Keywords

1 Introduction

1.1 Background

A important aim for the scientific community is to endow the machines of the capability of processing an image the same way the human being. The goal of artificial vision is fulfill this objective, being the digital image processing a first phase for this purpose. Nowadays digital image processing has been applied in diverse areas like the robotics [1], surveillance [2], medicine [3], ecology [4], for mention a few. In these areas, the image identification and interpretation are fundamental tasks. An essential process to identify interest regions (ROI) in the image is to divide the image in two or more ROIs. Nevertheless, this process is notable affected when input image quality is low. Particularly in the medical area, the microscopic image processing in gray levels is useful to identify some biological specimens that typically highlight from the background. A type of images of special interest in medicine are the blood cells, this is because the morphological analysis of these images allows identifying pathologies like the leukemia and others hematological disorders. However, due to conditions for the images acquisition are not controlled, these images are generally of poor quality and with different properties for the same group of samples. Some causes are a poor illumination, lack of constant communication with the image sensor, or the wrong setting of lens during acquisition process, for mention a few.

Although, image enhancement techniques have been applied in analysis of leukemia cell images as in [5,6,7], the most not consider the particular properties of each one these, applying the same transformations to all images. Moreover, traditional enhancement approaches are highly dependent on the image, requiring a manual parameters adjustment. A technique commonly used to image enhancement is the contrast stretching, which consists in to expand the range of intensity levels in the image [8]. In this regard, some proposals to deal the problem of determining the adequate parameters for the image enhancement have been proposed. In [9] an Modified Differential Evolution algorithm is proposed for contrast and brightness enhancement, nevertheless it is computationally complex. Furthermore, in [10] two chaotic Differential Evolution schemes to contrast enhancement are proposed, here the enhancement is considered as a constrained nonlinear optimization problem. Nonetheless, the study uses only two images to test and it does not include an assessment of resulting images. Finally, in [11, 12] enhancement techniques are applied to leukemia images, but the results are not reliable because their assessment is done only from a visual standpoint and without comparisons with others works.

Since the microscopic images analysis is a common technique in medical area, the image enhancement techniques are often required due to poor quality or lack of homogeneity in the images. This way, a enhancement technique should be adequate for each image to get satisfactory results in posterior phases of processing. This paper proposes the use of Differential Evolution (ED) to approximate a gaussian mixture model (GMM) to the image histogram. From this, some parameters used for contrast enhancement are computed. Edge detection and dilatation are applied to output image to isolate the cell nucleus region, from which geometrical features are extracted for classifying to the cell by using two types of neural networks.

The paper is organized as follows. Section 2 presents the theoretical fundamentals of proposed methodology. Section 3 describes the process of contrast enhancement with ED for detection of leukemia in blood cell images. Experimental results are shown is Sect. 4. Conclusions are drawn in Sect. 5.

2 Theoretical Fundamentals

2.1 Contrast Enhancement

The goal of image enhancement is to remove many unwanted aspects as posible, while retaining those aspects of the image that are critical to posterior processing. For this purpose an approach is by using per-pixel operations, where it is returned a single value corresponding to each pixel of the input image [13]. In this respect, a commonly used function is the contrast stretch which allows getting an image that covers a wider range of values in the image histogram. In a simple way, it is defined as (1),

$$\begin{aligned} y=A\frac{c-a}{b-a} \end{aligned}$$
(1)

where A is the max value that it is wished pixels have, a and b are the bottom and upper limits and c is the gray value in the input image. Graphically it can be seen in Fig. 1. The estimation of values a and b can be done manually or by approximation.

Fig. 1.
figure 1

Typical histogram

2.2 Approximation of Histogram with Gaussian Mixture Model (GMM)

In image processing, the GMM are commonly used for obtaining probabilistic data models. In the case of an image, the distribution of gray levels can be expressed as a histogram h(gl). Considering L gray levels: \([0, \cdots , L-1]\), the histogram can be treated as a probability distribution function. For which, it is normalized dividing each gray level gl in the histogram over the total of pixels N in the image. Thus, the histogram h(gl) can be contained in a mix of Gaussians (2),

$$\begin{aligned} p(x)= \sum _{i=1}^{K} P_i\cdot p_i(x)= \sum _{i=1}^{K} \frac{P_i}{ \sqrt{ 2\pi \sigma _i }}e^{ \frac{(x-\mu _i)^{2}}{2\sigma _{i}^{2}}} \end{aligned}$$
(2)

where \(P_i\) is the probability a prior of class i, \(p_i(x)\) is the probability distribution function of random variable x of gray level in the class i, K is the number of classes, \(\mu _i\) and \(\sigma _i\) are the mean and standard deviation of the probability \(i-th\) function. Moreover, there is a constraint that is referred to the sum of probabilities, which should be 1. The assessment of parameters for each mix is through the squared minimum error between the sum of Gaussians and the image histogram (3).

$$\begin{aligned} E= \frac{1}{ n} \sum _{i=1}^{n} (p(x_i)-h(x_i))^{2} \end{aligned}$$
(3)

The problem of parameters estimation to minimize the error in (3) is complex, it is more difficult to solve if the number of mixtures increases. For this reason, the algorithm ED is used for estimation of parameters for each mix, which later are used to deduce the appropriate values of a and b in (1) for the contrast enhancement in the image.

2.3 Differential Evolution

Differential Evolution (DE) is a parallel direct search method which utilizes NP D-dimensional parameter vectors as a population for each generation G, where NP does not change during the minimization process as in (4). This is chosen randomly and should cover the entire parameter space [14].

$$\begin{aligned} x_{i,G} , i=1,2,...,NP \end{aligned}$$
(4)

DE is useful to solve optimization problems in continuous spaces where variables are represented by means of real numbers. Initial population is randomly generated and three individuals are selected to be parents, which one of them is the principal father and it is disturbed by the others two parents. If after a selection between the modified father and one of the others fathers, the first is fitter, it is conserved, else it is replaced. Therefore, DE’s basic strategy includes the functions of mutation, crossover and selection. The strategy is as follows,

For each vector \(\overrightarrow{x_{i,G}}\), i \(=\) 1,2,...,NP, a mutant vector \(\overrightarrow{v}\) is generated according to (5) :

$$\begin{aligned} \overrightarrow{v} = \overrightarrow{x_{r1,G}}+ F\cdot (\overrightarrow{x_{r2,G}}-\overrightarrow{x_{r3,G}}) \end{aligned}$$
(5)

with indexes randomly created \( r_1,r_2,r_3\epsilon [1,2,3,...,NP]\), integer, mutually different and \(F>0\). The values \( r_1,r_2 \) and \(r_3\) should be different from the running index i, so that NP must be greater or equal to four to allow for this condition. F is a real and constant factor \(\epsilon [0,2]\) which controls the amplification of the differential variation \((\overrightarrow{x_{r2,G}}-\overrightarrow{x_{r3,G}})\). In order to increase the diversity of the perturbed parameter vectors, crossover is used. For this purpose the trial vector u (6) is defined,

$$\begin{aligned} \overrightarrow{u_{i,G+1}} = (u_1i,G+1, u_2i,G+1,...,u_Di,G+1) \end{aligned}$$
(6)

where D is vector dimension. Trial vector u is formed according to (7):

$$\begin{aligned} u_{ji,G+1} = \left\{ \begin{array}{ll} v_{ji,G+1} &{} if (randb(j)\le CR) or j=rnbr(i)\\ x_{ji,G} &{} if (randb(j)>CR) and j\ne rnbr(i) \end{array} \right. j= 1,2, ..., D. \end{aligned}$$
(7)

In 7, randb(j) is the jth evaluation of a uniform random number generator with outcome \(\epsilon \) [0,1]. CR is the crossover constant \(\epsilon \) [0,1] which is determined by the user. rnbr(i) is a randomly chosen index \(\epsilon \) 1,2, ..., D which ensures that \(u_{i,G+1}\) gets at least one parameter from \(v_{i,G+1}\). To select the individual of next generation \(G+1\), the trial vector \(u_{i,G+1}\) is compared to the target vector \(x_{i,G}\) using the greedy criterion. If vector \(u_{i,G+1}\) yields a smaller cost function value than \(x_{i,G}\) then \(x_{i,G+1}\) is set to \(u_{i,G+1}\); otherwise, the old value \(x_{i,G+1}\) is retained.

Particularly, considering (3) as the objective function, the process described above is repeated until a ending criterion is attained or a predetermined generation number is reached.

Fig. 2.
figure 2

Images of blood cells

3 Automatic Contrast Enhancement with Differential Evolution

In this section will be described the contrast enhancement process, which has the aim of highlight the area of interest (ROI) in images of blood cells. Considering that the cell nucleus is relevant to identify a cell type, the contrast enhancement is focused on highlight this region of cell. In Fig. 2 are shown healthy and leukemic cells images, in first column healthy cells, whereas in the second leukemic cells. As it can seen, the nucleus for each class is notably different. On the other hand, the typical histograms of blood cells contain at least three modes, which is shown in Fig. 3. This fact is considered to propose three mixes K in a model of gaussian mixes (GMM) for the histogram approximation of the image, which is used to get the values of variables a and b in 1. This way, by using 2, a GMM is obtained from the image histogram.

Fig. 3.
figure 3

Overlapping representative histograms of blood cells of a set of five images per class: healthy and leukemic cells.

It is worth to mention that the estimation of parameters for the mix is a complicated problem such that, the more it increases the number of components k, more complex is the estimation of these parameters. This way, the parameters estimation for each component of the mix is treated as an optimization problem as in 8.

$$\begin{aligned} Minimize\,f(E)= \frac{1}{ n} \sum _{i=1}^{n} (p(x_i)-h(x_i))^{2} \end{aligned}$$
(8)

with design variables,

$$\begin{aligned} x= \{{p_1,\sigma _1,\mu _1,p_2,\sigma _2,\mu _2,p_3,\sigma _3,\mu _3}\} \end{aligned}$$
(9)

subject to:

$$\begin{aligned} h_1(E)= (p_1+p_2+p_3)-1=0 \end{aligned}$$
(10)

From the above, each individual has the structure defined in 9. This is used by te ED algorithm to get the parameters for the GMM. The feasibility rules proposed by [15] are used to handling the constraint \(h_1(E)\). These rules take into account the fitness and/or feasibility of each individual during the selection process in the evolutionary algorithm. From this approximation of histogram, and considering that the mode that is more on the left M1 in the representative histograms in 3 represents the darker regions in the image (i.e., the region of nucleus of the cell), to contrast enhancement in this region is pertinent to get the values of a and b from 1 solely for this mode M1. The calculation of these values is done taking into a count the 3-sigma rule, which about \(99.7\%\) of values from a normal distribution are within three standard deviations [16]. This means, from the values \(\mu _1\) and \(\sigma _1\), the values for a and b can be gotten as in 11 .

$$\begin{aligned} \begin{array}{ll} a= \mu _1-3*\sigma \\ b= \mu _1+3*\sigma \end{array} \end{aligned}$$
(11)

Finally, enhancing the contrast from the above, the region of cell nucleus is delimited very accurately.

4 Experimental Results

The dataset used in this proposal includes 260 blood cells images in RGB, from which the half are healthy cells and remainder leukemic cells [17]. The images size is \(600\times 600\) pixels. For experiments is used G channel due to it shows better contrast, unlike image in level gray or the channels R and B, as is shown in Fig. 4. Algorithms were coded in MATLAB R2018a and executed on CPU Core i7 processor, 16GB memory and graphics processing unit Gforce 6.1.

Particularly, the goal of contrast enhancement proposed is to highlight the region of cell nucleus for identification of leukemic cells. First, DE algorithm estimates the parameters defined in 9 using the objective function in 8, which is subject to 10 with a tolerance \(\epsilon =1e^{-6}\). A population of 30 individuals is considered, whither each individual has the estructure defined in 9. This population is randomly initialized, whereas the DE algorithm uses a mutation factor \(F=0.3\) and crossover constant \(CR=0.8\). Number of generations is 200; all obtained experimentally. Deb’s feasibility rules are used in constraints handling because search space can include parameters sets which might not be feasible solutions for the problem. Experiments are tested with 10 executions.

Fig. 4.
figure 4

Blood cells images. In first column images in gray level. Columns 2–4 images of channels R, G and B, respectively.

From the above, parameters gotten from approximation of image histogram are used to obtain the values of a and b for the mode M1, as is defined in 11 from Sect. 3. This way, contrast enhancement done from values of a and b previously gotten achieves to isolate the region of cell nucleus in the most of cases. On the other hand, to identify the leukemic cells from obtained image, a post processing with morphological operators were done to join some regions in the image for its posterior classifying. This way, the edges are extracted using the Robert’s algorithm. Later, a dilatation is applied using a structuring element of disk with radius 5. The algorithm for edges extraction, the disk size and its shape were chosen by visual inspection. For which, methods like sobel, canny, prewitt and log for the edges extraction were applied. Whereas for disk geometries were tested diamond, octagon and sphere shapes with sizes between 3 and 7. In Fig. 5, third column shows the results of previous process for an image per class. As it can be seen, the proposed enhancement allow us highlight only the interest area, unlike the other techniques like the histogram equalization.

Fig. 5.
figure 5

Isolation of cell nucleus from the proposed contrast enhancement.

To determine that the contrast enhancement proposed is useful to identify leukemic cells, three types of neural networks were used: MLP, LeNet and AlexNet. In the former, 17 geometric features were extracted: area, diameter ratio, extent, eccentricity, orientation, solidity, rectangularity, euler number, perimeter, convex area and the 7 Hu’s invariants. Whereas, in the convolutional networks LeNet and AlexNet the binary images obtained previously were used. In all cases \(80\%\) of data were used for training and remainder for test. Table 1 gives a summary of training outcomes, where we can see the proposed method for contrast enhancement has a good performance to identify the leukemic cells using a MLP network. Moreover, unlike of convolutional networks, the required time for classification is much shorter.

Table 1. Results of classification.

5 Conclusions

The method of automatic contrast enhancement presented in this paper has shown to be useful for identifying leukemic cells. Evolutive algorithms like Differential evolution have allowed to obtain good solutions to achieve a contrast enhancement which is enough to identify the cells without using a posterior process like segmentation. Regard cells identification, the use of different types of neural networks shows that convolutional networks have a poor performance for classifying binary images, while a MLP is better for this task.

In brief, it is noticeable that the proposed method favours the use of computational few resources to identify leukemic cells in gray level images. Finally, the method could be used for whatever type of gray level images, considering a previous characterization theirs histograms.