Texture Image Segmentation Approach Based on Neural Networks

— One of the major problems in texture analysis is segmenting images into different regions based on textures. In this paper, we present a new approach of texture segmentation, which is based on both Kohonen maps and mathematical morphology, using three different texture features, namely, Haralick features based on gray-level co-occurrence matrix (GLCM), fractal features based on fractal dimension using the differential box counting method, and wavelet features based on wavelet transform. These features are used to train the Kohonen Network, which will be represented by the underlying probability density function (PDF). The segmentation of this map’s representation is made by morphological watershed transformation. In the final part of our algo-rithm, this will help on the segmentation of the textural image, by assigning each pixel to a modal region extracted from the map. Our work covers the re-sults obtained by the three extraction methods taking into consideration the execution time and the error rate.


Introduction
In order to interpret an image, the texture is a rich information component, thus considered an important feature for the detection and segmentation of different objects of the same color.
The purpose of image segmentation is to seek the number and types of textures in this image, and which are the textures of each particular region. The application of the segmentation generally involves two steps, the first step is to extract the texture features for each pixel in the image, and the second step is to use these features to determine the uniform regions.
In this paper, we present three texture feature extraction methods. The first calculates the Haralick features extracted from Grey Level Co-occurrence Matrix (GLCM), the second computes local fractal features using the differential box counting method, and the third uses wavelet transform to calculate wavelet features.
After calculating the features of one of the three methods, we use our approach for unsupervised classification based on a morphological and neural concept, which, unlike other methods such k-means, doesn't oblige us to know the number of regions in our texture image.
First, for each vector to classify, we make a projection on a two-dimensional selforganizing map, named Kohonen self-organizing feature map. To help extract the homogeneous regions in this map, we present in the first stage the information in each cell of this map with the probability density function value (PDF) estimated by a nonparametric procedure, in the second stage we extract automatically the modal regions using watershed transformation. The segmentation phase consists in comparing the weight vector of each pixel of the image to the vectors corresponding to the detected modal regions, in order to assign each pixel to one of the extracted classes.
In the last section of our paper, we start by presenting the result of our comparative study of the effectiveness of three different features, by calculating the error rate and the execution time, then the results of comparison between our segmentation approach and the k-means clustering.

Features Extraction
As any statistical approach for segmenting textured images, our approach starts by selecting a set of features characterizing the local texture. In our study, we use three observations' samples, the first is constituted by Haralick features extracted from GLCM matrices, the second consists of Fractal features calculated by the differential box-counting method, and the third is composed of Wavelet features extracted using the wavelet transform.
These three methods of extraction are based on the same principle, as shown in Figure 1. An analysis window is assigned to each pixel of the image to extract a vector of this window which will characterize the local textural information around each pixel. The choice of window's size is very important for the quality of the information extracted. In such a manner that choosing a size too small is likely to make us lose a lot of textural information, and the choice of a too large size will reduce border location accuracy between the different regions in the image.

Haralick Features
The gray co-occurrence matrix is a statistical method of examining texture proposed by Haralick [1][2], unlike the first order methods, GLCM is a second order method that considers the relationship between the values of neighboring pixels. This method is one of the most used feature extraction methods in many types of texture analysis applications. It has great implementation simplicity and it provides good results for most types of images.
A Gray level co-occurrence matrix ! !"!!" !!! !! is a square matrix, which represents the relative frequencies with which two pixels separated by a distance ! ! !!!"! !"! on the image !, the first with gray-tone ! and the second pixel with gray-tone !. Mathematically, we can define The GLCM as: Choosing a short distance between two pixels generally gives good results, so in our study case, we chose the value of one pixel for the distance !.
The large size of information obtained by GLCM matrix makes treatment difficult. Thus, instead of directly using GLCM, we calculate some of the fourteen features defined by Haralick. According to the experimental results, in our work, we use the five least correlated Haralick coefficients for texture analysis: Homogeneity (! ! ), Energy (! ! ), Entropy (! ! ), Contrast (! ! ) and Correlation (! ! ): • The Homogeneity coefficient (! ! ) returns a value that measures the closeness of the distribution of elements in the GLCM to the GLCM diagonal, the value of the homogeneity is high if the texture has more homogeneous regions: • The Energy coefficient (! ! ) returns the sum of squared elements in the GLCM, it reaches high values when the distribution of the gray levels is either constant or in a periodic form: • The Entropy coefficient (! ! ) measures the complexity of the image, it achieves high values when the values of the co-occurrence matrix are all almost equal, and produces a low value when the texture is completely random: • The Contrast coefficient (! ! ) measures the local variations, if these variations are important, the contrast will reach high values: iJES -Vol. 6, No. 1, 2018 • The Correlation coefficient (! ! ) measures the linear dependence of gray level values in the GLCM, the value of this coefficient is high when the values are more uniformly distributed in the co-occurrence matrix: The process of extracting Haralick features from an image ! is defined by the following algorithm: 1. Browsing the image for each pixel by a sliding window !" of size !!! , the current pixel ! !!!!! is the center pixel of !" !!!!! . 2. Computing the four directions ! ! !!"! !"#! !"#! !"#$! GLCM matrices for the current window !" !!!!! . 3. Extraction the five Haralick features for each Direction GLCM: 4. Calculation the mean of these features, so each pixel !!!! !! have a vector ! ! :

Fractal Features
Fractal geometry appeared in the 70s, it has proposed new concepts in order to understand some complex phenomena. The concept of Fractal has many fields of application including image analysis.
During image analysis application, the fractal geometry is in most cases used through the concept of fractal dimension (!"). Many methods exist to estimate this dimension, in our study, we have chosen to work with the differential box counting method, as it can be computed automatically and can be applied to patterns either with or without self-similarity.
The differential box counting method [3][4] consist in partitioning the image space into boxes of different sizes, then the probability !!!! is calculated as the difference between the maximum and minimum gray levels for each box. Then, the fractal dimension is estimated using the equation: To compute the fractal dimension of a pixel !!!! !! for image !, we use the local The fractal dimension is obtained by linear regression of this line-fit. In this paper, we use the differential box counting method as a manner to extract different features from the textural image. We have used not only the original image (! ! ) but also derived images: • high gray valued image (! ! ): • low gray valued image (! ! ): • horizontally smoothed image (! ! ): • vertically smoothed image (! ! ): Finally, we have for each pixel

Wavelet Features
The Wavelet Transform is a powerful tool widely used in various image processing fields, the main advantage of this transform is that unlike the Fourier transform, the wavelets are localized in both time and frequency.
The input signal can be decomposed using a series of elementary wavelet functions ! !!! !!!, created from a translation and dilatation of a mother wavelet !.
According to Grossman and Morlet previous works [5] the wavelet coefficients ! !!! resulting from this transform contain information about the signal !!!! studied at different scales.
To facilitate the implementation of the transformed discrete wavelet (DWT), Mallat produced a fast wavelet decomposition and reconstruction algorithm based on the use of a filter bank, including a low pass filter (!") and a high pass filter (!") [6]. The input signal undergoes a high pass filter and a low-pass filter, the filtering resumes on each sub-band after a sub-sampling operation, as shown in Figure 2. Wavelet decomposition can be extended to superior dimensions, including the twodimensional signal, in this case, the DWT is applied first line by line, then column by column. Four images are then generated at each level. The Figure 3 shows an example of decomposition of the image on three levels. In what concerns the extraction of wavelet attributes of each pixel of our image, the DWT algorithm is on an analysis window representing the local texture [7]. For a decomposition of ! levels, we end up with !! ! ! features channel that will result in a features vector ! for each pixel: To minimize the size of the vector !, we will replace each features channel of each scale by its energy using the following equation: Ultimately, we have for each pixel of the image a vector ! ! which characterizes the local texture of the vicinity of the latter:

Segmentation
After extraction of the texture attributes of our image, we use our unsupervised classification approach to classify our cloud observation made of the extracted features. Our segmentation approach is composed of four main modules: in the first module we use the textural feature vectors in the learning phase of Kohonen Self-Organizing feature map, then in the second module we represent this map using the underlying probability density function (PDF), in the third module we apply the watershed transformation to identify homogeneous regions in the PDF representation, finally the segmentation of the image is the last module which is done by assigning each pixel to an extracted homogeneous region, as shown in Figure 4.

Kohonen map
In 1984 Kohonen designed a neural network known as the self-organizing Kohonen map [8]. Using a learning rule similar to the cortex interaction laws, he was able to show the same phenomenon of self-organization. The Kohonen network has been successfully applied in various fields, such as speech recognition, robotics or image processing.
iJES -Vol. 6, No. 1, 2018 The architecture of the Kohonen network differs from that of competitive learning network in the organization of neurons in the output layer.
Kohonen takes into account in its modeling, the physical position of the neurons within the network. Furthermore, it defines different types of topological organization of neurons within a layer. Thus, the neurons are ordered according to a grid on a map. Each neuron is associated with a couple of indexes position on the grid, as shown in Figure 5. Let ! ! !! ! ! ! ! ! ! ! ! ! ! ! be a sample of ! observations in a N-dimensional space such as ! ! ! !! !!! ! ! !!! ! ! ! ! ! !!! ! ! ! ! ! !!!! ! ! !. The Kohonen network is made of two layers, the first one is the input layer which is composed of ! attributes of the observation ! ! . The output layer is composed of ! neural units regularly distributed on the map which elaborates prototypes of the data.
The neural units of the first layer are connected to the units of the second layer. Each interconnection from an input unit ! to an output unit ! has a weight ! !!! . That means that each output unit ! has a corresponding weight vector The followed steps of the learning algorithm are: 1. Initializing the weights of the neurons in the Kohonen map layer by giving them small random values. 2. Presenting an input vector ! ! . 3. Finding the winning node ! ! using the Euclidean distance between the vector ! ! and the nodes of the output layer. 4. Updating the weights ! ! winner node, as well as those around him, using the equation 19.
5. Decreasing the size of the neighborhood area winners' nodes.

Probability density function
Once the learning phase is processed, the determined weight vectors in the multidimensional data space are used to estimate the underlying probability density function (PDF) [9]. This technique of representation is based on the calculation of distances between the weight vectors. It helps to distinguish classes relatively distant from one to another, when two classes are very close, the distances calculated at the borders of these classes are relatively small and become undetectable.
To view the PDF, we use the non-parametric Parzen estimator, defined by: This visualization permits to display the Kohonen map as a digital image where each unit of the map is represented by a gray value pixel, however, the parameter ! ! has a great effect on the quality of the estimation. If it is too big, then small maximum density will be undetectable. Conversely, if too small, we obtain an estimator with a lot of noise maxima. In Figure 6 we can observe that the map is constituted by four regions where the PDF presents high values, separated by valleys where the PDF presents low values. This figure shows the importance of ! ! value.

Watershed transformation
Although the PDF presentation allows us to reduce the size of the set of multidimensional observations, this method cannot be considered automatic because of the intervention of the analyst to decide the number of classes presented in the sample and do so in an interactive classification. So, to automate this step, we use digital mathematical morphology and specifically the watershed technique [10][11][12].
To detect modal regions of the PDF, as a first step, we apply a numerical morphological opening on this estimation, this transformation tends to remove the tips of the peaks of the PDF, while preserving the valleys that separate the modal regions. This opening operation is therefore interesting to highlight the significant regional maxima of the PDF.
Let !!!! be the filtered PDF function. The proposed procedure allows determining the so-called catchment basins corresponding to the regional minima of the inverse function !!!!!, through consecutive homotopic thinning operations followed by sequential pruning operations until idempotence [12]. Figure 7 shows a watershed technique on a PDF presentation containing four different modal regions.

Segmentation process
The characteristics of the resulting function !!!!! from our watershed technique lie in the fact that the modal regions of this function take constant values, which equals to the regional maximum of each region. Besides that, the thickness of the line separating two neighboring modal Regions does not exceed one pixel. These two properties will allow us to extract the modal regions, by means of a simple numerical dilation of the function !!!!!.
After extracting the modal regions of our image, we compare each pixel of the image, which is represented by its vector of textural features, with all the cells in the Kohonen map through the Euclidean distance. Finally, we assign to our pixel the value of the closest modal region.

Experimental Results and Evaluations
Our segmentation approach is tested on three images generated by combining several different textures, as shown in Figure 8. The three images have a size of !"#!!"# pixels and 256 gray levels. The first image is composed of four different synthetic textures created by an image manipulation software, the second image is a combination of two natural textures from the Brodatz album [13], combined by using an irregular shape, the last image is composed of five Brodatz textures of different types. All these textures had their histograms equalized and are therefore indistinguishable on the basis of only their gray level or their first-order statistics. To extract Haralick five features of these images, we looked for an optimal size of analysis window, finally, we opted for a size of !!! pixels, it was the same for the other two extraction methods. We find that the !!!!! pixel size is optimal to extract five fractal features, while the size of !!! has given the best outcome for wavelet features. To the latter method, Daubechies wavelets [14] have given a better result for segmentation and optimal results in terms of speed, which push us to use this wavelet family.
After extracting the vector features of each image, we classified them using our approach of segmentation. Figure 9 shows us the visual results of the segmentation of our three textured images, using the three features extraction methods. To correctly evaluate the segmentation results, our original images were manually segmented to get the labeled ground truth images, then they were compared with our results to have a segmentation error rate. In addition, we made a comparison between our result and segmentation's result obtained by k-means clustering method [15].
All the modules that compose our segmentation approach were programmed using the C++ library OPENCV2, all the results are obtained using one i5-2540M 2.60GHz computer with a 4.00GB of RAM. Table 1 presents all the segmentation results obtained for our three images, and the computing time of each extraction approach. The results of the first image show us that the method of Haralick is more effective for synthetic images than the other methods, however, in terms of classification rate, the second and the third image results prove that fractals attributes are more suitable for the case of images with natural textures.
One of the major flaws of the method based on the Haralick features is the large extraction time, being based on the calculation of the co-occurrence matrix for each pixel in the image, it requires more computational resources than the other methods; furthermore, according to the results of segmentation of the third image, it can be seen that unlike the fractal method, Haralick attributes are not well suited for macro textures.
As opposed to k-means approach who needs to fix a number of classes k, our segmentation approach allows us to segment the image without knowing in advance the number of regions of the image, thanks to its combination of networks Kohonen neuron and mathematical morphology. Besides this advantage, the results prove that our approach it has a more important result for classification rate than the k-means approach.

Conclusion
In this paper, we present a comparative study of three different texture feature extraction methods, using our textural images segmentation approach, based on the application of morphological watershed transformation in Kohonen map.
The objective of this study is to characterize the textures presented in the textural images using these three extraction methods, the first based on GLCM matrix, the second is to use the local fractal dimension, while the last method is based on wavelet features. The application of these three methods shows us that the fractal features are most effective for the natural textures, in addition to that, it does not consume a great amount of computing time.
In comparison with the k-means approach, the classification rate results obtained by our segmentation approach is very encouraging. The thing which allows us to have a perspective to apply to both color textural image and 3D image.