CN102609721A

CN102609721A - Remote sensing image clustering method

Info

Publication number: CN102609721A
Application number: CN2012100223532A
Authority: CN
Inventors: 唐宏; 陈云浩; 慎利; 齐银凤
Original assignee: Beijing Normal University
Current assignee: Beijing Normal University
Priority date: 2012-02-01
Filing date: 2012-02-01
Publication date: 2012-07-25
Anticipated expiration: 2032-02-01
Also published as: CN102609721B

Abstract

The invention discloses a remote sensing image clustering method and belongs to the technical field of image analysis. The remote sensing image clustering method comprises the following steps: A, determining the number of optimal clustering centers of an original image; B, acquiring the multi-scale expression of the original image through a Gaussian convolution function, and mapping the original image into a scale space thereof to produce a multilayer document set; C, establishing a dirichlet distribution model with invariant overlapping image semanteme according to the multilayer document set, and estimating the mixed proportion parameter of the theme of each document in the multilayer document set and the distribution parameters of the theme which produces visual words according to the probability; and D, obtaining the clustering category of each visual word according to the posteriori probability maximation method. The calculation complicity due to the generation of the document set in advance is avoided, the correlation of the documents can be kept and the detection efficiency of the geographical target of the remote sensing image is improved.

Description

Clustering method of remote sensing images

Technical Field

The invention relates to the technical field of image analysis, in particular to a clustering method of remote sensing images.

Background

The Latent Dirichlet Allocation model (LDA) is a probabilistic topic model for text modeling proposed by bleei et al in 2003. By means of the expression mode of the probability graph model, the conditional probability relation among the words, the documents and the topics can be modeled, and the probability semantic information of the two levels of the documents and the words is fully mined.

It is generally considered that the probability topic model in the true sense proposed for the first time is a probability Latent Semantic Analysis (pLSA) model which is constructed by Hoffmann in 1999 based on a Latent Semantic Analysis model (LSA), abandons an original complex singular value decomposition Analysis mode, and is constructed from the perspective of a generation model, and is successfully applied to text Analysis.

However, since the pLSA model does not establish a proper probability framework at the document level, and all variables at this level are regarded as parameters of the model, that is, how many documents have model parameters corresponding to the model parameters, the number of parameters to be estimated increases linearly with the number of documents, so that the model is prone to overfitting and lacks of processing capacity for new documents.

As a great improvement to the pLSA model, the LDA model is a complete generative model, and by introducing a hyper-parameter, the mixed proportion distribution of the subjects in the document is regarded as a polynomial distribution which is subject to Dirichlet prior, but not a set of individual parameters directly related to a specific document, so that the overfitting problem of the pLSA model is overcome. In addition, a series of other probabilistic topic models are also developed and developed based on the pLSA and LDA models aiming at different practical application requirements. Although there may be some processing variability, probabilistic topic models generally have a common underlying theoretical assumption that a document is considered to be a mixture of topics, each of which is a probability distribution about a word. Under the condition of not using any supervision information, the type of model can automatically mine theme information and semantic information in data, and a new idea is developed for natural language understanding based on a statistical learning theory.

Because the probability topic model can better analyze the statistical correlation among documents, topics and words, the probability topic model has better application in the fields of computer vision, pattern recognition and the like, and has many successful application cases in natural image recognition, retrieval and scene analysis. Meanwhile, a modeling object of the probabilistic topic model is mapped to an analysis object of the object-oriented high-resolution remote sensing image, namely a word corresponding to a pixel, a document corresponding to a pixel cluster of a specific mode, and a topic corresponding to a ground object category, so that the discrimination problem of the attribution of the pixel category in each pixel cluster is quite naturally converted into the topic attribution problem of discriminating the visual word in each document. Therefore, the inherent characteristics of the probabilistic topic model are quite consistent with the application requirements of high-resolution remote sensing image information extraction, and the application of the probabilistic topic model to the field of remote sensing image analysis is feasible by taking the successful example of the probabilistic topic model in natural picture processing and analysis as a reference.

At present, in text modeling or computer vision image segmentation or recognition applications, documents are given in advance and are assumed to be independent of each other in the modeling process. In remote sensing image information extraction, one needs to somehow generate a document from a given remote sensing image for modeling a probabilistic topic model, such as a division in an image or a small image block. Once these documents are generated, they are still assumed to be independent of each other during the modeling process. However, in order to represent spatial correlation (between pixels or features) in the remote sensing image, a certain degree of overlap of these documents must be required. In other words, the documents are not independent of each other, and strong correlation exists. However, in this current model, the same pixels in different documents may be identified by the probabilistic topic model as different semantic categories, further post-processing is usually required to remove this ambiguity, and the number of documents increases dramatically with increasing overlap. Furthermore, probabilistic topic models generally assume that a document is an unordered set of words when analyzing text, but such "bag of words" based assumptions ignore the spatial correlation that may exist between visual words and are therefore not reasonable for modeling for remote sensing image analysis. Therefore, how to integrate word order, syntax or syntactic information between visual words will help to further mine spatial context information at the pixel level.

The clustering algorithm of the remote sensing image can be divided into pixel-based clustering and object-based clustering according to the analysis elements. Because the image clustering algorithm based on the pixels mainly utilizes the spectral information of the pixels for analysis and lacks the introduction of spatial information, the clustering result of the high-resolution remote sensing image often has an obvious phenomenon of 'salt and pepper', thereby influencing the effect of the clustering result. In contrast, object-oriented clustering algorithms analyze image patches that are obtained by primitives toward image objects, such as segmentation operators. Generally, the acquisition of an image object often depends heavily on the quality of a segmentation patch acquired by a segmentation algorithm, and image segmentation is a problem which is difficult to solve in the field of image processing at present, and a good general image segmentation algorithm is not provided at present. Generally speaking, space information can be utilized to a certain extent in many current clustering algorithms, but for consideration of semantic information among pixels, few such algorithms are applied to remote sensing image clustering analysis.

In summary, the conventional clustering algorithm for remote sensing images needs to generate a vast document set in advance, so that the calculation complexity is high, the storage cost is high, the correlation between documents is poor, and the detection efficiency of the geographic target of the remote sensing images is low.

Disclosure of Invention

Technical problem to be solved

The technical problem to be solved by the invention is as follows: the clustering method for the remote sensing images reduces the complexity of a clustering algorithm, reduces storage cost, can keep the correlation among documents, and improves the detection efficiency of the geographic targets of the remote sensing images.

(II) technical scheme

In order to solve the above problems, the present invention provides a method for clustering remote sensing images, comprising the following steps:

a: determining the number of the optimal clustering centers of the original images;

b: obtaining multi-scale expression of an original image through a Gaussian convolution function, and mapping the original image to a scale space of the original image to generate a multilayer document set;

c: establishing a latent Dirichlet allocation model with unchanged overlapped image semantics according to the multilayer document set, and estimating a mixing proportion parameter of topics in each document in the multilayer document set and a distribution parameter of visual words generated by each topic according to probability;

d: and obtaining the clustering category of each visual word by a method of maximizing the posterior probability.

Wherein the step A further comprises: and (3) assuming that the characteristics of the original image conform to Gaussian mixture distribution according to the minimum description length criterion, and acquiring the optimal number of clustering centers when the MDL value corresponding to the image is minimum by utilizing the correlation between the MDL value of the original image and the number of different clustering centers.

In step B, obtaining a multi-scale expression of the original image through a gaussian convolution function, further comprising: a multiscale representation of the original image is obtained by convolving a scale-variable Gaussian function with the original image.

Wherein, in the step C, establishing a latent dirichlet allocation model with unchanged overlapped image semantics according to the multi-layer document set further includes: and establishing a latent Dirichlet allocation model according to the multilayer document set to generate observation words, and constructing the multilayer document set consisting of the observation words so that the same pixel belonging to different documents is allocated with the same theme.

Establishing a latent dirichlet allocation model according to the multilayer document set to generate observation words, wherein the method specifically comprises the following steps: for the multi-layered document setAssume that the following generation process exists:

1) according to a compliance parameter of beta_SDirichlet distribution p (phi)_k|β_s) Sampling the distribution (phi) of the visual words corresponding to K subjects under each layer scale_k)_s(N×K×S)；

2) And (3) scale sampling: for the t-th pixel, p(s) is distributed a priori_t| γ) sampling its scale coordinate index s_tIndicating that the pixel should be driven from the s-th_tAllocating a theme to the layer scale space;

3) document sampling: for the t-th pixel, p (d) is distributed a priori_tL sigma, h) sampling to obtain document index d_t；

4) Theme sampling: for the t-th pixel, the distribution is according to a polynomial

Sampling its subject class, wherein

Is a sampled document d_tAt the dimension s_tA lower mixing ratio coefficient;

5) visual word sampling: visual words corresponding to the t-th pixel pass through the topic Z_tAnd (4) obtaining discrete distributed samples.

Wherein, in the step C, estimating a mixture ratio parameter of the topics in each document in the multi-layer document set and a distribution parameter of the visual words generated by each topic according to the probability, further comprising: and estimating a mixing proportion parameter of the topics in each document in the multi-layer document set and a distribution parameter of visual words generated by each topic according to probability by adopting a Gibbs sampling approximate reasoning method.

(III) advantageous effects

The invention is based on Latent Dirichlet Allocation model (LDA), adopts implicit document generation scheme, and does not need to generate vast document set in advance, therefore, the invention can reduce the complexity of clustering algorithm, reduce storage cost, and realize effective detection of high-resolution remote sensing image geographic target. The method can keep the correlation among the documents, so that the neighborhood space relationship information among the documents can be considered; the invention ensures that the pixels belonging to different documents are only assigned with the same subject class all the time by adding the constraint condition that the semantics of the overlapped images are not changed, thereby fusing the clustering process of the images and the reasoning of the model into a unified framework. In addition, the introduction of multi-scale expression further fully considers the spatial relationship between adjacent pixels, so that the clustering result of the image shows a very obvious object-oriented characteristic.

Drawings

FIG. 1 is a flowchart of a clustering method for remote sensing images according to an embodiment of the present invention;

FIG. 2 shows a is the original image of urban QUICKBIRD, and b is a schematic diagram of the number of the best clustering centers for detecting the urban QUICKBIRD image by using MDL constraint criteria;

fig. 3a is the original image of the rural EROS-B in the embodiment of the present invention, and B is a schematic diagram of the number of the optimal clustering centers for detecting the rural EROS-B image by using the MDL constraint criterion;

fig. 4a is a diagram illustrating the suburban QUICKBIRD original image and b is a diagram illustrating the number of best clustering centers for detecting the suburban QUICKBIRD image by using the MDL constraint criterion according to the embodiment of the present invention;

FIG. 5 is a schematic diagram of a probability map model of the msLDA model;

FIG. 6 is a graph illustrating the results of quantitative comparisons of QUICKBIRD images of urban areas using different clustering methods according to an embodiment of the present invention; a is a ground surface real ground feature distribution diagram, b is a K-means method clustering result, c is a traditional LDA method clustering result, and d is a msLDA clustering result;

FIG. 7 is a diagram illustrating a generalized global entropy comparison for each category in the results of different clustering methods for urban QUICKBIRD images according to an embodiment of the present invention;

FIG. 8 is a diagram illustrating the results of quantitative comparison of rural EROS-B images using different clustering methods according to an embodiment of the present invention; a is a ground surface real ground feature distribution diagram, b is a K-means method clustering result, c is a traditional LDA method clustering result, and d is a msLDA clustering result;

FIG. 9 is a diagram illustrating a generalized overall entropy comparison for each category in different clustering method results for rural EROS-B images according to an embodiment of the present invention;

FIG. 10 is a diagram illustrating the results of quantitative comparison of the QUICKBIRD images in suburbs by different clustering methods according to the embodiment of the present invention, where a is the distribution diagram of the real features on the earth surface, b is the clustering result of the K-means method, c is the clustering result of the traditional LDA method, and d is the clustering result of msLDA;

fig. 11 is a schematic diagram illustrating a comparison of generalized overall entropy values corresponding to each category in different clustering method results for suburban QUICKBIRD images according to an embodiment of the present invention.

Detailed Description

The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.

As shown in fig. 1, the method for clustering remote sensing images according to the present invention comprises the following steps:

in this step, the characteristics of the original image are assumed to conform to gaussian mixture distribution according to the minimum description length criterion, and the optimal number of clustering centers when the MDL value corresponding to the image is minimum is obtained by using the correlation between the MDL value of the original image and the number of different clustering centers.

Such as: the present invention uses three original images, as shown in fig. 2a, 3a, and 4 a. The correlation graphs of the MDL values of the three images and the number of different clustering centers are shown in fig. 2b, 3b, and 4 b.

As can be seen from fig. 2-4, when the number of the clustering centers is set to 7, the MDL values corresponding to the three images all obtain the minimum value, so that the complexity of the clustered images is the minimum. Therefore, for the three high-resolution remote sensing images, the number of the optimal clustering centers selected by the MDL criterion is 7.

in this step, obtaining the multi-scale expression of the original image through the gaussian convolution function, further includes: a multiscale representation of the original image is obtained by convolving a scale-variable Gaussian function with the original image.

The multi-scale features of the image data may be modeled by mapping the original imagery to its scale space. The gaussian convolution kernel is the only linear kernel to implement the scaling transform, and thus the scale space of a two-dimensional image I (x, y) can be defined as:

L(x，y，δ)＝G(x，y，δ)*I(x，y)

wherein,

is a gaussian function with variable scale, (x, y) is the spatial coordinate, δ is the scale coordinate, and a symbol of convolution operation.

Therefore, given a specific scale coordinate delta, a convolution image of the corresponding scale can be generated, and a document set of the layer is constructedIf S-1 level scale expression is considered, an S-level document set can be obtained

in this step, a latent dirichlet allocation model is established according to the multi-layer document set to generate observation words, so that the multi-layer document set composed of the observation words is constructed, and the same pixel belonging to different documents is allocated with the same theme.

When a latent Dirichlet allocation model is established according to the multi-layer document set to generate observation words, the multi-layer document set is subjected to

Assume that the following generation process exists:

Sampling its subject class, wherein

Is a sampled document d_tAt the dimension s_tA lower mixing ratio coefficient;

5) visual word sampling: visual words corresponding to the t-th pixel pass through the topic Z_tIs sampled.

And estimating a mixing proportion parameter of the topics in each document in the multi-layer document set and a distribution parameter of visual words generated by each topic according to probability by adopting a Gibbs sampling approximate reasoning method.

Given a scale coordinate index and document index for each pixel, the joint probability of the topic class and the visual word can be approximated by a Gibbs sampling method, i.e.

<math><mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>j</mi> <mo>,</mo> <msub> <mi>z</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>k</mi> <mo>|</mo> <msub> <mi>d</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>i</mi> <mo>,</mo> <msub> <mi>s</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>s</mi> <mo>,</mo> <mover> <mi>α</mi> <mo>&RightArrow;</mo> </mover> <mo>,</mo> <mover> <mi>β</mi> <mo>&RightArrow;</mo> </mover> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <msubsup> <mi>n</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>α</mi> <mi>k</mi> </msub> </mrow> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>K</mi> </munderover> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>α</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>*</mo> <mfrac> <mrow> <msubsup> <mi>n</mi> <mi>k</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>β</mi> <mi>j</mi> </msub> </mrow> <mrow> <munderover> <mi>Σ</mi> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>V</mi> </munderover> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mi>i</mi> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>β</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow></math>

Thus, the mixing ratio parameter Θ (K × M matrix) of the topics in each document and the distribution parameter Φ (N × K × S matrix) of each topic for generating a visual word in probability can be obtained by the following formula.

The document index for each pixel can be obtained by sampling the posterior distribution as shown in

<math><mrow> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mi>t</mi> </msub> <mo>|</mo> <msub> <mi>w</mi> <mi>t</mi> </msub> <mo>,</mo> <msub> <mi>z</mi> <mi>t</mi> </msub> <mo>,</mo> <msub> <mi>s</mi> <mi>t</mi> </msub> <mo>,</mo> <mover> <mi>α</mi> <mo>&RightArrow;</mo> </mover> <mo>,</mo> <mover> <mi>β</mi> <mo>&RightArrow;</mo> </mover> <mo>,</mo> <mi>γ</mi> <mo>,</mo> <mi>σ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>&Proportional;</mo> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>d</mi> <mi>t</mi> </msub> <mo>=</mo> <mi>j</mi> <mo>|</mo> <mi>σ</mi> <mo>,</mo> <mi>h</mi> <mo>)</mo> </mrow> <mo>*</mo> <mfrac> <mrow> <msubsup> <mi>n</mi> <mrow> <mo>&Not;</mo> <mi>i</mi> <mo>,</mo> <mi>k</mi> </mrow> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>α</mi> <mi>k</mi> </msub> </mrow> <mrow> <msubsup> <mi>Σ</mi> <mrow> <mo>&Not;</mo> <mi>i</mi> <mo>,</mo> <mi>k</mi> </mrow> <mi>K</mi> </msubsup> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mrow> <mo>&Not;</mo> <mi>i</mi> <mo>,</mo> <msup> <mi>k</mi> <mo>′</mo> </msup> </mrow> <mrow> <mo>(</mo> <mi>j</mi> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>α</mi> <msup> <mi>k</mi> <mo>′</mo> </msup> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow></math>

The scale coordinate index for each pixel can be obtained by sampling the a posteriori distribution as shown in the following formula, i.e.

<math><mrow> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>t</mi> </msub> <mo>|</mo> <msub> <mi>S</mi> <mrow> <mi>t</mi> <mo>&Not;</mo> </mrow> </msub> <mo>,</mo> <mi>D</mi> <mo>,</mo> <mi>Z</mi> <mo>,</mo> <mi>W</mi> <mo>)</mo> </mrow> <mo>&Proportional;</mo> <mi>P</mi> <mrow> <mo>(</mo> <msub> <mi>s</mi> <mi>t</mi> </msub> <mo>|</mo> <mi>γ</mi> <mo>)</mo> </mrow> <mfrac> <mrow> <msubsup> <mi>n</mi> <mrow> <mi>t</mi> <mo>&Not;</mo> <mo>,</mo> <msubsup> <mi>w</mi> <mi>t</mi> <mi>s</mi> </msubsup> </mrow> <mrow> <mo>(</mo> <msub> <mi>z</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>β</mi> <msubsup> <mi>w</mi> <mi>t</mi> <mi>s</mi> </msubsup> </msub> </mrow> <mrow> <msubsup> <mi>Σ</mi> <mrow> <mi>w</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </msubsup> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mrow> <mi>t</mi> <mo>&Not;</mo> <mo>,</mo> <mi>w</mi> </mrow> <mrow> <mo>(</mo> <msub> <mi>z</mi> <mi>t</mi> </msub> <mo>)</mo> </mrow> </msubsup> <mo>+</mo> <msub> <mi>β</mi> <mi>w</mi> </msub> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow></math>

According to the msLDA model, a theme mixing proportion coefficient theta of a document with each pixel as a center pixel can be obtained_ikP (z ═ k | d ═ i), the scale s corresponding to each pixel position and the topics at the scale s probabilistically yield a distribution of visual words (phi ═ d ═ i)_kj)_s＝p(w＝j|z＝k，scale＝s)。

Thus, for a given pixel w ═ j, assuming that the scale of its sampling is scale ═ s, its corresponding topic class can be obtained by maximizing the posterior probability:

<math><mrow> <msub> <mi>Topic</mi> <msub> <mi>w</mi> <mi>j</mi> </msub> </msub> <mo>=</mo> <munder> <mrow> <mi>Arg</mi> <mi>max</mi> </mrow> <mrow> <mn>1</mn> <mo>≤</mo> <mi>k</mi> <mo>≤</mo> <mi>K</mi> </mrow> </munder> <mrow> <mo>(</mo> <mi>p</mi> <mo>(</mo> <mi>z</mi> <mo>=</mo> <mi>k</mi> <mo>|</mo> <mi>d</mi> <mo>=</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>*</mo> <mi>p</mi> <mrow> <mo>(</mo> <mi>w</mi> <mo>=</mo> <mi>j</mi> <mo>|</mo> <mi>z</mi> <mo>=</mo> <mi>k</mi> <mo>,</mo> <mi>scale</mi> <mo>=</mo> <mi>s</mi> <mo>)</mo> <mo>)</mo> </mrow> </mrow></math>

clustering result analysis based on msLDA method

On the basis of completing the analysis and calculation flow understanding of the msLDA method, three high-resolution remote sensing images are utilized for carrying out cluster analysis, so that the effectiveness of the msLDA method is proved to a certain extent. The three selected experimental images respectively cover urban areas, rural areas and suburban areas, and ground feature distribution conditions under different scenes are fully considered, so that the three experimental images have representativeness and typicality. In the process of analyzing the experimental result, the difference between the clustering result of the msLDA method and the clustering result of the K-means and the traditional LDA method is compared from the qualitative aspect and the quantitative aspect.

1.1 introduction to qualitative evaluation method

On one hand, the qualitative evaluation of the clustering effect is realized by visually comparing the difference between the clustering result and the ground surface real ground feature distribution, and on the other hand, the qualitative evaluation is analyzed by evaluating the object-oriented characteristic of the clustering result.

In order to better and more objectively embody the object-oriented characteristics of the clustered images, the invention adopts two landscape indexes to analyze and compare the image clustering results corresponding to the three clustering methods.

Generally, the landscape index is generally used to quantitatively analyze real surface distribution data such as maps or land utilization maps, which reflect landscape distribution characteristics, so as to reflect landscape composition and composition characteristics of geospatial distribution. Landscapes refer to the complex of land and spaces and objects on the land that is a reflection of complex natural processes and human activities. In the image clustering result, the landscape is specifically composed of a series of geographic patches, wherein the geographic patches are in one-to-one correspondence with geographic real ground object targets. And evaluating the landscape characteristics of the image clustering results by adopting two landscape indexes, so that the difference of object-oriented characteristics of different clustering results is reflected more intuitively. The two selected landscape indexes are specifically described as follows:

(1) number of Plaques (NP): the index is used for describing the number of plaques formed by different clustering type pixels which are separated from each other in a clustering image result. In an optimal situation, the number of the plaques in the clustering result should be equal to the number of the real geographic objects on the ground, and the two can correspond to each other one by one. In general, the number of geographic objects of the real ground surface is relatively fixed, and if the number of the plaques is larger, the geographic objects are divided into broken sub-plaques, so that the object-oriented characteristic of the image is weakened.

(2) Surface integral dimension (PAFRAC): the index is used to characterize the shape complexity of the plaque and is typically greater than 1. As the degree of complexity increases, the index increases accordingly. When the shape of the plaque is very simple, such as a square or a circle, the index value will take a value of 1.

1.2 quantitative evaluation method

The Overall entropy (Overall entropy) is used as a quantitative evaluation index to analyze and compare the Overall clustering effect of the three clustering methods, and generally comprises two parts, namely Cluster center entropy (Cluster entropy) and Class entropy (Class entropy). Generally, the smaller the overall entropy value, the more rational the clustering effect. However, the overall entropy cannot fully reflect the clustering effect of the clustering algorithm on a specific ground feature. Thus, the present invention further introduces Generalized Overall entropy (Generalized overhead entropy).

In the process of solving the corresponding entropy value, the real geographic object distribution information group route of the earth surface is needed. h is_ckThe number of pixels in the clustering center k in the clustering image belongs to the pixels with the category of c in the Ground route is shown, andthen, the total number of all the pixels belonging to the class c in the group channel in the clustering result image is represented. In the same way, h_kcThe number of pixels of which the category is c in the group channel belongs to the cluster k in the cluster image is shown, and

and the total number of pixels of which the category is c in the group channel belongs to a clustering center k in the clustering result image is shown. K is the total number of image cluster centers, and C is the total number of categories in the group route. Each category in the group channel has a certain association relationship with a cluster category in the cluster image, specifically: each category in the group channel corresponds to the cluster category with the largest proportion in the cluster images one by one. The judgment of the quality of each cluster type in the cluster result image is realized by judging the homogeneity degree corresponding to each category pixel of the cluster type in the group channel. This degree of homogeneity is generally reflected by a combination of cluster center entropy and class entropy, with smaller entropy values corresponding to higher degrees of homogeneity.

For class c in the group truth image, the class entropy E_cIs shown as the following formula

For a clustering center k in a clustering result image, a clustering center entropy value E_kIs shown as the following formula

For a specific ground object class c, integrating class entropy values E_cAnd its corresponding clustering center entropy value E_kCan construct its generalized overall entropy value E_generalizedThe specific calculation formula is shown as the following formula:

E_generalized＝βE_c+(1-β)E_k

in the above formula, the variable is a weight adjustment parameter, and the variable beta is set to be 0.5 in the experiment. In general, a smaller overall entropy value corresponds to a higher degree of clustering result homogeneity.

2. Experimental evaluation of clustering results

Several key input parameters of the msLDA model were first determined prior to the experiment. The Dirichlet is initialized to be symmetrical in a priori mode, specifically, alpha is set to be 50/K, and beta is set to be 100, and re-estimation adjustment can be carried out according to an actually learned model in the reasoning process; the size of the implicit document is set to 17 × 17 pixels according to an empirical value; the number of the cluster types is judged according to the MDL criterion, and the three images are all 7.

1.1 experiment # 1: urban QUICKBIRD image

As shown in fig. 2a, the data used in this experiment were QUICKBIRD full color images taken at 11 d 2/2002, covering the urban area of beijing. The image size is 500 × 500 pixels, and the resolution is 0.6 m. The main features in the observation image include buildings, roads, shadows, water bodies and bare land. It should be noted that there is a large amount of shadow distribution in the image, and the image has almost the same gray value as the water body.

The ground surface real feature distribution map and the clustering results of the three clustering methods to be compared are shown in fig. 6(a) - (d), respectively.

Qualitative evaluation

As is clear from fig. 6, in the K-means method clustering result, almost all the shadows are erroneously determined as water bodies because they are almost at the same gray level. In contrast, both the traditional LDA method and the msLDA method can substantially distinguish between water and shadows. The main reason for this result is that only the pixel-based gray scale division is actually adopted in the K-means clustering process, and the spatial correlation between ground features is not considered, however, in the conventional LDA method and msLDA method, the gray scale difference information of the pixels and the neighborhood document information of the pixels are both effectively used, and the judgment of the final clustering type of each pixel is comprehensively determined by the gray scale and the clustering type of the pixels in the neighborhood document, so that the effective distinction of the water body and the shadow can be realized to a certain extent.

In addition, by visually observing and comparing the clustering results of the three methods, the following can be intuitively found: compared with clustering results of other two clustering methods, clustering results of the msLDA method are more compact, and independent pixel clusters are fewer, so that the clustering results of the msLDA method have certain object-oriented characteristics and can be directly in one-to-one correspondence with real geographic objects on the earth surface. The two landscape index characteristics of the experimental results are calculated by FRAGSTATS software, and the landscape index information of the three clustering methods corresponding to the clustering results is shown in table 1.

TABLE 1

	Number of plaques	Dimension of surface integral
			K-means process	13542	1.4673
LDA method	11425	1.4454
			msLDA method	5112	1.3944

Table 1 shows that both landscape indexes of the clustering result of the msLDA method are smaller than those of the clustering results of the other two clustering methods. Therefore, the complexity degree of the plaques corresponding to the clustering result of the msLDA method is relatively low, the number of the plaques is small, and the plaques are more close to the spatial distribution of the real geographic objects on the earth surface, so that the clustering result of the method has higher image object-oriented characteristic degree than other two clustering methods.

Quantitative evaluation

And calculating the overall entropy of the clustering results of the three clustering methods, wherein the overall entropy value of the msLDA method is obviously smaller than that of the other two methods as shown in Table 2.

TABLE 2

Clustering algorithm	Entropy of global clustering	Integral class entropy	Integral entropy
				K-means process	0.84326	1.3399	1.0916
LDA method	0.77865	1.3514	1.065
				msLDA method	0.75479	1.3001	1.0274

Generalized overall entropy is further calculated, and the generalized overall entropy values corresponding to each category in the results of different clustering methods are shown in fig. 7. In the msLDA method clustering result, the corresponding generalized overall entropy values of the shadow, the building, the road and the bare land are all smaller than the values of the clustering result of the K-means and the traditional LDA method. In other words, the msLDA method enables a more accurate determination of these four features. In addition, in the process of extracting and analyzing the geographical entities corresponding to the water body, the result precision extracted by the msLDA method is extremely weak and lower than that extracted by the K-means method, but higher than that extracted by the traditional LDA method. In general, the msLDA method can obtain higher precision when various kinds of geographic entity information are obtained. In particular, shadows and bodies of water can be well distinguished by this method.

1.2 experiment # 2: rural EROS-B imaging

As shown in fig. 3a, the data used in this experiment were EROS-B full color images obtained at 18 th 6 th 2010 and covering monyang city, anhui. The image size is 800 × 800 pixels, and the resolution is 0.6 m. The experimental area is located in a typical rural area. A large number of agricultural land and forest land are distributed in the observation image. In addition, the image also includes the ground features such as water body, shadow and road.

For the analysis of this experiment, the distribution diagram of the real surface feature and the clustering results of the three clustering methods to be compared are shown in fig. 8(a) - (d), respectively.

Qualitative evaluation

As shown in fig. 8, the msLDA method has a more compact land feature distribution result compared with the other two methods. Particularly, the water body area clustered by the msLDA method is very smooth, even if the water body area presents different gray tones at different positions, the msLDA method can ensure the uniform smoothness of the water body cluster through a smoothing mechanism which is blended into the msLDA method, and the msLDA method is not possessed by K-means and the traditional LDA method. The excellent object-oriented characteristics exhibited by the msLDA clustering results were further confirmed by the landscape index comparison listed in table 3.

TABLE 3

	Number of plaques	Dimension of surface integral
			K-means process	22443	1.4951
LDA method	22107	1.4755
			msLDA method	4133	1.4263

Quantitative evaluation

And respectively calculating the overall entropy and the generalization overall entropy of the clustering results of the three methods. The overall entropy comparison is shown in table 4; the generalized overall entropy comparison is shown in fig. 9.

TABLE 4

Clustering algorithm	Entropy of global clustering	Integral class entropy	Integral entropy
				K-means process	0.77045	1.3309	1.0507
LDA method	0.72353	1.341	1.0322
				msLDA method	0.59921	1.0216	0.8104

Obviously, all entropy indexes of the clustering result of the msLDA method are lower than those of the other two methods. Therefore, the experiment shows that the msLDA method can also obtain a better clustering effect aiming at the rural area high-resolution remote sensing images with more agricultural land and forest land distribution.

1.3 experiment # 3: suburban QUICKBIRD image

As shown in FIG. 4a, a QUICKBIRD (900X 900 pixels) full color image with a resolution of 0.6 m was selected as the experimental data of this time. The image was acquired at 22 days 4/2006. The research area is specially selected to be suburban between urban areas and rural areas, so that the distributed ground object types are more complicated, and the image clustering performance of the three methods under the complex scene can be evaluated. The main ground objects in the observation image comprise trees, buildings, roads, water bodies, shadows and farmlands.

For the analysis of this experiment, the distribution diagram of the real surface feature and the clustering results of the three clustering methods to be compared are shown in fig. 10(a) - (d), respectively.

Qualitative evaluation

Although the distribution scene of the ground features in the image selected in experiment #3 is more complicated, it can be seen from fig. 10 and table 5 that the msLDA method still exhibits better clustering performance, especially in distinguishing water bodies from shadows.

TABLE 5

	Number of plaques	Dimension of surface integral
			K-means process	22443	1.4951
LDA method	22107	1.4755
			msLDA method	7457	1.4229

Quantitative evaluation

Comparing the overall entropies of the three methods as shown in table 6, the clustering results of the msLDA method are all lower than those of the other two methods.

TABLE 6

Clustering algorithm	Entropy of global clustering	Integral class entropy	Integral entropy
				K-means process	1.0923	1.1172	1.1048
LDA method	1.0478	1.0326	1.0402
				msLDA method	1.0344	0.97891	1.0066

In addition, generalized overall entropy calculated for each feature in different methods is shown in fig. 11. In the msLDA method clustering result, the overall entropy values corresponding to the three categories of buildings, farmlands and trees are all smaller than the values of the clustering result of the msLDA method in K-means and the traditional LDA method. In addition, the msLDA method is very close to the clustering result with the best precision, so that the msLDA method can be considered as a good clustering result. For generalized overall entropy of shadows and roads, the msLDA model does not obtain the minimum value, but the traditional LDA method does. Comparing the clustering results of the traditional LDA method and the msLDA method, and considering the model mechanisms of the traditional LDA method and the msLDA method, the msLDA result can be actually regarded as smooth constraint of space consistency on the basis of the clustering result of the traditional LDA method by introducing the multi-scale expression strategy of the image. Therefore, a trade-off is often required for a smoother clustering result and edge retention. In fact, we can get better edge preservation effect by lowering the smoothing parameter in the msLDA model. In general, the msLDA method can obtain higher precision when various kinds of geographic entity information are obtained.

The above embodiments are only for illustrating the invention and are not to be construed as limiting the invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention, therefore, all equivalent technical solutions also belong to the scope of the invention, and the scope of the invention is defined by the claims.

Claims

1. A clustering method of remote sensing images is characterized by comprising the following steps:

2. The method for clustering remote sensing images according to claim 1, wherein the step a further comprises: and (3) assuming that the characteristics of the original image conform to Gaussian mixture distribution according to the minimum description length criterion, and acquiring the optimal number of clustering centers when the MDL value corresponding to the image is minimum by utilizing the correlation between the MDL value of the original image and the number of different clustering centers.

3. The method for clustering remote sensing images according to claim 1, wherein in the step B, the multi-scale expression of the original image is obtained by a gaussian convolution function, further comprising: a multiscale representation of the original image is obtained by convolving a scale-variable Gaussian function with the original image.

4. The method for clustering remote sensing images according to claim 1, wherein in the step C, establishing a latent dirichlet allocation model with unchanged overlapped image semantics according to the multi-layer document set further comprises: and establishing a latent Dirichlet allocation model according to the multilayer document set to generate observation words, and constructing the multilayer document set consisting of the observation words so that the same pixel belonging to different documents is allocated with the same theme.

5. The remote sensing image clustering method of claim 4, wherein establishing a latent dirichlet allocation model according to the multi-layer document set to generate observation words specifically comprises: for the multi-layered document set

Assume that the following generation process exists:

Sampling its subject class, whereinIs a sampled document d_tAt the dimension s_tA lower mixing ratio coefficient;

6. The method for clustering remote sensing images according to claim 1, wherein in the step C, a mixture ratio parameter of topics in each document in the multi-layered document set and a distribution parameter of visual words generated by each topic according to probability are estimated, further comprising: and estimating a mixing proportion parameter of the topics in each document in the multi-layer document set and a distribution parameter of visual words generated by each topic according to probability by adopting a Gibbs sampling approximate reasoning method.