[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN105243154A - Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings - Google Patents

Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings Download PDF

Info

Publication number
CN105243154A
CN105243154A CN201510708598.4A CN201510708598A CN105243154A CN 105243154 A CN105243154 A CN 105243154A CN 201510708598 A CN201510708598 A CN 201510708598A CN 105243154 A CN105243154 A CN 105243154A
Authority
CN
China
Prior art keywords
image
feature
matrix
training
salient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510708598.4A
Other languages
Chinese (zh)
Other versions
CN105243154B (en
Inventor
邵振峰
周维勋
李从敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201510708598.4A priority Critical patent/CN105243154B/en
Publication of CN105243154A publication Critical patent/CN105243154A/en
Application granted granted Critical
Publication of CN105243154B publication Critical patent/CN105243154B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

A remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings are disclosed. The method comprises the steps of extracting characteristic points of each image from an image library to obtain a characteristic point matrix, and calculating a salient map of each image based on a visual attention model; performing binaryzation on the salient maps by a self-adaption threshold value method, performing a mask calculation with the characteristic point matrix to obtain filtered significant characteristics points; separately choosing a plurality of significant characteristics points from each training image to configure training samples; training a spare auto-encoder network according to a whitened training sample set to obtain a characteristic extractor; extracting characteristics by the characteristic extractor, performing a sparsification treatment on the extracted image characteristics by a threshold function to obtain a final characteristic vector for retrieval; and performing image retrieval according to a preset similarity measurement criterion based on the extracted characteristic vector. The automatic extraction of the image characteristics is realized through the trained spare auto-encoder network; and in addition, the extracted characteristics are quite high in identification level, so that the retrieval precision ratio is ensured.

Description

Remote sensing image retrieval method and system based on salient point features and sparse self-coding
Technical Field
The invention belongs to the technical field of image processing, and relates to a remote sensing image retrieval method and system based on salient point features and sparse self-coding.
Background
Along with the improvement of the observation capability of remote sensing to the ground, the obtained remote sensing data has the characteristics of diversification and sea quantization. However, mass remote sensing data provides rich data sources for various important application requirements, and the problem of 'data mass and information inundation' of the remote sensing big data is increasingly prominent due to insufficient ground data processing and analyzing capability at present. How to utilize emerging scientific computing technology and means to quickly locate and intelligently retrieve an interested target or region in a remote sensing image is a challenge facing remote sensing big data processing and analysis and is also a scientific problem to be solved urgently in the field of remote sensing image processing. The remote sensing image retrieval technology is an effective method for solving the bottleneck problem, and has important significance in researching the efficient image retrieval technology.
The current remote sensing image retrieval technology mainly carries out similarity measurement on low-level features of an image so as to return a similar image. Compared with the traditional retrieval method based on keywords, the retrieval method based on the content has higher efficiency and accuracy, but the design of a feature description method capable of effectively describing various complex remote sensing image scenes is very difficult. In recent years, deep learning is becoming a research focus in the field of image recognition due to its good feature learning ability. Compared with the characteristics of artificial design, the method based on deep learning can obtain a characteristic extractor through sample training to realize automatic extraction of image characteristics, and is suitable for remote sensing image retrieval containing complex scenes. Due to the fact that network design and training are relatively simple, sparse self-coding has become a common deep learning method and is widely applied to image processing.
For sparse self-coding network training, in the aspect of constructing training samples, the existing method generally randomly selects a certain number and size of image blocks from a training image to construct the training samples, and the sample construction method has the following defects. First, from the perspective of human vision theory, people are interested in a specific target on a remote sensing image, and a randomly selected image block may not contain the specific target of interest. Second, since the size of the training image is fixed, the method of randomly selecting image blocks to construct training samples may result in insufficient training samples. Thirdly, since the training samples are image blocks, the features of the image blocks, not the entire image, are obtained when feature extraction is performed using the trained network, and therefore, the training samples cannot be directly used for image retrieval. In order to obtain the characteristics of the whole image, a convolution method is generally adopted, but the process is not only computationally inefficient, but also introduces other parameters. In the aspect of selecting an activation function, the existing method usually adopts a sigmoid function as the activation function of a network hidden layer neuron, and the sigmoid function has the problems of serious gradient disappearance and the like when the network propagates reversely, so that the network training is not facilitated. For sparse self-coding network feature extraction, the existing method generally directly takes the activation value of the hidden layer as the extracted feature without sparsification processing, and experiments show that the sparse feature performance is better.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides a remote sensing image retrieval technical scheme based on salient point features and sparse self-coding. The method takes the salient point features of the extracted remote sensing image as the input of the sparse self-coding network to train the sparse self-coding network, and finally extracts the image features by utilizing the trained feature extractor to realize remote sensing image retrieval.
The technical scheme adopted by the invention is a remote sensing image retrieval method based on salient point characteristics and sparse self-coding, which comprises the following steps:
step 1, extracting characteristic points of each image in an image library to obtain a characteristic point matrix, and calculating a saliency map of each image by using a visual attention model;
step 2, binarizing the saliency maps of each image in the image library by adopting a self-adaptive threshold method respectively, and performing mask operation on a feature point matrix corresponding to the image to obtain filtered saliency feature points; the implementation mode is as follows,
when the adaptive threshold method is adopted to binarize the saliency map, according to the saliency of the saliency map pixels, the binarization threshold value T of the saliency map is determined as follows,
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y )
wherein w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of the saliency map pixel (x, y);
setting the significance map to be binarized according to a binarization threshold value T to obtain a binarized significance map, wherein a matrix I is arranged correspondinglybinaryLet P denote the feature point matrix of the image, PIRepresenting the filtered salient feature point matrix, calculating the salient feature point matrix as follows,
P I = P ⊗ I b i n a r y
step 3, taking a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct a training sample to obtain a training sample set X, and training a sparse self-coding network according to the whitened training sample set X' to obtain a feature extractor;
the sparse self-coding network comprises an input layer, an implicit layer and an output layer, wherein a neuron of the implicit layer adopts a ReLU function as an activation function, a neuron of the output layer adopts a softplus function as the activation function, a cost function of the sparse self-coding network is defined as follows,
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2
wherein the first term is a mean square error term, the second term is a regularization term, HW,bNetwork output value representing training sample set X', W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2A constructed weight matrix, wherein lambda represents a regular term coefficient;
step 4, extracting the features of all the images in the image library by using the feature extractor obtained by training in the step 3, and performing sparsification processing on the extracted image features by using a threshold function to obtain a final feature vector for retrieval; the implementation mode is as follows,
the extracted image feature Y is expressed as follows,
Y=f1(W1PI′+b1)
wherein, the salient feature point matrix PI' is the filtered salient feature point matrix P obtained from step 2IThe whitened result;
for the extracted image feature Y, the following thinning processing is carried out to obtain a sparse feature matrix Z,
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)]
wherein α represents the threshold of the threshold function, matrix Z+=max(0,Y-α),Z-=max(0,α-Y);
The sparse feature matrix Z is further processed by setting the number of SIFT points detected from an image to be n to obtain a feature vector F as follows,
F = 1 n Σ i = 1 n [ Z + i , Z - i ]
wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And 5, based on the feature vector extracted in the step 4, carrying out image retrieval according to a preset similarity measurement criterion.
And in step 1, extracting the feature points of each image in the image library to obtain a feature point matrix, and extracting by using an SIFT operator.
In step 5, the preset similarity measure criterion is urban distance.
The invention also correspondingly provides a remote sensing image retrieval system based on the salient point characteristics and the sparse self-coding, which comprises the following modules,
the characteristic point extraction module is used for extracting characteristic points of each image in the image library to obtain a characteristic point matrix and calculating a saliency map of each image by using a visual attention model;
the salient feature point extraction module is used for binarizing the salient images of all the images in the image library by adopting a self-adaptive threshold method respectively, and performing mask operation on a feature point matrix corresponding to the images to obtain filtered salient feature points; the implementation mode is as follows,
when the adaptive threshold method is adopted to binarize the saliency map, according to the saliency of the saliency map pixels, the binarization threshold value T of the saliency map is determined as follows,
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y )
wherein w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of the saliency map pixel (x, y);
setting the significance map to be binarized according to a binarization threshold value T to obtain a binarized significance map, wherein a matrix I is arranged correspondinglybinaryLet P denote the feature point matrix of the image, PIRepresenting the filtered salient feature point matrix, calculating the salient feature point matrix as follows,
P I = P ⊗ I b i n a r y
the training module is used for taking a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct a training sample to obtain a training sample set X, and training a sparse self-coding network according to the whitened training sample set X' to obtain a feature extractor;
the sparse self-coding network comprises an input layer, an implicit layer and an output layer, wherein a neuron of the implicit layer adopts a ReLU function as an activation function, a neuron of the output layer adopts a softplus function as the activation function, a cost function of the sparse self-coding network is defined as follows,
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2
wherein, the first isThe term is the mean square error term, the second term is the regularization term, HW,bNetwork output value representing training sample set X', W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2A constructed weight matrix, wherein lambda represents a regular term coefficient;
the feature extraction module is used for extracting features of all images in the image library by using the feature extractor obtained by training in the step 3, and performing sparsification processing on the extracted image features by using a threshold function to obtain a final feature vector for retrieval; the implementation mode is as follows,
the extracted image feature Y is expressed as follows,
Y=f1(W1PI′+b1)
wherein, the salient feature point matrix PI' is the filtered salient feature point matrix P obtained from step 2IThe whitened result;
for the extracted image feature Y, the following thinning processing is carried out to obtain a sparse feature matrix Z,
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)]
wherein α represents the threshold of the threshold function, matrix Z+=max(0,Y-α),Z-=max(0,α-Y);
The sparse feature matrix Z is further processed by setting the number of SIFT points detected from an image to be n to obtain a feature vector F as follows,
F = 1 n Σ i = 1 n [ Z + i , Z - i ]
wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And the retrieval module is used for retrieving the image according to a preset similarity measurement criterion on the basis of the feature vector extracted by the feature extraction module.
And in the characteristic point extraction module, extracting the characteristic points of each image in the image library to obtain a characteristic point matrix, and extracting by utilizing an SIFT operator.
In the retrieval module, the preset similarity measurement criterion adopts urban area distance.
Compared with the prior art, the invention has the following characteristics and beneficial effects,
1. the salient image of the image is calculated by adopting the visual attention model, and the characteristic points extracted by SIFT are filtered by binarization of the salient image to obtain the salient characteristic points of the image, so that the method not only accords with the visual attention characteristics of human eyes, but also can better reflect the retrieval requirements of people.
2. The significant feature points of the images are selected to construct training samples, so that the defect that the training samples are constructed by random sampling on the training images in the prior art is overcome.
3. The feature extractor obtained by sparse self-coding network training is used for realizing automatic extraction of image features, and the feature design process aiming at complex remote sensing images is omitted.
4. The expansibility is good, and the training samples include but are not limited to the salient feature points.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention.
Detailed Description
The remote sensing image retrieval method based on the salient point features and the sparse self-coding firstly extracts the feature points of an image to obtain a feature point matrix, calculates the salient map of the image, then performs 'mask' operation on the binarization of the salient map and the feature point matrix by adopting a self-adaptive threshold value to obtain the salient feature points, then selects a certain number of the salient feature points to construct a training sample to train a sparse self-coding network, automatically extracts the image features by utilizing a trained feature extractor to obtain feature vectors for retrieval, and finally performs image retrieval according to a preset similarity measurement method and returns a similar image.
To explain the technical solution of the present invention in detail, referring to fig. 1, the embodiment flow is specifically explained as follows:
step 1, extracting characteristic points of each image in an image library to obtain a characteristic point matrix, and calculating a saliency map of each image by using a visual attention model.
In particular, an existing image library or an image library constructed by a person skilled in the art may be used. For example, a high-resolution remote sensing image containing a plurality of ground feature categories is selected, and a Tiles blocking mode is adopted for segmentation to construct a retrieval image library containing a plurality of categories. For each image in the image library, in the embodiment, firstly, a SIFT (scaleinvarietfeaturetransform) operator is adopted to extract feature points (key points) of the image to obtain a feature point matrix, and then a GBVS (Graph-based visual salience) model is adopted to calculate a saliency map of the image.
And 2, for the saliency maps of the images in the image library, binarizing the saliency maps by adopting a self-adaptive threshold method respectively, and carrying out mask operation on the feature point matrixes corresponding to the images to obtain the filtered saliency feature points.
In the embodiment, a binarization threshold of the saliency map is determined according to the saliency of the pixel, and the binarization saliency map and the feature point matrix are subjected to 'mask' operation to obtain the salient feature points, which is realized as follows:
according to the significance size of the significant image pixel, a binarization threshold T of the significant image is determined by formula (1).
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y ) - - - ( 1 )
Where w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of a pixel at the saliency map (x, y).
Binarizing the saliency map according to the binarization threshold value T to obtain a binarized saliency map, wherein a matrix I is arranged correspondinglybinary. Performing matrix search on feature points of image by using binary saliency mapLine filtering yields salient feature points. Let P denote the feature point matrix of the image, PIRepresenting the filtered significant feature point matrix, the significant feature point matrix can be calculated by equation (2).
P I = P ⊗ I b i n a r y - - - ( 2 )
Wherein,
matrix array
Each element of the matrix P represents a feature vector corresponding to an SIFT key point, and the feature vector corresponding to the SIFT key point is generally 128-dimensional, and the embodiment of the invention correspondingly uses 128 dimensions;
matrix array
Wherein, P128(x, y) represents a feature vector corresponding to the feature point, and if the pixel at (x, y) has no feature point, P is128(x,y)=0。IbinaryWherein each element is 0 or 1, IbinaryAnd (x, y) represents the value of the binary saliency map at (x, y). SymbolIs a number-times sign.
And 3, selecting a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct training samples, training a sparse self-coding network, and obtaining the feature extractor.
In the embodiment, in step 3, a certain number of salient feature points of the training image are selected instead of the traditional image block to construct the training sample, and a relu (rectifiedlinear units) function instead of the traditional sigmoid function is selected as the activation function of the sparse self-coding network hidden layer neuron during training. For example, each salient feature point in step 3 is a feature vector with dimensions of 4 × 4 × 8 ═ 128, and one feature point constitutes one training sample. In the implementation, the number of training images and the number of salient feature points in one training image can be automatically specified by those skilled in the art.
The concrete implementation is as follows:
firstly, selecting the salient feature points of the image, and constructing a training sample set.
The embodiment firstly randomly selects a certain number of images from an image library as training images, and then randomly selects salient feature points of the certain number of training images to construct a training sample set. The training sample set may be represented by equation (3):
where m represents the number of training samples, and each column of X represents a salient feature point, i.e., a training sample. For example, [ x ]1,1,x2,1,…,x128,1]Is the 1 st training sample, [ x ]1,2,x2,2,…,x128,2]Is the 2 nd training sample.
Then, training the sparse self-coding network to obtain a feature extractor.
Because the salient feature points extracted from the same training image have certain correlation, the training sample set X cannot be directly input into a sparse self-coding network for training. Before training, ZCA (zerocomponentaanalysis) whitening is adopted to process the training sample to obtain a whitened training sample set X', and relevant parameters during ZCA whitening are stored, and the ZCA whitening is realized in the prior art, which is not described in detail in the present invention.
The embodiment defines a sparse self-coding network comprising an input layer, a hidden layer and an output layer 3, wherein the hidden layer neurons use a ReLU function f1Max (0, x) as the activation function, the output layer neurons use the softplus function f2=ln(1+ex) As a function of activation. Compared with the traditional sigmoid function, the ReLU function can relieve the problem of gradient disappearance to a certain extent and is more beneficial to network training. Given a training sample set X', the cost function of the sparse self-encoding network can be defined as equation (4).
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2 - - - ( 4 )
Where the first term is the mean square error term, the second term is the regularization term, HW,bNetwork output value representing training sample set X', W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2And the constructed weight matrix is lambda represents a regular term coefficient. In specific implementation, the cost function in the formula (4) can be optimized by gradient descent and other methods during training to obtain the weight and the bias matrix parameters W and b.
And 4, extracting the features of all the images in the image library by using the feature extractor obtained by training in the step 3, and performing sparsification on the extracted features by using a threshold function to obtain a final feature vector for retrieval.
In step 4 of the embodiment, the salient feature points of the image are input into the feature extractor to be mapped to obtain corresponding image features, and then the extracted features are subjected to sparsification by using a threshold function to obtain a final feature vector for retrieval.
The extracted image feature Y can be represented by equation (5) as follows,
Y=f1(W1PI′+b1)(5)
wherein W is1PI+b1Substitution of the ReLU function f as variable x1Max (0, x), salient feature point matrix P used hereI' is the result of preprocessing using the same ZCA whitening parameters as when whitening the training sample set X, based on the filtered significant feature point matrix obtained in step 2. And (4) performing sparsification processing on the extracted image features Y by using an equation (6) to obtain a sparse feature matrix Z.
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)](6)
Wherein α denotes the threshold values of the threshold functions f ═ max (0, x- α) and f ═ max (0, α -Y), the matrix Z+=max(0,Y-α),Z-=max(0,α-Y)。
In order to obtain the final feature vector F for retrieval, let n be the number of SIFT points detected from one image, and further process the sparse feature matrix Z using equation (7).
F = 1 n Σ i = 1 n [ Z + i , Z - i ] - - - ( 7 )
Wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And 5, based on the feature vectors extracted in the step 4, carrying out image retrieval according to a preset similarity measurement criterion: in the implementation, one skilled in the art can preset the similarity measurement criteria. Embodiments use the urban distance (L1 norm) to calculate the similarity of the query image to other images and return related images by the magnitude of the similarity. In specific implementation, any image in the image library can be used as a query image to obtain a related image returned according to the similarity, and for other images except the image library, the feature vector can also be extracted in the same way and retrieved from the image library.
In specific implementation, the above processes can adopt a computer software mode to realize an automatic operation process, and can also adopt a modularized mode to provide a corresponding system. The invention also correspondingly provides a remote sensing image retrieval system based on the salient point characteristics and the sparse self-coding, which comprises the following modules,
the characteristic point extraction module is used for extracting characteristic points of each image in the image library to obtain a characteristic point matrix and calculating a saliency map of each image by using a visual attention model;
the salient feature point extraction module is used for binarizing the salient images of all the images in the image library by adopting a self-adaptive threshold method respectively, and performing mask operation on a feature point matrix corresponding to the images to obtain filtered salient feature points; the implementation mode is as follows,
when the adaptive threshold method is adopted to binarize the saliency map, according to the saliency of the saliency map pixels, the binarization threshold value T of the saliency map is determined as follows,
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y )
wherein w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of the saliency map pixel (x, y);
setting the significance map to be binarized according to a binarization threshold value T to obtain a binarized significance map, wherein a matrix I is arranged correspondinglybinaryLet P denote the feature point matrix of the image, PIRepresenting the filtered salient feature point matrix, calculating the salient feature point matrix as follows,
P I = P ⊗ I b i n a r y
the training module is used for taking a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct a training sample to obtain a training sample set X, and training a sparse self-coding network according to the whitened training sample set X' to obtain a feature extractor;
the sparse self-coding network comprises an input layer, an implicit layer and an output layer, wherein a neuron of the implicit layer adopts a ReLU function as an activation function, a neuron of the output layer adopts a softplus function as the activation function, a cost function of the sparse self-coding network is defined as follows,
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2
wherein the first itemIs a mean square error term, the second term is a regularizing term, HW,bNetwork output value representing training sample set X', W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2A constructed weight matrix, wherein lambda represents a regular term coefficient;
the query feature extraction module is used for extracting features of the image to be queried by using the feature extractor obtained by training in the step 3, and performing sparsification processing on the extracted image features by using a threshold function to obtain a final feature vector for retrieval; the implementation mode is as follows,
the extracted image feature Y is expressed as follows,
Y=f1(W1PI′+b1)
wherein, the salient feature point matrix PI' is the filtered salient feature point matrix P obtained from step 2IThe whitened result;
for the extracted image feature Y, the following thinning processing is carried out to obtain a sparse feature matrix Z,
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)]
wherein α represents the threshold of the threshold function, matrix Z+=max(0,Y-α),Z-=max(0,α-Y);
The sparse feature matrix Z is further processed by setting the number of SIFT points detected from an image to be n to obtain a feature vector F as follows,
F = 1 n Σ i = 1 n [ Z + i , Z - i ]
wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And the retrieval module is used for retrieving the image according to a preset similarity measurement criterion based on the feature vector extracted by the query feature extraction module.
The specific embodiments described herein are merely illustrative of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (6)

1. A remote sensing image retrieval method based on salient point features and sparse self-coding is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
step 1, extracting characteristic points of each image in an image library to obtain a characteristic point matrix, and calculating a saliency map of each image by using a visual attention model;
step 2, binarizing the saliency maps of each image in the image library by adopting a self-adaptive threshold method respectively, and performing mask operation on a feature point matrix corresponding to the image to obtain filtered saliency feature points; the implementation mode is as follows,
when the adaptive threshold method is adopted to binarize the saliency map, according to the saliency of the saliency map pixels, the binarization threshold value T of the saliency map is determined as follows,
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y )
wherein w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of the saliency map pixel (x, y);
setting the significance map to be binarized according to a binarization threshold value T to obtain a binarized significance map, wherein a matrix I is arranged correspondinglybinaryLet P denote the feature point matrix of the image, PIRepresenting the filtered salient feature point matrix, calculating the salient feature point matrix as follows,
P I = P ⊗ I b i n a r y
step 3, taking a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct a training sample to obtain a training sample set X, and training a sparse self-coding network according to the whitened training sample set X' to obtain a feature extractor;
the sparse self-coding network comprises an input layer, an implicit layer and an output layer, wherein a neuron of the implicit layer adopts a ReLU function as an activation function, a neuron of the output layer adopts a softplus function as the activation function, a cost function of the sparse self-coding network is defined as follows,
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2
wherein the first term is a mean square error term, the second term is a regularization term, HW,bNetwork output value representing training sample set X', W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2A constructed weight matrix, wherein lambda represents a regular term coefficient;
step 4, extracting the features of all the images in the image library by using the feature extractor obtained by training in the step 3, and performing sparsification processing on the extracted image features by using a threshold function to obtain a final feature vector for retrieval; the implementation mode is as follows,
the extracted image feature Y is expressed as follows,
Y=f1(W1PI′+b1)
wherein, the salient feature point matrix PI' is the filtered salient feature point matrix P obtained from step 2IWhiteningThe latter result;
for the extracted image feature Y, the following thinning processing is carried out to obtain a sparse feature matrix Z,
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)]
wherein α represents the threshold of the threshold function, matrix Z+=max(0,Y-α),Z-=max(0,α-Y);
The sparse feature matrix Z is further processed by setting the number of SIFT points detected from an image to be n to obtain a feature vector F as follows,
F = 1 n Σ i = 1 n [ Z + i , Z - i ]
wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And 5, based on the feature vector extracted in the step 4, carrying out image retrieval according to a preset similarity measurement criterion.
2. The remote sensing image retrieval method based on the salient point features and the sparse self-coding as claimed in claim 1, wherein the method comprises the following steps: in the step 1, extracting the feature points of each image in the image library to obtain a feature point matrix, and extracting by utilizing an SIFT operator.
3. The remote sensing image retrieval method based on the salient point features and the sparse self-coding as claimed in claim 1 or 2, wherein the method comprises the following steps: in step 5, the urban area distance is adopted as the preset similarity measurement criterion.
4. A remote sensing image retrieval system based on salient point features and sparse self-coding is characterized in that: comprises the following modules which are used for realizing the functions of the system,
the characteristic point extraction module is used for extracting characteristic points of each image in the image library to obtain a characteristic point matrix and calculating a saliency map of each image by using a visual attention model;
the salient feature point extraction module is used for binarizing the salient images of all the images in the image library by adopting a self-adaptive threshold method respectively, and performing mask operation on a feature point matrix corresponding to the images to obtain filtered salient feature points; the implementation mode is as follows,
when the adaptive threshold method is adopted to binarize the saliency map, according to the saliency of the saliency map pixels, the binarization threshold value T of the saliency map is determined as follows,
T = 2 w × h Σ x = 1 w Σ y = 1 h I ( x , y )
wherein w and h represent the width and height of the saliency map, respectively, and I (x, y) represents the saliency value of the saliency map pixel (x, y);
setting the significance map to be binarized according to a binarization threshold value T to obtain a binarized significance map, wherein a matrix I is arranged correspondinglybinaryLet P denoteCharacteristic point matrix, P, of an imageIRepresenting the filtered salient feature point matrix, calculating the salient feature point matrix as follows,
P I = P ⊗ I b i n a r y
the training module is used for taking a plurality of images from the image library as training images, respectively selecting a plurality of significant feature points from each training image to construct a training sample to obtain a training sample set X, and training a sparse self-coding network according to the whitened training sample set X' to obtain a feature extractor;
the sparse self-coding network comprises an input layer, an implicit layer and an output layer, wherein a neuron of the implicit layer adopts a ReLU function as an activation function, a neuron of the output layer adopts a softplus function as the activation function, a cost function of the sparse self-coding network is defined as follows,
J ( W , b ) = 1 2 | | X ′ - H W , b | | 2 + λ 2 | | W | | 2
wherein the first term is a mean square error term, the second term is a regularization term, HW,bRepresenting a set of training samples XNet output value, W ═ W1,W2]And b ═ b1,b2]Respectively representing weights W between the network input layer and the hidden layer1And bias b1And weight W between hidden layer and output layer2And bias b2A constructed weight matrix, wherein lambda represents a regular term coefficient;
the feature extraction module is used for extracting features of all images in the image library by using the feature extractor obtained by training in the step 3, and performing sparsification processing on the extracted image features by using a threshold function to obtain a final feature vector for retrieval; the implementation mode is as follows,
the extracted image feature Y is expressed as follows,
Y=f1(W1PI′+b1)
wherein, the salient feature point matrix PI' is the filtered salient feature point matrix P obtained from step 2IThe whitened result;
for the extracted image feature Y, the following thinning processing is carried out to obtain a sparse feature matrix Z,
Z=[Z+,Z-]=[max(0,Y-α),max(0,α-Y)]
wherein α represents the threshold of the threshold function, matrix Z+=max(0,Y-α),Z-=max(0,α-Y);
The sparse feature matrix Z is further processed by setting the number of SIFT points detected from an image to be n to obtain a feature vector F as follows,
F = 1 n Σ i = 1 n [ Z + i , Z - i ]
wherein,andrespectively represent a matrix Z+And Z-The ith column vector of (2).
And the retrieval module is used for retrieving the image according to a preset similarity measurement criterion based on the feature vector extracted by the query feature extraction module.
5. The remote sensing image retrieval system based on salient point features and sparse self-coding as claimed in claim 4, wherein: and in the characteristic point extraction module, extracting the characteristic points of each image in the image library to obtain a characteristic point matrix, and extracting by utilizing an SIFT operator.
6. The remote sensing image retrieval system based on the salient point features and the sparse self-coding as claimed in claim 4 or 5, wherein: in the retrieval module, the preset similarity measurement criterion adopts urban area distance.
CN201510708598.4A 2015-10-27 2015-10-27 Remote sensing image retrieval method based on notable point feature and sparse own coding and system Active CN105243154B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510708598.4A CN105243154B (en) 2015-10-27 2015-10-27 Remote sensing image retrieval method based on notable point feature and sparse own coding and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510708598.4A CN105243154B (en) 2015-10-27 2015-10-27 Remote sensing image retrieval method based on notable point feature and sparse own coding and system

Publications (2)

Publication Number Publication Date
CN105243154A true CN105243154A (en) 2016-01-13
CN105243154B CN105243154B (en) 2018-08-21

Family

ID=55040802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510708598.4A Active CN105243154B (en) 2015-10-27 2015-10-27 Remote sensing image retrieval method based on notable point feature and sparse own coding and system

Country Status (1)

Country Link
CN (1) CN105243154B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718531A (en) * 2016-01-14 2016-06-29 广州市万联信息科技有限公司 Image database building method and image recognition method
CN106228130A (en) * 2016-07-19 2016-12-14 武汉大学 Remote sensing image cloud detection method of optic based on fuzzy autoencoder network
CN106295613A (en) * 2016-08-23 2017-01-04 哈尔滨理工大学 A kind of unmanned plane target localization method and system
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN107122809A (en) * 2017-04-24 2017-09-01 北京工业大学 Neural network characteristics learning method based on image own coding
CN107515895A (en) * 2017-07-14 2017-12-26 中国科学院计算技术研究所 A kind of sensation target search method and system based on target detection
CN108830172A (en) * 2018-05-24 2018-11-16 天津大学 Aircraft remote sensing images detection method based on depth residual error network and SV coding
CN109259733A (en) * 2018-10-25 2019-01-25 深圳和而泰智能控制股份有限公司 Apnea detection method, apparatus and detection device in a kind of sleep
CN111144483A (en) * 2019-12-26 2020-05-12 歌尔股份有限公司 Image feature point filtering method and terminal
CN112731410A (en) * 2020-12-25 2021-04-30 上海大学 Underwater target sonar detection method based on CNN

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073748A (en) * 2011-03-08 2011-05-25 武汉大学 Visual keyword based remote sensing image semantic searching method
CN102867196A (en) * 2012-09-13 2013-01-09 武汉大学 Method for detecting complex sea-surface remote sensing image ships based on Gist characteristic study
CN103309982A (en) * 2013-06-17 2013-09-18 武汉大学 Remote sensing image retrieval method based on vision saliency point characteristics
CN104462494A (en) * 2014-12-22 2015-03-25 武汉大学 Remote sensing image retrieval method and system based on non-supervision characteristic learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102073748A (en) * 2011-03-08 2011-05-25 武汉大学 Visual keyword based remote sensing image semantic searching method
CN102867196A (en) * 2012-09-13 2013-01-09 武汉大学 Method for detecting complex sea-surface remote sensing image ships based on Gist characteristic study
CN103309982A (en) * 2013-06-17 2013-09-18 武汉大学 Remote sensing image retrieval method based on vision saliency point characteristics
CN104462494A (en) * 2014-12-22 2015-03-25 武汉大学 Remote sensing image retrieval method and system based on non-supervision characteristic learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
周维勋 等: "利用视觉注意模型和局部特征的遥感影像检索方法", 《武汉大学学报.信息科学版》 *
王星 等: "基于视觉显著点特征的遥感影像检索方法", 《测绘科学》 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718531A (en) * 2016-01-14 2016-06-29 广州市万联信息科技有限公司 Image database building method and image recognition method
CN105718531B (en) * 2016-01-14 2019-12-17 广州市万联信息科技有限公司 Image database establishing method and image identification method
CN106228130B (en) * 2016-07-19 2019-09-10 武汉大学 Remote sensing image cloud detection method of optic based on fuzzy autoencoder network
CN106228130A (en) * 2016-07-19 2016-12-14 武汉大学 Remote sensing image cloud detection method of optic based on fuzzy autoencoder network
CN106295613A (en) * 2016-08-23 2017-01-04 哈尔滨理工大学 A kind of unmanned plane target localization method and system
CN106909924A (en) * 2017-02-18 2017-06-30 北京工业大学 A kind of remote sensing image method for quickly retrieving based on depth conspicuousness
CN106909924B (en) * 2017-02-18 2020-08-28 北京工业大学 Remote sensing image rapid retrieval method based on depth significance
CN107122809A (en) * 2017-04-24 2017-09-01 北京工业大学 Neural network characteristics learning method based on image own coding
CN107122809B (en) * 2017-04-24 2020-04-28 北京工业大学 Neural network feature learning method based on image self-coding
CN107515895B (en) * 2017-07-14 2020-06-05 中国科学院计算技术研究所 Visual target retrieval method and system based on target detection
CN107515895A (en) * 2017-07-14 2017-12-26 中国科学院计算技术研究所 A kind of sensation target search method and system based on target detection
CN108830172A (en) * 2018-05-24 2018-11-16 天津大学 Aircraft remote sensing images detection method based on depth residual error network and SV coding
CN109259733A (en) * 2018-10-25 2019-01-25 深圳和而泰智能控制股份有限公司 Apnea detection method, apparatus and detection device in a kind of sleep
CN111144483A (en) * 2019-12-26 2020-05-12 歌尔股份有限公司 Image feature point filtering method and terminal
CN111144483B (en) * 2019-12-26 2023-10-17 歌尔股份有限公司 Image feature point filtering method and terminal
CN112731410A (en) * 2020-12-25 2021-04-30 上海大学 Underwater target sonar detection method based on CNN
CN112731410B (en) * 2020-12-25 2021-11-05 上海大学 Underwater target sonar detection method based on CNN

Also Published As

Publication number Publication date
CN105243154B (en) 2018-08-21

Similar Documents

Publication Publication Date Title
CN112750140B (en) Information mining-based disguised target image segmentation method
CN105243154B (en) Remote sensing image retrieval method based on notable point feature and sparse own coding and system
CN111259850B (en) Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN110348376B (en) Pedestrian real-time detection method based on neural network
CN110378381B (en) Object detection method, device and computer storage medium
CN112446398B (en) Image classification method and device
CN115240121B (en) Joint modeling method and device for enhancing local features of pedestrians
CN104462494B (en) A kind of remote sensing image retrieval method and system based on unsupervised feature learning
CN105678284B (en) A kind of fixed bit human body behavior analysis method
CN105574063B (en) The image search method of view-based access control model conspicuousness
CN111680176A (en) Remote sensing image retrieval method and system based on attention and bidirectional feature fusion
CN109714526B (en) Intelligent camera and control system
CN113298815A (en) Semi-supervised remote sensing image semantic segmentation method and device and computer equipment
CN113139489B (en) Crowd counting method and system based on background extraction and multi-scale fusion network
CN109635726B (en) Landslide identification method based on combination of symmetric deep network and multi-scale pooling
CN107767416A (en) The recognition methods of pedestrian's direction in a kind of low-resolution image
CN115375781A (en) Data processing method and device
CN113269224A (en) Scene image classification method, system and storage medium
CN115410081A (en) Multi-scale aggregated cloud and cloud shadow identification method, system, equipment and storage medium
Liu et al. CAFFNet: channel attention and feature fusion network for multi-target traffic sign detection
CN113011253B (en) Facial expression recognition method, device, equipment and storage medium based on ResNeXt network
CN116524189A (en) High-resolution remote sensing image semantic segmentation method based on coding and decoding indexing edge characterization
Lu et al. An iterative classification and semantic segmentation network for old landslide detection using high-resolution remote sensing images
CN117557774A (en) Unmanned aerial vehicle image small target detection method based on improved YOLOv8
CN114926734B (en) Solid waste detection device and method based on feature aggregation and attention fusion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant