CN102033933A

CN102033933A - Distance metric optimization method for maximizing mean average precision (MAP)

Info

Publication number: CN102033933A
Application number: CN 201010594547
Authority: CN
Inventors: 冯前进; 阳维; 卢振泰; 陈武凡
Original assignee: Southern Medical University
Current assignee: Southern Medical University
Priority date: 2010-12-17
Filing date: 2010-12-17
Publication date: 2011-04-27
Anticipated expiration: 2030-12-17
Also published as: CN102033933B

Abstract

The invention discloses a distance measurement optimization method for maximizing the average precision rate (MAP), which comprises the following steps: (1) obtaining an image database containing a query image-to-be-queried image pair set; (2) using a smoothing function to MAP performs approximation, converts it into an objective function and adds a regularization term, so that the objective function is continuously differentiable; (3) Using the query image-to-be-queried image pair in the image database, the optimization technique based on gradient descent is used to optimize the The function is optimized, and the optimization process is repeated, and the solution that maximizes the MAP is taken as the optimal linear transformation matrix; (4) The distance measure defined by the optimal linear transformation matrix is obtained. This method has the characteristics of optimization process and direct target, and the distance measure obtained by this method can achieve better image retrieval performance.

Description

Optimal distance measure method for maximizing the mean of average precision

技术领域technical field

本发明涉及一种用于提高图像检索性能的距离测度优化方法，具体来说涉及一种最大化平均查准率均值的距离测度优化方法。The invention relates to a distance measure optimization method for improving image retrieval performance, in particular to a distance measure optimization method for maximizing the mean value of the average precision rate.

背景技术Background technique

基于内容的图像检索(Content-Based Image Retrieval，CBIR)的基本思想是对图像的视觉特征进行提取，以视觉特征作为索引来实现较高层次的图像检索。CBIR技术可以为管理图像数据、临床诊断、医学教学等提供支持和帮助。特别地，医学图像中相似病灶的检索，可提高临床诊断的可靠性和相关信息的完整性。The basic idea of content-based image retrieval (Content-Based Image Retrieval, CBIR) is to extract the visual features of the image, and use the visual features as an index to achieve higher-level image retrieval. CBIR technology can provide support and assistance for image data management, clinical diagnosis, medical teaching, etc. In particular, the retrieval of similar lesions in medical images can improve the reliability of clinical diagnosis and the integrity of relevant information.

一般地，CBIR系统为了能让查询结果按照相似度从大到小的顺序返回，需在图像特征空间中定义一个距离测度或相似性度量来衡量数据库中待查询图像与查询图像间的差别，返回结果时，待查询图像按照差别大小的排序。常用的距离测度有：欧氏距离、马氏距离、柯西距离等。CBIR系统的检索性能很大程度上取决于两方面：1.用于表达图像内容和语义概念的视觉特征；2.在特征空间中定义的距离测度或相似性度量。CBIR系统中通常采用低层视觉特征，如颜色、纹理、形状、边缘等描述图像内容。然而低层视觉特征的描述能力有限，表达图像时信息会有损失，因而低层视觉特征一般不能直接表达或反映图像的高层语义。这样，在低层特征空间中，我们难以用上述一般性的距离测度定义一个与语义概念紧密关联的相似性度量，即存在所谓的“语义鸿沟”，使得CBIR系统的检索性能与实际需求有一定距离。Generally, in order to allow the query results to be returned in descending order of similarity, the CBIR system needs to define a distance measure or similarity measure in the image feature space to measure the difference between the image to be queried and the query image in the database, and return In the result, the images to be queried are sorted according to the difference. Commonly used distance measures are: Euclidean distance, Mahalanobis distance, Cauchy distance, etc. The retrieval performance of a CBIR system largely depends on two aspects: 1. Visual features used to express image content and semantic concepts; 2. Distance measure or similarity measure defined in the feature space. CBIR systems usually use low-level visual features, such as color, texture, shape, edge, etc., to describe image content. However, low-level visual features have limited ability to describe, and information will be lost when expressing images, so low-level visual features generally cannot directly express or reflect the high-level semantics of images. In this way, in the low-level feature space, it is difficult for us to use the above-mentioned general distance measure to define a similarity measure closely related to semantic concepts, that is, there is a so-called "semantic gap", which makes the retrieval performance of the CBIR system far from the actual demand. .

关联反馈(Relevance Feedback)和距离测度学习(Distance Metric Learning)方法可以用于减小语义鸿沟，提高CBIR的检索性能。关联反馈中用于学习的数据难以获取，系统需要多次调整才能达到满意效果，用户使用不是非常方便。距离测度学习方法可依据事先标注的图像样本，建立低层图像特征和高层语义之间的映射，从而定义反映语义差别的距离测度，提高检索性能。现有的距离测度学习方法，多针对聚类或k近邻分类问题设计，其学习目标是为了提高聚类或k近邻分类的准确率。然而，CBIR系统的检索精度指标与分类准确率有很大的不同，如平均查准率均值(Mean Average Precision，MAP)、NDCG(Normalized DiscountedCumulative Gain)，都与数据库中待查询图像的排序有关系，使用优化针对聚类或分类问题设计的目标函数进行距离测度学习，不一定能得到检索性能最佳的距离测度。而且，由于检索精度指标不是连续可微的，对其进行优化存在一定困难。Relevance Feedback and Distance Metric Learning methods can be used to reduce the semantic gap and improve the retrieval performance of CBIR. The data used for learning in relational feedback is difficult to obtain, and the system needs multiple adjustments to achieve satisfactory results, which is not very convenient for users to use. The distance measure learning method can establish a mapping between low-level image features and high-level semantics based on pre-labeled image samples, thereby defining a distance measure that reflects semantic differences and improving retrieval performance. The existing distance measure learning methods are mostly designed for clustering or k-nearest neighbor classification problems, and the learning goal is to improve the accuracy of clustering or k-nearest neighbor classification. However, the retrieval accuracy index of the CBIR system is very different from the classification accuracy rate, such as Mean Average Precision (MAP) and NDCG (Normalized Discounted Cumulative Gain), which are related to the ordering of the images to be queried in the database , using optimized objective functions designed for clustering or classification problems for distance measure learning may not necessarily obtain the best distance measure for retrieval performance. Moreover, since the retrieval accuracy index is not continuously differentiable, it is difficult to optimize it.

发明内容Contents of the invention

本发明的目的在于提供一种最大化平均查准率均值的距离测度优化方法，本方法具有优化过程和目标直接，优化得到的距离测度能反映图像的语义差别，用本方法得到的距离测度可达到更好的图像检索性能。The object of the present invention is to provide a distance measure optimization method for maximizing the mean value of the average precision rate. This method has an optimization process and a direct target. The distance measure obtained by optimization can reflect the semantic difference of the image. The distance measure obtained by the method can be achieve better image retrieval performance.

本发明的目的可通过以下的技术措施来实现：The purpose of the present invention can be achieved through the following technical measures:

一种最大化平均查准率均值的距离测度优化方法，包括以下步骤：A distance measure optimization method for maximizing the average precision rate, comprising the following steps:

(1)获取包含查询图像-待查询图像对集合的图像数据库；(1) Obtain an image database that includes a collection of query image-image pairs to be queried;

(2)采用平滑函数对MAP进行逼近，将其转换为目标函数并添加正则化项，使得目标函数是连续可微的；(2) Approximating MAP with a smooth function, converting it into an objective function and adding a regularization term, so that the objective function is continuously differentiable;

(3)使用图像数据库中的查询图像-待查询图像对，对目标函数采用基于梯度下降的方法进行优化，并重复优化过程，取使MAP最大的解作为最优的线性变换矩阵；(3) Using the query image-to-be-queried image pair in the image database, the objective function is optimized using a method based on gradient descent, and the optimization process is repeated, and the solution that maximizes the MAP is taken as the optimal linear transformation matrix;

(4)图像间距离测度的获取，包括两种方式：用步骤(3)中的最优线性变换矩阵对图像数据库中的图像特征先进行变换，再用变换特征的欧氏距离定义图像间的距离测度；或者是用步骤(3)中的最优线性变换矩阵定义马氏距离作为图像间的距离测度。(4) Acquisition of the distance measure between images, including two ways: use the optimal linear transformation matrix in step (3) to transform the image features in the image database first, and then use the Euclidean distance of the transformed features to define the distance between images distance measure; or use the optimal linear transformation matrix in step (3) to define the Mahalanobis distance as the distance measure between images.

所述步骤(1)中的图像数据库由图像或图像区域特征、图像类别构成。The image database in the step (1) is composed of images or image region features and image categories.

所述步骤(1)中依据是否包含相同解剖结构、相同类型病灶等确定图像对是否相关，对查询-待查询图像对进行标注。In the step (1), it is determined whether the image pair is related according to whether it contains the same anatomical structure, the same type of lesion, etc., and the query-to-be-queried image pair is marked.

所述步骤(2)中对MAP进行逼近的过程为：对给定的查询图像，任一待查询图像的排序序号使用Sigmoid型函数进行平滑逼近，进而得到MAP的平滑逼近。The process of approximating the MAP in the step (2) is: for a given query image, the sequence number of any image to be queried is smoothly approximated using a Sigmoid function, and then the smooth approximation of the MAP is obtained.

所述待查询图像的排序序号由该待查询图像和其他所有待查询图像与查询图像之间的距离进行比较确定。The sequence number of the image to be queried is determined by comparing the distance between the image to be queried and all other images to be queried and the query image.

所述步骤(3)中变换矩阵的初始解可为一随机矩阵或由其他方法得到的变换矩阵。The initial solution of the transformation matrix in the step (3) can be a random matrix or a transformation matrix obtained by other methods.

所述步骤(3)中采用随机梯度下降算法来优化目标函数。In the step (3), a stochastic gradient descent algorithm is used to optimize the objective function.

本发明方法相对于现有技术来说，具有以下的有益效果：Compared with the prior art, the inventive method has the following beneficial effects:

1、本方法涉及图像检索、统计学习和数值优化等技术，通过统计学习直接优化平均查准率均值(Mean Average Precision，MAP)，可用于图像检索、聚类分析、数据约减等。本方法主要对标注的图像数据进行统计学习，得到一个线性变换矩阵，将原始特征空间映射至一个新的更具鉴别力的特征空间，在新的特征空间中以欧氏距离作为图像相似性度量进行检索，达到较高的MAP。1. This method involves technologies such as image retrieval, statistical learning, and numerical optimization. Through statistical learning, the Mean Average Precision (MAP) is directly optimized, which can be used for image retrieval, cluster analysis, data reduction, etc. This method mainly performs statistical learning on the labeled image data, obtains a linear transformation matrix, maps the original feature space to a new more discriminative feature space, and uses Euclidean distance as the image similarity measure in the new feature space Search to achieve a higher MAP.

2、本发明方法直接解线性变换矩阵，无需使用费时的半正定规划算法进行求解，并可通过直接调整变换矩阵的维数对原始特征进行维数约减，而不是对半正定矩阵的秩进行约束或间接地调节正则化参数实现维数约减。该方法以检索精度指标MAP的平滑逼近为目标函数，相对于其他一些距离测度学习方法，优化过程和目标更为直接，映射后的低层图像特征与高层语义特征联系更为紧密，检索性能更优。在基于内容的医学图像检索应用中，本发明比其他一些距离测度学习方法得到的距离测度可达到更好的检索性能。2. The method of the present invention directly solves the linear transformation matrix without using the time-consuming semi-positive definite programming algorithm to solve it, and can directly adjust the dimension of the transformation matrix to perform dimension reduction on the original features instead of performing the rank reduction on the semi-positive definite matrix. Constrained or indirectly adjusted regularization parameters to achieve dimensionality reduction. This method takes the smooth approximation of the retrieval accuracy index MAP as the objective function. Compared with some other distance measure learning methods, the optimization process and objectives are more direct. The mapped low-level image features are more closely related to the high-level semantic features, and the retrieval performance is better. . In content-based medical image retrieval applications, the invention can achieve better retrieval performance than distance measures obtained by other distance measure learning methods.

3、由于对所有查询图像-待查询图像对计算目标函数及梯度的计算量非常大，因此，本方法中采用随机梯度下降算法进行优化，可减小对计算机内存的需求、加快算法收敛速度；此外，由于目标函数非凸，解空间中存在很多局部极值点，为避免解陷入局部极值点，通过其他较快速的距离测度学习方法估计一个较接近最优解的初始解，然后多次重复优化过程取最优解。3. Since the calculation of the objective function and gradient for all query images-to-be-queried images is very large, therefore, this method adopts the stochastic gradient descent algorithm for optimization, which can reduce the demand for computer memory and speed up the convergence of the algorithm; In addition, since the objective function is non-convex, there are many local extremum points in the solution space. In order to prevent the solution from falling into local extremum points, an initial solution closer to the optimal solution is estimated by other faster distance measure learning methods, and then multiple times Repeat the optimization process to get the optimal solution.

附图说明Description of drawings

图1是本发明的最大化平均查准率均值的距离测度优化方法的流程图；Fig. 1 is the flow chart of the distance measure optimization method of maximizing the average precision rate mean value of the present invention;

图2是本发明优化检索精度指标MAP过程中MAP随优化算法迭代次数的收敛情况示意图；Fig. 2 is a schematic diagram of the convergence of MAP with the number of optimization algorithm iterations in the process of optimizing the retrieval precision index MAP of the present invention;

图3是本发明与其他距离测度优化方法得到的距离测度用于检索的查全-查准率曲线；Fig. 3 is the recall-precision rate curve that the distance measure obtained by the present invention and other distance measure optimization methods is used for retrieval;

图4是四种距离测度在返回图像为50幅时的查准率均值曲线图；Fig. 4 is a curve diagram of the average precision rate of four kinds of distance measures when the return image is 50 pieces;

图5是使用本发明优化得到的距离测度用于脑部肿瘤T1加权增强MRI图像病检索的示例Fig. 5 is an example of using the distance measure optimized by the present invention for brain tumor T1 weighted enhanced MRI image disease retrieval

具体实施方式Detailed ways

图1示出了本发明的最大化平均查准率均值的距离测度优化方法的具体流程，包括以下步骤：Fig. 1 shows the specific process of the distance measure optimization method of maximizing the average precision rate mean value of the present invention, comprising the following steps:

(1)获取包含人工标注的查询图像-待查询图像对的集合的图像数据库；图像数据库由图像或图像区域特征、图像类别构成。(1) Obtain an image database containing a collection of manually labeled query image-to-be-queried image pairs; the image database is composed of images or image region features, and image categories.

(2)采用平滑函数对MAP进行逼近，将其转换为目标函数并添加正则化项，使得目标函数是连续可微的；变换矩阵初始解可取一个随机矩阵或由其他较快速的距离测度学习方法估计的变换矩阵。对MAP进行逼近的过程为：对给定的查询图像，任一待查询图像的排序序号使用Sigmoid型函数进行平滑逼近，进而得到MAP的平滑逼近。(2) Approximate MAP with a smooth function, convert it into an objective function and add a regularization term, so that the objective function is continuously differentiable; the initial solution of the transformation matrix can be a random matrix or other faster distance measure learning methods Estimated transformation matrix. The process of approximating the MAP is: for a given query image, the sequence number of any image to be queried is smoothly approximated using a Sigmoid function, and then the smooth approximation of the MAP is obtained.

待查询图像的排序序号由该待查询图像和其他所有待查询图像与查询图像之间的距离进行比较确定。The sequence number of the image to be queried is determined by comparing the distance between the image to be queried and all other images to be queried and the query image.

(3)使用图像数据库中的查询图像-待查询图像并采用随机梯度下降算法来优化目标函数，重复优化过程，取使MAP最大的解作为最优的线性变换矩阵；(3) Use the query image in the image database-the image to be queried and use the stochastic gradient descent algorithm to optimize the objective function, repeat the optimization process, and take the solution that maximizes the MAP as the optimal linear transformation matrix;

(4)图像间的距离测度用于对图像进行检索，步骤(3)中的最优线性变换矩阵的使用方式包括两种：一种是用线性变换矩阵对图像数据库中的图像特征先进行变换，再用变换特征的欧氏距离定义图像间的距离测度，从而进行图像查询；另一种是用线性变换矩阵定义马氏距离作为图像的距离测度进行图像查询。(4) The distance measure between images is used to retrieve images, and the optimal linear transformation matrix in step (3) includes two ways: one is to use the linear transformation matrix to transform the image features in the image database first , and then use the Euclidean distance of the transformation feature to define the distance measure between images, so as to perform image query; the other is to use the linear transformation matrix to define the Mahalanobis distance as the distance measure of the image for image query.

下面结合脑部T1加权增强MRI图像检索问题，阐述本发明的工作原理和步骤。In the following, the working principle and steps of the present invention will be described in combination with the T1 weighted enhanced MRI image retrieval problem of the brain.

步骤1，从图像数据库读取MRI图像、图中肿瘤的病理类型和医生手工勾画的病灶轮廓，提取病灶区域灰度、纹理、形状和边缘等特征，特征向量表示为x_i。定义包含相同类型肿瘤的图像为相关图像，否则为不相关图像。图像检索的任务是将数据库中与查询图像中病灶区域病理类型相同的图像返回给用户。Step 1. Read the MRI image, the pathological type of the tumor in the image, and the outline of the lesion manually drawn by the doctor from the image database, and extract features such as grayscale, texture, shape, and edge of the lesion area, and the feature vector is expressed as x _i . Define images containing the same type of tumors as correlated images, otherwise as irrelevant images. The task of image retrieval is to return images of the same pathological type as the lesion area in the query image in the database to the user.

步骤2，由于MAP与待查询图像的排序有关，首先依据查询图像与待查询图像之间的距离确定和逼近每一待查询图像的排序序号。设L为一d×D(d≤D)实数矩阵，半正定矩阵W＝L^TL，则x_i与x_j之间的马氏距离(平方)为Step 2, since MAP is related to the ordering of the images to be queried, firstly determine and approximate the ordering number of each image to be queried according to the distance between the image to be queried and the image to be queried. Let L be a d×D (d≤D) real number matrix, positive semi-definite matrix W=L ^T L, then the Mahalanobis distance (square) between x _i and x _j is

d_ij＝||Lx_i-Lx_j||²＝(x_i-x_j)^TL^TL(x_i-x_j)＝(x_i-x_j)^TW(x_i-x_j)，d _ij ＝||Lx _i -Lx _j || ² ＝(xi _-x _j ) ^T L ^T L( _xi -x _j )＝( _xi -x _j ) ^T W( _xi -x _j ),

这里^T表示矩阵的转置操作。设查询特征向量为x_q，查询-待查询对为(x_q，{x_i，i＝1，2，...，N})，这里N为待查询图像总数，待查询特征向量x_i的排序序号可表达为Here ^T represents the transpose operation of the matrix. Let the query feature vector be x _q , the query-to-query pair is (x _q , { _xi , i=1, 2, ..., N}), where N is the total number of images to be queried, and the eigenvector to be queried x _i The sort sequence number of can be expressed as

$π π (({x x}_{i i})) = = 11 + + \underset{k k,, k k &NotEqual; &NotEqual; i i}{Σ Σ} 11 {{{d d}_{qi qi} > > {d d}_{qk qk}}} = = 11 + + \underset{k k,, k k &NotEqual; &NotEqual; i i}{Σ Σ} 11 {{Δ Δ {d d}_{ik ik} > > 00}},,$

其中Δd_ik＝d_qi-d_qk，1{}为指示函数。由于指示函数不连续、不可微，排序序号π(x_i)作为距离d_qi或L的函数也是不连续、不可微的。指示函数可用连续可微的Sigmoid型函数，记为S(Z)。排序序号π(x_i)的平滑逼近为 Wherein Δd _ik =d _qi -d _qk , 1{} is an indicator function. Since the indicator function is discontinuous and non-differentiable, the sequence number π( _xi ) as a function of the distance d _qi or L is also discontinuous and non-differentiable. The indicator function can be a continuously differentiable Sigmoid function, denoted as S(Z). The smooth approximation of the sequence number π(xi ₎ is

检索精度指标平均查准率(Average Precision，AP)定义为

其中rel(x_i)∈{0，1}表示两级相关的标号，N⁺为与查询图像相关的图像总数，

MAP为平均查准率在所有查询对上的平均。AP可写成如下形式The average precision rate (Average Precision, AP) of retrieval precision index is defined as

where rel( _xi )∈{0, 1} represents the label of two-level correlation, N ⁺ is the total number of images related to the query image,

MAP is the average of the average precision over all query pairs. AP can be written as follows

$AP AP = = \frac{11}{{N N}^{+ +}} \underset{i i}{Σ Σ} ((\frac{rel rel (({x x}_{i i}))}{π π (({x x}_{i i}))} + + \underset{i i,, i i &NotEqual; &NotEqual; j j}{Σ Σ} rel rel (({x x}_{i i})) rel rel (({x x}_{j j})) \frac{11 {{Δ Δ {d d}_{ij ij} > > 00}}}{π π (({x x}_{i i}))})) . .$

用

替代π(x_i)，并再次用Sigmoid函数逼近指示函数，AP的平滑逼近为use

Substituting π( _xi ), and again using the Sigmoid function to approximate the indicator function, the smooth approximation of AP is

$\overset{^^}{A A} P P = = \frac{11}{{N N}^{+ +}} {Σ Σ}_{i i = = 11}^{N N} ((\frac{rel rel (({x x}_{i i}))}{\overset{^^}{π π} (({x x}_{i i}))} + + \underset{k k,, k k &NotEqual; &NotEqual; i i}{Σ Σ} \frac{rel rel (({x x}_{i i})) rel rel (({x x}_{k k})) S S ((Δ Δ {d d}_{ik ik}))}{\overset{^^}{π π} (({x x}_{i i}))})) . .$

变换矩阵L的正则化项表示为Reg(L)。Reg(L)可取trace(L^TL)，这里trace(M)表示矩阵M的迹；Reg(L)也可取L的其他范数形式。The regularization term of the transformation matrix L is denoted as Reg(L). Reg(L) can take trace(L ^T L), where trace(M) represents the trace of matrix M; Reg(L) can also take other norm forms of L.

最大化MAP，等价于最小化1-MAP。定义平均查准率损失函数：Maximizing MAP is equivalent to minimizing 1-MAP. Define the average precision loss function:

${Loss loss}_{AP AP} ((L L)) = = 11 - - \overset{^^}{A A} P P,,$

本发明定义的目标函数具有如下形式：The objective function defined by the present invention has the following form:

R(L)＝Loss_AP(L)+γReg(L)，R(L)=Loss _AP (L)+γReg(L),

其中γ为权重系数，用于控制损失函数和正则化项之间的平衡。R(L)是连续可微的。where γ is a weight coefficient used to control the balance between the loss function and the regularization term. R(L) is continuously differentiable.

本发明的距离测度优化问题表达为：The distance measure optimization problem of the present invention is expressed as:

$L L * * = = \underset{L L &Element; &Element; {R R}^{d d \times \times D D.}}{arg arg min min} \frac{11}{M m} {Σ Σ}_{m m = = 11}^{M m} {R R}_{m m} ((L L)),,$

其中M为查询-待查询图像对总数，R_m(L)为在第m个查询-待查询图像对上的目标函数。Where M is the total number of query-to-query image pairs, and R _m (L) is the objective function on the mth query-to-query image pair.

步骤3，首先估计初始解L₀，L₀可为一d×D随机矩阵或者取其他距离测度学习方法得到的解。第t次迭代时，随机从图像数据库中产生查询-待查询对(x_q，{x_i，i＝1，2，...，N_t})_t，计算R(L_t)在该查询对上关于L_t的梯度g_t，更新L_t＝L_t-1-η_tg_t，这里

t＝1，2，...，T为迭代次数，T为设定的最大迭代次数，η_t为第t次迭代的步长；重复迭代，直至达到最大迭代次数或解收敛条件。多次重复上述过程，选取使检索精度指标最大的解作为L^*。如取trace(L^TL)作为正则化项，则梯度g_t为：Step 3, first estimate the initial solution L ₀ , _which can be a d×D random matrix or a solution obtained by other distance measure learning methods. In the tth iteration, the query-to-be-queried pair (x _q , { _xi , i=1, 2, ..., N _t }) _t is randomly generated from the image database, and R(L _t ) is calculated in the query For the gradient g _t about L _t above, update L _t =L _t-1 -η _t g _t , where

t=1, 2,..., T is the number of iterations, T is the maximum number of iterations set, η _t is the step size of the tth iteration; repeat iterations until reaching the maximum number of iterations or solving the convergence condition. The above process is repeated many times, and the solution that maximizes the index of retrieval accuracy is selected as L ^* . If trace(L ^T L) is used as the regularization term, the gradient g _t is:

${g g}_{t t} = = \frac{&PartialD; &PartialD; R R (({L L}_{t t}))}{&PartialD; &PartialD; {L L}_{t t}} = = - - \frac{&PartialD; &PartialD; {Loss loss}_{AP AP} (({L L}_{t t}))}{&PartialD; &PartialD; {L L}_{t t}} + + 22 γ γ {L L}_{t t}$

损失函数关于L_t的偏导

通过链式法则得到。The partial derivative of the loss function with respect to L _t

obtained by the chain rule.

算法参数α、γ可通过交叉检验方法进行调整，经验地α取1～10，γ取0.0001～0.01可达到满意效果。梯度下降的步长可取η_t＝η₀exp(-2t/T)，t为当前迭代次数，也可采用其他方式调整步长。图2示出了本发明优化检索精度指标MAP过程中MAP随优化算法迭代次数的收敛情况。Algorithm parameters α and γ can be adjusted through the cross-check method. Empirically, α ranges from 1 to 10, and γ ranges from 0.0001 to 0.01 to achieve satisfactory results. The step size of gradient descent can be η _t = η ₀ exp(-2t/T), t is the current number of iterations, and the step size can also be adjusted in other ways. FIG. 2 shows the convergence of MAP with the number of optimization algorithm iterations in the process of optimizing the retrieval accuracy index MAP in the present invention.

步骤4，对于步骤3得到的线性变换L^*，有两种等价的方式应用于图像检索。一是用L^*将原始特征向量映射至新的特征空间，然后用标准的欧氏距离作为图像的相似性度量；一是直接用L^*定义马氏距离作为图像的相似性度量进行检索，前者可减小特征所需的存储空间、加快图像检索速度。Step 4, for the linear transformation L ^* obtained in step 3, there are two equivalent ways to apply to image retrieval. One is to use L ^* to map the original feature vector to a new feature space, and then use the standard Euclidean distance as the image similarity measure; the other is to directly use L ^* to define the Mahalanobis distance as the image similarity measure for retrieval, the former It can reduce the storage space required by features and speed up image retrieval.

图3为本发明与其他距离测度学习方法得到的距离测度用于检索的查全-查准率曲线，其中的实验数据为脑部T1加权增强MRI图像，包含相同类型肿瘤的图像为相关图像，反之为不相关图像。比较用的距离测度包括标准的欧氏距离、由大间隔近邻分类(LMNN)和局部Fisher判别分析(LFDA)方法学习得到的马氏距离。可见本发明方法可达到更高的MAP，优于欧氏距离和其他两种距离测度学习方法。Fig. 3 is the recall-precision rate curve of the distance measure obtained by the present invention and other distance measure learning methods for retrieval, wherein the experimental data are T1 weighted enhanced MRI images of the brain, and images containing the same type of tumor are related images, Otherwise, it is an irrelevant image. The distance measures used for comparison include standard Euclidean distance, Mahalanobis distance learned by Large Margin Neighbor Classification (LMNN) and Local Fisher Discriminant Analysis (LFDA) methods. It can be seen that the method of the present invention can achieve a higher MAP, which is better than the Euclidean distance and the other two distance measure learning methods.

图4为四种距离测度在返回图像为50幅时的查准率均值曲线，所用图像数据与图3相同。本发明中的距离测度优化方法，在返回前5-50图像时，查准率均值高于其他三种距离测度。Figure 4 shows the average precision curves of the four distance measures when 50 images are returned, and the image data used is the same as that in Figure 3. In the distance measure optimization method in the present invention, when returning the first 5-50 images, the average precision rate is higher than the other three distance measures.

图5为使用本发明得到的距离测度用于脑部肿瘤T1加权增强MRI图像病检索的示例，图中左上角的图像为查询图像，加实线框(白色线框)的表示检索到的相关图像(表示肿瘤类型相同)，加虚线框(白色虚线框)的表示检索到的不相关图像(表示肿瘤类型不相同)，图中肿瘤区域轮廓已人工勾出。上述实验数据表明本发明比其他一些距离测度学习方法得到的距离测度可达到更好的检索性能。Fig. 5 is the example that uses the distance measure that the present invention obtains to be used for brain tumor T1 weighted enhanced MRI image disease retrieval, the image in the upper left corner among the figure is query image, and the representation retrieval of solid line frame (white line frame) Images (indicating the same type of tumor), those with a dotted line frame (white dotted line frame) indicate irrelevant images retrieved (indicating different tumor types), and the outline of the tumor area in the figure has been drawn manually. The above experimental data show that the distance measure obtained by the present invention can achieve better retrieval performance than some other distance measure learning methods.

在本发明的实施方式不限于此，其他类似MAP由待查询图像排序序号定义的检索精度指标，如Prec@k和NDCG等，也可以使用类似上述的公式和方法进行逼近，然后用本发明的优化方法进行求解，在此不一一列举；其他一些基于梯度下降的优化技术和步长调整方式，如共轭梯度下降算法也可用于实现本发明目的；根据本发明的上述内容，按照本领域的普通技术知识和惯用手段，在不脱离本发明上述基本技术思想前提下，本发明还可以做出其它多种形式的等效修改、替换或变更，均落在本发明权利保护范围之内。Embodiments of the present invention are not limited thereto. Other similar MAP retrieval accuracy indicators defined by the sorting numbers of images to be queried, such as Prec@k and NDCG, can also be approximated using formulas and methods similar to the above, and then use the method of the present invention Optimization method is solved, not enumerating one by one here; Other optimization techniques and step size adjustment methods based on gradient descent, such as conjugate gradient descent algorithm can also be used to realize the object of the present invention; According to the above-mentioned content of the present invention, according to the art Without departing from the above-mentioned basic technical idea of the present invention, the present invention can also make other equivalent modifications, replacements or changes, all of which fall within the protection scope of the present invention.

Claims

One kind the maximization average precision ratio average the distance measure optimization method, may further comprise the steps:

(1) obtains the image data base that comprises query image-image pair set to be checked;

(2) adopt smooth function that MAP is approached, be converted into objective function and add regularization term, make that objective function is continuously differentiable;

(3) query image-image to be checked in the use image data base is right, adopts the method that descends based on gradient to be optimized to objective function, and repeats optimizing process, gets the matrix of a linear transformation of separating the conduct optimum that makes the MAP maximum;

(4) image pitch comprises dual mode from obtaining of estimating: the optimum linearity transformation matrix in the usefulness step (3) is to the advanced line translation of the characteristics of image in the image data base, again with the distance measure between the Euclidean distance definition image of transform characteristics; Perhaps use the optimum linearity transformation matrix in the step (3) to define mahalanobis distance as the distance measure between image.
2. the distance measure optimization method of the average precision ratio average of maximization according to claim 1 is characterized in that: the image data base in the described step (1) is made of image or image-region feature, image category.
3. the distance measure optimization method of the average precision ratio average of maximization according to claim 1, it is characterized in that: the inquiry in the described step (1)-right set of image to be checked is that whether foundation comprises same anatomical, the same type focus determines that image to whether similar, marks.
4. the distance measure optimization method of the average precision ratio average of maximization according to claim 1 is characterized in that: the transformation matrix that obtains with stochastic matrix or other distance measure learning methods in the described step (2) is as the initial solution of the matrix of a linear transformation.
5. the distance measure optimization method of the average precision ratio average of maximization according to claim 1, it is characterized in that: the process of in the described step (2) MAP being approached is: to given query image, the ordering sequence number of arbitrary image to be checked uses the Sigmoid type function smoothly to approach, and then obtains smoothly approaching of MAP.
6. the distance measure optimization method of the average precision ratio average of maximization according to claim 1 is characterized in that: adopting at random in the described step (3), gradient descent algorithm comes the optimization aim function.