CN110443257B

CN110443257B - A saliency detection method based on active learning

Info

Publication number: CN110443257B
Application number: CN201910609780.2A
Authority: CN
Inventors: 张立和; 闵一璠
Original assignee: Dalian University of Technology
Current assignee: Dalian University of Technology
Priority date: 2019-07-08
Filing date: 2019-07-08
Publication date: 2022-04-12
Anticipated expiration: 2039-07-08
Also published as: CN110443257A

Abstract

The invention belongs to the technical field of artificial intelligence, and provides a significance detection method based on active learning. And then, in order to optimize the target boundary of the saliency map, a super-pixel-level post-processing method is designed to further improve the performance. The invention reduces the marking cost and simultaneously reduces the redundancy of the training set, thereby greatly improving the experimental effect compared with the original KSR model. Meanwhile, comparative experiments show that the performance of the method is superior to that of many classical algorithms.

Description

A saliency detection method based on active learning

技术领域technical field

本发明属于人工智能技术领域，涉及到计算机视觉，特别涉及到一种图像显著性检测方法。The invention belongs to the technical field of artificial intelligence, relates to computer vision, and particularly relates to an image saliency detection method.

背景技术Background technique

当今社会的经济和科技水平迅速发展，各种各样不同的碎片化信息时时刻刻的被人类所接收，而图像和视频信息又是这些信息中最多、最重要的。如何快速有效的处理图像数据成为人们需要解决的一个难题。通常，人们只关注图像中最吸引人眼注意区域，即前景区域或显著目标，同时忽略背景区域。因此，人们利用计算机模拟人类视觉系统进行显著性检测。目前，显著性的研究可以广泛应用到计算机视觉的各个领域，包括图像检索、图像分类、目标识别以及图像分割等。With the rapid development of economy and technology in today's society, all kinds of fragmented information are received by humans all the time, and image and video information is the most and most important of these information. How to process image data quickly and effectively has become a difficult problem that people need to solve. Usually, people only focus on the most eye-catching regions in an image, i.e., foreground regions or salient objects, while ignoring background regions. Therefore, people use computers to simulate the human visual system for saliency detection. At present, saliency research can be widely applied to various fields of computer vision, including image retrieval, image classification, object recognition, and image segmentation.

显著性检测的目标是精准的从图像中将显著目标检测出来。基于监督学习的显著性检测算法普遍存在一个问题，即模型训练过程通常需要大量的人工标记数据，标记显著区域需要花费大量的资源，而且许多训练样本中存在冗余信息，这些冗余信息反而会对模型精度造成负面影响。The goal of saliency detection is to accurately detect salient objects from images. There is a common problem with saliency detection algorithms based on supervised learning, that is, the model training process usually requires a large amount of manually labeled data, and it takes a lot of resources to label salient regions, and there is redundant information in many training samples. Negative impact on model accuracy.

发明内容SUMMARY OF THE INVENTION

本发明要解决的技术问题是：弥补上述现有方法的不足，提出一种基于主动学习的图像显著性检测方法，实现使用较少的训练样本获得更高模型精度的目的。The technical problem to be solved by the present invention is: to make up for the deficiencies of the above-mentioned existing methods, and to propose an image saliency detection method based on active learning, so as to achieve the purpose of obtaining higher model accuracy by using fewer training samples.

本发明的技术方案：Technical scheme of the present invention:

一种基于主动学习的显著性检测方法，步骤如下：A saliency detection method based on active learning, the steps are as follows:

(1)首先，从MSRA数据库中随机选取500张图像加入训练集L中作为初始训练集，分别生成所有图像的区域候选分割(Proposals)，并提取所有区域候选分割的区域CNN特征；(1) First, randomly select 500 images from the MSRA database and add them to the training set L as the initial training set, generate regional candidate segmentations (Proposals) of all images respectively, and extract the regional CNN features of all regional candidate segmentations;

(2)定义区域候选分割的正负样本，设计一个自信值(confidence score)去衡量样本相对于真值图前景和背景的打分，自信值为：

该分数提前计算两个分值A和C，A是准确度分数，C是覆盖度分数，其中

而

其中O_i代表第i个样本的目标候选分割，G代表该图像的真值图；其中，ξ是用来平衡准确度分数和覆盖度分数的权重；在本方法中，设定自信值高于0.9的样本被认为是正样本，自信值小于0.6的样本被确定为负样本；因为在计算自信值时发现正样本的个数远远小于负样本的个数，所以使用所有的正样本，随机选择与正样本数量相同的负样本；为进行排序支持向量机训练，将所有正负样本组成正负样本对，定义正样本减负样本为正样本对，反之为负样本对；(2) Define the positive and negative samples of the region candidate segmentation, and design a confidence score to measure the score of the sample relative to the foreground and background of the ground truth map. The confidence value is:

The score is calculated ahead of time with two scores, A and C, where A is the accuracy score and C is the coverage score, where

and

where O _i represents the target candidate segmentation of the ith sample, and G represents the ground-truth map of the image; where ξ is the weight used to balance the accuracy score and the coverage score; in this method, the confidence value is set higher than A sample of 0.9 is considered a positive sample, and a sample with a confidence value of less than 0.6 is determined to be a negative sample; because the number of positive samples is much smaller than the number of negative samples when calculating the confidence value, all positive samples are used and randomly selected Negative samples with the same number of positive samples; for sorting support vector machine training, all positive and negative samples are formed into positive and negative sample pairs, and positive samples minus negative samples are defined as positive sample pairs, and vice versa as negative sample pairs;

利用公式：

进行排序支持向量机和子空间学习联合训练，训练得到一个排序器KSR，该排序器对样本的区域候选分割进行显著性排序，排名靠前的与前景相似度大；其中w是排序支持向量机的排序系数；式中

是逻辑损失函数，a是损失函数参数，e为指数函数；φ(x_i)代表样本的特征x_i通过核映射后的特征；p是样本对x_in和x_jn的约束数目；(in,jn)表明样本对是第n对约束的下标；y_n∈{+1,-1}表示样本是属于同一类还是不同类，或者说是否同时属于前景或背景；L∈R^l×d(l＜d)是学习到的映射矩阵，l是初始特征维数，d是映射后的特征维度，μ和λ则代表正则化参数；Use the formula:

Perform joint training of sorting support vector machine and subspace learning, and train to get a sorter KSR, which sorts the region candidate segmentation of the sample significantly, and the top ranking is similar to the foreground; where w is the sorting support vector machine. sorting coefficient; where

is the logistic loss function, a is the loss function parameter, and e is the exponential function; φ( _xi ) represents the feature x _i of the sample after being mapped by the kernel; p is the number of constraints of the sample on x _in and x _jn ; (in, jn) indicates that the sample pair is the subscript of the nth pair of constraints; y _n ∈{+1,-1} indicates whether the sample belongs to the same class or different classes, or whether it belongs to the foreground or background at the same time; L∈R ^l×d ( l<d) is the learned mapping matrix, l is the initial feature dimension, d is the mapped feature dimension, and μ and λ represent regularization parameters;

利用主动学习挑选训练样本；首先利用上述初始化生成的模型对未标记样本池中所有样本的目标候选分割进行显著性排序，由s_i＝w^TPk_i得到排序分数，为简化联合训练的计算引入L＝Pφ^T(X)，其中P∈R^l×N，N是样本个数，φ^T(X)是核运算，在化简过程中，引入核函数；利用公式

将所有排序分数s_i归一化，

代表排序分数归一化后分数，s_min代表该图像排序分数集中最小的排序分数，s_max代表该图像排序分数集中最大的排序分数；找出所有图像的目标候选分割的归一化排序分数在0.4至0.9之间的图像

X_p代表选择出的目标候选分割组成的集合，X代表该图像所有的目标候选分割组成的集合；通过公式

计算该图像目标候选分割的排序分数在0.4至0.9之间的个数占所有目标候选分割的比例β，其中card(X)代表该图像所有的目标候选分割数目，card(X_P)代表集合X_p中的目标候选分割数目；将此分数作为每张图像的不确定值；这样，得到一个关于所有样本池中未标记样本的不确定值集合，选择其Β＝{β₁,β₂,…,β_n}中高不确定度的样本人工标记后加入训练集，通过公式

进行每一次选择，其中μ₀为不确定值集合B的均值，δ是集合的标准差，λ₀是权重参数，选取λ₀＝1.145；设计每一次选择不确定度β大于μ₀+λ₀δ的样本组成集合Q_uc；对图像Q_uc应用密度聚类算法，获得最佳参数ε＝0.05，MinPts的大小设定为2，当圆心邻域内样本为2或以上可归为一类；聚类后，获得高密度的样本簇C＝{c₁,c₂,…c_n}，和只有1个孤立样本的簇O＝{o₁,o₂,…o_m}，最终图像集Q_uc被分成：Q_uc＝{c_i,i＝1,2,...n}∪{o_i,i＝1,2,...m}；通过公式

从每一个高密度的簇c_t中选择其中不确定度最大的样本U_t，除此之外，还选择所有孤立样本加入候选集Q中，此类样本点可增加训练模型的泛化能力；最终候选集为Q＝{U_t,t＝1,...n}∪{O_i,i＝1,...m}；样本集Q代表了每次通过同时考虑不确定性和多样性设计的选择模型选择出的样本，通过人工标记后加入训练集L中；Use active learning to select training samples; first, the model generated by the above initialization is used to rank the target candidate _segmentation of all samples in the _unlabeled sample pool ^. L=Pφ ^T (X), where P∈R ^l×N , N is the number of samples, φ ^T (X) is the kernel operation, in the process of simplification, the kernel function is introduced; using the formula

Normalize all ranking scores s _i ,

Represents the normalized score of the ranking score, s _min represents the smallest ranking score in the image ranking score set, and s _max represents the largest ranking score in the image ranking score set; find the normalized ranking score of the target candidate segmentation of all images in Images between 0.4 and 0.9

X _p represents the set of selected target candidate segmentations, and X represents the set of all target candidate segmentations of the image; through the formula

Calculate the proportion β of the number of target candidate segmentations of the image between 0.4 and 0.9 to all target candidate segmentations, where card(X) represents the number of all target candidate segmentations in the image, and card(X _P ) represents the set X The number of target candidate segmentations in _p ; this score is taken as the uncertainty value of each image; in this way, a set of uncertain values about unlabeled samples in all sample pools is obtained, and its β={β ₁ ,β ₂ ,… ,β _n } The samples with high uncertainty are manually marked and added to the training set, through the formula

Make each selection, where μ ₀ is the mean value of the set B of uncertain values, δ is the standard deviation of the set, λ ₀ is the weight parameter, and λ ₀ =1.145 is selected; the design uncertainty β of each selection is greater than μ ₀ +λ ₀ The samples of δ form a set _{Qu uc} ; apply the density clustering algorithm to the image _{Qu uc} to obtain the optimal parameter ε=0.05, and the size of MinPts is set to 2. When the samples in the neighborhood of the center of the circle are 2 or more, they can be classified into one class; After the class, obtain a high-density sample cluster C={c ₁ ,c ₂ ,...c _n }, and a cluster O={o ₁ ,o ₂ ,... o _m } with only 1 isolated sample, the final image set _Quc is divided into: Q _uc ={ci , _i =1,2,...n}∪{o _i ,i=1,2,...m}; by the formula

Select the sample U _t with the largest uncertainty from each high-density cluster c _t , in addition, select all isolated samples to join the candidate set Q, such sample points can increase the generalization ability of the training model; The final candidate set is Q={U _t , t=1,...n}∪{O _i ,i=1,...m}; the sample set Q represents that each pass considers uncertainty and diversity at the same time The samples selected by the designed selection model are manually marked and added to the training set L;

(3)将上述工作选择的样本集Q进行人工标记，之后加入到训练集L中，利用更新后的训练集L再次训练一个排序器KSR，之后在验证集上验证此次训练的模型的性能，之后不断重复步骤(2)，直到模型性能变化较小或性能下降，选择上一次迭代选择的训练集作为最终训练集，训练的模型作为最终的训练模型，由模型对每张测试图像的区域候选分割进行显著性排序，选取排名前16位的区域候选分割进行加权融合，得到该图像的显著图M_p；(3) Manually label the sample set Q selected in the above work, and then add it to the training set L, use the updated training set L to train a sorter KSR again, and then verify the performance of the trained model on the validation set. , and then repeat step (2) until the performance of the model changes little or the performance decreases, select the training set selected in the previous iteration as the final training set, and the trained model as the final training model. The candidate segmentation is saliency ranking, and the top 16 regional candidate segmentations are selected for weighted fusion to obtain the saliency map M _p of the image;

(4)由步骤(3)得到的显著图M_p对于目标的边缘细节处理的仍然不够，因此本发明提出一种在超像素级别上的处理方法，实现优化边界的目的。首先利用SLIC的超像素分割算法，设定分割的超像素块数量分别为100，150和200，用来构成图像i的超像素集SP_i，分别提取每一个超像素块的CNN特征x_j；将由步骤(3)得到的显著图M_p进行二值化作为先验显著图E_i；确定超像素的正负样本，为了使置信度最高，将超像素中完全位于先验显著图E_i前景区域的超像素构成正样本集PO_i，将超像素中完全位于先验显著图E_i的背景区域的超像素构成负样本集N_i；将正负样本集中的样本构成正负样本对，利用公式

去自训练一个关于图像i的模型KSR_i；对于该模型KSR_i，利用公式s_i＝w^TPk_i对图像所有的超像素进行打分，将所有的超像素进行排序S＝{s₁,s₂,…s_n}，分数越高越接近前景，反之分数越低越接近背景。利用公式

求得超像素中每个像素的得分，并将所有分数归一化到0-1之间，最后加权融合得到超像素级别合成的显著图M_s。最终显著图通过公式M＝w₁×M_p+w₂×M_S得到，其中M为最终显著图，M_p为原显著图，M_s为超像素级显著图。(4) The saliency map Mp obtained in step (3) is still insufficient for processing the edge details of the target, so the present invention proposes a processing method at the _superpixel level to achieve the purpose of optimizing the boundary. Firstly, the superpixel segmentation algorithm of SLIC is used, and the number of superpixel blocks to be divided is 100, 150 and 200, respectively, to form the superpixel set SP _i of image i, and the CNN feature x _j of each superpixel block is extracted respectively; Binarize the saliency map M _p obtained in step (3) as the prior saliency map E _i ; determine the positive and negative samples of the superpixel, in order to make the highest confidence, the superpixel is completely located in the foreground of the prior saliency map E _i . The superpixels in the region constitute a positive sample set PO _i , and the superpixels in the superpixels that are completely located in the background region of the prior saliency map E _i constitute a negative sample set N _i ; the samples in the positive and negative sample sets constitute a positive and negative sample pair, using formula

To train a model KSR _i about the image i; for the model KSR _i , use the formula _si =w ^T Pki to score all the _superpixels of the image, and sort all the superpixels S={s ₁ ,s ₂ ,…s _n }, the higher the score, the closer to the foreground, and the lower the score, the closer to the background. Use the formula

The score of each pixel in the superpixel is obtained, and all scores are normalized between 0 and 1, and finally the saliency map M _s synthesized at the superpixel level is obtained by weighted fusion. The final saliency map is obtained by the formula M=w ₁ ×M _p +w ₂ ×M _S , where M is the final saliency map, M _p is the original saliency map, and M _s is the superpixel-level saliency map.

本发明的有益效果：本发明提出的基于主动学习的显著性检测算法，将主动学习的思路运用到了显著性检测领域，通过考虑样本的不确定性与多样性从未标记样本集中选择出最利于模型训练的样本加入训练集，训练得到最终的KSR模型，由该模型输出测试样本的初始显著图。之后，为了优化显著图的目标边界，又设计了一种超像素级的后处理方法进一步提升性能。本发明降低了标注成本，同时减少了训练集的冗余，使得实验效果相较于原始KSR模型有了很大提升。同时通过对比实验表明本发明方法的性能优于许多经典算法。Beneficial effects of the present invention: The saliency detection algorithm based on active learning proposed by the present invention applies the idea of active learning to the field of saliency detection. By considering the uncertainty and diversity of samples, the unmarked sample set is selected to be the most beneficial The samples trained by the model are added to the training set, and the final KSR model is obtained by training, and the initial saliency map of the test samples is output from the model. Later, in order to optimize the target boundary of the saliency map, a superpixel-level post-processing method is designed to further improve the performance. The invention reduces the labeling cost and reduces the redundancy of the training set, so that the experimental effect is greatly improved compared with the original KSR model. Meanwhile, comparative experiments show that the performance of the method of the present invention is better than many classical algorithms.

附图说明Description of drawings

图1是本发明的基本流程图。Figure 1 is a basic flow chart of the present invention.

图2是将主动学习应用到KSR模型上得到的初始显著图。Figure 2 is the initial saliency map obtained by applying active learning to the KSR model.

图3是对初始显著图应用超像素级后处理融合得到的最终显著图。Figure 3 is the final saliency map obtained by applying superpixel-level post-processing fusion to the initial saliency map.

具体实施方式Detailed ways

以下结合附图和技术方案，进一步说明本发明的具体实施方式。The specific embodiments of the present invention will be further described below with reference to the accompanying drawings and technical solutions.

本发明的构思是：由于监督学习训练过程通常需要大量的人工标记数据，标记显著区域需要花费大量的资源，而且许多训练样本中存在冗余信息，这些冗余信息反而会对模型精度造成负面影响。主动学习利用选择机制选择信息量较大的样本进行训练，实现使用较少的训练样本获得更高模型精度的目的。基于此，本发明在核子空间排序算法(Kernelized Subspace Ranking，KSR)的基础上，将主动学习(Active Learning，AL)的思路与之结合。本发明设计了一种基于池的主动学习策略，即考虑未标记样本的不确定性和多样性来挑选信息量较大的样本参与训练，实现减少训练样本数量，降低标记成本的目的。The idea of the present invention is: since the supervised learning training process usually requires a large amount of artificially labeled data, it takes a lot of resources to mark significant regions, and there are redundant information in many training samples, which will negatively affect the model accuracy. . Active learning uses a selection mechanism to select samples with a large amount of information for training, so as to achieve the purpose of using fewer training samples to obtain higher model accuracy. Based on this, the present invention combines the idea of Active Learning (AL) with the Kernelized Subspace Ranking (KSR) algorithm. The present invention designs a pool-based active learning strategy, that is, considering the uncertainty and diversity of unlabeled samples, selecting samples with a large amount of information to participate in training, so as to reduce the number of training samples and reduce the cost of labeling.

本发明提取目标级区域候选分割(Proposals)的卷积神经网络CNN特征，利用子空间映射和排序支持向量机联合学习一个排序器，该排序器对测试图像的区域候选分割进行显著性排序，将排序靠前的区域候选分割加权融合得到显著图。最后，为了优化显著图的目标边界，本发明设计了一种超像素级的后处理方法进一步提升性能。The present invention extracts the convolutional neural network CNN features of target-level region candidate segmentation (Proposals), uses subspace mapping and sorting support vector machine to jointly learn a sorter, and the sorter performs significant sorting on the region candidate divisions of the test image, The saliency map is obtained by weighted fusion of the top-ranked region candidates. Finally, in order to optimize the target boundary of the saliency map, the present invention designs a superpixel-level post-processing method to further improve the performance.

本发明具体实施如下：The present invention is specifically implemented as follows:

(1)首先从MSRA数据库中随机选取500张图像加入训练集L中作为初始训练集，分别生成所有图像的区域候选分割(Proposals)，并提取所有区域候选分割的区域CNN特征。(1) First, 500 images are randomly selected from the MSRA database and added to the training set L as the initial training set, the region candidate segmentation (Proposals) of all images are generated respectively, and the region CNN features of all region candidate segmentations are extracted.

(2)定义区域候选分割的正负样本，本算法设计一个自信值(confidence score)去衡量样本相对于真值图前景和背景的打分，自信值为：

而

其中O_i代表第i个样本的目标候选分割，G代表该图像的真值图；其中，ξ是用来平衡准确度分数和覆盖度分数的权重；在本算法中，设定自信值高于0.9的样本被认为是正样本，自信值小于0.6的样本被确定为负样本；因为在计算自信值时发现正样本的个数远远小于负样本的个数，所以使用所有的正样本，随机选择与正样本数量相同的负样本；为进行排序支持向量机训练，将所有正负样本组成正负样本对，定义正样本减负样本为正样本对，反之为负样本对；(2) Define the positive and negative samples of the candidate segmentation of the region. This algorithm designs a confidence score to measure the score of the sample relative to the foreground and background of the ground truth map. The confidence value is:

and

where O _i represents the target candidate segmentation of the ith sample, and G represents the ground truth map of the image; where ξ is the weight used to balance the accuracy score and the coverage score; in this algorithm, the confidence value is set higher than A sample of 0.9 is considered a positive sample, and a sample with a confidence value of less than 0.6 is determined to be a negative sample; because the number of positive samples is much smaller than the number of negative samples when calculating the confidence value, all positive samples are used and randomly selected Negative samples with the same number of positive samples; for sorting support vector machine training, all positive and negative samples are formed into positive and negative sample pairs, and positive samples minus negative samples are defined as positive sample pairs, and vice versa as negative sample pairs;

利用公式：

进行排序支持向量机和子空间学习联合训练，可训练得到一个排序器KSR，该排序器对样本的区域候选分割进行显著性排序，排名靠前的与前景相似度大；其中w是排序支持向量机的排序系数；式中

是逻辑损失函数，a是损失函数参数，e为指数函数；φ(x_i)代表样本的特征x_i通过核映射后的特征；p是样本对x_in和x_jn的约束数目；(in,jn)表明样本对是第n对约束的下标；y_n∈{+1,-1}表示样本是属于同一类还是不同类，或者说是否同时属于前景或背景；L∈R^l×d(l＜d)是学习到的映射矩阵，l是初始特征维数，d是映射后的特征维度，μ和λ则代表正则化参数。Use the formula:

Perform joint training of sorting support vector machine and subspace learning, and a sorter KSR can be obtained by training, which sorts the region candidate segmentation of the sample significantly, and the top ranking has a large similarity with the foreground; where w is the sorting support vector machine. The ordering coefficient of ; where

is the logistic loss function, a is the loss function parameter, and e is the exponential function; φ( _xi ) represents the feature x _i of the sample after being mapped by the kernel; p is the number of constraints of the sample on x _in and x _jn ; (in, jn) indicates that the sample pair is the subscript of the nth pair of constraints; y _n ∈{+1,-1} indicates whether the sample belongs to the same class or different classes, or whether it belongs to the foreground or background at the same time; L∈R ^l×d ( l<d) is the learned mapping matrix, l is the initial feature dimension, d is the mapped feature dimension, and μ and λ represent regularization parameters.

利用主动学习挑选训练样本；首先利用上述初始化生成的模型对未标记样本池中所有样本的目标候选分割进行显著性排序，由s_i＝w^TPk_i得到排序分数，为简化联合训练的计算引入L＝Pφ^T(X)，其中，N是样本个数，φ^T(X)是核运算，在化简过程中，本专利引入了核函数。利用公式

将所有排序分数s_i归一化，

X_p代表选择出的目标候选分割组成的集合，X代表该图像所有的目标候选分割组成的集合。通过公式

计算该图像目标候选分割的排序分数在0.4至0.9之间的个数占所有目标候选分割的比例β，其中card(X)代表该图像所有的目标候选分割数目，card(X_P)代表集合X_p中的目标候选分割数目。将此分数作为每张图像的不确定值；这样，就能得到一个关于所有样本池中未标记样本的不确定值集合，Β＝{β₁,β₂,…,β_n}，选择其中高不确定度的样本人工标记后加入训练集，通过公式

进行每一次选择，其中μ₀为不确定值集合B的均值，δ是集合的标准差，λ₀是权重参数，选取λ₀＝1.145；设计每一次选择不确定度β大于μ₀+λ₀δ的样本组成集合Q_uc。对图像Q_uc应用密度聚类算法，通过实验获得最佳参数ε＝0.05，MinPts的大小设定为2，当圆心邻域内样本为2或以上可归为一类。聚类后，可获得高密度的样本簇C＝{c₁,c₂,…c_n}，和一些只有1个孤立样本的簇O＝{o₁,o₂,…o_m}，最终图像集Q_uc可被分成：Q_uc＝{c_i,i＝1,2,...n}∪{o_i,i＝1,2,...m}。通过公式

从每一个高密度的簇c_t中选择其中不确定度最大的样本U_t，除此之外，还选择所有孤立样本加入候选集Q中，此类样本点可以增加训练模型的泛化能力。最终和候选集为Q＝{U_t,t＝1,...n}∪{O_i,i＝1,...m}。样本集Q代表了每次通过同时考虑不确定性和多样性设计的选择模型选择出的样本，通过人工标记后加入训练集L中。Use active learning to select training samples; first, the model generated by the above initialization is used to rank the target candidate _segmentation of all samples in the _unlabeled sample pool ^. L=Pφ ^T (X) , where N is the number of samples, and φ ^T (X) is the kernel operation. In the simplification process, the present patent introduces a kernel function. Use the formula

Normalize all ranking scores s _i ,

X _p represents the set of selected target candidate segmentations, and X represents the set of all target candidate segmentations of the image. by formula

Calculate the proportion β of the number of target candidate segmentations of the image between 0.4 and 0.9 to all target candidate segmentations, where card(X) represents the number of all target candidate segmentations in the image, and card(X _P ) represents the set X The number of target candidate splits in _p . Take this score as the uncertainty value for each image; in this way, you can get a set of uncertainty values for all unlabeled samples in the sample pool, β={β ₁ ,β ₂ ,...,β _n }, choose the highest Uncertainty samples are manually marked and added to the training set, through the formula

Make each selection, where μ ₀ is the mean value of the set B of uncertain values, δ is the standard deviation of the set, λ ₀ is the weight parameter, and λ ₀ =1.145 is selected; the design uncertainty β of each selection is greater than μ ₀ +λ ₀ The samples of δ form the set _Quc . The density clustering algorithm is applied to the image _Quc , and the optimal parameter ε=0.05 is obtained through experiments, and the size of MinPts is set to 2. When the number of samples in the neighborhood of the center of the circle is 2 or more, it can be classified into one class. After clustering, a high-density sample cluster C={c ₁ ,c ₂ ,…c _n }, and some clusters O={o ₁ ,o ₂ ,…o _m } with only one isolated sample can be obtained, and the final image The set _Quc can be divided into: _Quc ={ci, _i =1,2,...n}∪{oi, _i =1,2,...m}. by formula

The sample U _t with the largest uncertainty is selected from each high-density cluster c _t . In addition, all isolated samples are selected to be added to the candidate set Q. Such sample points can increase the generalization ability of the training model. The final sum candidate set is Q={U _t , t=1,...n}∪{O _i ,i=1,...m}. The sample set Q represents the samples selected by the selection model designed by considering both uncertainty and diversity each time, and is added to the training set L after being manually marked.

(3)将上述工作选择的样本集Q进行人工标记，之后加入到训练集L中，利用更新后的训练集L再次训练一个排序器KSR，之后在验证集上验证此次训练的模型的性能，之后不断重复上文的步骤(2)，直到模型性能变化较小或者性能下降，选择上一次迭代选择的训练集作为最终训练集，训练的模型作为最终的训练模型，由模型对每张测试图像的区域候选分割进行显著性排序，选取排名前16位的区域候选分割进行加权融合，得到该图像的显著图M_p。(3) Manually label the sample set Q selected in the above work, and then add it to the training set L, use the updated training set L to train a sorter KSR again, and then verify the performance of the trained model on the validation set. , and then repeat the above step (2) until the performance of the model changes slightly or the performance declines, select the training set selected in the previous iteration as the final training set, and the trained model as the final training model, and the model tests each test The region candidate segmentations of the image are sorted for significance, and the top 16 region candidate segmentations are selected for weighted fusion to obtain the saliency map _Mp of the image.

求得超像素中每个像素的得分，并将所有分数归一化到0-1之间，最后加权融合得到超像素级别合成的显著图M_s。最终显著图通过公式M＝w₁×M_p+w₂×M_S得到，其中M为最终显著图，M_p为原显著图，M_s为超像素级显著图，w₁为1，w₂取0.3。(4) The saliency map Mp obtained in step (3) is still insufficient for processing the edge details of the target, so the present invention proposes a processing method at the _superpixel level to achieve the purpose of optimizing the boundary. Firstly, the superpixel segmentation algorithm of SLIC is used, and the number of superpixel blocks to be divided is 100, 150 and 200, respectively, to form the superpixel set SP _i of image i, and the CNN feature x _j of each superpixel block is extracted respectively; Binarize the saliency map M _p obtained in step (3) as the prior saliency map E _i ; determine the positive and negative samples of the superpixel, in order to make the highest confidence, the superpixel is completely located in the foreground of the prior saliency map E _i . The superpixels in the region constitute a positive sample set PO _i , and the superpixels in the superpixels that are completely located in the background region of the prior saliency map E _i constitute a negative sample set N _i ; the samples in the positive and negative sample sets constitute a positive and negative sample pair, using formula

The score of each pixel in the superpixel is obtained, and all the scores are normalized between 0 and 1, and finally the saliency map M _s synthesized at the superpixel level is obtained by weighted fusion. The final saliency map is obtained by the formula M=w ₁ ×M _p +w ₂ ×M _S , where M is the final saliency map, M _p is the original saliency map, M _s is the superpixel-level saliency map, w ₁ is 1, w ₂ Take 0.3.

Claims

1. a saliency detection method based on active learning, is characterized in that, step is as follows:

(1) First, randomly select 500 images from the MSRA database and add them to the training set L as the initial training set, generate regional candidate segmentations of all images, and extract the regional CNN features of all regional candidate segmentations;

(2) Define the positive and negative samples of the region candidate segmentation, and design a confidence value to measure the score of the sample relative to the foreground and background of the ground truth map. The confidence value is:

and

Use the formula:

is the logistic loss function, a is the loss function parameter, and e is the exponential function; φ( _xi ) represents the feature x _i of the sample after being mapped by the kernel; p is the number of constraints of the sample on x _in and x _jn ; (in, jn) indicates that the sample pair is the subscript of the nth pair of constraints; y _n ∈{+1,-1} indicates whether the samples belong to the same class or different classes; L∈R ^l×d (l<d) is the learned mapping matrix, l is the initial feature dimension, d is the mapped feature dimension, and μ and λ represent the regularization parameter;

Use active learning to select training samples; first, the model generated by the above initialization is used to rank the target candidate _segmentation of all samples in the _unlabeled sample pool ^. L=Pφ ^T (X), where P∈R ^l×N , N is the number of samples, φ ^T (X) is the kernel operation, in the process of simplification, the kernel function is introduced; using the formula

Normalize all ranking scores s _i ,

Calculate the proportion β of the number of target candidate segmentations of the image between 0.4 and 0.9 to all target candidate segmentations, where card(X) represents the number of all target candidate segmentations in the image, and card(X _P ) represents the set X The number of target candidate segmentations in _p ; this ranking score is taken as the uncertainty value of each image; thus, a set of uncertain values for unlabeled samples in all sample pools is obtained β={β ₁ ,β ₂ ,...,β _n }, select the samples corresponding to high uncertainty in β={β ₁ ,β ₂ ,...,β _n } and add them to the training set after manual labeling, that is, by formula

Make each selection, where μ ₀ is the mean value of the set B of uncertain values, δ is the standard deviation of the set B, λ ₀ is the weight parameter, choose λ ₀ =1.145; in this way, the uncertainty β of each selection is greater than μ ₀ +λ The samples of ₀ δ form a set _Quc ; apply the density clustering algorithm to the image _Quc , obtain the parameter ε=0.05, and set the size of MinPts to 2. When the samples in the neighborhood of the center of the circle are 2 or more, they are classified into one class; after clustering , to obtain a high-density sample cluster C={c ₁ ,c ₂ ,…c _n }, and a cluster O={o ₁ ,o ₂ ,…o _m } with only 1 isolated sample, the final image set _Quc is divided into : Q _uc ={ci , _i =1,2,...n}∪{o _i ,i=1,2,...m}; by formula

(3) Manually label the sample set Q selected in the above work, and then add it to the training set L, use the updated training set L to train a sorter KSR again, and then verify the performance of the model trained this time on the validation set , and then repeat step (2) until the performance of the model changes little or the performance declines, select the training set selected in the previous iteration as the final training set, and the trained model as the final training model. The candidate segmentation is saliency sorted, and the top 16 regional candidate segmentations are selected for weighted fusion to obtain the saliency map M _p of the image;

(4) A processing method at the superpixel level is proposed to achieve the purpose of optimizing the boundary; first, the superpixel segmentation algorithm of SLIC is used to set the number of divided superpixel blocks to be 100, 150 and 200, respectively, to form an image. For the superpixel set SP _i of i , extract the CNN feature x _j of each superpixel block; binarize the saliency map M _p obtained in step (3) as the prior saliency map E _i ; determine the positive and negative values of the superpixels Samples, in order to make the highest confidence, the superpixels in the _superpixels that are completely located in the foreground area of the prior saliency map Ei constitute a positive sample set POi, and the _superpixels in the _superpixels that are completely located in the background area of the prior saliency map Ei. Constitute the negative sample set N _i ; the samples in the positive and negative sample sets are formed into positive and negative sample pairs, using the formula

To train a model KSR _i about the image i; for the model KSR _i , use the formula _si =w ^T Pki to score all the _superpixels of the image, and sort all the superpixels S={s ₁ ,s ₂ ,…s _n }, the higher the score, the closer to the foreground, and the lower the score, the closer to the background; using the formula

Obtain the score of each pixel in the superpixel, normalize all the scores to 0-1, and finally obtain the saliency map M _s synthesized at the superpixel level by weighted fusion; the final saliency map is obtained by the formula M=w ₁ ×M _p +w ₂ ×M _S , where M is the final saliency map, M _p is the original saliency map, and M _s is the superpixel-level saliency map.