CN110443257B - A saliency detection method based on active learning - Google Patents
A saliency detection method based on active learning Download PDFInfo
- Publication number
- CN110443257B CN110443257B CN201910609780.2A CN201910609780A CN110443257B CN 110443257 B CN110443257 B CN 110443257B CN 201910609780 A CN201910609780 A CN 201910609780A CN 110443257 B CN110443257 B CN 110443257B
- Authority
- CN
- China
- Prior art keywords
- sample
- samples
- score
- image
- positive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 11
- 238000012549 training Methods 0.000 claims abstract description 53
- 238000000034 method Methods 0.000 claims abstract description 15
- 230000011218 segmentation Effects 0.000 claims description 48
- 230000006870 function Effects 0.000 claims description 12
- 238000012706 support-vector machine Methods 0.000 claims description 10
- 230000004927 fusion Effects 0.000 claims description 8
- 238000013461 design Methods 0.000 claims description 7
- 239000000284 extract Substances 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 101001013832 Homo sapiens Mitochondrial peptide methionine sulfoxide reductase Proteins 0.000 claims description 3
- 102100031767 Mitochondrial peptide methionine sulfoxide reductase Human genes 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000003672 processing method Methods 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 3
- 238000012805 post-processing Methods 0.000 abstract description 4
- 238000002474 experimental method Methods 0.000 abstract description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000000052 comparative effect Effects 0.000 abstract description 2
- 230000000694 effects Effects 0.000 abstract description 2
- 238000013527 convolutional neural network Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 4
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/2163—Partitioning the feature space
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/46—Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
- G06V10/462—Salient features, e.g. scale invariant feature transforms [SIFT]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Description
技术领域technical field
本发明属于人工智能技术领域,涉及到计算机视觉,特别涉及到一种图像显著性检测方法。The invention belongs to the technical field of artificial intelligence, relates to computer vision, and particularly relates to an image saliency detection method.
背景技术Background technique
当今社会的经济和科技水平迅速发展,各种各样不同的碎片化信息时时刻刻的被人类所接收,而图像和视频信息又是这些信息中最多、最重要的。如何快速有效的处理图像数据成为人们需要解决的一个难题。通常,人们只关注图像中最吸引人眼注意区域,即前景区域或显著目标,同时忽略背景区域。因此,人们利用计算机模拟人类视觉系统进行显著性检测。目前,显著性的研究可以广泛应用到计算机视觉的各个领域,包括图像检索、图像分类、目标识别以及图像分割等。With the rapid development of economy and technology in today's society, all kinds of fragmented information are received by humans all the time, and image and video information is the most and most important of these information. How to process image data quickly and effectively has become a difficult problem that people need to solve. Usually, people only focus on the most eye-catching regions in an image, i.e., foreground regions or salient objects, while ignoring background regions. Therefore, people use computers to simulate the human visual system for saliency detection. At present, saliency research can be widely applied to various fields of computer vision, including image retrieval, image classification, object recognition, and image segmentation.
显著性检测的目标是精准的从图像中将显著目标检测出来。基于监督学习的显著性检测算法普遍存在一个问题,即模型训练过程通常需要大量的人工标记数据,标记显著区域需要花费大量的资源,而且许多训练样本中存在冗余信息,这些冗余信息反而会对模型精度造成负面影响。The goal of saliency detection is to accurately detect salient objects from images. There is a common problem with saliency detection algorithms based on supervised learning, that is, the model training process usually requires a large amount of manually labeled data, and it takes a lot of resources to label salient regions, and there is redundant information in many training samples. Negative impact on model accuracy.
发明内容SUMMARY OF THE INVENTION
本发明要解决的技术问题是:弥补上述现有方法的不足,提出一种基于主动学习的图像显著性检测方法,实现使用较少的训练样本获得更高模型精度的目的。The technical problem to be solved by the present invention is: to make up for the deficiencies of the above-mentioned existing methods, and to propose an image saliency detection method based on active learning, so as to achieve the purpose of obtaining higher model accuracy by using fewer training samples.
本发明的技术方案:Technical scheme of the present invention:
一种基于主动学习的显著性检测方法,步骤如下:A saliency detection method based on active learning, the steps are as follows:
(1)首先,从MSRA数据库中随机选取500张图像加入训练集L中作为初始训练集,分别生成所有图像的区域候选分割(Proposals),并提取所有区域候选分割的区域CNN特征;(1) First, randomly select 500 images from the MSRA database and add them to the training set L as the initial training set, generate regional candidate segmentations (Proposals) of all images respectively, and extract the regional CNN features of all regional candidate segmentations;
(2)定义区域候选分割的正负样本,设计一个自信值(confidence score)去衡量样本相对于真值图前景和背景的打分,自信值为:该分数提前计算两个分值A和C,A是准确度分数,C是覆盖度分数,其中而其中Oi代表第i个样本的目标候选分割,G代表该图像的真值图;其中,ξ是用来平衡准确度分数和覆盖度分数的权重;在本方法中,设定自信值高于0.9的样本被认为是正样本,自信值小于0.6的样本被确定为负样本;因为在计算自信值时发现正样本的个数远远小于负样本的个数,所以使用所有的正样本,随机选择与正样本数量相同的负样本;为进行排序支持向量机训练,将所有正负样本组成正负样本对,定义正样本减负样本为正样本对,反之为负样本对;(2) Define the positive and negative samples of the region candidate segmentation, and design a confidence score to measure the score of the sample relative to the foreground and background of the ground truth map. The confidence value is: The score is calculated ahead of time with two scores, A and C, where A is the accuracy score and C is the coverage score, where and where O i represents the target candidate segmentation of the ith sample, and G represents the ground-truth map of the image; where ξ is the weight used to balance the accuracy score and the coverage score; in this method, the confidence value is set higher than A sample of 0.9 is considered a positive sample, and a sample with a confidence value of less than 0.6 is determined to be a negative sample; because the number of positive samples is much smaller than the number of negative samples when calculating the confidence value, all positive samples are used and randomly selected Negative samples with the same number of positive samples; for sorting support vector machine training, all positive and negative samples are formed into positive and negative sample pairs, and positive samples minus negative samples are defined as positive sample pairs, and vice versa as negative sample pairs;
利用公式:进行排序支持向量机和子空间学习联合训练,训练得到一个排序器KSR,该排序器对样本的区域候选分割进行显著性排序,排名靠前的与前景相似度大;其中w是排序支持向量机的排序系数;式中是逻辑损失函数,a是损失函数参数,e为指数函数;φ(xi)代表样本的特征xi通过核映射后的特征;p是样本对xin和xjn的约束数目;(in,jn)表明样本对是第n对约束的下标;yn∈{+1,-1}表示样本是属于同一类还是不同类,或者说是否同时属于前景或背景;L∈Rl×d(l<d)是学习到的映射矩阵,l是初始特征维数,d是映射后的特征维度,μ和λ则代表正则化参数;Use the formula: Perform joint training of sorting support vector machine and subspace learning, and train to get a sorter KSR, which sorts the region candidate segmentation of the sample significantly, and the top ranking is similar to the foreground; where w is the sorting support vector machine. sorting coefficient; where is the logistic loss function, a is the loss function parameter, and e is the exponential function; φ( xi ) represents the feature x i of the sample after being mapped by the kernel; p is the number of constraints of the sample on x in and x jn ; (in, jn) indicates that the sample pair is the subscript of the nth pair of constraints; y n ∈{+1,-1} indicates whether the sample belongs to the same class or different classes, or whether it belongs to the foreground or background at the same time; L∈R l×d ( l<d) is the learned mapping matrix, l is the initial feature dimension, d is the mapped feature dimension, and μ and λ represent regularization parameters;
利用主动学习挑选训练样本;首先利用上述初始化生成的模型对未标记样本池中所有样本的目标候选分割进行显著性排序,由si=wTPki得到排序分数,为简化联合训练的计算引入L=PφT(X),其中P∈Rl×N,N是样本个数,φT(X)是核运算,在化简过程中,引入核函数;利用公式将所有排序分数si归一化,代表排序分数归一化后分数,smin代表该图像排序分数集中最小的排序分数,smax代表该图像排序分数集中最大的排序分数;找出所有图像的目标候选分割的归一化排序分数在0.4至0.9之间的图像Xp代表选择出的目标候选分割组成的集合,X代表该图像所有的目标候选分割组成的集合;通过公式计算该图像目标候选分割的排序分数在0.4至0.9之间的个数占所有目标候选分割的比例β,其中card(X)代表该图像所有的目标候选分割数目,card(XP)代表集合Xp中的目标候选分割数目;将此分数作为每张图像的不确定值;这样,得到一个关于所有样本池中未标记样本的不确定值集合,选择其Β={β1,β2,…,βn}中高不确定度的样本人工标记后加入训练集,通过公式进行每一次选择,其中μ0为不确定值集合B的均值,δ是集合的标准差,λ0是权重参数,选取λ0=1.145;设计每一次选择不确定度β大于μ0+λ0δ的样本组成集合Quc;对图像Quc应用密度聚类算法,获得最佳参数ε=0.05,MinPts的大小设定为2,当圆心邻域内样本为2或以上可归为一类;聚类后,获得高密度的样本簇C={c1,c2,…cn},和只有1个孤立样本的簇O={o1,o2,…om},最终图像集Quc被分成:Quc={ci,i=1,2,...n}∪{oi,i=1,2,...m};通过公式从每一个高密度的簇ct中选择其中不确定度最大的样本Ut,除此之外,还选择所有孤立样本加入候选集Q中,此类样本点可增加训练模型的泛化能力;最终候选集为Q={Ut,t=1,...n}∪{Oi,i=1,...m};样本集Q代表了每次通过同时考虑不确定性和多样性设计的选择模型选择出的样本,通过人工标记后加入训练集L中;Use active learning to select training samples; first, the model generated by the above initialization is used to rank the target candidate segmentation of all samples in the unlabeled sample pool . L=Pφ T (X), where P∈R l×N , N is the number of samples, φ T (X) is the kernel operation, in the process of simplification, the kernel function is introduced; using the formula Normalize all ranking scores s i , Represents the normalized score of the ranking score, s min represents the smallest ranking score in the image ranking score set, and s max represents the largest ranking score in the image ranking score set; find the normalized ranking score of the target candidate segmentation of all images in Images between 0.4 and 0.9 X p represents the set of selected target candidate segmentations, and X represents the set of all target candidate segmentations of the image; through the formula Calculate the proportion β of the number of target candidate segmentations of the image between 0.4 and 0.9 to all target candidate segmentations, where card(X) represents the number of all target candidate segmentations in the image, and card(X P ) represents the set X The number of target candidate segmentations in p ; this score is taken as the uncertainty value of each image; in this way, a set of uncertain values about unlabeled samples in all sample pools is obtained, and its β={β 1 ,β 2 ,… ,β n } The samples with high uncertainty are manually marked and added to the training set, through the formula Make each selection, where μ 0 is the mean value of the set B of uncertain values, δ is the standard deviation of the set, λ 0 is the weight parameter, and λ 0 =1.145 is selected; the design uncertainty β of each selection is greater than μ 0 +λ 0 The samples of δ form a set Qu uc ; apply the density clustering algorithm to the image Qu uc to obtain the optimal parameter ε=0.05, and the size of MinPts is set to 2. When the samples in the neighborhood of the center of the circle are 2 or more, they can be classified into one class; After the class, obtain a high-density sample cluster C={c 1 ,c 2 ,...c n }, and a cluster O={o 1 ,o 2 ,... o m } with only 1 isolated sample, the final image set Quc is divided into: Q uc ={ci , i =1,2,...n}∪{o i ,i=1,2,...m}; by the formula Select the sample U t with the largest uncertainty from each high-density cluster c t , in addition, select all isolated samples to join the candidate set Q, such sample points can increase the generalization ability of the training model; The final candidate set is Q={U t , t=1,...n}∪{O i ,i=1,...m}; the sample set Q represents that each pass considers uncertainty and diversity at the same time The samples selected by the designed selection model are manually marked and added to the training set L;
(3)将上述工作选择的样本集Q进行人工标记,之后加入到训练集L中,利用更新后的训练集L再次训练一个排序器KSR,之后在验证集上验证此次训练的模型的性能,之后不断重复步骤(2),直到模型性能变化较小或性能下降,选择上一次迭代选择的训练集作为最终训练集,训练的模型作为最终的训练模型,由模型对每张测试图像的区域候选分割进行显著性排序,选取排名前16位的区域候选分割进行加权融合,得到该图像的显著图Mp;(3) Manually label the sample set Q selected in the above work, and then add it to the training set L, use the updated training set L to train a sorter KSR again, and then verify the performance of the trained model on the validation set. , and then repeat step (2) until the performance of the model changes little or the performance decreases, select the training set selected in the previous iteration as the final training set, and the trained model as the final training model. The candidate segmentation is saliency ranking, and the top 16 regional candidate segmentations are selected for weighted fusion to obtain the saliency map M p of the image;
(4)由步骤(3)得到的显著图Mp对于目标的边缘细节处理的仍然不够,因此本发明提出一种在超像素级别上的处理方法,实现优化边界的目的。首先利用SLIC的超像素分割算法,设定分割的超像素块数量分别为100,150和200,用来构成图像i的超像素集SPi,分别提取每一个超像素块的CNN特征xj;将由步骤(3)得到的显著图Mp进行二值化作为先验显著图Ei;确定超像素的正负样本,为了使置信度最高,将超像素中完全位于先验显著图Ei前景区域的超像素构成正样本集POi,将超像素中完全位于先验显著图Ei的背景区域的超像素构成负样本集Ni;将正负样本集中的样本构成正负样本对,利用公式去自训练一个关于图像i的模型KSRi;对于该模型KSRi,利用公式si=wTPki对图像所有的超像素进行打分,将所有的超像素进行排序S={s1,s2,…sn},分数越高越接近前景,反之分数越低越接近背景。利用公式求得超像素中每个像素的得分,并将所有分数归一化到0-1之间,最后加权融合得到超像素级别合成的显著图Ms。最终显著图通过公式M=w1×Mp+w2×MS得到,其中M为最终显著图,Mp为原显著图,Ms为超像素级显著图。(4) The saliency map Mp obtained in step (3) is still insufficient for processing the edge details of the target, so the present invention proposes a processing method at the superpixel level to achieve the purpose of optimizing the boundary. Firstly, the superpixel segmentation algorithm of SLIC is used, and the number of superpixel blocks to be divided is 100, 150 and 200, respectively, to form the superpixel set SP i of image i, and the CNN feature x j of each superpixel block is extracted respectively; Binarize the saliency map M p obtained in step (3) as the prior saliency map E i ; determine the positive and negative samples of the superpixel, in order to make the highest confidence, the superpixel is completely located in the foreground of the prior saliency map E i . The superpixels in the region constitute a positive sample set PO i , and the superpixels in the superpixels that are completely located in the background region of the prior saliency map E i constitute a negative sample set N i ; the samples in the positive and negative sample sets constitute a positive and negative sample pair, using formula To train a model KSR i about the image i; for the model KSR i , use the formula si =w T Pki to score all the superpixels of the image, and sort all the superpixels S={s 1 ,s 2 ,…s n }, the higher the score, the closer to the foreground, and the lower the score, the closer to the background. Use the formula The score of each pixel in the superpixel is obtained, and all scores are normalized between 0 and 1, and finally the saliency map M s synthesized at the superpixel level is obtained by weighted fusion. The final saliency map is obtained by the formula M=w 1 ×M p +w 2 ×M S , where M is the final saliency map, M p is the original saliency map, and M s is the superpixel-level saliency map.
本发明的有益效果:本发明提出的基于主动学习的显著性检测算法,将主动学习的思路运用到了显著性检测领域,通过考虑样本的不确定性与多样性从未标记样本集中选择出最利于模型训练的样本加入训练集,训练得到最终的KSR模型,由该模型输出测试样本的初始显著图。之后,为了优化显著图的目标边界,又设计了一种超像素级的后处理方法进一步提升性能。本发明降低了标注成本,同时减少了训练集的冗余,使得实验效果相较于原始KSR模型有了很大提升。同时通过对比实验表明本发明方法的性能优于许多经典算法。Beneficial effects of the present invention: The saliency detection algorithm based on active learning proposed by the present invention applies the idea of active learning to the field of saliency detection. By considering the uncertainty and diversity of samples, the unmarked sample set is selected to be the most beneficial The samples trained by the model are added to the training set, and the final KSR model is obtained by training, and the initial saliency map of the test samples is output from the model. Later, in order to optimize the target boundary of the saliency map, a superpixel-level post-processing method is designed to further improve the performance. The invention reduces the labeling cost and reduces the redundancy of the training set, so that the experimental effect is greatly improved compared with the original KSR model. Meanwhile, comparative experiments show that the performance of the method of the present invention is better than many classical algorithms.
附图说明Description of drawings
图1是本发明的基本流程图。Figure 1 is a basic flow chart of the present invention.
图2是将主动学习应用到KSR模型上得到的初始显著图。Figure 2 is the initial saliency map obtained by applying active learning to the KSR model.
图3是对初始显著图应用超像素级后处理融合得到的最终显著图。Figure 3 is the final saliency map obtained by applying superpixel-level post-processing fusion to the initial saliency map.
具体实施方式Detailed ways
以下结合附图和技术方案,进一步说明本发明的具体实施方式。The specific embodiments of the present invention will be further described below with reference to the accompanying drawings and technical solutions.
本发明的构思是:由于监督学习训练过程通常需要大量的人工标记数据,标记显著区域需要花费大量的资源,而且许多训练样本中存在冗余信息,这些冗余信息反而会对模型精度造成负面影响。主动学习利用选择机制选择信息量较大的样本进行训练,实现使用较少的训练样本获得更高模型精度的目的。基于此,本发明在核子空间排序算法(Kernelized Subspace Ranking,KSR)的基础上,将主动学习(Active Learning,AL)的思路与之结合。本发明设计了一种基于池的主动学习策略,即考虑未标记样本的不确定性和多样性来挑选信息量较大的样本参与训练,实现减少训练样本数量,降低标记成本的目的。The idea of the present invention is: since the supervised learning training process usually requires a large amount of artificially labeled data, it takes a lot of resources to mark significant regions, and there are redundant information in many training samples, which will negatively affect the model accuracy. . Active learning uses a selection mechanism to select samples with a large amount of information for training, so as to achieve the purpose of using fewer training samples to obtain higher model accuracy. Based on this, the present invention combines the idea of Active Learning (AL) with the Kernelized Subspace Ranking (KSR) algorithm. The present invention designs a pool-based active learning strategy, that is, considering the uncertainty and diversity of unlabeled samples, selecting samples with a large amount of information to participate in training, so as to reduce the number of training samples and reduce the cost of labeling.
本发明提取目标级区域候选分割(Proposals)的卷积神经网络CNN特征,利用子空间映射和排序支持向量机联合学习一个排序器,该排序器对测试图像的区域候选分割进行显著性排序,将排序靠前的区域候选分割加权融合得到显著图。最后,为了优化显著图的目标边界,本发明设计了一种超像素级的后处理方法进一步提升性能。The present invention extracts the convolutional neural network CNN features of target-level region candidate segmentation (Proposals), uses subspace mapping and sorting support vector machine to jointly learn a sorter, and the sorter performs significant sorting on the region candidate divisions of the test image, The saliency map is obtained by weighted fusion of the top-ranked region candidates. Finally, in order to optimize the target boundary of the saliency map, the present invention designs a superpixel-level post-processing method to further improve the performance.
本发明具体实施如下:The present invention is specifically implemented as follows:
(1)首先从MSRA数据库中随机选取500张图像加入训练集L中作为初始训练集,分别生成所有图像的区域候选分割(Proposals),并提取所有区域候选分割的区域CNN特征。(1) First, 500 images are randomly selected from the MSRA database and added to the training set L as the initial training set, the region candidate segmentation (Proposals) of all images are generated respectively, and the region CNN features of all region candidate segmentations are extracted.
(2)定义区域候选分割的正负样本,本算法设计一个自信值(confidence score)去衡量样本相对于真值图前景和背景的打分,自信值为:该分数提前计算两个分值A和C,A是准确度分数,C是覆盖度分数,其中而其中Oi代表第i个样本的目标候选分割,G代表该图像的真值图;其中,ξ是用来平衡准确度分数和覆盖度分数的权重;在本算法中,设定自信值高于0.9的样本被认为是正样本,自信值小于0.6的样本被确定为负样本;因为在计算自信值时发现正样本的个数远远小于负样本的个数,所以使用所有的正样本,随机选择与正样本数量相同的负样本;为进行排序支持向量机训练,将所有正负样本组成正负样本对,定义正样本减负样本为正样本对,反之为负样本对;(2) Define the positive and negative samples of the candidate segmentation of the region. This algorithm designs a confidence score to measure the score of the sample relative to the foreground and background of the ground truth map. The confidence value is: The score is calculated ahead of time with two scores, A and C, where A is the accuracy score and C is the coverage score, where and where O i represents the target candidate segmentation of the ith sample, and G represents the ground truth map of the image; where ξ is the weight used to balance the accuracy score and the coverage score; in this algorithm, the confidence value is set higher than A sample of 0.9 is considered a positive sample, and a sample with a confidence value of less than 0.6 is determined to be a negative sample; because the number of positive samples is much smaller than the number of negative samples when calculating the confidence value, all positive samples are used and randomly selected Negative samples with the same number of positive samples; for sorting support vector machine training, all positive and negative samples are formed into positive and negative sample pairs, and positive samples minus negative samples are defined as positive sample pairs, and vice versa as negative sample pairs;
利用公式:进行排序支持向量机和子空间学习联合训练,可训练得到一个排序器KSR,该排序器对样本的区域候选分割进行显著性排序,排名靠前的与前景相似度大;其中w是排序支持向量机的排序系数;式中是逻辑损失函数,a是损失函数参数,e为指数函数;φ(xi)代表样本的特征xi通过核映射后的特征;p是样本对xin和xjn的约束数目;(in,jn)表明样本对是第n对约束的下标;yn∈{+1,-1}表示样本是属于同一类还是不同类,或者说是否同时属于前景或背景;L∈Rl×d(l<d)是学习到的映射矩阵,l是初始特征维数,d是映射后的特征维度,μ和λ则代表正则化参数。Use the formula: Perform joint training of sorting support vector machine and subspace learning, and a sorter KSR can be obtained by training, which sorts the region candidate segmentation of the sample significantly, and the top ranking has a large similarity with the foreground; where w is the sorting support vector machine. The ordering coefficient of ; where is the logistic loss function, a is the loss function parameter, and e is the exponential function; φ( xi ) represents the feature x i of the sample after being mapped by the kernel; p is the number of constraints of the sample on x in and x jn ; (in, jn) indicates that the sample pair is the subscript of the nth pair of constraints; y n ∈{+1,-1} indicates whether the sample belongs to the same class or different classes, or whether it belongs to the foreground or background at the same time; L∈R l×d ( l<d) is the learned mapping matrix, l is the initial feature dimension, d is the mapped feature dimension, and μ and λ represent regularization parameters.
利用主动学习挑选训练样本;首先利用上述初始化生成的模型对未标记样本池中所有样本的目标候选分割进行显著性排序,由si=wTPki得到排序分数,为简化联合训练的计算引入L=PφT(X),其中,N是样本个数,φT(X)是核运算,在化简过程中,本专利引入了核函数。利用公式将所有排序分数si归一化,代表排序分数归一化后分数,smin代表该图像排序分数集中最小的排序分数,smax代表该图像排序分数集中最大的排序分数;找出所有图像的目标候选分割的归一化排序分数在0.4至0.9之间的图像Xp代表选择出的目标候选分割组成的集合,X代表该图像所有的目标候选分割组成的集合。通过公式计算该图像目标候选分割的排序分数在0.4至0.9之间的个数占所有目标候选分割的比例β,其中card(X)代表该图像所有的目标候选分割数目,card(XP)代表集合Xp中的目标候选分割数目。将此分数作为每张图像的不确定值;这样,就能得到一个关于所有样本池中未标记样本的不确定值集合,Β={β1,β2,…,βn},选择其中高不确定度的样本人工标记后加入训练集,通过公式进行每一次选择,其中μ0为不确定值集合B的均值,δ是集合的标准差,λ0是权重参数,选取λ0=1.145;设计每一次选择不确定度β大于μ0+λ0δ的样本组成集合Quc。对图像Quc应用密度聚类算法,通过实验获得最佳参数ε=0.05,MinPts的大小设定为2,当圆心邻域内样本为2或以上可归为一类。聚类后,可获得高密度的样本簇C={c1,c2,…cn},和一些只有1个孤立样本的簇O={o1,o2,…om},最终图像集Quc可被分成:Quc={ci,i=1,2,...n}∪{oi,i=1,2,...m}。通过公式从每一个高密度的簇ct中选择其中不确定度最大的样本Ut,除此之外,还选择所有孤立样本加入候选集Q中,此类样本点可以增加训练模型的泛化能力。最终和候选集为Q={Ut,t=1,...n}∪{Oi,i=1,...m}。样本集Q代表了每次通过同时考虑不确定性和多样性设计的选择模型选择出的样本,通过人工标记后加入训练集L中。Use active learning to select training samples; first, the model generated by the above initialization is used to rank the target candidate segmentation of all samples in the unlabeled sample pool . L=Pφ T (X) , where N is the number of samples, and φ T (X) is the kernel operation. In the simplification process, the present patent introduces a kernel function. Use the formula Normalize all ranking scores s i , Represents the normalized score of the ranking score, s min represents the smallest ranking score in the image ranking score set, and s max represents the largest ranking score in the image ranking score set; find the normalized ranking score of the target candidate segmentation of all images in Images between 0.4 and 0.9 X p represents the set of selected target candidate segmentations, and X represents the set of all target candidate segmentations of the image. by formula Calculate the proportion β of the number of target candidate segmentations of the image between 0.4 and 0.9 to all target candidate segmentations, where card(X) represents the number of all target candidate segmentations in the image, and card(X P ) represents the set X The number of target candidate splits in p . Take this score as the uncertainty value for each image; in this way, you can get a set of uncertainty values for all unlabeled samples in the sample pool, β={β 1 ,β 2 ,...,β n }, choose the highest Uncertainty samples are manually marked and added to the training set, through the formula Make each selection, where μ 0 is the mean value of the set B of uncertain values, δ is the standard deviation of the set, λ 0 is the weight parameter, and λ 0 =1.145 is selected; the design uncertainty β of each selection is greater than μ 0 +λ 0 The samples of δ form the set Quc . The density clustering algorithm is applied to the image Quc , and the optimal parameter ε=0.05 is obtained through experiments, and the size of MinPts is set to 2. When the number of samples in the neighborhood of the center of the circle is 2 or more, it can be classified into one class. After clustering, a high-density sample cluster C={c 1 ,c 2 ,…c n }, and some clusters O={o 1 ,o 2 ,…o m } with only one isolated sample can be obtained, and the final image The set Quc can be divided into: Quc ={ci, i =1,2,...n}∪{oi, i =1,2,...m}. by formula The sample U t with the largest uncertainty is selected from each high-density cluster c t . In addition, all isolated samples are selected to be added to the candidate set Q. Such sample points can increase the generalization ability of the training model. The final sum candidate set is Q={U t , t=1,...n}∪{O i ,i=1,...m}. The sample set Q represents the samples selected by the selection model designed by considering both uncertainty and diversity each time, and is added to the training set L after being manually marked.
(3)将上述工作选择的样本集Q进行人工标记,之后加入到训练集L中,利用更新后的训练集L再次训练一个排序器KSR,之后在验证集上验证此次训练的模型的性能,之后不断重复上文的步骤(2),直到模型性能变化较小或者性能下降,选择上一次迭代选择的训练集作为最终训练集,训练的模型作为最终的训练模型,由模型对每张测试图像的区域候选分割进行显著性排序,选取排名前16位的区域候选分割进行加权融合,得到该图像的显著图Mp。(3) Manually label the sample set Q selected in the above work, and then add it to the training set L, use the updated training set L to train a sorter KSR again, and then verify the performance of the trained model on the validation set. , and then repeat the above step (2) until the performance of the model changes slightly or the performance declines, select the training set selected in the previous iteration as the final training set, and the trained model as the final training model, and the model tests each test The region candidate segmentations of the image are sorted for significance, and the top 16 region candidate segmentations are selected for weighted fusion to obtain the saliency map Mp of the image.
(4)由步骤(3)得到的显著图Mp对于目标的边缘细节处理的仍然不够,因此本发明提出一种在超像素级别上的处理方法,实现优化边界的目的。首先利用SLIC的超像素分割算法,设定分割的超像素块数量分别为100,150和200,用来构成图像i的超像素集SPi,分别提取每一个超像素块的CNN特征xj;将由步骤(3)得到的显著图Mp进行二值化作为先验显著图Ei;确定超像素的正负样本,为了使置信度最高,将超像素中完全位于先验显著图Ei前景区域的超像素构成正样本集POi,将超像素中完全位于先验显著图Ei的背景区域的超像素构成负样本集Ni;将正负样本集中的样本构成正负样本对,利用公式去自训练一个关于图像i的模型KSRi;对于该模型KSRi,利用公式si=wTPki对图像所有的超像素进行打分,将所有的超像素进行排序S={s1,s2,…sn},分数越高越接近前景,反之分数越低越接近背景。利用公式求得超像素中每个像素的得分,并将所有分数归一化到0-1之间,最后加权融合得到超像素级别合成的显著图Ms。最终显著图通过公式M=w1×Mp+w2×MS得到,其中M为最终显著图,Mp为原显著图,Ms为超像素级显著图,w1为1,w2取0.3。(4) The saliency map Mp obtained in step (3) is still insufficient for processing the edge details of the target, so the present invention proposes a processing method at the superpixel level to achieve the purpose of optimizing the boundary. Firstly, the superpixel segmentation algorithm of SLIC is used, and the number of superpixel blocks to be divided is 100, 150 and 200, respectively, to form the superpixel set SP i of image i, and the CNN feature x j of each superpixel block is extracted respectively; Binarize the saliency map M p obtained in step (3) as the prior saliency map E i ; determine the positive and negative samples of the superpixel, in order to make the highest confidence, the superpixel is completely located in the foreground of the prior saliency map E i . The superpixels in the region constitute a positive sample set PO i , and the superpixels in the superpixels that are completely located in the background region of the prior saliency map E i constitute a negative sample set N i ; the samples in the positive and negative sample sets constitute a positive and negative sample pair, using formula To train a model KSR i about the image i; for the model KSR i , use the formula si =w T Pki to score all the superpixels of the image, and sort all the superpixels S={s 1 ,s 2 ,…s n }, the higher the score, the closer to the foreground, and the lower the score, the closer to the background. Use the formula The score of each pixel in the superpixel is obtained, and all the scores are normalized between 0 and 1, and finally the saliency map M s synthesized at the superpixel level is obtained by weighted fusion. The final saliency map is obtained by the formula M=w 1 ×M p +w 2 ×M S , where M is the final saliency map, M p is the original saliency map, M s is the superpixel-level saliency map, w 1 is 1, w 2 Take 0.3.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910609780.2A CN110443257B (en) | 2019-07-08 | 2019-07-08 | A saliency detection method based on active learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910609780.2A CN110443257B (en) | 2019-07-08 | 2019-07-08 | A saliency detection method based on active learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110443257A CN110443257A (en) | 2019-11-12 |
CN110443257B true CN110443257B (en) | 2022-04-12 |
Family
ID=68429598
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910609780.2A Active CN110443257B (en) | 2019-07-08 | 2019-07-08 | A saliency detection method based on active learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110443257B (en) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112149688A (en) * | 2020-09-24 | 2020-12-29 | 北京汽车研究总院有限公司 | Image processing method and apparatus, computer readable storage medium, computer equipment |
CN114118413A (en) * | 2021-11-30 | 2022-03-01 | 上海商汤临港智能科技有限公司 | Network training and equipment control method, device, equipment and storage medium |
CN114120048B (en) * | 2022-01-26 | 2022-05-13 | 中兴通讯股份有限公司 | Image processing method, electronic device, and computer-readable storage medium |
CN114332489B (en) * | 2022-03-15 | 2022-06-24 | 江西财经大学 | Image salient target detection method and system based on uncertainty perception |
CN115880268B (en) * | 2022-12-28 | 2024-01-30 | 南京航空航天大学 | Method, system, equipment and medium for detecting inferior goods in plastic hose production |
CN116664492B (en) * | 2023-05-04 | 2024-11-19 | 鲸朵(上海)智能科技有限公司 | Self-learning quality inspection method and device |
CN117173701B (en) * | 2023-08-14 | 2024-11-05 | 西北工业大学 | Semantic segmentation active learning method based on super-pixel feature characterization learning |
CN118196399B (en) * | 2024-05-16 | 2024-07-23 | 创意信息技术股份有限公司 | Target detection model optimization method and device based on active learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103927394A (en) * | 2014-05-04 | 2014-07-16 | 苏州大学 | Multi-label active learning classification method and system based on SVM |
US9414048B2 (en) * | 2011-12-09 | 2016-08-09 | Microsoft Technology Licensing, Llc | Automatic 2D-to-stereoscopic video conversion |
CN107103608A (en) * | 2017-04-17 | 2017-08-29 | 大连理工大学 | A kind of conspicuousness detection method based on region candidate samples selection |
CN107133955A (en) * | 2017-04-14 | 2017-09-05 | 大连理工大学 | A kind of collaboration conspicuousness detection method combined at many levels |
-
2019
- 2019-07-08 CN CN201910609780.2A patent/CN110443257B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9414048B2 (en) * | 2011-12-09 | 2016-08-09 | Microsoft Technology Licensing, Llc | Automatic 2D-to-stereoscopic video conversion |
CN103927394A (en) * | 2014-05-04 | 2014-07-16 | 苏州大学 | Multi-label active learning classification method and system based on SVM |
CN107133955A (en) * | 2017-04-14 | 2017-09-05 | 大连理工大学 | A kind of collaboration conspicuousness detection method combined at many levels |
CN107103608A (en) * | 2017-04-17 | 2017-08-29 | 大连理工大学 | A kind of conspicuousness detection method based on region candidate samples selection |
Non-Patent Citations (2)
Title |
---|
Kernelized Subspace Ranking for Saliency Detection;Tiantian Wang etal.;《ECCV 2016》;20160917;全文 * |
核子空间样本选择方法的核最近邻凸包分类器;周晓飞等;《计算机工程与应用》;20071231;第43卷(第32期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN110443257A (en) | 2019-11-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110443257B (en) | A saliency detection method based on active learning | |
CN113378632B (en) | Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method | |
CN111178432B (en) | Weakly supervised fine-grained image classification method based on multi-branch neural network model | |
CN103310195B (en) | Based on LLC feature the Weakly supervised recognition methods of vehicle high score remote sensing images | |
CN110321967B (en) | Improved method of image classification based on convolutional neural network | |
CN105389550B (en) | It is a kind of based on sparse guide and the remote sensing target detection method that significantly drives | |
CN115170805A (en) | Image segmentation method combining super-pixel and multi-scale hierarchical feature recognition | |
CN110909820A (en) | Image classification method and system based on self-supervised learning | |
CN111507217A (en) | A Pedestrian Re-identification Method Based on Local Discriminatory Feature Fusion | |
CN110633708A (en) | Deep network significance detection method based on global model and local optimization | |
CN112131961B (en) | Semi-supervised pedestrian re-identification method based on single sample | |
CN113326731A (en) | Cross-domain pedestrian re-identification algorithm based on momentum network guidance | |
CN106650690A (en) | Night vision image scene identification method based on deep convolution-deconvolution neural network | |
CN111598004B (en) | Progressive reinforcement self-learning unsupervised cross-domain pedestrian re-identification method | |
CN114283162B (en) | Real scene image segmentation method based on contrast self-supervision learning | |
Zhou et al. | Remote sensing scene classification based on rotation-invariant feature learning and joint decision making | |
CN109034035A (en) | Pedestrian's recognition methods again based on conspicuousness detection and Fusion Features | |
CN111079847A (en) | Remote sensing image automatic labeling method based on deep learning | |
CN106056165B (en) | A saliency detection method based on superpixel correlation-enhanced Adaboost classification learning | |
CN108427740A (en) | A kind of Image emotional semantic classification and searching algorithm based on depth measure study | |
CN116910571B (en) | Open-domain adaptation method and system based on prototype comparison learning | |
CN110008899B (en) | Method for extracting and classifying candidate targets of visible light remote sensing image | |
CN108345866B (en) | A Pedestrian Re-identification Method Based on Deep Feature Learning | |
CN110334656A (en) | Method and device for extracting water bodies from multi-source remote sensing images based on source probability weighting | |
CN105574545B (en) | The semantic cutting method of street environment image various visual angles and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |