[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107545276B - Multi-view learning method combining low-rank representation and sparse regression - Google Patents

Multi-view learning method combining low-rank representation and sparse regression Download PDF

Info

Publication number
CN107545276B
CN107545276B CN201710648597.4A CN201710648597A CN107545276B CN 107545276 B CN107545276 B CN 107545276B CN 201710648597 A CN201710648597 A CN 201710648597A CN 107545276 B CN107545276 B CN 107545276B
Authority
CN
China
Prior art keywords
low
features
image
rank
level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710648597.4A
Other languages
Chinese (zh)
Other versions
CN107545276A (en
Inventor
刘安安
史英迪
苏育挺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201710648597.4A priority Critical patent/CN107545276B/en
Publication of CN107545276A publication Critical patent/CN107545276A/en
Application granted granted Critical
Publication of CN107545276B publication Critical patent/CN107545276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

本发明公开了一种联合低秩表示和稀疏回归的多视角学习方法,所述方法包括以下步骤:对带有图像记忆度分数标签的SUN数据集分别进行低级特征和高级属性特征的提取;将低秩表示、结合稀疏回归模型和多视角一致性损失三部分放在同一个框架下构成一个整体,构建联合低秩和稀疏回归的多视角模型;利用多视觉自适应回归算法解决自动预测图像的可记忆性的问题,在最优参数下得到图像底层特征、图像属性特征和图像记忆度的关系;组合图像的低级特征和高级属性特征,利用在最优参数下得到的关系结果,预测数据库测试集图像记忆度,并用相关评价标准来验证预测结果。本发明联合低秩表示和稀疏回归的多视角学习框架,准确预测图像区域的可记忆性。

Figure 201710648597

The invention discloses a multi-view learning method combining low-rank representation and sparse regression. The method includes the following steps: extracting low-level features and high-level attribute features respectively on a SUN data set with an image memory score label; The three parts of low-rank representation, combined sparse regression model and multi-view consistency loss are placed under the same framework to form a whole, and a multi-view model combining low-rank and sparse regression is constructed. The problem of memorability is to obtain the relationship between the underlying image features, image attribute features and image memory under the optimal parameters; combine the low-level features and high-level attribute features of the image, and use the relationship results obtained under the optimal parameters to predict the database test. Set image memory, and use relevant evaluation criteria to verify the prediction results. The invention combines the multi-view learning framework of low-rank representation and sparse regression to accurately predict the memorability of image regions.

Figure 201710648597

Description

联合低秩表示和稀疏回归的多视角学习方法A Multi-View Learning Approach to Joint Low-Rank Representation and Sparse Regression

技术领域technical field

本发明涉及低秩表示和稀疏回归领域,尤其涉及一种联合低秩表示和稀疏回归的多视角学习方法。The invention relates to the field of low-rank representation and sparse regression, in particular to a multi-view learning method combining low-rank representation and sparse regression.

背景技术Background technique

人类有记住成千上万图像的能力,然而并不是所有的图像都以同样的方式被储存在大脑中。一些有代表性的图片看一眼就能记住,而其他图像很容易从记忆中消失。图像记忆被用来测量在特定时间段之后图像被记住或被遗忘的程度。以前的研究工作已经表明,对图片的记忆力和图像的固有属性有关,即对图片的记忆力在不同的时间间隔内以及在不同观察者之间是一致性的。在这种情况下,就像研究许多其他高级图像属性(如人气,兴趣,情绪和美学)一样,一些研究工作开始探索图像内容表示和图像记忆之间的潜在相关性。Humans have the ability to remember thousands of images, but not all images are stored in the brain in the same way. Some representative images are remembered at a glance, while other images are easily lost from memory. Image memory is used to measure how well images are remembered or forgotten after a specific period of time. Previous research work has shown that memory for pictures is related to the inherent property of images that memory for pictures is consistent across time intervals and across observers. In this context, as with many other high-level image attributes (such as popularity, interest, mood, and aesthetics), several research works have begun to explore the potential correlation between image content representation and image memory.

分析图像可记忆性可以应用在诸如用户界面设计、视频摘要、场景理解和广告设计等几个领域中。例如,可以通过选择有意义的图像来将可记忆性用作引导标准来总结图像集合或视频。通过提高消费者对目标品牌的记忆,可以设计难忘的广告帮助商人扩大影响力。Analyzing image memorability can be applied in several fields such as user interface design, video summarization, scene understanding, and advertising design. For example, image collections or videos can be summarized using memorability as a guiding criterion by selecting meaningful images. By enhancing consumers' memory of the target brand, unforgettable advertisements can be designed to help merchants expand their influence.

近来,低秩表现(LRR)已经成功应用于多媒体和计算机视觉领域。为了更好地处理特征表示问题,LRR用于通过将原始数据矩阵分解为低秩表示矩阵,同时消除不相关的细节,揭示嵌入数据中的底层低维子空间结构。传统方法通常不足以进行异常值的处理。为了解决这个问题,最近有一些研究也着重于稀疏回归学习。Recently, low-rank representation (LRR) has been successfully applied in multimedia and computer vision domains. To better handle the feature representation problem, LRR is used to reveal the underlying low-dimensional subspace structure embedded in the data by decomposing the original data matrix into a low-rank representation matrix while eliminating irrelevant details. Traditional methods are often insufficient for outlier handling. To address this issue, some recent studies also focus on sparse regression learning.

然而,这些作品的主要缺点之一是特征表示和记忆预测在两个分开的阶段进行。也就是说,当确定用于图像可记忆性预测的特征组合的图案时,回归步骤的最终性能主要由处理的特征决定。虽然参考文献[1]提出了联合低秩和稀疏回归的特征编码算法来处理异常值。同样,参考文献[2]开发了一种用于降维的联合图嵌入和稀疏回归框架。但它们都是为视觉分类问题设计的,而不是图像记忆预测任务。However, one of the main shortcomings of these works is that feature representation and memory prediction are performed in two separate stages. That is, when determining the pattern of feature combinations for image memorability prediction, the final performance of the regression step is mainly determined by the processed features. While reference [1] proposes a feature encoding algorithm for joint low-rank and sparse regression to deal with outliers. Likewise, Reference [2] develops a joint graph embedding and sparse regression framework for dimensionality reduction. But they are all designed for visual classification problems, not image memory prediction tasks.

发明内容SUMMARY OF THE INVENTION

本发明提供了一种联合低秩表示和稀疏回归的多视角学习方法,本发明联合低秩表示和稀疏回归的多视角学习框架,准确预测图像区域的可记忆性,详见下文描述:The present invention provides a multi-view learning method that combines low-rank representation and sparse regression. The present invention combines a multi-view learning framework of low-rank representation and sparse regression to accurately predict the memorability of image regions, as described below:

对带有图像记忆度分数标签的SUN数据集分别进行低级特征和高级属性特征的提取;Extract low-level features and high-level attribute features respectively for the SUN dataset with image memory score labels;

将低秩表示、结合稀疏回归模型和多视角一致性损失三部分放在同一个框架下构成一个整体,构建联合低秩和稀疏回归的多视角模型;The three parts of low-rank representation, combined sparse regression model and multi-view consistency loss are placed under the same framework to form a whole, and a multi-view model combining low-rank and sparse regression is constructed;

利用多视觉自适应回归算法解决自动预测图像的可记忆性的问题,在最优参数下得到图像底层特征、图像属性特征和图像记忆度的关系;The multi-vision adaptive regression algorithm is used to solve the problem of automatically predicting the memorability of images, and the relationship between image underlying features, image attribute features and image memory is obtained under the optimal parameters;

组合图像的低级特征和高级属性特征,利用在最优参数下得到的关系结果,预测数据库测试集图像记忆度,并用相关评价标准来验证预测结果;Combine the low-level features and high-level attribute features of the image, use the relationship results obtained under the optimal parameters to predict the image memory of the database test set, and use the relevant evaluation criteria to verify the prediction results;

所述利用多视觉自适应回归算法解决自动预测图像的可记忆性的问题为:通过松弛变量Q来转换等价的问题:The problem of automatically predicting the memorability of the image by using the multi-vision adaptive regression algorithm is: converting the equivalent problem through the slack variable Q:

Figure GDA0002772751910000024
Figure GDA0002772751910000024

s.t.X=XA+E,Q=Aws.t.X=XA+E, Q=Aw

其中,A为低秩表示的映射矩阵;w为低秩特征表示和输出记忆度分数之间的线性依赖关系;E是稀疏误差约束部分;α是预测误差部分和正则化部分之间的平衡参数;β为控制稀疏的参数;λ>0是平衡参数;X为输入的特征;y为可记忆度分数向量;*是核范数表示;φ为图正则化的约束项;L是拉普拉斯算子;Among them, A is the mapping matrix of the low-rank representation; w is the linear dependence between the low-rank feature representation and the output memory score; E is the sparse error constraint part; α is the balance parameter between the prediction error part and the regularization part ; β is the parameter to control sparsity; λ > 0 is the balance parameter; X is the input feature; y is the memorability score vector; * is the nuclear norm representation; φ is the constraint term of graph regularization; operator;

引入两个松弛变量Y1和Y2以获得增广的拉格朗日函数:Two slack variables Y 1 and Y 2 are introduced to obtain the augmented Lagrangian function:

Figure GDA0002772751910000021
Figure GDA0002772751910000021

其中,<,>代表矩阵的内积操作,Y1和Y2代表拉格朗日算子矩阵,μ>0是正惩罚参数,将上述方法合并为:Among them, <,> represents the inner product operation of the matrix, Y 1 and Y 2 represent the Lagrangian operator matrix, and μ > 0 is the positive penalty parameter. The above methods are combined as:

Figure GDA0002772751910000022
Figure GDA0002772751910000022

其中in

Figure GDA0002772751910000023
Figure GDA0002772751910000023

引入变量t,定义At,Et,Qt,wt,Y1,t,Y2,t和μ作为变量的第t次迭代的结果,得到第t+1次迭代结果如下所示:Introduce variable t, define A t , E t , Q t , w t , Y 1, t , Y 2, t and μ as the result of the t-th iteration of the variables, and the result of the t+1-th iteration is as follows:

A的迭代结果:The iterative result of A:

Figure GDA0002772751910000031
Figure GDA0002772751910000031

其中,

Figure GDA0002772751910000032
Figure GDA0002772751910000033
in,
Figure GDA0002772751910000032
Figure GDA0002772751910000033

固定w,A,Q得到E的优化结果如下:Fixing w, A, Q to get the optimization result of E is as follows:

Figure GDA0002772751910000034
Figure GDA0002772751910000034

通过固定E,A,Q,优化w结果如下:By fixing E, A, Q, the result of optimizing w is as follows:

Figure GDA0002772751910000035
Figure GDA0002772751910000035

上述问题是岭回归问题,最优解是

Figure GDA0002772751910000036
The above problem is a ridge regression problem, and the optimal solution is
Figure GDA0002772751910000036

最后固定E,w,A,优化Q,得到:Finally, fix E, w, A, optimize Q, get:

Figure GDA0002772751910000037
Figure GDA0002772751910000037

Y1和Y2通过以下方案更新:Y 1 and Y 2 are updated with the following scheme:

Y1,t+1=Y1,tt(X-XAt+1-Et+1)Y 1,t+1 =Y 1,t + μ t (X-XA t+1 -E t+1 )

Y2,t+1=Y2,tt(Qt+1-At+1wt+1)Y 2,t+1 =Y 2,t + μ t (Q t+1 -A t+1 w t+1 )

其中,

Figure GDA0002772751910000038
为求偏导的符号;in,
Figure GDA0002772751910000038
is the symbol for partial derivative;

所述联合低秩和稀疏回归的多视角模型具体为:The multi-view model of the joint low-rank and sparse regression is specifically:

Figure GDA0002772751910000039
Figure GDA0002772751910000039

其中:in:

Figure GDA00027727519100000310
Figure GDA00027727519100000310

Figure GDA00027727519100000311
Figure GDA00027727519100000311

G(φHl)=tr[(XlAlwl)TXHAHwH]G(φ Hl )=tr[(X l A l w l ) T X H A H w H ]

Figure GDA0002772751910000041
为用作高级特征预测误差的损失函数;
Figure GDA0002772751910000042
为用作低级特征预测误差的损失函数;
Figure GDA0002772751910000043
为用于解决过拟合问题的图形正则化表示;XH为高级属性特征;AH为高级属性特征低秩表示的映射矩阵;EH为是高级属性特征稀疏误差约束部分;wH为高级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系;Al为低级特征低秩表示的映射矩阵;El为低级属性特征稀疏误差约束部分;Xl为低级属性特征;wl为低级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系。
Figure GDA0002772751910000041
is the loss function used as the prediction error for advanced features;
Figure GDA0002772751910000042
is the loss function used as the prediction error of low-level features;
Figure GDA0002772751910000043
is the graphic regularization representation used to solve the overfitting problem; X H is the high-level attribute feature; A H is the mapping matrix of the low-rank representation of the high-level attribute feature; E H is the sparse error constraint part of the high-level attribute feature; w H is the high-level attribute feature Linear dependence between low-rank representation of attribute features and output memory score; A l is the mapping matrix of low-rank representation of low-level features; E l is the sparse error constraint part of low-level attribute features; X l is low-level attribute features; w l is the linear dependence between the low-rank representation of low-level attribute features and the output memory score.

所述方法还包括:获取图像可记忆性数据集。The method also includes acquiring an image memorability dataset.

所述低级特征包括:尺度不变特征变换特征、搜索树特征、方向梯度直方图特征、以及结构相似性特征。The low-level features include: scale-invariant feature transformation features, search tree features, directional gradient histogram features, and structural similarity features.

所述高级属性特征包括:327维场景类别属性特征、以及106维对象属性特征。The advanced attribute features include: 327-dimensional scene category attribute features and 106-dimensional object attribute features.

本发明提供的技术方案的有益效果是:The beneficial effects of the technical scheme provided by the present invention are:

1、联合低秩表示和稀疏回归用于图像可记忆性预测,其中采用低等级约束来揭示嵌入原始数据的内在结构,利用稀疏约束去除异常值以及冗余信息,当低秩表示和稀疏回归共同执行时,所有特征共享的低秩表示可以捕获特征的内部结构,从而提高了预测的准确率;1. Combine low-rank representation and sparse regression for image memorability prediction, in which low-rank constraints are used to reveal the inherent structure of the embedded original data, and sparse constraints are used to remove outliers and redundant information. When low-rank representation and sparse regression work together When executed, the low-rank representation shared by all features can capture the internal structure of features, thereby improving the accuracy of prediction;

2、本发明基于多视觉自适应回归(MAR)算法,以快速收敛来解决目标函数的优化问题。2. The present invention is based on the multi-vision adaptive regression (MAR) algorithm to solve the optimization problem of the objective function with fast convergence.

附图说明Description of drawings

图1为一种联合低秩表示和稀疏回归的多视角学习方法的流程图;Figure 1 is a flowchart of a multi-view learning method combining low-rank representation and sparse regression;

图2为标有图像记忆度分数的数据库图像样例;Figure 2 is an example of a database image marked with an image memory score;

图3为算法收敛图;Fig. 3 is the algorithm convergence diagram;

图4为本方法与其他方法结果对比图。Figure 4 is a comparison chart of the results of this method and other methods.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚,下面对本发明实施方式作进一步地详细描述。In order to make the objectives, technical solutions and advantages of the present invention clearer, the embodiments of the present invention are further described in detail below.

实施例1Example 1

研究表明图像属性特征相比其原始底层特征是很高级别的语义特征,对图像的视觉特征进行研究并对图像记忆度进行预测,本发明实施例提出了一种用于图像可记忆性预测的联合低秩表示和稀疏回归的多视角学习方法,参见图1,该方法包括以下步骤:Studies have shown that image attribute features are very high-level semantic features compared to their original underlying features, and the visual features of images are studied and image memory is predicted. A multi-view learning method combining low-rank representation and sparse regression, see Figure 1. The method includes the following steps:

101:获取图像可记忆性数据集;101: Obtain an image memory data set;

其中,该图像可记忆性数据集[1]包含来自SUN数据集[11]的2,222张图像。图像的记忆得分通过Amazon Mechanical Turk的Visual Memory Game得到,图像可记忆性是从0到1的连续值。值越高,图像越难记忆。具有各种记忆得分的样本图像如图2所示。Among them, the image memorability dataset [1] contains 2,222 images from the SUN dataset [11] . The memory score of the image is obtained by Amazon Mechanical Turk's Visual Memory Game, and the image memorability is a continuous value from 0 to 1. The higher the value, the harder the image is to remember. Sample images with various memory scores are shown in Figure 2.

102:对带有图像记忆度分数标签的SUN数据集分别进行低级特征和高级属性特征的提取;102: Extract low-level features and high-level attribute features respectively for the SUN dataset with image memory score labels;

其中,提取的低级特征包括:SIFT(尺度不变特征变换,Scale-invariant featuretransform,SIFT)、Gist(搜索树,Generalized Search Trees)、HOG(方向梯度直方图,Histogram of Oriented Gradient)和SSIM(结构相似性,structural similarityindex))特征,该4种低级特征共同构成了低级特征库。本发明实施例同时使用两种类型的高级属性特征,包括:327维场景类别属性特征,106维对象属性特征。Among them, the extracted low-level features include: SIFT (Scale-invariant feature transform, SIFT), Gist (Search Tree, Generalized Search Trees), HOG (Histogram of Oriented Gradient) and SSIM (Structure Similarity, structural similarityindex)) feature, the four low-level features together constitute the low-level feature library. The embodiment of the present invention simultaneously uses two types of advanced attribute features, including: a 327-dimensional scene category attribute feature and a 106-dimensional object attribute feature.

其中,场景类别属性涵盖327个场景类别,对象属性特征通过106个对象类别标记,具体的维数根据实际应用中的需要进行设定,本发明实施例对此不做限制。。The scene category attribute covers 327 scene categories, and the object attribute feature is marked by 106 object categories, and the specific dimension is set according to actual application needs, which is not limited in this embodiment of the present invention. .

103:将低秩表示、稀疏回归模型和多视角一致性损失三部分放在同一个框架下构成一个整体,构建Mv-JLRSR(联合低秩和稀疏回归的多视角)模型;103: The low-rank representation, the sparse regression model and the multi-view consistency loss are placed under the same framework to form a whole, and the Mv-JLRSR (multi-view joint low-rank and sparse regression) model is constructed;

104:利用多视觉自适应回归(MAR)算法解决自动预测图像的可记忆性的问题,在最优参数下得到图像底层特征、图像属性特征和图像记忆度的关系;104: Use the multi-vision adaptive regression (MAR) algorithm to solve the problem of automatically predicting the memorability of images, and obtain the relationship between the underlying image features, image attribute features and image memory under optimal parameters;

105:组合图像的低级特征和高级属性特征,利用在最优参数下得到的关系结果,预测数据库测试集图像记忆度,并用相关评价标准来验证预测结果。105: Combine the low-level features and high-level attribute features of the image, use the relationship results obtained under the optimal parameters to predict the image memory degree of the database test set, and use the relevant evaluation criteria to verify the prediction results.

综上所述,本发明实施例通过上述步骤101-步骤105采用低秩约束来揭示原始数据的内在结构并利用稀疏约束去除特征的异常值以及冗余信息,当低级代表和稀疏回归共同执行时,所有特征共享的最低等级表示不仅可以捕获所有模态的全局结构,而且可以表示回归的要求;由于拟订的目标函数不平滑,难以解决,因此利用多视角自适应回归(MAR)算法来解决自动预测图像的可记忆性问题,以快速收敛来解决优化问题。To sum up, in the embodiment of the present invention, the low-rank constraints are used to reveal the internal structure of the original data through the above steps 101 to 105, and the outliers and redundant information of the features are removed by using the sparse constraints. When the low-level representation and the sparse regression are executed jointly , the lowest-level representation shared by all features can not only capture the global structure of all modalities, but also represent the requirements of regression; since the formulated objective function is not smooth and difficult to solve, the Multi-View Adaptive Regression (MAR) algorithm is used to solve automatic Predict the memorability problem of images to solve optimization problems with fast convergence.

实施例2Example 2

下面结合具体的计算公式对实施例1中的方案进行进一步地介绍,详见下文描述:The scheme in Embodiment 1 is further introduced below in conjunction with specific calculation formulas, and is described in detail below:

201:图像可记忆性数据集包含来自SUN数据集的2,222张图像;201: The image memorability dataset contains 2,222 images from the SUN dataset;

其中,该数据集为本领域技术人员所公知,本发明实施例对此不做赘述。The data set is known to those skilled in the art, and details are not described in this embodiment of the present invention.

202:对带有图像记忆度分数标签的SUN数据集的图片进行特征提取,提取的SIFT、Gist、HOG和SSIM特征构成低级特征库,并使用两种类型的高级属性特征,包括327维场景类别属性,106维对象属性202: Perform feature extraction on the pictures of the SUN dataset with image memory score labels, the extracted SIFT, Gist, HOG and SSIM features form a low-level feature library, and use two types of high-level attribute features, including 327-dimensional scene categories properties, 106-dimensional object properties

其中,上述数据集包括2222张各种环境下的图片,每张图片都标记好了图像记忆度分数,附图2展示了数据库中标有记忆度分数图片的样例。特征表示为

Figure GDA0002772751910000061
Di代表此类特征的维数,N代表数据库中所含图像个数(2222)。这些特征构成特征库B={B1,...,BM}。Among them, the above data set includes 2222 pictures in various environments, and each picture is marked with the image memory score. Figure 2 shows a sample of the pictures marked with the memory score in the database. Features are expressed as
Figure GDA0002772751910000061
D i represents the dimension of such features, and N represents the number of images contained in the database (2222). These features constitute a feature library B={B 1 , . . . , B M }.

203:建立Mv-JLRSR模型,在提取的低级特征和高级属性特征的基础上联合低秩表示和稀疏回归,建立更鲁棒的特征表示、以及建立准确的回归模型。203: Establish an Mv-JLRSR model, combine low-rank representation and sparse regression on the basis of the extracted low-level features and high-level attribute features, establish a more robust feature representation, and establish an accurate regression model.

Mv-JLRSR模型所定义的一般框架如下所示:The general framework defined by the Mv-JLRSR model is as follows:

Figure GDA0002772751910000062
Figure GDA0002772751910000062

其中,F(A,w)是用作预测误差的损失函数;L(A,E)表示基于低秩表示的特征编码器;G(A)是用于解决过拟合问题的图形正则化表示;A为低秩表示的映射矩阵;w为低秩特征表示和输出记忆度分数之间的线性依赖关系;E是稀疏误差约束部分。where F(A,w) is the loss function used as prediction error; L(A,E) represents the feature encoder based on low-rank representation; G(A) is the graph regularization representation used to solve the overfitting problem ; A is the mapping matrix of the low-rank representation; w is the linear dependence between the low-rank feature representation and the output memory score; E is the sparse error constraint part.

图像可记忆性数据集[1]包含来自SUN数据集[11]的2,222张图像,图像的记忆得分通过Amazon Mechanical Turk的Visual Memory Game得到;结合自适应迁移学习的回归训练,采用线性回归的方法对提取的特征库进行训练。图像记忆度的分数预测分为两方面,一方面是直接利用特征表示来预测图像记忆度,得到每一类图像底层特征到图像记忆度的映射矩阵wi,另一方面图像高级属性特征(在预测图像记忆度分数中也起到了非常重要的作用,结合低秩学习,得到每类图像属性与图像记忆度的关系;根据初始图像特征向量集X∈RN×D,Mv-JLRSR模型的目标是在提取的视觉线索的基础上联合低秩表示和稀疏回归以增强鲁棒特征表示和准确回归模型。The image memorability dataset [1] contains 2,222 images from the SUN dataset [11] , and the memory scores of the images are obtained through the Visual Memory Game of Amazon Mechanical Turk; combined with the regression training of adaptive transfer learning, the linear regression method is adopted Train the extracted feature library. The score prediction of image memory degree is divided into two aspects. On the one hand, the feature representation is directly used to predict the image memory degree, and the mapping matrix w i from each type of image underlying features to image memory degree is obtained. On the other hand, the image high-level attribute features (in It also plays a very important role in predicting the image memory score. Combined with low-rank learning, the relationship between each type of image attribute and image memory is obtained; according to the initial image feature vector set X∈R N×D , the goal of the Mv-JLRSR model is It combines low-rank representation and sparse regression based on the extracted visual cues to enhance robust feature representation and accurate regression models.

对每一个部分进行具体介绍:A detailed introduction to each part:

由于低秩约束可以去除噪声或冗余信息来帮助揭示数据的本质结构。因此,这些提取好的低级和高级特征可以被集成到特征学习中来处理这些问题。LRR假设原始特征矩阵包含所有样本共享的潜在最低秩结构分量及其唯一的误差矩阵:Due to low-rank constraints, noise or redundant information can be removed to help reveal the essential structure of the data. Therefore, these extracted low-level and high-level features can be integrated into feature learning to deal with these problems. LRR assumes that the original feature matrix contains the potential lowest-rank structural component shared by all samples and its unique error matrix:

Figure GDA0002772751910000063
Figure GDA0002772751910000063

其中,A∈RD×D是N个样本的低秩投影矩阵,E∈RN×D是用l1范数约束的唯一稀疏误差部分,以便处理随机误差,λ>0是平衡参数,X为输入的特征;D为低秩约束后的特征维数;rank为特征的低秩表示法。where A∈R D×D is the low-rank projection matrix of N samples, E∈R N×D is the only sparse error part constrained with the l1 norm to handle random errors, λ>0 is the balance parameter, X is the input feature; D is the feature dimension after low-rank constraint; rank is the low-rank representation of the feature.

由于上述方程很难优化,因此采用核范数||A||*(*表示核范数是指矩阵奇异值的和)来逼近A的秩,因此L(A,E)的公式可以被定义如下:Since the above equation is difficult to optimize, the nuclear norm ||A|| * ( * indicates that the nuclear norm refers to the sum of the singular values of the matrix) is used to approximate the rank of A, so the formula of L(A, E) can be defined as follows:

Figure GDA0002772751910000071
Figure GDA0002772751910000071

在本发明实施例提出的框架中,将图像记忆预测的问题作为标准回归问题。提出了lasso[5]回归方法,通过建立输入特征矩阵X与可记忆度分数向量y之间的线性关系v,最小化最小二乘误差

Figure GDA0002772751910000072
来解决预测问题。在最小二乘误差部分加入岭正则化[6]后,获得具有岭回归的典型最小二乘问题。In the framework proposed by the embodiments of the present invention, the problem of image memory prediction is regarded as a standard regression problem. The lasso [5] regression method is proposed to minimize the least squares error by establishing a linear relationship v between the input feature matrix X and the memorability score vector y
Figure GDA0002772751910000072
to solve the prediction problem. After adding ridge regularization [6] to the least squares error part, a typical least squares problem with ridge regression is obtained.

Figure GDA0002772751910000073
Figure GDA0002772751910000073

其中,α是预测误差部分和正则化部分之间的平衡参数。where α is a balance parameter between the prediction error part and the regularization part.

从矩阵分解的角度来看,变换向量v可以被分解为两个分量的乘积,即应用低秩投影矩阵A来捕获样本之间共享的低秩结构,并将系数向量w应用于将变换后的样本与它们的记忆得分相关联。引入Q=Aw并将损失函数F(A,w)定义为:From the perspective of matrix decomposition, the transformation vector v can be decomposed into a product of two components, i.e. applying a low-rank projection matrix A to capture the low-rank structure shared between samples, and applying a coefficient vector w to transform the transformed Samples are associated with their memory scores. Introduce Q=Aw and define the loss function F(A,w) as:

Figure GDA0002772751910000074
Figure GDA0002772751910000074

基于多元学习的思想,采用图形正则化来保持几何结构的一致性来解决这个问题。图形正则化的核心思想是样本在特征表示形式上是接近的,那么记忆得分也是接近的,反之亦然。通过最小化图形正则化器G(A),实现特征和记忆度得分之间的几何结构一致性:Based on the idea of multi-learning, graph regularization is adopted to maintain the consistency of the geometric structure to solve this problem. The core idea of graph regularization is that the samples are close in feature representation, so the memory scores are also close, and vice versa. Geometric consistency between features and memory scores is achieved by minimizing the graph regularizer G(A):

Figure GDA0002772751910000075
Figure GDA0002772751910000075

其中,L=B-S是拉普拉斯算子,B是角矩阵,Bii=∑jSij,S是高斯相似性函数所计算出的权重矩阵,它的计算由高斯相似性函数得到:Among them, L=BS is the Laplacian operator, B is the angle matrix, B ii =∑ j S ij , S is the weight matrix calculated by the Gaussian similarity function, and its calculation is obtained by the Gaussian similarity function:

Figure GDA0002772751910000076
Figure GDA0002772751910000076

其中,yi和yj是第i个样本和第j个样本的流行度得分,NK表示yi是yi的K临近数据,σ是一个半径参数,它被简单地设置为所有视频对上的欧氏距离的中值。where y i and y j are the popularity scores of the i-th sample and the j-th sample, N K denotes that y i is the K neighbors of yi i , and σ is a radius parameter, which is simply set as all video pairs The median value of the Euclidean distance on .

通常提取多个特征以表示来自不同视图的图像。这是因为这些多重表示可以提供兼容和补充信息。对于图像记忆预测任务,自然的选择是将这些多个表示集成在一起,以表示图像以获得更好的性能,而不是依赖于单个特征。本发明实施例提取的是高级的属性特征和低级的视觉特征。Often multiple features are extracted to represent images from different views. This is because these multiple representations can provide compatible and complementary information. For image memory prediction tasks, the natural choice is to integrate these multiple representations to represent images for better performance, rather than relying on a single feature. The embodiments of the present invention extract high-level attribute features and low-level visual features.

因此定义Mv-JLRSR模型为:Therefore, the Mv-JLRSR model is defined as:

Figure GDA0002772751910000081
Figure GDA0002772751910000081

其中:in:

Figure GDA0002772751910000082
Figure GDA0002772751910000082

Figure GDA0002772751910000083
Figure GDA0002772751910000083

Figure GDA0002772751910000084
Figure GDA0002772751910000084

A∈RD×D是N个样本的低秩投影矩阵来捕获样本之间共享的底层低秩结构,E∈RN×D是利用L1范数来解决随机误差。A∈R D×D is a low-rank projection matrix of N samples to capture the underlying low-rank structure shared between samples, and E∈R N×D utilizes the L 1 norm to account for random errors.

其中,

Figure GDA0002772751910000085
为用作高级特征预测误差的损失函数;
Figure GDA0002772751910000086
为用作低级特征预测误差的损失函数;
Figure GDA0002772751910000087
为用于解决过拟合问题的图形正则化表示;XH为高级属性特征;AH为高级属性特征低秩表示的映射矩阵;EH为是高级属性特征稀疏误差约束部分;β为控制稀疏的参数;wH为高级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系;Al为低级特征低秩表示的映射矩阵;El为低级属性特征稀疏误差约束部分;Xl为低级属性特征;El为低级属性特征稀疏误差约束部分;wl为低级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系;y为训练样本的标签。in,
Figure GDA0002772751910000085
is the loss function used as the prediction error for advanced features;
Figure GDA0002772751910000086
is the loss function used as the prediction error of low-level features;
Figure GDA0002772751910000087
is the graphic regularization representation used to solve the overfitting problem; X H is the high-level attribute feature; A H is the mapping matrix of the low-rank representation of the high-level attribute feature; E H is the sparse error constraint part of the high-level attribute feature; β is the control sparsity parameter; w H is the linear dependence between the low-rank representation of high-level attribute features and the output memory score; A l is the mapping matrix of low-level feature low-rank representation; E l is the sparse error constraint part of low-level attribute features; X l is the low-level attribute feature; E l is the sparse error constraint part of the low-level attribute feature; w l is the linear dependence between the low-rank representation of the low-level attribute feature and the output memory score; y is the label of the training sample.

Figure GDA0002772751910000088
β||Alwl||+λ||XlAlwl-y||F是定义的误差函数,通过在输入特征矩阵(XH和Xl)和可记忆度分数向量y之间建立线性向量v来解决预测问题。
Figure GDA0002772751910000088
β||A l w l ||+λ||X l A l w l -y|| F is the error function defined by summing the input feature matrix (X H and X l ) and the memorability score vector y A linear vector v is established between to solve the prediction problem.

Figure GDA0002772751910000089
是为了保证特征相近的样本记忆力分数也是接近的。
Figure GDA0002772751910000089
It is to ensure that the memory scores of samples with similar characteristics are also close.

对Mv-JLRSR模型中FHH)及FLL)中的α,β,λ和φ初始化;分别固定A,E,w和Q并求导,不断重复求导过程直到误差达到设定的最小值。Initialize α, β, λ and φ in F HH ) and FLL ) in the Mv-JLRSR model; fix A, E, w and Q respectively and derive them, and repeat the derivation process until the error reaches the set minimum value.

下面具体介绍求解过程,利用多视觉自适应回归(MAR)算法[7]来解决自动预测图像的可记忆性的问题,进而解决优化问题;The following is a detailed introduction to the solution process, and the multi-vision adaptive regression (MAR) algorithm [7] is used to solve the problem of automatically predicting the memorability of the image, and then solve the optimization problem;

首先,介绍一个松弛变量Q来转换上述等价的问题:First, a slack variable Q is introduced to transform the above equivalent problem:

Figure GDA0002772751910000091
Figure GDA0002772751910000091

然后,引入两个松弛变量Y1和Y2以获得增广的拉格朗日函数:Then, two slack variables Y 1 and Y 2 are introduced to obtain the augmented Lagrangian function:

Figure GDA0002772751910000092
Figure GDA0002772751910000092

其中<,>代表矩阵的内积操作,Y1和Y2代表拉格朗日算子矩阵,μ>0是正惩罚参数,*是核范数表示;φ为图正则化的约束项;将上述方法合并为:Where <,> represents the inner product operation of the matrix, Y 1 and Y 2 represent the Lagrangian operator matrix, μ>0 is the positive penalty parameter, * is the nuclear norm representation; φ is the constraint term for graph regularization; The methods are combined into:

Figure GDA0002772751910000093
Figure GDA0002772751910000093

其中in

Figure GDA0002772751910000094
Figure GDA0002772751910000094

本方法采用交替迭代的方法来求解。通过将二次项h(A,Q,E,w,Y1,Y2,μ)近似为二阶泰勒展开来分别处理每个子问题。为了更好地理解这个过程,引入了一个变量t,并定义了At,Et,Qt,wt,Y1,t,Y2,t和μ作为变量的第t次迭代的结果,因此得到第t+1次迭代结果如下所示:This method adopts the alternate iteration method to solve. Each subproblem is treated separately by approximating the quadratic term h(A,Q,E,w,Y 1 , Y 2 , μ) as a second-order Taylor expansion. To better understand this process, a variable t is introduced, and A t , E t , Q t , w t , Y 1, t , Y 2, t and μ are defined as the results of the t-th iteration of the variables, Therefore, the result of the t+1th iteration is as follows:

A的迭代结果:The iterative result of A:

Figure GDA0002772751910000095
Figure GDA0002772751910000095

其中,

Figure GDA0002772751910000096
Figure GDA0002772751910000097
in,
Figure GDA0002772751910000096
Figure GDA0002772751910000097

然后固定w,A,Q得到E的优化结果如下:Then fix w, A, Q to get the optimization result of E as follows:

Figure GDA0002772751910000101
Figure GDA0002772751910000101

通过固定E,A,Q,优化w结果如下:By fixing E, A, Q, the result of optimizing w is as follows:

Figure GDA0002772751910000102
Figure GDA0002772751910000102

上述问题实际上是众所周知的岭回归问题,其最优解是

Figure GDA0002772751910000103
The above problem is actually the well-known ridge regression problem, and its optimal solution is
Figure GDA0002772751910000103

最后固定E,w,A,优化Q,可以得到:Finally, fix E, w, A, and optimize Q, you can get:

Figure GDA0002772751910000104
Figure GDA0002772751910000104

此外,拉格朗日乘数Y1和Y2可以通过以下方案更新:Furthermore, the Lagrangian multipliers Y 1 and Y 2 can be updated by the following scheme:

Y1,t+1=Y1,tt(X-XAt+1-Et+1)Y 1,t+1 =Y 1,t + μ t (X-XA t+1 -E t+1 )

Y2,t+1=Y2,tt(Qt+1-At+1wt+1)Y 2,t+1 =Y 2,t + μ t (Q t+1 -A t+1 w t+1 )

其中,

Figure GDA0002772751910000105
为求偏导的符号。in,
Figure GDA0002772751910000105
Symbols for partial derivatives.

在所选评价标准下研究预测的分数与实际分数间的关系,得到算法性能结果。The relationship between the predicted score and the actual score is studied under the selected evaluation criteria, and the algorithm performance results are obtained.

其中,本发明实施例将数据库随机分为10组,对每一组都进行上述步骤得到10组相关系数,取平均值评价算法性能。本方法选择的评价标准有排序相关(RankingCorrelation)和R-value,在实施例3中还有详细介绍。In the embodiment of the present invention, the database is randomly divided into 10 groups, and the above steps are performed for each group to obtain 10 groups of correlation coefficients, and the average value is taken to evaluate the performance of the algorithm. The evaluation criteria selected by this method include Ranking Correlation and R-value, which are also described in detail in Embodiment 3.

实施例3Example 3

下面结合具体的实验数据,图3至图4对实施例1和2中的方案进行可行性验证,详见下文描述:Below in conjunction with concrete experimental data, Fig. 3 to Fig. 4 carry out feasibility verification to the scheme in embodiment 1 and 2, see the following description for details:

图像可记忆性数据集包含来自SUN数据集的2,222张图像。图像的记忆得分通过Amazon Mechanical Turk的Visual Memory Game得到,图像可记忆性是从0到1的连续值。值越高,图像越难记忆,具有各种记忆得分的样本图像如图2所示。The Image Memorability dataset contains 2,222 images from the SUN dataset. The memory score of the image is obtained by Amazon Mechanical Turk's Visual Memory Game, and the image memorability is a continuous value from 0 to 1. The higher the value, the harder the image is to remember, and sample images with various memory scores are shown in Figure 2.

本方法采取两种评估方法:This method adopts two evaluation methods:

排序相关评估方法(Ranking Correlation,RC):得到真实记忆度排序和预测记忆度分数排序关系,采用排序相关的Spearman相关系数的标准来衡量两种排序之间的相关系数。它的取值范围是[-1,1],值越高代表两种排序更接近:Ranking Correlation (RC): Obtain the ranking relationship between the actual memory ranking and the predicted memory score, and use the ranking-related Spearman correlation coefficient standard to measure the correlation coefficient between the two rankings. Its value range is [-1, 1], and the higher the value, the closer the two sorts are:

Figure GDA0002772751910000111
Figure GDA0002772751910000111

其中,N是测试集图像个数,r1中的元素r1i是第i张图片在真实结果中排序的位置,r2中的元素r2i是第i张图片在预测结果中排序的位置。Among them, N is the number of images in the test set, the element r 1i in r 1 is the position where the ith image is sorted in the real result, and the element r 2i in r 2 is the position where the ith image is sorted in the predicted result.

R-value:评估预测分数与实际分数间的相关系数便于回归模型比较。R-value取值范围是[-1,1],1代表正相关,-1代表负相关:R-value: Evaluate the correlation between predicted scores and actual scores for easy comparison of regression models. The value range of R-value is [-1,1], 1 means positive correlation, -1 means negative correlation:

Figure GDA0002772751910000112
Figure GDA0002772751910000112

其中,N是测试集图像个数,si是图像真实记忆度分数向量,

Figure GDA0002772751910000113
是所有图像真实记忆度分数的均值;vi是图像预测记忆度分数向量,
Figure GDA0002772751910000114
是所有图像预测记忆度分数的均值。Among them, N is the number of images in the test set, s i is the image real memory score vector,
Figure GDA0002772751910000113
is the mean of the real memory score of all images; v i is the image prediction memory score vector,
Figure GDA0002772751910000114
is the mean of the predicted memory scores for all images.

实验中将本方法与以下四种方法进行对比:In the experiment, this method is compared with the following four methods:

LR(Liner Regression):利用线性预测函数训练底层特征与记忆度分数之间的关系;LR (Liner Regression): Use the linear prediction function to train the relationship between the underlying features and the memory score;

SVR(Support Vector Regression):支持向量回归,将底层特征串在一起,结合RBF核函数学习非线性函数预测图像记忆度;SVR (Support Vector Regression): Support Vector Regression, string together the underlying features, combined with RBF kernel function to learn nonlinear functions to predict image memory;

MRR[9](Multiple Rank Regression):采用多阶左投影向量和右投影向量建立回归模型;MRR [9] (Multiple Rank Regression): Use multi-order left projection vector and right projection vector to establish a regression model;

MLHR[10](Multi-Level via Hierarchical Regression):基于分层回归的多媒体信息分析。MLHR [10] (Multi-Level via Hierarchical Regression): Multi-Level Regression Based Multimedia Information Analysis.

图3验证了算法的收敛性;图4展示了本方法与其他方法性能比较结果,可以看到本方法优于其他方法。对比方法只探究了底层特征与记忆度预测的关系。本方法将底层特征同图像属性特征结合在同一框架下对图像记忆度进行预测,得到一个较为稳定的模型。实验结果验证了本方法的可行性与优越性。Figure 3 verifies the convergence of the algorithm; Figure 4 shows the performance comparison results of this method and other methods, and it can be seen that this method is better than other methods. Contrastive methods only explore the relationship between underlying features and memory prediction. This method combines the underlying features and image attribute features to predict the image memory degree under the same framework, and obtains a relatively stable model. The experimental results verify the feasibility and superiority of this method.

参考文献references

[1]Zhang Z,Li F,Zhao M,et al.Joint low-rank and sparse principalfeature coding for enhanced robust representation and visual classification[J].IEEE Transactions on Image Processing,2016,25(6):2429-2443.[1] Zhang Z, Li F, Zhao M, et al. Joint low-rank and sparse principal feature coding for enhanced robust representation and visual classification [J]. IEEE Transactions on Image Processing, 2016, 25(6): 2429-2443 .

[2]Shi X,Guo Z,Lai Z,et al.A framework of joint graph embedding andsparse regression for dimensionality reduction[J].IEEE Transactions on ImageProcessing,2015,24(4):1341-1355.[2]Shi X,Guo Z,Lai Z,et al.A framework of joint graph embedding and sparse regression for dimensionality reduction[J].IEEE Transactions on ImageProcessing,2015,24(4):1341-1355.

[3]P.Isola,J.Xiao,A.Torralba,and A.Oliva,“What makes an imagememorable?”in Proc.Int.Conf.Comput.Vis.Pattern Recognit.,2011,pp.145–152.[3] P.Isola, J.Xiao, A.Torralba, and A.Oliva, "What makes an imagememorable?" in Proc.Int.Conf.Comput.Vis.Pattern Recognit., 2011, pp.145–152.

[4]P.Isola,D.Parikh,A.Torralba,and A.Oliva,“Understanding theintrinsic memorability of images,”in Proc.Adv.Conf.Neural Inf.Process.Syst.,2011,pp.2429–2437.[4] P.Isola, D.Parikh, A.Torralba, and A.Oliva, "Understanding the intrinsic memorability of images," in Proc.Adv.Conf.Neural Inf.Process.Syst., 2011, pp.2429–2437 .

[5]Tibshirani R.Regression shrinkage and selection via the lasso[J].Journal of the Royal Statistical Society.Series B(Methodological),1996:267-288.[5]Tibshirani R.Regression shrinkage and selection via the lasso[J].Journal of the Royal Statistical Society.Series B(Methodological),1996:267-288.

[6]Hoerl A E,Kennard R W.Ridge regression:Biased estimation fornonorthogonal problems.Technometrics,1970,12(1):55-67.[6] Hoerl A E, Kennard R W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 1970, 12(1):55-67.

[7]Purcell S,Neale B,Todd-Brown K,et al.PLINK:a tool set for whole-genome association and population-based linkage analyses.The American JournalofHuman Genetics,2007,81(3):559-575.[7] Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics, 2007, 81(3):559-575.

[8]Q.You,H.Jin,and J.Luo,“Visual sentiment analysis by attending onlocal image regions,”in Thirty-First AAAI Conference on ArtificialIntelligence,2017.[8] Q. You, H. Jin, and J. Luo, “Visual sentiment analysis by attending on local image regions,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.

[9]Hou C,Nie F,Yi D,et al.Efficient image classification via multiplerank regression.IEEE Transactions on Image Processing,2013,22(1):340-352.[9] Hou C, Nie F, Yi D, et al. Efficient image classification via multiplerank regression. IEEE Transactions on Image Processing, 2013, 22(1): 340-352.

[10]Sundt B.A multi-level hierarchical credibility regression model[J].Scandinavian Actuarial Journal,1980,1980(1):25-32.[10] Sundt B.A multi-level hierarchical credibility regression model[J]. Scandinavian Actuarial Journal, 1980, 1980(1): 25-32.

[11]J.Xiao,J.Hays,K.Ehinger,A.Oliva,A.Torralba et al.,“Sun database:Large-scale scene recognition from abbey to zoo,”in Proc.Int.Conf.Comput.Vis.Pattern Recognit.,2010,pp.3485–3492.[11] J. Xiao, J. Hays, K. Ehinger, A. Oliva, A. Torralba et al., “Sun database: Large-scale scene recognition from abbey to zoo,” in Proc.Int.Conf.Comput. Vis. Pattern Recognit., 2010, pp. 3485–3492.

本领域技术人员可以理解附图只是一个优选实施例的示意图,上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。Those skilled in the art can understand that the accompanying drawing is only a schematic diagram of a preferred embodiment, and the above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages or disadvantages of the embodiments.

以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention and are not intended to limit the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included in the protection of the present invention. within the range.

Claims (4)

1.一种联合低秩表示和稀疏回归的多视角学习方法,其特征在于,所述方法包括以下步骤:1. A multi-view learning method combining low-rank representation and sparse regression, characterized in that the method comprises the following steps: 对带有图像记忆度分数标签的SUN数据集分别进行低级特征和高级属性特征的提取;Extract low-level features and high-level attribute features respectively for the SUN dataset with image memory score labels; 将低秩表示、结合稀疏回归模型和多视角一致性损失三部分放在同一个框架下构成一个整体,构建联合低秩和稀疏回归的多视角模型;The three parts of low-rank representation, combined sparse regression model and multi-view consistency loss are placed under the same framework to form a whole, and a multi-view model combining low-rank and sparse regression is constructed; 利用多视觉自适应回归算法解决自动预测图像的可记忆性的问题,在最优参数下得到图像底层特征、图像属性特征和图像记忆度的关系;The multi-vision adaptive regression algorithm is used to solve the problem of automatically predicting the memorability of images, and the relationship between image underlying features, image attribute features and image memory is obtained under the optimal parameters; 组合图像的低级特征和高级属性特征,利用在最优参数下得到的关系结果,预测数据库测试集图像记忆度,并用相关评价标准来验证预测结果;Combine the low-level features and high-level attribute features of the image, use the relationship results obtained under the optimal parameters to predict the image memory of the database test set, and use the relevant evaluation criteria to verify the prediction results; 所述利用多视觉自适应回归算法解决自动预测图像的可记忆性的问题为:通过松弛变量Q来转换等价的问题:The problem of automatically predicting the memorability of the image by using the multi-vision adaptive regression algorithm is: converting the equivalent problem through the slack variable Q:
Figure FDA0002772751900000011
Figure FDA0002772751900000011
s.t.X=XA+E,Q=Aws.t.X=XA+E, Q=Aw 其中,A为低秩表示的映射矩阵;w为低秩特征表示和输出记忆度分数之间的线性依赖关系;E是稀疏误差约束部分;α是预测误差部分和正则化部分之间的平衡参数;β为控制稀疏的参数;λ>0是平衡参数;X为输入的特征;y为可记忆度分数向量;*是核范数表示;φ为图正则化的约束项;L是拉普拉斯算子;Among them, A is the mapping matrix of the low-rank representation; w is the linear dependence between the low-rank feature representation and the output memory score; E is the sparse error constraint part; α is the balance parameter between the prediction error part and the regularization part ; β is the parameter to control sparsity; λ > 0 is the balance parameter; X is the input feature; y is the memorability score vector; * is the nuclear norm representation; φ is the constraint term of graph regularization; operator; 引入两个松弛变量Y1和Y2以获得增广的拉格朗日函数:Two slack variables Y 1 and Y 2 are introduced to obtain the augmented Lagrangian function:
Figure FDA0002772751900000012
Figure FDA0002772751900000012
其中,<,>代表矩阵的内积操作,Y1和Y2代表拉格朗日算子矩阵,μ>0是正惩罚参数,将上述方法合并为:Among them, <,> represents the inner product operation of the matrix, Y 1 and Y 2 represent the Lagrangian operator matrix, and μ > 0 is the positive penalty parameter. The above methods are combined as:
Figure FDA0002772751900000013
Figure FDA0002772751900000013
其中in
Figure FDA0002772751900000014
Figure FDA0002772751900000014
引入变量t,定义At,Et,Qt,wt,Y1,t,Y2,t和μ作为变量的第t次迭代的结果,得到第t+1次迭代结果如下所示:Introduce variable t, define A t , E t , Q t , w t , Y 1, t , Y 2, t and μ as the result of the t-th iteration of the variables, and the result of the t+1-th iteration is as follows: A的迭代结果:The iterative result of A:
Figure FDA0002772751900000021
Figure FDA0002772751900000021
其中,
Figure FDA0002772751900000022
Figure FDA0002772751900000023
in,
Figure FDA0002772751900000022
Figure FDA0002772751900000023
固定w,A,Q得到E的优化结果如下:Fixing w, A, Q to get the optimization result of E is as follows:
Figure FDA0002772751900000024
Figure FDA0002772751900000024
通过固定E,A,Q,优化w结果如下:By fixing E, A, Q, the result of optimizing w is as follows:
Figure FDA0002772751900000025
Figure FDA0002772751900000025
上述问题是岭回归问题,最优解是
Figure FDA0002772751900000026
The above problem is a ridge regression problem, and the optimal solution is
Figure FDA0002772751900000026
最后固定E,w,A,优化Q,得到:Finally, fix E, w, A, optimize Q, get:
Figure FDA0002772751900000027
Figure FDA0002772751900000027
Y1和Y2通过以下方案更新:Y 1 and Y 2 are updated with the following scheme: Y1,t+1=Y1,tt(X-XAt+1-Et+1)Y 1,t+1 =Y 1,t + μ t (X-XA t+1 -E t+1 ) Y2,t+1=Y2,tt(Qt+1-At+1wt+1)Y 2,t+1 =Y 2,t + μ t (Q t+1 -A t+1 w t+1 ) 其中,
Figure FDA0002772751900000028
为求偏导的符号;
in,
Figure FDA0002772751900000028
is the symbol for partial derivative;
所述联合低秩和稀疏回归的多视角模型具体为:The multi-view model of the joint low-rank and sparse regression is specifically:
Figure FDA0002772751900000029
Figure FDA0002772751900000029
其中:in:
Figure FDA0002772751900000031
Figure FDA0002772751900000031
Figure FDA0002772751900000032
Figure FDA0002772751900000032
G(φHl)=tr[(XlAlwl)TXHAHwH]G(φ Hl )=tr[(X l A l w l ) T X H A H w H ]
Figure FDA0002772751900000033
为用作高级特征预测误差的损失函数;
Figure FDA0002772751900000034
为用作低级特征预测误差的损失函数;
Figure FDA0002772751900000035
为用于解决过拟合问题的图形正则化表示;XH为高级属性特征;AH为高级属性特征低秩表示的映射矩阵;EH为是高级属性特征稀疏误差约束部分;wH为高级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系;Al为低级特征低秩表示的映射矩阵;El为低级属性特征稀疏误差约束部分;Xl为低级属性特征;wl为低级属性特征的低秩表示和输出记忆度分数之间的线性依赖关系。
Figure FDA0002772751900000033
is the loss function used as the prediction error for advanced features;
Figure FDA0002772751900000034
is the loss function used as the prediction error of low-level features;
Figure FDA0002772751900000035
is the graphic regularization representation used to solve the overfitting problem; X H is the high-level attribute feature; A H is the mapping matrix of the low-rank representation of the high-level attribute feature; E H is the sparse error constraint part of the high-level attribute feature; w H is the high-level attribute feature Linear dependence between low-rank representation of attribute features and output memory score; A l is the mapping matrix of low-rank representation of low-level features; E l is the sparse error constraint part of low-level attribute features; X l is low-level attribute features; w l is the linear dependence between the low-rank representation of low-level attribute features and the output memory score.
2.根据权利要求1所述的一种联合低秩表示和稀疏回归的多视角学习方法,其特征在于,所述方法还包括:获取图像可记忆性数据集。2 . The multi-view learning method combining low-rank representation and sparse regression according to claim 1 , wherein the method further comprises: acquiring an image memorability data set. 3 . 3.根据权利要求1所述的一种联合低秩表示和稀疏回归的多视角学习方法,其特征在于,所述低级特征包括:尺度不变特征变换特征、搜索树特征、方向梯度直方图特征、以及结构相似性特征。3. A multi-view learning method combining low-rank representation and sparse regression according to claim 1, wherein the low-level features comprise: scale-invariant feature transformation features, search tree features, and directional gradient histogram features , and structural similarity features. 4.根据权利要求1所述的一种联合低秩表示和稀疏回归的多视角学习方法,其特征在于,所述高级属性特征包括:327维场景类别属性特征、以及106维对象属性特征。4 . The multi-view learning method combining low-rank representation and sparse regression according to claim 1 , wherein the high-level attribute features include: 327-dimensional scene category attribute features and 106-dimensional object attribute features. 5 .
CN201710648597.4A 2017-08-01 2017-08-01 Multi-view learning method combining low-rank representation and sparse regression Active CN107545276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710648597.4A CN107545276B (en) 2017-08-01 2017-08-01 Multi-view learning method combining low-rank representation and sparse regression

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710648597.4A CN107545276B (en) 2017-08-01 2017-08-01 Multi-view learning method combining low-rank representation and sparse regression

Publications (2)

Publication Number Publication Date
CN107545276A CN107545276A (en) 2018-01-05
CN107545276B true CN107545276B (en) 2021-02-05

Family

ID=60971226

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710648597.4A Active CN107545276B (en) 2017-08-01 2017-08-01 Multi-view learning method combining low-rank representation and sparse regression

Country Status (1)

Country Link
CN (1) CN107545276B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256486B (en) * 2018-01-18 2022-02-22 河南科技大学 Image identification method and device based on nonnegative low-rank and semi-supervised learning
CN108197320B (en) * 2018-02-02 2019-07-30 北方工业大学 An automatic labeling method for multi-view images
CN109522956B (en) * 2018-11-16 2022-09-30 哈尔滨理工大学 Low-rank discriminant feature subspace learning method
CN109583498B (en) * 2018-11-29 2023-04-07 天津大学 Fashion compatibility prediction method based on low-rank regularization feature enhancement characterization
CN109858543B (en) * 2019-01-25 2023-03-21 天津大学 Image memorability prediction method based on low-rank sparse representation and relationship inference
CN110619367B (en) * 2019-09-20 2022-05-13 哈尔滨理工大学 Joint low-rank constraint cross-view-angle discrimination subspace learning method and device
CN110727818B (en) * 2019-09-27 2023-11-14 天津大学 A binary image feature encoding method based on low-rank embedding representation
CN112990242A (en) * 2019-12-16 2021-06-18 京东数字科技控股有限公司 Training method and training device for image classification model
CN111242102B (en) * 2019-12-17 2022-11-18 大连理工大学 Fine-grained image recognition algorithm of Gaussian mixture model based on discriminant feature guide
CN111259917B (en) * 2020-02-20 2022-06-07 西北工业大学 An Image Feature Extraction Method Based on Local Neighbor Component Analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400143A (en) * 2013-07-12 2013-11-20 中国科学院自动化研究所 Data subspace clustering method based on multiple view angles
CN105809119A (en) * 2016-03-03 2016-07-27 厦门大学 Sparse low-rank structure based multi-task learning behavior identification method
CN106971200A (en) * 2017-03-13 2017-07-21 天津大学 A kind of iconic memory degree Forecasting Methodology learnt based on adaptive-migration

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103400143A (en) * 2013-07-12 2013-11-20 中国科学院自动化研究所 Data subspace clustering method based on multiple view angles
CN105809119A (en) * 2016-03-03 2016-07-27 厦门大学 Sparse low-rank structure based multi-task learning behavior identification method
CN106971200A (en) * 2017-03-13 2017-07-21 天津大学 A kind of iconic memory degree Forecasting Methodology learnt based on adaptive-migration

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Joint Low-Rank and Sparse Principal Feature Coding for Enhanced Robust Representation and Visual Classification;Zhao Zhang等;《IEEE Transactions on Image Processing》;20160325;第2429-2443页 *

Also Published As

Publication number Publication date
CN107545276A (en) 2018-01-05

Similar Documents

Publication Publication Date Title
CN107545276B (en) Multi-view learning method combining low-rank representation and sparse regression
Zhang et al. Vector of locally and adaptively aggregated descriptors for image feature representation
Zhou et al. Atrank: An attention-based user behavior modeling framework for recommendation
Lu et al. Co-attending free-form regions and detections with multi-modal multiplicative feature embedding for visual question answering
CN107590505B (en) Learning method combining low-rank representation and sparse regression
Gao et al. Multi‐dimensional data modelling of video image action recognition and motion capture in deep learning framework
Huang et al. Object-location-aware hashing for multi-label image retrieval via automatic mask learning
Wang et al. Semantic supplementary network with prior information for multi-label image classification
Yang et al. STA-TSN: spatial-temporal attention temporal segment network for action recognition in video
Zhang et al. Learning implicit class knowledge for RGB-D co-salient object detection with transformers
Zheng et al. Diagnostic regions attention network (DRA-Net) for histopathology WSI recommendation and retrieval
Xie et al. Hierarchical coding of convolutional features for scene recognition
CN113239159B (en) Cross-modal retrieval method for video and text based on relational inference network
Li et al. Learning hierarchical video representation for action recognition
Gao et al. k-Partite graph reinforcement and its application in multimedia information retrieval
Xu et al. Instance-level coupled subspace learning for fine-grained sketch-based image retrieval
Wu et al. Variant semiboost for improving human detection in application scenes
Su et al. Deep low-rank matrix factorization with latent correlation estimation for micro-video multi-label classification
Wu et al. Deep spatiotemporal LSTM network with temporal pattern feature for 3D human action recognition
CN108108769A (en) Data classification method and device and storage medium
Huang et al. Deep multimodal embedding model for fine-grained sketch-based image retrieval
Soltanian et al. Spatio-temporal VLAD encoding of visual events using temporal ordering of the mid-level deep semantics
Indu et al. Survey on sketch based image retrieval methods
Dong et al. A supervised dictionary learning and discriminative weighting model for action recognition
Qi et al. Cross-media similarity metric learning with unified deep networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant