CN107545276B - Multi-view learning method combining low-rank representation and sparse regression - Google Patents
Multi-view learning method combining low-rank representation and sparse regression Download PDFInfo
- Publication number
- CN107545276B CN107545276B CN201710648597.4A CN201710648597A CN107545276B CN 107545276 B CN107545276 B CN 107545276B CN 201710648597 A CN201710648597 A CN 201710648597A CN 107545276 B CN107545276 B CN 107545276B
- Authority
- CN
- China
- Prior art keywords
- low
- image
- features
- regression
- level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 14
- 238000011156 evaluation Methods 0.000 claims abstract description 8
- 238000012360 testing method Methods 0.000 claims abstract description 6
- 239000011159 matrix material Substances 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 19
- 238000013507 mapping Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 9
- 230000003190 augmentative effect Effects 0.000 claims description 3
- 230000002040 relaxant effect Effects 0.000 claims description 2
- 230000000007 visual effect Effects 0.000 description 7
- 230000003044 adaptive effect Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a multi-view learning method combining low-rank representation and sparse regression, which comprises the following steps of: extracting low-level features and high-level attribute features of the SUN data set with the image memory degree score labels respectively; putting three parts of low-rank representation, combination of a sparse regression model and multi-view consistency loss under the same frame to form a whole, and constructing a multi-view model combining low-rank and sparse regression; solving the problem of memorability of the automatic prediction image by using a multi-vision self-adaptive regression algorithm, and obtaining the relation between the bottom layer characteristic of the image, the attribute characteristic of the image and the memory degree of the image under the optimal parameter; and combining the low-level features and the high-level attribute features of the image, predicting the image memory of the database test set by using the relation result obtained under the optimal parameter, and verifying the prediction result by using a related evaluation standard. The method combines the low-rank representation and the multi-view learning framework of sparse regression to accurately predict the memorability of the image area.
Description
Technical Field
The invention relates to the field of low-rank representation and sparse regression, in particular to a multi-view learning method combining low-rank representation and sparse regression.
Background
Humans have the ability to remember thousands of images, however not all images are stored in the brain in the same way. Some representative pictures can be memorized at a glance, while other pictures easily disappear from memory. Image memory is used to measure the extent to which an image is memorized or forgotten after a certain period of time. Previous research work has shown that the memory of pictures is related to the intrinsic properties of the image, i.e. the memory of pictures is consistent over different time intervals and between different observers. In this case, just as many other advanced image attributes (such as popularity, interest, mood, and aesthetics) were studied, some research efforts began exploring potential correlations between image content representations and image memory.
Analyzing image memorability may be applied in several fields such as user interface design, video summarization, scene understanding, and advertisement design. For example, the image collection or video may be summarized by selecting meaningful images to use memorability as a guide criterion. By improving the memory of consumers to the target brand, the memorable advertisement can be designed to help the businessman to expand the influence.
Recently, Low Rank Representation (LRR) has been successfully applied in the multimedia and computer vision fields. To better handle the eigen-representation problem, LRRs are used to reveal the underlying low-dimensional subspace structure embedded in the data by decomposing the original data matrix into a low-rank representation matrix, while eliminating irrelevant details. Conventional methods are often inadequate for dealing with outliers. To solve this problem, some recent studies have also focused on sparse regression learning.
However, one of the main drawbacks of these works is that the feature representation and memory prediction are performed in two separate phases. That is, when determining the pattern of feature combinations for image memorability prediction, the final performance of the regression step is mainly determined by the features processed. Although reference [1] proposes a feature coding algorithm combining low rank and sparse regression to handle outliers. Also, reference [2] developed a joint graph embedding and sparse regression framework for dimension reduction. They are designed for the visual classification problem rather than the image memory prediction task.
Disclosure of Invention
The invention provides a multi-view learning method combining low-rank representation and sparse regression, which combines a multi-view learning frame combining low-rank representation and sparse regression to accurately predict the memorability of an image area and is described in detail as follows:
extracting low-level features and high-level attribute features of the SUN data set with the image memory degree score labels respectively;
putting three parts of low-rank representation, combination of a sparse regression model and multi-view consistency loss under the same frame to form a whole, and constructing a multi-view model combining low-rank and sparse regression;
solving the problem of memorability of the automatic prediction image by using a multi-vision self-adaptive regression algorithm, and obtaining the relation between the bottom layer characteristic of the image, the attribute characteristic of the image and the memory degree of the image under the optimal parameter;
combining the low-level features and the high-level attribute features of the image, predicting the image memory of the database test set by using the relation result obtained under the optimal parameter, and verifying the prediction result by using a related evaluation standard;
the problem of the memorability of the automatic prediction image solved by the multi-vision self-adaptive regression algorithm is as follows: the equivalent problem is transformed by relaxing the variable Q:
s.t.X=XA+E,Q=Aw
wherein, A is a mapping matrix represented by low rank; w is a linear dependency between the low rank feature representation and the output memory fraction; e is a sparse error constraint part; α is a balance parameter between the prediction error portion and the regularization portion; beta is a parameter for controlling sparsity; λ > 0 is an equilibrium parameter; x is the input characteristic; y is a memorability score vector; is a nuclear norm representation; phi is a constraint term of graph regularization; l is Laplace operator;
introducing two relaxation variables Y1And Y2To obtain an augmented lagrange function:
wherein < > represents the inner product operation of the matrix, Y1And Y2Representing a Lagrange operator matrix, mu > 0 is a positive penalty parameter, and the method is combined as follows:
wherein
Introducing the variable t, defining At,Et,Qt,wt,Y1,t,Y2,tAnd mu as the result of the t iteration of the variable to obtain the tThe results of +1 iterations are as follows:
iteration result of A:
the optimization results for E obtained by fixing w, a, Q are as follows:
by fixing E, a, Q, the optimization w results are as follows:
And finally, fixing E, w and A, and optimizing Q to obtain:
Y1and Y2Updating by the following scheme:
Y1,t+1=Y1,t+μt(X-XAt+1-Et+1)
Y2,t+1=Y2,t+μt(Qt+1-At+1wt+1)
the multi-view model combining low rank and sparse regression specifically comprises the following steps:
wherein:
G(φH,φl)=tr[(XlAlwl)TXHAHwH]
as a loss function for use as an advanced feature prediction error;as a loss function for low-level feature prediction error;a graphical regularization representation for solving an over-fitting problem; xHIs a high-level attribute feature; a. theHA mapping matrix for low rank representation of high-level attribute features; eHIs a high-level attribute characteristic sparse error constraint part; w is aHA linear dependency between a low rank representation of the high-level attribute features and the output memory score; a. thelA mapping matrix that is a low rank representation of low-level features; elA low-level attribute feature sparse error constraint part; xlA low-level attribute feature; w is alLow rank representation for low-level attribute featuresAnd a linear dependence between the output memory scores.
The method further comprises the following steps: an image memorable data set is obtained.
The low-level features include: scale invariant feature transform features, search tree features, histogram of oriented gradients features, and structural similarity features.
The high-level attribute features include: a 327 dimension scene category attribute feature, and a 106 dimension object attribute feature.
The technical scheme provided by the invention has the beneficial effects that:
1. the low-rank representation and the sparse regression are combined for image memorability prediction, wherein the internal structure of embedded original data is revealed by adopting low-rank constraint, abnormal values and redundant information are removed by utilizing the sparse constraint, and when the low-rank representation and the sparse regression are jointly executed, the internal structure of the feature can be captured by the low-rank representation shared by all the features, so that the prediction accuracy is improved;
2. the invention is based on a multi-vision adaptive regression (MAR) algorithm to solve the optimization problem of the objective function with fast convergence.
Drawings
FIG. 1 is a flow chart of a multi-view learning method combining low rank representation and sparse regression;
FIG. 2 is a sample database image marked with an image memory score;
FIG. 3 is a graph of algorithm convergence;
FIG. 4 is a graph comparing the results of the present method with those of other methods.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention are described in further detail below.
Example 1
The research shows that the image attribute features are semantic features with a high level compared with the original bottom-layer features, the visual features of the image are researched, and the image memory is predicted, the embodiment of the invention provides a multi-view learning method for combined low-rank representation and sparse regression of image memorability prediction, and the method comprises the following steps:
101: acquiring an image memorable data set;
wherein the image memorable data set[1]Including data from the SUN data set[11]2,222 images. The Memory score of an image is obtained by Visual Memory Game of Amazon Mechanical Turk, and the memorability of the image is a continuous value from 0 to 1. The higher the value, the harder the image is to remember. Sample images with various memory scores are shown in fig. 2.
102: extracting low-level features and high-level attribute features of the SUN data set with the image memory degree score labels respectively;
wherein the extracted low-level features include: SIFT (Scale-invariant feature transform), Gist (Search tree), HOG (Histogram of Oriented Gradient), and SSIM (structural similarity) features, which together constitute a low-level feature library. The embodiment of the invention simultaneously uses two types of advanced attribute features, including: dimension 327 scene category attribute feature, and dimension 106 object attribute feature.
The scene type attribute covers 327 scene types, the object attribute characteristics are marked by 106 object types, and the specific dimension is set according to the requirement in practical application, which is not limited in the embodiment of the present invention. .
103: putting three parts of low-rank representation, a sparse regression model and multi-view consistency loss under the same frame to form a whole, and constructing an Mv-JLRSR (multi-view model combining low-rank and sparse regression);
104: solving the problem of memorability of the automatic prediction image by using a multi-vision adaptive regression (MAR) algorithm, and obtaining the relation between the bottom layer characteristic of the image, the attribute characteristic of the image and the memorability of the image under the optimal parameter;
105: and combining the low-level features and the high-level attribute features of the image, predicting the image memory of the database test set by using the relation result obtained under the optimal parameter, and verifying the prediction result by using a related evaluation standard.
In summary, the embodiment of the present invention adopts the low rank constraint to reveal the internal structure of the original data and uses the sparse constraint to remove the abnormal values and redundant information of the features through the above steps 101 to 105, when the low-level representation and the sparse regression are performed together, the lowest-level representation shared by all the features can not only capture the global structure of all the modalities, but also represent the requirement of the regression; since the proposed objective function is not smooth and difficult to solve, the multi-view adaptive regression (MAR) algorithm is used to solve the memorability problem of the auto-predicted image to solve the optimization problem with fast convergence.
Example 2
The scheme of example 1 is further described below with reference to specific calculation formulas, which are described in detail below:
201: the image memorable dataset contains 2,222 images from the SUN dataset;
the data set is well known to those skilled in the art, and the embodiment of the present invention will not be described in detail herein.
202: extracting the characteristics of the picture of the SUN data set with the image memory degree score label, forming a low-level characteristic library by the extracted SIFT, Gist, HOG and SSIM characteristics, and using two types of high-level attribute characteristics including 327-dimensional scene category attributes and 106-dimensional object attribute attributes
The data set includes 2222 pictures in various environments, each picture is marked with an image memory score, and fig. 2 shows a sample of the pictures marked with the memory scores in the database. The characteristics are expressed asDiRepresenting the dimensions of such features, and N represents the number of images contained in the database (2222). These features constitute a feature library B ═ B1,...,BM}。
203: and establishing an Mv-JLRSR model, combining low-rank representation and sparse regression on the basis of the extracted low-level features and high-level attribute features, establishing more robust feature representation and establishing an accurate regression model.
The general framework defined by the Mv-JLRSR model is as follows:
wherein F (A, w) is a loss function used as a prediction error; l (a, E) denotes a feature encoder based on low rank representation; g (A) is a graphical regularization representation for solving the over-fitting problem; a is a mapping matrix represented by a low rank; w is a linear dependency between the low rank feature representation and the output memory fraction; e is the sparse error constraint part.
Image memorable data set[1]Including data from the SUN data set[11]2,222 images, the Memory score of which was obtained by Visual Memory Game from Amazon Mechanical Turk; and training the extracted feature library by adopting a linear regression method in combination with the regression training of the self-adaptive transfer learning. The fractional prediction of the image memory is divided into two aspects, one aspect is that the image memory is predicted by directly utilizing feature representation to obtain a mapping matrix w from each type of image bottom layer features to the image memoryiOn the other hand, the high-level attribute features of the image (playing a very important role in the memory degree fraction of the prediction image and combining low-rank learning to obtain the relationship between the attribute of each type of image and the memory degree of the image, and obtaining the vector set X e R according to the initial image featureN×DThe goal of the Mv-jlrr model is to combine low rank representation and sparse regression on the basis of extracted visual cues to enhance robust feature representation and accurate regression models.
Each section is specifically described as follows:
noise or redundant information may be removed due to low rank constraints to help reveal the essential structure of the data. Thus, these extracted low-level and high-level features can be integrated into feature learning to handle these problems. LRR assumes that the original feature matrix contains the potential lowest rank structure components shared by all samples and its unique error matrix:
wherein A ∈ RD×DIs a low-rank projection matrix of N samples, E ∈ RN×DIs prepared from1A unique sparse error portion of norm constraint to handle random errors, λ > 0 being a balance parameter, X being an input feature; d is a characteristic dimension after the low rank constraint; rank is a low rank representation of the features.
Because the above equation is difficult to optimize, the kernel norm A is adopted to count the luminance*(*Representing the kernel norm is the sum of matrix singular values) to approximate the rank of a, so the formula for L (a, E) can be defined as follows:
in the framework proposed by the embodiment of the present invention, the problem of image memory prediction is taken as a standard regression problem. Propose lasso[5]The regression method minimizes the least square error by establishing a linear relationship v between the input feature matrix X and the memorability score vector yTo solve the prediction problem. Adding ridge regularization to least squares error portion[6]Thereafter, a typical least squares problem with ridge regression is obtained.
Where α is a balance parameter between the prediction error portion and the regularization portion.
From a matrix decomposition perspective, the transform vector v may be decomposed as the product of two components, i.e., applying the low-rank projection matrix a to capture the low-rank structure shared between the samples, and applying the coefficient vector w to associate the transformed samples with their memory scores. Q ═ Aw was introduced and the loss function F (a, w) was defined as:
based on the idea of multivariate learning, graphical regularization is employed to maintain consistency of geometry to solve this problem. The core idea of graph regularization is that samples are close in the form of feature representation, and then the memory scores are also close, and vice versa. By minimizing the graph regularizer g (a), geometric consistency between features and memory scores is achieved:
where L-B-S is the Laplace operator, B is the angular matrix, Bii=∑jSijS is a weight matrix calculated by a gaussian similarity function, the calculation of which is derived from the gaussian similarity function:
wherein, yiAnd yjIs the popularity score, N, of the ith and jth samplesKDenotes yiIs yiIs a radius parameter, which is simply set to the median of the euclidean distances over all video pairs.
Multiple features are typically extracted to represent images from different views. This is because these multiple representations can provide compatible and supplemental information. For the image memory prediction task, it is a natural choice to integrate these multiple representations together to represent the image for better performance, rather than relying on a single feature. The embodiment of the invention extracts high-level attribute features and low-level visual features.
The Mv-JLRSR model is thus defined as:
wherein:
A∈RD×Dis a low-rank projection matrix of N samples to capture the underlying low-rank structure shared between samples, E ∈ RN×DIs by using L1Norm to account for random errors.
Wherein,as a loss function for use as an advanced feature prediction error;as a loss function for low-level feature prediction error;a graphical regularization representation for solving an over-fitting problem; xHIs a high-level attribute feature; a. theHA mapping matrix for low rank representation of high-level attribute features; eHIs a high-level attribute characteristic sparse error constraint part; beta is a parameter for controlling sparsity; w is aHA linear dependency between a low rank representation of the high-level attribute features and the output memory score; a. thelA mapping matrix that is a low rank representation of low-level features; elA low-level attribute feature sparse error constraint part; xlA low-level attribute feature; elA low-level attribute feature sparse error constraint part; w is alIs of the lower genusA linear dependency between the low rank representation of the characteristic feature and the output memory score; y is the label of the training sample.
β||Alwl||+λ||XlAlwl-y||FIs a defined error function, by applying a feature matrix (X) to the inputHAnd Xl) And establishing a linear vector v between the memorability score vector y to solve the prediction problem.
For F in Mv-JLRRS modelH(φH) And FL(φL) Initializing alpha, beta, lambda and phi in the (A); and respectively fixing A, E, w and Q, performing derivation, and continuously repeating the derivation process until the error reaches the set minimum value.
The following describes the solution process in more detail, using the multi-vision adaptive regression (MAR) algorithm[7]The problem of memorability of the automatic prediction image is solved, and then the optimization problem is solved;
first, a slack variable Q is introduced to transform the equivalent problem:
then, two relaxation variables Y are introduced1And Y2To obtain an augmented lagrange function:
wherein > represents the inner product operation of the matrix, Y1And Y2Represents a Lagrange operator matrix, mu > 0 is a positive penalty parameter, and is represented by a nuclear norm; phi isConstraint terms of graph regularization; the above methods are combined as follows:
wherein
The method adopts an alternate iteration method to solve. By combining the quadratic terms h (A, Q, E, w, Y)1,Y2μ) are approximated as a second order Taylor expansion to treat each sub-problem separately. To better understand this process, a variable t is introduced and A is definedt,Et,Qt,wt,Y1,t,Y2,tAnd μ as the result of the t-th iteration of the variables, thus yielding the t + 1-th iteration result as follows:
iteration result of A:
then, the optimization results of fixing w, A and Q to obtain E are as follows:
by fixing E, a, Q, the optimization w results are as follows:
the above problem is actually the well-known ridge regression problem, the optimal solution of which is
Finally, fixing E, w, A and optimizing Q, the following can be obtained:
in addition, the Lagrange multiplier Y1And Y2The update can be achieved through the following scheme:
Y1,t+1=Y1,t+μt(X-XAt+1-Et+1)
Y2,t+1=Y2,t+μt(Qt+1-At+1wt+1)
And researching the relation between the predicted score and the actual score under the selected evaluation standard to obtain an algorithm performance result.
In the embodiment of the invention, the database is randomly divided into 10 groups, each group is subjected to the steps to obtain 10 groups of correlation coefficients, and the average value is taken to evaluate the performance of the algorithm. The evaluation criteria selected by the method are rank Correlation (Ranking Correlation) and R-value, as described in detail in example 3.
Example 3
The feasibility of the protocols of examples 1 and 2 is verified in figures 3 to 4 below with reference to specific experimental data, as described in detail below:
the image memorable dataset contains 2,222 images from the SUN dataset. The Memory score of an image is obtained by Visual Memory Game of Amazon Mechanical Turk, and the memorability of the image is a continuous value from 0 to 1. The higher the value, the more difficult the image is to memorize, and sample images with various memory scores are shown in fig. 2.
The method adopts two evaluation methods:
rank Correlation evaluation method (RC): and obtaining the ordering relation between the real memory ordering and the predicted memory fraction, and measuring the correlation coefficient between the two orderings by adopting the standard of Spearman correlation coefficient related to the orderings. The value range is [ -1,1], and the higher the value is, the more the two sequences are closer:
where N is the number of images in the test set, r1Element r in (1)1iIs the position of the ith picture in the real result in order, r2Element r in (1)2iIs the position of the ith picture in the prediction result.
R-value: evaluating the correlation coefficient between the predicted score and the actual score facilitates regression model comparison. The value range of R-value is [ -1,1],1 represents positive correlation, 1 represents negative correlation:
where N is the number of images in the test set, siIs a vector of the true memory degree score of the image,is the average of the true memory scores of all images; v. ofiIs a vector of the memory degree scores of the image prediction,is the average of all image prediction memory scores.
The method was compared experimentally with the following four methods:
lr (line regression): training the relation between the bottom layer characteristics and the memory degree score by using a linear prediction function;
svr (support Vector regression): vector regression is supported, bottom layer features are connected in series, and the memory degree of a nonlinear function prediction image is learned by combining an RBF kernel function;
MRR[9](Multiple Rank Regression): establishing a regression model by adopting a multi-order left projection vector and a right projection vector;
MLHR[10](Multi-Level via Hierarchical Regression): and analyzing the multimedia information based on hierarchical regression.
FIG. 3 verifies the convergence of the algorithm; fig. 4 shows the performance comparison of the method with other methods, and it can be seen that the method is superior to other methods. The comparison method only explores the relation between the bottom layer characteristics and the memory degree prediction. The method combines the bottom layer characteristics and the image attribute characteristics in the same frame to predict the image memory degree, and obtains a stable model. The experimental result verifies the feasibility and superiority of the method.
Reference to the literature
[1]Zhang Z,Li F,Zhao M,et al.Joint low-rank and sparse principal feature coding for enhanced robust representation and visual classification[J].IEEE Transactions on Image Processing,2016,25(6):2429-2443.
[2]Shi X,Guo Z,Lai Z,et al.A framework of joint graph embedding and sparse regression for dimensionality reduction[J].IEEE Transactions on Image Processing,2015,24(4):1341-1355.
[3]P.Isola,J.Xiao,A.Torralba,and A.Oliva,“What makes an image memorable?”in Proc.Int.Conf.Comput.Vis.Pattern Recognit.,2011,pp.145–152.
[4]P.Isola,D.Parikh,A.Torralba,and A.Oliva,“Understanding the intrinsic memorability of images,”in Proc.Adv.Conf.Neural Inf.Process.Syst.,2011,pp.2429–2437.
[5]Tibshirani R.Regression shrinkage and selection via the lasso[J].Journal of the Royal Statistical Society.Series B(Methodological),1996:267-288.
[6]Hoerl A E,Kennard R W.Ridge regression:Biased estimation for nonorthogonal problems.Technometrics,1970,12(1):55-67.
[7]Purcell S,Neale B,Todd-Brown K,et al.PLINK:a tool set for whole-genome association and population-based linkage analyses.The American Journal ofHuman Genetics,2007,81(3):559-575.
[8]Q.You,H.Jin,and J.Luo,“Visual sentiment analysis by attending on local image regions,”in Thirty-First AAAI Conference on Artificial Intelligence,2017.
[9]Hou C,Nie F,Yi D,et al.Efficient image classification via multiple rank regression.IEEE Transactions on Image Processing,2013,22(1):340-352.
[10]Sundt B.A multi-level hierarchical credibility regression model[J].Scandinavian Actuarial Journal,1980,1980(1):25-32.
[11]J.Xiao,J.Hays,K.Ehinger,A.Oliva,A.Torralba et al.,“Sun database:Large-scale scene recognition from abbey to zoo,”in Proc.Int.Conf.Comput.Vis.Pattern Recognit.,2010,pp.3485–3492.
Those skilled in the art will appreciate that the drawings are only schematic illustrations of preferred embodiments, and the above-described embodiments of the present invention are merely provided for description and do not represent the merits of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (4)
1. A multi-view learning method combining low rank representation and sparse regression, the method comprising the steps of:
extracting low-level features and high-level attribute features of the SUN data set with the image memory degree score labels respectively;
putting three parts of low-rank representation, combination of a sparse regression model and multi-view consistency loss under the same frame to form a whole, and constructing a multi-view model combining low-rank and sparse regression;
solving the problem of memorability of the automatic prediction image by using a multi-vision self-adaptive regression algorithm, and obtaining the relation between the bottom layer characteristic of the image, the attribute characteristic of the image and the memory degree of the image under the optimal parameter;
combining the low-level features and the high-level attribute features of the image, predicting the image memory of the database test set by using the relation result obtained under the optimal parameter, and verifying the prediction result by using a related evaluation standard;
the problem of the memorability of the automatic prediction image solved by the multi-vision self-adaptive regression algorithm is as follows: the equivalent problem is transformed by relaxing the variable Q:
s.t.X=XA+E,Q=Aw
wherein, A is a mapping matrix represented by low rank; w is a linear dependency between the low rank feature representation and the output memory fraction; e is a sparse error constraint part; α is a balance parameter between the prediction error portion and the regularization portion; beta is a parameter for controlling sparsity; λ > 0 is an equilibrium parameter; x is the input characteristic; y is a memorability score vector; is a nuclear norm representation; phi is a constraint term of graph regularization; l is Laplace operator;
introducing two relaxation variables Y1And Y2To obtain an augmented lagrange function:
wherein < > represents the inner product operation of the matrix, Y1And Y2Representing a Lagrange operator matrix, mu > 0 is a positive penalty parameter, and the method is combined as follows:
wherein
Introducing the variable t, defining At,Et,Qt,wt,Y1,t,Y2,tAnd μ as the result of the t-th iteration of the variables, the result of the t + 1-th iteration is as follows:
iteration result of A:
the optimization results for E obtained by fixing w, a, Q are as follows:
by fixing E, a, Q, the optimization w results are as follows:
And finally, fixing E, w and A, and optimizing Q to obtain:
Y1and Y2Updating by the following scheme:
Y1,t+1=Y1,t+μt(X-XAt+1-Et+1)
Y2,t+1=Y2,t+μt(Qt+1-At+1wt+1)
the multi-view model combining low rank and sparse regression specifically comprises the following steps:
wherein:
G(φH,φl)=tr[(XlAlwl)TXHAHwH]
as a loss function for use as an advanced feature prediction error;as a loss function for low-level feature prediction error;a graphical regularization representation for solving an over-fitting problem; xHIs a high-level attribute feature; a. theHA mapping matrix for low rank representation of high-level attribute features; eHIs a high-level attribute characteristic sparse error constraint part; w is aHA linear dependency between a low rank representation of the high-level attribute features and the output memory score; a. thelA mapping matrix that is a low rank representation of low-level features; elA low-level attribute feature sparse error constraint part; xlA low-level attribute feature; w is alA linear dependency between a low rank representation of the low-level attribute features and the output memory score.
2. The method of multi-view learning combining low rank representation and sparse regression of claim 1, further comprising: an image memorable data set is obtained.
3. The method of multi-view learning combining low rank representation and sparse regression of claim 1, wherein said low-level features comprise: scale invariant feature transform features, search tree features, histogram of oriented gradients features, and structural similarity features.
4. The method of multi-view learning combining low rank representation and sparse regression as claimed in claim 1, wherein said high level attribute features comprise: a 327 dimension scene category attribute feature, and a 106 dimension object attribute feature.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710648597.4A CN107545276B (en) | 2017-08-01 | 2017-08-01 | Multi-view learning method combining low-rank representation and sparse regression |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710648597.4A CN107545276B (en) | 2017-08-01 | 2017-08-01 | Multi-view learning method combining low-rank representation and sparse regression |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107545276A CN107545276A (en) | 2018-01-05 |
CN107545276B true CN107545276B (en) | 2021-02-05 |
Family
ID=60971226
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710648597.4A Active CN107545276B (en) | 2017-08-01 | 2017-08-01 | Multi-view learning method combining low-rank representation and sparse regression |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107545276B (en) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108256486B (en) * | 2018-01-18 | 2022-02-22 | 河南科技大学 | Image identification method and device based on nonnegative low-rank and semi-supervised learning |
CN108197320B (en) * | 2018-02-02 | 2019-07-30 | 北方工业大学 | Multi-view image automatic labeling method |
CN109522956B (en) * | 2018-11-16 | 2022-09-30 | 哈尔滨理工大学 | Low-rank discriminant feature subspace learning method |
CN109583498B (en) * | 2018-11-29 | 2023-04-07 | 天津大学 | Fashion compatibility prediction method based on low-rank regularization feature enhancement characterization |
CN109858543B (en) * | 2019-01-25 | 2023-03-21 | 天津大学 | Image memorability prediction method based on low-rank sparse representation and relationship inference |
CN110619367B (en) * | 2019-09-20 | 2022-05-13 | 哈尔滨理工大学 | Joint low-rank constraint cross-view-angle discrimination subspace learning method and device |
CN110727818B (en) * | 2019-09-27 | 2023-11-14 | 天津大学 | Binary image feature coding method based on low-rank embedded representation |
CN112990242A (en) * | 2019-12-16 | 2021-06-18 | 京东数字科技控股有限公司 | Training method and training device for image classification model |
CN111242102B (en) * | 2019-12-17 | 2022-11-18 | 大连理工大学 | Fine-grained image recognition algorithm of Gaussian mixture model based on discriminant feature guide |
CN111259917B (en) * | 2020-02-20 | 2022-06-07 | 西北工业大学 | Image feature extraction method based on local neighbor component analysis |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103400143A (en) * | 2013-07-12 | 2013-11-20 | 中国科学院自动化研究所 | Data subspace clustering method based on multiple view angles |
CN105809119A (en) * | 2016-03-03 | 2016-07-27 | 厦门大学 | Sparse low-rank structure based multi-task learning behavior identification method |
CN106971200A (en) * | 2017-03-13 | 2017-07-21 | 天津大学 | A kind of iconic memory degree Forecasting Methodology learnt based on adaptive-migration |
-
2017
- 2017-08-01 CN CN201710648597.4A patent/CN107545276B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103400143A (en) * | 2013-07-12 | 2013-11-20 | 中国科学院自动化研究所 | Data subspace clustering method based on multiple view angles |
CN105809119A (en) * | 2016-03-03 | 2016-07-27 | 厦门大学 | Sparse low-rank structure based multi-task learning behavior identification method |
CN106971200A (en) * | 2017-03-13 | 2017-07-21 | 天津大学 | A kind of iconic memory degree Forecasting Methodology learnt based on adaptive-migration |
Non-Patent Citations (1)
Title |
---|
Joint Low-Rank and Sparse Principal Feature Coding for Enhanced Robust Representation and Visual Classification;Zhao Zhang等;《IEEE Transactions on Image Processing》;20160325;第2429-2443页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107545276A (en) | 2018-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107545276B (en) | Multi-view learning method combining low-rank representation and sparse regression | |
CN111581510B (en) | Shared content processing method, device, computer equipment and storage medium | |
Zhou et al. | Atrank: An attention-based user behavior modeling framework for recommendation | |
CN107590505B (en) | Learning method combining low-rank representation and sparse regression | |
CN107480261B (en) | Fine-grained face image fast retrieval method based on deep learning | |
Wu et al. | Tag completion for image retrieval | |
US9633045B2 (en) | Image ranking based on attribute correlation | |
Cong et al. | Towards scalable summarization of consumer videos via sparse dictionary selection | |
CN102567483B (en) | Multi-feature fusion human face image searching method and system | |
US20090150376A1 (en) | Mutual-Rank Similarity-Space for Navigating, Visualising and Clustering in Image Databases | |
US20120124037A1 (en) | Multimedia data searching method and apparatus and pattern recognition method | |
Li et al. | Detecting shot boundary with sparse coding for video summarization | |
CN105512674B (en) | RGB-D object identification method and device based on the adaptive similarity measurement of dense Stereo Matching | |
CN110633421A (en) | Feature extraction, recommendation, and prediction methods, devices, media, and apparatuses | |
Gao et al. | k-Partite graph reinforcement and its application in multimedia information retrieval | |
Panda et al. | Nyström approximated temporally constrained multisimilarity spectral clustering approach for movie scene detection | |
CN114691973A (en) | Recommendation method, recommendation network and related equipment | |
Xu et al. | Instance-level coupled subspace learning for fine-grained sketch-based image retrieval | |
Zhang et al. | Second-and high-order graph matching for correspondence problems | |
Wu et al. | Semisupervised feature learning by deep entropy-sparsity subspace clustering | |
CN106407281B (en) | Image retrieval method and device | |
Tkachenko et al. | Plackett-luce regression mixture model for heterogeneous rankings | |
Zhu et al. | Multimodal sparse linear integration for content-based item recommendation | |
Indu et al. | Survey on sketch based image retrieval methods | |
Singh et al. | Image collection summarization: Past, present and future |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |