CN109242007A - A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor - Google Patents
A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor Download PDFInfo
- Publication number
- CN109242007A CN109242007A CN201810970444.6A CN201810970444A CN109242007A CN 109242007 A CN109242007 A CN 109242007A CN 201810970444 A CN201810970444 A CN 201810970444A CN 109242007 A CN109242007 A CN 109242007A
- Authority
- CN
- China
- Prior art keywords
- tensor
- scoring
- vector
- feature space
- space
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor constructs sample tensor according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors;Cumulative obtain of the sample tensor is merged into tensor;Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;Each feature space scoring vector is obtained according to the connection tensor, and feature space scoring vector is done into apposition and obtains scoring tensor;The feature space mix vector and scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance;Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building viewing matrix;The multi-angle of view cluster result under different views is obtained according to the viewing matrix.Solution can not generate different cluster results according to the demand applied under different situations in the prior art, thus the technical issues of providing high quality cluster service for different applications.
Description
Technical field
The present invention relates to technical field of information processing more particularly to a kind of cross-domain isomery big data multi-angle of view based on tensor
Clustering method and device.
Background technique
Multi-angle of view clustering is an emerging research field of data mining, it can go to explore unknown with different view
Data set allows to have one or more cluster results there are multiple cluster process.Than traditional list cluster, i.e., from a visual angle
It explores unknown data collection and only generates a cluster result, be more in line with the mankind and treat diversity of the world feature.Therefore, to big
Data carry out multi-angle of view clustering, can open all structures in data, preferably the service mankind.
Existing multi-angle of view clustering technique mainly includes multiple view cluster, selection cluster and subspace clustering.Multiple view is poly-
Class can merge the immanent structure of multi-source information mining data, have better clustering performance than single view cluster.However, more
View cluster can only learn to find single cluster result from multi-source information, the combination of different characteristic cannot be selected to generate from multi-angle
Different cluster results;And select cluster that can provide multiple and different cluster results for user with the different mode of mining data
Selection.But selection cluster only focuses on the diversity between multiple cluster results and without its meaning of method interpretation, cannot merge multiple view
Information improves clustering performance.Subspace clustering is used for High Dimensional Clustering Analysis, can find good class by the subspace of extraction
Cluster, but multiple and different cluster results can not be generated from the different viewpoint of data.
Multi-angle of view clustering technique in the prior art can not on the basis of merging cross-domain multiple view information, allow user according to
The various combination of different situation context selection data characteristicses, so that different cluster results can not be generated according to different situations
The technical issues of providing high quality cluster service for upper layer big data application.
Summary of the invention
The embodiment of the invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor, solution
Different cluster results can not be certainly generated according to different situations in the prior art and provides high quality cluster for upper layer big data application
The technical issues of service.
In view of the above problems, it is more in order to provide a kind of cross-domain isomery big data based on tensor to propose the embodiment of the present application
Visual angle clustering method and device.
In a first aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering method based on tensor, described
Method includes:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature spaces
Mix vector;Cumulative obtain of the sample tensor is merged into tensor;It is done along the corresponding rank of each feature space of the merging tensor
Normalization obtains connection tensor;According to the Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, described in acquisition
Each feature space scoring vector, and feature space scoring vector is done into apposition and obtains scoring tensor;The feature is empty
Between mix vector and the scoring tensor introduce higher dimensional space tensor distance building combination scoring tensor distance;According to the combination
The tensor distance that scores calculates Sample Similarity, and constructs viewing matrix according to the feature space mix vector;According to the view
Figure matrix obtains the multi-angle of view cluster result under different views.
Preferably, the cross-domain heterogeneous characteristic space include one of cyberspace, physical space and social space or
It is a variety of.
Preferably, the Stationary Distribution according to the connection tensor computation under the conditions of more Attribute Associations, obtains described each
Feature space scoring vector, and the scoring vector is done into apposition and obtains scoring tensor, further includes: obtain l connection tensor, l
For positive integer;Initialization probability parameter and threshold parameter;Select initial vector and random vector;By the l connection tensors point
Does not do single mode with the initial vector, the random vector and multiply;Judge whether the adjacent error for scoring vector twice is less than respectively
Threshold parameter;When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;According to feature space
Dimension intercepts the l scoring vector and obtains feature space scoring vector;Feature space scoring vector is done into apposition acquisition
Score tensor.
Preferably, the multi-angle of view cluster result under different views is obtained according to the viewing matrix, further includes: according to typical case
Clustering algorithm inputs the viewing matrix and obtains multi-angle of view cluster result.
Second aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor are described
Device includes:
First construction unit, first construction unit are used for according to the cross-domain heterogeneous characteristic space building sample of fusion
Amount, and according to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit along the corresponding rank of each feature space of the merging tensor for doing
Normalization obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation in more Attribute Association conditions
Under Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition acquisition
Score tensor;
Second construction unit, second construction unit are used for the feature space mix vector and the scoring tensor
Introduce higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate sample according to combination scoring tensor distance similar
Degree, and viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the 4th obtaining unit are used to obtain more views under different views according to the viewing matrix
Angle cluster result.
Preferably, it includes cyberspace, physical space and society that the first construction unit, which includes the cross-domain heterogeneous characteristic space,
One of meeting space is a variety of.
Preferably, Stationary Distribution of the third obtaining unit according to the connection tensor computation under the conditions of more Attribute Associations,
Each feature space scoring vector is obtained, and the scoring vector is done into apposition and obtains scoring tensor, further includes:
5th obtaining unit, for the 5th obtaining unit for obtaining l connection tensor, l is positive integer;
First execution unit, first execution unit are used for initialization probability parameter and threshold parameter;
Second execution unit, second execution unit is for selecting initial vector and random vector;
Third execution unit, the third execution unit be used for by l connection tensors respectively with the initial vector,
The random vector does single mode and multiplies;
First judging unit, first judging unit are used to judge respectively whether the error of the adjacent vector that scores twice to be small
In threshold parameter;
6th obtaining unit, the 6th obtaining unit are used to be less than threshold parameter when the adjacent error for scoring vector twice
When, obtain l scoring vector;
7th obtaining unit, the 7th obtaining unit are used to intercept the l scoring vector according to feature space dimension
Obtain feature space scoring vector;
8th obtaining unit, the 8th obtaining unit are used to feature space scoring vector doing apposition and be scored
Tensor.
Preferably, the 4th obtaining unit obtains the multi-angle of view cluster result under different views according to the viewing matrix, also
Include:
9th obtaining unit, the 9th obtaining unit are used to input the viewing matrix according to typical clustering algorithm and obtain
Multi-angle of view cluster result.
The third aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, including
Memory, processor and storage on a memory and the computer program that can run on a processor, processor execution institute
Performed the steps of when stating program according to merging cross-domain heterogeneous characteristic space building sample tensor, and according to different situations above and below
Literary construction feature Spatial Coupling vector;Cumulative obtain of the sample tensor is merged into tensor;Along each feature of the merging tensor
The corresponding rank in space does normalization and obtains connection tensor;It is steady under the conditions of more Attribute Associations according to the connection tensor computation
Distribution obtains each feature space scoring vector, and feature space scoring vector is done apposition and obtains scoring tensor;
The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view square
Battle array;The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
Said one or multiple technical solutions in the embodiment of the present application at least have following one or more technology effects
Fruit:
A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor provided by the embodiments of the present application, root
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors;
Cumulative obtain of the sample tensor is merged into tensor;Normalization is done along the corresponding rank of each feature space of the merging tensor to obtain
Connect tensor;According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, it is empty to obtain each feature
Between score vector, and feature space scoring vector is done into apposition and obtains scoring tensor;By the feature space mix vector
Higher dimensional space tensor distance building combination scoring tensor distance is introduced with the scoring tensor;According to the combination score tensor away from
Viewing matrix is constructed from calculating Sample Similarity, and according to the feature space mix vector;It is obtained according to the viewing matrix
Multi-angle of view cluster result under different views.Solution can not generate different cluster results according to different situations in the prior art and be
Big data application in upper layer provides the technical issues of high quality cluster service, and melting for multiple modal characteristics can be considered simultaneously by realizing
Influence of the interaction to cluster result is closed, can provide and cluster better clustering performance than single view;It can be neatly according to difference
The feature space that the demand selection of situation application needs generates multiple cluster results according to tensor element mapping relations as difference
Application provide high quality cluster service, and cluster result it is explanatory preferably;More efficiently measure data in high order spatial
The distance of sample is suitble to the calculating of cross-domain isomery big data Sample Similarity;Influence of the important attribute to cluster result is improved, together
When degeneration noise attribute influence, clustering result quality is better than the technical effect for the case where scoring is not added.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
Fig. 1 shows for the process of the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in the embodiment of the present invention
It is intended to;
Fig. 2 shows for the structure of the cross-domain isomery big data multi-angle of view clustering apparatus based on tensor a kind of in the embodiment of the present invention
It is intended to;
Fig. 3 is the scoring learning algorithm flow diagram provided in the embodiment of the present invention;
Fig. 4 is the tensor multi-angle of view clustering algorithm flow diagram provided in the embodiment of the present invention;
Fig. 5 is the structure of cross-domain isomery big data multi-angle of view clustering apparatus of the another kind based on tensor in the embodiment of the present invention
Schematic diagram.
Drawing reference numeral explanation: bus 300, receiver 301, processor 302, transmitter 303, memory 304, bus interface
306。
Specific embodiment
The embodiment of the invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor, sheet
It is as follows to invent the technical solution general thought provided: constructing sample tensor according to cross-domain heterogeneous characteristic space is merged, and according to not
With situation context construction feature Spatial Coupling vector;Cumulative obtain of the sample tensor is merged into tensor;Along the merging
It measures the corresponding rank of each feature space and does normalization acquisition connection tensor;According to the connection tensor computation in more Attribute Association items
Stationary Distribution under part obtains each feature space scoring vector, and the scoring vector is done apposition and obtains scoring
Amount;The feature space mix vector and scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and building viewing matrix is combined according to the feature space;Root
The multi-angle of view cluster result under different views is obtained according to the viewing matrix.Solution can not produce in the prior art according to different situations
The technical issues of raw different cluster result provides high quality cluster service for upper layer big data application, realize to examine simultaneously
Consider influence of the fusion interaction of multiple modal characteristics to cluster result, can provide and cluster better clustering performance than single view;
It can neatly be produced according to the feature space that the demand selection applied under different situations needs according to tensor element mapping relations
Raw multiple cluster results provide high quality cluster service for different applications, and cluster result is explanatory preferably;It is more efficient
The distance of data sample in high order spatial is measured on ground, is suitble to the calculating of cross-domain isomery big data Sample Similarity;Improve important category
Property influence to cluster result, while the influence for noise attribute of degenerating, clustering result quality are better than the technology effect for the case where scoring is not added
Fruit.
Technical solution of the present invention is described in detail below by attached drawing and specific embodiment, it should be understood that the application
Specific features in embodiment and embodiment are the detailed description to technical scheme, rather than to present techniques
The restriction of scheme, in the absence of conflict, the technical characteristic in the embodiment of the present application and embodiment can be combined with each other.
Embodiment one
Fig. 1 shows for the process of the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in the embodiment of the present invention
It is intended to.As shown in Figure 1, which comprises
Step 110: constructing sample tensor according to cross-domain heterogeneous characteristic space is merged, and constructed according to different situation contexts
Feature space mix vector.
Further, the cross-domain heterogeneous characteristic space includes one of cyberspace, physical space and social space
Or it is a variety of.
Specifically, referring to FIG. 4, constructing sample tensor according to cross-domain heterogeneous characteristic space is merged
Wherein F1,F2,...,FlIndicate l feature space.The characteristics of for big data higher-dimension, multi-source, isomery, using tensor fusion across
The immanent structure of domain isomerous multi-source information excavating data can consider the fusion interaction of multiple modal characteristics to cluster result simultaneously
Influence, can provide and cluster better clustering performance than single view.The cross-domain heterogeneous characteristic space includes cyberspace, object
Manage one of space and social space or a variety of.According to different situation context construction feature Spatial Coupling vector vs1,
v2,...,vm∈{0,1}l。
Step 120: cumulative obtain of the sample tensor is merged into tensor;
Step 130: doing normalization along the corresponding rank of each feature space of the merging tensor and obtain connection tensor.
Specifically, cumulative obtain of the sample tensor is merged tensorIt is each along the merging tensor
The corresponding rank of a feature space does normalization and obtains connection tensor
Step 140: according to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, obtaining described each
Feature space scoring vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
Further, the Stationary Distribution according to the connection tensor computation under the conditions of more Attribute Associations obtains described each
A feature space scoring vector, and the scoring vector is done into apposition and obtains scoring tensor, further includes: l connection tensor is obtained,
L is positive integer;Initialization probability parameter and threshold parameter;Select initial vector and random vector;By the l connection tensors point
Does not do single mode with the initial vector, the random vector and multiply;Judge whether the adjacent error for scoring vector twice is less than respectively
Threshold parameter;When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;According to feature space
Dimension intercepts the l scoring vector and obtains feature space scoring vector;Feature space scoring vector is done into apposition acquisition
Score tensor.
Specifically, calculating the Stationary Distribution of connection tensor under the conditions of more Attribute Associations, each spy is obtained
Space scoring vector is levied, and the scoring vector is done into apposition and obtains scoring tensorWherein, referring to FIG. 3,
The scoring learning algorithm provided in the present embodiment includes: to obtain l connection tensorInitially
Change probability parameter μ and threshold parameter δ;The l connection tensors are done single mode with initial vector, random vector respectively to multiply, respectively
Judge that the adjacent error for scoring vector twice is less than threshold parameter δ, obtains l scoring vector.By higher dimensional space tensor away from
From middle introduced feature Spatial Coupling coefficient, the feature that can be neatly needed according to the demand selection applied under different situations is empty
Between, according to tensor element mapping relations, high quality cluster service is provided for different applications to generate multiple cluster results, and
And cluster result is explanatory preferably.The l scoring vector, which is intercepted, according to feature space dimension obtains feature space scoring vector
e1,e2,…,el;By feature space scoring vector e1,e2,…,elIt does apposition and obtains scoring tensor
Step 150: the feature space mix vector and scoring tensor being introduced into the distance building of higher dimensional space tensor and combined
Score tensor distance;
Step 160: tensor being scored apart from calculating Sample Similarity according to the combination, and according to feature space combination
Construct viewing matrix;
Step 170: the multi-angle of view cluster result under different views is obtained according to the viewing matrix.
Further, the viewing matrix is inputted according to typical clustering algorithm and obtains multi-angle of view cluster result.
It is combined specifically, the feature space mix vector and scoring tensor are introduced the distance building of higher dimensional space tensor
Score tensor distance, wherein combination scoring tensor range formula:
Tensor is scored apart from calculating Sample Similarity according to the combination, and building view is combined according to the feature space
MatrixThe viewing matrix is inputted according to typical clustering algorithm (fast search density peaks clustering algorithm) to obtain
Obtain multi-angle of view cluster result cl1,cl2,...,clm, high quality cluster service is provided for upper layer big data application.The combination is commented
Span say good-bye from the complex relationship for considering different coordinates, so can be more efficiently using combination scoring tensor distance
The distance for measuring data sample in high order spatial, is suitble to the calculating of cross-domain isomery big data Sample Similarity.Meanwhile at described group
Closing introduced feature space in scoring tensor distance, score coefficient, and influence of the important attribute to cluster result can be improved, move back simultaneously
Change the influence of noise attribute, clustering result quality is better than the case where scoring is not added.
Embodiment 2
Based on similarly being sent out with the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in previous embodiment
Bright design, the present invention also provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, as shown in Fig. 2, described
Device includes:
First construction unit, first construction unit are used for according to the cross-domain heterogeneous characteristic space building sample of fusion
Amount, and according to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit along the corresponding rank of each feature space of the merging tensor for doing
Normalization obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation in more Attribute Association conditions
Under Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition acquisition
Score tensor;
Second construction unit, second construction unit are used to introduce the feature space mix vector and scoring tensor
Higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate sample according to combination scoring tensor distance similar
Degree, and viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the 4th obtaining unit are used to obtain more views under different views according to the viewing matrix
Angle cluster result.
Further, the first construction unit includes the cross-domain heterogeneous characteristic space, wherein the cross-domain heterogeneous characteristic is empty
Between include one of cyberspace, physical space and social space or a variety of.
Further, third obtaining unit is according to steady point for connecting tensor computation under the conditions of more Attribute Associations
Cloth obtains each feature space scoring vector, and the scoring vector is done apposition and obtains scoring tensor, further includes:
5th obtaining unit, for the 5th obtaining unit for obtaining l connection tensor, l is positive integer;
First execution unit, first execution unit are used for initialization probability parameter and threshold parameter;
Second execution unit, second execution unit is for selecting initial vector and random vector;
Third execution unit, the third execution unit be used for by l connection tensors respectively with the initial vector,
The random vector does single mode and multiplies;
First judging unit, first judging unit are used to judge respectively whether the error of the adjacent vector that scores twice to be small
In threshold parameter;
6th obtaining unit, the 6th obtaining unit are used to be less than threshold parameter when the adjacent error for scoring vector twice
When, obtain l scoring vector;
7th obtaining unit, the 7th obtaining unit are used to intercept the l scoring vector according to feature space dimension
Obtain feature space scoring vector;
8th obtaining unit, the 8th obtaining unit are used to feature space scoring vector doing apposition and be scored
Tensor.
Further, the 4th obtaining unit obtains the multi-angle of view cluster result under different views according to the viewing matrix,
Further include:
9th obtaining unit, the 9th obtaining unit are used to input the viewing matrix according to typical clustering algorithm and obtain
Multi-angle of view cluster result.
The various changes of cross-domain isomery big data multi-angle of view clustering method of one of 1 embodiment 1 of earlier figures based on tensor
Change mode and specific example are equally applicable to a kind of cross-domain isomery big data multi-angle of view cluster dress based on tensor of the present embodiment
It sets, passes through the detailed description of the aforementioned cross-domain isomery big data multi-angle of view clustering method to a kind of based on tensor, art technology
Personnel are clear that a kind of implementation of the cross-domain isomery big data multi-angle of view clustering apparatus based on tensor in the present embodiment
Method, so this will not be detailed here in order to illustrate the succinct of book.
Embodiment 3
Based on similarly being sent out with the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in previous embodiment
Bright design, the present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is by processor
The step of a kind of either cross-domain isomery big data multi-angle of view clustering method based on tensor described previously method is realized when execution.
Wherein, in Fig. 5, bus architecture (is represented) with bus 300, and bus 300 may include any number of interconnection
Bus and bridge, bus 300 will include the one or more processors represented by processor 302 and what memory 304 represented deposits
The various circuits of reservoir link together.Bus 300 can also will peripheral equipment, voltage-stablizer and management circuit etc. it
Various other circuits of class link together, and these are all it is known in the art, therefore, no longer carry out further to it herein
Description.Bus interface 306 provides interface between bus 300 and receiver 301 and transmitter 303.Receiver 301 and transmitter
303 can be the same element, i.e. transceiver, provide the unit for communicating over a transmission medium with various other devices.
Processor 302 is responsible for management bus 300 and common processing, and memory 304 can be used for storage processor
302 when executing operation used information.
Said one or multiple technical solutions in the embodiment of the present application at least have following one or more technology effects
Fruit:
A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor provided by the embodiments of the present application, root
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors;
Cumulative obtain of the sample tensor is merged into tensor;Normalization is done along the corresponding rank of each feature space of the merging tensor to obtain
Connect tensor;According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, it is empty to obtain each feature
Between score vector, and the scoring vector is done into apposition and obtains scoring tensor;By the feature space mix vector and scoring
Amount introduces higher dimensional space tensor distance building combination scoring tensor distance;Tensor is scored apart from calculating sample phase according to the combination
Building viewing matrix is combined like degree, and according to the feature space;More views under different views are obtained according to the viewing matrix
Angle cluster result.It is that big data application in upper layer mentions that solution can not generate different cluster results according to different situations in the prior art
The technical issues of servicing is clustered for high quality, realizes the fusion interaction that can consider multiple modal characteristics simultaneously to cluster result
Influence, can provide and cluster better clustering performance than single view;It can be neatly according to the demand applied under different situations
The feature space needed is selected, according to tensor element mapping relations, it is different high-quality using providing for generating multiple cluster results
Amount cluster service, and cluster result is explanatory preferably;The distance for more efficiently measuring data sample in high order spatial, is suitble to
The calculating of cross-domain isomery big data Sample Similarity;Improve influence of the important attribute to cluster result, while noise attribute of degenerating
Influence, clustering result quality is better than the technical effect for the case where scoring is not added.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (6)
1. a kind of cross-domain isomery big data multi-angle of view clustering method based on tensor, which is characterized in that the described method includes:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Couplings
Vector;
Cumulative obtain of the sample tensor is merged into tensor;
Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;
According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, each feature space scoring is obtained
Vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor
Distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view
Matrix;
The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
2. the method as described in claim 1, which is characterized in that the cross-domain heterogeneous characteristic space includes cyberspace, physics
One of space and social space are a variety of.
3. the method as described in claim 1, which is characterized in that according to the connection tensor computation under the conditions of more Attribute Associations
Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition and is commented
The amount of saying good-bye, comprising:
L connection tensor is obtained, l is positive integer;
Initialization probability parameter and threshold parameter;
Select initial vector and random vector;
The l connection tensors are done single mode with the initial vector, the random vector respectively to multiply;
Judge whether the adjacent error for scoring vector twice is less than threshold parameter respectively;
When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;
The l scoring vector, which is intercepted, according to feature space dimension obtains feature space scoring vector;
Feature space scoring vector is done into apposition and obtains scoring tensor.
4. the method as described in claim 1, which is characterized in that obtain the multi-angle of view under different views according to the viewing matrix
Cluster result, comprising:
The viewing matrix, which is inputted, according to typical clustering algorithm obtains multi-angle of view cluster result.
5. a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, which is characterized in that described device includes:
First construction unit, first construction unit are used to construct sample tensor according to the cross-domain heterogeneous characteristic space of fusion, and
According to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit are used to do normalizing along the corresponding rank of each feature space of the merging tensor
Change and obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation under the conditions of more Attribute Associations
Stationary Distribution obtains each feature space scoring vector, and feature space scoring vector is done apposition and is scored
Tensor;
Second construction unit, second construction unit are used to introduce the feature space mix vector and the scoring tensor
Higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate Sample Similarity according to combination scoring tensor distance,
And viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the multi-angle of view that the 4th obtaining unit is used to be obtained under different views according to the viewing matrix are poly-
Class result.
6. a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, including memory, processor and it is stored in
On reservoir and the computer program that can run on a processor, which is characterized in that the processor is realized when executing described program
Following steps:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Couplings
Vector;
Cumulative obtain of the sample tensor is merged into tensor;
Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;
According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, each feature space scoring is obtained
Vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor
Distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view
Matrix;
The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810970444.6A CN109242007A (en) | 2018-08-24 | 2018-08-24 | A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810970444.6A CN109242007A (en) | 2018-08-24 | 2018-08-24 | A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109242007A true CN109242007A (en) | 2019-01-18 |
Family
ID=65068306
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810970444.6A Pending CN109242007A (en) | 2018-08-24 | 2018-08-24 | A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109242007A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414150A (en) * | 2019-07-30 | 2019-11-05 | 华东交通大学 | A kind of tensor subspace continuous system recognition methods of bridge time-varying |
CN111242318A (en) * | 2020-01-13 | 2020-06-05 | 拉扎斯网络科技(上海)有限公司 | Business model training method and device based on heterogeneous feature library |
WO2023221275A1 (en) * | 2022-05-17 | 2023-11-23 | 中山大学 | Node classification method and system based on tensor graph convolutional network |
-
2018
- 2018-08-24 CN CN201810970444.6A patent/CN109242007A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110414150A (en) * | 2019-07-30 | 2019-11-05 | 华东交通大学 | A kind of tensor subspace continuous system recognition methods of bridge time-varying |
CN110414150B (en) * | 2019-07-30 | 2021-06-22 | 四川省公路规划勘察设计研究院有限公司 | Tensor subspace continuous system identification method of bridge time-varying system |
CN111242318A (en) * | 2020-01-13 | 2020-06-05 | 拉扎斯网络科技(上海)有限公司 | Business model training method and device based on heterogeneous feature library |
CN111242318B (en) * | 2020-01-13 | 2024-04-26 | 拉扎斯网络科技(上海)有限公司 | Service model training method and device based on heterogeneous feature library |
WO2023221275A1 (en) * | 2022-05-17 | 2023-11-23 | 中山大学 | Node classification method and system based on tensor graph convolutional network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Almeida | Benefits, challenges and tools of big data management. | |
Tsai et al. | Multiscale crack fundamental element model for real-world pavement crack classification | |
WO2017019735A1 (en) | Classifying user behavior as anomalous | |
CN109242007A (en) | A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor | |
US20210216859A1 (en) | Interpretability-Aware Adversarial Attack and Defense Method for Deep Learnings | |
CN112000763B (en) | Method, device, equipment and medium for determining competition relationship of interest points | |
WO2022018538A1 (en) | Identifying source datasets that fit transfer learning process for target domain | |
CN109471805A (en) | Resource testing method and device, storage medium, electronic equipment | |
CN109885769A (en) | A kind of active recommender system and device based on difference privacy algorithm | |
Fofonov et al. | Projected Field Similarity for Comparative Visualization of Multi‐Run Multi‐Field Time‐Varying Spatial Data | |
CN103957116A (en) | Decision-making method and system of cloud failure data | |
Bai et al. | Cross-domain representation learning by domain-migration generative adversarial network for sketch based image retrieval | |
Hao et al. | Deep belief network based on double weber local descriptor in micro-expression recognition | |
Lu et al. | Deformable attention-oriented feature pyramid network for semantic segmentation | |
Chen et al. | Handwritten CAPTCHA recognizer: a text CAPTCHA breaking method based on style transfer network | |
Feng et al. | Research on Threat Assessment evaluation model based on improved CNN algorithm | |
Guo et al. | CTpoint: A novel local and global features extractor for point cloud | |
Bosse et al. | Sketching phase diagrams using low-depth variational quantum algorithms | |
Ding et al. | A clustering and generative adversarial networks-based hybrid approach for imbalanced data classification | |
CN110490245B (en) | Identity verification model training method and device, storage medium and electronic equipment | |
Wang et al. | Combing deep and handcrafted features for NTV-NRPCA based fabric defect detection | |
Han et al. | Exploring Computing Time for Automatic Occlusion Detection: A Scan-Based Algorithm Versus a Geometry-Based Algorithm | |
CN103870520B (en) | For searching for the device and method of information | |
Yan et al. | A novel robust model fitting approach towards multiple-structure data segmentation | |
Li | An approach to evaluate the clothing creative design with dual hesitant fuzzy information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190118 |