[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109242007A - A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor - Google Patents

A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor Download PDF

Info

Publication number
CN109242007A
CN109242007A CN201810970444.6A CN201810970444A CN109242007A CN 109242007 A CN109242007 A CN 109242007A CN 201810970444 A CN201810970444 A CN 201810970444A CN 109242007 A CN109242007 A CN 109242007A
Authority
CN
China
Prior art keywords
tensor
scoring
vector
feature space
space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810970444.6A
Other languages
Chinese (zh)
Inventor
杨天若
赵雅靓
张荣皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Ezhou Institute of Industrial Technology Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Ezhou Institute of Industrial Technology Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology, Ezhou Institute of Industrial Technology Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201810970444.6A priority Critical patent/CN109242007A/en
Publication of CN109242007A publication Critical patent/CN109242007A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor constructs sample tensor according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors;Cumulative obtain of the sample tensor is merged into tensor;Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;Each feature space scoring vector is obtained according to the connection tensor, and feature space scoring vector is done into apposition and obtains scoring tensor;The feature space mix vector and scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance;Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building viewing matrix;The multi-angle of view cluster result under different views is obtained according to the viewing matrix.Solution can not generate different cluster results according to the demand applied under different situations in the prior art, thus the technical issues of providing high quality cluster service for different applications.

Description

A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor
Technical field
The present invention relates to technical field of information processing more particularly to a kind of cross-domain isomery big data multi-angle of view based on tensor Clustering method and device.
Background technique
Multi-angle of view clustering is an emerging research field of data mining, it can go to explore unknown with different view Data set allows to have one or more cluster results there are multiple cluster process.Than traditional list cluster, i.e., from a visual angle It explores unknown data collection and only generates a cluster result, be more in line with the mankind and treat diversity of the world feature.Therefore, to big Data carry out multi-angle of view clustering, can open all structures in data, preferably the service mankind.
Existing multi-angle of view clustering technique mainly includes multiple view cluster, selection cluster and subspace clustering.Multiple view is poly- Class can merge the immanent structure of multi-source information mining data, have better clustering performance than single view cluster.However, more View cluster can only learn to find single cluster result from multi-source information, the combination of different characteristic cannot be selected to generate from multi-angle Different cluster results;And select cluster that can provide multiple and different cluster results for user with the different mode of mining data Selection.But selection cluster only focuses on the diversity between multiple cluster results and without its meaning of method interpretation, cannot merge multiple view Information improves clustering performance.Subspace clustering is used for High Dimensional Clustering Analysis, can find good class by the subspace of extraction Cluster, but multiple and different cluster results can not be generated from the different viewpoint of data.
Multi-angle of view clustering technique in the prior art can not on the basis of merging cross-domain multiple view information, allow user according to The various combination of different situation context selection data characteristicses, so that different cluster results can not be generated according to different situations The technical issues of providing high quality cluster service for upper layer big data application.
Summary of the invention
The embodiment of the invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor, solution Different cluster results can not be certainly generated according to different situations in the prior art and provides high quality cluster for upper layer big data application The technical issues of service.
In view of the above problems, it is more in order to provide a kind of cross-domain isomery big data based on tensor to propose the embodiment of the present application Visual angle clustering method and device.
In a first aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering method based on tensor, described Method includes:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature spaces Mix vector;Cumulative obtain of the sample tensor is merged into tensor;It is done along the corresponding rank of each feature space of the merging tensor Normalization obtains connection tensor;According to the Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, described in acquisition Each feature space scoring vector, and feature space scoring vector is done into apposition and obtains scoring tensor;The feature is empty Between mix vector and the scoring tensor introduce higher dimensional space tensor distance building combination scoring tensor distance;According to the combination The tensor distance that scores calculates Sample Similarity, and constructs viewing matrix according to the feature space mix vector;According to the view Figure matrix obtains the multi-angle of view cluster result under different views.
Preferably, the cross-domain heterogeneous characteristic space include one of cyberspace, physical space and social space or It is a variety of.
Preferably, the Stationary Distribution according to the connection tensor computation under the conditions of more Attribute Associations, obtains described each Feature space scoring vector, and the scoring vector is done into apposition and obtains scoring tensor, further includes: obtain l connection tensor, l For positive integer;Initialization probability parameter and threshold parameter;Select initial vector and random vector;By the l connection tensors point Does not do single mode with the initial vector, the random vector and multiply;Judge whether the adjacent error for scoring vector twice is less than respectively Threshold parameter;When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;According to feature space Dimension intercepts the l scoring vector and obtains feature space scoring vector;Feature space scoring vector is done into apposition acquisition Score tensor.
Preferably, the multi-angle of view cluster result under different views is obtained according to the viewing matrix, further includes: according to typical case Clustering algorithm inputs the viewing matrix and obtains multi-angle of view cluster result.
Second aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor are described Device includes:
First construction unit, first construction unit are used for according to the cross-domain heterogeneous characteristic space building sample of fusion Amount, and according to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit along the corresponding rank of each feature space of the merging tensor for doing Normalization obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation in more Attribute Association conditions Under Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition acquisition Score tensor;
Second construction unit, second construction unit are used for the feature space mix vector and the scoring tensor Introduce higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate sample according to combination scoring tensor distance similar Degree, and viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the 4th obtaining unit are used to obtain more views under different views according to the viewing matrix Angle cluster result.
Preferably, it includes cyberspace, physical space and society that the first construction unit, which includes the cross-domain heterogeneous characteristic space, One of meeting space is a variety of.
Preferably, Stationary Distribution of the third obtaining unit according to the connection tensor computation under the conditions of more Attribute Associations, Each feature space scoring vector is obtained, and the scoring vector is done into apposition and obtains scoring tensor, further includes:
5th obtaining unit, for the 5th obtaining unit for obtaining l connection tensor, l is positive integer;
First execution unit, first execution unit are used for initialization probability parameter and threshold parameter;
Second execution unit, second execution unit is for selecting initial vector and random vector;
Third execution unit, the third execution unit be used for by l connection tensors respectively with the initial vector, The random vector does single mode and multiplies;
First judging unit, first judging unit are used to judge respectively whether the error of the adjacent vector that scores twice to be small In threshold parameter;
6th obtaining unit, the 6th obtaining unit are used to be less than threshold parameter when the adjacent error for scoring vector twice When, obtain l scoring vector;
7th obtaining unit, the 7th obtaining unit are used to intercept the l scoring vector according to feature space dimension Obtain feature space scoring vector;
8th obtaining unit, the 8th obtaining unit are used to feature space scoring vector doing apposition and be scored Tensor.
Preferably, the 4th obtaining unit obtains the multi-angle of view cluster result under different views according to the viewing matrix, also Include:
9th obtaining unit, the 9th obtaining unit are used to input the viewing matrix according to typical clustering algorithm and obtain Multi-angle of view cluster result.
The third aspect, the present invention provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, including Memory, processor and storage on a memory and the computer program that can run on a processor, processor execution institute Performed the steps of when stating program according to merging cross-domain heterogeneous characteristic space building sample tensor, and according to different situations above and below Literary construction feature Spatial Coupling vector;Cumulative obtain of the sample tensor is merged into tensor;Along each feature of the merging tensor The corresponding rank in space does normalization and obtains connection tensor;It is steady under the conditions of more Attribute Associations according to the connection tensor computation Distribution obtains each feature space scoring vector, and feature space scoring vector is done apposition and obtains scoring tensor; The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance; Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view square Battle array;The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
Said one or multiple technical solutions in the embodiment of the present application at least have following one or more technology effects Fruit:
A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor provided by the embodiments of the present application, root Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors; Cumulative obtain of the sample tensor is merged into tensor;Normalization is done along the corresponding rank of each feature space of the merging tensor to obtain Connect tensor;According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, it is empty to obtain each feature Between score vector, and feature space scoring vector is done into apposition and obtains scoring tensor;By the feature space mix vector Higher dimensional space tensor distance building combination scoring tensor distance is introduced with the scoring tensor;According to the combination score tensor away from Viewing matrix is constructed from calculating Sample Similarity, and according to the feature space mix vector;It is obtained according to the viewing matrix Multi-angle of view cluster result under different views.Solution can not generate different cluster results according to different situations in the prior art and be Big data application in upper layer provides the technical issues of high quality cluster service, and melting for multiple modal characteristics can be considered simultaneously by realizing Influence of the interaction to cluster result is closed, can provide and cluster better clustering performance than single view;It can be neatly according to difference The feature space that the demand selection of situation application needs generates multiple cluster results according to tensor element mapping relations as difference Application provide high quality cluster service, and cluster result it is explanatory preferably;More efficiently measure data in high order spatial The distance of sample is suitble to the calculating of cross-domain isomery big data Sample Similarity;Influence of the important attribute to cluster result is improved, together When degeneration noise attribute influence, clustering result quality is better than the technical effect for the case where scoring is not added.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
Fig. 1 shows for the process of the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in the embodiment of the present invention It is intended to;
Fig. 2 shows for the structure of the cross-domain isomery big data multi-angle of view clustering apparatus based on tensor a kind of in the embodiment of the present invention It is intended to;
Fig. 3 is the scoring learning algorithm flow diagram provided in the embodiment of the present invention;
Fig. 4 is the tensor multi-angle of view clustering algorithm flow diagram provided in the embodiment of the present invention;
Fig. 5 is the structure of cross-domain isomery big data multi-angle of view clustering apparatus of the another kind based on tensor in the embodiment of the present invention Schematic diagram.
Drawing reference numeral explanation: bus 300, receiver 301, processor 302, transmitter 303, memory 304, bus interface 306。
Specific embodiment
The embodiment of the invention provides a kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor, sheet It is as follows to invent the technical solution general thought provided: constructing sample tensor according to cross-domain heterogeneous characteristic space is merged, and according to not With situation context construction feature Spatial Coupling vector;Cumulative obtain of the sample tensor is merged into tensor;Along the merging It measures the corresponding rank of each feature space and does normalization acquisition connection tensor;According to the connection tensor computation in more Attribute Association items Stationary Distribution under part obtains each feature space scoring vector, and the scoring vector is done apposition and obtains scoring Amount;The feature space mix vector and scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor distance; Tensor is scored apart from calculating Sample Similarity according to the combination, and building viewing matrix is combined according to the feature space;Root The multi-angle of view cluster result under different views is obtained according to the viewing matrix.Solution can not produce in the prior art according to different situations The technical issues of raw different cluster result provides high quality cluster service for upper layer big data application, realize to examine simultaneously Consider influence of the fusion interaction of multiple modal characteristics to cluster result, can provide and cluster better clustering performance than single view; It can neatly be produced according to the feature space that the demand selection applied under different situations needs according to tensor element mapping relations Raw multiple cluster results provide high quality cluster service for different applications, and cluster result is explanatory preferably;It is more efficient The distance of data sample in high order spatial is measured on ground, is suitble to the calculating of cross-domain isomery big data Sample Similarity;Improve important category Property influence to cluster result, while the influence for noise attribute of degenerating, clustering result quality are better than the technology effect for the case where scoring is not added Fruit.
Technical solution of the present invention is described in detail below by attached drawing and specific embodiment, it should be understood that the application Specific features in embodiment and embodiment are the detailed description to technical scheme, rather than to present techniques The restriction of scheme, in the absence of conflict, the technical characteristic in the embodiment of the present application and embodiment can be combined with each other.
Embodiment one
Fig. 1 shows for the process of the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in the embodiment of the present invention It is intended to.As shown in Figure 1, which comprises
Step 110: constructing sample tensor according to cross-domain heterogeneous characteristic space is merged, and constructed according to different situation contexts Feature space mix vector.
Further, the cross-domain heterogeneous characteristic space includes one of cyberspace, physical space and social space Or it is a variety of.
Specifically, referring to FIG. 4, constructing sample tensor according to cross-domain heterogeneous characteristic space is merged Wherein F1,F2,...,FlIndicate l feature space.The characteristics of for big data higher-dimension, multi-source, isomery, using tensor fusion across The immanent structure of domain isomerous multi-source information excavating data can consider the fusion interaction of multiple modal characteristics to cluster result simultaneously Influence, can provide and cluster better clustering performance than single view.The cross-domain heterogeneous characteristic space includes cyberspace, object Manage one of space and social space or a variety of.According to different situation context construction feature Spatial Coupling vector vs1, v2,...,vm∈{0,1}l
Step 120: cumulative obtain of the sample tensor is merged into tensor;
Step 130: doing normalization along the corresponding rank of each feature space of the merging tensor and obtain connection tensor.
Specifically, cumulative obtain of the sample tensor is merged tensorIt is each along the merging tensor The corresponding rank of a feature space does normalization and obtains connection tensor
Step 140: according to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, obtaining described each Feature space scoring vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
Further, the Stationary Distribution according to the connection tensor computation under the conditions of more Attribute Associations obtains described each A feature space scoring vector, and the scoring vector is done into apposition and obtains scoring tensor, further includes: l connection tensor is obtained, L is positive integer;Initialization probability parameter and threshold parameter;Select initial vector and random vector;By the l connection tensors point Does not do single mode with the initial vector, the random vector and multiply;Judge whether the adjacent error for scoring vector twice is less than respectively Threshold parameter;When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;According to feature space Dimension intercepts the l scoring vector and obtains feature space scoring vector;Feature space scoring vector is done into apposition acquisition Score tensor.
Specifically, calculating the Stationary Distribution of connection tensor under the conditions of more Attribute Associations, each spy is obtained Space scoring vector is levied, and the scoring vector is done into apposition and obtains scoring tensorWherein, referring to FIG. 3, The scoring learning algorithm provided in the present embodiment includes: to obtain l connection tensorInitially Change probability parameter μ and threshold parameter δ;The l connection tensors are done single mode with initial vector, random vector respectively to multiply, respectively Judge that the adjacent error for scoring vector twice is less than threshold parameter δ, obtains l scoring vector.By higher dimensional space tensor away from From middle introduced feature Spatial Coupling coefficient, the feature that can be neatly needed according to the demand selection applied under different situations is empty Between, according to tensor element mapping relations, high quality cluster service is provided for different applications to generate multiple cluster results, and And cluster result is explanatory preferably.The l scoring vector, which is intercepted, according to feature space dimension obtains feature space scoring vector e1,e2,…,el;By feature space scoring vector e1,e2,…,elIt does apposition and obtains scoring tensor
Step 150: the feature space mix vector and scoring tensor being introduced into the distance building of higher dimensional space tensor and combined Score tensor distance;
Step 160: tensor being scored apart from calculating Sample Similarity according to the combination, and according to feature space combination Construct viewing matrix;
Step 170: the multi-angle of view cluster result under different views is obtained according to the viewing matrix.
Further, the viewing matrix is inputted according to typical clustering algorithm and obtains multi-angle of view cluster result.
It is combined specifically, the feature space mix vector and scoring tensor are introduced the distance building of higher dimensional space tensor Score tensor distance, wherein combination scoring tensor range formula:
Tensor is scored apart from calculating Sample Similarity according to the combination, and building view is combined according to the feature space MatrixThe viewing matrix is inputted according to typical clustering algorithm (fast search density peaks clustering algorithm) to obtain Obtain multi-angle of view cluster result cl1,cl2,...,clm, high quality cluster service is provided for upper layer big data application.The combination is commented Span say good-bye from the complex relationship for considering different coordinates, so can be more efficiently using combination scoring tensor distance The distance for measuring data sample in high order spatial, is suitble to the calculating of cross-domain isomery big data Sample Similarity.Meanwhile at described group Closing introduced feature space in scoring tensor distance, score coefficient, and influence of the important attribute to cluster result can be improved, move back simultaneously Change the influence of noise attribute, clustering result quality is better than the case where scoring is not added.
Embodiment 2
Based on similarly being sent out with the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in previous embodiment Bright design, the present invention also provides a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, as shown in Fig. 2, described Device includes:
First construction unit, first construction unit are used for according to the cross-domain heterogeneous characteristic space building sample of fusion Amount, and according to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit along the corresponding rank of each feature space of the merging tensor for doing Normalization obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation in more Attribute Association conditions Under Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition acquisition Score tensor;
Second construction unit, second construction unit are used to introduce the feature space mix vector and scoring tensor Higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate sample according to combination scoring tensor distance similar Degree, and viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the 4th obtaining unit are used to obtain more views under different views according to the viewing matrix Angle cluster result.
Further, the first construction unit includes the cross-domain heterogeneous characteristic space, wherein the cross-domain heterogeneous characteristic is empty Between include one of cyberspace, physical space and social space or a variety of.
Further, third obtaining unit is according to steady point for connecting tensor computation under the conditions of more Attribute Associations Cloth obtains each feature space scoring vector, and the scoring vector is done apposition and obtains scoring tensor, further includes:
5th obtaining unit, for the 5th obtaining unit for obtaining l connection tensor, l is positive integer;
First execution unit, first execution unit are used for initialization probability parameter and threshold parameter;
Second execution unit, second execution unit is for selecting initial vector and random vector;
Third execution unit, the third execution unit be used for by l connection tensors respectively with the initial vector, The random vector does single mode and multiplies;
First judging unit, first judging unit are used to judge respectively whether the error of the adjacent vector that scores twice to be small In threshold parameter;
6th obtaining unit, the 6th obtaining unit are used to be less than threshold parameter when the adjacent error for scoring vector twice When, obtain l scoring vector;
7th obtaining unit, the 7th obtaining unit are used to intercept the l scoring vector according to feature space dimension Obtain feature space scoring vector;
8th obtaining unit, the 8th obtaining unit are used to feature space scoring vector doing apposition and be scored Tensor.
Further, the 4th obtaining unit obtains the multi-angle of view cluster result under different views according to the viewing matrix, Further include:
9th obtaining unit, the 9th obtaining unit are used to input the viewing matrix according to typical clustering algorithm and obtain Multi-angle of view cluster result.
The various changes of cross-domain isomery big data multi-angle of view clustering method of one of 1 embodiment 1 of earlier figures based on tensor Change mode and specific example are equally applicable to a kind of cross-domain isomery big data multi-angle of view cluster dress based on tensor of the present embodiment It sets, passes through the detailed description of the aforementioned cross-domain isomery big data multi-angle of view clustering method to a kind of based on tensor, art technology Personnel are clear that a kind of implementation of the cross-domain isomery big data multi-angle of view clustering apparatus based on tensor in the present embodiment Method, so this will not be detailed here in order to illustrate the succinct of book.
Embodiment 3
Based on similarly being sent out with the cross-domain isomery big data multi-angle of view clustering method based on tensor a kind of in previous embodiment Bright design, the present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, and the program is by processor The step of a kind of either cross-domain isomery big data multi-angle of view clustering method based on tensor described previously method is realized when execution.
Wherein, in Fig. 5, bus architecture (is represented) with bus 300, and bus 300 may include any number of interconnection Bus and bridge, bus 300 will include the one or more processors represented by processor 302 and what memory 304 represented deposits The various circuits of reservoir link together.Bus 300 can also will peripheral equipment, voltage-stablizer and management circuit etc. it Various other circuits of class link together, and these are all it is known in the art, therefore, no longer carry out further to it herein Description.Bus interface 306 provides interface between bus 300 and receiver 301 and transmitter 303.Receiver 301 and transmitter 303 can be the same element, i.e. transceiver, provide the unit for communicating over a transmission medium with various other devices.
Processor 302 is responsible for management bus 300 and common processing, and memory 304 can be used for storage processor 302 when executing operation used information.
Said one or multiple technical solutions in the embodiment of the present application at least have following one or more technology effects Fruit:
A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor provided by the embodiments of the present application, root Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Coupling vectors; Cumulative obtain of the sample tensor is merged into tensor;Normalization is done along the corresponding rank of each feature space of the merging tensor to obtain Connect tensor;According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, it is empty to obtain each feature Between score vector, and the scoring vector is done into apposition and obtains scoring tensor;By the feature space mix vector and scoring Amount introduces higher dimensional space tensor distance building combination scoring tensor distance;Tensor is scored apart from calculating sample phase according to the combination Building viewing matrix is combined like degree, and according to the feature space;More views under different views are obtained according to the viewing matrix Angle cluster result.It is that big data application in upper layer mentions that solution can not generate different cluster results according to different situations in the prior art The technical issues of servicing is clustered for high quality, realizes the fusion interaction that can consider multiple modal characteristics simultaneously to cluster result Influence, can provide and cluster better clustering performance than single view;It can be neatly according to the demand applied under different situations The feature space needed is selected, according to tensor element mapping relations, it is different high-quality using providing for generating multiple cluster results Amount cluster service, and cluster result is explanatory preferably;The distance for more efficiently measuring data sample in high order spatial, is suitble to The calculating of cross-domain isomery big data Sample Similarity;Improve influence of the important attribute to cluster result, while noise attribute of degenerating Influence, clustering result quality is better than the technical effect for the case where scoring is not added.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (6)

1. a kind of cross-domain isomery big data multi-angle of view clustering method based on tensor, which is characterized in that the described method includes:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Couplings Vector;
Cumulative obtain of the sample tensor is merged into tensor;
Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;
According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, each feature space scoring is obtained Vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor Distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view Matrix;
The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
2. the method as described in claim 1, which is characterized in that the cross-domain heterogeneous characteristic space includes cyberspace, physics One of space and social space are a variety of.
3. the method as described in claim 1, which is characterized in that according to the connection tensor computation under the conditions of more Attribute Associations Stationary Distribution, obtain each feature space and score vector, and feature space scoring vector is done into apposition and is commented The amount of saying good-bye, comprising:
L connection tensor is obtained, l is positive integer;
Initialization probability parameter and threshold parameter;
Select initial vector and random vector;
The l connection tensors are done single mode with the initial vector, the random vector respectively to multiply;
Judge whether the adjacent error for scoring vector twice is less than threshold parameter respectively;
When the adjacent error for scoring vector twice is less than threshold parameter, l scoring vector is obtained;
The l scoring vector, which is intercepted, according to feature space dimension obtains feature space scoring vector;
Feature space scoring vector is done into apposition and obtains scoring tensor.
4. the method as described in claim 1, which is characterized in that obtain the multi-angle of view under different views according to the viewing matrix Cluster result, comprising:
The viewing matrix, which is inputted, according to typical clustering algorithm obtains multi-angle of view cluster result.
5. a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, which is characterized in that described device includes:
First construction unit, first construction unit are used to construct sample tensor according to the cross-domain heterogeneous characteristic space of fusion, and According to different situation context construction feature Spatial Coupling vectors;
First obtains unit, the first obtains unit are used to cumulative obtain of the sample tensor merging tensor;
Second obtaining unit, second obtaining unit are used to do normalizing along the corresponding rank of each feature space of the merging tensor Change and obtains connection tensor;
Third obtaining unit, the third obtaining unit are used for according to the connection tensor computation under the conditions of more Attribute Associations Stationary Distribution obtains each feature space scoring vector, and feature space scoring vector is done apposition and is scored Tensor;
Second construction unit, second construction unit are used to introduce the feature space mix vector and the scoring tensor Higher dimensional space tensor distance building combination scoring tensor distance;
Third construction unit, the third construction unit are used to calculate Sample Similarity according to combination scoring tensor distance, And viewing matrix is constructed according to the feature space mix vector;
4th obtaining unit, the multi-angle of view that the 4th obtaining unit is used to be obtained under different views according to the viewing matrix are poly- Class result.
6. a kind of cross-domain isomery big data multi-angle of view clustering apparatus based on tensor, including memory, processor and it is stored in On reservoir and the computer program that can run on a processor, which is characterized in that the processor is realized when executing described program Following steps:
Sample tensor is constructed according to cross-domain heterogeneous characteristic space is merged, and according to different situation context construction feature Spatial Couplings Vector;
Cumulative obtain of the sample tensor is merged into tensor;
Normalization, which is done, along the corresponding rank of each feature space of the merging tensor obtains connection tensor;
According to Stationary Distribution of connection tensor computation under the conditions of more Attribute Associations, each feature space scoring is obtained Vector, and feature space scoring vector is done into apposition and obtains scoring tensor;
The feature space mix vector and the scoring tensor are introduced into higher dimensional space tensor distance building combination scoring tensor Distance;
Tensor is scored apart from calculating Sample Similarity according to the combination, and according to feature space mix vector building view Matrix;
The multi-angle of view cluster result under different views is obtained according to the viewing matrix.
CN201810970444.6A 2018-08-24 2018-08-24 A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor Pending CN109242007A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810970444.6A CN109242007A (en) 2018-08-24 2018-08-24 A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810970444.6A CN109242007A (en) 2018-08-24 2018-08-24 A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor

Publications (1)

Publication Number Publication Date
CN109242007A true CN109242007A (en) 2019-01-18

Family

ID=65068306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810970444.6A Pending CN109242007A (en) 2018-08-24 2018-08-24 A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor

Country Status (1)

Country Link
CN (1) CN109242007A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414150A (en) * 2019-07-30 2019-11-05 华东交通大学 A kind of tensor subspace continuous system recognition methods of bridge time-varying
CN111242318A (en) * 2020-01-13 2020-06-05 拉扎斯网络科技(上海)有限公司 Business model training method and device based on heterogeneous feature library
WO2023221275A1 (en) * 2022-05-17 2023-11-23 中山大学 Node classification method and system based on tensor graph convolutional network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414150A (en) * 2019-07-30 2019-11-05 华东交通大学 A kind of tensor subspace continuous system recognition methods of bridge time-varying
CN110414150B (en) * 2019-07-30 2021-06-22 四川省公路规划勘察设计研究院有限公司 Tensor subspace continuous system identification method of bridge time-varying system
CN111242318A (en) * 2020-01-13 2020-06-05 拉扎斯网络科技(上海)有限公司 Business model training method and device based on heterogeneous feature library
CN111242318B (en) * 2020-01-13 2024-04-26 拉扎斯网络科技(上海)有限公司 Service model training method and device based on heterogeneous feature library
WO2023221275A1 (en) * 2022-05-17 2023-11-23 中山大学 Node classification method and system based on tensor graph convolutional network

Similar Documents

Publication Publication Date Title
Almeida Benefits, challenges and tools of big data management.
Tsai et al. Multiscale crack fundamental element model for real-world pavement crack classification
WO2017019735A1 (en) Classifying user behavior as anomalous
CN109242007A (en) A kind of cross-domain isomery big data multi-angle of view clustering method and device based on tensor
US20210216859A1 (en) Interpretability-Aware Adversarial Attack and Defense Method for Deep Learnings
CN112000763B (en) Method, device, equipment and medium for determining competition relationship of interest points
WO2022018538A1 (en) Identifying source datasets that fit transfer learning process for target domain
CN109471805A (en) Resource testing method and device, storage medium, electronic equipment
CN109885769A (en) A kind of active recommender system and device based on difference privacy algorithm
Fofonov et al. Projected Field Similarity for Comparative Visualization of Multi‐Run Multi‐Field Time‐Varying Spatial Data
CN103957116A (en) Decision-making method and system of cloud failure data
Bai et al. Cross-domain representation learning by domain-migration generative adversarial network for sketch based image retrieval
Hao et al. Deep belief network based on double weber local descriptor in micro-expression recognition
Lu et al. Deformable attention-oriented feature pyramid network for semantic segmentation
Chen et al. Handwritten CAPTCHA recognizer: a text CAPTCHA breaking method based on style transfer network
Feng et al. Research on Threat Assessment evaluation model based on improved CNN algorithm
Guo et al. CTpoint: A novel local and global features extractor for point cloud
Bosse et al. Sketching phase diagrams using low-depth variational quantum algorithms
Ding et al. A clustering and generative adversarial networks-based hybrid approach for imbalanced data classification
CN110490245B (en) Identity verification model training method and device, storage medium and electronic equipment
Wang et al. Combing deep and handcrafted features for NTV-NRPCA based fabric defect detection
Han et al. Exploring Computing Time for Automatic Occlusion Detection: A Scan-Based Algorithm Versus a Geometry-Based Algorithm
CN103870520B (en) For searching for the device and method of information
Yan et al. A novel robust model fitting approach towards multiple-structure data segmentation
Li An approach to evaluate the clothing creative design with dual hesitant fuzzy information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190118