[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107203529B - Multi-service relevance analysis method and device based on metadata graph structure similarity - Google Patents

Multi-service relevance analysis method and device based on metadata graph structure similarity Download PDF

Info

Publication number
CN107203529B
CN107203529B CN201610150952.0A CN201610150952A CN107203529B CN 107203529 B CN107203529 B CN 107203529B CN 201610150952 A CN201610150952 A CN 201610150952A CN 107203529 B CN107203529 B CN 107203529B
Authority
CN
China
Prior art keywords
metadata
graph
similarity
vertex
objects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610150952.0A
Other languages
Chinese (zh)
Other versions
CN107203529A (en
Inventor
李湛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Hebei Co Ltd
Original Assignee
China Mobile Group Hebei Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Hebei Co Ltd filed Critical China Mobile Group Hebei Co Ltd
Priority to CN201610150952.0A priority Critical patent/CN107203529B/en
Publication of CN107203529A publication Critical patent/CN107203529A/en
Application granted granted Critical
Publication of CN107203529B publication Critical patent/CN107203529B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for analyzing multi-service relevance based on structural similarity of a metadata graph, wherein the method comprises the following steps: after metadata are obtained from a plurality of services, a relation graph of metadata objects is established; judging whether the same meta-model in the relation graph of the metadata object has the common metadata object and the attribute of the metadata object, if so, obtaining the structural similarity of the metadata object according to the similarity of the vertex, the attribute of the vertex and the edge of the structure in the relation graph of the metadata object; determining an incidence relation among a plurality of businesses based on the structural similarity of the metadata graphs.

Description

Multi-service relevance analysis method and device based on metadata graph structure similarity
Technical Field
The present invention relates to relevance analysis technologies, and in particular, to a method and an apparatus for analyzing multi-service relevance based on similarity of metadata graph structures.
Background
Metadata refers to data describing data, mainly describing concepts, relationships, rules, semantics, etc. of a domain. Metadata is an effective way to manage mass data systems (e.g., data warehouses, data marts, Hadoop big data platforms, etc.), and it can provide a clear and complete directory for accessing data, so that users can clearly understand data from the whole and guide users to use data efficiently.
By adopting the prior art, when the relevance analysis is carried out based on the metadata, the following defects mainly exist:
firstly, a relational link of metadata is a business process with a direct reference relationship or a direct data flow relationship from beginning to end, but in general, a plurality of indirect connections exist among a plurality of businesses of an enterprise, but the existing metadata system has no method for determining key connection pivot points among the businesses, so that the influence of one business on other businesses cannot be intuitively evaluated when the caliber of the business changes, and the influence of each metadata object on the business process can only be searched by adopting a manual backtracking method.
Secondly, the existing metadata relevance analysis only roughly compares the number of coincident metadata objects in two relationship links, and in fact, different services often use different attributes of the metadata objects and logic flow relationships often differ, so that the result of relevance analysis without considering metadata object attribute information and logic relationships often lacks accuracy.
At present, business classification is mainly carried out manually by means of experience of business personnel, and can be handled reluctantly when the data volume is small, but in the presence of massive and complex large data, manual classification is obviously not attentive, and an existing metadata application system lacks a method and a mechanism for auxiliary classification.
Disclosure of Invention
In view of this, embodiments of the present invention are intended to provide a method and an apparatus for analyzing multi-service relevance based on similarity of metadata graph structures, so as to at least solve the problems in the prior art.
The technical scheme of the embodiment of the invention is realized as follows:
the embodiment of the invention provides a multi-service relevance analysis method based on metadata graph structure similarity, which comprises the following steps:
after metadata are obtained from a plurality of services, a relation graph of metadata objects is established;
judging whether the same meta-model in the relation graph of the metadata object has the common metadata object and the attribute of the metadata object, if so, obtaining the structural similarity of the metadata object according to the similarity of the vertex, the attribute of the vertex and the edge of the structure in the relation graph of the metadata object;
determining an incidence relation among a plurality of businesses based on the structural similarity of the metadata graphs.
In the above solution, after obtaining metadata from multiple services, establishing a relationship diagram of metadata objects includes:
dividing the metadata into a plurality of classes according to different granularities, wherein a description model established by each class is the meta-model;
composing the metadata object from instances or entities of the meta-model;
and establishing a metadata relationship according to the reference or data flow relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
In the above scheme, the method further comprises:
the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
In the above solution, the obtaining the structural similarity of the metadata graph according to the vertices and vertex attributes of the structure in the relationship graph of the metadata object and the similarity of the edges includes:
acquiring similarity of vertex combination vertex attributes of the structure in the relation graph of the metadata object;
acquiring the similarity of edges in the relation graph of the metadata object;
and obtaining the structural similarity of the metadata graph according to the similarity of the vertex combined with the vertex attribute and the similarity of the edge.
In the above scheme, the obtaining similarity between a vertex of a structure in a relationship diagram of the metadata object and a vertex attribute includes:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in the specified specification graph, and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion.
In the foregoing solution, the obtaining of the similarity of the edges in the relationship graph of the metadata object includes:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common edge of the two metadata subgraphs in the specified specification graph, and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
In the above solution, determining an association relationship between a plurality of services based on the structural similarity of the metadata map includes:
combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services;
and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
The embodiment of the invention provides a multi-service relevance analysis device based on metadata graph structure similarity, which comprises:
the system comprises an establishing unit, a processing unit and a processing unit, wherein the establishing unit is used for establishing a relation graph of a metadata object after acquiring metadata from a plurality of services;
the processing unit is used for judging whether the same meta-model in the relational graph of the metadata object has the common metadata object and the metadata object attribute, and if the same meta-model in the relational graph of the metadata object has the common metadata object and the common metadata object attribute, the structural similarity of the metadata object is obtained according to the structural vertex, the vertex attribute and the edge similarity in the relational graph of the metadata object;
and the determining unit is used for determining the incidence relation among the plurality of services based on the structural similarity of the metadata graphs.
In the foregoing solution, the establishing unit further includes:
the classification subunit is used for dividing the metadata into a plurality of classes according to different granularities, and the description model established by each class is the meta-model;
a composition subunit for composing the metadata object from instances or entities of the meta-model;
and the relationship establishing subunit is used for establishing a metadata relationship according to the reference or data flow direction relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
In the above scheme, the apparatus further comprises:
the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
In the foregoing solution, the processing unit further includes:
the first processing subunit is used for acquiring similarity of vertex combination vertex attributes of a structure in a relational graph of the metadata object;
the second processing subunit is used for acquiring the similarity of edges in the relation graph of the metadata object;
and the third processing subunit is used for obtaining the structural similarity of the metadata graph according to the similarity of the vertex combined with the vertex attributes and the similarity of the edges.
In the foregoing solution, the first processing subunit is further configured to:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in the specified specification graph, and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion.
In the foregoing solution, the second processing subunit is further configured to:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common edge of the two metadata subgraphs in the specified specification graph, and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
In the foregoing solution, the determining unit is further configured to:
combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services;
and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
The method for analyzing the multi-service relevance based on the structure similarity of the metadata graph comprises the following steps: after metadata are obtained from a plurality of services, a relation graph of metadata objects is established; judging whether the same meta-model in the relation graph of the metadata object has the common metadata object and the attribute of the metadata object, if so, obtaining the structural similarity of the metadata object according to the similarity of the vertex, the attribute of the vertex and the edge of the structure in the relation graph of the metadata object; determining an incidence relation among a plurality of businesses based on the structural similarity of the metadata graphs. By adopting the embodiment of the invention, the accuracy and the effect of the correlation analysis can be improved.
Drawings
FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a three-level architecture of a metadata system in an application scenario to which an embodiment of the present invention is applied;
FIG. 3 is a schematic diagram of a multi-service relevance analysis based on structural similarity of metadata graphs in an application scenario to which an embodiment of the present invention is applied;
fig. 4 is a flowchart of a multi-service relevance analysis based on the similarity of metadata graph structures in an application scenario to which an embodiment of the present invention is applied.
Detailed Description
The following describes the embodiments in further detail with reference to the accompanying drawings.
The embodiment of the invention provides a multi-service relevance analysis method based on metadata graph structure similarity, as shown in figure 1, the method comprises the following steps:
step 101, after obtaining metadata from a plurality of services, establishing a relationship diagram of metadata objects.
And 102, judging whether the same meta-model in the relation graph of the metadata object has the common metadata object and the metadata object attribute, if so, obtaining the structural similarity of the metadata object according to the structural vertex, the vertex attribute and the edge similarity in the relation graph of the metadata object.
And 103, determining the incidence relation among the plurality of businesses based on the structural similarity of the metadata graph.
In an embodiment of the present invention, after obtaining metadata from multiple services, establishing a relationship diagram of a metadata object includes: dividing the metadata into a plurality of classes according to different granularities, wherein a description model established by each class is the meta-model; composing the metadata object from instances or entities of the meta-model; and establishing a metadata relationship according to the reference or data flow relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
In an implementation manner of an embodiment of the present invention, the method further includes: the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
In an embodiment of the present invention, the obtaining the structural similarity of the metadata graph according to the similarity of the vertices, the vertex attributes, and the edges of the structure in the relationship graph of the metadata object includes: acquiring similarity of vertex combination vertex attributes of the structure in the relation graph of the metadata object; acquiring the similarity of edges in the relation graph of the metadata object; and obtaining the structural similarity of the metadata graph according to the similarity of the vertex combined with the vertex attribute and the similarity of the edge.
In an embodiment of the present invention, the obtaining of similarity between a vertex of a structure in a relationship graph of the metadata object and a vertex attribute includes: each service is represented by a metadata sub-graph of a metadata directed graph; and acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in a specified specification graph (such as a minimum graph), and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion.
In an embodiment of the present invention, the obtaining of the similarity of the edges in the relationship graph of the metadata object includes: each service is represented by a metadata sub-graph of a metadata directed graph; and acquiring the proportion of the common edge of the two metadata subgraphs in a specified specification graph (such as a minimum graph), and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
In an embodiment of the present invention, determining an association relationship between multiple services based on structural similarity of the metadata graph includes: combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services; and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
The embodiment of the invention provides a multi-service relevance analysis device based on metadata graph structure similarity, which comprises: the system comprises an establishing unit, a processing unit and a processing unit, wherein the establishing unit is used for establishing a relation graph of a metadata object after acquiring metadata from a plurality of services; the processing unit is used for judging whether the same meta-model in the relational graph of the metadata object has the common metadata object and the metadata object attribute, and if the same meta-model in the relational graph of the metadata object has the common metadata object and the common metadata object attribute, the structural similarity of the metadata object is obtained according to the structural vertex, the structural vertex attribute and the structural similarity of the edges in the relational graph of the metadata object; and the determining unit is used for determining the incidence relation among a plurality of services based on the structural similarity of the metadata graphs.
In an implementation manner of the embodiment of the present invention, the establishing unit further includes:
the classification subunit is used for dividing the metadata into a plurality of classes according to different granularities, and the description model established by each class is the meta-model;
a composition subunit for composing the metadata object from instances or entities of the meta-model;
and the relationship establishing subunit is used for establishing a metadata relationship according to the reference or data flow direction relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
In an implementation manner of an embodiment of the present invention, the apparatus further includes:
the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
In an embodiment of the present invention, the processing unit further includes:
the first processing subunit is used for acquiring similarity of vertex combination vertex attributes of a structure in a relational graph of the metadata object;
the second processing subunit is used for acquiring the similarity of edges in the relation graph of the metadata object;
and the third processing subunit is used for obtaining the structural similarity of the metadata graph according to the similarity of the vertex combined with the vertex attributes and the similarity of the edges.
In an implementation manner of the embodiment of the present invention, the first processing subunit is further configured to:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in the specified specification graph, and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion.
In an implementation manner of the embodiment of the present invention, the second processing subunit is further configured to:
each service is represented by a metadata sub-graph of a metadata directed graph;
and acquiring the proportion of the common edge of the two metadata subgraphs in the specified specification graph, and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
In an implementation manner of the embodiment of the present invention, the determining unit is further configured to:
combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services;
and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
The embodiment of the invention is explained by taking a practical application scene as an example as follows:
an application scenario of the embodiment of the present invention is introduced as follows:
in today's big data age, successful implementation and deployment of business intelligence BI depends on efficient metadata management and applications. Metadata is defined as data that describes other data, mainly including data such as related topics, concepts, terms, structures, procedures, relationships, and rules in the fields of business, technology, and management. The high-level metadata application can serve as a guidance mark for various complex systems and massive data, can help users to better understand the coming and going pulse of various services, enhances the basic supporting capacity of the data on the services, improves the management and control capacity of the data quality, and realizes efficient enterprise management. However, the application of metadata is still in a simple use stage at present, and a great improvement in the analysis of complex relationships of multiple businesses is still needed due to the lack of high-level deep research and application.
The application scenario adopts the embodiment of the invention, and the relevance among a plurality of services is measured according to the vertexes, the attributes and the edge similarity of the metadata directed graph corresponding to the services, so that the influence relationship of complex intersection among the services can be reflected, guidance is provided for the service personnel to be familiar with the relationship among the services, and a decision is provided for the enterprise to conduct business analysis. The main problems that can be solved are: 1) the incidence relation among a plurality of businesses is determined by establishing a relation graph of the metadata object and comparing the similarity of the structure vertex, the attribute and the edge of the graph, so that the influence degree of one business change on other businesses can be intuitively reflected. 2) The condition that the same resource attribute and the front-back logic relation of the service are not used by the service is considered, and the similarity of the structure of the metadata graph is measured by fully utilizing the vertex of the metadata object, the attribute information of the metadata object and the edges corresponding to the front-back logic flow relation, so that the service relevance analysis result is more credible. 3) After the business relevance degree is calculated on the basis of the similarity of the structure of the metadata graph, a series of applications can be created, for example: the service change influences an early warning system, redundant repeated processes of combing and combining services, automatic auxiliary service classification and the like, and can solve some complex problems faced by big data.
The three-layer architecture of the metadata system is shown in fig. 2, and the application scenario adopts the embodiment of the present invention, and adds a multi-service meta-object analysis function module (as shown in a15 in fig. 2) based on the structural similarity of the metadata graph, and further proposes a module for high-layer extended application, such as a service change early warning module, a redundant repeat flow module for combing and merging services, and an automatic auxiliary service classification module, as shown in a16 in fig. 2, on this basis, and the rest of the modules marked by a11, a12, a13, and a14 are existing modules.
The main principle of the multi-service relevance analysis function module based on the structure similarity of the metadata graph is shown in fig. 3, and fig. 3 is a multi-service relevance analysis principle graph based on the structure similarity of the metadata graph, and the detailed description is as follows:
one, the resource objects used by different businesses have similarities, that is, different businesses all use some resource objects or some attributes of these resource objects together, and the mapping metadata map is that the vertex and its attributes have similarities, then there is an association between these businesses.
For example: in fig. 3, the company has two market share reports corresponding to two services, and the data of the two reports are summarized by the same table, but the same fields of the table are used according to different apertures, so that the reports related to the two services are related by the table and the attribute fields of the table.
At present, description information (e.g., log files, XML files, Webservice interfaces, Hadoop platform data, etc.) of structured data (e.g., relational databases, OLAP online analysis data, etc.) and unstructured data is a common main body for generating metadata, and automatic or manual extraction and entry of the description data of these data is a main way for obtaining data by an acquisition layer of a metadata system.
In a logic layer of a metadata system, metadata is divided into delta classes according to different granularities, a description model is respectively established for each class, the description model is called as a meta model, and thus all metadata can be classified according to the meta model and expressed into a set M { M ═ M1,m2,...,mδIn which each meta model mχCan be described by several attributes, i.e. mχ=(a1,a2,...,aκ). An instance or entity of a meta-model, called a meta-object, is represented as
Figure BDA0000943106840000101
The metadata relationships are established from the reference or data flow relationships between the meta-objects, denoted as rχ,γI.e. metadata objects
Figure BDA0000943106840000102
And
Figure BDA0000943106840000103
the relationship between them. With the meta-objects as vertices and the relationships between meta-objects as edges, a directed graph of the meta-data can be built, denoted as G ═ V, E>In which the vertices are represented as sets
Figure BDA0000943106840000104
The edges are represented as a contiguous matrix
Figure BDA0000943106840000105
Thus, the resource objects involved in each business and the relationships between these resource objects can be represented by a directed graph of metadata. On the basis of a directed graph of metadata obtained by abstracting services on a functional layer of the metadata, whether a common meta-object and the attribute of the meta-object exist in the same meta-model in the directed graph of the metadata are compared, and the relevance among different services is measured according to the similarity between the vertex of a structure of the metadata graph and the attribute of the vertex.
Due to the same meta-modelThe attribute dimensions are the same, so the dimension sizes of the meta-objects derived from the same meta-model and the attributes of the meta-objects are the same, but the specific meta-objects or the attribute values thereof may be different because the services are different, and by adopting the embodiment of the invention, the cosine similarity is used for measuring the same meta-model mχDifferent meta-objects of
Figure BDA0000943106840000106
And
Figure BDA0000943106840000107
the similarity between the attributes of (a) is calculated as follows in equation (1-1):
Figure BDA0000943106840000108
wherein, ifIs/are as follows
Figure BDA00009431068400001010
If the attribute is not null, it is represented as 1, otherwise it is 0.
Each service is represented by a sub-graph of a metadata directed graph, and considering that the common vertex of the two graphs and the attribute thereof account for the proportion of the minimum graph, the similarity of the vertex and the vertex attribute of the metadata graph structure corresponding to any two services α and β can be calculated, as shown in formula (1-2):
Figure BDA0000943106840000111
wherein subfigure gα,gβ∈G。
The logic processes of different services are similar, that is, the different services all use a logic flow from some resource objects or attributes thereof to other resource objects or corresponding attributes thereof, and the services are associated when the metadata map is mapped to a common continuous directed edge.
In this embodiment, from the perspective of an abstract metadata directed graph, if there are common continuous and directional edges between metadata objects, the relevance between different businesses can be measured according to the similarity of the edges of the metadata graph structure. For example, in the above example, two business related reports of a company are associated by the attribute fields of the commonly involved tables and tables, and both the tables and the fields are processed by the same stored procedure, so that there is a continuous and directional logical link from the stored procedure to the tables and the fields thereof.
The edge similarity of the metadata graph structures corresponding to any two services α and β can be calculated by considering the proportion of the common edge of the metadata directed graph corresponding to the two services to the minimum graph, as shown in the formula (1-3):
Figure BDA0000943106840000112
and thirdly, combining the top point of the metadata graph structure, the similarity of the top point attributes and the similarity of edges to measure the relevance among a plurality of different services, and adjusting the weight value through an adjusting factor according to the angle which needs attention actually to obtain the service relevance value.
In reality, the relationship between two services is often compared by simultaneously considering the resource objects and attributes and the service logic flows commonly used by the two services, so the embodiment proposes a formula for calculating the association degree of any two services α and β by combining the above two formulas (1-2) and (1-3), as shown in (1-4):
rel(α,β)=sim(gα,gβ)=θ·sv(gα,gβ)+(1-θ)·se(gα,gβ) (1-4)
if one of the two services α and β is a sub-service of the other service, then the association between the two services is 100%, i.e., rel (α) is 1.
Fourthly, according to the relevance value between different services, a series of applications can be created, such as: the method comprises the steps of influencing an early warning system by service change, combing and combining redundant repeated flows of services, automatically assisting service classification and the like.
A series of advanced extension applications can be established on a functional layer of a metadata system by utilizing the multi-service relevance analysis result of the structural similarity of the metadata graph.
Specifically, the service change influence early warning system can evaluate the influence of the change operation of one service on other services in advance, and if the association degree of the services is high and the influence exceeds an early warning threshold value, an alarm is given, so that the serious adverse effect caused by only considering to change one service and neglecting other services can be avoided.
The application of the redundancy repeated flows for combing and combining the services can find out the redundancy repeated flows possibly existing in the services according to the relevance of the services and adjust and combine the flows, thereby saving resources and cost.
The application of automatic auxiliary service classification can carry out automatic auxiliary classification according to the relevance of the service and the existing service class, and the workload of manual classification is reduced.
Fig. 4 is a flowchart of a multi-service relevance analysis based on the structural similarity of the metadata graph, and as shown in fig. 4, the whole flow of the multi-service relevance analysis method based on the structural similarity of the metadata graph is as follows:
and 11, collecting metadata of the service to be managed.
The metadata collection includes the interface, source code and document of a business system, the table, view and storage process of a database, the rules of extraction, cleaning, conversion, mapping and loading of ETL data, the collection of resource object descriptions such as data model API of a modeling tool and OLAP online analysis data, and the collection of relation rule descriptions, wherein structured data can be obtained through a data dictionary, and unstructured data including XML files, log files, Webservice interfaces, Hadoop platforms and the like can be obtained by providing standard rules for analysis.
Step 12, establishing meta-models according to the specified granularity, wherein each meta-model mχDescribed by several attributes, i.e. mχ=(a1,a2,...,aκ) The metadata is classified by the meta model.
Step 13, describing the metadata according to the meta-model to establish the meta-object
Figure BDA0000943106840000121
Establishing metadata relation r chi, gamma according to rules of reference/referenced or data outflow/inflow between metadata objects
Figure BDA0000943106840000122
And
Figure BDA0000943106840000123
the relationship between them. Establishing a directed graph G (V, E) of metadata by taking the meta-objects as vertexes and taking the relations between the meta-objects as edges, wherein the vertexes are setsThe edges being contiguous matrices
Figure BDA0000943106840000131
And step 14, according to the principle that different businesses commonly use certain resource objects or certain attributes of the resource objects and then the businesses are associated with each other, calculating the similarity of the vertex of the metadata graph structure corresponding to any two businesses α and β combined with the vertex attributes according to the formulas (1-1) and (1-2).
Step 15, according to the principle that different services all use the logic flow from some resource objects or their attributes to other resource objects or their corresponding attributes, and that these services are related, calculate the edge similarity of the metadata graph structure corresponding to any two services α and β according to the above formulas (1-3).
And step 16, integrating the similarity of the vertexes and the attributes of the vertexes of the metadata graph structure and the similarity of the edges to calculate the relevance of any two services α and β according to the formulas (1-4).
And step 17, judging whether the two services α and β are related according to the threshold values of the association degrees of the two services α and β, if so, executing step 18, otherwise, returning to execute step 12 again.
Step 18, a series of applications are created according to the relevance values of the two services α and β.
Here, as for creating a series of applications from the relevance values, for example: the method comprises the steps of influencing an early warning system by service change, combing and combining redundant repeated flows of services, automatically assisting service classification and the like.
By adopting the embodiment of the invention, 1) according to the similarity of business logic flows which commonly use some resource objects or some attributes thereof and from some resource objects or the attributes thereof to other resource objects or the attributes corresponding thereto, the vertex of a metadata directed graph structure, the attributes thereof and the similarity of edges are abstracted, and the relevance among a plurality of businesses is measured according to the similarity; 2) and the metadata association analysis module analyzes the association among a plurality of services by using the vertexes of the metadata directed graph structure and the attributes thereof and the similarity of the edges and realizes an application layer. The embodiment of the invention makes up the defect that the existing metadata technology can not adequately process the complex relationship among a plurality of services, and the provided method for analyzing the multi-service relevance based on the structure similarity of the metadata graph can determine the relevance relationship among a plurality of services by establishing the relationship graph of the metadata object and comparing the similarity of the structure vertex, the attribute and the edge of the graph, so that the influence degree of one service change on other services can be intuitively reflected, and on the basis, a series of applications can be created to realize service change early warning, redundant service flow merging, automatic auxiliary service classification and the like, thereby solving some problems in big data. The scheme has higher practicability in practical application.
The integrated module according to the embodiment of the present invention may also be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as an independent product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.
Correspondingly, the embodiment of the present invention further provides a computer storage medium, in which a computer program is stored, where the computer program is used to execute the method for analyzing multi-service relevance based on the similarity of metadata graph structures according to the embodiment of the present invention.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention.

Claims (8)

1. A multi-service relevance analysis method based on metadata graph structure similarity is characterized by comprising the following steps:
after metadata are obtained from a plurality of services, a relation graph of metadata objects is established; the relation graph of the metadata object is a directed graph of the metadata object;
judging whether the same meta-model in the relation graph of the metadata object has a common metadata object and metadata object attribute, if so, obtaining the structural similarity of the relation graph of the metadata object according to the similarity of the vertex, the vertex attribute and the edge of the structure in the relation graph of the metadata object;
determining an incidence relation among a plurality of businesses based on the structural similarity of the relation graph of the metadata object;
wherein, the obtaining the structural similarity of the relationship graph of the metadata object according to the similarity of the vertex, the vertex attribute and the edge of the structure in the relationship graph of the metadata object comprises: acquiring similarity of vertex combination vertex attributes of the structure in the relation graph of the metadata object; acquiring the similarity of edges in the relation graph of the metadata object; according to the similarity of the vertex combined with the vertex attributes and the similarity of the edges, obtaining the structural similarity of the relation graph of the metadata object;
the obtaining of the similarity of the vertex combination vertex attributes of the structure in the relationship graph of the metadata object includes:
each service is represented by a metadata sub-graph of a directed graph of metadata objects; acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in a specified specification graph, and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion;
the obtaining of the similarity of the edges in the relationship graph of the metadata object includes: each service is represented by a metadata sub-graph of a directed graph of metadata objects; and acquiring the proportion of the common edge of the two metadata subgraphs in the specified specification graph, and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
2. The method of claim 1, wherein after obtaining metadata from the plurality of services, establishing a relationship graph of metadata objects comprises:
dividing the metadata into a plurality of classes according to different granularities, wherein a description model established by each class is the meta-model;
composing the metadata object from instances or entities of the meta-model;
and establishing a metadata relationship according to the reference or data flow relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
3. The method of claim 2, further comprising:
the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
4. The method of claim 1, wherein determining the associative relationship between the plurality of businesses based on the structural similarity of the relationship graph of the metadata object comprises:
combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services;
and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
5. A multi-service relevance analysis apparatus based on metadata graph structure similarity, the apparatus comprising:
the system comprises an establishing unit, a processing unit and a processing unit, wherein the establishing unit is used for establishing a relation graph of a metadata object after acquiring metadata from a plurality of services;
the processing unit is used for judging whether the same meta-model in the relation graph of the metadata object has the common metadata object and the metadata object attribute, and if the same meta-model in the relation graph of the metadata object has the common metadata object and the common metadata object attribute, the structural similarity of the relation graph of the metadata object is obtained according to the similarity of the vertex, the vertex attribute and the edge of the structure in the relation graph of the metadata object;
a determining unit, configured to determine an association relationship between multiple businesses based on structural similarity of the relationship graph of the metadata object;
wherein the processing unit further comprises: the first processing subunit is used for acquiring similarity of vertex combination vertex attributes of a structure in a relational graph of the metadata object;
the second processing subunit is used for acquiring the similarity of edges in the relation graph of the metadata object;
the third processing subunit is used for obtaining the structural similarity of the relation graph of the metadata object according to the similarity of the vertex combined with the vertex attribute and the similarity of the edge;
the first processing subunit further configured to: each service is represented by a metadata sub-graph of a directed graph of metadata objects; acquiring the proportion of the common vertex of the two metadata subgraphs and the attribute thereof in a specified specification graph, and calculating the similarity of the vertex combination vertex attribute of the metadata subgraph structure corresponding to any two services according to the proportion;
the second processing subunit further configured to: each service is represented by a metadata sub-graph of a directed graph of metadata objects; and acquiring the proportion of the common edge of the two metadata subgraphs in the specified specification graph, and calculating the similarity of the edges of the metadata subgraph structures corresponding to any two services according to the proportion.
6. The apparatus of claim 5, wherein the establishing unit further comprises:
the classification subunit is used for dividing the metadata into a plurality of classes according to different granularities, and the description model established by each class is the meta-model;
a composition subunit for composing the metadata object from instances or entities of the meta-model;
and the relationship establishing subunit is used for establishing a metadata relationship according to the reference or data flow direction relationship among the metadata objects, establishing a directed graph of the metadata objects by taking the metadata objects as vertexes and the relationship among the metadata objects as edges, and taking the directed graph of the metadata objects as the relationship graph of the metadata objects.
7. The apparatus of claim 6, further comprising:
the resource objects involved by each business and the relationships between the resource objects support representation using directed graphs of the metadata objects.
8. The apparatus of claim 5, wherein the determining unit is further configured to:
combining the vertex of the metadata sub-graph structure corresponding to any two services with the similarity of the vertex attributes and the similarity of the edges of the metadata sub-graph structure corresponding to any two services, and measuring the relevance between any different services;
and according to the angle which needs attention actually, the weight is adjusted through the adjustment factor to obtain a service relevance value, and the relevance relation among a plurality of services is determined according to the service relevance value.
CN201610150952.0A 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity Active CN107203529B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610150952.0A CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610150952.0A CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Publications (2)

Publication Number Publication Date
CN107203529A CN107203529A (en) 2017-09-26
CN107203529B true CN107203529B (en) 2020-02-21

Family

ID=59903515

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610150952.0A Active CN107203529B (en) 2016-03-16 2016-03-16 Multi-service relevance analysis method and device based on metadata graph structure similarity

Country Status (1)

Country Link
CN (1) CN107203529B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416525B (en) * 2018-03-13 2020-10-30 三峡大学 Flow model similarity measurement method based on metadata
CN109766940B (en) * 2018-12-29 2024-02-02 北京天诚同创电气有限公司 Method and apparatus for evaluating similarity between multiple sewage treatment systems
CN118521330A (en) * 2019-05-17 2024-08-20 嘉兴树融数据科技有限公司 Consumption analysis method, system, device and platform
CN110287223A (en) * 2019-06-24 2019-09-27 北京明略软件系统有限公司 Information storage means and device, electronic device and storage medium
CN110795524B (en) * 2019-10-31 2022-07-05 望海康信(北京)科技股份公司 Main data mapping processing method and device, computer equipment and storage medium
CN115460622A (en) * 2021-05-19 2022-12-09 中兴通讯股份有限公司 Modeling method, network element data processing method and device, electronic equipment and medium
CN113687825B (en) * 2021-08-25 2023-12-12 恒安嘉新(北京)科技股份公司 Method, device, equipment and storage medium for constructing software module
CN116149831B (en) * 2023-04-20 2023-08-11 山东海量信息技术研究院 Task scheduling method, system, electronic device, quantum cloud system and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460580A1 (en) * 2001-12-14 2004-09-22 NEC Corporation Face meta-data creation and face similarity calculation
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
CN102982168A (en) * 2012-12-12 2013-03-20 江苏省电力公司信息通信分公司 Metadata schema matching method based on XML (extensive markup language) document
CN104850632A (en) * 2015-05-22 2015-08-19 东北师范大学 Generic similarity calculation method and system based on heterogeneous information network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1460580A1 (en) * 2001-12-14 2004-09-22 NEC Corporation Face meta-data creation and face similarity calculation
CN102239458A (en) * 2008-12-02 2011-11-09 起元技术有限责任公司 Visualizing relationships between data elements
CN102982168A (en) * 2012-12-12 2013-03-20 江苏省电力公司信息通信分公司 Metadata schema matching method based on XML (extensive markup language) document
CN104850632A (en) * 2015-05-22 2015-08-19 东北师范大学 Generic similarity calculation method and system based on heterogeneous information network

Also Published As

Publication number Publication date
CN107203529A (en) 2017-09-26

Similar Documents

Publication Publication Date Title
CN107203529B (en) Multi-service relevance analysis method and device based on metadata graph structure similarity
Wang Heterogeneous data and big data analytics
CN113032579B (en) Metadata blood relationship analysis method and device, electronic equipment and medium
US20220075762A1 (en) Method for classifying an unmanaged dataset
US9031992B1 (en) Analyzing big data
US7565335B2 (en) Transform for outlier detection in extract, transfer, load environment
US9361320B1 (en) Modeling big data
US10146878B2 (en) Method and system for creating filters for social data topic creation
US9959326B2 (en) Annotating schema elements based on associating data instances with knowledge base entities
US10318540B1 (en) Providing an explanation of a missing fact estimate
US10671577B2 (en) Merging synonymous entities from multiple structured sources into a dataset
CN105159971B (en) A kind of cloud platform data retrieval method
US20180357278A1 (en) Processing aggregate queries in a graph database
Shakhovska et al. Big Data Model" Entity and Features"
Pellegrino et al. A configurable evaluation framework for node embedding techniques
Gkoulalas-Divanis et al. Large-Scale Data Analytics
US20230016485A1 (en) Systems and Methods for Intelligent Automatic Filing of Documents in a Content Management System
Wrembel Data integration, cleaning, and deduplication: Research versus industrial projects
CN107103023B (en) Organizing electronically stored files using an automatically generated storage hierarchy
Cao et al. A new approach for large‐scale scene image retrieval based on improved parallel K‐means algorithm in MapReduce environment
US20240126770A1 (en) Systems and Methods for Intelligent Automatic Filing of Documents in a Content Management System
US11500942B2 (en) Focused aggregation of classification model outputs to classify variable length digital documents
Jabeen et al. Divided we stand out! Forging Cohorts fOr Numeric Outlier Detection in large scale knowledge graphs (CONOD)
Li et al. Research on hot news discovery model based on user interest and topic discovery
Camastra et al. Machine learning-based web documents categorization by semantic graphs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant