CN111914098A - Knowledge graph construction method and device, electronic equipment and readable storage medium - Google Patents
Knowledge graph construction method and device, electronic equipment and readable storage medium Download PDFInfo
- Publication number
- CN111914098A CN111914098A CN202010695185.8A CN202010695185A CN111914098A CN 111914098 A CN111914098 A CN 111914098A CN 202010695185 A CN202010695185 A CN 202010695185A CN 111914098 A CN111914098 A CN 111914098A
- Authority
- CN
- China
- Prior art keywords
- knowledge
- software
- information
- graph
- knowledge graph
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title claims description 36
- 238000012545 processing Methods 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 41
- 238000005065 mining Methods 0.000 claims abstract description 18
- 238000007781 pre-processing Methods 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 7
- 238000004220 aggregation Methods 0.000 claims description 7
- 238000011161 development Methods 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 6
- 238000004806 packaging method and process Methods 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 230000010365 information processing Effects 0.000 abstract description 2
- 238000012795 verification Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 238000005034 decoration Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000011022 operating instruction Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/53—Decompilation; Disassembly
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
- G06F8/61—Installation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Stored Programmes (AREA)
Abstract
The application relates to the technical field of information processing, in particular to a method and a device for constructing a knowledge graph, an electronic device and a readable storage medium, wherein the method comprises the following steps: collecting sample software information and preprocessing the sample software information to obtain preprocessed data; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. The method for constructing the software project knowledge graph by performing diversified analysis processing on the sample software under the background of data sparseness caused by the fact that software project information resources cannot be acquired can help software developers to perform software influence assessment, dependence retrieval, code generation or verification and the like, software development efficiency is improved, and software version conflicts and the like are reduced.
Description
Technical Field
The present application relates to the field of information processing technologies, and in particular, to a method and an apparatus for constructing a knowledge graph, an electronic device, and a readable storage medium.
Background
Knowledge Graph (also known as Knowledge domain visualization or Knowledge domain mapping map) is a series of different graphs displaying the relationship between the Knowledge development process and the structure, and uses visualization technology to describe Knowledge resources and their carriers, and to mine, analyze, construct, draw and display Knowledge and the mutual relation between them. In the technical field of software, a software knowledge graph is constructed mainly depending on software resources such as source codes, question and answer documents, requirement/design documents and the like of software items, a generated knowledge graph is mainly oriented to code hierarchy, but in the industries such as finance and the like, most of software is developed by third-party commercial companies, software resources such as source codes, question and answer documents, requirement/design documents and the like of the software items cannot be obtained, so that packages, classes, interfaces, methods, attributes, exceptions, method parameters, return values and relations among the elements cannot be extracted, and the knowledge graph of the software items cannot be established. In addition, the influence relationship of software version upgrading and the like cannot be evaluated aiming at the knowledge graph of the code level.
Disclosure of Invention
The present application aims to solve at least one of the above technical drawbacks. The technical scheme adopted by the application is as follows:
in a first aspect, an embodiment of the present application provides a method for constructing a knowledge graph, where the method includes:
collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
determining a knowledge graph knowledge base according to the preprocessed data and preset rules;
analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
Optionally, the collected sample software information includes, but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information.
Optionally, determining a knowledge-graph knowledge base according to the preprocessed data and preset rules includes:
determining a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, wherein the knowledge base comprises a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
Optionally, the analyzing, processing, and mining the sample software information to expand the knowledge-graph knowledge base of the feature data includes:
and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
Optionally, the analyzing and processing the software information of the sample software expands the concept set and the attribute set of the knowledge-graph knowledge base to include at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
Optionally, the method further comprises:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
In a second aspect, an embodiment of the present application provides a knowledge graph building apparatus, including: the device comprises an acquisition module, a processing module, a storage module and a construction module; wherein,
the acquisition module is used for acquiring sample software information;
the processing module is used for preprocessing the sample software information to obtain preprocessed data;
the processing module is further used for determining a knowledge base of the knowledge map according to the preprocessed data and preset rules;
the processing module is also used for analyzing, processing and mining the sample software information to expand the knowledge base of the knowledge map by using the characteristic data;
the construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
the storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
Optionally, the processing module is specifically configured to determine a knowledge base of the knowledge graph according to the preprocessed data and expert knowledge in the software domain, where the knowledge base includes a concept set and an attribute set of the sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
Optionally, the processing module is further configured to perform at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
Optionally, the processing module is further configured to:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the knowledge graph construction method by calling the operation instruction.
In a fourth aspect, a computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the above-described method of knowledge-graph construction.
The technical scheme provided by the embodiment of the application has the following beneficial effects: the knowledge graph construction method provided by the embodiment of the application comprises the following steps: collecting sample software information and preprocessing the sample software information to obtain preprocessed data; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. According to the technical scheme, under the background of data sparseness caused by the fact that software project information resources cannot be obtained, entity extraction and entity association extraction are achieved by conducting diversified analysis processing on sample software and establishing a knowledge base, and a software project knowledge graph is established.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
FIG. 1 is a schematic flow chart diagram of a knowledge graph construction method provided in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a knowledge graph constructing apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
The embodiment of the application relates to a knowledge graph construction technology of a software project, and provides a knowledge graph construction method based on the difficulty in construction of a knowledge graph of the software project introduced in the background technology. The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments in conjunction with the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the present application clearer, fig. 1 discloses a flowchart of a method for constructing a knowledge graph provided by an embodiment of the present application, and as shown in fig. 1, the method for constructing a knowledge graph includes:
s101, collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
s102, determining a knowledge base of the knowledge base according to the preprocessed data and preset rules;
s103, analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and S104, constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
Further, in the embodiments of the present application, the collecting of the sample software information includes, but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information. The construction information comprises construction information included in collected format software files such as pom.xml, built.gradle, built.sbt and built.xml.
Further, in this embodiment of the present application, determining the knowledge-graph knowledge base according to the preprocessed data and the preset rule includes: according to the preprocessed data and the domain knowledge of software domain experts, establishing a top-level and core knowledge base core concept set and an attribute set so as to determine a knowledge base of the knowledge graph, wherein the knowledge base comprises a concept set and an attribute set of sample software; the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated. The step should ensure the correctness of the established concepts and attributes without making high requirements on the integrity temporarily, and the established knowledge base is an RDF/OWL file representing the definition of the core concept set and the attribute set so as to solve the problem of 'cold start' of the core ontology of the knowledge base.
Further, in this embodiment of the present application, the analyzing, processing, and mining the sample software information to obtain the feature data to expand the knowledge-graph knowledge base includes: and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
Optionally, the analysis processing of the software information of the sample software may expand the concept set and the attribute set of the knowledge-graph knowledge base by:
1) analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; namely, the information such as the name, version and dependency relationship of the software project is extracted through sample software deployment file structure information.
2) Analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; the method comprises the steps of analyzing construction information of a package of sample software, such as pom.xml, build.gradle, build.sbt, build.xml and the like, and obtaining project information, compiling information, environment information, inheritance, aggregation, dependency relationship and the like of the sample software.
3) Decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element; that is, information such as the name, version, and dependency of a sample software project is extracted by decompressing a sample software package including JAR format or a decompilation software program.
Wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information; the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
The above embodiments are mainly used in the case that software resources such as source code, question and answer document, requirement/design document and the like of the software cannot be obtained, carrying out a series of static and dynamic analysis and processing on the file structure, the constructed information, the decompilation result and the like of the sample software project, extracting characteristic data, mining and discovering new concepts and attributes, expanding a knowledge base concept set (comprising project or organization, common name of the project, version of the project, packaging mechanism, name of user description project, official address of development team, classification and other project information, compiling information, environment information and the like) constructed in the embodiment as an entity of a pre-constructed knowledge graph of the sample software project, and extending the attribute set (including relationship of belonging, containing, plug-in, inheritance, aggregation and dependency) of the knowledge base to be used as entity association of the pre-constructed knowledge graph.
In an alternative embodiment, the method for expanding the attribute set of the knowledge base described in the present application may further adopt: analyzing the process or directory relation of the running state of the sample software program; mining the relation of the sample software project in the running state; performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base; acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result; and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
By the embodiment, the software knowledge base comprising the concept set and the attribute set is obtained, the concept set elements are used as the entity of the pre-constructed knowledge graph, the attribute set elements are used as the entity association of the pre-constructed knowledge graph, the entity is further used as the node, and the entity association is used as the edge, so that the construction of the software knowledge graph disclosed by the application is completed.
Based on the method for constructing the knowledge graph provided by the embodiment shown in fig. 1, fig. 2 shows that the embodiment of the present application provides a knowledge graph constructing apparatus, as shown in fig. 2, the apparatus includes: the system comprises a 201 acquisition module, a 202 processing module, a 203 storage module and a 204 construction module; wherein,
the 201 acquisition module is used for acquiring sample software information;
the 202 processing module is configured to perform preprocessing on the sample software information to obtain preprocessed data;
the 202 processing module is further configured to determine a knowledge-graph knowledge base according to the preprocessed data and preset rules;
the 202 processing module is further used for analyzing, processing and mining sample software information to expand a knowledge base of the feature data;
the 204 construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
and the 203 storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
In the embodiment of the application, the acquisition module is used for acquiring information such as a sample software warehouse, a software installation package name, directory structure information, construction information and program running information.
In an embodiment of the application, the processing module is specifically configured to determine a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, where the knowledge base includes a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
In this embodiment, the processing module is further configured to analyze and process the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
In an embodiment of the present application, the processing module is further configured to perform at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
In an embodiment of the present application, the processing module is further configured to: analyzing the process or directory relation of the running state of the sample software program; mining the relation of the sample software project in the running state; performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base; acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result; and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
It is to be understood that the above modules of the knowledge graph constructing apparatus in the present embodiment have functions of implementing the respective steps of the method in the embodiment shown in fig. 1. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module, reference may be specifically made to the corresponding description of the method in the embodiment shown in fig. 1, and details are not repeated here.
The embodiment of the application provides an electronic device, which comprises a processor and a memory;
a memory for storing operating instructions;
and the processor is used for executing the knowledge graph construction method provided by any embodiment of the application by calling the operation instruction.
As an example, fig. 3 shows a schematic structural diagram of an electronic device to which an embodiment of the present application is applicable, and as shown in fig. 3, the electronic device 2000 includes: a processor 2001 and a memory 2003. Wherein the processor 2001 is coupled to a memory 2003, such as via a bus 2002. Optionally, the electronic device 2000 may also include a transceiver 2004. It should be noted that the transceiver 2004 is not limited to one in practical applications, and the structure of the electronic device 2000 is not limited to the embodiment of the present application.
The processor 2001 is applied to the embodiment of the present application to implement the method shown in the above method embodiment. The transceiver 2004 may include a receiver and a transmitter, and the transceiver 2004 is applied to the embodiments of the present application to implement the functions of the electronic device of the embodiments of the present application to communicate with other devices when executed.
The Processor 2001 may be a CPU (Central Processing Unit), general Processor, DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.
The Memory 2003 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.
Optionally, the memory 2003 is used for storing application program code for performing the disclosed aspects, and is controlled in execution by the processor 2001. The processor 2001 is configured to execute the application program code stored in the memory 2003 to implement the method of constructing a knowledge graph provided in any of the embodiments of the present application.
The electronic device provided by the embodiment of the application is applicable to any embodiment of the method, and is not described herein again.
The embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the computer program implements the method for constructing a knowledge graph shown in the above method embodiment.
The computer-readable storage medium provided in the embodiments of the present application is applicable to any of the embodiments of the foregoing method, and is not described herein again.
According to the knowledge graph construction scheme provided by the embodiment of the application, preprocessing data are obtained by acquiring sample software information and preprocessing the sample software information; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. According to the technical scheme, under the background of data sparseness caused by the fact that software project information resources cannot be obtained, entity extraction and entity association extraction are achieved by conducting diversified analysis processing on sample software and establishing a knowledge base, and a software project knowledge graph is established.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.
Claims (10)
1. A method of knowledge graph construction, the method comprising:
collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
determining a knowledge graph knowledge base according to the preprocessed data and preset rules;
analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
2. The method of knowledge-graph construction according to claim 1, wherein the collecting sample software information includes but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information.
3. The method for constructing a knowledge graph according to claim 2, wherein determining a knowledge graph knowledge base according to the preprocessed data and the preset rules comprises:
determining a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, wherein the knowledge base comprises a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
4. The knowledge graph construction method according to claim 3, wherein the analyzing, processing and mining the sample software information to extend the knowledge graph knowledge base by using the feature data comprises:
and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
5. The method of knowledge-graph construction according to claim 4, wherein said analyzing the software information of the sample software to expand the concept sets and attribute sets of the knowledge-graph knowledge base comprises at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
6. The method of knowledge-graph construction according to any one of claims 3 to 5, wherein the method further comprises:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
7. An apparatus for knowledge-graph construction, the apparatus comprising: the device comprises an acquisition module, a processing module, a storage module and a construction module; wherein,
the acquisition module is used for acquiring sample software information;
the processing module is used for preprocessing the sample software information to obtain preprocessed data;
the processing module is further used for determining a knowledge base of the knowledge map according to the preprocessed data and preset rules;
the processing module is also used for analyzing, processing and mining the sample software information to expand the knowledge base of the knowledge map by using the characteristic data;
the construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
the storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
8. The apparatus of claim 7, wherein the processing module is configured to determine a knowledge base of the knowledge graph based on the preprocessed data and the software domain expert knowledge, the knowledge base comprising a concept set and an attribute set of the sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
9. An electronic device comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method of any one of claims 1-6 by calling the operation instruction.
10. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010695185.8A CN111914098A (en) | 2020-07-19 | 2020-07-19 | Knowledge graph construction method and device, electronic equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010695185.8A CN111914098A (en) | 2020-07-19 | 2020-07-19 | Knowledge graph construction method and device, electronic equipment and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111914098A true CN111914098A (en) | 2020-11-10 |
Family
ID=73281659
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010695185.8A Pending CN111914098A (en) | 2020-07-19 | 2020-07-19 | Knowledge graph construction method and device, electronic equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111914098A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112486494A (en) * | 2020-11-16 | 2021-03-12 | 中信银行股份有限公司 | File generation method and device, electronic equipment and computer readable storage medium |
CN112632336A (en) * | 2020-12-16 | 2021-04-09 | 恩亿科(北京)数据科技有限公司 | Method and system for processing real-time streaming graph relation |
CN113504972A (en) * | 2021-07-26 | 2021-10-15 | 京东科技控股股份有限公司 | Service deployment method and device, electronic equipment and storage medium |
CN116661768A (en) * | 2023-07-25 | 2023-08-29 | 苏州浮木云科技有限公司 | Knowledge graph-based page code generation method, system, device and medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105760495A (en) * | 2016-02-17 | 2016-07-13 | 扬州大学 | Method for carrying out exploratory search for bug problem based on knowledge map |
CN107943874A (en) * | 2017-11-13 | 2018-04-20 | 平安科技(深圳)有限公司 | Knowledge mapping processing method, device, computer equipment and storage medium |
CN108196880A (en) * | 2017-12-11 | 2018-06-22 | 北京大学 | Software project knowledge mapping method for automatically constructing and system |
CN108959433A (en) * | 2018-06-11 | 2018-12-07 | 北京大学 | A kind of method and system extracting knowledge mapping and question and answer from software project data |
CN110287704A (en) * | 2019-06-25 | 2019-09-27 | 北京中科微澜科技有限公司 | A kind of loophole software dependence construction method based on loophole map |
-
2020
- 2020-07-19 CN CN202010695185.8A patent/CN111914098A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105760495A (en) * | 2016-02-17 | 2016-07-13 | 扬州大学 | Method for carrying out exploratory search for bug problem based on knowledge map |
CN107943874A (en) * | 2017-11-13 | 2018-04-20 | 平安科技(深圳)有限公司 | Knowledge mapping processing method, device, computer equipment and storage medium |
CN108196880A (en) * | 2017-12-11 | 2018-06-22 | 北京大学 | Software project knowledge mapping method for automatically constructing and system |
CN108959433A (en) * | 2018-06-11 | 2018-12-07 | 北京大学 | A kind of method and system extracting knowledge mapping and question and answer from software project data |
CN110287704A (en) * | 2019-06-25 | 2019-09-27 | 北京中科微澜科技有限公司 | A kind of loophole software dependence construction method based on loophole map |
Non-Patent Citations (2)
Title |
---|
李文鹏等: "面向开源软件项目的软件知识图谱构建方法", 计算机科学与探索, 31 October 2016 (2016-10-31), pages 851 - 862 * |
王飞等: "代码知识图谱构建及智能化软件开发方法研究", 软件学报, 6 November 2019 (2019-11-06) * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112486494A (en) * | 2020-11-16 | 2021-03-12 | 中信银行股份有限公司 | File generation method and device, electronic equipment and computer readable storage medium |
CN112632336A (en) * | 2020-12-16 | 2021-04-09 | 恩亿科(北京)数据科技有限公司 | Method and system for processing real-time streaming graph relation |
CN113504972A (en) * | 2021-07-26 | 2021-10-15 | 京东科技控股股份有限公司 | Service deployment method and device, electronic equipment and storage medium |
CN116661768A (en) * | 2023-07-25 | 2023-08-29 | 苏州浮木云科技有限公司 | Knowledge graph-based page code generation method, system, device and medium |
CN116661768B (en) * | 2023-07-25 | 2023-12-29 | 苏州浮木云科技有限公司 | Knowledge graph-based page code generation method, system, device and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111914098A (en) | Knowledge graph construction method and device, electronic equipment and readable storage medium | |
CN108270629B (en) | Website visitor behavior monitoring method and device | |
CN110704064B (en) | Method and device for compiling and executing intelligent contract | |
WO2008156969A1 (en) | Discoscript: a simplified distributed computing scripting language | |
CN103778373A (en) | Virus detection method and device | |
US11074079B2 (en) | Event handling instruction processing | |
CN112579146A (en) | Interface change detection method and device | |
Groth et al. | A model of process documentation to determine provenance in mash-ups | |
CN105630656A (en) | Log model based system robustness analysis method and apparatus | |
CN105824647A (en) | Form page generating method and device | |
CN113495728A (en) | Dependency relationship determination method, dependency relationship determination device, electronic equipment and medium | |
CN111258905A (en) | Defect positioning method and device, electronic equipment and computer readable storage medium | |
CN113434582A (en) | Service data processing method and device, computer equipment and storage medium | |
CN113901169A (en) | Information processing method, information processing device, electronic equipment and storage medium | |
CN112181479A (en) | Method and device for determining difference between code file versions and electronic equipment | |
CN116483888A (en) | Program evaluation method and device, electronic equipment and computer readable storage medium | |
CN104408198A (en) | Method and device for acquiring webpage contents | |
CN113656044B (en) | Android installation package compression method and device, computer equipment and storage medium | |
CN113296834B (en) | Android closed source service type information extraction method based on reverse engineering | |
CN113282541B (en) | File calling method and device and electronic equipment | |
CN116578282A (en) | Code generation method, device, electronic equipment and medium | |
CN113934405A (en) | Plug-in processing method, device, equipment, storage medium and computer program product | |
CN112199080A (en) | Webpack construction method and equipment for vuejs project | |
CN112181825A (en) | Test case library construction method and device, electronic equipment and medium | |
CN116700840B (en) | File execution method, device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |