[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111914098A - Knowledge graph construction method and device, electronic equipment and readable storage medium - Google Patents

Knowledge graph construction method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN111914098A
CN111914098A CN202010695185.8A CN202010695185A CN111914098A CN 111914098 A CN111914098 A CN 111914098A CN 202010695185 A CN202010695185 A CN 202010695185A CN 111914098 A CN111914098 A CN 111914098A
Authority
CN
China
Prior art keywords
knowledge
software
information
graph
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010695185.8A
Other languages
Chinese (zh)
Inventor
熊龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Citic Bank Corp Ltd
Original Assignee
China Citic Bank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Citic Bank Corp Ltd filed Critical China Citic Bank Corp Ltd
Priority to CN202010695185.8A priority Critical patent/CN111914098A/en
Publication of CN111914098A publication Critical patent/CN111914098A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/53Decompilation; Disassembly
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/60Software deployment
    • G06F8/61Installation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Stored Programmes (AREA)

Abstract

The application relates to the technical field of information processing, in particular to a method and a device for constructing a knowledge graph, an electronic device and a readable storage medium, wherein the method comprises the following steps: collecting sample software information and preprocessing the sample software information to obtain preprocessed data; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. The method for constructing the software project knowledge graph by performing diversified analysis processing on the sample software under the background of data sparseness caused by the fact that software project information resources cannot be acquired can help software developers to perform software influence assessment, dependence retrieval, code generation or verification and the like, software development efficiency is improved, and software version conflicts and the like are reduced.

Description

Knowledge graph construction method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of information processing technologies, and in particular, to a method and an apparatus for constructing a knowledge graph, an electronic device, and a readable storage medium.
Background
Knowledge Graph (also known as Knowledge domain visualization or Knowledge domain mapping map) is a series of different graphs displaying the relationship between the Knowledge development process and the structure, and uses visualization technology to describe Knowledge resources and their carriers, and to mine, analyze, construct, draw and display Knowledge and the mutual relation between them. In the technical field of software, a software knowledge graph is constructed mainly depending on software resources such as source codes, question and answer documents, requirement/design documents and the like of software items, a generated knowledge graph is mainly oriented to code hierarchy, but in the industries such as finance and the like, most of software is developed by third-party commercial companies, software resources such as source codes, question and answer documents, requirement/design documents and the like of the software items cannot be obtained, so that packages, classes, interfaces, methods, attributes, exceptions, method parameters, return values and relations among the elements cannot be extracted, and the knowledge graph of the software items cannot be established. In addition, the influence relationship of software version upgrading and the like cannot be evaluated aiming at the knowledge graph of the code level.
Disclosure of Invention
The present application aims to solve at least one of the above technical drawbacks. The technical scheme adopted by the application is as follows:
in a first aspect, an embodiment of the present application provides a method for constructing a knowledge graph, where the method includes:
collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
determining a knowledge graph knowledge base according to the preprocessed data and preset rules;
analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
Optionally, the collected sample software information includes, but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information.
Optionally, determining a knowledge-graph knowledge base according to the preprocessed data and preset rules includes:
determining a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, wherein the knowledge base comprises a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
Optionally, the analyzing, processing, and mining the sample software information to expand the knowledge-graph knowledge base of the feature data includes:
and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
Optionally, the analyzing and processing the software information of the sample software expands the concept set and the attribute set of the knowledge-graph knowledge base to include at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
Optionally, the method further comprises:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
In a second aspect, an embodiment of the present application provides a knowledge graph building apparatus, including: the device comprises an acquisition module, a processing module, a storage module and a construction module; wherein,
the acquisition module is used for acquiring sample software information;
the processing module is used for preprocessing the sample software information to obtain preprocessed data;
the processing module is further used for determining a knowledge base of the knowledge map according to the preprocessed data and preset rules;
the processing module is also used for analyzing, processing and mining the sample software information to expand the knowledge base of the knowledge map by using the characteristic data;
the construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
the storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
Optionally, the processing module is specifically configured to determine a knowledge base of the knowledge graph according to the preprocessed data and expert knowledge in the software domain, where the knowledge base includes a concept set and an attribute set of the sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
Optionally, the processing module is further configured to perform at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
Optionally, the processing module is further configured to:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the knowledge graph construction method by calling the operation instruction.
In a fourth aspect, a computer-readable storage medium has stored thereon a computer program which, when executed by a processor, implements the above-described method of knowledge-graph construction.
The technical scheme provided by the embodiment of the application has the following beneficial effects: the knowledge graph construction method provided by the embodiment of the application comprises the following steps: collecting sample software information and preprocessing the sample software information to obtain preprocessed data; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. According to the technical scheme, under the background of data sparseness caused by the fact that software project information resources cannot be obtained, entity extraction and entity association extraction are achieved by conducting diversified analysis processing on sample software and establishing a knowledge base, and a software project knowledge graph is established.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
FIG. 1 is a schematic flow chart diagram of a knowledge graph construction method provided in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a knowledge graph constructing apparatus according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
The embodiment of the application relates to a knowledge graph construction technology of a software project, and provides a knowledge graph construction method based on the difficulty in construction of a knowledge graph of the software project introduced in the background technology. The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments in conjunction with the accompanying drawings. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
In order to make the objects, technical solutions and advantages of the present application clearer, fig. 1 discloses a flowchart of a method for constructing a knowledge graph provided by an embodiment of the present application, and as shown in fig. 1, the method for constructing a knowledge graph includes:
s101, collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
s102, determining a knowledge base of the knowledge base according to the preprocessed data and preset rules;
s103, analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and S104, constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
Further, in the embodiments of the present application, the collecting of the sample software information includes, but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information. The construction information comprises construction information included in collected format software files such as pom.xml, built.gradle, built.sbt and built.xml.
Further, in this embodiment of the present application, determining the knowledge-graph knowledge base according to the preprocessed data and the preset rule includes: according to the preprocessed data and the domain knowledge of software domain experts, establishing a top-level and core knowledge base core concept set and an attribute set so as to determine a knowledge base of the knowledge graph, wherein the knowledge base comprises a concept set and an attribute set of sample software; the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated. The step should ensure the correctness of the established concepts and attributes without making high requirements on the integrity temporarily, and the established knowledge base is an RDF/OWL file representing the definition of the core concept set and the attribute set so as to solve the problem of 'cold start' of the core ontology of the knowledge base.
Further, in this embodiment of the present application, the analyzing, processing, and mining the sample software information to obtain the feature data to expand the knowledge-graph knowledge base includes: and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
Optionally, the analysis processing of the software information of the sample software may expand the concept set and the attribute set of the knowledge-graph knowledge base by:
1) analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; namely, the information such as the name, version and dependency relationship of the software project is extracted through sample software deployment file structure information.
2) Analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; the method comprises the steps of analyzing construction information of a package of sample software, such as pom.xml, build.gradle, build.sbt, build.xml and the like, and obtaining project information, compiling information, environment information, inheritance, aggregation, dependency relationship and the like of the sample software.
3) Decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element; that is, information such as the name, version, and dependency of a sample software project is extracted by decompressing a sample software package including JAR format or a decompilation software program.
Wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information; the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
The above embodiments are mainly used in the case that software resources such as source code, question and answer document, requirement/design document and the like of the software cannot be obtained, carrying out a series of static and dynamic analysis and processing on the file structure, the constructed information, the decompilation result and the like of the sample software project, extracting characteristic data, mining and discovering new concepts and attributes, expanding a knowledge base concept set (comprising project or organization, common name of the project, version of the project, packaging mechanism, name of user description project, official address of development team, classification and other project information, compiling information, environment information and the like) constructed in the embodiment as an entity of a pre-constructed knowledge graph of the sample software project, and extending the attribute set (including relationship of belonging, containing, plug-in, inheritance, aggregation and dependency) of the knowledge base to be used as entity association of the pre-constructed knowledge graph.
In an alternative embodiment, the method for expanding the attribute set of the knowledge base described in the present application may further adopt: analyzing the process or directory relation of the running state of the sample software program; mining the relation of the sample software project in the running state; performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base; acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result; and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
By the embodiment, the software knowledge base comprising the concept set and the attribute set is obtained, the concept set elements are used as the entity of the pre-constructed knowledge graph, the attribute set elements are used as the entity association of the pre-constructed knowledge graph, the entity is further used as the node, and the entity association is used as the edge, so that the construction of the software knowledge graph disclosed by the application is completed.
Based on the method for constructing the knowledge graph provided by the embodiment shown in fig. 1, fig. 2 shows that the embodiment of the present application provides a knowledge graph constructing apparatus, as shown in fig. 2, the apparatus includes: the system comprises a 201 acquisition module, a 202 processing module, a 203 storage module and a 204 construction module; wherein,
the 201 acquisition module is used for acquiring sample software information;
the 202 processing module is configured to perform preprocessing on the sample software information to obtain preprocessed data;
the 202 processing module is further configured to determine a knowledge-graph knowledge base according to the preprocessed data and preset rules;
the 202 processing module is further used for analyzing, processing and mining sample software information to expand a knowledge base of the feature data;
the 204 construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
and the 203 storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
In the embodiment of the application, the acquisition module is used for acquiring information such as a sample software warehouse, a software installation package name, directory structure information, construction information and program running information.
In an embodiment of the application, the processing module is specifically configured to determine a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, where the knowledge base includes a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
In this embodiment, the processing module is further configured to analyze and process the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
In an embodiment of the present application, the processing module is further configured to perform at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
In an embodiment of the present application, the processing module is further configured to: analyzing the process or directory relation of the running state of the sample software program; mining the relation of the sample software project in the running state; performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base; acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result; and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
It is to be understood that the above modules of the knowledge graph constructing apparatus in the present embodiment have functions of implementing the respective steps of the method in the embodiment shown in fig. 1. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module, reference may be specifically made to the corresponding description of the method in the embodiment shown in fig. 1, and details are not repeated here.
The embodiment of the application provides an electronic device, which comprises a processor and a memory;
a memory for storing operating instructions;
and the processor is used for executing the knowledge graph construction method provided by any embodiment of the application by calling the operation instruction.
As an example, fig. 3 shows a schematic structural diagram of an electronic device to which an embodiment of the present application is applicable, and as shown in fig. 3, the electronic device 2000 includes: a processor 2001 and a memory 2003. Wherein the processor 2001 is coupled to a memory 2003, such as via a bus 2002. Optionally, the electronic device 2000 may also include a transceiver 2004. It should be noted that the transceiver 2004 is not limited to one in practical applications, and the structure of the electronic device 2000 is not limited to the embodiment of the present application.
The processor 2001 is applied to the embodiment of the present application to implement the method shown in the above method embodiment. The transceiver 2004 may include a receiver and a transmitter, and the transceiver 2004 is applied to the embodiments of the present application to implement the functions of the electronic device of the embodiments of the present application to communicate with other devices when executed.
The Processor 2001 may be a CPU (Central Processing Unit), general Processor, DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.
Bus 2002 may include a path that conveys information between the aforementioned components. The bus 2002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 2002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.
The Memory 2003 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.
Optionally, the memory 2003 is used for storing application program code for performing the disclosed aspects, and is controlled in execution by the processor 2001. The processor 2001 is configured to execute the application program code stored in the memory 2003 to implement the method of constructing a knowledge graph provided in any of the embodiments of the present application.
The electronic device provided by the embodiment of the application is applicable to any embodiment of the method, and is not described herein again.
The embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the computer program implements the method for constructing a knowledge graph shown in the above method embodiment.
The computer-readable storage medium provided in the embodiments of the present application is applicable to any of the embodiments of the foregoing method, and is not described herein again.
According to the knowledge graph construction scheme provided by the embodiment of the application, preprocessing data are obtained by acquiring sample software information and preprocessing the sample software information; determining a knowledge graph knowledge base according to the preprocessed data and preset rules; analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map; and constructing a software knowledge graph according to the expanded knowledge graph knowledge base. According to the technical scheme, under the background of data sparseness caused by the fact that software project information resources cannot be obtained, entity extraction and entity association extraction are achieved by conducting diversified analysis processing on sample software and establishing a knowledge base, and a software project knowledge graph is established.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method of knowledge graph construction, the method comprising:
collecting sample software information and preprocessing the sample software information to obtain preprocessed data;
determining a knowledge graph knowledge base according to the preprocessed data and preset rules;
analyzing and processing sample software information, mining feature data, and expanding a knowledge base of a knowledge map;
and constructing a software knowledge graph according to the expanded knowledge graph knowledge base.
2. The method of knowledge-graph construction according to claim 1, wherein the collecting sample software information includes but is not limited to:
collecting a sample software warehouse, a software installation package name, directory structure information, construction information and program operation information.
3. The method for constructing a knowledge graph according to claim 2, wherein determining a knowledge graph knowledge base according to the preprocessed data and the preset rules comprises:
determining a knowledge base of a knowledge graph according to the preprocessed data and expert knowledge in the software field, wherein the knowledge base comprises a concept set and an attribute set of sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
4. The knowledge graph construction method according to claim 3, wherein the analyzing, processing and mining the sample software information to extend the knowledge graph knowledge base by using the feature data comprises:
and analyzing and processing the software information of the sample software to expand the concept set and the attribute set of the knowledge base.
5. The method of knowledge-graph construction according to claim 4, wherein said analyzing the software information of the sample software to expand the concept sets and attribute sets of the knowledge-graph knowledge base comprises at least one of:
analyzing the sample software directory structure information to obtain knowledge graph concept set elements and attribute set elements; or,
analyzing the sample software file format construction information to obtain a knowledge graph concept set element and an attribute set element; or,
decompressing a software package with a specific format or a decompiling software program to obtain a knowledge graph concept set element and an attribute set element;
wherein the concept set elements include, but are not limited to: name, version, packaging mechanism, user description software name, development team official address, classification and the like of sample software, compiling information and environment information;
the attribute set elements include, but are not limited to: belonging, including, plug-in, inheritance, aggregation, dependency.
6. The method of knowledge-graph construction according to any one of claims 3 to 5, wherein the method further comprises:
analyzing the process or directory relation of the running state of the sample software program;
mining the relation of the sample software project in the running state;
performing knowledge item association on the mined relation in the running state and the determined knowledge graph knowledge base;
acquiring the dependency relationship among hot loading, distributed program deployment and calling of the sample software according to the correlation result;
and expanding the dependency relationship of the hot loading, the distributed program deployment and the calling into an attribute set of the knowledge-graph knowledge base.
7. An apparatus for knowledge-graph construction, the apparatus comprising: the device comprises an acquisition module, a processing module, a storage module and a construction module; wherein,
the acquisition module is used for acquiring sample software information;
the processing module is used for preprocessing the sample software information to obtain preprocessed data;
the processing module is further used for determining a knowledge base of the knowledge map according to the preprocessed data and preset rules;
the processing module is also used for analyzing, processing and mining the sample software information to expand the knowledge base of the knowledge map by using the characteristic data;
the construction module is used for constructing a software knowledge graph according to the expanded knowledge graph knowledge base;
the storage module is used for storing preset rules, preprocessing data and a knowledge base of the knowledge graph.
8. The apparatus of claim 7, wherein the processing module is configured to determine a knowledge base of the knowledge graph based on the preprocessed data and the software domain expert knowledge, the knowledge base comprising a concept set and an attribute set of the sample software;
the concept set is used as an entity for constructing the knowledge graph, and the attribute set is used as an entity for constructing the knowledge graph to be associated.
9. An electronic device comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method of any one of claims 1-6 by calling the operation instruction.
10. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-6.
CN202010695185.8A 2020-07-19 2020-07-19 Knowledge graph construction method and device, electronic equipment and readable storage medium Pending CN111914098A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010695185.8A CN111914098A (en) 2020-07-19 2020-07-19 Knowledge graph construction method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010695185.8A CN111914098A (en) 2020-07-19 2020-07-19 Knowledge graph construction method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN111914098A true CN111914098A (en) 2020-11-10

Family

ID=73281659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010695185.8A Pending CN111914098A (en) 2020-07-19 2020-07-19 Knowledge graph construction method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN111914098A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112486494A (en) * 2020-11-16 2021-03-12 中信银行股份有限公司 File generation method and device, electronic equipment and computer readable storage medium
CN112632336A (en) * 2020-12-16 2021-04-09 恩亿科(北京)数据科技有限公司 Method and system for processing real-time streaming graph relation
CN113504972A (en) * 2021-07-26 2021-10-15 京东科技控股股份有限公司 Service deployment method and device, electronic equipment and storage medium
CN116661768A (en) * 2023-07-25 2023-08-29 苏州浮木云科技有限公司 Knowledge graph-based page code generation method, system, device and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760495A (en) * 2016-02-17 2016-07-13 扬州大学 Method for carrying out exploratory search for bug problem based on knowledge map
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN108196880A (en) * 2017-12-11 2018-06-22 北京大学 Software project knowledge mapping method for automatically constructing and system
CN108959433A (en) * 2018-06-11 2018-12-07 北京大学 A kind of method and system extracting knowledge mapping and question and answer from software project data
CN110287704A (en) * 2019-06-25 2019-09-27 北京中科微澜科技有限公司 A kind of loophole software dependence construction method based on loophole map

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105760495A (en) * 2016-02-17 2016-07-13 扬州大学 Method for carrying out exploratory search for bug problem based on knowledge map
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN108196880A (en) * 2017-12-11 2018-06-22 北京大学 Software project knowledge mapping method for automatically constructing and system
CN108959433A (en) * 2018-06-11 2018-12-07 北京大学 A kind of method and system extracting knowledge mapping and question and answer from software project data
CN110287704A (en) * 2019-06-25 2019-09-27 北京中科微澜科技有限公司 A kind of loophole software dependence construction method based on loophole map

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李文鹏等: "面向开源软件项目的软件知识图谱构建方法", 计算机科学与探索, 31 October 2016 (2016-10-31), pages 851 - 862 *
王飞等: "代码知识图谱构建及智能化软件开发方法研究", 软件学报, 6 November 2019 (2019-11-06) *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112486494A (en) * 2020-11-16 2021-03-12 中信银行股份有限公司 File generation method and device, electronic equipment and computer readable storage medium
CN112632336A (en) * 2020-12-16 2021-04-09 恩亿科(北京)数据科技有限公司 Method and system for processing real-time streaming graph relation
CN113504972A (en) * 2021-07-26 2021-10-15 京东科技控股股份有限公司 Service deployment method and device, electronic equipment and storage medium
CN116661768A (en) * 2023-07-25 2023-08-29 苏州浮木云科技有限公司 Knowledge graph-based page code generation method, system, device and medium
CN116661768B (en) * 2023-07-25 2023-12-29 苏州浮木云科技有限公司 Knowledge graph-based page code generation method, system, device and medium

Similar Documents

Publication Publication Date Title
CN111914098A (en) Knowledge graph construction method and device, electronic equipment and readable storage medium
CN108270629B (en) Website visitor behavior monitoring method and device
CN110704064B (en) Method and device for compiling and executing intelligent contract
WO2008156969A1 (en) Discoscript: a simplified distributed computing scripting language
CN103778373A (en) Virus detection method and device
US11074079B2 (en) Event handling instruction processing
CN112579146A (en) Interface change detection method and device
Groth et al. A model of process documentation to determine provenance in mash-ups
CN105630656A (en) Log model based system robustness analysis method and apparatus
CN105824647A (en) Form page generating method and device
CN113495728A (en) Dependency relationship determination method, dependency relationship determination device, electronic equipment and medium
CN111258905A (en) Defect positioning method and device, electronic equipment and computer readable storage medium
CN113434582A (en) Service data processing method and device, computer equipment and storage medium
CN113901169A (en) Information processing method, information processing device, electronic equipment and storage medium
CN112181479A (en) Method and device for determining difference between code file versions and electronic equipment
CN116483888A (en) Program evaluation method and device, electronic equipment and computer readable storage medium
CN104408198A (en) Method and device for acquiring webpage contents
CN113656044B (en) Android installation package compression method and device, computer equipment and storage medium
CN113296834B (en) Android closed source service type information extraction method based on reverse engineering
CN113282541B (en) File calling method and device and electronic equipment
CN116578282A (en) Code generation method, device, electronic equipment and medium
CN113934405A (en) Plug-in processing method, device, equipment, storage medium and computer program product
CN112199080A (en) Webpack construction method and equipment for vuejs project
CN112181825A (en) Test case library construction method and device, electronic equipment and medium
CN116700840B (en) File execution method, device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination