CN108874819B - Data mining method for database - Google Patents
Data mining method for database Download PDFInfo
- Publication number
- CN108874819B CN108874819B CN201710329637.9A CN201710329637A CN108874819B CN 108874819 B CN108874819 B CN 108874819B CN 201710329637 A CN201710329637 A CN 201710329637A CN 108874819 B CN108874819 B CN 108874819B
- Authority
- CN
- China
- Prior art keywords
- data
- ontology
- database
- network
- nodes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
A data mining method of a database comprises the steps of converting a data mode of an existing relational database into a proprietary ontology to form a proprietary ontology base, converting data in the existing relational database into an RDF (resource description framework) knowledge graph corresponding to the proprietary ontology, and then carrying out node operation on a semantic network formed by the proprietary ontology to obtain data in the RDF knowledge graph corresponding to nodes. The invention simplifies the process of data mining, so that the data can be obtained by non-IT staff, and the labor productivity is greatly improved.
Description
Technical Field
The invention relates to the field of semantic search and big data, in particular to a data mining method of a database.
Background
The combination of computers and the internet creates a vast amount of information that soon gives us the feeling of being overwhelmed. This is true, as well, and we are constantly making new information while dealing with unconventional vast amounts of information. This amount of information grows in a geometric progression. It is desirable to effectively process massive information by a computer, and it is expected that the massive information can be utilized better, while being released from information inundation.
Information processing of a computer is initially limited to data having a simple structure, and the structure is relatively simple although the amount of data may be large. With the rapid increase in the hardware capacity of computers, which are used to cope with complex problems, the complexity of the structure of data increases greatly. Through different accumulation of data by the internet, data of different data sources begin to be gathered together, so that data processing becomes more complex.
The database makes our daily work very concise and efficient. As the use of databases is deepened, the ecology of the databases in use is more and more complicated, and at the same time, more and more databases need to be integrated or merged to generate greater benefits. Since the database design is a bottom-up approach nowadays, when a database becomes very complex, the database itself becomes a legacy (legacy) system, and the bottom is a huge black hole, which makes it difficult for people to reach. When these complex and exotic databases need to be integrated or merged with homogeneous databases, the task becomes very laborious and impossible (mission observable).
With respect to searching, one thinks of the relevant results that are then given for a query made using "search terms" for textual descriptions in text or images. Text is also referred to as unstructured data. For Structured data (i.e. row data, stored in a Database, and implemented data can be logically expressed by a two-dimensional table structure) stored in the Database, IT is a matter of course to hand the DBA (Database Administrator) or corresponding IT personnel to Query the desired data, and let them write the Query statement of SQL using the Query Language of a relational Database, such as SQL (Structured Query Language), and then obtain these data and corresponding data reports. For example, a health management company project ideally knows the data of 50-60 year old men and 45-55 year old women whose glycemic index is close to diabetes in their managed population, and the project manager gives this request to DBA personnel, who write corresponding SQL query statements, query and extract relevant data from the database, and then browse and analyze the data. If any problem is found and further data is needed, the manager must ask the other requirements, for example, classify the data according to profession, and the DBA staff needs to do further data query and extraction. This process is very cumbersome and fraught with possible human error.
Disclosure of Invention
The invention provides a data mining method of a database, which simplifies the data mining process, enables the data to be obtained by non-IT staff and greatly improves the labor productivity.
In order to achieve the above object, the present invention provides a data mining method for a database, comprising the steps of:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
and step S3, performing node operation on the semantic network formed by the proprietary ontology, and acquiring data in the RDF knowledge graph corresponding to the nodes.
The step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
s1.2, converting the data mode into a proprietary ontology;
a table in the relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity;
and S1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library.
In step S2, the data originally stored in the table of the relational database forms the semantic web graph in the RDF knowledge graph.
The step S3 specifically includes the following steps:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
The step of generating a sub-network in step S3.2 specifically includes: and selecting a plurality of nodes on the semantic network, filtering the nodes which are not selected, and forming a sub-network by the selected nodes.
After a sub-network is generated, the semantic network is reset to the initial state of the semantic network, so that a next new sub-network can be generated, or the nodes can be continuously selected on the basis of the current sub-network, so that a new sub-network is generated.
The invention applies the proprietary ontology to data mining and converts the structured data into the knowledge graph, thereby carrying out semantic search through keywords, simplifying the process of data mining, leading the data to be obtained to be operated by non-IT staff and greatly improving the labor productivity.
Drawings
Fig. 1 is a flowchart of a data mining method for a database according to the present invention.
Fig. 2 is a specific schematic diagram of a data mining method for a database according to the present invention.
Detailed Description
The preferred embodiment of the present invention is described in detail below with reference to fig. 1 and 2.
Ontologies and proprietary ontologies are emerging in the computer science and artificial intelligence communities to deal with such complex data processing. The ontology and the proprietary ontology are the foundation of the third generation internet, namely the Semantic Web, and are also the cornerstone of Semantic search. Third generation internet and semantic search are the basis for big data processing. Soon after the introduction of ontology into the computer field, this concept was also introduced by some people into database design and development, and the design of databases has also changed from the bottom to the top of the past to a top-down approach: firstly, the composition relationship of concepts and entities in the field and the specific attributes of the concepts and the entities are determined and designed, a proprietary field ontology is established, and the data of the database is tightly surrounded around the proprietary field ontology. Such database design, development and maintenance biases are in the completeness of concepts and entities and the straightforward handlability of domain experts. Moreover, the evolution of the database is firstly embodied in the knowledge ontology and then implemented in the underlying data system. The ontology-driven database thoroughly changes the database's popularity, so that database integration and consolidation become the maintenance and updating of the ontology, while changes to the bottom level of the database are automated.
According to the top-down concept, as shown in fig. 1, the present invention provides a data mining method for a database, comprising the following steps:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
and step S3, performing node operation on the semantic network formed by the proprietary ontology, and acquiring data in the RDF knowledge graph corresponding to the nodes.
As shown in fig. 2, the step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
the relational Database is composed of a series of tables in which data is stored, and various tables in the relational Database are determined by data patterns, which are established by Database administrators (DBAs for short);
s1.2, converting the data mode into a proprietary ontology;
the proprietary ontology is established by experts in the proprietary domain;
generally, a table in a relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity; some table fields are called foreign keys, namely, primary keys of another table; from an ontology perspective, this indicates that the two entities are related, one entity being the attribute value of the other entity; in the same way, the method can be applied to all tables of the database, so that the data mode can be roughly converted into the proprietary ontology, and the existing proprietary ontology participates in the conversion process;
s1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library;
the editing refers to adding, modifying and deleting.
The step S2 specifically includes the following steps:
the data in the relational database are originally stored in the table, the positions of the data are indicated by the fields in the table, the data are extracted now, the attributes in the entity corresponding to the proprietary ontology are the values of the attributes, namely, the data are arranged in the table in the relational database, but in the RDF knowledge graph, the data directly form a semantic network graph.
As shown in fig. 2, the step S3 specifically includes the following steps:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
because a proprietary ontology can have a large number of classes and a corresponding large number of attributes, the number of nodes on the semantic network graph is large, and the relationship is complex, the network graph is generated on a computer interface by using the existing Javascript technology, so that the nodes formed by the classes and the attributes can be clicked, after the nodes representing the classes or the attributes are clicked, the nodes and the relationships connected with the nodes are highlighted, and the nodes become the focus of attention;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
clicking a plurality of nodes on the semantic network, filtering the nodes which are not clicked, wherein the clicked nodes form a sub-network which represents a part of data in the whole data;
according to different selected nodes, different sub-networks can be generated, after one sub-network is generated, the semantic network is reset to return to the initial state of the semantic network, a next new sub-network can be generated, or the nodes can be continuously selected on the basis of the current sub-network to generate a new sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
The invention applies the proprietary ontology to data mining and converts the structured data into the knowledge graph, thereby carrying out semantic search through keywords, simplifying the process of data mining, leading the data to be obtained to be operated by non-IT staff and greatly improving the labor productivity.
While the present invention has been described in detail with reference to the preferred embodiments, it should be understood that the above description should not be taken as limiting the invention. Various modifications and alterations to this invention will become apparent to those skilled in the art upon reading the foregoing description. Accordingly, the scope of the invention should be determined from the following claims.
Claims (5)
1. A method for mining data of a database, comprising the steps of:
step S1, converting the data mode of the existing relational database into a proprietary ontology to form a proprietary ontology library;
step S2, converting the data in the existing relational database into an RDF knowledge graph corresponding to the proprietary ontology;
step S3, performing node operation on the semantic network formed by the proprietary ontology to acquire data in the RDF knowledge graph corresponding to the nodes;
the step S1 specifically includes the following steps:
s1.1, extracting a data mode of a relational database;
s1.2, converting the data mode into a proprietary ontology;
a table in the relational database represents an entity in an ontology, and fields owned by the table in the relational database are attributes of the entity;
and S1.3, after the special ontology is edited by experts in the special field, generating an expert-level special ontology, and storing the expert-level special ontology in a special ontology library.
2. The method of data mining of database of claim 1, wherein in step S2, the data originally stored in the table of the relational database forms a semantic web graph in the RDF knowledge graph.
3. The method for mining data of a database according to claim 1, wherein said step S3 specifically comprises the steps of:
s3.1, the classes and the attributes of the special ontology in the special ontology library form a semantic network graph;
s3.2, selecting a plurality of nodes on the semantic network to generate a sub-network;
and S3.3, selecting data corresponding to the nodes from the RDF knowledge graph according to the sub-networks to obtain search data.
4. The method of data mining of a database according to claim 3, characterized in that the step of generating a sub-network in step S3.2 comprises: and selecting a plurality of nodes on the semantic network, filtering the nodes which are not selected, and forming a sub-network by the selected nodes.
5. The method of data mining of database of claim 4, wherein after a sub-network is created, a new sub-network is created by resetting the semantic network back to the initial state of the semantic network, or by continuing to select nodes based on the current sub-network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710329637.9A CN108874819B (en) | 2017-05-11 | 2017-05-11 | Data mining method for database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710329637.9A CN108874819B (en) | 2017-05-11 | 2017-05-11 | Data mining method for database |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108874819A CN108874819A (en) | 2018-11-23 |
CN108874819B true CN108874819B (en) | 2021-09-03 |
Family
ID=64319551
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710329637.9A Active CN108874819B (en) | 2017-05-11 | 2017-05-11 | Data mining method for database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108874819B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107330007A (en) * | 2017-06-12 | 2017-11-07 | 南京邮电大学 | A kind of Method for Ontology Learning based on multi-data source |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102713A (en) * | 2014-07-16 | 2014-10-15 | 百度在线网络技术(北京)有限公司 | Method and device for displaying recommendation results |
CN104462501A (en) * | 2014-12-19 | 2015-03-25 | 北京奇虎科技有限公司 | Knowledge graph construction method and device based on structural data |
CN104866593A (en) * | 2015-05-29 | 2015-08-26 | 中国电子科技集团公司第二十八研究所 | Database searching method based on knowledge graph |
CN105183869A (en) * | 2015-09-16 | 2015-12-23 | 分众(中国)信息技术有限公司 | Building knowledge mapping database and construction method thereof |
CN106202564A (en) * | 2016-08-02 | 2016-12-07 | 浪潮软件股份有限公司 | Ontology relationship data searching framework based on elastic search |
CN106294481A (en) * | 2015-06-05 | 2017-01-04 | 阿里巴巴集团控股有限公司 | A kind of air navigation aid based on collection of illustrative plates and device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8051104B2 (en) * | 1999-09-22 | 2011-11-01 | Google Inc. | Editing a network of interconnected concepts |
-
2017
- 2017-05-11 CN CN201710329637.9A patent/CN108874819B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104102713A (en) * | 2014-07-16 | 2014-10-15 | 百度在线网络技术(北京)有限公司 | Method and device for displaying recommendation results |
CN104462501A (en) * | 2014-12-19 | 2015-03-25 | 北京奇虎科技有限公司 | Knowledge graph construction method and device based on structural data |
CN104866593A (en) * | 2015-05-29 | 2015-08-26 | 中国电子科技集团公司第二十八研究所 | Database searching method based on knowledge graph |
CN106294481A (en) * | 2015-06-05 | 2017-01-04 | 阿里巴巴集团控股有限公司 | A kind of air navigation aid based on collection of illustrative plates and device |
CN105183869A (en) * | 2015-09-16 | 2015-12-23 | 分众(中国)信息技术有限公司 | Building knowledge mapping database and construction method thereof |
CN106202564A (en) * | 2016-08-02 | 2016-12-07 | 浪潮软件股份有限公司 | Ontology relationship data searching framework based on elastic search |
Also Published As
Publication number | Publication date |
---|---|
CN108874819A (en) | 2018-11-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112214611B (en) | Enterprise knowledge graph construction system and method | |
CN112906826B (en) | Multi-dimensional knowledge graph based fusion method and device and computer equipment | |
CN102968469B (en) | A kind of application references method for building up and system, application search method and system | |
Subramanian et al. | UP-GNIV: an expeditious high utility pattern mining algorithm for itemsets with negative utility values | |
Benedetti et al. | Exposing the underlying schema of LOD sources | |
CN108874819B (en) | Data mining method for database | |
Chen et al. | Trends in conceptual modeling: Citation analysis of the ER conference papers (1979-2005) | |
Wang et al. | Analysis of the structure and time-series evolution of knowledge label network from a complex perspective | |
Aloui et al. | A fuzzy ontology-based platform for flexible querying | |
CN114691845B (en) | Semantic search method, semantic search device, electronic equipment, storage medium and product | |
Jabeen et al. | Divided we stand out! Forging Cohorts fOr Numeric Outlier Detection in large scale knowledge graphs (CONOD) | |
CN110825792A (en) | High-concurrency distributed data retrieval method based on golang middleware coroutine mode | |
Rattinger et al. | Semantic and topological patent graphs: Analysis of retrieval and community structure | |
CN103577560B (en) | Method and device for inputting data base operating instructions | |
Castano et al. | Thematic clustering and exploration of linked data | |
Chaturvedi et al. | System Network Analytics: Evolution and Stable Rules of a State Series | |
Liu et al. | Current status and application analysis of graph database technology | |
CN113434658A (en) | Thermal power generating unit operation question-answer generation method, system, equipment and readable storage medium | |
Bodra | Processing queries over partitioned graph databases: An approach and it’s evaluation | |
El Abdouli et al. | A distributed approach for mining moroccan hashtags using Twitter platform | |
Gu et al. | A Novel Approach for Constructing Intangible Cultural Heritage Knowledge Graphs | |
Simonini et al. | Enhancing Loosely Schema-aware Entity Resolution with User Interaction | |
Kozmina et al. | Perspectives of information requirements analysis in big data projects | |
Sharma et al. | Review Of Data Mining Techniques: An Empirical Study | |
Cheng et al. | Semi-automatic Causal Graph Construction System |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |