[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN106528506B - Data processing method and device based on XML (extensive markup language) tag and terminal equipment - Google Patents

Data processing method and device based on XML (extensive markup language) tag and terminal equipment Download PDF

Info

Publication number
CN106528506B
CN106528506B CN201610915648.0A CN201610915648A CN106528506B CN 106528506 B CN106528506 B CN 106528506B CN 201610915648 A CN201610915648 A CN 201610915648A CN 106528506 B CN106528506 B CN 106528506B
Authority
CN
China
Prior art keywords
data
xml tag
predefined xml
several
predefined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610915648.0A
Other languages
Chinese (zh)
Other versions
CN106528506A (en
Inventor
魏誉荧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Genius Technology Co Ltd
Original Assignee
Guangdong Genius Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Genius Technology Co Ltd filed Critical Guangdong Genius Technology Co Ltd
Priority to CN201610915648.0A priority Critical patent/CN106528506B/en
Publication of CN106528506A publication Critical patent/CN106528506A/en
Application granted granted Critical
Publication of CN106528506B publication Critical patent/CN106528506B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/131Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention relates to the technical field of data processing, and discloses a data processing method, a device and terminal equipment based on XML tags, wherein the method comprises the following steps: acquiring text data; marking predefined XML tags on data contents of the text data, wherein different types of predefined XML tags are marked on different types of data contents; dividing text data to obtain a plurality of data pieces by taking the data content marked by the predefined XML tags as a unit, wherein the data content marked by one predefined XML tag corresponds to one data piece; and storing a plurality of data pieces in association with the predefined XML tags. The embodiment of the invention is used for quickly searching the content in the text data so as to improve the multiplexing efficiency of the text data.

Description

A kind of data processing method based on XML tag, device and terminal device
Technical field
The present invention relates to technical field of data processing, and in particular to one kind is based on extensible markup language (Extensible Markup Language, abbreviation XML) label data processing method, device and terminal device.
Background technique
In data processing, some data may be extracted Reusability, such as text data.Typically, text The data content of data is relatively more, to extract certain data, the position of data can be retrieved by retrieval mode, so It extracts again afterwards.Retrieval has precise search and fuzzy search, if it is precise search, needs to input accurate retrieval information, It is more demanding to memory of the user to data, if it is fuzzy search, although only needing to input main retrieval information, It is not high to detect accuracy, it is lower so as to cause the multiplexing efficiency of text data.
Summary of the invention
The embodiment of the invention discloses a kind of data processing method based on XML tag, device and terminal devices, for fast The content in text data is looked in quick checking, to improve the multiplexing efficiency of text data.
First aspect present invention discloses a kind of data processing method based on XML tag, it may include:
Obtain text data;
To the predefined XML tag of the data content indicia of the text data, wherein different classes of data content indicia Different types of predefined XML tag;
As unit of the data content of predefined XML tag label, the text data is split to obtain several numbers According to piece, the corresponding data slice of data content of a predefined XML tag label;
Several data slices and predefined XML tag are associated preservation.
As an alternative embodiment, in first aspect present invention, the data processing method further include: when need When reading any one data slice in several data slices, it is corresponding predefined to search any one described data slice XML tag searches any one described data slice according to the predefined XML tag found.
As an alternative embodiment, in first aspect present invention, before the acquisition text data, the number According to processing method further include: establish the corresponding database of predefined XML tag of each type in cloud storage node;
It is described that several data slices and predefined XML tag are associated preservation includes: according to several data The corresponding predefined XML tag of each of piece data slice, it is every in several data slices from being determined in cloud storage node The database that one data slice saves;Each of several data slices data slice is saved in determining database.
As an alternative embodiment, in first aspect present invention, it is described to establish each in cloud storage node Before the corresponding database of predefined XML tag of type, the data processing method further include: collect different business And/or the text data sample of different classifications, it is customized several according to the classification of the data content of the text data sample XML tag obtains several different types of predefined XML tags, and a type of predefined XML tag corresponds to a kind of classification Data content.
As an alternative embodiment, in first aspect present invention, the data processing method further include: establish The backup cloud storage node of the cloud storage node, and the predefined of each type is established in the backup cloud storage node The corresponding backup database of XML tag;
It is described each of several data slices data slice is saved in determining database after, the data Processing method further include:
According to the corresponding predefined XML tag of each of several data slices data slice, from backup cloud storage section The backup database that each of several data slices data slice saves is determined in point;It will be every in several data slices One data slice is saved in determining backup database.
Second aspect of the present invention discloses a kind of data processing equipment based on XML tag, it may include:
Acquiring unit, for obtaining text data;
Marking unit, for the predefined XML tag of data content indicia to the text data, wherein different classes of The different types of predefined XML tag of data content indicia;
Cutting unit, for dividing the text data as unit of the data content that predefined XML tag marks It cuts to obtain several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit, for several data slices and predefined XML tag to be associated preservation.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Searching unit, for when needing to read any one data slice in several data slices, searching described appoint The corresponding predefined XML tag of data slice of anticipating searches any one described data according to the predefined XML tag found Piece.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Unit is established, for establishing each type in cloud storage node before the acquiring unit obtains text data The corresponding database of predefined XML tag of type;
The storage unit specifically includes:
Determination unit, for according to the corresponding predefined XML tag of each of several data slices data slice, from The database that each of several data slices data slice saves is determined in cloud storage node;
It is associated with storage unit, for each of several data slices data slice to be saved in determining database In.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Collector unit, for collecting the text data sample of different business and/or different classifications, according to the text data The classification of the data content of sample, customized several XML tags obtain several different types of predefined XML tags, a type The predefined XML tag of type corresponds to a kind of data content of classification.
As an alternative embodiment, in second aspect of the present invention,
The unit of establishing is also used to, and establishes the backup cloud storage node of the cloud storage node, and in the backup cloud The corresponding backup database of predefined XML tag of each type is established in memory node;
The determination unit is also used to, according to the corresponding predefined XML of each of several data slices data slice Label determines the backup database that each of several data slices data slice saves from backup cloud storage node;
The association storage unit is also used to, and each of several data slices data slice is saved in determining standby In part database.
Third aspect present invention discloses a kind of terminal device, it may include: based on XML tag as disclosed in second aspect Data processing equipment.
Compared with prior art, the embodiment of the present invention has the advantages that
In embodiments of the present invention, after getting text data, according to the different classes of data content in text data, Different types of predefined XML tag is marked respectively, it is right then as unit of the data content that predefined XML tag is marked Text data is split to obtain several data slices, and the data content correspondence of a predefined XML tag label obtains a number According to piece, several data slices and predefined XML tag are then associated preservation.As can be seen that implementing the embodiment of the present invention, root Different types of predefined XML tag is marked according to different classes of data content, then according to predefined XML tag label Text data is split to obtain several data slices, is finally associated data slice and predefined XML tag by data content After preservation, corresponding data slice can be found by predefined XML tag, to quickly and easily complete in text data The lookup of appearance improves search speed, to improve the multiplexing efficiency of text data.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to needed in the embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 a is the flow diagram of the data processing method disclosed by the embodiments of the present invention based on XML tag;
Fig. 1 b is the schematic diagram disclosed by the embodiments of the present invention that predefined XML tag is marked in text data;
Fig. 2 a is the flow diagram of the data processing method disclosed by the embodiments of the present invention based on XML tag;
Fig. 2 b uses schematic diagram for database disclosed by the embodiments of the present invention;
Fig. 3 is the structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 4 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 5 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 6 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 7 is the structural schematic diagram of terminal device disclosed by the embodiments of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that the described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on this Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts Example is applied, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a kind of data processing methods based on XML tag, for quickly searching text data In content, to improve the multiplexing efficiency of text data.The embodiment of the invention also discloses at a kind of data based on XML tag The corresponding device of reason method and terminal device.
The present embodiments relate to terminal device can be computer, smart phone, tablet computer, E-book reader Deng in conjunction with specific embodiments, the embodiment of the present invention being described in detail below by from the angle of terminal device.
Embodiment one
Fig. 1 a is please referred to, Fig. 1 a is that the process of the data processing method disclosed by the embodiments of the present invention based on XML tag is shown It is intended to;As shown in Figure 1a, a kind of data processing method based on XML tag can include:
101, terminal device obtains text data;
It is appreciated that common format TXT or portable document format (Portable Document in data handling Format, abbreviation PDF) etc. text data, these text datas are related to the content of different business and/or different classifications, than Such as, the PDF text data etc. for recording the TXT text data of program and for recording novel, two kinds of text datas belong to not Same business, and the content of text data also belongs to different classification.
102, terminal device is to the predefined XML tag of the data content indicia of text data, wherein different classes of data The different types of predefined XML tag of content-label;
In embodiments of the present invention, the data content in text data is made a reservation for according to different classes of data content The label of adopted XML tag, to be distinguished the data content in text data by predefined XML tag.
Further, in embodiments of the present invention terminal device first determine text data business and/or classification, according to true Fixed business and/or classification selects determining business and/or the corresponding predefined XML tag of classification;Further, terminal is set For according to the different types of predefined XML tag of different classes of data content indicia in text data.
Wherein, traditional XML tag quite flexible in the sense that label describes structure and meaning in text data, Apparatus is described the expression such as letter or word of meaning generally in<>, such as<B>it is a kind of formatting mark;<STRONG>is A kind of semantic marker illustrates that content therein is especially important;<TD>is structure tag, indicates that content is a unit in table. The embodiment of the present invention can use for reference traditional XML tag when defining predefined XML tag, and combine the business of text data With/classification, the classification of data content in text data etc., the predefined XML tag with label meaning is neatly defined, Such as<TITLE>, indicate theme.
103, terminal device is split text data as unit of the data content that predefined XML tag marks To several data slices, the corresponding data slice of data content of a predefined XML tag label;
Predefined XML tag is separated by the data content in text, the number then marked again with predefined XML tag It is unit according to content, text data is split to obtain several data slices.
Fig. 1 b is please referred to, Fig. 1 b marks showing for predefined XML tag to be disclosed by the embodiments of the present invention in text data It is intended to;In Figure 1b, according to different classes of data content, predefined XML tag label is carried out,<predefined XML tag 1>is used Data content between label<predefined XML tag 1>and next predefined XML tag,<predefined XML tag 2> For marking the data content etc. between<predefined XML tag 2>and next predefined XML tag, wherein in Fig. 1 b In,<predefined XML tag 1>,<predefined XML tag 2>,<predefined XML tag 3>etc. indicate different types of predefined XML tag.When carrying out data content segmentation, divided according to the data content of each predefined XML tag label, such as < pre- It defines divided come out of the data content between XML tag 1>and<predefined XML tag 2>and obtains a data slice.
A predefined XML tag in step 103 refers to the predefined XML mark of any one marked in text data Label are different parsings from a type of predefined XML tag of above-mentioned introduction.It can also be seen that from Fig. 1 b with 1,2,3 etc. Number distinguishes different types of predefined XML tag, and same type of predefined XML can be used for multiple times in text data Label, such as<predefined XML tag 2>are used for multiple times.
104, several data slices and predefined XML tag are associated preservation by terminal device;
It is appreciated that each data slice is associated guarantor with the predefined XML tag of corresponding type by terminal device It deposits, preserves multiple data slices under same type of predefined XML tag.
105, when needing to read any one data slice in several data slices, terminal device searches any one data The corresponding predefined XML tag of piece searches any one data slice according to the predefined XML tag found.
In embodiments of the present invention, after getting text data, according to the different classes of data content in text data, Different types of predefined XML tag is marked respectively, it is right then as unit of the data content that predefined XML tag is marked Text data is split to obtain several data slices, and the data content correspondence of a predefined XML tag label obtains a number According to piece, several data slices and predefined XML tag are then associated preservation.As can be seen that implementing the embodiment of the present invention, root Different types of predefined XML tag is marked according to different classes of data content, then according to predefined XML tag label Text data is split to obtain several data slices, is finally associated data slice and predefined XML tag by data content After preservation, corresponding data slice can be found by predefined XML tag, to quickly and easily complete in text data The lookup of appearance improves search speed, to improve the multiplexing efficiency of text data.
Embodiment two
Fig. 2 a is please referred to, Fig. 2 a is that the process of the data processing method disclosed by the embodiments of the present invention based on XML tag is shown It is intended to;As shown in Figure 2 a, a kind of data processing method based on XML tag can include:
201, terminal device collects the text data sample of different business and/or different classifications, according to text data sample Data content classification, customized several XML tags obtain several different types of predefined XML tags, a type of Predefined XML tag corresponds to a kind of data content of classification;
Further, terminal device first divides according to the business of text data sample and/or classification in embodiments of the present invention Drive that the predefined XML tag of row is customized, and then terminal device is further according in the different classes of data in text data sample into Hold customized predefined XML tag.
202, terminal device establishes the corresponding data of predefined XML tag of each type in cloud storage node Library;
As an alternative embodiment, terminal device in cloud storage node according to the business of text data and/or The large database concepts of respective numbers is first established in classification, then under each large database concept, then for different classes of data content and Customized predefined XML tag establishes corresponding database, as shown in Figure 2 b, for two kinds of business difference of e-book and program Large database concept 1 and large database concept 2 are established, database A1, database A2 are also provided in large database concept 1, until database An is equally also provided with database B1, database B2, until database Bn in large database concept 2.Wherein, 1 He of large database concept Database in large database concept 2 respectively corresponds different types of predefined XML tag.
203, terminal device obtains text data;
204, terminal device is to the predefined XML tag of the data content indicia of text data, wherein different classes of data The different types of predefined XML tag of content-label;
205, terminal device is split text data as unit of the data content that predefined XML tag marks To several data slices, the corresponding data slice of data content of a predefined XML tag label;
206, terminal device is according to the corresponding predefined XML tag of each of several data slices data slice, Cong Yuncun Store up the database for determining that each of several data slices data slice saves in node;
207, each of several data slices data slice is saved in determining database by terminal device;
As an alternative embodiment, in embodiments of the present invention can be by being established on backup cloud storage node The corresponding backup database of predefined XML tag of each type, according to each of several data slices data slice It is standby to determine that each of several data slices data slice saves from backup cloud storage node for corresponding predefined XML tag Part database;Each of several data slices data slice is saved in determining backup database.Implement through the invention Example, makes a backup store data slice, can not be from the data read in data slice or cloud storage node in cloud storage node When piece is damaged, can by reading corresponding data slice from backup cloud storage node, and after the reparation of cloud storage node, It is saved on cloud storage node with the data slice read from backup cloud storage node, realizes data slice backup to reach Purpose.
208, terminal device searches the corresponding predefined XML tag of any one data slice, predefined according to what is found XML tag searches any one data slice.
As can be seen that in embodiments of the present invention, terminal device is by collecting text data sample, by text data The category analysis of data content in sample, thus the customized classification with data content predefined XML tag correspondingly, And establish the corresponding database of each type of predefined XML tag.Based on customized predefined XML tag and database, Terminal device is after getting text data to be processed, according to the classification of the data content of circumferential edge, to its data content The predefined XML tag of corresponding type is marked, is then split data content according to the predefined XML tag of label, Data slice is obtained, data slice is saved in respectively and completes to save in corresponding database, subsequent user can then pass through XML tag Search data content, search it is simple and convenient, to improve the multiplexing efficiency of text data.
For example, when the text data that text data is e-book, terminal device can be E-book reader at this time Or then the other terminal devices being connected with E-book reader, terminal device are led to by collecting several e-book samples Cross and the classification of the data content in e-book sample analyzed, according to analyze come data content classification, it is customized The corresponding XML tag of the other data content of every type out, then further according to the classification of data content, to customized XML Label is classified, and the data content for obtaining a kind of classification corresponds to a type of XML tag.Then, terminal device is for every A type of XML tag establishes corresponding database respectively.Later, terminal device obtains target electronic book, analysis target electricity Then the classification of data content in the philosophical works carries out the label of XML tag according to classification, then will further according to the XML tag of label Data content is divided into data slice in target electronic book, and data slice is saved in corresponding database respectively.
Embodiment three
Referring to Fig. 3, Fig. 3 is the structural representation of the data processing equipment disclosed by the embodiments of the present invention based on XML tag Figure;As shown in figure 3, a kind of data processing equipment based on XML tag can include:
Acquiring unit 310, for obtaining text data;
Marking unit 320, for the predefined XML tag of data content indicia to the text data, wherein inhomogeneity Other different types of predefined XML tag of data content indicia;
Cutting unit 330, for as unit of the data content that predefined XML tag marks, to the text data into Row segmentation obtains several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit 340, for several data slices and predefined XML tag to be associated preservation.
In embodiments of the present invention, after acquiring unit 310 gets text data, marking unit 320 is according to text data In different classes of data content, mark different types of predefined XML tag respectively, then cutting unit 330 with mark The data content of predefined XML tag is unit, is split to obtain several data slices to text data, one predefined The data content of XML tag label is corresponding to obtain a data slice, and storage unit 340 marks several data slices and predefined XML Label are associated preservation.As can be seen that implementing the embodiment of the present invention, different type is marked according to different classes of data content Predefined XML tag, then according to predefined XML tag mark data content, if text data is split to obtain Dry data slice can be searched after data slice and predefined XML tag are finally associated preservation by predefined XML tag To corresponding data slice, to quickly and easily complete the lookup to text data content, search speed is improved, to improve text The multiplexing efficiency of notebook data.
Example IV
Referring to Fig. 4, Fig. 4 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag Schematic diagram;Wherein, the data processing equipment shown in Fig. 4 based on XML tag is the data based on XML tag as shown in Figure 3 What processing unit optimized.It, should be based on XML tag in data processing equipment shown in Fig. 4 based on XML tag Data processing equipment is specific further include:
Searching unit 410, for when needing to read any one data slice in several data slices, described in lookup The corresponding predefined XML tag of any one data slice searches any one number according to the predefined XML tag found According to piece.
Embodiment five
Referring to Fig. 5, Fig. 5 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag Schematic diagram;Wherein, the data processing equipment shown in fig. 5 based on XML tag is the data based on XML tag as shown in Figure 3 What processing unit optimized.It, should be based on XML tag in data processing equipment based on XML tag shown in Fig. 5 Data processing equipment is specific further include:
Unit 510 is established, for establishing in cloud storage node every before the acquiring unit 310 obtains text data The corresponding database of a type of predefined XML tag;
Above-mentioned storage unit 340 specifically includes:
Determination unit 341, for being marked according to the corresponding predefined XML of each of several data slices data slice Label determine the database that each of several data slices data slice saves from cloud storage node;
It is associated with storage unit 342, for each of several data slices data slice to be saved in determining data In library.
It is appreciated that being based on embodiment five, the unit 510 of establishing is also used to, and obtains textual data in the acquiring unit Each is established according to the backup cloud storage node for before, establishing the cloud storage node, and in the backup cloud storage node The corresponding backup database of predefined XML tag of type;The determination unit 341 is also used to, according to several numbers According to the corresponding predefined XML tag of each of piece data slice, several data slices are determined from backup cloud storage node Each of data slice save backup database;The association storage unit 342 is also used to, will be in several data slices Each data slice be saved in determining backup database.Through the embodiment of the present invention, data slice is made a backup store, It, can be by from backup when that can not be damaged from the data slice read in cloud storage node in data slice or cloud storage node Corresponding data slice is read on cloud storage node, and after the reparation of cloud storage node, is read with from backup cloud storage node To data slice be saved on cloud storage node, thus achieve the purpose that realize data slice backup.
Embodiment six
Referring to Fig. 6, Fig. 6 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag Schematic diagram;Wherein, the data processing equipment shown in fig. 6 based on XML tag is the data based on XML tag as shown in Figure 5 What processing unit optimized.It, should be based on XML tag in data processing equipment based on XML tag shown in Fig. 6 Data processing equipment is specific further include:
Collector unit 610, for collecting the text data sample of different business and/or different classifications, according to the text The classification of the data content of data sample, customized several XML tags obtain several different types of predefined XML tags, and one The predefined XML tag of seed type corresponds to a kind of data content of classification.
Embodiment seven
Referring to Fig. 7, Fig. 7 is the structural schematic diagram of terminal device disclosed by the embodiments of the present invention;As shown in fig. 7, a kind of Terminal device can include: the data processing equipment based on XML tag involved in any one attached drawing in 3~attached drawing of attached drawing 6.
Wherein, the data processing equipment based on XML tag can be refering to detailed in embodiment of the method and Installation practice Illustrate, details are not described herein.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium include read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), programmable read only memory (Programmable Read-only Memory, PROM), erasable programmable is read-only deposits Reservoir (Erasable Programmable Read Only Memory, EPROM), disposable programmable read-only memory (One- Time Programmable Read-Only Memory, OTPROM), the electronics formula of erasing can make carbon copies read-only memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other disc memories, magnetic disk storage, magnetic tape storage or can For carrying or any other computer-readable medium of storing data.
Above to a kind of data processing method based on XML tag disclosed by the embodiments of the present invention, device and terminal device It is described in detail, used herein a specific example illustrates the principle and implementation of the invention, the above reality The explanation for applying example is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology of this field Personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this theory Bright book content should not be construed as limiting the invention.

Claims (9)

1. a kind of data processing method based on XML tag characterized by comprising
Obtain text data;
To the predefined XML tag of the data content indicia of the text data, wherein different classes of data content indicia is different The predefined XML tag of type;
As unit of the data content of predefined XML tag label, the text data is split to obtain several data slices, The corresponding data slice of data content of one predefined XML tag label;
Several data slices and predefined XML tag are associated preservation;
Before the acquisition text data, the data processing method further include:
The corresponding database of predefined XML tag of each type is established in cloud storage node;
It is described several data slices are associated preservation with predefined XML tag to include:
According to the corresponding predefined XML tag of each of several data slices data slice, determined from cloud storage node The database that each of several data slices data slice saves;
Each of several data slices data slice is saved in determining database.
2. data processing method according to claim 1, which is characterized in that the data processing method further include:
When needing to read any one data slice in several data slices, it is corresponding to search any one described data slice Predefined XML tag searches any one described data slice according to the predefined XML tag found.
3. data processing method according to claim 1, which is characterized in that described to establish each type in cloud storage node Before the corresponding database of predefined XML tag of type, the data processing method further include:
The text data sample for collecting different business and/or different classifications, according to the data content of the text data sample Classification, customized several XML tags obtain several different types of predefined XML tags, a type of predefined XML tag A kind of data content of corresponding classification.
4. data processing method according to claim 1, which is characterized in that the data processing method further include:
The backup cloud storage node of the cloud storage node is established, and establishes each type in the backup cloud storage node The corresponding backup database of predefined XML tag;
It is described each of several data slices data slice is saved in determining database after, the data processing Method further include:
According to the corresponding predefined XML tag of each of several data slices data slice, from backup cloud storage node Determine the backup database that each of several data slices data slice saves;
Each of several data slices data slice is saved in determining backup database.
5. a kind of data processing equipment based on XML tag characterized by comprising
Acquiring unit, for obtaining text data;
Marking unit, for the predefined XML tag of data content indicia to the text data, wherein different classes of number According to the different types of predefined XML tag of content-label;
Cutting unit, for being split to the text data as unit of the data content that predefined XML tag marks To several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit, for several data slices and predefined XML tag to be associated preservation;
The data processing equipment further include:
Unit is established, for establishing each type in cloud storage node before the acquiring unit obtains text data The corresponding database of predefined XML tag;
The storage unit specifically includes:
Determination unit, for according to the corresponding predefined XML tag of each of several data slices data slice, Cong Yuncun Store up the database for determining that each of several data slices data slice saves in node;
It is associated with storage unit, for each of several data slices data slice to be saved in determining database.
6. data processing equipment according to claim 5, which is characterized in that the data processing equipment further include:
Searching unit, for searching described any one when needing to read any one data slice in several data slices The corresponding predefined XML tag of a data slice searches any one described data slice according to the predefined XML tag found.
7. data processing equipment according to claim 5, which is characterized in that the data processing equipment further include:
Collector unit, for collecting the text data sample of different business and/or different classifications, according to the text data sample Data content classification, customized several XML tags obtain several different types of predefined XML tags, a type of Predefined XML tag corresponds to a kind of data content of classification.
8. data processing equipment according to claim 5, which is characterized in that
The unit of establishing is also used to, and establishes the backup cloud storage node of the cloud storage node, and in the backup cloud storage The corresponding backup database of predefined XML tag of each type is established in node;
The determination unit is also used to, according to the corresponding predefined XML tag of each of several data slices data slice, The backup database that each of several data slices data slice saves is determined from backup cloud storage node;
The association storage unit is also used to, and each of several data slices data slice is saved in determining backup number According in library.
9. a kind of terminal device characterized by comprising
Such as the described in any item data processing equipments based on XML tag of claim 5~8.
CN201610915648.0A 2016-10-20 2016-10-20 Data processing method and device based on XML (extensive markup language) tag and terminal equipment Active CN106528506B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610915648.0A CN106528506B (en) 2016-10-20 2016-10-20 Data processing method and device based on XML (extensive markup language) tag and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610915648.0A CN106528506B (en) 2016-10-20 2016-10-20 Data processing method and device based on XML (extensive markup language) tag and terminal equipment

Publications (2)

Publication Number Publication Date
CN106528506A CN106528506A (en) 2017-03-22
CN106528506B true CN106528506B (en) 2019-05-03

Family

ID=58332822

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610915648.0A Active CN106528506B (en) 2016-10-20 2016-10-20 Data processing method and device based on XML (extensive markup language) tag and terminal equipment

Country Status (1)

Country Link
CN (1) CN106528506B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110175233B (en) * 2019-03-07 2022-03-11 平安科技(深圳)有限公司 Method, device, computer device and storage medium for analyzing target subject portrait
CN109992752B (en) * 2019-03-07 2023-10-20 平安科技(深圳)有限公司 Label marking method, device, computer device and storage medium for contract file

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581071A (en) * 2003-07-31 2005-02-16 富士通株式会社 Information processing method, apparatus and program in XML driven architecture
CN1825316A (en) * 2005-02-25 2006-08-30 微软公司 Data store for software application documents
CN101263480A (en) * 2005-09-09 2008-09-10 微软公司 Real-time synchronization of XML data between applications
CN101263477A (en) * 2005-09-09 2008-09-10 微软公司 Programmability for XML data store for documents
JP2009129400A (en) * 2007-11-28 2009-06-11 Hitachi Ltd Database management method, database management device and database management program
CN102045388A (en) * 2010-11-25 2011-05-04 汉王科技股份有限公司 Online reading device and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6952800B1 (en) * 1999-09-03 2005-10-04 Cisco Technology, Inc. Arrangement for controlling and logging voice enabled web applications using extensible markup language documents

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1581071A (en) * 2003-07-31 2005-02-16 富士通株式会社 Information processing method, apparatus and program in XML driven architecture
CN1825316A (en) * 2005-02-25 2006-08-30 微软公司 Data store for software application documents
CN101263480A (en) * 2005-09-09 2008-09-10 微软公司 Real-time synchronization of XML data between applications
CN101263477A (en) * 2005-09-09 2008-09-10 微软公司 Programmability for XML data store for documents
JP2009129400A (en) * 2007-11-28 2009-06-11 Hitachi Ltd Database management method, database management device and database management program
CN102045388A (en) * 2010-11-25 2011-05-04 汉王科技股份有限公司 Online reading device and method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XML的数据存储实例分析;龚颖;《江苏广播电视大学学报》;20020630;第13卷(第3期);全文

Also Published As

Publication number Publication date
CN106528506A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
Siegel et al. Figureseer: Parsing result-figures in research papers
CN108932294B (en) Resume data processing method, device, equipment and storage medium based on index
CN108460014A (en) Recognition methods, device, computer equipment and the storage medium of business entity
CN107085583B (en) Electronic document management method and device based on content
CN111522901B (en) Method and device for processing address information in text
CN111125086B (en) Method, device, storage medium and processor for acquiring data resources
CN111241230A (en) Method and system for identifying string mark risk based on text mining
CN107357765B (en) Word document flaking method and device
JP2008210024A (en) Apparatus for analyzing set of documents, method for analyzing set of documents, program implementing this method, and recording medium storing this program
Owen et al. Towards a scientific workflow featuring Natural Language Processing for the digitisation of natural history collections.
KR20150059208A (en) Device for analyzing the time-space correlation of the event in the social web media and method thereof
CN106528506B (en) Data processing method and device based on XML (extensive markup language) tag and terminal equipment
CN114861677A (en) Information extraction method, information extraction device, electronic equipment and storage medium
CN109933803A (en) A kind of Chinese idiom information displaying method shows device, electronic equipment and storage medium
CN112035723A (en) Resource library determination method and device, storage medium and electronic device
Jeon et al. Making a graph database from unstructured text
CN103823868A (en) Event recognition method and event relation extraction method oriented to on-line encyclopedia
CN108073678B (en) Document analysis processing method, system and device applied to big data analysis
CN114020904A (en) Test question file screening method, model training method, device, equipment and medium
CN110489528B (en) Electronic dictionary reconstruction method based on electronic book content and computing equipment
KR101580784B1 (en) Method for calculating plagiarism rate of electronic documents, and a computer-readable storage medium having program to perform the same
CN112528665A (en) Information extraction method based on semantic understanding
CN111401047A (en) Method and device for generating dispute focus of legal document and computer equipment
CN113254583B (en) Document marking method, device and medium based on semantic vector
CN112579747B (en) Identity information extraction method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant