CN106528506B - Data processing method and device based on XML (extensive markup language) tag and terminal equipment - Google Patents
Data processing method and device based on XML (extensive markup language) tag and terminal equipment Download PDFInfo
- Publication number
- CN106528506B CN106528506B CN201610915648.0A CN201610915648A CN106528506B CN 106528506 B CN106528506 B CN 106528506B CN 201610915648 A CN201610915648 A CN 201610915648A CN 106528506 B CN106528506 B CN 106528506B
- Authority
- CN
- China
- Prior art keywords
- data
- xml tag
- predefined xml
- several
- predefined
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 28
- 238000004321 preservation Methods 0.000 claims description 14
- 238000000034 method Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 13
- 230000015654 memory Effects 0.000 description 13
- 241001269238 Data Species 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/131—Fragmentation of text files, e.g. creating reusable text-blocks; Linking to fragments, e.g. using XInclude; Namespaces
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention relates to the technical field of data processing, and discloses a data processing method, a device and terminal equipment based on XML tags, wherein the method comprises the following steps: acquiring text data; marking predefined XML tags on data contents of the text data, wherein different types of predefined XML tags are marked on different types of data contents; dividing text data to obtain a plurality of data pieces by taking the data content marked by the predefined XML tags as a unit, wherein the data content marked by one predefined XML tag corresponds to one data piece; and storing a plurality of data pieces in association with the predefined XML tags. The embodiment of the invention is used for quickly searching the content in the text data so as to improve the multiplexing efficiency of the text data.
Description
Technical field
The present invention relates to technical field of data processing, and in particular to one kind is based on extensible markup language (Extensible
Markup Language, abbreviation XML) label data processing method, device and terminal device.
Background technique
In data processing, some data may be extracted Reusability, such as text data.Typically, text
The data content of data is relatively more, to extract certain data, the position of data can be retrieved by retrieval mode, so
It extracts again afterwards.Retrieval has precise search and fuzzy search, if it is precise search, needs to input accurate retrieval information,
It is more demanding to memory of the user to data, if it is fuzzy search, although only needing to input main retrieval information,
It is not high to detect accuracy, it is lower so as to cause the multiplexing efficiency of text data.
Summary of the invention
The embodiment of the invention discloses a kind of data processing method based on XML tag, device and terminal devices, for fast
The content in text data is looked in quick checking, to improve the multiplexing efficiency of text data.
First aspect present invention discloses a kind of data processing method based on XML tag, it may include:
Obtain text data;
To the predefined XML tag of the data content indicia of the text data, wherein different classes of data content indicia
Different types of predefined XML tag;
As unit of the data content of predefined XML tag label, the text data is split to obtain several numbers
According to piece, the corresponding data slice of data content of a predefined XML tag label;
Several data slices and predefined XML tag are associated preservation.
As an alternative embodiment, in first aspect present invention, the data processing method further include: when need
When reading any one data slice in several data slices, it is corresponding predefined to search any one described data slice
XML tag searches any one described data slice according to the predefined XML tag found.
As an alternative embodiment, in first aspect present invention, before the acquisition text data, the number
According to processing method further include: establish the corresponding database of predefined XML tag of each type in cloud storage node;
It is described that several data slices and predefined XML tag are associated preservation includes: according to several data
The corresponding predefined XML tag of each of piece data slice, it is every in several data slices from being determined in cloud storage node
The database that one data slice saves;Each of several data slices data slice is saved in determining database.
As an alternative embodiment, in first aspect present invention, it is described to establish each in cloud storage node
Before the corresponding database of predefined XML tag of type, the data processing method further include: collect different business
And/or the text data sample of different classifications, it is customized several according to the classification of the data content of the text data sample
XML tag obtains several different types of predefined XML tags, and a type of predefined XML tag corresponds to a kind of classification
Data content.
As an alternative embodiment, in first aspect present invention, the data processing method further include: establish
The backup cloud storage node of the cloud storage node, and the predefined of each type is established in the backup cloud storage node
The corresponding backup database of XML tag;
It is described each of several data slices data slice is saved in determining database after, the data
Processing method further include:
According to the corresponding predefined XML tag of each of several data slices data slice, from backup cloud storage section
The backup database that each of several data slices data slice saves is determined in point;It will be every in several data slices
One data slice is saved in determining backup database.
Second aspect of the present invention discloses a kind of data processing equipment based on XML tag, it may include:
Acquiring unit, for obtaining text data;
Marking unit, for the predefined XML tag of data content indicia to the text data, wherein different classes of
The different types of predefined XML tag of data content indicia;
Cutting unit, for dividing the text data as unit of the data content that predefined XML tag marks
It cuts to obtain several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit, for several data slices and predefined XML tag to be associated preservation.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Searching unit, for when needing to read any one data slice in several data slices, searching described appoint
The corresponding predefined XML tag of data slice of anticipating searches any one described data according to the predefined XML tag found
Piece.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Unit is established, for establishing each type in cloud storage node before the acquiring unit obtains text data
The corresponding database of predefined XML tag of type;
The storage unit specifically includes:
Determination unit, for according to the corresponding predefined XML tag of each of several data slices data slice, from
The database that each of several data slices data slice saves is determined in cloud storage node;
It is associated with storage unit, for each of several data slices data slice to be saved in determining database
In.
As an alternative embodiment, in second aspect of the present invention, the data processing equipment further include:
Collector unit, for collecting the text data sample of different business and/or different classifications, according to the text data
The classification of the data content of sample, customized several XML tags obtain several different types of predefined XML tags, a type
The predefined XML tag of type corresponds to a kind of data content of classification.
As an alternative embodiment, in second aspect of the present invention,
The unit of establishing is also used to, and establishes the backup cloud storage node of the cloud storage node, and in the backup cloud
The corresponding backup database of predefined XML tag of each type is established in memory node;
The determination unit is also used to, according to the corresponding predefined XML of each of several data slices data slice
Label determines the backup database that each of several data slices data slice saves from backup cloud storage node;
The association storage unit is also used to, and each of several data slices data slice is saved in determining standby
In part database.
Third aspect present invention discloses a kind of terminal device, it may include: based on XML tag as disclosed in second aspect
Data processing equipment.
Compared with prior art, the embodiment of the present invention has the advantages that
In embodiments of the present invention, after getting text data, according to the different classes of data content in text data,
Different types of predefined XML tag is marked respectively, it is right then as unit of the data content that predefined XML tag is marked
Text data is split to obtain several data slices, and the data content correspondence of a predefined XML tag label obtains a number
According to piece, several data slices and predefined XML tag are then associated preservation.As can be seen that implementing the embodiment of the present invention, root
Different types of predefined XML tag is marked according to different classes of data content, then according to predefined XML tag label
Text data is split to obtain several data slices, is finally associated data slice and predefined XML tag by data content
After preservation, corresponding data slice can be found by predefined XML tag, to quickly and easily complete in text data
The lookup of appearance improves search speed, to improve the multiplexing efficiency of text data.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to needed in the embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability
For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached
Figure.
Fig. 1 a is the flow diagram of the data processing method disclosed by the embodiments of the present invention based on XML tag;
Fig. 1 b is the schematic diagram disclosed by the embodiments of the present invention that predefined XML tag is marked in text data;
Fig. 2 a is the flow diagram of the data processing method disclosed by the embodiments of the present invention based on XML tag;
Fig. 2 b uses schematic diagram for database disclosed by the embodiments of the present invention;
Fig. 3 is the structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 4 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 5 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 6 is another structural schematic diagram of the data processing equipment disclosed by the embodiments of the present invention based on XML tag;
Fig. 7 is the structural schematic diagram of terminal device disclosed by the embodiments of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that the described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on this
Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts
Example is applied, shall fall within the protection scope of the present invention.
The embodiment of the invention discloses a kind of data processing methods based on XML tag, for quickly searching text data
In content, to improve the multiplexing efficiency of text data.The embodiment of the invention also discloses at a kind of data based on XML tag
The corresponding device of reason method and terminal device.
The present embodiments relate to terminal device can be computer, smart phone, tablet computer, E-book reader
Deng in conjunction with specific embodiments, the embodiment of the present invention being described in detail below by from the angle of terminal device.
Embodiment one
Fig. 1 a is please referred to, Fig. 1 a is that the process of the data processing method disclosed by the embodiments of the present invention based on XML tag is shown
It is intended to;As shown in Figure 1a, a kind of data processing method based on XML tag can include:
101, terminal device obtains text data;
It is appreciated that common format TXT or portable document format (Portable Document in data handling
Format, abbreviation PDF) etc. text data, these text datas are related to the content of different business and/or different classifications, than
Such as, the PDF text data etc. for recording the TXT text data of program and for recording novel, two kinds of text datas belong to not
Same business, and the content of text data also belongs to different classification.
102, terminal device is to the predefined XML tag of the data content indicia of text data, wherein different classes of data
The different types of predefined XML tag of content-label;
In embodiments of the present invention, the data content in text data is made a reservation for according to different classes of data content
The label of adopted XML tag, to be distinguished the data content in text data by predefined XML tag.
Further, in embodiments of the present invention terminal device first determine text data business and/or classification, according to true
Fixed business and/or classification selects determining business and/or the corresponding predefined XML tag of classification;Further, terminal is set
For according to the different types of predefined XML tag of different classes of data content indicia in text data.
Wherein, traditional XML tag quite flexible in the sense that label describes structure and meaning in text data,
Apparatus is described the expression such as letter or word of meaning generally in<>, such as<B>it is a kind of formatting mark;<STRONG>is
A kind of semantic marker illustrates that content therein is especially important;<TD>is structure tag, indicates that content is a unit in table.
The embodiment of the present invention can use for reference traditional XML tag when defining predefined XML tag, and combine the business of text data
With/classification, the classification of data content in text data etc., the predefined XML tag with label meaning is neatly defined,
Such as<TITLE>, indicate theme.
103, terminal device is split text data as unit of the data content that predefined XML tag marks
To several data slices, the corresponding data slice of data content of a predefined XML tag label;
Predefined XML tag is separated by the data content in text, the number then marked again with predefined XML tag
It is unit according to content, text data is split to obtain several data slices.
Fig. 1 b is please referred to, Fig. 1 b marks showing for predefined XML tag to be disclosed by the embodiments of the present invention in text data
It is intended to;In Figure 1b, according to different classes of data content, predefined XML tag label is carried out,<predefined XML tag 1>is used
Data content between label<predefined XML tag 1>and next predefined XML tag,<predefined XML tag 2>
For marking the data content etc. between<predefined XML tag 2>and next predefined XML tag, wherein in Fig. 1 b
In,<predefined XML tag 1>,<predefined XML tag 2>,<predefined XML tag 3>etc. indicate different types of predefined
XML tag.When carrying out data content segmentation, divided according to the data content of each predefined XML tag label, such as < pre-
It defines divided come out of the data content between XML tag 1>and<predefined XML tag 2>and obtains a data slice.
A predefined XML tag in step 103 refers to the predefined XML mark of any one marked in text data
Label are different parsings from a type of predefined XML tag of above-mentioned introduction.It can also be seen that from Fig. 1 b with 1,2,3 etc.
Number distinguishes different types of predefined XML tag, and same type of predefined XML can be used for multiple times in text data
Label, such as<predefined XML tag 2>are used for multiple times.
104, several data slices and predefined XML tag are associated preservation by terminal device;
It is appreciated that each data slice is associated guarantor with the predefined XML tag of corresponding type by terminal device
It deposits, preserves multiple data slices under same type of predefined XML tag.
105, when needing to read any one data slice in several data slices, terminal device searches any one data
The corresponding predefined XML tag of piece searches any one data slice according to the predefined XML tag found.
In embodiments of the present invention, after getting text data, according to the different classes of data content in text data,
Different types of predefined XML tag is marked respectively, it is right then as unit of the data content that predefined XML tag is marked
Text data is split to obtain several data slices, and the data content correspondence of a predefined XML tag label obtains a number
According to piece, several data slices and predefined XML tag are then associated preservation.As can be seen that implementing the embodiment of the present invention, root
Different types of predefined XML tag is marked according to different classes of data content, then according to predefined XML tag label
Text data is split to obtain several data slices, is finally associated data slice and predefined XML tag by data content
After preservation, corresponding data slice can be found by predefined XML tag, to quickly and easily complete in text data
The lookup of appearance improves search speed, to improve the multiplexing efficiency of text data.
Embodiment two
Fig. 2 a is please referred to, Fig. 2 a is that the process of the data processing method disclosed by the embodiments of the present invention based on XML tag is shown
It is intended to;As shown in Figure 2 a, a kind of data processing method based on XML tag can include:
201, terminal device collects the text data sample of different business and/or different classifications, according to text data sample
Data content classification, customized several XML tags obtain several different types of predefined XML tags, a type of
Predefined XML tag corresponds to a kind of data content of classification;
Further, terminal device first divides according to the business of text data sample and/or classification in embodiments of the present invention
Drive that the predefined XML tag of row is customized, and then terminal device is further according in the different classes of data in text data sample into
Hold customized predefined XML tag.
202, terminal device establishes the corresponding data of predefined XML tag of each type in cloud storage node
Library;
As an alternative embodiment, terminal device in cloud storage node according to the business of text data and/or
The large database concepts of respective numbers is first established in classification, then under each large database concept, then for different classes of data content and
Customized predefined XML tag establishes corresponding database, as shown in Figure 2 b, for two kinds of business difference of e-book and program
Large database concept 1 and large database concept 2 are established, database A1, database A2 are also provided in large database concept 1, until database
An is equally also provided with database B1, database B2, until database Bn in large database concept 2.Wherein, 1 He of large database concept
Database in large database concept 2 respectively corresponds different types of predefined XML tag.
203, terminal device obtains text data;
204, terminal device is to the predefined XML tag of the data content indicia of text data, wherein different classes of data
The different types of predefined XML tag of content-label;
205, terminal device is split text data as unit of the data content that predefined XML tag marks
To several data slices, the corresponding data slice of data content of a predefined XML tag label;
206, terminal device is according to the corresponding predefined XML tag of each of several data slices data slice, Cong Yuncun
Store up the database for determining that each of several data slices data slice saves in node;
207, each of several data slices data slice is saved in determining database by terminal device;
As an alternative embodiment, in embodiments of the present invention can be by being established on backup cloud storage node
The corresponding backup database of predefined XML tag of each type, according to each of several data slices data slice
It is standby to determine that each of several data slices data slice saves from backup cloud storage node for corresponding predefined XML tag
Part database;Each of several data slices data slice is saved in determining backup database.Implement through the invention
Example, makes a backup store data slice, can not be from the data read in data slice or cloud storage node in cloud storage node
When piece is damaged, can by reading corresponding data slice from backup cloud storage node, and after the reparation of cloud storage node,
It is saved on cloud storage node with the data slice read from backup cloud storage node, realizes data slice backup to reach
Purpose.
208, terminal device searches the corresponding predefined XML tag of any one data slice, predefined according to what is found
XML tag searches any one data slice.
As can be seen that in embodiments of the present invention, terminal device is by collecting text data sample, by text data
The category analysis of data content in sample, thus the customized classification with data content predefined XML tag correspondingly,
And establish the corresponding database of each type of predefined XML tag.Based on customized predefined XML tag and database,
Terminal device is after getting text data to be processed, according to the classification of the data content of circumferential edge, to its data content
The predefined XML tag of corresponding type is marked, is then split data content according to the predefined XML tag of label,
Data slice is obtained, data slice is saved in respectively and completes to save in corresponding database, subsequent user can then pass through XML tag
Search data content, search it is simple and convenient, to improve the multiplexing efficiency of text data.
For example, when the text data that text data is e-book, terminal device can be E-book reader at this time
Or then the other terminal devices being connected with E-book reader, terminal device are led to by collecting several e-book samples
Cross and the classification of the data content in e-book sample analyzed, according to analyze come data content classification, it is customized
The corresponding XML tag of the other data content of every type out, then further according to the classification of data content, to customized XML
Label is classified, and the data content for obtaining a kind of classification corresponds to a type of XML tag.Then, terminal device is for every
A type of XML tag establishes corresponding database respectively.Later, terminal device obtains target electronic book, analysis target electricity
Then the classification of data content in the philosophical works carries out the label of XML tag according to classification, then will further according to the XML tag of label
Data content is divided into data slice in target electronic book, and data slice is saved in corresponding database respectively.
Embodiment three
Referring to Fig. 3, Fig. 3 is the structural representation of the data processing equipment disclosed by the embodiments of the present invention based on XML tag
Figure;As shown in figure 3, a kind of data processing equipment based on XML tag can include:
Acquiring unit 310, for obtaining text data;
Marking unit 320, for the predefined XML tag of data content indicia to the text data, wherein inhomogeneity
Other different types of predefined XML tag of data content indicia;
Cutting unit 330, for as unit of the data content that predefined XML tag marks, to the text data into
Row segmentation obtains several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit 340, for several data slices and predefined XML tag to be associated preservation.
In embodiments of the present invention, after acquiring unit 310 gets text data, marking unit 320 is according to text data
In different classes of data content, mark different types of predefined XML tag respectively, then cutting unit 330 with mark
The data content of predefined XML tag is unit, is split to obtain several data slices to text data, one predefined
The data content of XML tag label is corresponding to obtain a data slice, and storage unit 340 marks several data slices and predefined XML
Label are associated preservation.As can be seen that implementing the embodiment of the present invention, different type is marked according to different classes of data content
Predefined XML tag, then according to predefined XML tag mark data content, if text data is split to obtain
Dry data slice can be searched after data slice and predefined XML tag are finally associated preservation by predefined XML tag
To corresponding data slice, to quickly and easily complete the lookup to text data content, search speed is improved, to improve text
The multiplexing efficiency of notebook data.
Example IV
Referring to Fig. 4, Fig. 4 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag
Schematic diagram;Wherein, the data processing equipment shown in Fig. 4 based on XML tag is the data based on XML tag as shown in Figure 3
What processing unit optimized.It, should be based on XML tag in data processing equipment shown in Fig. 4 based on XML tag
Data processing equipment is specific further include:
Searching unit 410, for when needing to read any one data slice in several data slices, described in lookup
The corresponding predefined XML tag of any one data slice searches any one number according to the predefined XML tag found
According to piece.
Embodiment five
Referring to Fig. 5, Fig. 5 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag
Schematic diagram;Wherein, the data processing equipment shown in fig. 5 based on XML tag is the data based on XML tag as shown in Figure 3
What processing unit optimized.It, should be based on XML tag in data processing equipment based on XML tag shown in Fig. 5
Data processing equipment is specific further include:
Unit 510 is established, for establishing in cloud storage node every before the acquiring unit 310 obtains text data
The corresponding database of a type of predefined XML tag;
Above-mentioned storage unit 340 specifically includes:
Determination unit 341, for being marked according to the corresponding predefined XML of each of several data slices data slice
Label determine the database that each of several data slices data slice saves from cloud storage node;
It is associated with storage unit 342, for each of several data slices data slice to be saved in determining data
In library.
It is appreciated that being based on embodiment five, the unit 510 of establishing is also used to, and obtains textual data in the acquiring unit
Each is established according to the backup cloud storage node for before, establishing the cloud storage node, and in the backup cloud storage node
The corresponding backup database of predefined XML tag of type;The determination unit 341 is also used to, according to several numbers
According to the corresponding predefined XML tag of each of piece data slice, several data slices are determined from backup cloud storage node
Each of data slice save backup database;The association storage unit 342 is also used to, will be in several data slices
Each data slice be saved in determining backup database.Through the embodiment of the present invention, data slice is made a backup store,
It, can be by from backup when that can not be damaged from the data slice read in cloud storage node in data slice or cloud storage node
Corresponding data slice is read on cloud storage node, and after the reparation of cloud storage node, is read with from backup cloud storage node
To data slice be saved on cloud storage node, thus achieve the purpose that realize data slice backup.
Embodiment six
Referring to Fig. 6, Fig. 6 is another structure of the data processing equipment disclosed by the embodiments of the present invention based on XML tag
Schematic diagram;Wherein, the data processing equipment shown in fig. 6 based on XML tag is the data based on XML tag as shown in Figure 5
What processing unit optimized.It, should be based on XML tag in data processing equipment based on XML tag shown in Fig. 6
Data processing equipment is specific further include:
Collector unit 610, for collecting the text data sample of different business and/or different classifications, according to the text
The classification of the data content of data sample, customized several XML tags obtain several different types of predefined XML tags, and one
The predefined XML tag of seed type corresponds to a kind of data content of classification.
Embodiment seven
Referring to Fig. 7, Fig. 7 is the structural schematic diagram of terminal device disclosed by the embodiments of the present invention;As shown in fig. 7, a kind of
Terminal device can include: the data processing equipment based on XML tag involved in any one attached drawing in 3~attached drawing of attached drawing 6.
Wherein, the data processing equipment based on XML tag can be refering to detailed in embodiment of the method and Installation practice
Illustrate, details are not described herein.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage
Medium include read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory,
RAM), programmable read only memory (Programmable Read-only Memory, PROM), erasable programmable is read-only deposits
Reservoir (Erasable Programmable Read Only Memory, EPROM), disposable programmable read-only memory (One-
Time Programmable Read-Only Memory, OTPROM), the electronics formula of erasing can make carbon copies read-only memory
(Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact
Disc Read-Only Memory, CD-ROM) or other disc memories, magnetic disk storage, magnetic tape storage or can
For carrying or any other computer-readable medium of storing data.
Above to a kind of data processing method based on XML tag disclosed by the embodiments of the present invention, device and terminal device
It is described in detail, used herein a specific example illustrates the principle and implementation of the invention, the above reality
The explanation for applying example is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology of this field
Personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this theory
Bright book content should not be construed as limiting the invention.
Claims (9)
1. a kind of data processing method based on XML tag characterized by comprising
Obtain text data;
To the predefined XML tag of the data content indicia of the text data, wherein different classes of data content indicia is different
The predefined XML tag of type;
As unit of the data content of predefined XML tag label, the text data is split to obtain several data slices,
The corresponding data slice of data content of one predefined XML tag label;
Several data slices and predefined XML tag are associated preservation;
Before the acquisition text data, the data processing method further include:
The corresponding database of predefined XML tag of each type is established in cloud storage node;
It is described several data slices are associated preservation with predefined XML tag to include:
According to the corresponding predefined XML tag of each of several data slices data slice, determined from cloud storage node
The database that each of several data slices data slice saves;
Each of several data slices data slice is saved in determining database.
2. data processing method according to claim 1, which is characterized in that the data processing method further include:
When needing to read any one data slice in several data slices, it is corresponding to search any one described data slice
Predefined XML tag searches any one described data slice according to the predefined XML tag found.
3. data processing method according to claim 1, which is characterized in that described to establish each type in cloud storage node
Before the corresponding database of predefined XML tag of type, the data processing method further include:
The text data sample for collecting different business and/or different classifications, according to the data content of the text data sample
Classification, customized several XML tags obtain several different types of predefined XML tags, a type of predefined XML tag
A kind of data content of corresponding classification.
4. data processing method according to claim 1, which is characterized in that the data processing method further include:
The backup cloud storage node of the cloud storage node is established, and establishes each type in the backup cloud storage node
The corresponding backup database of predefined XML tag;
It is described each of several data slices data slice is saved in determining database after, the data processing
Method further include:
According to the corresponding predefined XML tag of each of several data slices data slice, from backup cloud storage node
Determine the backup database that each of several data slices data slice saves;
Each of several data slices data slice is saved in determining backup database.
5. a kind of data processing equipment based on XML tag characterized by comprising
Acquiring unit, for obtaining text data;
Marking unit, for the predefined XML tag of data content indicia to the text data, wherein different classes of number
According to the different types of predefined XML tag of content-label;
Cutting unit, for being split to the text data as unit of the data content that predefined XML tag marks
To several data slices, the corresponding data slice of data content of a predefined XML tag label;
Storage unit, for several data slices and predefined XML tag to be associated preservation;
The data processing equipment further include:
Unit is established, for establishing each type in cloud storage node before the acquiring unit obtains text data
The corresponding database of predefined XML tag;
The storage unit specifically includes:
Determination unit, for according to the corresponding predefined XML tag of each of several data slices data slice, Cong Yuncun
Store up the database for determining that each of several data slices data slice saves in node;
It is associated with storage unit, for each of several data slices data slice to be saved in determining database.
6. data processing equipment according to claim 5, which is characterized in that the data processing equipment further include:
Searching unit, for searching described any one when needing to read any one data slice in several data slices
The corresponding predefined XML tag of a data slice searches any one described data slice according to the predefined XML tag found.
7. data processing equipment according to claim 5, which is characterized in that the data processing equipment further include:
Collector unit, for collecting the text data sample of different business and/or different classifications, according to the text data sample
Data content classification, customized several XML tags obtain several different types of predefined XML tags, a type of
Predefined XML tag corresponds to a kind of data content of classification.
8. data processing equipment according to claim 5, which is characterized in that
The unit of establishing is also used to, and establishes the backup cloud storage node of the cloud storage node, and in the backup cloud storage
The corresponding backup database of predefined XML tag of each type is established in node;
The determination unit is also used to, according to the corresponding predefined XML tag of each of several data slices data slice,
The backup database that each of several data slices data slice saves is determined from backup cloud storage node;
The association storage unit is also used to, and each of several data slices data slice is saved in determining backup number
According in library.
9. a kind of terminal device characterized by comprising
Such as the described in any item data processing equipments based on XML tag of claim 5~8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610915648.0A CN106528506B (en) | 2016-10-20 | 2016-10-20 | Data processing method and device based on XML (extensive markup language) tag and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610915648.0A CN106528506B (en) | 2016-10-20 | 2016-10-20 | Data processing method and device based on XML (extensive markup language) tag and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106528506A CN106528506A (en) | 2017-03-22 |
CN106528506B true CN106528506B (en) | 2019-05-03 |
Family
ID=58332822
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610915648.0A Active CN106528506B (en) | 2016-10-20 | 2016-10-20 | Data processing method and device based on XML (extensive markup language) tag and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106528506B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110175233B (en) * | 2019-03-07 | 2022-03-11 | 平安科技(深圳)有限公司 | Method, device, computer device and storage medium for analyzing target subject portrait |
CN109992752B (en) * | 2019-03-07 | 2023-10-20 | 平安科技(深圳)有限公司 | Label marking method, device, computer device and storage medium for contract file |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581071A (en) * | 2003-07-31 | 2005-02-16 | 富士通株式会社 | Information processing method, apparatus and program in XML driven architecture |
CN1825316A (en) * | 2005-02-25 | 2006-08-30 | 微软公司 | Data store for software application documents |
CN101263480A (en) * | 2005-09-09 | 2008-09-10 | 微软公司 | Real-time synchronization of XML data between applications |
CN101263477A (en) * | 2005-09-09 | 2008-09-10 | 微软公司 | Programmability for XML data store for documents |
JP2009129400A (en) * | 2007-11-28 | 2009-06-11 | Hitachi Ltd | Database management method, database management device and database management program |
CN102045388A (en) * | 2010-11-25 | 2011-05-04 | 汉王科技股份有限公司 | Online reading device and method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6952800B1 (en) * | 1999-09-03 | 2005-10-04 | Cisco Technology, Inc. | Arrangement for controlling and logging voice enabled web applications using extensible markup language documents |
-
2016
- 2016-10-20 CN CN201610915648.0A patent/CN106528506B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1581071A (en) * | 2003-07-31 | 2005-02-16 | 富士通株式会社 | Information processing method, apparatus and program in XML driven architecture |
CN1825316A (en) * | 2005-02-25 | 2006-08-30 | 微软公司 | Data store for software application documents |
CN101263480A (en) * | 2005-09-09 | 2008-09-10 | 微软公司 | Real-time synchronization of XML data between applications |
CN101263477A (en) * | 2005-09-09 | 2008-09-10 | 微软公司 | Programmability for XML data store for documents |
JP2009129400A (en) * | 2007-11-28 | 2009-06-11 | Hitachi Ltd | Database management method, database management device and database management program |
CN102045388A (en) * | 2010-11-25 | 2011-05-04 | 汉王科技股份有限公司 | Online reading device and method |
Non-Patent Citations (1)
Title |
---|
XML的数据存储实例分析;龚颖;《江苏广播电视大学学报》;20020630;第13卷(第3期);全文 |
Also Published As
Publication number | Publication date |
---|---|
CN106528506A (en) | 2017-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Siegel et al. | Figureseer: Parsing result-figures in research papers | |
CN108932294B (en) | Resume data processing method, device, equipment and storage medium based on index | |
CN108460014A (en) | Recognition methods, device, computer equipment and the storage medium of business entity | |
CN107085583B (en) | Electronic document management method and device based on content | |
CN111522901B (en) | Method and device for processing address information in text | |
CN111125086B (en) | Method, device, storage medium and processor for acquiring data resources | |
CN111241230A (en) | Method and system for identifying string mark risk based on text mining | |
CN107357765B (en) | Word document flaking method and device | |
JP2008210024A (en) | Apparatus for analyzing set of documents, method for analyzing set of documents, program implementing this method, and recording medium storing this program | |
Owen et al. | Towards a scientific workflow featuring Natural Language Processing for the digitisation of natural history collections. | |
KR20150059208A (en) | Device for analyzing the time-space correlation of the event in the social web media and method thereof | |
CN106528506B (en) | Data processing method and device based on XML (extensive markup language) tag and terminal equipment | |
CN114861677A (en) | Information extraction method, information extraction device, electronic equipment and storage medium | |
CN109933803A (en) | A kind of Chinese idiom information displaying method shows device, electronic equipment and storage medium | |
CN112035723A (en) | Resource library determination method and device, storage medium and electronic device | |
Jeon et al. | Making a graph database from unstructured text | |
CN103823868A (en) | Event recognition method and event relation extraction method oriented to on-line encyclopedia | |
CN108073678B (en) | Document analysis processing method, system and device applied to big data analysis | |
CN114020904A (en) | Test question file screening method, model training method, device, equipment and medium | |
CN110489528B (en) | Electronic dictionary reconstruction method based on electronic book content and computing equipment | |
KR101580784B1 (en) | Method for calculating plagiarism rate of electronic documents, and a computer-readable storage medium having program to perform the same | |
CN112528665A (en) | Information extraction method based on semantic understanding | |
CN111401047A (en) | Method and device for generating dispute focus of legal document and computer equipment | |
CN113254583B (en) | Document marking method, device and medium based on semantic vector | |
CN112579747B (en) | Identity information extraction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |