CN113095044A - File conversion method, device and equipment - Google Patents
File conversion method, device and equipment Download PDFInfo
- Publication number
- CN113095044A CN113095044A CN202110392762.0A CN202110392762A CN113095044A CN 113095044 A CN113095044 A CN 113095044A CN 202110392762 A CN202110392762 A CN 202110392762A CN 113095044 A CN113095044 A CN 113095044A
- Authority
- CN
- China
- Prior art keywords
- node
- xml
- information
- data
- file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000006243 chemical reaction Methods 0.000 title claims abstract description 41
- 238000002372 labelling Methods 0.000 claims description 25
- 238000012217 deletion Methods 0.000 claims description 18
- 230000037430 deletion Effects 0.000 claims description 18
- 230000015654 memory Effects 0.000 claims description 10
- 230000008859 change Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 4
- 230000001502 supplementing effect Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 20
- 230000000875 corresponding effect Effects 0.000 description 71
- 238000010586 diagram Methods 0.000 description 15
- 230000008569 process Effects 0.000 description 8
- 230000001105 regulatory effect Effects 0.000 description 6
- 239000000047 product Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000002354 daily effect Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000003203 everyday effect Effects 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 229910021389 graphene Inorganic materials 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000002910 structure generation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
- G06F40/154—Tree transformation for tree-structured or markup documents, e.g. XSLT, XSL-FO or stylesheets
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/14—Tree-structured documents
- G06F40/143—Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The specification relates to the technical field of big data processing, and particularly discloses a file conversion method, a file conversion device and file conversion equipment, wherein the method comprises the following steps: acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled; reading the node parameters in the basic data table, and extracting the data values corresponding to the read node parameters from the business data table based on the position information of the marked cells corresponding to the read node parameters; and filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain a delivery xml file corresponding to the service data table. By utilizing the embodiments of the specification, the efficiency and the accuracy of file conversion can be greatly improved.
Description
Technical Field
The present specification relates to the field of big data processing technologies, and in particular, to a file conversion method, device, and apparatus.
Background
Overseas branches of financial institutions often require periodic reporting of portions of the operational data to local regulatory authorities. The reporting data is usually generated by transaction, and is generated into an excel report through an internal report system, and then is reported to a local supervision department. At present, when data reporting is required for monitoring in part of regions, in addition to reporting normal excel report data, xml (Extensible Markup Language) format data files need to be reported. The data types related to the financial excel report forms are complex and variable, and the xml file requirements of the supervision requirements of all regions have great difference, so that it is difficult to directly convert all the excel report forms into the xml files meeting the requirements of all the regions by adopting a general conversion tool. At present, related business personnel basically take out data in excel manually, and then convert the data into an xml file by using a general conversion tool, so that the conversion work is time-consuming, labor-consuming and easy to make mistakes.
Disclosure of Invention
An object of the embodiments of the present disclosure is to provide a method, an apparatus, and a device for file conversion, which can further improve efficiency and accuracy of file conversion.
The present specification provides a file conversion method, device and apparatus, which are implemented in the following manner:
a method of file conversion, the method comprising: acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled; reading the node parameters in the basic data table, and extracting the data values corresponding to the read node parameters from the service data table based on the position information of the marked cells corresponding to the read node parameters; and filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain a delivery xml file corresponding to the service data table.
In other embodiments of the method provided in this specification, the xml tree structure is constructed in the following manner: acquiring an xsd file specified in a basic parameter table; wherein, the appointed xsd file at least comprises the node parameters of the xml nodes contained in the submission xml file; traversing the specified xsd file, and screening out element nodes from the specified xsd file; constructing a node directed graph according to the reference relationship among the element nodes in the specified xsd file; and generating an xml tree structure by using the node directed graph.
In other embodiments of the method provided in this specification, the name of the element node is used as the xml node name in the xml tree structure.
In other embodiments of the method provided herein, the base parameter table is determined by: reading the position information and the marking information of the marking cells in the sampling table in sequence; adding a node small mark in the node parameter; the node small label is used for distinguishing data corresponding to different marked cells under the same xml node; and storing the position information of the labeled cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table.
In other embodiments of the method provided in this specification, the label information further includes a writing manner in which the service data is written into the cell, where the writing manner at least includes a location fixing type and a list type; the position fixing type refers to the position fixing of a cell written in by the service data along with the time change; the list type refers to that the position of a cell written in by the service data along with the time changes extends towards the appointed direction; correspondingly, the adding of the node subscript in the node parameter includes: under the condition that the labeling information does not comprise an extension mode and an extension length, adding a node small label in the node parameter based on a preset node sorting rule of the sample table; and under the condition that the labeling information does not comprise an extension mode and an extension length, adding node small marks in the node parameters based on a preset node sorting rule of the sample table, and supplementing and adding the node small marks in the node parameters according to the extension direction and the extension length in the labeling information.
In other embodiments of the method provided in this specification, the annotation information further includes a delete flag and a zero padding flag; the deletion mark is used for deleting the corresponding xml node or the child node corresponding to the node subscript under the condition that the data value is null; the zero padding mark is used for filling a designated value into a corresponding child node corresponding to the xml node or the node small label under the condition that the data value is empty; the method further comprises the following steps: reading a deletion mark and a zero padding mark in the labeling information; and storing the read deletion marks and zero padding marks, the position information of the marked cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table.
In other embodiments of the method provided in this specification, when the read node parameter exists in the xml tree structure, based on the location information of the labeled cell corresponding to the read node parameter, the data value corresponding to the read node parameter is extracted from the service data table.
In other embodiments of the method provided in this specification, in a case that the read node parameter does not exist in the xml tree structure, the read node parameter is recorded in the exception table, and the reading of the next node parameter is continued.
On the other hand, embodiments of the present specification further provide a file conversion apparatus, where the apparatus includes: the acquisition module is used for acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled; the extraction module is used for reading the node parameters in the basic data table and extracting the data values corresponding to the read node parameters from the business data table based on the position information of the marked cells corresponding to the read node parameters; and the filling module is used for filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain the delivery xml file corresponding to the service data table.
In another aspect, an embodiment of the present specification further provides a file conversion device, where the device includes at least one processor and a memory for storing processor-executable instructions, where the instructions, when executed by the processor, implement the steps of the method according to any one or more of the above embodiments.
According to the file conversion method, the file conversion device and the file conversion equipment provided by one or more embodiments of the specification, the label information of the cells of the xml file which need to be filled with data is configured in the sample table, and the xml node information which needs to be filled in is configured in the label information. And generating a parameter table based on the marking information and the position information of the corresponding cell. And then reading a data value based on the parameter table, and filling the data value into a corresponding node of a pre-constructed xml tree structure to obtain an xml file. The generated parameter table can be repeatedly used, can meet the requirement of length change of the tabular report, can be used all the time under the condition that the report form and the supervision xsd are not modified, and improves the report efficiency. And the records in the parameter table are mutually opposite, so that concurrent operation can be realized, a large number of report conversion tasks can be quickly completed, and the data conversion time delay during report sending is greatly shortened. Meanwhile, batch conversion of stock excel files can be supported, the reporting requirement after quick response is met, and the research and development cost and efficiency are reduced. And the form of using the parameter table can also ensure that the conversion program reserves the maximum flexibility and is suitable for various requirements of different reports on formats or data.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative effort. In the drawings:
FIG. 1 is a schematic flow chart diagram illustrating an embodiment of a file conversion method provided in the present specification;
FIG. 2 is a schematic diagram of annotation information in one embodiment provided herein;
FIG. 3 is a schematic diagram of annotation information in one embodiment provided herein;
FIG. 4 is a parameter table construction diagram in one embodiment provided in the present specification;
FIG. 5 is a schematic diagram of annotation information in one embodiment provided herein;
FIG. 6 is a schematic diagram of annotation information in one embodiment provided herein;
FIG. 7 is a schematic diagram of an xml tree structure generation in one embodiment provided by the present specification;
FIG. 8 is a diagram illustrating a portion of information in an xsd file in one embodiment provided in the present specification;
FIG. 9 provides a node directed graph in one embodiment of the present specification;
FIG. 10 is a data value population diagram in one embodiment provided herein;
FIG. 11 is a flow diagram illustrating xml file generation processing in one embodiment provided herein;
fig. 12 is a schematic block diagram of a file conversion apparatus provided in this specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the specification, and not all embodiments. All other embodiments obtained by a person skilled in the art based on one or more embodiments of the present specification without making any creative effort shall fall within the protection scope of the embodiments of the present specification.
In an application scenario example of the present specification, the file conversion method may be applied to a file conversion tool. The execution logic of the file conversion tool can be configured in the server or the client. The client can be an intelligent terminal such as a computer and a service terminal. For example, the file conversion tool can be a client application program constructed based on Java language to convert excel reports into xml files in a specified form in batch, so that a business system of a financial institution can conveniently feed back business data to a corresponding system of a supervision department.
Fig. 1 is a schematic flowchart of an embodiment of the file conversion method provided in this specification. When the method is applied to a server or an end product in practice, the method may be executed sequentially or in parallel according to the embodiments or the method shown in the drawings (for example, in an environment of parallel processors or multi-thread processing, or even in an implementation environment including distributed processing and server clustering). Fig. 1 shows a specific embodiment of a file conversion method provided in this specification, which may be applied to the file conversion tool. The method may comprise the steps of:
s20: acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled.
The business system of the financial institution can make different sample tables for different businesses. The sample table and the service data table filled with the service data have the same structure format, and are distinguished in that the sample table only contains report style information and does not contain specific service data, and the service data table contains service data written according to report frequency (daily report, monthly report, annual report and the like).
The sample table may be labeled in advance. Typically, not all types of data in a sample table need to be reported to a regulatory body. Marking information can be added to the cells needing to be reported in the sample table in advance based on the reporting requirements provided by the supervision mechanism, and the cells corresponding to other data which do not need to be reported do not need to be marked. Correspondingly, the tool can accurately extract the data to be submitted and the position of the cell where the data to be submitted are located by reading the marked information, so that the data to be submitted can be accurately extracted from a large amount of complicated data, and the simplicity and the accuracy of data screening are improved. For convenience of description, the cell to which the label information is added may be referred to as a label cell.
The submission requirement may be determined by an xsd (xml schema definition) file provided by a regulatory body. The xsd file can contain two kinds of configuration information, wherein one is that xml node parameters of data needing to be reported are defined, such as node identification, node paths and the like of the xml nodes; and the other is the constraint conditions of the data filled in each node of the xml file, such as numerical data, enumerated data, numerical decimal number, enumerated value and the like.
The sample tables may be manually labeled by business personnel based on xsd files provided by regulatory agencies. Corresponding information in an xsd file provided by a supervision mechanism can be read by designing a labeling program, and labeling information is added to the cells needing to be reported in the sample table. Of course, other methods may be used, and are not limited herein.
Correspondingly, the label information at least includes node parameters of the xml node to which the data corresponding to the label cell needs to be filled. The node parameters may include, for example, node identifications of xml nodes, node paths, and the like. By configuring the node parameters of the xml nodes, association can be established between the data corresponding to the marked cells and the xml nodes to be filled, so that the data can be filled accurately.
The basic parameter table can be determined based on the position information and the label information of the label cells in the sample table corresponding to the service data table. For example, the type of data to be filled in each row and each column in the table can be preset. The tool can read the annotation information in the sampling table and the position information of the labeling unit cell associated with the annotation information. The read annotation information and the position information of the associated labeling cell can be filled into a preset table to obtain a basic parameter table. The basic parameter table is used as an intermediate file, only needs to be generated once during initial development on the premise that the sample table and the supervision xsd file are not changed, and subsequent daily conversion does not need to be generated repeatedly. And the intermediate file falls to the ground by an excel table file, can support manual modification of data in the parameter table, and can also manually modify the parameter table when the sample table or the xsd changes in a small range, so that the change is quickly responded, and the flexibility is kept.
Since most of the service data changes with time, even if the data amount does not change with time, the data amount cannot be kept constant in most cases, and the supervision gate cannot count the actual data amount of each type of service data. Therefore, the data to be filled in by each xml node given by the supervision department is usually some kind of data, such as transaction amount data of some product. Obviously, the transaction amount data of a certain product is usually not only one data, but also comprises a plurality of transaction data. That is, it may happen that the same xml node corresponds to multiple sets of data.
In some embodiments, the position information of the labeled cells and the label information in the sampling table can be read sequentially. Adding a node small mark in the node parameter; the node small label is used for distinguishing data corresponding to different marked cells under the same xml node. And storing the position information of the labeled cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table. By further arranging the mode of adding the node small mark behind the node path, the data writing under each xml node can be more flexible and accurate.
In other embodiments, the content of the label information may also be determined according to the writing mode of the data in the label cell. The writing mode may include at least a location fix type and a list type. The location fixing type may refer to a location fixing of a cell to which the service data is written over time. The list type means that the position of a cell in which service data is written changes with time is expanded to a specified direction. If a single transaction amount data of a certain product is written into a certain column of the data table, the writing frequency is daily, the transaction amount of the product is not fixed every day, correspondingly, the positions of the cells occupied by the transaction amount data written into the column of the data table every day are not fixed, and the positions of the occupied cells are expanded in the column. Correspondingly, the position of the cell in which part of the service data is written is fixed along with the change of time.
Correspondingly, the label information written in the label cell with the fixed position type at least includes the node parameter of the xml node to which the data corresponding to the label cell needs to be filled. Fig. 2 is a schematic diagram showing annotation information of an annotation cell of a fixed position type. As shown in fig. 2, the annotation information includes a node path (Metric) and a writing method (Data Type) as a fixed location Type (Bk-Boolean). The label information written in the label cell in the list type may include at least node parameters of the xml node to which the data corresponding to the label cell needs to be filled, and an extension direction, an extension length, and the like. Fig. 3 is a schematic diagram showing the labeling information of the labeling cell of the list type. As shown in fig. 3, the annotation information includes a node path (Metric), a writing method (Data Type) as a list Type (Bk-Text (200)), an extension Direction (Direction) as a Down Direction (Down), and an extension Length (Max Length) as 100.
When the marking information is read, if the read marking information does not include the extension direction and the extension length, determining that each writing mode of the marking unit is a fixed position type; if the read annotation information includes the extension direction and the extension length, it can be determined that each writing mode of the annotation unit is a list type.
The position and the number of the cells occupied by the service data corresponding to the labeling cells with fixed positions are fixed, so that all the occupied cells can be labeled during labeling. Correspondingly, as shown in fig. 4, when the basic parameter table is generated, under the condition that the read labeled information does not include the extension direction and the extension length, the node paths of a plurality of labeled cells under the same node path are sorted by using the principle of upper left priority of the sample table, and node subscripts are added to the node paths of the plurality of labeled cells under the same node path, so that the precedence relationship of each labeled cell under the same node path is clarified. For example, as shown in FIG. 5, the node paths of the X, Y cells in FIG. 5 are all A/B/C. Then the node path information of the X cell can be configured to be A/B/C ^1, and the node path information of the Y cell is A/B/C ^2, that is, the node small label is added in the original node path. Of course, the above-mentioned principle of upper left priority is only a preferred scheme given for the general storage habit of financial business data, and the specific embodiment may also adopt other sorting principles based on the data storage manner of the business data table, such as upper right priority, lower left priority, and the like.
As shown in fig. 4, for the list type, the position and the number of the cells occupied by the service data are not fixed, a maximum cell occupation range may be preset as an extended length as a part of the label information. For example, the traffic data volume under the node path in the historical traffic data table can be counted, and the historical maximum data volume can be multiplied by two to serve as the extension length. During labeling, a first cell occupied by the data can be selected as a labeling cell based on the upper left priority principle, and an expansion direction and an expansion length are configured in labeling information. Correspondingly, when the basic parameter table is generated, under the condition that the read marking information comprises the extension direction and the extension length, the node small labels of the marking cells can be added into the node paths based on the sequencing of the upper left priority principle, and then the position information of the marking cells and the node paths and the node small labels related to the position information are supplemented based on the extension direction and the extension length.
When reading the service data table, the actual service data may not completely fill the generated node small label. Correspondingly, a deletion mark can be added in the labeling information in advance, and the redundant empty node small labels can be deleted based on the deletion mark under the condition that no data is filled. For example, as shown in fig. 6, the label information includes an extension direction of downward and an extension length of 100. After sorting according to the upper left priority principle, if there is no other marked cell with the same node identifier before the marked cell, the node size of the marked cell is increased to 1, and correspondingly, the node path information is a/B/D ^ 1. Then, the position information of the labeled cell and the node path and the node subscript associated with the position information can be supplemented based on the extension direction and the extension length. And a Delete Flag (Delete Flag) may be added correspondingly, and a part of information of the generated basic parameter table is shown in table 1, for example.
TABLE 1
Node name | Line of | Column(s) of | Delete marker |
A/B/D^1 | 23 | 2 | True |
A/B/D^2 | 24 | 2 | True |
A/B/D^3 | 25 | 2 | True |
··· | ··· | ··· | ··· |
A/B/D^100 | 123 | 2 | True |
Correspondingly, in some embodiments, the annotation information may further include a writing manner of writing the service data into the cell, where the writing manner at least includes a location fixing type and a list type. The position fixing type refers to the position fixing of a cell in which service data is written over time. The list type means that the position of a cell in which service data is written changes with time is expanded to a specified direction. Correspondingly, the adding of the node subscript in the node parameter may include: under the condition that the labeling information does not comprise an extension mode and an extension length, adding a node small label in the node parameter based on a preset node sorting rule of the sample table; and under the condition that the labeling information does not comprise an extension mode and an extension length, adding node small marks in the node parameters based on a preset node sorting rule of the sample table, and supplementing and adding the node small marks in the node parameters according to the extension direction and the extension length in the labeling information. The node small label increasing mode is configured by further distinguishing the table state occupied by writing the service data into the service data table, so that the node small label can be increased more accurately.
When the basic parameter table is generated, the identification information of the sampling table can be read and stored in the basic parameter table in a correlated manner so as to distinguish the sample table from which each information of the basic parameter table comes, so that the service data table corresponding to the sample table can be quickly found based on the identification. The sample table may also include a plurality of sub-tables, and the identification information of the sub-tables may be further read and stored in the basic parameter table in an associated manner.
Or, the file identifier of the xml file to which the data needs to be written may also be added to the annotation information. When the basic parameter table is generated, the file identifier of the xml file can be read and stored in the basic parameter table in an associated manner, so that the xml file needing to be written can be quickly located.
A Zero padding Flag (Zero Flag) may also be added to the annotation information. The zero padding flag means whether the value is automatically set to 0 when empty. The zero padding flag and the deletion flag may be set mutually exclusive, with the deletion flag being prioritized. By using the deleting mark and the zero padding mark in a matching manner, the addition and deletion of the node small mark can be more flexible, meanwhile, the processing requirement of the hollow data or default value in the service data table can not be influenced, some special requirements in the service data table are ensured, and the flexibility of generating the xml file is improved.
Accordingly, in some embodiments, the annotation information may further include a delete flag and a zero padding flag. And the deletion mark is used for deleting the corresponding xml node or the child node corresponding to the node subscript under the condition that the data value is null. And the zero padding mark is used for filling the appointed value into the child node corresponding to the corresponding xml node or node small label under the condition that the data value is null. The method may further comprise: reading a deletion mark and a zero padding mark in the labeling information; and storing the read deletion marks and zero padding marks, the position information of the marked cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table.
As shown in table 2, table 2 is an example of contents included in a basic parameter table of a transaction detail list provided in an embodiment of the present specification.
TABLE 2
S22: and reading the node parameters in the basic data table, and extracting the data values corresponding to the read node parameters from the service data table based on the position information of the marked cells corresponding to the read node parameters.
S24: and filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain a delivery xml file corresponding to the service data table.
The file transformation tool may process according to the xsd file given by the regulatory body to build the initial xml tree structure. In the embodiment, only element node information is extracted to construct an initial xml tree structure for generating the xml file, so that the convenience of generating the xml file can be greatly improved. Meanwhile, the xml file is generated by constructing a tree structure form, so that the flexibility and the accuracy of data filling can be improved.
In some embodiments, the xsd file specified in the base parameter table may be obtained; wherein, the appointed xsd file at least comprises the node parameters of the xml nodes contained in the submission xml file. Traversing the specified xsd file, and screening out element nodes from the specified xsd file; constructing a node directed graph according to the reference relationship among the element nodes in the specified xsd file; and generating an xml tree structure by using the node directed graph. By constructing the element node directed graph and constructing the xml tree structure based on the directed graph, the accuracy and the efficiency of constructing the xml tree structure based on the xsd file can be greatly improved.
As shown in FIG. 7, all rows in the xsd file may be traversed first, and the element nodes are filtered out. Because the number of the elements is large, an index can be established according to the names of the elements, and the searching in the subsequent steps is facilitated. Or the element name can be used as the xml node identifier of the initial xml tree structure, so that the subsequent steps can be conveniently searched. An element node directed graph can be constructed according to the reference relationship among the element nodes in the xsd file. The root node can be determined according to the mutual reference relation in the directed graph, the root node is used as the starting point of the xml tree structure, and the initial xml tree structure is constructed based on the element node directed graph.
Fig. 8 is an example of partial information of an xsd file. In one example scenario, after traversing all the element nodes in fig. 8, the found element nodes are Kontoregister, Info _ Daten, MessageSpec, kontoregisterBody, Fastnr _ Fon _ Tn, Fastnr _ Fi, and Vers. And constructing an element node directed graph according to the reference relationship among the element nodes in the figure 8, as shown in figure 9. The Kontoregister node comprises Info _ Daten, MessageSpec and KontoregisterBody, and the Info _ Daten node comprises Fastnr _ Fon _ Tn, Fastnr _ Fi and Vers nodes. From this directed graph, it can be determined that Kontoregister is the root node. The xml tree structure generated with Kontoregister as the root node is,
by establishing indexes for the nodes or using the node names of the elements as the node identifiers of the xml nodes, the data filling efficiency during the subsequent xml construction can be improved, and the overall operation speed of the tool is increased. And the memory is independent when the tool runs, so that the concurrent processing of a plurality of xsd files can be supported, and the running efficiency is improved.
An xml tree structure that generates a corresponding xsd file may be constructed in advance based on the above-described manner. And configuring the identification information of the xml tree structure as a part of node path information of the xml node, and adding the identification information into the labeling information so as to facilitate the rapid and accurate filling of subsequent data values. The update frequency of the xsd file is usually not fast, and the efficiency of generating the xml file can be greatly improved in a pre-construction mode.
Or, the identification information of the xsd file corresponding to the labeling cell may be configured in the basic parameter table. And when the xml file corresponding to the service data table is generated based on the basic parameter table, if the xml tree structure of the corresponding xsd file exists, filling the data value into the corresponding xml node. If the xml tree structure of the corresponding xsd file does not exist, the xml tree structure of the corresponding xsd file is generated based on the method, and then the filling operation of the data value is executed.
Different implementation interfaces can be further configured for different xsd files, and each interface can correspond to an xml file generation processing logic under a corresponding xsd file, so that the flexibility of tool execution is ensured.
As shown in fig. 10, the tool may read data in the basic parameter table line by line, obtain position information of the labeled cells corresponding to each node path, extract a data value corresponding to the read node parameter from the service data table, and fill the extracted data value into an xml node corresponding to the read node parameter in an xml tree structure, to obtain a delivery xml file corresponding to the service data table. For the case that the node subscript exists, the child nodes corresponding to the node subscript can be gradually added under the corresponding paths in the xml tree structure, and the read data values are filled into the child nodes of the corresponding xml nodes based on the node paths and the node subscript.
FIG. 11 is a schematic diagram of an overall process of xml file generation in a scenario example. Assume that the initial xml tree structure generated based on the regulatory agency's xsd file is,
table 3 is an example table consisting of node path information with node small labels and corresponding data values, zero padding flags, and deletion flags. Table 3 is used for illustration only, and does not directly limit the need to extract the following information to generate the table shown in table 3 in the actual processing.
TABLE 3
Node path information | Data value (val)ue) | Delete marker | Zero-filling mark |
/a/c/e^1 | 1 | false | true |
/a/c/e^2 | true | true | |
/a/d^1/ |
2 | false | true |
/a/d^2/ |
3 | fasle | true |
/a/d^2/g^1 | fasle | true | |
/a/e/x^1 | E | fasle | true |
Correspondingly, the xml produced after reading the node path information/a/c/e ^1 is,
since the b-node is not involved in the node path information, the illustration of the b-node is omitted here for convenience of description.
The xml generated after reading the node path information/a/c/e ^2 is,
the xml created after reading the node path information/a/d ^1/f is,
the xml created after reading the node path information/a/d ^2/f is,
the xml generated after reading the node path information/a/d ^2/g ^ 1 is,
after reading the node path information/a/e/x ^1, checking the initial xml tree structure to find that the node path information/a/e/x ^1 has an error in the path information, at the moment, a prompt of the path error can be thrown out, and the node path information is skipped, namely, the xml structure keeps the xml manufactured after reading the node path information/a/d ^2/g ^1 unchanged. And if the next node path information exists, continuing the processing of the next node path information. Therefore, the termination of the generation of the whole xml file can be avoided when a path error occurs, and the stability of tool processing is improved.
Accordingly, in some embodiments, when the read node parameter exists in the xml tree structure, based on the location information of the labeled cell corresponding to the read node parameter, the data value corresponding to the read node parameter may be extracted from the service data table. Under the condition that the read node parameters do not exist in the xml tree structure, the read node parameters are recorded into the exception table, and the next node parameter is continuously read, so that the generation of the whole xml file can be prevented from being terminated when a path error occurs, and the processing stability of the tool is improved.
Then, the xml generated based on the above manner can be corrected according to the deletion flag and the zero padding flag corresponding to the node path or the node subscript under the node path. If the node path information/a/c/e ^2 is read, the deletion mark is true, the zero padding mark is true, and the child node is deleted because the deletion mark is prior to the zero padding mark. The modified xml is then the one with the value,
after reading the node path information/a/d ^2/g ^ 1, because the delete mark is false and the zero padding mark is true, the corrected xml is,
after completing the zero padding process and the deleting process, the tool may serialize the xml structure and generate an xml file.
The records in the basic parameter table ensure the sequence through the node paths and the node subscripts under the node paths, and the accuracy of the xml file generation can be improved. And the records in the basic parameter table are mutually independent, the records in the parameter table and different parameter tables can be processed in parallel, and the flexibility and the efficiency of processing the xml file can be improved. Because different records are independent, an error processing mechanism can be flexibly configured for errors possibly occurring in the parameter table, the failure of the whole generation processing process caused by the error of one record can be avoided, and the fault tolerance of the xml file generation processing process is improved. As shown in Table 3, a parameter table filled with data values can be synchronously generated in the xml file generation process, so that the submission personnel can check and correct the xml file, and the accuracy of submission data is improved. The generated XML file can also be verified by using an XML Notepad tool and then submitted to a system of a supervision department after the verification is passed.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. For details, reference may be made to the description of the related embodiments of the related processing, and details are not repeated herein.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the file conversion method, one or more embodiments of the present specification further provide a file conversion apparatus. The apparatus may include systems, software (applications), modules, components, servers, etc. that utilize the methods described in the embodiments of the present specification in conjunction with hardware implementations as necessary. Based on the same innovative conception, embodiments of the present specification provide an apparatus as described in the following embodiments. Since the implementation scheme of the apparatus for solving the problem is similar to that of the method, the specific implementation of the apparatus in the embodiment of the present specification may refer to the implementation of the foregoing method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Specifically, fig. 12 is a schematic diagram of a module structure of an embodiment of a file conversion apparatus provided in the specification, and is applied to a file conversion tool. As shown in fig. 12, the apparatus may include:
the obtaining module 102 may be configured to obtain a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled.
The extracting module 104 may be configured to read a node parameter in the basic data table, and extract a data value corresponding to the read node parameter from the service data table based on the location information of the labeled cell corresponding to the read node parameter.
The filling module 106 may be configured to fill the extracted data value into an xml node corresponding to the read node parameter in an xml tree structure, so as to obtain a delivery xml file corresponding to the service data table.
It should be noted that the above-described apparatus may also include other embodiments according to the description of the method embodiment. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.
The present specification also provides a file conversion device, which can be applied to a file conversion tool and can also be applied to various computer data processing systems. The system may be a single server, or may include a server cluster, a system (including a distributed system), software (applications), an actual operating device, a logic gate device, a quantum computer, etc. using one or more of the methods or one or more of the example devices of the present specification, in combination with a terminal device implementing hardware as necessary. In some embodiments, the apparatus may include at least one processor and a memory for storing processor-executable instructions that, when executed by the processor, perform steps comprising the method of any one or more of the embodiments described above.
The memory may include physical means for storing information, typically by digitizing the information for storage on a medium using electrical, magnetic or optical means. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.
It should be noted that the above-mentioned device may also include other implementation manners according to the description of the method or apparatus embodiment, and specific implementation manners may refer to the description of the related method embodiment, which is not described in detail herein.
The embodiments of the present description are not limited to what must be consistent with a standard data model/template or described in the embodiments of the present description. Certain industry standards, or implementations modified slightly from those described using custom modes or examples, may also achieve the same, equivalent, or similar, or other, contemplated implementations of the above-described examples. The embodiments using these modified or transformed data acquisition, storage, judgment, processing, etc. may still fall within the scope of the alternative embodiments of the present description.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment. In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.
Claims (10)
1. A method of file conversion, the method comprising:
acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled;
reading the node parameters in the basic data table, and extracting the data values corresponding to the read node parameters from the service data table based on the position information of the marked cells corresponding to the read node parameters;
and filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain a delivery xml file corresponding to the service data table.
2. The method according to claim 1, wherein the xml tree structure is constructed in the following manner:
acquiring an xsd file specified in a basic parameter table; wherein, the appointed xsd file at least comprises the node parameters of the xml nodes contained in the submission xml file;
traversing the specified xsd file, and screening out element nodes from the specified xsd file;
constructing a node directed graph according to the reference relationship among the element nodes in the specified xsd file;
and generating an xml tree structure by using the node directed graph.
3. The method of claim 2, wherein the name of the element node is taken as the xml node name in the xml tree structure.
4. The method of claim 2 wherein the base parameter table is determined by:
reading the position information and the marking information of the marking cells in the sampling table in sequence;
adding a node small mark in the node parameter; the node small label is used for distinguishing data corresponding to different marked cells under the same xml node;
and storing the position information of the labeled cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table.
5. The method according to claim 4, wherein the label information further includes a writing manner of the service data written into the cell, and the writing manner at least includes a location fix type and a list type; the position fixing type refers to the position fixing of a cell written in by the service data along with the time change; the list type refers to that the position of a cell written in by the service data along with the time changes extends towards the appointed direction;
correspondingly, the adding of the node subscript in the node parameter includes:
under the condition that the labeling information does not comprise an extension mode and an extension length, adding a node small label in the node parameter based on a preset node sorting rule of the sample table;
and under the condition that the labeling information does not comprise an extension mode and an extension length, adding node small marks in the node parameters based on a preset node sorting rule of the sample table, and supplementing and adding the node small marks in the node parameters according to the extension direction and the extension length in the labeling information.
6. The method of claim 4, wherein the annotation information further comprises a delete flag and a zero padding flag; the deletion mark is used for deleting the corresponding xml node or the child node corresponding to the node subscript under the condition that the data value is null; the zero padding mark is used for filling a designated value into a corresponding child node corresponding to the xml node or the node small label under the condition that the data value is empty; the method further comprises the following steps:
reading a deletion mark and a zero padding mark in the labeling information;
and storing the read deletion marks and zero padding marks, the position information of the marked cells and the node parameters added with the node subscripts into an editable data table in an associated manner to obtain a basic parameter table.
7. The method according to any one of claims 1 to 6, wherein in a case that the read node parameter exists in the xml tree structure, the data value corresponding to the read node parameter is extracted from the service data table based on the location information of the labeled cell corresponding to the read node parameter.
8. Method according to any of claims 1 to 6, characterized in that in case the read node parameter is not present in the xml tree structure, the read node parameter is logged into the exception table and the reading of the next node parameter is continued.
9. A file conversion apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a basic parameter table corresponding to the service data table; the basic parameter table is determined based on the position information and the marking information of the marking cells in the sample table corresponding to the service data table; the marking information at least comprises node parameters of the xml nodes to which the data corresponding to the marking cells need to be filled;
the extraction module is used for reading the node parameters in the basic data table and extracting the data values corresponding to the read node parameters from the business data table based on the position information of the marked cells corresponding to the read node parameters;
and the filling module is used for filling the extracted data values into the xml nodes corresponding to the read node parameters in the xml tree structure to obtain the delivery xml file corresponding to the service data table.
10. A file conversion device, characterized in that it comprises at least one processor and a memory for storing processor-executable instructions, which when executed by said processor implement steps comprising the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110392762.0A CN113095044A (en) | 2021-04-13 | 2021-04-13 | File conversion method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110392762.0A CN113095044A (en) | 2021-04-13 | 2021-04-13 | File conversion method, device and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113095044A true CN113095044A (en) | 2021-07-09 |
Family
ID=76676541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110392762.0A Pending CN113095044A (en) | 2021-04-13 | 2021-04-13 | File conversion method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113095044A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114461207A (en) * | 2021-11-30 | 2022-05-10 | 深圳银兴科技开发有限公司 | Message conversion integrated component |
CN117331887A (en) * | 2023-10-31 | 2024-01-02 | 中国人民解放军32039部队 | Automatic migration method and device for configuration file of aerospace measurement and control system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593457A (en) * | 2013-11-22 | 2014-02-19 | 方正国际软件有限公司 | Method for converting document format |
CN104156342A (en) * | 2014-08-01 | 2014-11-19 | 福建星网视易信息系统有限公司 | Method and device for converting Excel format testing case into XML (extensive markup language) format |
CN106874493A (en) * | 2017-02-23 | 2017-06-20 | 济南浪潮高新科技投资发展有限公司 | A kind of data transfer device and device |
CN109783554A (en) * | 2018-12-13 | 2019-05-21 | 重庆金融资产交易所有限责任公司 | Excel document analytic method, device and computer readable storage medium |
CN110110150A (en) * | 2018-01-04 | 2019-08-09 | 北大医疗信息技术有限公司 | Read method, reading device, computer equipment and the storage medium of XML data |
CN111259634A (en) * | 2020-01-13 | 2020-06-09 | 陕西心像信息科技有限公司 | XSD format file analyzing method and generating method |
CN111858472A (en) * | 2020-08-03 | 2020-10-30 | 平安国际智慧城市科技股份有限公司 | File format conversion method and device, computer equipment and storage medium |
CN112328841A (en) * | 2020-11-30 | 2021-02-05 | 中国民航信息网络股份有限公司 | Document processing method and device, electronic equipment and storage medium |
-
2021
- 2021-04-13 CN CN202110392762.0A patent/CN113095044A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103593457A (en) * | 2013-11-22 | 2014-02-19 | 方正国际软件有限公司 | Method for converting document format |
CN104156342A (en) * | 2014-08-01 | 2014-11-19 | 福建星网视易信息系统有限公司 | Method and device for converting Excel format testing case into XML (extensive markup language) format |
CN106874493A (en) * | 2017-02-23 | 2017-06-20 | 济南浪潮高新科技投资发展有限公司 | A kind of data transfer device and device |
CN110110150A (en) * | 2018-01-04 | 2019-08-09 | 北大医疗信息技术有限公司 | Read method, reading device, computer equipment and the storage medium of XML data |
CN109783554A (en) * | 2018-12-13 | 2019-05-21 | 重庆金融资产交易所有限责任公司 | Excel document analytic method, device and computer readable storage medium |
CN111259634A (en) * | 2020-01-13 | 2020-06-09 | 陕西心像信息科技有限公司 | XSD format file analyzing method and generating method |
CN111858472A (en) * | 2020-08-03 | 2020-10-30 | 平安国际智慧城市科技股份有限公司 | File format conversion method and device, computer equipment and storage medium |
CN112328841A (en) * | 2020-11-30 | 2021-02-05 | 中国民航信息网络股份有限公司 | Document processing method and device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
无: "XSD: The Path From Excel to XML: The Basics: Mapping Elements and Attributes", 《DIGITAL COMMONS》, 29 April 2015 (2015-04-29), pages 1 - 13 * |
金智勇,等: "XML存储格式的Excel文件转换为数据库的实现", 浙江树人大学学报, no. 06, 25 December 2005 (2005-12-25), pages 80 - 83 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114461207A (en) * | 2021-11-30 | 2022-05-10 | 深圳银兴科技开发有限公司 | Message conversion integrated component |
CN117331887A (en) * | 2023-10-31 | 2024-01-02 | 中国人民解放军32039部队 | Automatic migration method and device for configuration file of aerospace measurement and control system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109558575B (en) | Online form editing method, online form editing device, computer equipment and storage medium | |
CN108763171B (en) | Automatic document generation method based on format template | |
US9740698B2 (en) | Document merge based on knowledge of document schema | |
CN112036144B (en) | Data analysis method, device, computer equipment and readable storage medium | |
CN109977383A (en) | A kind of form information extracting method based on Excel | |
CN113095044A (en) | File conversion method, device and equipment | |
CN110765741A (en) | Data processing method and device, computer equipment and storage medium | |
CN112672370B (en) | Method, system, equipment and storage medium for automatically detecting network element index data | |
CN113608903A (en) | Fault management method based on XML language | |
CN116257656A (en) | Python-based telemetry page format conversion system | |
CN115587098A (en) | Method and system for intelligently identifying chart data | |
CN115579096A (en) | Automatic generation and analysis verification method, system and storage medium for drug alert E2B R3 standard report | |
CN114372025A (en) | Consistency checking method and device for distributed edge cloud edge nodes | |
CN111352824B (en) | Test method and device and computer equipment | |
CN113821691A (en) | Document processing method and device, electronic equipment and readable storage medium | |
CN113657080A (en) | XML-based structured system and data packet creation method | |
CN110196965B (en) | Method and device for converting XML (extensive Makeup language) file into Word file | |
CN115599976B (en) | User grouping method, device, electronic equipment and storage medium | |
CN114691767A (en) | Data writing method, system, terminal device and computer readable storage medium | |
CN118278377A (en) | Data management method and system based on document template matching | |
CN115878105A (en) | Mass data verification method based on model visualization and storage medium | |
CN116501697A (en) | Method, device, equipment and storage medium for importing logistics order data in batches | |
CN118394752A (en) | Meta data maintenance method and system based on metadata configuration | |
CN116662274A (en) | Table file export method, system and electronic equipment | |
Romero | Automated BCF Data Extraction For BIM QC Communication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |