WO2002019165A1 - Virtual fields - Google Patents
Virtual fields Download PDFInfo
- Publication number
- WO2002019165A1 WO2002019165A1 PCT/US2001/011671 US0111671W WO0219165A1 WO 2002019165 A1 WO2002019165 A1 WO 2002019165A1 US 0111671 W US0111671 W US 0111671W WO 0219165 A1 WO0219165 A1 WO 0219165A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- virtual
- fields
- document
- data
- mapping
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Definitions
- the invention is related to the field of representation and translation of electronic documents.
- a field in some document forms can have multiple meanings.
- documents with data that convey information a needed piece of information can reside in one of several possible locations, and the location of that piece of information depends on information in other locations in the concrete document.
- a “segment” (group) called “POl” is shown in Table 1.
- POl has elements (i.e. fields) in it, which are called "PO101", "PO102", etc.
- a qualifier element can contain a qualifier code, which determines the meaning of its related data elements. For example, if PO106 contains "UP”, then PO107 holds an UPC code. If POllO holds a "VP”, then POlll holds a vendor part number. However, any of PO106, PO108, ... P0124 can hold any of the qualifier values.
- mappings in a document can have more than one meaning. This means that conventional methods of mapping are hard to automate. Instead, the mappings must be manually done and require customized code, which does not allow reuse of mapping knowledge and rules.
- mapping and the mapping rules are one-off. That is, each time a user wants to define how to perform a document translation, similar code must be written and tested. This increases the time needed to define how to translate from the source to the target document.
- mapping and the mapping rules depend on user- written code. This makes it hard to automatically validate the integrity of the mapping. It also sets a minimum bar for the skill level of anyone trying to define a mapping, as they must then know all the document locations that might hold a particular meaning, and must be skilful enough to write the code to handle the case. This imposes a maintenance burden, as fixing a problem in a mapping requires altering code.
- mapping and the mapping rules are translation-language dependent.
- the code that must be written and tested depends on the underlying translation engine that will translate the documents.
- mapping rules will be translation-engine dependent, and a translation defined for one translation engine will likely need adjusting to make the mapping work on a different translation engine. Moving a transform from one translation engine to another is difficult.
- the source and target mappings must be significantly different.
- the code for handling the case described above will differ whether the document is the source or the target document. If one has mapped from A to B, mapping from B to A requires major rework, as the code for the mapping would have to be rewritten using different logic.
- mapping tools use superficial similarities in field names or document structure as the basis for automapping. They can not automap to virtual structures, forcing users to write code.
- a method including identifying a meaning for data in a document, automatically locating the data having the meaning, assigning a virtual field to the meaning, and automatically mapping and transforming document using the virtual field is disclosed.
- Figure 1 is an example of an embodiment of a data structure for a document.
- Figure 2 is another example of an embodiment of a data structure for a document.
- Figure 3 is an example of data structure used to create a virtual field.
- Figure 4 is an example of a network that translates a document using virtual fields.
- Figure 5 is an example of a computer system that translates a document using virtual fields.
- Figure 6 is an example of a translation system that uses virtual fields to translate documents.
- Figure 7 is an embodiment of a method for automatically generating a transform using virtual fields.
- Virtual fields can be used to automatically generate transforms.
- the virtual fields automatically locate data that has a specific meaning in concrete documents where the meaning can reside in any of a set of locations, as identified by data in other document locations, assign a name and a field to that meaning, and let mapping and mapping rules work with the new field.
- Figure 1 depicts a data structure for part of a document.
- Groupl has fields under it.
- Fields Fieldx_q, Fieldy_q and Fieldz_q are qualifier fields that can hold values from a predefined set of qualifier values.
- Fields Fieldx_dl, Fieldy_dl and Fieldz_dl are related data fields that can hold data values that will be moved between the source and the target documents.
- Fields Fieldx_d2, Fieldy_d2 and Fieldz_d2 are another set of related data fields.
- Groupl can contain other fields too.
- Figure 2 is a data structure depicting Groupl as having two virtual fields that represent the possible meanings of the data fields in Figure 1.
- Virtual field UPC_Code represents the data field in Fieldx_dl, Fieldy_dl and Fieldz_dl whose related qualifier field holds code "UPC". That is, qualifier and data fields come in pairs, such that if a qualifier field contains "UPC" then its related data field holds a UPC code value.
- the virtual fields are here depicted as having three qualifier-data field pairs. In other embodiments, a virtual field needs at least one qualifier-data pair, and can have more than three.
- UPC_Code and Vendor_Code are "enabled” virtual fields, and that Vendor_Subcode is a "disabled” virtual field.
- Enabled fields appear in the document under Groupl.
- the document structure at any given time might or might not contain a particular virtual field under a particular group.
- an event such as a user operation in a GUI
- requests that a virtual field be enabled the virtual field is added to the document structure.
- an event requests that a virtual field be disabled, the virtual field is removed from the document structure.
- Figure 3 illustrates a data structure including the information needed to create the virtual fields of Figure 2.
- the information needed to generate a virtual field is:
- Name - (optional). If specified, append the Qualifier to the end of the Name, and use the result as the name of the virtual field. If Name is not specified, locate the name of the group that is the parent of the qualifier and data fields, append the qualifier, and use the result as the name of the virtual field.
- mapping rules can be applied to meta-data to map from a virtual field in the source document to a corresponding field(s) in the target document.
- a virtual field in a source document can be treated like any other field. Whatever operations - move, or any other mapping rule that might be applied to other fields - apply to virtual fields.
- a transform is the code used by a translation engine to convert one concrete document into another.
- a transform is generated by applying mapping rules to meta-data of the source and target documents. After the mapping rules, and meta-data, including virtual fields, are defined, a transform can be automatically generated, which will perform the following processing on virtual fields defined for a concrete source document:
- the transform will locate the right group, and then examine Fieldx_q, Fieldy_q and Fieldz_q, in that order, and locate the first that holds the value "UPC”. It will stop, and will continue with the mapping rules using the value in Fieldx_dl, Fieldy_dl or Fieldz__dl, whichever corresponds to the qualifier field that contained "UPC”.
- mapping rules can be applied to meta-data to map from location(s) in the source document to a virtual field in the target document.
- a virtual field in a target document can be treated like any other field. Whatever operations - move, or any other manipulation rule that might be applied to other fields - apply to virtual fields.
- a transform is the code used by a translation engine to convert one concrete document into another.
- a transform is generated by applying mapping rules to the meta-data of the source and target documents. After the mapping rules and meta-data, including the virtual fields, are defined, a transform can be automatically generated, which will perform the following processing on virtual fields defined for a concrete target document:
- a host computer system transmits and receives data over a computer network or standard telephone line.
- the steps of accessing, downloading, and manipulating the data, as well as other aspects of the present invention are implemented by a central processing unit (CPU) in the host computer executing sequences of instructions stored in a memory.
- the memory may be a random access memory (RAM), read-only memory (ROM), a persistent store, such as a mass storage device, or any combination of these devices. Execution of the sequences of instructions causes the CPU to perform steps according to the present invention.
- the instructions may be loaded into the memory of the host computer from a storage device, or from one or more other computer systems over a network connection.
- a server computer may transmit a sequence of instructions to the host computer in response to a message transmitted to the server over a network by the host.
- the host receives the instructions over the network connection, it stores the instructions in memory.
- the host may store the instructions for later execution or execute the instructions as they arrive over the network connection.
- the downloaded instructions may be directly supported by the CPU.
- the instructions may not be directly executable by the CPU, and may instead be executed by an interpreter that interprets the instructions.
- hardwired circuitry may be used in place of, or in combination with, software instructions to implement the present invention.
- the present invention is not limited to any specific combination of hardware circuitry and software, nor to any particular source for the instructions executed by the host computer.
- Figure 4 illustrates a system 400 in which a host computer 402 is connected to a remote computer 404 through a network 410.
- the network interface between host computer 402 and remote 404 may also include one or more routers, such as routers 406 and 408, which serve to buffer and route the data transmitted between the host and client computers.
- Network 410 may be the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), or any combination thereof.
- the remote computer 404 may be a World-Wide Web (WWW) server that stores data in the form of 'web pages' and transmits these pages as Hypertext Markup Language (HTML) files over the Internet network 410 to host computer 402.
- WWW World-Wide Web
- host computer 402 runs a 'web browser', which is simply an application program for accessing and providing links to web pages available on various Internet sites.
- Host computer 402 is also configured to communicate to telephone system 412 through a telephone interface, typically a modem.
- FIG. 5 is a block diagram of a representative networked computer, such as host computer 402 illustrated in Figure 4.
- the computer system 500 includes a processor 502 coupled through a bus 501 to a random access memory (RAM) 504, a read only memory (ROM) 506, and a mass storage device 507.
- Mass storage device 507 could be a disk or tape drive for storing data and instructions.
- a display device 520 for providing visual output is also coupled to processor 502 through bus 501.
- Keyboard 521 is coupled to bus 501 for communicating information and command selections to processor 502.
- cursor control unit 522 is Another type of user input device, which may be a device such as a mouse or trackball, for communicating direction commands that control cursor movement on display 520.
- cursor control unit 522 is also coupled to processor 502 through bus 501.
- an audio output port 524 for connection to speakers that output audio signals produced by computer 500.
- I/O input/output
- network interface device 523 for providing a physical and logical connection between computer system 500 and a network.
- Network interface device 523 is used by various communication applications running on computer 500 for communicating over a network medium and may represent devices such as an ethernet card, ISDN card, or similar devices.
- Modem 526 interfaces computer system 500 to a telephone line and translates digital data produced by the computer into analog signals that can be transmitted over standard telephone lines, such as by telephone system 412 in
- modem 526 provides a hardwired interface to a telephone wall jack, however modem 526 could also represent a wireless modem for communication over cellular telephone networks. It should be noted that the architecture of Figure 5 is provided only for purposes of illustration, and that a host computer used in conjunction with the present invention is not limited to the specific architecture shown.
- Figure 6 shows an example of the groups and fields of two different documents, a source document format 610 and a target document format 620.
- the document is a purchase order.
- the document may convey any information that one person or business wants to send to another person or business.
- the source group 615 includes the source fields of name, address, city, description, price, quantity, and total.
- the target group 625 includes the fields name, location, information, cost, number, and amount.
- the formats of the fields in the source and target groups are structurally different, they have similarities and common abstractions such as name, amount, and place to ship the goods.
- the names of the fields in groups 615 and 625 may be different, such as "price" and "cost,” for example, but the data 617 and 627 contained in these fields is the same.
- a virtual field that corresponds to a field in the source and target groups 615 and 625 can be used to capture these common abstractions using meta-data.
- meta-data associated with the source document can be used by the mapping engine to define one or more virtual fields.
- the meta-data used to define the virtual fields can be obtained from a data structure such as the data structure of Figure 3.
- the mapping engine can apply mapping rules to the meta-data associated with the source group, including the virtual fields, to automatically generate a transform.
- the transform is then provided to the translation engine, which uses the transform to convert the source document into the target document.
- a mapping engine 650 creates a translation map, as shown in Figure 6.
- the translation map is used by a translation engine 630 to convert, or translate a message from a source format to a target format.
- the translation map is a metadata level description of the fields in the source document that will be used to populate a field in the target document.
- Figure 7 shows an embodiment of a method for automatically generating a transform using virtual fields.
- One or more virtual fields for a first document are defined, step 710.
- the virtual fields are defined using meta-data contained in the data structure of Figure 3.
- One or more of these virtual fields are enabled, so that the enabled virtual fields appear in the first document, step 720.
- One or more of the virtual fields may be disabled, so that the disabled virtual fields do not appear in the first document, step 720.
- mapping rules to map data from fields in the first document to fields in a second document are defined, step 740. Then, a transform to convert the first document into the second document is automatically generated by applying the mapping rules to the meta-data, including the enabled virtual fields, of the first and second documents.
- virtual fields enable the automatic generation of transform code that maps between source and target documents.
- the automatically generated code enables virtual fields as needed - if it discovers that a virtual field that could potentially be enabled is specified by the code, it enables the virtual field.
- mapping tends to be sufficient if a virtual field is involved.
- the user does not need to write code to locate the data of a source field that is part of a virtual field, or to put data into the correct location into the correct part of a target virtual field.
- a user needs to put particular information into the first qualifier- data pair of a virtual field, he merely needs to specify that the translation engine run the mapping to that virtual field occurs before other mappings to a different virtual field that maps to the same set of qualifier-data pairs. Alternatively, he can manually write the code to put the data into those target fields.
- transform code can be successfully generated for various translation engines.
- mapping from document A to B is much closer to mapping from B to A than without this invention.
- mapping from B to A has been made closer to the transposition of the mapping from A to B.
- Mapping one direction then provides most of the information needed to map the other direction. If users had to write code to map from A to B, such a transposition would be far more work. With this invention, transposing a mapping is far less work.
- mapping to or from a virtual field is translation-engine independent.
- the code appropriate for that translation engine is generated when writing out the transform in the way that translation engine requires.
- mappings to and from virtual fields can be validated, as most cases do not require the user to write code. Because fewer mappings require the user to write code, mapping difference checking is easier.
- mapping is more translation-engine independent, as code to handle virtual fields in the mapping is automatically generated when the mapping is exported, rather than coded by the user before the mapping is exported. Eleventh, creating a map is faster, as automapping has a better hit rate.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Document Processing Apparatus (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2001255301A AU2001255301A1 (en) | 2000-08-29 | 2001-04-09 | Virtual fields |
CA002420817A CA2420817A1 (en) | 2000-08-29 | 2001-04-09 | Virtual fields |
JP2002523206A JP2004507841A (en) | 2000-08-29 | 2001-04-09 | Virtual field |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/650,224 | 2000-08-29 | ||
US65022400 | 2000-08-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2002019165A1 true WO2002019165A1 (en) | 2002-03-07 |
Family
ID=24608005
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/011671 WO2002019165A1 (en) | 2000-08-29 | 2001-04-09 | Virtual fields |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP2004507841A (en) |
AU (1) | AU2001255301A1 (en) |
WO (1) | WO2002019165A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5369761A (en) * | 1990-03-30 | 1994-11-29 | Conley; John D. | Automatic and transparent denormalization support, wherein denormalization is achieved through appending of fields to base relations of a normalized database |
US5627979A (en) * | 1994-07-18 | 1997-05-06 | International Business Machines Corporation | System and method for providing a graphical user interface for mapping and accessing objects in data stores |
US5724573A (en) * | 1995-12-22 | 1998-03-03 | International Business Machines Corporation | Method and system for mining quantitative association rules in large relational tables |
-
2001
- 2001-04-09 JP JP2002523206A patent/JP2004507841A/en active Pending
- 2001-04-09 WO PCT/US2001/011671 patent/WO2002019165A1/en not_active Application Discontinuation
- 2001-04-09 AU AU2001255301A patent/AU2001255301A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5369761A (en) * | 1990-03-30 | 1994-11-29 | Conley; John D. | Automatic and transparent denormalization support, wherein denormalization is achieved through appending of fields to base relations of a normalized database |
US5627979A (en) * | 1994-07-18 | 1997-05-06 | International Business Machines Corporation | System and method for providing a graphical user interface for mapping and accessing objects in data stores |
US5724573A (en) * | 1995-12-22 | 1998-03-03 | International Business Machines Corporation | Method and system for mining quantitative association rules in large relational tables |
Also Published As
Publication number | Publication date |
---|---|
JP2004507841A (en) | 2004-03-11 |
AU2001255301A1 (en) | 2002-03-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6694338B1 (en) | Virtual aggregate fields | |
US6560608B1 (en) | Method and apparatus for automatically selecting a rule | |
US6757739B1 (en) | Method and apparatus for automatically converting the format of an electronic message | |
US7849179B2 (en) | System and program for managing devices in a network | |
US5937411A (en) | Method and apparatus for creating storage for java archive manifest file | |
US6931409B2 (en) | Method, apparatus, and program to efficiently serialize objects | |
JP4972082B2 (en) | Ability for developers to easily discover or extend well-known locations on the system | |
JP4202041B2 (en) | Method and system for applying input mode bias | |
US20060184568A1 (en) | Having a single set of object relational mappings across different instances of the same schemas | |
US7512990B2 (en) | Multiple simultaneous ACL formats on a filesystem | |
KR20050082156A (en) | Methods and systems for providing automated actions on recognized text strings in a computer-generated document | |
US7530075B2 (en) | System and method for employing object-based pipelines | |
US7937418B2 (en) | Method and system for enhancing software documentation and help systems | |
US7991785B1 (en) | Interface to a human interface infrastructure database in an extensible firmware interface environment | |
US20060020681A1 (en) | Modification and importation of live web pages | |
US20060230057A1 (en) | Method and apparatus for mapping web services definition language files to application specific business objects in an integrated application environment | |
US6928616B2 (en) | Method and apparatus for allowing one bookmark to replace another | |
WO2002019165A1 (en) | Virtual fields | |
CA2420817A1 (en) | Virtual fields | |
EP1328873A1 (en) | Virtual groups | |
US6775820B2 (en) | Web based application re-coded for OS/2 compatibility | |
US20080127180A1 (en) | Operating system automated application porting tool | |
JP2022064865A (en) | Computer-implemented method, computer program and computer system (extraction of structured information from unstructured documents) | |
Pincus et al. | The NetCDF Fortran 90 Interface Guide | |
Gobbetti et al. | " A comparative study on Binary Scientific Data Formats" CRS4 Technical Report |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2420817 Country of ref document: CA |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2002523206 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001928444 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2001928444 Country of ref document: EP |