52 7 0
WO 02/057926 PCT/NZ02/00004
1
DATA PROCESSING SYSTEM AND METHOD Field of the Invention
The invention relates to methods of processing data. An abstract object 5' layer is utilised in relation to data to define a process for the data. A user may interactively define the process using meta-data.
Background to the invention
In many distributed computer systems there is a need to transfer data 10 from one computer system to another computer system, often a remote computer system. Often the data is stored in a different format on each system.
When dealing with certain types of structured information rules can be 15 established to transform data stored in a first format to data format according to a second format. US6,085,196 discloses a method which enables mapping relationships to be defined between structured information in a first format and structured information in a second format, particularly between SGML and HTML. In this patent a mapping 20 database is defined by a user which defines the mapping relationship between elements (e.g. fields) of a first format and elements {e.g. tags) of a second format. This patent deals with structured data where elements are defined within a description document (e.g. a DTD or XSD). Data from a source data source is then parsed utilising the transformations 25 defined in the mapping database to produce target data, formatted according to the second format. The method of this patent involves the definition of rules for transforming defined generic data elements according to a first format to defined generic data elements according to a second format.
Often there is a need to transport or transform data to another format. This need can arise where data, stored without meta-data, must be transported or transformed to,a format where the meta-data is defined.
2
The method of US6,085,196 does not provide means of transforming data where the elements are not defined within a description document (i.e. SGML elements and HTML tags). Furthermore, the method only provides for one to one mapping of source and target fields.
It is an object of the present invention to provide a flexible method and system for enabling the transfer or transformation of data between a wide variety of data formats or to at least provide the public with a useful choice.
According to a first aspect of the invention there is provided a computer implemented method of processing data comprising the steps of:
i) defining meta-data descriptors to represent the data;
ii) in an interactive user application defining a process
associated with at least one of the meta-data descriptors;
and iii) processing the data in accordance with the defined process.
A meta-data descriptor may describe formatting, relationships, structure, 20 and attributes relating to data. Meta-data descriptors may be defined by querying a structured database, examining an XML or HTML file, querying an XML schema or based on contextual criteria.
Access to data may be assisted by a meta-data connector. A specific 25 meta-data connector exists for each data. For example, a text file where there are three fields being Name, ID, and Address will have a text file./ meta-data connector that specifies the location of the text file, any other information required to access that text file, and any information required to access text files generally. Another text file with different data but the 30 same fields will use the same meta-data connector. A different text file with different fields will use a different text file meta-data connector. A database file accessed using JDBC will use a JDBC meta-data connector.
3
Processing data can include manipulating data, transforming data, and/or transferring data.
Preferably the method involves the transformation of data from a source 5 data source to a target data source. A data source is data accessed through a meta-data connector. The interactive user application displays source meta-data descriptors and target meta-data descriptors and allows a user to define rules for transforming data represented by the source meta-data descriptors into data represented by the target meta-data 10 descriptors. Transformation may be performed at times according to a user defined schedule. Data may be obtained from remote sources and remote devices may perform part of the transformation operation. Transformation may be initiated by a trigger event at a remote device which may be another computer system or software program that sends a 15 "signal" to start the process.
Target data elements may be supplied with the associated target metadata descriptors to a target data source or a file containing the target data elements may be sent to the target data source. By using different types 20 of meta-data connectors the method may enable transformations between different types of data including JDBC, text, EDI, IDOC, XML and HTML files, dynamic web pages,, telnet terminal sessions, web services, and reai-time data streams.
According to a further aspect of the invention there is provided a computer implemented method of transforming selected data from one or more source data sources to one or more target data sources comprising the steps of:
i) defining meta-data descriptors for the source data sources 30 and for the target data sources;
ii) in an interactive user application defining a transformation process between the source meta-data descriptors and the target meta-data descriptors; and
4
iii} transforming source data extracted from the source data sources in accordance with the defined transformation process to generate target data for supply to the target data sources.
According to a further aspect of the invention there is provided a computer system for processing data comprising:
i) a processor;
ii) memory for supplying data to the processor;
iii). an input device for providing user input to the processor;
iv) a display device for displaying info.rmation from the processor;
v) an application residing in memory which, when executed by the processor, is responsive to user input to define meta-
data descriptors to represent data and to define a process associated with at least one of the meta-data descriptors; and to process the data in accordance with the defined process.
The invention will now be described by way of example with reference to the accompanying drawings in which:
Figure 1: shows a functional diagram illustrating the method for defining the transformation process and processing the data.
Figure 2: shows a functional diagram illustrating the method for defining meta-data descriptors by examining the data through a meta-data connector.
Figure 3:- shows a functional diagram illustrating the method for defining meta-data descriptors with user assistance.
Figure 4: shows a functional diagram illustrating the method for defining the transformation process.
Figure 5: shows a functional diagram illustrating the method for 5 transforming data according to the defined transformation process.
Figure 6: shows an example of a meta-data descriptor.
Figure 7: shows the components of a system for implementing the 10 method shown in figures 1 to 5.
Figure 8: shows an example of source data as a CSV file.
Figure 9: shows a screen illustrating a user creating a meta-data 15 connector for the CSV file.
Figure 10: shows a screen illustrating a meta-data descriptor for the CSV file.
Figure 11: shows an XML file from which a meta-data descriptor will be extracted.
Figure 12: shows a screen illustrating a user creating a meta-data connector for the XML file.
Figure 13: shows a screen illustrating the meta-data descriptor for the XML file.
Figure 14: shows a screen illustrating the interactive user application for 30 defining a transformation process by dragging source elements to target elements and establishing a one-to-one direct map.
6
Figure 15: shows a screen illustrating the interactive user application for creating calculation operations.
Figure 16: shows a screen illustrating the interactive user application 5 where target elements resulted from by direct one-to-one maps with the source elements, transformations from the source elements, and calculated data..
Figure 17: shows a screen illustrating the creation of an activity.
Figure 18: shows a screen illustrating the creation of an action.
Figure 19: shows a screen illustrating constructing an activity from actions.
Figure 20: shows a screen illustrating the scheduling application when scheduling dates for activities and actions.
Figure 21: shows a screen illustrating the scheduling application when 20 scheduling times for activities and actions.
Figure 22: shows a screen illustrating the scheduling application, when scheduling an action.
. Figure 23: shows a screen illustrating a function of the scheduling application.
Figure 24: shows the components of the simplest system for implementing a method shown in figures 1 to 5.
Figure 25: shows the components of a system for implementing the method shown in figures 1to 5.
7
The present invention relates to a method which enables the transfer of data between distributed devices and the transformation of data between a first format and a second format. The method involves the creation of 5 an abstract object layer between the source and target data sources to define the required transformation operations. This, provides great flexibility and enables users to define- required transformations for specific data types and transformation operations.
Referring to the example shown in figure 1, the transformation process is defined by mapping elements 1 represented by meta-data descriptors 2 for the source data 3 to elements 4 represented by meta-data descriptors 5 for the target data 6. The definition of the transformation process is assisted by a user within an interactive user application. The 15 transformation process may involve mappings which transform or manipulate the source data elements by applying various operations 7 including programmatic and arithmetic operations.
. The defined transformation process 8 uses a meta-data connector 9 to 20 access the source data 3. The meta-data connector contains specific information about the source data including how to access the source data. For example, if the source data is to come from a telnet session the meta-data connector may include logon information, information about key strokes required to access the data, and information about how to 25 handle error exceptions received from the telnet session.
In addition to containing specific information about the particular source data, the meta-data connector contains general data for accessing data of that type. For example, a JDBC meta-data connector type 10 used for 30 JDBC data, a XML meta-data connector type 11 used for XML data, or a telnet meta-data connector type 12 used for telnet data.
8
Data resulting from the transformation process is inserted into the target data 6 using a meta-data connector 13.
Referring to the example shown in figure 2, the first step is to identify the 5 location of the data 14. This may be locai or remote data. Meta-data descriptors 15 for that data may be defined by using a meta-data' connector 16 to examine the data.
Referring to the example shown in figure 3, meta-data descriptors 18 may 10 be defined with the assistance of a user in an interactive user application
17.
Structured data is data where meta-data is recorded within the data, such as a database. Unstructured data is data where meta-data is not recorded 15 within the data.
Structured data may be examined to determine the meta-data descriptors. For example, a database may be queried to extract meta-data descriptors . For unstructured data, such as text files, rules must be established to 20 enable the meta-data descriptors to be defined. A user may identify the location of the data and the manner in which the data should be parsed to define the meta-data descriptors
For unstructured data, such as text files, telnet terminal sessions, or 25 HTML pages, contextual criteria may be specified. For example it may be specified that the first row contains field headings. Record terminators and field separators may also be defined. With this information it is possible to parse the data and return field names, data types, data structure and other relevant information to construct a meta-data 30 descriptor. .
The data may be source data from which data is to be extracted or target data to which data is sent, in the process described above identification
WO 02/057926 PCT/NZ02/00004
9
of ail target meta-data descriptors and source meta-data descriptors for the target and source data is possible whether the data is structured or unstructured.
In the example shown in Figure 4 a process for transforming data represented by the meta-data descriptors is defined. Any number of steps within the process may be defined for execution. The source meta-data descriptors 19,20 are preferably displayed on one side of the screen and the target meta-data descriptors 21,22 displayed on the other side of the 10 screen. With an interactive user application 23 a user may then define mapping relationships between source meta-data descriptors and target meta-data descriptors or any number of operations that must be performed to map source meta-data descriptors to target meta-data descriptors, for example an operation to combine data represented by two 15 source meta-data descriptors to result in data represented by one target meta-data descriptor. Mapping may be performed using a drag and drop operation or another method to associate source and target meta-data descriptors.
Certain operations may involve calculations including the concatenation or breaking up of data represented by source meta-data descriptors to map to a target meta-data descriptor. Target meta-data descriptors can also be specified as calculations without any relationship whatsoever to the source meta-data descriptors for example, where the target data needs to 25 contain constant or calculated values .
In the example shown in figure 5 a source data source 24 and a source data source 25 are shown. It will be appreciated that any number of data sources may be utilised. The data transformation manager using the 30 defined process 26 transforms the source data elements into target data elements of target data source 27 and target data source 28. Again, it will be appreciated that any number of target data sources may be created or utilised to accept data.
Transformations may be performed iocaiiy or by a remote transformation manager. Where a remote transformation manager is employed data associated with selected source meta-data descriptors must be supplied 5 to the remote transformation manager which returns data relating to the selected target meta-data descriptors. The remote transformation manager may further require data from a remote data source to complete a transformation. Software may be installed on a remote computer connected by a TCP/IP connection which enables data to be easily 10 extracted from the remote computer and transported to the local .
computer by one of a number of transport protocols such as SOAP over HTTP or RMI. Transport protocols may incorporate authentication and encryption to allow the remote computer to communicate securely with the local computer.
Data represented by target meta-data descriptors may be mapped or combined according to a specified function to produce the required target data elements. The transformation software may include a "calculator" which determines the value of target data elements based upon source 20 data and/or target data elements. The calculation may be a simple one to one mapping or use complex predefined or user defined functions. Preferably, the. calculations are performed using a scripting language such as Python, Jython, Javascript or VB script. The calculations may include • mathematical operators (multiply, divide, add, subtract, assignment, mod, 25 brackets) string operators (concatenation), logical operators (equal, not equal, less than, greater than, less than or equal, greater than or equal, AND, OR, XOR), flow control operators (if, if ... else, if ... else if ... else, for, for ... else, while, while ... else, break, continue, pass) and utility operators (number to string conversion). Calculations may include 30 mathematical functions (abs (val), complex (reai[,imag3), pow(xy), divmod
(a, b), pi, e, trig functions, exponential functions, logarithmic functions etc). Calculations may also include calendar and date and time functions,
11
*
string functions, utility functions, list functions, key generators, SQL utilities, variable utilities and area handling utilities.
Target data elements may be sent to respective target data sources with 5 their associated target meta-data descriptors or a file containing the target data elements may be sent to the relevant target data sources.
Referring now to figure 6 an example of a meta-data descriptor is shown.
Referring now to figure 7 a system for implementing the method of the invention is shown. A server 29 is seen to include an Executor component 30, a Database 31, a Timer 32 .and a 3rd Party Application accessed through an intelligent datasource 33 and its associated Database 34.
A client computer 35 is seen to include an Administrator component 36, a Remote data transfer component 37 and a data source 38. The remote data transfer component 37 is a lightweight component and is connected to server 29 via a TCP/IP connection over a WAN. The remote data 20 transfer component 37 enables executor component 30 to call data from client computer 35 to facilitate a connection and the transfer of data from a remote computer where no direct connection exists.
Administrator component 36 may communicate with the executor 25 component 30 to allow a remote user to schedule actions. These actions may then be performed by executor module 30 at specified times or upon the happening of specified events. Trigger events may include communications from a remote device such as client computer 35. A client computer 39 is seen to have a browser application 40.
The system enables the transport and transformation of data between databases 34 and 38. The Administrator module 36 allows a client to define actions as-described above in relation to figure 3. Administrator
WO 02/057926 PCT/NZ02/00004
12
module 36 also enables the actions to be scheduled to be executed at specified times or upon specified events. The actions and schedule may be stored on server 29. Executor module 30 executes the specified actions as set out in the schedule at specified times or upon receipt of 5 event information the specified actions are executed. The event mechanism may allow an external application such as the 3rd Party Application to initiate workflow activities or actions in the Executor module 30. Data is obtained from Database 34 and, where required, information is requested by the Executor module 30 via the remote data 10 transfer component 37 to query data source 38 and return the required data to executor module 30. The required data transformations are performed and the target data elements are transferred to data source 38.
A worked example illustrating the creation of meta-data connectors, the 15 creation of meta-data descriptors, and the definition of a process for transforming source data to target data all by the administrator component 36 as seen by a user will now be described with reference to figures 8 to 16.
Referring firstly to figure 8 a CSV file is shown. This CSV file contains field headings in the first row and the subsequent rows contain data relating to "Orders". This file will be used in the transformation process in figure 14 as the source data.
In figure 9 a screen is shown where a user defines the meta-data connector for the CSV file. The data source is given the name "Order CSV file" and the file name or URL at which the file can be called is given. The user has also selected the "First row contains field headings" box. This enables the Executor module 30 to query the CSV file and recover 30 the field headings. The field headings form the meta-data descriptor for this data.
WO 02/057926 PCT/NZ02/00004
13
The "Select" button is then actioned and the screen shown in figure 10 shows the fields extracted from the CSV file after the "Import" button is actioned. The source field names "Customer" to "ETA" are listed on the Screen.
'
In figure 11 an XML file is shown.
In figure 12 the meta-data connector for an XML target data source is defined by the user. The user enters the name of the data source as "1.0 10 • XML Order" and a Filename/URL is entered. In this case the meta-data •descriptor will be extracted from the XML file shown in figure 11.
In figure 13 the meta-data descriptor that is going to be used as the target meta-data descriptor is displayed. In this case the meta-data 15 descriptor has two entities "Headerlnfo" and "Lineltemlnfo" as children of an entity "Order" which is in turn a child of entity "Orders". The entity "Headerlnfo" has target field names "Account" to "TotalAmount". The entity "Lineltemlnfo" has target field names "Price" to "ProductCode". Additional entities and field names may be added if necessary.
In figure 14 the transformation process is defined. On the left side the source entity and field names from the meta-data descriptor for the CSV file are displayed. On the right side the target entities and field names from the meta-data descriptor extracted from the XML file are displayed. 25 The user may map source field names to target field names. In figure 14
the user has clicked and dragged source field names to a corresponding target field name. This creates a direct one-to-one mapping. In the example the user has dragged "Customer" to "Account". The user can map one'or more source fields to one target field or one source field to 30 many target fields. The user may also define certain calculations to result in data for target field names by pressing the Calculation button.
14
In figure 15 the calculation component is shown. Data for target field names can result from calculations made in relation to source fields or ■calculations resulting from constant data, data from another source, or other data unrelated to data from the source fields.
Figure 16 shows another example of-the definition of a transformation process. In this example the user has directly mapped the "Price" source field to the "Price" target field and the "Qty" source field to the "Quantity" target.field. The user has used the calculation component to 10 enter a constant value "Std Item" in the "ItemDescription" target field and prefixed "014-" to data from the "ProductCode" source field to result in the "ProductCode" target field.
The Administrator module 36 allows a client to define activites. An 15 activity consists of actions. The actions may be arranged according to a script. The actions can consist of data transfer actions and other actions that control the computer environment, send e-mails, or handle errors. The actions can consist of functions that monitor or control the current activity or other activities, execute programs or iterate other actions, or 20 other standard programmatic functions. New types of actions can be created by the user. For example the user may require a particular network connection to be operational before a defined process to transform data is started.
One of the actions within an activity may be a defined process to transform data as described in figure 5.
The execution of actions or activities can be dependent on a trigger event. A trigger event includes events generated by a remote system, a 30 scheduler application, or a specified change on the local system.
Figure 17 shows the creation of an activity, the user gives the activity a name, in this example "Process Daily e-mail Orders", and a description.
Intellectual Property
Office of M.Z.
2 4 FEB 200
RECEI
Figure 18 shows a screen where a user is defining a particular action. The action is a POP 3 e-mail action that logs into a mail server and retrieves e-mails matching certain header fields defined by the user, and extracts the e-mail message and attachments to 5 the local hard drive.
Figure 19 shows how an activity may be composed or various actions. In this example the process is:
• Get order e-mails from the POP3 server - "get all order e-mails".
• Unzip attachments to extract the CSV files containing orders - "Unzip order attachments".
• Convert the CSV orders to an XML format - "CSV to XML purchase order". This step represents a defined process to transform data as in figure 5.
• Validate the resulting XML data against an XML schema to ensure the orders 15 contain valid data - "Validate data against XML PO schema".
• Copy the resulting XML file to an AS/40 system - "Copy XML file to AS/400".
• Execute a command on the AS/400 that will send the orders into an ERP system on the AS/400 - "Process batch on AS/400".
The scheduling component may include a graphical user interface as shown in Figures 20 to 23. Figure 20 shows a screen listing for defined actions. Figure 21 shows a screen showing the scheduling of actions in a calendar format. Figure 22 shows the action scheduled for a day. Users can view the scheduling of actions in any desired 25 format.
Referring to Figure 21 a screen showing scheduling properties is shown. A user can select an action, a date and time for execution of the action and in the "Repeat" portion define the periodicity of the action. Actions may also be set up to be triggered upon 30 the occurrence of trigger events including communications from external devices, changes in data etc. Figure 23 shows a schedule listing a series of actions.
i
WO 02/057926 PCT/NZ02/00004
16
Figure 24 shows a possible computer system for implementing the method. The method may be deployed on computer hardware comprising an Intel processor 41, SDRAM memory 42, a keyboard and mouse input 5 device 43, a computer monitor visual display device 44 and an application
45 residing in the SDRAM memory.
Figure 25 shows another possible computer system for implementing the method. The method may be deployed on computer hardware comprising 10 an Intel processor 41, SDRAM memory 42, a keyboard and mouse input device 43, a computer monitor visual display device 44 and an application 45 residing in the SDRAM memory. The computer system may include a database 46 residing on a remote system which the processor can access through a remote transfer component proxy device 47 also residing on the 15 remote system 48. The computer system also includes a hardware clock timer device 49 and a local database 50.
Those skilled in the art will appreciate that the method may be deployed on a computer system with more than one processor, more than one 20 memory component, other types of input devices or more than one database located on remote systems or locally. Those skilled in the art will appreciated that the method may be deployed on a network such that some components may communicate to each other over a network such as a LAN or WAN using a protocol such as TCP/IP.
It will be seen that the invention provides a convenient means for transferring data formatted according to a first format to another system in which data is stored in a second format. The invention also provides a method and system which provides great flexibility for a user in the 30 transformation of a wide range of data source formats to a wide range of target source formats. The invention also provides a method whereby changes in the way data is accessed does not affect the defined process as data access information is isolated to the rneta-data connector for that
17
data. The invention is platform independent as the remote data transfer component can be deployed on any system and all transformations managed by a central server. Furthermore, due to the abstract nature of the meta-data connectors and the interactive user interface used to define 5 transformations, inexperienced 3rd party programmers can add new meta data connector types and define new transformation processes easily. The ability of the invention to define meta-data connectors enables the use of the invention for legacy systems which use out-dated or unusual data access methods, such as telnet sessions. The access complexity 10 handled by the meta-data connectors enables the invention to be used to manage data from a source which requires complex error handling capabilities.
Where in the foregoing description reference has been made to integers or 15 components having known equivalents then such equivalents are herein incorporated as if individually set forth.
Although this invention has been described by way of example it is to be appreciated that improvements and/or modifications may be made thereto 20 without departing^ from the scope of the invention as defined in the appended claims.