CLAIM OF PRIORITY
This application claims the benefit of a U.S. Provisional Patent Application 60/642,019 entitled A TECHNIQUE FOR CREATING SELF DESCRIBED DATA SHARED ACROSS MULTIPLE SERVICES, by Somenath Sengupta, filed Jan. 7, 2005, the entire contents of which is incorporated herein by reference.
COPYRIGHT NOTICE
A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
FIELD OF THE INVENTION
The current invention relates generally to data exchange between applications, and more particularly to a mechanism for creating self-described data that may be shared across multiple services.
BACKGROUND
Today's enterprise computing systems face increasing demands for existing applications, either within a single enterprise or across multiple enterprises, to work with one another. One challenge involved with the data shared across applications is how multiple parties can adapt to the changes in format and/or semantics of the shared data. Presently, one or more factors block successful interaction between applications. These factors include different interaction semantics and different data formats and semantics among different applications.
Currently available techniques do not solve the problem of differences in data formats and data semantics in a dynamic manner. Many current approaches employ adapters to enable each application to communicate with the other applications. Such conventional approaches require each party to the conversation to have pre-existing knowledge about data definitions of peer parties. In such conventional approaches, changing any one of the parties to a communication necessitates changes to the adapters used by each of the other parties.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A -1B are functional block diagrams of example computing environments in which techniques for sharing data across services in one embodiment of the present invention may be implemented.
FIG. 2A is an operational flow diagram illustrating a high level overview of a technique for sharing data in one embodiment of the present invention.
FIG. 2B is an operational flow diagram illustrating a high level overview of a technique for sharing data when an application has been modified in one embodiment of the present invention.
FIG. 2C is an operational flow diagram illustrating a high level overview of a technique for defining data in one embodiment of the present invention.
FIG. 2D is an operational flow diagram illustrating a high level overview of a process for reading data from the conforming schema in one embodiment of the present invention.
FIG. 3 is a hardware block diagram of an example computer system, which may be used to embody one or more components of an embodiment of the present invention.
DETAILED DESCRIPTION
In accordance with one embodiment of the present invention, there are provided mechanisms and methods for sharing data across services. These mechanisms and methods for applications to share information even when the applications use different data formats make it possible for business services to operate without any disruption even if the input data stream produced by it's peer business service has been changed.
In one embodiment, the invention provides a method for sharing data. One embodiment of the method includes determining a data group for each target application. The data group includes data intended for that target application and identity information about a schema describing the data. A message is prepared from the data by organizing the data groups into a plurality of element groups. In one embodiment, the plurality of element groups is implemented by an unbounded sequence of element groups, however other implementations are contemplated. Each element group in the plurality of element groups contains data intended for a particular target application and identity information about a schema describing the data. The message is sent to at least one of a plurality of recipients.
The mechanisms and methods for sharing data across services even when the applications use different data formats enable business service applications to operate without any disruption even if the input data stream produced by it's peer business service application has been changed. The data sharing techniques provided by one embodiment enable a multiparty communication scenario to continue working even when one or more subscribers want a different set of information from the other subscribers. In one embodiment, a message producer does not have to have pre-existing knowledge of supported data format or semantics of each individual subscriber. In one embodiment, a technique supports strong type validations.
In one embodiment, business service applications may be constrained to only see the portion of the data for which that business service is authorized, even though the producer of the data publishes the message once to a plurality of business services. In one embodiment, the producer of the data can change the data without breaking any subscriber. In one embodiment, each business service can route the data to another service even though the service itself only understands a portion of the data. In one embodiment, if a business service broadcasts the data to a group of business services in which each member of the group of business services understands a different data description language, a message having a set of sub-divided portions described with different languages and conformed to different schema enables the sender to send the same data only once to a group of subscribers.
This ability of a multiparty communication scenario to continue working even when one or more subscribers want a different set of information from the other subscribers makes it possible to attain improved usage from computing resources in a computer system.
FIGS. 1A-1B are functional block diagrams of example computing environments in which techniques for sharing data across services in one embodiment of the present invention may be implemented. As shown in FIG. 1A, an application 101 a is sending a message 90 a to a plurality of applications 103 a, 105 a and 107 a. Applications 101 a, 103 a, 105 a and 107 a may be any kind of business services, data processing services, data storage and retrieval services, web hosting services, scientific applications, entertainment applications and other application types that are contemplated. While FIG. 1A depicts application 101 a as a message sender and applications 103 a, 105 a and 107 a as message recipients, or target applications, in many embodiments, communications between applications will be bi-directional. Communications between applications 101 a, 103 a, 105 a and 107 a can be facilitated by a variety of mechanisms in various embodiments, such as, without limitation, computer networks, wireless networks, direct memory interfaces and other communications mechanisms comprised of hardware, software and combinations thereof, that are contemplated.
Message 90 a comprises one or more individual element groups, each element group includes data to be exchanged between application 101 a and one of applications 101 a, 103 a, 105 a and 107 a. Message 90 a may contain data that is structured or semi-structured. Application 101 a processes data into the element groups for one or more of the applications 103 a, 105 a and 107 a using processing described in further detail below with reference to FIGS. 2A-2D. Each element group in message 90 a includes information relating to a schema for the element group. Including information identifying a schema for the data makes the data self-described. Accordingly, message 90 a is an unbounded sequence of elements, each element of which may have sub-elements, and for which there is no one single schema for the whole of the data stored in message 90 a. By employing such processing techniques, applications 101 a, 103 a, 105 a and 107 a are able to operate in a loosely coupled environment, meaning that a change to any one application does not necessitate changes to each of the other applications. While FIG. 1A illustrates a communication process in which application 101 a broadcasts a message 90 a to a plurality of recipient applications 101 a, 103 a, 105 a and 107 a, other communications configurations are also possible.
FIG. 1B illustrates another communications configuration in which sender application 101 b sends a message 90 b to a recipient application 105 b. Application 105 b may then pass the message 90 b on to other recipient applications 103 b and 107 b. In such case, application 105 b may act as both a receiving application and a sending application in certain embodiments. Many other communications configurations between applications 101 a, 103 a, 105 a and 107 a are readily apparent to one skilled in the art. While the invention is illustrated generally herein with reference to an example of devices using a Java™ Virtual Machine (JVM) as the runtime environment, the present invention does not require such an environment, and in some embodiments, techniques according to the invention may be implemented in devices using alternative runtime environments.
FIG. 2A is an operational flow diagram illustrating a high level overview of a technique for sharing data in an embodiment of the present invention. The technique for sharing data shown in FIG. 2A is operable with an application sending data, such as application 101 a of FIG. 1A and application 101 b of FIG. 1B, for example. As shown in FIG. 2A, a data group is determined for each target application (block 202). Each data group comprises data intended for a particular target application and identity information about a schema describing the data. A message is prepared by organizing the data groups into an unbounded sequence of element groups (block 204). Each element group in the unbounded sequence of element groups contains the data intended for a particular target application and the identity information about a schema describing the data. The message is sent to at least one of a plurality of recipients (block 206).
FIG. 2B is an operational flow diagram illustrating a high level overview of a technique for sharing data when an application has been modified in one embodiment of the present invention. The technique for sharing data shown in FIG. 2B is operable with an application sending data, such as application 101 a of FIG. 1A and application 101 b of FIG. 1B, for example. As shown in FIG. 2B, a data group is determined for at least one application program that has been modified to accept data in a different format and according to a new schema (block 212). The data group comprises additional data intended for the modified application and identity information about the new schema. A second message is prepared by organizing the data group for the modified application, along with any other data groups, into an unbounded sequence of element groups (block 214). The element group in the unbounded sequence of element groups that corresponds to the modified application contains the additional data intended for the modified application and corresponding identity information about the new schema describing the data. The message is sent to at least one of a plurality of recipients (block 216). In one embodiment, the processing illustrated by FIG. 2B enables the modified application to receive data in a new format irrespective of data groups intended for other applications.
FIG. 2C is an operational flow diagram illustrating a high level overview of a technique for defining data in one embodiment of the present invention. The technique for sharing data shown in FIG. 2C is operable with an application sending data, such as application 101 a of FIG. 1A and application 101 b of FIG. 1B, for example. As shown in FIG. 2C, each group of elements is wrapped around a data description language element, e.g. <newElements> (block 222). Data description languages that may be used in various embodiments include, for example eXtended Markup Language (XML), Resource Description Framework (RDF) and other formats are contemplated. Next, each group of elements used in the data is associated with a schema defining the types and semantics of that group of elements (block 224). Each group of elements used in the data is associated with the information about the schema name and the schema version and the like (block 226). Each group of elements used in the data is associated with the information about the well-known data description language conforming to the schema (block 228). The data is presented as an unbounded sequence of elements where each element itself may have sub-elements (block 230). In the embodiment illustrated by FIG. 2C, there is no one schema for the whole data.
FIG. 2D is an operational flow diagram illustrating a high level overview of a process for reading data from the conforming schema in one embodiment of the present invention. As shown in FIG. 2D, a message is received (block 232). The technique for reading data shown in FIG. 2D is operable with an application receiving data, such as applications 103 a, 105 a and 107 a of FIG. 1A and applications 103 b, 105 b and 107 b of FIG. 1B, for example. The message comprises data groups organized into an unbounded sequence of element groups. Each element group in the unbounded sequence of element groups contains data intended for a particular target application. The message also includes identity information about a schema describing the data. Information identifying the schema is read (block 234). Information identifying the schema may include, a schema name, a schema version and other identification forms are contemplated. Next, the schema language is read (block 236). The group of elements is parsed (block 238) based upon an appropriate combination of schema and parsers selected from a plurality of available schemas and parsers (e.g. if schema language is RDF, an RDF parser is selected).
The technique illustrated by FIG. 2D does not enforce each business service participating in a conversation to comply with this mechanism. Rather, in the illustrated embodiment, each business service can internally use any schema, but just before sharing it with other parties, the business service transforms the message to a common format such as without limitation the format illustrated below. Accordingly, this embodiment provides an extendable format for data that accommodates changes in the data structures shared across multiple parties.
In one embodiment, the data structure for each group of elements is XML that conforms to the following schema:
|
<?xml version=“1.0” encoding=“UTF-8”?> |
<xs:schema xmlns:xs=“http://www.w3.org/2001/XMLSchema” |
elementFormDefault=“qualified” attributeFormDefault=“unqualified”> |
<xs:complexType name=“buildingBlocks”> |
<xs:sequence> |
<xs:element name =“schemaName” type =“xs:string”/> |
<xs:element name= “schemaVersion” type =“xs:string”/> |
<xs:element name= “schemaLanguage” type =“xs:string”/> |
<xs:element name= “schemaDef” type = “xs:base64Binary”/> |
<xs:element name=“body” type =“xs:anyType”/> |
</xs:sequence> |
</xs:complexType> |
<xs:element name=“NewElements”> |
<xs:complexType> |
<xs:sequence> |
<xs:element name=“NewElement” type =“buildingBlocks” |
maxOccurs=“unbounded” minOccurs=“0” /> |
</xs:sequence> |
</xs:complexType> |
</xs:element > |
</xs:schema> |
|
Example of the data instance conforming to this schema:
|
|
|
<NewElements> |
|
<NewElement> |
|
<schemaName>Schema1.xsd</schemaName> |
|
<schemaVersion>1</schemaVersion> |
|
<schemaLanguage>xml</ schemaLanguage > |
|
<schemaDef>base64 encoded schema content</schemaDef> |
|
<body> |
|
.... . xml structure |
|
</body> |
|
</NewElement> |
|
<NewElement> |
|
<schemaName>Schema2.xsd</schemaName> |
|
<schemaVersion>1</schemaVersion> |
|
<schemaLanguage>rdf</schemaLanguage > |
|
<schemaDef> base64 encoded schema content</schemaDef> |
|
<body> |
|
.... . rdf structure |
|
</body> |
|
</NewElement> |
|
............. . |
|
</NewElements> |
|
|
Example of the updated data instance still conforming to this schema:
|
|
|
<NewElements> |
|
<NewElement> |
|
<schemaName>Schema1.xsd</schemaName> |
|
<schemaVersion>1</schemaVersion> |
|
<schemaLanguage>xml</schemaLanguage > |
|
<schemaDef>base64 encoded schema content</schemaDef> |
|
<body> |
|
.... . xml structure |
|
</body> |
|
</NewElement> |
|
<NewElement> |
|
<schemaName>Schema2.xsd</schemaName> |
|
<schemaVersion>1</schemaVersion> |
|
<schemaLanguage>rdf</schemaLanguage > |
|
<schemaDef>base64 encoded schema content</schemaDef> |
|
<body> |
|
.... . rdf srtucture |
|
</body> |
|
</NewElement> |
|
<NewElement> |
|
<schemaName>Schema3.xsd</schemaName> |
|
<schemaVersion>1</schemaVersion> |
|
<schemaLanguage>rdf</schemaLanguage > |
|
<schemaDef>base64 encoded schema content</schemaDef> |
|
<body> |
|
.... . rdf structure |
|
</body> |
|
</NewElement> |
|
</NewElements> |
|
|
In other aspects, the invention encompasses in some embodiments, computer apparatus, computing systems and machine-readable media configured to carry out the foregoing methods. In addition to an embodiment consisting of specifically designed integrated circuits or other electronics, the present invention may be conveniently implemented using a conventional general purpose or a specialized digital computer or microprocessor programmed according to the teachings of the present disclosure, as will be apparent to those skilled in the computer art.
Appropriate software coding can readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art. The invention may also be implemented by the preparation of application specific integrated circuits or by interconnecting an appropriate network of conventional component circuits, as will be readily apparent to those skilled in the art.
The present invention includes a computer program product which is a storage medium (media) having instructions stored thereon/in which can be used to program a computer to perform any of the processes of the present invention. The storage medium can include, but is not limited to, any type of rotating media including floppy disks, optical discs, DVD, CD-ROMs, microdrive, and magneto-optical disks, and magnetic or optical cards, nanosystems (including molecular memory ICs), or any type of media or device suitable for storing instructions and/or data.
Stored on any one of the computer readable medium (media), the present invention includes software for controlling both the hardware of the general purpose/specialized computer or microprocessor, and for enabling the computer or microprocessor to interact with a human user or other mechanism utilizing the results of the present invention. Such software may include, but is not limited to, device drivers, operating systems, and user applications.
Included in the programming (software) of the general/specialized computer or microprocessor are software modules for implementing the teachings of the present invention, including, but not limited to providing mechanisms and methods for sharing data as discussed herein.
FIG. 3 illustrates an exemplary processing system 300, which can comprise one or more of the elements of FIGS. 1A and 1B. Turning now to FIG. 3, an exemplary computing system is illustrated that may comprise one or more of the components of FIGS. 1A and 1B. While other alternatives might be utilized, it will be presumed for clarity sake that components of the systems of FIGS. 1A and 1B are implemented in hardware, software or some combination by one or more computing systems consistent therewith, unless otherwise indicated.
Computing system 300 comprises components coupled via one or more communication channels (e.g., bus 301) including one or more general or special purpose processors 302, such as a Pentium®, Centrino®, Power PC®, digital signal processor (“DSP”), and so on. System 300 components also include one or more input devices 303 (such as a mouse, keyboard, microphone, pen, and so on), and one or more output devices 304, such as a suitable display, speakers, actuators, and so on, in accordance with a particular application. (It will be appreciated that input or output devices can also similarly include more specialized devices or hardware/software device enhancements suitable for use by the mentally or physically challenged.)
System 300 also includes a computer readable storage media reader 305 coupled to a computer readable storage medium 306, such as a storage/memory device or hard or removable storage/memory media; such devices or media are further indicated separately as storage 308 and memory 309, which may include hard disk variants, floppy/compact disk variants, digital versatile disk (“DVD”) variants, smart cards, read only memory, random access memory, cache memory, and so on, in accordance with the requirements of a particular application. One or more suitable communication interfaces 307 may also be included, such as a modem, DSL, infrared, RF or other suitable transceiver, and so on for providing inter-device communication directly or via one or more suitable private or public networks or other components that may include but are not limited to those already discussed.
Working memory 310 further includes operating system (“OS”) 311 elements and other programs 312, such as one or more of application programs, mobile code, data, and so on for implementing system 300 components that might be stored or loaded therein during use. The particular OS or OSs may vary in accordance with a particular device, features or other aspects in accordance with a particular application (e.g. Windows®, WindowsCE®, Mac®, Linux, Unix or Palm OS® variants, a cell phone OS, a proprietary OS, Symbian®, and so on). Various programming languages or other tools can also be utilized, such as those compatible with C variants (e.g., C++, C#), the Java 2™ Platform, Enterprise Edition (“J2EE™”) or other programming languages in accordance with the requirements of a particular application. Other programs 312 may further, for example, include one or more of activity systems, education managers, education integrators, or interface, security, other synchronization, other browser or groupware code, and so on, including but not limited to those discussed elsewhere herein.
When implemented in software (e.g. as an application program, object, agent, downloadable, servlet, and so on in whole or part), a learning integration system or other component may be communicated transitionally or more persistently from local or remote storage to memory (SRAM, cache memory, etc.) for execution, or another suitable mechanism can be utilized, and components may be implemented in compiled or interpretive form. Input, intermediate or resulting data or functional elements may further reside more transitionally or more persistently in a storage media, cache or other volatile or non-volatile memory, (e.g., storage device 308 or memory 309) in accordance with a particular application.
Other features, aspects and objects of the invention can be obtained from a review of the figures and the claims. It is to be understood that other embodiments of the invention can be developed and fall within the spirit and scope of the invention and claims. The foregoing description of preferred embodiments of the present invention has been provided for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations will be apparent to the practitioner skilled in the art. The embodiments were chosen and described in order to best explain the principles of the invention and its practical application, thereby enabling others skilled in the art to understand the invention for various embodiments and with various modifications that are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalence.