WO2015055373A2 - Case-based reasoning - Google Patents
Case-based reasoning Download PDFInfo
- Publication number
- WO2015055373A2 WO2015055373A2 PCT/EP2014/069889 EP2014069889W WO2015055373A2 WO 2015055373 A2 WO2015055373 A2 WO 2015055373A2 EP 2014069889 W EP2014069889 W EP 2014069889W WO 2015055373 A2 WO2015055373 A2 WO 2015055373A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- case
- information
- comparison
- cbr
- dependence
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Definitions
- the present invention relates to case-based reasoning. More particularly, embodiments of the invention provide efficient, effective, adaptable and scalable case-based reasoning techniques that can be applied in a broad range of industries, such as the finance, healthcare and energy industries. Background of the Invention
- Predictive analytics is a tool for making and supporting decisions. Predictive analytics involves analysing historical data in order to predict future events and thereby automatically propose or take actions.
- the majority of known predictive analytics systems are offline or batch processing systems that do not operate in real-time.
- the data used in the predictive analytics is separate from that used in operational systems and the data may be hours, days, weeks or even months old before analytics algorithms are applied to it.
- These techniques are not appropriate for applications in which it is necessary for the predictive analytics to be performed in real-time.
- Such applications may be, for example, the monitoring of an oil well drilling operation or an operation by a physician, in which it is necessary for problems to be detected, and proposals to be generated, very quickly.
- CEP complex event processing
- CEP systems generate alerts based on previously created rules for monitoring data.
- Such rule-based systems are inherently limited by the difficulty in defining and maintaining the rules. While a near real-time rule may be applied to data, the analytics required to create the rule is slow and not realtime. Moreover, the created rules are inflexible and incapable of adapting to changes in the data. The analysis needed to create and update rules is therefore undertaken offline. Accordingly, rule-based systems tend to only be used in stable and predictable environments in which it is possible to define a set of rules for all circumstances and for automatic actions to be taken.
- Rule-based techniques are not appropriate for applying predictive analytics in fast-changing environments. Furthermore, there are scenarios in which it is not appropriate for automatic actions to be taken. If a critical or complicated decision is to be made, for example by an oil well operator during a drilling operation or by a physician during surgery, it is neither feasible nor desirable to take humans out of the decision making process.
- Case-based reasoning is a real-time predictive analytics technique that does not experience the above-described problems of rule-based techniques.
- CBR systems detect and propose solutions to problems using information obtained from a plurality of cases stored in a case base.
- Each of the stored cases comprises a description of a problem and a description of a solution.
- the cases are typically generated manually based on actual experienced problems and devised solutions by system operators.
- CBR systems are able to provide system operators with detailed and reasoned solutions to complicated problems.
- Big data refers to a collection of data sets so large and complex that they become difficult to process using traditional data processing applications. For example, such big data could be encountered when applying predictive analytics within the financial services industry as a vast quantity of financial information is continuously generated and transferred between computing systems all over the world.
- CBR systems A problem with known CBR systems is that they are not designed for supporting and providing real-time operation on big data.
- a computer- implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case in case-based reasoning, CBR comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream; generating, for each of the generated data streams, a similarity score for a feature of a case, wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case, thereby each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case; and generating an overall similarity score between the received data stream and the case in dependence on the generated similarity scores.
- the method further comprises generating each similarity score in dependence on comparison information of the corresponding feature of the case.
- the received data stream comprises one or more streams of parameters and each of the generated data streams comprises one or more streams of parameters in dependence on the received data stream.
- the method further comprises generating the similarity score for each feature in dependence on a comparison of the value of a parameter in a generated data stream of parameters for the feature with the stored value of the parameter for the feature.
- the comparison information comprises weight information and the method further comprises weighting the value of the parameter in the generated data stream for a feature and/or the stored value of the parameter for the feature; and generating the similarity score of the feature in dependence on the weighted value(s).
- the comparison information comprises a comparison function that specifies computations for generating the similarity score and the method further comprises generating the similarity score of each feature in dependence on the comparison function of the feature.
- the method further comprises generating one or more aggregate similarity scores in dependence on one or more of the generated similarity scores; and generating the overall similarity score in dependence on the aggregate similarity score(s).
- the method further comprises determining the overall similarity score in real-time.
- the method is performed by a comparison agent.
- the comparison agent is configured in dependence on the comparison information for each feature and including the stored information for each feature.
- the case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
- a computer- implemented method of monitoring a situation by determining a set of one or more cases in case-based reasoning, CBR comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams from the received data stream; generating, according to any of the above-described methods, an overall similarity score between each of the plurality of parallel data streams and a case, wherein each overall similarity score is generated from a comparison between one of the plurality of parallel data streams and a different one of a plurality of cases; and determining a set of one or more cases in dependence on the generated overall similarity scores.
- a comparison agent for monitoring a situation by generating an overall similarity score between a received data stream, comprising information on a monitored situation, and a case in case-based reasoning, CBR, wherein the comparison agent is configured to perform any of the above described methods.
- a case-based reasoning, CBR, engine for monitoring a situation by determining a set of one or more cases, wherein the CBR engine is configured to perform any of the above- described methods.
- a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
- a computer- implemented method of monitoring a situation by determining a set of one or more cases in case-based reasoning, CBR comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams from the received data stream; generating, for each of the parallel data streams, an overall similarity score between the parallel data stream and one of a plurality of cases, wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and a different case; and determining a set of one or more cases in dependence on the generated overall similarity scores.
- each of the overall similarity scores is generated by one of a plurality of comparison agents and each of the comparison agents receives one of the plurality of data streams.
- the method further comprises each comparison agent generating an overall similarity score by: receiving one of the plurality of data streams; generating a further plurality of parallel data streams, wherein each of the generated further plurality of parallel data streams is dependent on the received one of the plurality of data streams; generating, for each of said generated further plurality of parallel data streams, a similarity score in dependence on a comparison between information on a feature in the generated further data stream and stored information on the feature of a case, wherein each of the similarity scores of said generated further plurality of data streams is generated in dependence on a comparison with stored information on a different feature of the same case; and generating an overall similarity score between the received one of the plurality of data streams and the case in dependence on the generated similarity scores.
- the method further comprises determining to include a case in the set of one or more cases if the overall similarity score for the case is above a predetermined threshold level.
- the determined set of cases has a predetermined number of two or more cases, and the method comprises determining the predetermined number of cases for including in the set as the cases with the highest overall similarity scores.
- the method further comprises displaying information dependent on each of the determined one or more cases.
- each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
- a case-based reasoning, CBR, engine for monitoring a situation by determining a set of one or more cases, wherein the CBR engine is configured to perform any of the above- described methods.
- a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
- a case-based reasoning, CBR, platform for monitoring a situation comprising: a CBR engine; a unified data cache; a case library; an agent application programming interface, API ; a data source API; an application API ; and a persistence API ; wherein: the data source API is configured to provide an interface between live and/or static data sources, external from the CBR platform, and the unified data cache, wherein the live and/or static data sources are sources of information on a monitored situation; the persistence API is configured to provide an interface between a persistence database, external from the CBR platform, and the unified data cache and the case library; the application API is configured to provide an interface between the systems of data analysts, platform administrators and/or operators, external from the CBR platform, and the unified data cache and the case library; the C
- cases are stored in the case base in XML or serialised code format.
- the case base is any of a single database, a plurality of distributed databases, a directory on a server or a plurality of directories on one or more servers.
- each case comprises metadata that includes comparison information for the case.
- each case is structured according to a computing graph.
- the computing graph is a tree structure or directed acyclic graph.
- the CBR engine is configured to perform the method of: receiving said data stream comprising information on the monitored situation from the unified data cache; generating a plurality of parallel data streams from said received data stream comprising information on the monitored situation; generating, for each of the parallel data streams, an overall similarity score between the parallel data stream and information on one of the plurality of cases received from the case library, wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and information on a different case; and determining the CBR results as a set of one or more cases in dependence on the generated overall similarity scores.
- each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
- a case-based reasoning apparatus comprising a processor and a memory having stored therein instruction that, when executed by the processor, cause the apparatus to perform a method for creating, in programmable hardware, components of a comparison agent for monitoring a situation, the method comprising: obtaining one or more parameter values and comparison information for each of a plurality of features of a case, wherein the comparison information of each feature defines a configuration of a computation unit; and creating, for each of the plurality of features, a computation unit in dependence on the obtained one or more parameter values and the comparison information of the feature, such that the created computation unit is configured to generate an output in dependence on the obtained one or more parameter values and the comparison information of the feature.
- the method further comprises creating each of the computation units such that each computation unit is configured to receive one or more data streams of parameters and to generate an output in dependence on the values of the one or more received parameters, wherein the one or more data streams comprise information on a monitored situation.
- the method further comprises: obtaining comparison information for one or more aggregate computation units, wherein the comparison information of each aggregate computation unit defines the configuration of the aggregate computation unit; and creating, in dependence on the obtained comparison information, one or more aggregate computation units, wherein each aggregate computation unit is configured to receive an output generated by at least one of the computation units.
- the comparison information for each computation unit includes information on how the obtained one or more parameter values and/or received parameter values are to be weighted by the computation unit, and the method comprises creating each computation unit such that the computation unit is configured to weight the obtained one or more parameter values and/or received parameter values in dependence of the comparison information.
- the comparison information for each aggregate computation unit includes information on how the aggregate computation unit is to weight the values of its inputs
- the method comprises creating each aggregate computation unit such that the aggregate computation unit is configured to weight its inputs in dependence on the comparison information.
- the comparison information for each computation unit includes a comparison function that specifies the computations that the computation unit is required to perform to generate an output
- the method comprises creating each computation unit such that the computation unit is configured to apply computations with the obtained one or more parameter values and received parameter values in dependence on the comparison function.
- the configuration information for each aggregate computation unit includes a comparison function that specifies the computations that the aggregate computation unit is required to perform to generate an output
- the method comprises creating each aggregate computation unit such that the aggregate computation unit is configured to apply computations to its inputs in dependence on the comparison function.
- the comparison information for each computation unit and the comparison information of each aggregate computation unit are obtained by obtaining metadata of the case.
- the computation units and aggregate computation units are configured in dependence on a computing graph for the case.
- the computing graph is a tree structure or directed acyclic graph.
- the case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
- a comparison agent for monitoring a situation the comparison agent created according to any of the above-described methods.
- a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
- a case-based reasoning apparatus comprising a processor and a memory having stored therein instruction that, when executed by the processor, cause the apparatus to perform a method of creating a new case in a case-based reasoning, CBR, system for monitoring a situation, the method comprising: determining a set of one or more cases from a plurality of cases in dependence on a received data stream comprising information on a monitored situation, wherein each case comprises information describing a problem and information describing a solution to the problem and the process of determining the set of one or more cases is performed without comparing the description of the problems of any of the plurality of cases with a previously generated current case comprising information describing the monitored situation; generating information describing a solution in dependence on information obtained from the determined set of one or more cases and/or in dependence on information received from a user interface; generating a current case comprising information describing the monitored situation in dependence on the received data stream; and generating a new case in dependence on the generated information describing
- the method further comprises generating the current case after generating the information describing a solution.
- the method further comprises generating a request for the current case; and generating the current case in response to receiving the request.
- the method further comprises automatically generating the current case.
- the plurality of cases are stored in a case base and the method further comprises storing the generated new case in the case base.
- the process of determining a set of one or more cases includes creating a comparison agent for each of the plurality of cases, and the method comprises generating a new comparison agent for the generated new case.
- each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
- a fourteenth aspect of the invention there is provided a case-based reasoning, CBR, system for monitoring a situation, the CBR system configured to perform any of the above-described methods.
- a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
- Figure 1 provides an overview of a situation in which a CBR system is used
- Figure 2 is schematic diagram of a case for use in embodiments
- Figure 3 is an XML representation of a case according to an embodiment
- Figure 4 is a block diagram of a comparison agent according to an embodiment
- Figure 5 is a block diagram of a comparison agent according to an embodiment
- Figure 6 is a block diagram of part of a CBR engine according to an embodiment
- Figure 7 shows the steps of a method for generating an overall similarity score between a received data stream and a case according to an embodiment
- Figure 8 shows the steps of a method for determining a set of one or more cases by a CBR engine according to an embodiment
- Figure 9 shows the steps of a method for creating components of a comparison agent according to an embodiment
- Figure 10 is a block diagram showing a CBR cycle according to an embodiment
- Figure 1 1 shows the steps of a method for creating a new case using a CBR cycle according to an embodiment
- Figure 12 shows a CBR platform according to an embodiment.
- Embodiments of the invention provide CBR techniques that are advantageous over known predictive analytics techniques. Embodiments allow CBR systems to be realised that are fast and scalable, as required for real-time operation on big data. Moreover, the CBR systems according to embodiments are adaptable and can be used for many applications. CBR systems according to embodiments are particularly effective in the energy, finance and healthcare industries.
- the CBR techniques of embodiments identify and prevent drilling problems and thereby greatly reduce both costs and drilling time.
- the techniques are especially advantageous in complex drilling operations and multi- well operations as they are able to manage high volumes of data and to quickly recognise trends and indicators.
- the CBR techniques are also applicable to the energy industry in general and are not limited to the oil industry. For example, they may be used to detect and prevent problems in the electrical power generation industries.
- the CBR techniques provide an analytics tool on top of the existing data-capture technology to generate proposals for preventing problems from occurring and solving any problems that have occurred.
- the CBR techniques also provide better assurance for staying in compliance with regulatory requirements and protecting organisations from third-party mistakes. Risk and compliance officers can look at actual past events (default rates, VAR, etc.) to measure the risk of using similar strategies in the future.
- CBR techniques enable hospitals to improve the quality of patient care and reduce costs.
- the CBR techniques apply real-time analysis to identify and manage impacting events by providing physicians with evidence-based decision support.
- the type of implementation as shown in Figure 1 may be used in which the monitored situation (in this example a drilling operation), the data analysis server comprising the CBR system and the operations centre are all remote from each other and communicate over a network, such as a local network or the internet.
- the data analysis server and/or operations centre may be local to the situation.
- CBR CBR-based blood pressure
- the monitored parameters could be transmitted to a data analysis server comprising a CBR system within the hospital.
- a data analysis server comprising a CBR system within the hospital.
- any detected problems and proposed solutions are then displayed, in the operating theatre, to the surgeon operating on the patient.
- the CBR techniques create a comparison agent for each of the cases in a case base.
- a comparison agent is an instantiation of a case containing information describing how to compare a previous situation described by the case with a current situation.
- Each comparison agent comprises computational units that hold all the information required for comparing a feature of a stored case with data streams describing the current situation.
- the computational units are created in dependence on one or more values of parameters of a stored case feature, a function defining how to determine a similarity as well as any other information required for comparison, such as weighting information or minimum and maximum values.
- a plurality of parallel data streams are generated from a received data stream that comprises information on monitored parameters of a situation.
- the plurality of data streams are then streamed into the computational units of the comparison agents.
- Each computational unit compares a subset of the parameters in the streamed data with their stored parameters, that correspond to a feature of a case, to generate a similarity score between the received data stream and the feature.
- All of the features of a case have a corresponding computational unit that computes the similarity between one or more parameters describing a monitored situation and the feature that describes a past situation.
- An overall similarity score between a monitored situation and the case is then calculated in dependence on the similarity score calculated for each feature.
- received data is streamed directly into the computational units of comparison agents. This allows a very fast comparison of features to be performed.
- Embodiments differ from, and are faster than, all known CBR techniques as these require the additional step of first generating a file comprising information on monitored parameters of a situation, referred to herein as a current case, and then comparing the current case with stored case information.
- all of the comparison agents operate in parallel with each other. This is a lot faster than known CBR systems that sequentially compare a current case with each case of a case base.
- each comparison agent generates an overall similarity score.
- the comparison agents are configured not only with stored parameter information for a case, but also with weight information of parameters and functions that describe how the comparison agent should compare received and stored information. This allows a more sophisticated and tuneable comparison technique to be applied and the generated similarity score is therefore more accurate.
- CBR techniques are performed by a CBR engine supported by a CBR platform.
- the CBR platform is able to integrate with other existing systems and can therefore be used in many applications.
- the CBR platform, and in particular the CBR engine within the platform, are also highly scalable.
- the CBR techniques according to embodiments are described in more detail below.
- Figure 2 is a schematic diagram showing how the information within a case 21 for use in embodiments may be structured.
- Each case 21 comprises a description of a problem, shown as a situation description, and a description of a solution, shown as advice.
- Stored information within each of these sections may be further categorised into sub-sections, such as dynamic and static data for the situation description. Within each sub- section, the stored information may be further categorised further sub-sections. Although, not shown in Figure 2, there may be a number of further categorisations of the stored information into smaller and smaller sub-sections.
- the smallest sub-sections of stored information for the situation description are features of the case 21 .
- Each problem that a case 21 solves is represented by a set of features with each feature comprising stored values of a parameter. Values of the same parameter can also be obtained from a monitored situation.
- Each feature may be combined with other features to form an aggregate feature.
- the features that are combined to form the aggregate feature are the child features of the aggregate feature.
- Each aggregate feature may itself be a child feature of another aggregate feature.
- the structure of the situation description of each case 21 is defined by a case description graph.
- the case description graph may be a directed acyclic graph, DAG, a tree or other types of structure.
- the nodes of the graph denote the features of the case 21 while the edges, or paths between the nodes, correspond to the relationships between the nodes. That is to say, a leaf node in a tree structure corresponds to a feature that does not depend on any other feature and the other nodes within the tree structure correspond to aggregate features.
- Features can have any data type.
- the data type can be just a number with a unit or a symbol, or it can be more complex, such as a set, a vector or a sequence of numbers or symbols.
- Features can even be natural language text.
- the comparison between a stored case 21 in a case base and an input data stream from a monitored situation is performed by comparing the parameter information stored within the features on a feature by feature basis.
- Aggregate features have at least one input that is an output from another feature comparison.
- parameters within the received data stream may also be directly input to an aggregate feature, aggregate features typically have only outputs from other feature comparisons as inputs.
- the output of a feature comparison is a similarity score while the aggregated similarity score for all features of a case 21 is an overall similarity score for the comparison between the stored case 21 and the received data stream.
- comparison information For every feature, including the aggregate features, comparison information is defined.
- the comparison information may include weights, comparison functions and any other configuration information, such as max and min values for numeric similarity measures or range limits for sequences.
- a comparison function is a function that measures the similarity of one or more received and stored parameters to thereby generate a similarity score that is a measure of the similarity between a feature of a case 21 and information from a monitored situation.
- the comparison function for a feature may use any of the other information in the comparison information, such as weights of parameter values, when generating a similarity score for the feature.
- All of the features that receive parameters in the data stream may comprise weights that are applied to the stored parameter information and/or the parameters in the data stream.
- Each aggregate feature may also comprise weights that are applied to each of its inputs. The weights allow the contributions of each of the features to be controlled and therefore the relative importance of each feature to be included in the information describing a situation.
- Local weights can be distinct for each feature and are individual for each case 21 . Local weights need to be stored on a case-by-case basis. Global weights apply to different cases 21 in the same manner and need only be stored centrally. Global weights become local weights once they are customised for individual cases 21 .
- every feature may also have a comparison function that defines how the feature is to be compared against input parameters from the data stream.
- the comparison function for a feature can be any mathematical function that generates a result in dependence on the parameters.
- Each comparison function can be individualised to each feature.
- Features may be provided with a default comparison function or a comparison function that has been determined by a system operator.
- All of the cases 21 according to embodiments comprise metadata for storing the comparison information for all of the features of each case 21 . Metadata can also comprise further information describing a case 21 , such as units and textual descriptions of the features to help system operators understand each feature.
- each case 21 can be modelled individually.
- the original compiler of a case 21 has full control over which features are chosen to describe the case 21 , how stored and measured information is compared for each of the features, and how an overall similarity score is generated for the case 21 .
- the metadata for each case 21 can also be modified at a later stage by a system operator in order to change how the case 21 is compared with monitored data. A system operator can therefore tune the comparison of the case 21 .
- case solution if it is not required for the solution to be automatically modified by a computer, then this can be a textual description of how to solve the problem. Otherwise, the case solution needs to be represented in a format that can be understood by a computer. This advantageously allows a solution to be automatically devised that is based on a plurality of similar cases 21 to the current situation. How to represent a case solution so that it can be understood by computers is known in the art.
- Each case 21 can be stored as an XML file, such as the example shown in Figure 3. There are a number of alternative forms in which each case 21 can be stored, such as serialised code.
- the case base may be, for example, a single database, a plurality of databases distributed across a plurality of hardware devices, a directory on a server or a plurality of directories on one or more servers.
- a comparison agent is created for each case 21 in the case base.
- Each comparison agent is created in dependence on the case 21 description graph for the situation description of the case 21 .
- FIG. 4 An example of a comparison agent 41 for comparing parameters in a received data stream with a case 21 is shown in Figure 4.
- a computation unit has been created for each feature of the case 21 .
- the relative arrangement of the computation units has been defined by the case description graph for the case 21 and the comparison information of the case 21 has been used to configure how each computation unit operates.
- Computation Node 1 is a computation unit that has been configured to generate a comparison result between a received and a stored value of a voltage. In addition to being created with the stored value of the voltage, Computation Node 1 has been configured to compare the stored and received value of the voltage according to the comparison information of the feature that Computation Node 1 corresponds to.
- the comparison information includes a Similarity Measure, that is a mathematical function that describes how a result is generated, as well as a Configuration, that specifies limits on the voltage values.
- Computation Node 2 has been configured to generate a comparison result between a received and stored value of a status. It has been created with a stored value of the status and has been configured to compare the stored and received value of the status according to the comparison information of the feature that Computation Node 2 corresponds to.
- Computation Node 3 has been created for an aggregate feature. Computation Node 3 receives as inputs the outputs from Computation Nodes 1 and 2. It has been configured to weight and combine its inputs according to the comparison information of the aggregate feature that it corresponds to in order to generate an overall comparison result, i.e. overall similarity score.
- FIG. 5 shows another example of a comparison agent 51 .
- the comparison agent 51 comprises computation units for features F1 , F2 and F3 as well as for aggregate features AF1 , AF2 and AF3.
- the comparison agent 51 also comprises a filtering and splitting component 58.
- a received data stream comprises data streams of parameters A, B and C.
- the received data stream is input to the filtering and splitting component 58 that generates a plurality of parallel data streams that are output to features F1 , F2 and F3.
- the filtering ensures that each of the parallel data streams comprises only the parameters that are required by the computation unit that the data stream corresponds to. Accordingly, a data stream comprising only parameter A is sent to F1 as the computation unit F1 only performs a comparison between a received and stored value for parameter A.
- the data stream sent to F2 differs from that sent to F1 and comprises a data stream of parameter A as well as a data stream of parameter B.
- F2 only performs a comparison between received and stored values of parameters A and B and so these are the only data streams of parameters that are sent to it.
- F3 only performs a comparison between received and stored values of parameters B and C and so these are the only data streams of parameters that are sent to it.
- Figure 6 shows the part of a CBR engine 60 that performs case comparison and retrieval for determining one or more similar cases 21 to a current situation according to embodiments.
- the cases 21 are stored in a case base comprising N cases 21 .
- Comparison agents for each of the N cases 21 are created according to the techniques described above.
- the CBR engine 60 comprises the plurality of comparison agents, arranged in parallel with each other, a filtering and splitting component 65 and a retrieval agent 61 .
- a received data stream from a monitored situation is input to the filtering and splitting component 65.
- the filtering and splitting component 65 divides the data stream into a plurality of N parallel data streams, with each of the plurality of parallel data streams being sent to a different comparison agent.
- the filtering and splitting component 65 also filters the received data stream so that each comparison agent only receives data streams comprising parameters that are required by the comparison agent.
- the retrieval agent 61 receives overall similarity scores from each of the comparison agents. On the basis of the received overall similarity scores, the retrieval agent 61 determines if there are any cases 21 in the case base with similar situation descriptions to the situation being monitored.
- One strategy that may be used by the retrieval agent 61 is to retrieve all cases 21 that have an overall similarity score that is above a pre-determined threshold level. Alternatively, the retrieval agent 61 may use the strategy of always retrieving the same predetermined number of cases 21 , the retrieved cases 21 having the highest overall similarity scores. Other retrieval strategies are also possible.
- Figure 7 shows the steps of a computer-implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case 21 according to an embodiment.
- step 701 The method starts in step 701 .
- step 703 a data stream is received comprising information on a monitored situation.
- step 705 the method generates a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream.
- step 707 the method generates, for each of the generated data streams, a similarity score for a feature of a case 21 , wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case 21 , and each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case 21 .
- step 709 the method generates an overall similarity score between the received data stream and the case 21 in dependence on the generated similarity scores.
- step 71 1 the method ends.
- Figure 8 shows the steps of a computer-implemented method of monitoring a situation by determining a set of one or more cases 21 by a CBR engine according to an embodiment. The method starts in step 801 .
- step 803 a data stream is received comprising information on a monitored situation.
- step 805 the method generates a plurality of parallel data streams from the received data stream.
- step 807 the method generates, for each of the parallel data streams, an overall similarity score between the parallel data stream and one of a plurality of cases 21 , wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and a different case 21 .
- step 809 the method determines a set of one or more cases 21 in dependence on the generated overall similarity scores.
- step 81 1 the method ends.
- Figure 9 shows the steps of a computer-implemented method for creating components of a comparison agent for monitoring a situation according to an embodiment.
- step 901 the method starts.
- step 903 the method obtains one or more parameter values and comparison information for each of a plurality of features of a case 21 , wherein the comparison information of each feature defines a configuration of a computation unit.
- step 905 the method creates, for each of the plurality of features, a computation unit in dependence on the obtained one or more parameter values and the comparison information of the feature, such that the created computation unit is configured to generate an output in dependence on the obtained one or more parameter values and the comparison information of the feature.
- step 907 the method ends.
- a similarity score for each feature of a case 21 is generated by a computation unit that receives a data stream, or data streams, of parameters from the monitored situation.
- the similarity score for each feature is therefore generated extremely quickly and this allows the CBR techniques of embodiments to be applied in real-time.
- a further advantage is provided by filtering the received data stream so that only the required data streams of parameters are sent to the computation unit for each feature. This reduces the amount of data transmission within the CBR platform.
- each comparison agent can be flexibly configured. This allows a system operator to accurately control how each case 21 is compared with received information and how each overall similarity score is generated.
- Figures 4 and 5 show very simple comparison agents that require very few computation units.
- the comparison agent of an actual case 21 that describes, for example, a drilling operation may contain computation units that correspond to hundreds, or even thousands, of features and the comparison agent would be a lot larger and more complicated.
- a tree based design is particularly advantageous for such large comparison agents since inputting streams of data parameters directly into a parallel arrangement of computation units allows an overall similarity score to be generated quickly.
- the system design of Figure 6 advantageously allows the input of a plurality of parallel data streams directly into comparison agents. The process of generating a current case and the overhead of transmitting the entire current case within the system is therefore avoided. In addition, by filtering each of the data streams that are sent to each of the comparison agents, the data streams only comprise the data streams of parameters that are required for each case 21 . This reduces the amount of information that is communicated within the system. Furthermore, the parallel arrangement of comparison agents allows features of cases to be matched in parallel. This is not possible with, and is a lot faster than, all known CBR systems.
- the output from the retrieval agent 61 shown in Figure 6 is one or more similar cases 21 .
- a solution to a problem that has been identified from the received data stream from the monitored situation, can be generated and provided to a system operator.
- One way of easily generating a solution is to directly copy the solution provided in the case 21 with highest overall similarity score.
- More advanced solutions may be generated by adapting the solution(s) provided in the one or more retrieved cases 21 so as to generate a solution that is dependent on the solution(s) in one or more of the retrieved cases 21 .
- a system operator may also provide a completely new solution, not dependent on any of the solutions of the retrieved cases 21 , as the solution to a problem that has been determined from the received data stream.
- a new case 21 may be generated for which the situation description is dependent on the monitored situation determined from the received data stream and the advice is dependent on the generated solution.
- the determination to generate such a new case 21 , and to store the new case 21 in the case base, may be made by a system operator or performed automatically.
- Figure 10 shows a CBR cycle for generating a new case 21 , and storing the new case 21 in a case base 108, according to an embodiment.
- the CBR cycle is implemented with a retrieval agent 104, a reuse agent 105, a revise agent 106 and a retain agent 107.
- the process also requires a situation description agent which not explicitly shown in Figure 10.
- the retrieval agent 104 operates as described above and determines one or more similar cases 21 in dependence on the outputs of the comparison agents 101 , 102 and 103.
- the reuse agent 105 outputs information from cases 21 for display to a system operator.
- the output information may be copied from the solution of only one case 21 , or the output information may be a solution generated automatically by the reuse agent 105 in dependence on two or more solutions from retrieved cases 21 .
- the retrieval and revise agents may operate in substantially the same way as these agents operate in known CBR cycles.
- the purpose of the revise agent 106 is to ensure that the proposed solution is appropriate for the current monitored situation.
- the revise agent 106 can adapt the solution generated by the reuse agent 105 or provide a completely new solution, not dependent on the solution generated be the reuse agent 105.
- the generation of a solution by the revise agent 106 may be performed automatically, such as in response to automatic testing determining that adaption of the solution is required, or controlled, partially or fully, by a system operator.
- the revise agent 106 may perform in substantially the same way as the operation of a revise agent in a known CBR cycle.
- a situation description agent receives the data stream from the monitored situation and generates a current case, i.e. a file comprising a description of the situation.
- the situation description agent is located within the revise agent 106 and so the data stream is input directly to the revise agent 106.
- the situation description agent may be separate from the revise agent 106.
- the current case created by the situation description agent may have the same format as that used to store the description of a problem information for cases 21 in the case base 108.
- the situation description agent operates independently of the comparison agents and may be configured parallel to the comparison agents.
- the situation description agent only generates the current case in response to receiving a request for the current case from the revise agent 106.
- the revise agent 106 only sends the request to the situation description agent when it has generated an adapted or new solution.
- the situation description agent automatically generates the current case without requiring a request to be received from the revise agent 106 and the generated current case is automatically sent to the revise agent 106.
- the revise agent 106 receives the current case from the situation description agent. The revise agent 106 then generates a new case 21 based upon the generated solution and the current case.
- the new case 21 preferably comprises metadata with comparison information, as described above for the other cases 21 in the case base 108.
- the retain agent 107 stores the new case 21 generated by the revise agent in the case base 108.
- the case 21 may be stored as, for example, an XML file or serialised code, as described above for the other cases 21 stored in the case base 108.
- the retain agent 107 also creates a new comparison agent for the case 21 and reconfigures the system so that the new comparison agent is supported and operates in the same way as that described above for the other comparison agents 101 , 102 and 103. Accordingly, an additional data stream of parameters is created and transmitted to the new comparison agent and the overall similarity score generated for the new case 21 is input to the retrieval agent.
- the retain agent 107 then creates a comparison agent for each case 21 stored in the case base 108 according to the above-described techniques.
- the computation units of the CBR engine are thereby created in dependence on the comparison information for each feature of each case 21 .
- the process of creating each agent may also be referred as instantiation.
- information on the most relevant cases 21 to a monitored situation is preferably displayed to a system operator using a case radar, as described in WO2010/106014A2, which is incorporated herein by reference.
- Figure 1 1 shows the steps of a computer-implemented method for creating a new case 21 using a CBR cycle, the method performed by a CBR system for monitoring a situation, according to an embodiment.
- step 1 101 the method starts.
- step 1 103 the method determines a set of one or more cases 21 from a plurality of cases 21 in dependence on a received data stream comprising information on a monitored situation, wherein each case 21 comprises information describing a problem and information describing a solution to the problem and the process of determining the set of one or more cases 21 is performed without comparing the description of the problems of any of the plurality of cases 21 with a previously generated current case comprising information describing the monitored situation.
- step 1 105 the method generates information describing a solution in dependence on information obtained from the determined set of one or more cases 21 and/or in dependence on information received from a user interface.
- step 1 107 the method generates a current case information describing the monitored situation in dependence on the received data stream.
- step 1 109 the method generates a new case 21 in dependence on the generated information describing a solution and the generated current case.
- the CBR cycle allows proposed solutions to be provided to a system operator, with the proposed solutions being obtained from original cases 21 for a specific situation, from generic cases 21 , or from modified cases 21 .
- a current case comprising a description of a situation is first created and the cases in the case base are searched with the current case.
- the already created current case is combined with an adapted or new solution.
- the CBR cycle according to embodiments is faster and/or more computationally efficient than known CBR cycles as the process of generating and sending a current case to all comparison agents is not required before the content of the case base 108 is searched.
- the situation description agent may operate in parallel with the comparison agents so that the current case is generated at the same time as the content of the case base 108 is searched.
- the situation description agent may only create a current case in response to an instruction from the revise agent 106 or an operator that the current case is required. This latter approach is more computationally efficient since the current case is only created when necessary.
- the high level architecture of a CBR system comprising a CBR platform according to embodiments is shown in Figure 12.
- the CBR platform is designed to be scalable, flexible and adaptable so that it can be used in many different applications and is able to be integrated with a wide variety of data sources and third party systems.
- the CBR platform provides real-time decision support in dependence on received streamed data.
- the CBR system comprises the following components: - CBR platform 1201
- Persistence database (with Persistence API) - Data interpretation agents (with Agent API)
- the CBR platform 1201 comprises a system for scaling the deployment of data analysis components in a CBR application.
- the CBR platform 1201 is designed to be able to support very high data throughput and seamless scaling of an application by adding processing nodes, such as computer servers, and distributing computation across nodes in run-time.
- Components of the CBR platform 1201 may include:
- the CBR engine 1206 performs the CBR techniques of any of the embodiments of the invention described throughout the present document to generate overall similarity scores.
- the CBR engine 1206 receives one or more data streams, which describe the current status of a monitored situation, from a unified data cache 1205.
- the CBR engine 1206 also receives case information from a case library 1207 and compares the case information to that of the monitored situation. For each case 21 that a received data stream is compared to, an overall similarity score is generated.
- the overall similarity score may be in the form of a percentage match metric.
- the CBR engine 1206 therefore generates results that provide information on relevant cases 21 .
- the results of the CBR engine 1206 are output to the unified data cache 1205.
- the unified data cache 1205 is able to receive information from, and transmit information to, any of the APIs.
- the unified data cache 1205 may store data for use by data interpretation agents, which may perform pattern recognition, and may store the results of the data interpretation agents.
- the unified data cache 1205 also processes data for inputting to the CBR engine 1206.
- the results of case comparisons, by the CBR engine 1206, are stored in the unified data cache 1205.
- the case comparison results stored in the unified data cache 1205 may be output through the Application API and provided to users, such as system operators and data analysts.
- the data in the unified data cache 1205 may also be output to the persistence database through the persistence API and stored therein.
- the case base may be, for example, a single database, a plurality of databases distributed across a plurality of hardware devices, a directory on a server or a plurality of directories on one or more servers.
- Data interpretation agents Although shown in Figure 12 as being external to the CBR platform 1201 , there may also be data interpretation agents within the CBR platform 1201 . The data interpretation agents may also provide an executable input to the CBR engine 1206.
- APIs are provided for data input to, and output from, the CBR platform 1201 . These allow persistent data storage and also provide tools for data analysts and platform administrators.
- the APIs may be part of, and integral with, the CBR platform 1201 or they may be separate from the CBR platform 1201 .
- the Uls, data sources, persistence database and external data interpretation agents do not form part of the CBR platform 1201 and may be custom devices for a specific application.
- the data source API 1203 enables integration with a variety of data sources via data connectors, typically implemented as short programs, that connect data streams, that represent information on a monitored situation, from one or more data sources to the CBR platform 1201 .
- the live and static data connectors receive information from respective live and static data sources and map the information to a unified data format.
- the data source API 1203 is provided so that the data connectors can be customised for different implementations. Default connectors may be used but the API also enables the implementation of custom data connectors developed specifically for the application that the CBR platform 1201 is required to support.
- a persistence database for permanently storing some or all of the data that is input to and/or generated within the CBR platform 1201 .
- any new cases 21 generated by a revise agent may be stored in the persistence database.
- the persistence database may be that of a third party or a default database provided with the CBR platform 1201 . It can be implemented according to any known storage solution, such as one or more databases or directories.
- the stored data in the persistence database can be used to replay situations in order to validate data interpretation agents and cases 21 . Additional advantages of having such a persistence database are that it can be used to store the current data within the CBR platform 1201 to thereby allow fast system recovery if there is a system failure. Such an external database also facilitates the handling of big data.
- the persistence database is supported through the persistence API 1204.
- the persistence database can be integrated with the CBR platform 1201 with short programs that translate between the CBR platform 1201 and the data storage solution, that may be a custom data storage solution.
- Each application may have data interpretation agents internal and/or external of the CBR platform 1201 .
- Tasks that may be performed by the data interpretation agents include pre-processing data and filtering out noise before the data is fed into the CBR engine 1206.
- the data interpretation agents may mine the unified data in order to identify patterns in it using pattern recognition methods.
- the pattern recognition methods may be standard or customised.
- the agents are typically highly modular and while some are application specific, others can be reused to identify similar patterns or perform similar noise filtering across a plurality of different applications. For example, an agent may use statistical methods to recognize when there is a sudden increase in a time series of data.
- An example of a more complex agent is one that may analyse trends in a set of parameters to detect certain patterns, such as when a few of the parameters have erratic values relative to the others.
- the data interpretation agents may therefore generate information for detecting specific events, or just single or streams of numerical values, for use in any of the case comparison processes.
- the data interpretation agents communicate with the CBR platform 1201 through the agent API 1208.
- the agent API 1208 is shown within the CBR platform 1201 in Figure 12 but may alternatively be on the edge of the CBR platform 1201 , in the same way that the other APIs in Figure 12 are shown.
- the agent API 1208 also enables third party developers to create custom agents.
- the agent API 1208 provides the CBR engine 1206 with information for detecting specific events, such as OverpuH' or 'tight spot' events during a drilling operation.
- the agent API 1208 may also provide the CBR engine 1206 with parameter information, for use by the comparison agents in generating similarity scores, and this information may be in the form of single parameter values or one or more data streams of parameter values. That is to say, the CBR engine 1206 may treat a data stream received from the agent API 1208 as if it were a data stream within the received data stream from a monitored situation and use the data stream to generate the overall similarity score for a case 21 .
- Default or custom Uls of applications can communicate with the platform through the Application API 1202.
- Data analysts may be provided with Uls that enable them to view raw or analysed data going through the CBR platform 1201 , view case data, add custom data interpretation agents, test data interpretation agents and case matching, capture cases 21 and configure cases 21 and the case library.
- the results of the CBR engine 1206 may be displayed to a system operator using a case radar as described in WO2010/106014A2, which is incorporated herein by reference.
- the radar provides a highly intuitive visualization that allows a system operator to easily identify relevant cases 21 .
- the CBR platform 1201 is highly adaptable and can be easily integrated into a wide range of applications.
- the CBR platform 1201 in particular the CBR engine 1206 within the CBR platform 1201 , is highly scalable can therefore be used in applications that require a larger case base to be searched and/or large cases 21 within the case base to be searched.
- the CBR engine 1206 can easily adapt to different sizes of case base.
- a case base may increase in size if new cases 21 are added, or decrease if some of the existing cases 21 in the case base are deemed not relevant to the current situation and do not need to be used in comparisons.
- a further advantage is that the CBR platform 1201 can be implemented by a distributed computing system. This increases the scalability, flexibility and adaptability of the CBR platform 1201 .
- Applications that the CBR platform 1201 is suitable for range from the oil and gas industry, in which the cases 21 are typically very large and the case comparisons computationally demanding, to the financial services industry, in which the case comparisons are typically less computationally demanding but the case base a lot larger.
- a comparison agent is created for each case 21 in the case base.
- An advantage of this approach is that the retrieval agent determines one or more similar cases 21 in dependence on all of the case information in the case base.
- An alternative approach is to first determine a subset of potentially relevant cases 21 from the case base and only generate comparison agents for the subset of cases 21 . This requires the additional process of filtering the cases 21 in the case base so that the subset only includes cases 21 that are potentially relevant.
- the determination of one or more cases 21 is faster and more computationally efficient since fewer comparison agents are required.
- the case 21 shown in Figure 2 has separate dynamic and static data. This separation is not essential and the dynamic and static data may be fully or partially intermingled.
- a filter is provided that filters a received data stream into different data streams of parameters. This filtering is not essential and the data stream could have applied unfiltered to each computation unit. This would increase the amount of communicated information within each comparison agent but avoid the requirement of having a filter at the input to the comparison agent.
- embodiments of the CBR platform are particularly powerful tools for the energy, finance and healthcare industries.
- Embodiments are in no way restricted to these applications and the CBR engine may be used in any industry.
- the CBR engine can provide a powerful tool in the automobile industry, the fish farming industry and for the control of energy grids.
- Embodiments are particularly effective for applications, in any domain, in which humans are required to make decisions based on the information stored in realtime data streams.
- These computer program instructions may be provided to a processor of a general purpose computer(s) or computer system(s), special purpose computer(s) or computer system(s), other programmable data processing apparatus, or the like, to produce a machine, such that the instructions, executed via the processor of the computer (computer system, programmable data processing apparatus, or the like), create mechanisms for implementing the functions specified within the blocks of the flowcharts and/or block diagrams and/or within corresponding portions of the present disclosure.
- These computer program instructions may also be stored in a computer- readable memory (or medium) and direct a computer (computer system, programmable data processing apparatus, or the like) to function in a particular manner, such that the instructions stored in the computer readable memory or medium produce an article of manufacture including instruction means which implement the functions specified in the blocks of the flowchart(s) and/or block diagram(s) and/or within corresponding portions of the present disclosure.
- the computer-readable medium may include, but is not limited to, a non-transitory computer-readable medium, such as a tangible electronic, magnetic, optical, electromagnetic, infrared, and/or semiconductor system, device, and/or other apparatus.
- a non-transitory computer-readable medium such as a tangible electronic, magnetic, optical, electromagnetic, infrared, and/or semiconductor system, device, and/or other apparatus.
- the non-transitory computer-readable medium includes a tangible medium such as a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read- only memory (HP OM or Flash memory), a compact disc read-only memory (CD- ROM), and/or some other tangible optical and/or magnetic storage device.
- the computer-readable medium may be transitory, such as, for example, a propagation signal including computer-executable program code portions embodied therein.
- the computer program instructions may also be loaded onto a computer (computer system, other programmable data processing apparatus, or the like) to cause a series of operational steps to be performed on the computer (computer system, other programmable data processing apparatus, or the like) to produce a computer-implemented method or process such that the instructions executed on the computer (computer system, other programmable data processing apparatus, or the like) provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block(s) and/or within corresponding portions of the present disclosure.
- the above described methods and/or processes could be performed by a program executing in a programmable, general purpose computer or computer system.
- Alternative embodiments are implemented in a dedicated or special-purpose computer or computer system in which some or all of the operations, functions, steps, or acts are performed using hardwired logic or firmware.
- unit and “engine” may be understood to refer to computing software, firmware, hardware, and/or various combinations thereof.
- some or all of the steps are performed automatically by a processor.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Disclosed herein is a computer-implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case (21) in case-based reasoning, CBR, the method comprising: receiving (703) a data stream comprising information on a monitored situation; generating (705) a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream; generating (707), for each of the generated data streams, a similarity score for a feature of a case (21), wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case (21), and each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case (21); and generating (709) an overall similarity score between the received data stream and the case (21) in dependence on the generated similarity scores.
Description
CASE-BASED REASONING
Field of the Invention The present invention relates to case-based reasoning. More particularly, embodiments of the invention provide efficient, effective, adaptable and scalable case-based reasoning techniques that can be applied in a broad range of industries, such as the finance, healthcare and energy industries. Background of the Invention
Predictive analytics is a tool for making and supporting decisions. Predictive analytics involves analysing historical data in order to predict future events and thereby automatically propose or take actions.
The majority of known predictive analytics systems are offline or batch processing systems that do not operate in real-time. The data used in the predictive analytics is separate from that used in operational systems and the data may be hours, days, weeks or even months old before analytics algorithms are applied to it. These techniques are not appropriate for applications in which it is necessary for the predictive analytics to be performed in real-time. Such applications may be, for example, the monitoring of an oil well drilling operation or an operation by a physician, in which it is necessary for problems to be detected, and proposals to be generated, very quickly.
A known technique for performing real-time predictive analytics is complex event processing, CEP. CEP systems generate alerts based on previously created rules for monitoring data. Such rule-based systems are inherently limited by the difficulty in defining and maintaining the rules. While a near real-time rule may be applied to data, the analytics required to create the rule is slow and not realtime. Moreover, the created rules are inflexible and incapable of adapting to changes in the data. The analysis needed to create and update rules is therefore undertaken offline. Accordingly, rule-based systems tend to only be
used in stable and predictable environments in which it is possible to define a set of rules for all circumstances and for automatic actions to be taken.
Rule-based techniques are not appropriate for applying predictive analytics in fast-changing environments. Furthermore, there are scenarios in which it is not appropriate for automatic actions to be taken. If a critical or complicated decision is to be made, for example by an oil well operator during a drilling operation or by a physician during surgery, it is neither feasible nor desirable to take humans out of the decision making process.
Case-based reasoning, CBR, is a real-time predictive analytics technique that does not experience the above-described problems of rule-based techniques.
CBR systems detect and propose solutions to problems using information obtained from a plurality of cases stored in a case base. Each of the stored cases comprises a description of a problem and a description of a solution. The cases are typically generated manually based on actual experienced problems and devised solutions by system operators. Advantageously, CBR systems are able to provide system operators with detailed and reasoned solutions to complicated problems.
The application of predictive analytics to scenarios increasingly requires the use and handling of big data. Big data refers to a collection of data sets so large and complex that they become difficult to process using traditional data processing applications. For example, such big data could be encountered when applying predictive analytics within the financial services industry as a vast quantity of financial information is continuously generated and transferred between computing systems all over the world. A problem with known CBR systems is that they are not designed for supporting and providing real-time operation on big data.
Summary of the Invention
According to a first aspect of the invention, there is provided a computer- implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case in case-based reasoning, CBR, the method comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream; generating, for each of the generated data streams, a similarity score for a feature of a case, wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case, thereby each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case; and generating an overall similarity score between the received data stream and the case in dependence on the generated similarity scores.
Preferably, the method further comprises generating each similarity score in dependence on comparison information of the corresponding feature of the case. Preferably the received data stream comprises one or more streams of parameters and each of the generated data streams comprises one or more streams of parameters in dependence on the received data stream.
Preferably, the method further comprises generating the similarity score for each feature in dependence on a comparison of the value of a parameter in a generated data stream of parameters for the feature with the stored value of the parameter for the feature.
Preferably, the comparison information comprises weight information and the method further comprises weighting the value of the parameter in the generated data stream for a feature and/or the stored value of the parameter for the feature; and generating the similarity score of the feature in dependence on the weighted value(s).
Preferably, the comparison information comprises a comparison function that specifies computations for generating the similarity score and the method further comprises generating the similarity score of each feature in dependence on the comparison function of the feature.
Preferably, the method further comprises generating one or more aggregate similarity scores in dependence on one or more of the generated similarity scores; and generating the overall similarity score in dependence on the aggregate similarity score(s).
Preferably, the method further comprises determining the overall similarity score in real-time.
Preferably, the method is performed by a comparison agent.
Preferably, the comparison agent is configured in dependence on the comparison information for each feature and including the stored information for each feature. Preferably, wherein the case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
According to a second aspect of the invention, there is provided a computer- implemented method of monitoring a situation by determining a set of one or more cases in case-based reasoning, CBR, the method comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams from the received data stream; generating, according to any of the above-described methods, an overall similarity score between each of the plurality of parallel data streams and a case, wherein each overall similarity score is generated from a comparison between one of the plurality of parallel data streams and a different one of a plurality of cases; and determining a set of one or more cases in dependence on the generated overall similarity scores.
According to a third aspect of the invention, there is provided a comparison agent for monitoring a situation by generating an overall similarity score between a received data stream, comprising information on a monitored situation, and a case in case-based reasoning, CBR, wherein the comparison agent is configured to perform any of the above described methods.
According to a fourth aspect of the invention, there is provided a case-based reasoning, CBR, engine for monitoring a situation by determining a set of one or more cases, wherein the CBR engine is configured to perform any of the above- described methods.
According to a fifth aspect of the invention, there is provided a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
According to a sixth aspect of the invention, there is provided a computer- implemented method of monitoring a situation by determining a set of one or more cases in case-based reasoning, CBR, the method comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams from the received data stream; generating, for each of the parallel data streams, an overall similarity score between the parallel data stream and one of a plurality of cases, wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and a different case; and determining a set of one or more cases in dependence on the generated overall similarity scores.
Preferably, each of the overall similarity scores is generated by one of a plurality of comparison agents and each of the comparison agents receives one of the plurality of data streams.
Preferably, the method further comprises each comparison agent generating an overall similarity score by: receiving one of the plurality of data streams; generating a further plurality of parallel data streams, wherein each of the generated further plurality of parallel data streams is dependent on the received
one of the plurality of data streams; generating, for each of said generated further plurality of parallel data streams, a similarity score in dependence on a comparison between information on a feature in the generated further data stream and stored information on the feature of a case, wherein each of the similarity scores of said generated further plurality of data streams is generated in dependence on a comparison with stored information on a different feature of the same case; and generating an overall similarity score between the received one of the plurality of data streams and the case in dependence on the generated similarity scores.
Preferably, the method further comprises determining to include a case in the set of one or more cases if the overall similarity score for the case is above a predetermined threshold level. Preferably, the determined set of cases has a predetermined number of two or more cases, and the method comprises determining the predetermined number of cases for including in the set as the cases with the highest overall similarity scores. Preferably, the method further comprises displaying information dependent on each of the determined one or more cases.
Preferably, each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
According to a seventh aspect of the invention, there is provided a case-based reasoning, CBR, engine for monitoring a situation by determining a set of one or more cases, wherein the CBR engine is configured to perform any of the above- described methods.
According to an eighth aspect of the invention, there is provided a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
According to a ninth aspect of the invention, there is provided a case-based reasoning, CBR, platform for monitoring a situation, the CBR platform comprising: a CBR engine; a unified data cache; a case library; an agent application programming interface, API ; a data source API; an application API ; and a persistence API ; wherein: the data source API is configured to provide an interface between live and/or static data sources, external from the CBR platform, and the unified data cache, wherein the live and/or static data sources are sources of information on a monitored situation; the persistence API is configured to provide an interface between a persistence database, external from the CBR platform, and the unified data cache and the case library; the application API is configured to provide an interface between the systems of data analysts, platform administrators and/or operators, external from the CBR platform, and the unified data cache and the case library; the CBR engine is configured to receive information for use in generating CBR results from the agent API, to receive information on a plurality of cases from the case library, to receive a data stream comprising information on the monitored situation from the unified data cache and to generate CBR results that are both dependent on a comparison of a plurality of parallel data streams, dependent on the received data stream, with information on the plurality of cases and also dependent on the information received from the agent API; the unified data cache is configured to receive data from the data source API, to send a data stream comprising information on the monitored situation to the CBR engine and to receive the results of the CBR engine from the CBR engine; the agent API is configured to receive information from data interpretation agents external from the CBR platform and to send the received information to the CBR engine in order to provide the CBR engine with information for use in generating the CBR results; and the case library is configured to send information on a plurality of cases to the CBR engine. Preferably, the case library comprises a case base.
Preferably, cases are stored in the case base in XML or serialised code format.
Preferably, the case base is any of a single database, a plurality of distributed databases, a directory on a server or a plurality of directories on one or more servers. Preferably, each case comprises metadata that includes comparison information for the case.
Preferably, each case is structured according to a computing graph. Preferably, the computing graph is a tree structure or directed acyclic graph.
Preferably, the CBR engine is configured to perform the method of: receiving said data stream comprising information on the monitored situation from the unified data cache; generating a plurality of parallel data streams from said received data stream comprising information on the monitored situation; generating, for each of the parallel data streams, an overall similarity score between the parallel data stream and information on one of the plurality of cases received from the case library, wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and information on a different case; and determining the CBR results as a set of one or more cases in dependence on the generated overall similarity scores.
Preferably, each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
According to a tenth aspect of the invention, there is provided a case-based reasoning apparatus comprising a processor and a memory having stored therein instruction that, when executed by the processor, cause the apparatus to perform a method for creating, in programmable hardware, components of a comparison agent for monitoring a situation, the method comprising: obtaining one or more parameter values and comparison information for each of a plurality of features of a case, wherein the comparison information of each feature defines a configuration of a computation unit; and creating, for each of the plurality of features, a computation unit in dependence on the obtained one or
more parameter values and the comparison information of the feature, such that the created computation unit is configured to generate an output in dependence on the obtained one or more parameter values and the comparison information of the feature.
Preferably, the method further comprises creating each of the computation units such that each computation unit is configured to receive one or more data streams of parameters and to generate an output in dependence on the values of the one or more received parameters, wherein the one or more data streams comprise information on a monitored situation.
Preferably, the method further comprises: obtaining comparison information for one or more aggregate computation units, wherein the comparison information of each aggregate computation unit defines the configuration of the aggregate computation unit; and creating, in dependence on the obtained comparison information, one or more aggregate computation units, wherein each aggregate computation unit is configured to receive an output generated by at least one of the computation units. Preferably, the comparison information for each computation unit includes information on how the obtained one or more parameter values and/or received parameter values are to be weighted by the computation unit, and the method comprises creating each computation unit such that the computation unit is configured to weight the obtained one or more parameter values and/or received parameter values in dependence of the comparison information.
Preferably, the comparison information for each aggregate computation unit includes information on how the aggregate computation unit is to weight the values of its inputs, and the method comprises creating each aggregate computation unit such that the aggregate computation unit is configured to weight its inputs in dependence on the comparison information.
Preferably, the comparison information for each computation unit includes a comparison function that specifies the computations that the computation unit is
required to perform to generate an output, and the method comprises creating each computation unit such that the computation unit is configured to apply computations with the obtained one or more parameter values and received parameter values in dependence on the comparison function.
Preferably, the configuration information for each aggregate computation unit includes a comparison function that specifies the computations that the aggregate computation unit is required to perform to generate an output, and the method comprises creating each aggregate computation unit such that the aggregate computation unit is configured to apply computations to its inputs in dependence on the comparison function.
Preferably, the comparison information for each computation unit and the comparison information of each aggregate computation unit are obtained by obtaining metadata of the case.
Preferably, the computation units and aggregate computation units are configured in dependence on a computing graph for the case. Preferably, the computing graph is a tree structure or directed acyclic graph.
Preferably, wherein the case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry. According to an eleventh aspect of the invention, there is provided a comparison agent for monitoring a situation, the comparison agent created according to any of the above-described methods.
According to a twelfth aspect of the invention, there is provided a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
According to a thirteenth aspect of the invention, there is provided a case-based reasoning apparatus comprising a processor and a memory having stored
therein instruction that, when executed by the processor, cause the apparatus to perform a method of creating a new case in a case-based reasoning, CBR, system for monitoring a situation, the method comprising: determining a set of one or more cases from a plurality of cases in dependence on a received data stream comprising information on a monitored situation, wherein each case comprises information describing a problem and information describing a solution to the problem and the process of determining the set of one or more cases is performed without comparing the description of the problems of any of the plurality of cases with a previously generated current case comprising information describing the monitored situation; generating information describing a solution in dependence on information obtained from the determined set of one or more cases and/or in dependence on information received from a user interface; generating a current case comprising information describing the monitored situation in dependence on the received data stream; and generating a new case in dependence on the generated information describing a solution and the generated current case.
Preferably, the method further comprises generating the current case after generating the information describing a solution.
Preferably, the method further comprises generating a request for the current case; and generating the current case in response to receiving the request.
Preferably, the method further comprises automatically generating the current case.
Preferably, the plurality of cases are stored in a case base and the method further comprises storing the generated new case in the case base. Preferably, the process of determining a set of one or more cases includes creating a comparison agent for each of the plurality of cases, and the method comprises generating a new comparison agent for the generated new case.
Preferably, each case comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
According to a fourteenth aspect of the invention, there is provided a case-based reasoning, CBR, system for monitoring a situation, the CBR system configured to perform any of the above-described methods.
According to a fifteenth aspect of the invention, there is provided a computer program that, when executed by a computing device, controls the computing device to perform any of the above-described methods.
Brief Description of the Drawings
Embodiments of the present invention will be described, by way of example only, with reference to the accompanying drawings, in which:
Figure 1 provides an overview of a situation in which a CBR system is used;
Figure 2 is schematic diagram of a case for use in embodiments;
Figure 3 is an XML representation of a case according to an embodiment;
Figure 4 is a block diagram of a comparison agent according to an embodiment; Figure 5 is a block diagram of a comparison agent according to an embodiment;
Figure 6 is a block diagram of part of a CBR engine according to an embodiment; Figure 7 shows the steps of a method for generating an overall similarity score between a received data stream and a case according to an embodiment;
Figure 8 shows the steps of a method for determining a set of one or more cases by a CBR engine according to an embodiment;
Figure 9 shows the steps of a method for creating components of a comparison agent according to an embodiment; Figure 10 is a block diagram showing a CBR cycle according to an embodiment;
Figure 1 1 shows the steps of a method for creating a new case using a CBR cycle according to an embodiment; and Figure 12 shows a CBR platform according to an embodiment.
Detailed Description
Embodiments of the invention provide CBR techniques that are advantageous over known predictive analytics techniques. Embodiments allow CBR systems to be realised that are fast and scalable, as required for real-time operation on big data. Moreover, the CBR systems according to embodiments are adaptable and can be used for many applications. CBR systems according to embodiments are particularly effective in the energy, finance and healthcare industries.
In the oil industry, the CBR techniques of embodiments identify and prevent drilling problems and thereby greatly reduce both costs and drilling time. The techniques are especially advantageous in complex drilling operations and multi- well operations as they are able to manage high volumes of data and to quickly recognise trends and indicators. The CBR techniques are also applicable to the energy industry in general and are not limited to the oil industry. For example, they may be used to detect and prevent problems in the electrical power generation industries.
In the finance industry, it is highly desirable for financial services organisations to have effective systems for predicting and detecting volatility due to any problems that may be caused by IT and service outages, capacity and risk issues,
compliance pressures and trading errors. These problems can result in very large financial losses and serious damage to reputations and customer confidence. Although there are already financial services organisations that have IT systems and data for predicting business compromising events, the CBR techniques according to embodiments provide an analytics tool on top of the existing data-capture technology to generate proposals for preventing problems from occurring and solving any problems that have occurred. The CBR techniques also provide better assurance for staying in compliance with regulatory requirements and protecting organisations from third-party mistakes. Risk and compliance officers can look at actual past events (default rates, VAR, etc.) to measure the risk of using similar strategies in the future. Organisations can also detect anomalous events occurring within the industry to protect themselves from other organisations mistakes. In the healthcare industry, CBR techniques according to embodiments enable hospitals to improve the quality of patient care and reduce costs. The CBR techniques apply real-time analysis to identify and manage impacting events by providing physicians with evidence-based decision support. There are many ways of incorporating a CBR system into the control of a situation. For example, the type of implementation as shown in Figure 1 may be used in which the monitored situation (in this example a drilling operation), the data analysis server comprising the CBR system and the operations centre are all remote from each other and communicate over a network, such as a local network or the internet. Alternatively, the data analysis server and/or operations centre may be local to the situation.
An example of how CBR could be implemented in a hospital is for patients being operated on to all have their temperature, blood pressure and other characteristics continuously monitored. The monitored parameters could be transmitted to a data analysis server comprising a CBR system within the hospital. For each patient, any detected problems and proposed solutions are then displayed, in the operating theatre, to the surgeon operating on the patient. There may also be an operations centre within the hospital in which the results
from the CBR system for all of the operations that are occurring at that time are displayed so that all of the operations can be monitored together.
An overview of how the CBR techniques according to embodiments are advantageous over known CBR techniques is provided below.
In order to compare information obtained from a monitored situation to stored information in a case, the CBR techniques according to embodiments create a comparison agent for each of the cases in a case base. A comparison agent is an instantiation of a case containing information describing how to compare a previous situation described by the case with a current situation. Each comparison agent comprises computational units that hold all the information required for comparing a feature of a stored case with data streams describing the current situation. The computational units are created in dependence on one or more values of parameters of a stored case feature, a function defining how to determine a similarity as well as any other information required for comparison, such as weighting information or minimum and maximum values. A plurality of parallel data streams are generated from a received data stream that comprises information on monitored parameters of a situation. The plurality of data streams are then streamed into the computational units of the comparison agents. Each computational unit then compares a subset of the parameters in the streamed data with their stored parameters, that correspond to a feature of a case, to generate a similarity score between the received data stream and the feature. All of the features of a case have a corresponding computational unit that computes the similarity between one or more parameters describing a monitored situation and the feature that describes a past situation. An overall similarity score between a monitored situation and the case is then calculated in dependence on the similarity score calculated for each feature. Advantageously, received data is streamed directly into the computational units of comparison agents. This allows a very fast comparison of features to be performed. Embodiments differ from, and are faster than, all known CBR techniques as these require the additional step of first generating a file comprising information on monitored parameters of a situation, referred to herein
as a current case, and then comparing the current case with stored case information.
In addition, in embodiments all of the comparison agents operate in parallel with each other. This is a lot faster than known CBR systems that sequentially compare a current case with each case of a case base.
A further advantage is provided by the way in which each comparison agent generates an overall similarity score. The comparison agents are configured not only with stored parameter information for a case, but also with weight information of parameters and functions that describe how the comparison agent should compare received and stored information. This allows a more sophisticated and tuneable comparison technique to be applied and the generated similarity score is therefore more accurate.
In addition, the CBR cycle for generating new or revised solutions to problems is faster and more efficient than known CBR cycles.
CBR techniques according to embodiments are performed by a CBR engine supported by a CBR platform. The CBR platform is able to integrate with other existing systems and can therefore be used in many applications. The CBR platform, and in particular the CBR engine within the platform, are also highly scalable. The CBR techniques according to embodiments are described in more detail below.
Figure 2 is a schematic diagram showing how the information within a case 21 for use in embodiments may be structured.
Each case 21 comprises a description of a problem, shown as a situation description, and a description of a solution, shown as advice. Stored information within each of these sections may be further categorised into sub-sections, such as dynamic and static data for the situation description. Within each sub-
section, the stored information may be further categorised further sub-sections. Although, not shown in Figure 2, there may be a number of further categorisations of the stored information into smaller and smaller sub-sections. The smallest sub-sections of stored information for the situation description are features of the case 21 . Each problem that a case 21 solves is represented by a set of features with each feature comprising stored values of a parameter. Values of the same parameter can also be obtained from a monitored situation. Each feature may be combined with other features to form an aggregate feature. The features that are combined to form the aggregate feature are the child features of the aggregate feature. Each aggregate feature may itself be a child feature of another aggregate feature. The structure of the situation description of each case 21 is defined by a case description graph. The case description graph may be a directed acyclic graph, DAG, a tree or other types of structure. The nodes of the graph denote the features of the case 21 while the edges, or paths between the nodes, correspond to the relationships between the nodes. That is to say, a leaf node in a tree structure corresponds to a feature that does not depend on any other feature and the other nodes within the tree structure correspond to aggregate features.
Features can have any data type. For example, the data type can be just a number with a unit or a symbol, or it can be more complex, such as a set, a vector or a sequence of numbers or symbols. Features can even be natural language text. There is no restriction on the format or type of the features describing a case 21 . The comparison between a stored case 21 in a case base and an input data stream from a monitored situation is performed by comparing the parameter information stored within the features on a feature by feature basis. Aggregate features have at least one input that is an output from another feature comparison. Although parameters within the received data stream may also be
directly input to an aggregate feature, aggregate features typically have only outputs from other feature comparisons as inputs. The output of a feature comparison is a similarity score while the aggregated similarity score for all features of a case 21 is an overall similarity score for the comparison between the stored case 21 and the received data stream.
For every feature, including the aggregate features, comparison information is defined. The comparison information may include weights, comparison functions and any other configuration information, such as max and min values for numeric similarity measures or range limits for sequences. A comparison function is a function that measures the similarity of one or more received and stored parameters to thereby generate a similarity score that is a measure of the similarity between a feature of a case 21 and information from a monitored situation. The comparison function for a feature may use any of the other information in the comparison information, such as weights of parameter values, when generating a similarity score for the feature.
All of the features that receive parameters in the data stream may comprise weights that are applied to the stored parameter information and/or the parameters in the data stream. Each aggregate feature may also comprise weights that are applied to each of its inputs. The weights allow the contributions of each of the features to be controlled and therefore the relative importance of each feature to be included in the information describing a situation. Local weights can be distinct for each feature and are individual for each case 21 . Local weights need to be stored on a case-by-case basis. Global weights apply to different cases 21 in the same manner and need only be stored centrally. Global weights become local weights once they are customised for individual cases 21 . In addition to weighting the one or more parameters that describe features, every feature may also have a comparison function that defines how the feature is to be compared against input parameters from the data stream. The comparison function for a feature can be any mathematical function that generates a result in dependence on the parameters. Each comparison function
can be individualised to each feature. Features may be provided with a default comparison function or a comparison function that has been determined by a system operator. All of the cases 21 according to embodiments comprise metadata for storing the comparison information for all of the features of each case 21 . Metadata can also comprise further information describing a case 21 , such as units and textual descriptions of the features to help system operators understand each feature. The above-described case 21 structure according to embodiments differs from the case structure used in known CBR systems that do not store metadata in the cases themselves. Advantageously, each case 21 can be modelled individually. The original compiler of a case 21 has full control over which features are chosen to describe the case 21 , how stored and measured information is compared for each of the features, and how an overall similarity score is generated for the case 21 . If required, the metadata for each case 21 can also be modified at a later stage by a system operator in order to change how the case 21 is compared with monitored data. A system operator can therefore tune the comparison of the case 21 .
With regard to the case solution, if it is not required for the solution to be automatically modified by a computer, then this can be a textual description of how to solve the problem. Otherwise, the case solution needs to be represented in a format that can be understood by a computer. This advantageously allows a solution to be automatically devised that is based on a plurality of similar cases 21 to the current situation. How to represent a case solution so that it can be understood by computers is known in the art.
Each case 21 can be stored as an XML file, such as the example shown in Figure 3. There are a number of alternative forms in which each case 21 can be stored, such as serialised code.
All of the cases 21 are stored in a case base. The case base may be, for example, a single database, a plurality of databases distributed across a plurality
of hardware devices, a directory on a server or a plurality of directories on one or more servers.
In order to compare the cases 21 in a case base with monitored information on a situation from a received data stream, a comparison agent is created for each case 21 in the case base. Each comparison agent is created in dependence on the case 21 description graph for the situation description of the case 21 .
An example of a comparison agent 41 for comparing parameters in a received data stream with a case 21 is shown in Figure 4. A computation unit has been created for each feature of the case 21 . The relative arrangement of the computation units has been defined by the case description graph for the case 21 and the comparison information of the case 21 has been used to configure how each computation unit operates.
Computation Node 1 is a computation unit that has been configured to generate a comparison result between a received and a stored value of a voltage. In addition to being created with the stored value of the voltage, Computation Node 1 has been configured to compare the stored and received value of the voltage according to the comparison information of the feature that Computation Node 1 corresponds to. The comparison information includes a Similarity Measure, that is a mathematical function that describes how a result is generated, as well as a Configuration, that specifies limits on the voltage values. Computation Node 2 has been configured to generate a comparison result between a received and stored value of a status. It has been created with a stored value of the status and has been configured to compare the stored and received value of the status according to the comparison information of the feature that Computation Node 2 corresponds to.
Computation Node 3 has been created for an aggregate feature. Computation Node 3 receives as inputs the outputs from Computation Nodes 1 and 2. It has been configured to weight and combine its inputs according to the comparison
information of the aggregate feature that it corresponds to in order to generate an overall comparison result, i.e. overall similarity score.
Figure 5 shows another example of a comparison agent 51 . The comparison agent 51 comprises computation units for features F1 , F2 and F3 as well as for aggregate features AF1 , AF2 and AF3. The comparison agent 51 also comprises a filtering and splitting component 58. A received data stream comprises data streams of parameters A, B and C. The received data stream is input to the filtering and splitting component 58 that generates a plurality of parallel data streams that are output to features F1 , F2 and F3. The filtering ensures that each of the parallel data streams comprises only the parameters that are required by the computation unit that the data stream corresponds to. Accordingly, a data stream comprising only parameter A is sent to F1 as the computation unit F1 only performs a comparison between a received and stored value for parameter A. The data stream sent to F2 differs from that sent to F1 and comprises a data stream of parameter A as well as a data stream of parameter B. F2 only performs a comparison between received and stored values of parameters A and B and so these are the only data streams of parameters that are sent to it. Similarly, F3 only performs a comparison between received and stored values of parameters B and C and so these are the only data streams of parameters that are sent to it.
Figure 6 shows the part of a CBR engine 60 that performs case comparison and retrieval for determining one or more similar cases 21 to a current situation according to embodiments. The cases 21 are stored in a case base comprising N cases 21 . Comparison agents for each of the N cases 21 are created according to the techniques described above. The CBR engine 60 comprises the plurality of comparison agents,
arranged in parallel with each other, a filtering and splitting component 65 and a retrieval agent 61 . A received data stream from a monitored situation is input to the filtering and splitting component 65. The filtering and splitting component 65 divides the data stream into a plurality of N parallel data streams, with each of the plurality of parallel data streams being sent to a different comparison agent. The filtering and splitting component 65 also filters the received data stream so that each comparison
agent only receives data streams comprising parameters that are required by the comparison agent.
The retrieval agent 61 receives overall similarity scores from each of the comparison agents. On the basis of the received overall similarity scores, the retrieval agent 61 determines if there are any cases 21 in the case base with similar situation descriptions to the situation being monitored. One strategy that may be used by the retrieval agent 61 is to retrieve all cases 21 that have an overall similarity score that is above a pre-determined threshold level. Alternatively, the retrieval agent 61 may use the strategy of always retrieving the same predetermined number of cases 21 , the retrieved cases 21 having the highest overall similarity scores. Other retrieval strategies are also possible.
Figure 7 shows the steps of a computer-implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case 21 according to an embodiment.
The method starts in step 701 . In step 703, a data stream is received comprising information on a monitored situation.
In step 705, the method generates a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream.
In step 707, the method generates, for each of the generated data streams, a similarity score for a feature of a case 21 , wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case 21 , and each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case 21 .
In step 709, the method generates an overall similarity score between the received data stream and the case 21 in dependence on the generated similarity scores.
In step 71 1 , the method ends.
Figure 8 shows the steps of a computer-implemented method of monitoring a situation by determining a set of one or more cases 21 by a CBR engine according to an embodiment. The method starts in step 801 .
In step 803, a data stream is received comprising information on a monitored situation. In step 805, the method generates a plurality of parallel data streams from the received data stream.
In step 807, the method generates, for each of the parallel data streams, an overall similarity score between the parallel data stream and one of a plurality of cases 21 , wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and a different case 21 .
In step 809, the method determines a set of one or more cases 21 in dependence on the generated overall similarity scores.
In step 81 1 , the method ends.
Figure 9 shows the steps of a computer-implemented method for creating components of a comparison agent for monitoring a situation according to an embodiment.
In step 901 , the method starts.
In step 903, the method obtains one or more parameter values and comparison information for each of a plurality of features of a case 21 , wherein the comparison information of each feature defines a configuration of a computation unit.
In step 905, the method creates, for each of the plurality of features, a computation unit in dependence on the obtained one or more parameter values and the comparison information of the feature, such that the created computation unit is configured to generate an output in dependence on the obtained one or more parameter values and the comparison information of the feature.
In step 907, the method ends. The above described embodiments of the invention provide significant advantages over known CBR systems.
Advantageously, a similarity score for each feature of a case 21 is generated by a computation unit that receives a data stream, or data streams, of parameters from the monitored situation. The similarity score for each feature is therefore generated extremely quickly and this allows the CBR techniques of embodiments to be applied in real-time.
A further advantage is provided by filtering the received data stream so that only the required data streams of parameters are sent to the computation unit for each feature. This reduces the amount of data transmission within the CBR platform.
The arrangement of the comparison units in Figures 4 and 5 has been defined by the case description graph for the case 21 that the comparison agent corresponds to. In Figure 5, for example, it is clear that computation units F1 , F2 and F3 correspond to leaf nodes of a tree structure and that AF3 corresponds to the root node of the tree.
Advantageously, since the comparison agent for each case 21 is built according to a case description graph such as a tree, each comparison agent can be flexibly configured. This allows a system operator to accurately control how each case 21 is compared with received information and how each overall similarity score is generated.
Moreover, Figures 4 and 5 show very simple comparison agents that require very few computation units. The comparison agent of an actual case 21 that describes, for example, a drilling operation may contain computation units that correspond to hundreds, or even thousands, of features and the comparison agent would be a lot larger and more complicated. A tree based design is particularly advantageous for such large comparison agents since inputting streams of data parameters directly into a parallel arrangement of computation units allows an overall similarity score to be generated quickly.
The above-described techniques for generating an overall similarity score between stored information for a case 21 and received information from a monitored system are a completely different approach to generating an overall similarity score from that used in known CBR systems.
The design of all known CBR systems has been based on the concept that to find a similar case in a case base to a situation, it is first necessary to create a current case, i.e. a description of the monitored situation, and to compare the current case with descriptions of problems in each of the cases in a case base. Known CBR systems have therefore always performed the time consuming process of building a current case that describes the current situation. A further problem with creating such a current case is that the same current case is compared with each case. This is inefficient since a comparison agent may be provided with parameter information that it does not require. In particular, in a distributed system to send the current case to all case comparison agents results in a lot of information being unnecessarily transported within the system. This increases the network traffic and slows down the system.
The system design of Figure 6 advantageously allows the input of a plurality of parallel data streams directly into comparison agents. The process of generating a current case and the overhead of transmitting the entire current case within the system is therefore avoided. In addition, by filtering each of the data streams that are sent to each of the comparison agents, the data streams only comprise the data streams of parameters that are required for each case 21 . This reduces the amount of information that is communicated within the system. Furthermore, the parallel arrangement of comparison agents allows features of cases to be matched in parallel. This is not possible with, and is a lot faster than, all known CBR systems.
The output from the retrieval agent 61 shown in Figure 6 is one or more similar cases 21 . From these retrieved cases 21 , a solution to a problem, that has been identified from the received data stream from the monitored situation, can be generated and provided to a system operator. One way of easily generating a solution is to directly copy the solution provided in the case 21 with highest overall similarity score. More advanced solutions may be generated by adapting the solution(s) provided in the one or more retrieved cases 21 so as to generate a solution that is dependent on the solution(s) in one or more of the retrieved cases 21 . A system operator may also provide a completely new solution, not dependent on any of the solutions of the retrieved cases 21 , as the solution to a problem that has been determined from the received data stream.
For each problem that is determined from a received data stream and for which a solution has been generated, by any of the above-described techniques, a new case 21 may be generated for which the situation description is dependent on the monitored situation determined from the received data stream and the advice is dependent on the generated solution. The determination to generate such a new case 21 , and to store the new case 21 in the case base, may be made by a system operator or performed automatically.
Figure 10 shows a CBR cycle for generating a new case 21 , and storing the new case 21 in a case base 108, according to an embodiment. The CBR cycle is implemented with a retrieval agent 104, a reuse agent 105, a revise agent 106
and a retain agent 107. The process also requires a situation description agent which not explicitly shown in Figure 10.
The retrieval agent 104 operates as described above and determines one or more similar cases 21 in dependence on the outputs of the comparison agents 101 , 102 and 103.
The reuse agent 105 outputs information from cases 21 for display to a system operator. The output information may be copied from the solution of only one case 21 , or the output information may be a solution generated automatically by the reuse agent 105 in dependence on two or more solutions from retrieved cases 21 .
The retrieval and revise agents may operate in substantially the same way as these agents operate in known CBR cycles.
The purpose of the revise agent 106 is to ensure that the proposed solution is appropriate for the current monitored situation. The revise agent 106 can adapt the solution generated by the reuse agent 105 or provide a completely new solution, not dependent on the solution generated be the reuse agent 105. The generation of a solution by the revise agent 106 may be performed automatically, such as in response to automatic testing determining that adaption of the solution is required, or controlled, partially or fully, by a system operator. To the extent that a solution for a case 21 is generated, the revise agent 106 may perform in substantially the same way as the operation of a revise agent in a known CBR cycle.
In embodiments, a situation description agent, not explicitly shown in Figure 10, receives the data stream from the monitored situation and generates a current case, i.e. a file comprising a description of the situation. In Figure 10, the situation description agent is located within the revise agent 106 and so the data stream is input directly to the revise agent 106. In alternative implementations, the situation description agent may be separate from the revise agent 106.
The current case created by the situation description agent may have the same format as that used to store the description of a problem information for cases 21 in the case base 108. The situation description agent operates independently of the comparison agents and may be configured parallel to the comparison agents. In an embodiment, the situation description agent only generates the current case in response to receiving a request for the current case from the revise agent 106. The revise agent 106 only sends the request to the situation description agent when it has generated an adapted or new solution. In an alternative embodiment, the situation description agent automatically generates the current case without requiring a request to be received from the revise agent 106 and the generated current case is automatically sent to the revise agent 106.
The revise agent 106 receives the current case from the situation description agent. The revise agent 106 then generates a new case 21 based upon the generated solution and the current case. The new case 21 preferably comprises metadata with comparison information, as described above for the other cases 21 in the case base 108. The retain agent 107 stores the new case 21 generated by the revise agent in the case base 108. The case 21 may be stored as, for example, an XML file or serialised code, as described above for the other cases 21 stored in the case base 108. The retain agent 107 also creates a new comparison agent for the case 21 and reconfigures the system so that the new comparison agent is supported and operates in the same way as that described above for the other comparison agents 101 , 102 and 103. Accordingly, an additional data stream of parameters is created and transmitted to the new comparison agent and the overall similarity score generated for the new case 21 is input to the retrieval agent.
To create a CBR engine, all of the CBR agents, except the comparison agents, are first created. The retain agent 107 then creates a comparison agent for each case 21 stored in the case base 108 according to the above-described techniques. The computation units of the CBR engine are thereby created in
dependence on the comparison information for each feature of each case 21 . The process of creating each agent may also be referred as instantiation.
In operation, information on the most relevant cases 21 to a monitored situation is preferably displayed to a system operator using a case radar, as described in WO2010/106014A2, which is incorporated herein by reference.
Figure 1 1 shows the steps of a computer-implemented method for creating a new case 21 using a CBR cycle, the method performed by a CBR system for monitoring a situation, according to an embodiment.
In step 1 101 , the method starts.
In step 1 103, the method determines a set of one or more cases 21 from a plurality of cases 21 in dependence on a received data stream comprising information on a monitored situation, wherein each case 21 comprises information describing a problem and information describing a solution to the problem and the process of determining the set of one or more cases 21 is performed without comparing the description of the problems of any of the plurality of cases 21 with a previously generated current case comprising information describing the monitored situation.
In step 1 105, the method generates information describing a solution in dependence on information obtained from the determined set of one or more cases 21 and/or in dependence on information received from a user interface.
In step 1 107, the method generates a current case information describing the monitored situation in dependence on the received data stream. In step 1 109, the method generates a new case 21 in dependence on the generated information describing a solution and the generated current case.
In step 1 1 1 1 , the method ends.
Advantageously, the CBR cycle allows proposed solutions to be provided to a system operator, with the proposed solutions being obtained from original cases 21 for a specific situation, from generic cases 21 , or from modified cases 21 . In known CBR cycles, a current case comprising a description of a situation is first created and the cases in the case base are searched with the current case. To build a new case, the already created current case is combined with an adapted or new solution. The CBR cycle according to embodiments is faster and/or more computationally efficient than known CBR cycles as the process of generating and sending a current case to all comparison agents is not required before the content of the case base 108 is searched. The situation description agent may operate in parallel with the comparison agents so that the current case is generated at the same time as the content of the case base 108 is searched. Alternatively, the situation description agent may only create a current case in response to an instruction from the revise agent 106 or an operator that the current case is required. This latter approach is more computationally efficient since the current case is only created when necessary.
The high level architecture of a CBR system comprising a CBR platform according to embodiments is shown in Figure 12. The CBR platform is designed to be scalable, flexible and adaptable so that it can be used in many different applications and is able to be integrated with a wide variety of data sources and third party systems. The CBR platform provides real-time decision support in dependence on received streamed data.
As shown in Figure 12, the CBR system comprises the following components: - CBR platform 1201
Data sources (with Data source application programming interface, API)
Persistence database (with Persistence API)
- Data interpretation agents (with Agent API)
- User interfaces, Uls (with Application API)
The CBR platform 1201 comprises a system for scaling the deployment of data analysis components in a CBR application. The CBR platform 1201 is designed to be able to support very high data throughput and seamless scaling of an application by adding processing nodes, such as computer servers, and distributing computation across nodes in run-time.
Components of the CBR platform 1201 may include:
- A CBR engine 1206. This is a high performance, real-time case-based reasoning engine. The CBR engine 1206 performs the CBR techniques of any of the embodiments of the invention described throughout the present document to generate overall similarity scores. The CBR engine 1206 receives one or more data streams, which describe the current status of a monitored situation, from a unified data cache 1205. The CBR engine 1206 also receives case information from a case library 1207 and compares the case information to that of the monitored situation. For each case 21 that a received data stream is compared to, an overall similarity score is generated. The overall similarity score may be in the form of a percentage match metric. The CBR engine 1206 therefore generates results that provide information on relevant cases 21 . The results of the CBR engine 1206 are output to the unified data cache 1205.
- A unified data cache 1205. The unified data cache 1205 is able to receive information from, and transmit information to, any of the APIs.
The unified data cache 1205 may store data for use by data interpretation agents, which may perform pattern recognition, and may store the results of the data interpretation agents. The unified data cache 1205 also processes data for inputting to the CBR engine 1206. The results of case
comparisons, by the CBR engine 1206, are stored in the unified data cache 1205. The case comparison results stored in the unified data cache 1205 may be output through the Application API and provided to users, such as system operators and data analysts. The data in the unified data cache 1205 may also be output to the persistence database through the persistence API and stored therein.
- A case library 1207. This is a case base as described in the above embodiments. The case base may be, for example, a single database, a plurality of databases distributed across a plurality of hardware devices, a directory on a server or a plurality of directories on one or more servers.
- Data interpretation agents. Although shown in Figure 12 as being external to the CBR platform 1201 , there may also be data interpretation agents within the CBR platform 1201 . The data interpretation agents may also provide an executable input to the CBR engine 1206.
APIs are provided for data input to, and output from, the CBR platform 1201 . These allow persistent data storage and also provide tools for data analysts and platform administrators. The APIs may be part of, and integral with, the CBR platform 1201 or they may be separate from the CBR platform 1201 . The Uls, data sources, persistence database and external data interpretation agents do not form part of the CBR platform 1201 and may be custom devices for a specific application.
The data source API 1203 enables integration with a variety of data sources via data connectors, typically implemented as short programs, that connect data streams, that represent information on a monitored situation, from one or more data sources to the CBR platform 1201 . The live and static data connectors receive information from respective live and static data sources and map the information to a unified data format. The data source API 1203 is provided so that the data connectors can be customised for different implementations. Default connectors may be used but the API also enables the implementation of
custom data connectors developed specifically for the application that the CBR platform 1201 is required to support.
External of the CBR platform is a persistence database for permanently storing some or all of the data that is input to and/or generated within the CBR platform 1201 . In particular, any new cases 21 generated by a revise agent may be stored in the persistence database. The persistence database may be that of a third party or a default database provided with the CBR platform 1201 . It can be implemented according to any known storage solution, such as one or more databases or directories. The stored data in the persistence database can be used to replay situations in order to validate data interpretation agents and cases 21 . Additional advantages of having such a persistence database are that it can be used to store the current data within the CBR platform 1201 to thereby allow fast system recovery if there is a system failure. Such an external database also facilitates the handling of big data.
The persistence database is supported through the persistence API 1204. The persistence database can be integrated with the CBR platform 1201 with short programs that translate between the CBR platform 1201 and the data storage solution, that may be a custom data storage solution.
Each application may have data interpretation agents internal and/or external of the CBR platform 1201 . Tasks that may be performed by the data interpretation agents include pre-processing data and filtering out noise before the data is fed into the CBR engine 1206. The data interpretation agents may mine the unified data in order to identify patterns in it using pattern recognition methods. The pattern recognition methods may be standard or customised. The agents are typically highly modular and while some are application specific, others can be reused to identify similar patterns or perform similar noise filtering across a plurality of different applications. For example, an agent may use statistical methods to recognize when there is a sudden increase in a time series of data. An example of a more complex agent is one that may analyse trends in a set of parameters to detect certain patterns, such as when a few of the parameters have erratic values relative to the others. The data interpretation agents may
therefore generate information for detecting specific events, or just single or streams of numerical values, for use in any of the case comparison processes.
The data interpretation agents communicate with the CBR platform 1201 through the agent API 1208. The agent API 1208 is shown within the CBR platform 1201 in Figure 12 but may alternatively be on the edge of the CBR platform 1201 , in the same way that the other APIs in Figure 12 are shown. The agent API 1208 also enables third party developers to create custom agents. The agent API 1208 provides the CBR engine 1206 with information for detecting specific events, such as OverpuH' or 'tight spot' events during a drilling operation. The agent API 1208 may also provide the CBR engine 1206 with parameter information, for use by the comparison agents in generating similarity scores, and this information may be in the form of single parameter values or one or more data streams of parameter values. That is to say, the CBR engine 1206 may treat a data stream received from the agent API 1208 as if it were a data stream within the received data stream from a monitored situation and use the data stream to generate the overall similarity score for a case 21 .
Default or custom Uls of applications can communicate with the platform through the Application API 1202.
Data analysts may be provided with Uls that enable them to view raw or analysed data going through the CBR platform 1201 , view case data, add custom data interpretation agents, test data interpretation agents and case matching, capture cases 21 and configure cases 21 and the case library.
Platform administrators are provided with Uls that can be used for server cluster installation and configuration. The results of the CBR engine 1206 may be displayed to a system operator using a case radar as described in WO2010/106014A2, which is incorporated herein by reference. The radar provides a highly intuitive visualization that allows a system operator to easily identify relevant cases 21 .
Advantageously, the CBR platform 1201 is highly adaptable and can be easily integrated into a wide range of applications. In addition, the CBR platform 1201 , in particular the CBR engine 1206 within the CBR platform 1201 , is highly scalable can therefore be used in applications that require a larger case base to be searched and/or large cases 21 within the case base to be searched. The CBR engine 1206 can easily adapt to different sizes of case base. A case base may increase in size if new cases 21 are added, or decrease if some of the existing cases 21 in the case base are deemed not relevant to the current situation and do not need to be used in comparisons.
A further advantage is that the CBR platform 1201 can be implemented by a distributed computing system. This increases the scalability, flexibility and adaptability of the CBR platform 1201 . Applications that the CBR platform 1201 is suitable for range from the oil and gas industry, in which the cases 21 are typically very large and the case comparisons computationally demanding, to the financial services industry, in which the case comparisons are typically less computationally demanding but the case base a lot larger.
Further embodiments include modifications and variations of the above described techniques.
For example, in the above-described techniques a comparison agent is created for each case 21 in the case base. An advantage of this approach is that the retrieval agent determines one or more similar cases 21 in dependence on all of the case information in the case base. An alternative approach is to first determine a subset of potentially relevant cases 21 from the case base and only generate comparison agents for the subset of cases 21 . This requires the additional process of filtering the cases 21 in the case base so that the subset only includes cases 21 that are potentially relevant. However, the determination of one or more cases 21 is faster and more computationally efficient since fewer comparison agents are required.
The case 21 shown in Figure 2 has separate dynamic and static data. This separation is not essential and the dynamic and static data may be fully or partially intermingled. As shown in Figure 5 a filter is provided that filters a received data stream into different data streams of parameters. This filtering is not essential and the data stream could have applied unfiltered to each computation unit. This would increase the amount of communicated information within each comparison agent but avoid the requirement of having a filter at the input to the comparison agent.
Similarly, in Figure 6 it is not essential to filter the received data stream so that comparison agents are only provided with data streams of parameters that they require. Not filtering the received data stream increases the amount of communicated information within the CBR engine but the processing requirements at the input to the CBR engine are reduced.
As described above, embodiments of the CBR platform are particularly powerful tools for the energy, finance and healthcare industries. Embodiments are in no way restricted to these applications and the CBR engine may be used in any industry. In particular, the CBR engine can provide a powerful tool in the automobile industry, the fish farming industry and for the control of energy grids. Embodiments are particularly effective for applications, in any domain, in which humans are required to make decisions based on the information stored in realtime data streams.
The flowcharts and description thereof herein should not be understood to prescribe a fixed order of performing the method steps described therein. Rather, the method steps may be performed in any order that is practicable. Although the present invention has been described in connection with specific exemplary embodiments, it should be understood that various changes, substitutions, and alterations apparent to those skilled in the art can be made to the disclosed embodiments without departing from the spirit and scope of the invention as set forth in the appended claims.
Some of the above-described embodiments are described with references to flowcharts and/or block diagrams of methods, apparatuses, and systems. One skilled in the art will appreciate that each block of the flowcharts, block diagrams, and/or their combinations can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer(s) or computer system(s), special purpose computer(s) or computer system(s), other programmable data processing apparatus, or the like, to produce a machine, such that the instructions, executed via the processor of the computer (computer system, programmable data processing apparatus, or the like), create mechanisms for implementing the functions specified within the blocks of the flowcharts and/or block diagrams and/or within corresponding portions of the present disclosure.
These computer program instructions may also be stored in a computer- readable memory (or medium) and direct a computer (computer system, programmable data processing apparatus, or the like) to function in a particular manner, such that the instructions stored in the computer readable memory or medium produce an article of manufacture including instruction means which implement the functions specified in the blocks of the flowchart(s) and/or block diagram(s) and/or within corresponding portions of the present disclosure.
One skilled in the art will understand that any suitable computer-readable medium may be utilized. In particular, the computer-readable medium may include, but is not limited to, a non-transitory computer-readable medium, such as a tangible electronic, magnetic, optical, electromagnetic, infrared, and/or semiconductor system, device, and/or other apparatus. For example, in some embodiments, the non-transitory computer-readable medium includes a tangible medium such as a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read- only memory (HP OM or Flash memory), a compact disc read-only memory (CD- ROM), and/or some other tangible optical and/or magnetic storage device. In other embodiments, the computer-readable medium may be transitory, such as, for example, a propagation signal including computer-executable program code portions embodied therein.
The computer program instructions may also be loaded onto a computer (computer system, other programmable data processing apparatus, or the like) to cause a series of operational steps to be performed on the computer (computer system, other programmable data processing apparatus, or the like) to produce a computer-implemented method or process such that the instructions executed on the computer (computer system, other programmable data processing apparatus, or the like) provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block(s) and/or within corresponding portions of the present disclosure.
In some embodiments of the present disclosure, the above described methods and/or processes could be performed by a program executing in a programmable, general purpose computer or computer system. Alternative embodiments are implemented in a dedicated or special-purpose computer or computer system in which some or all of the operations, functions, steps, or acts are performed using hardwired logic or firmware.
Further, as used herein, the terms "unit" and "engine" may be understood to refer to computing software, firmware, hardware, and/or various combinations thereof.
In some embodiments some or all of the steps are performed automatically by a processor.
Claims
Claims:
1 . A computer-implemented method of monitoring a situation by generating an overall similarity score between a received data stream and a case (21 ) in case-based reasoning, CBR, the method comprising: receiving (703) a data stream comprising information on a monitored situation; generating (705) a plurality of parallel data streams, wherein each of the generated plurality of data streams is dependent on the received data stream; generating (707), for each of the generated data streams, a similarity score for a feature of a case (21 ), wherein each similarity score is generated in dependence on a comparison between information in the generated data stream and stored information on the feature of a case (21 ), and each of the similarity scores is generated in dependence on a comparison with stored information on a different feature of the same case (21 ); and generating (709) an overall similarity score between the received data stream and the case (21 ) in dependence on the generated similarity scores.
The method according to claim 1 , further comprising generating each similarity score in dependence on comparison information of the corresponding feature of the case (21 ). 3. The method according to claim 1 or 2, wherein the received data stream comprises one or more streams of parameters and each of the generated data streams comprises one or more streams of parameters in dependence on the received data stream.
The method according to any preceding claim, further comprising generating the similarity score for each feature in dependence on a comparison of the value of a parameter in a generated data stream of parameters for the feature with the stored value of the parameter for the feature.
The method according to claim 2 or any dependent claim thereon, wherein the comparison information comprises weight information and the method further comprises weighting the value of the parameter in the generated data stream for a feature and/or the stored value of the parameter for the feature; and generating the similarity score of the feature in dependence on the weighted value(s).
The method according to claim 2 or any dependent claim thereon, wherein the comparison information comprises a comparison function that specifies computations for generating the similarity score and the method further comprises generating the similarity score of each feature in dependence on the comparison function of the feature.
The method according to any preceding claim, further comprising generating one or more aggregate similarity scores in dependence on one or more of the generated similarity scores; and generating the overall similarity score in dependence on the aggregate similarity score(s).
The method according to any preceding claim, further comprising determining the overall similarity score in real-time.
The method according to any preceding claim, wherein the method is performed by a comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103).
10. The method according to claim 9, wherein the comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) is configured in dependence on the
comparison information for each feature and including the stored information for each feature.
1 1 . The method according any preceding claim, wherein the case (21 ) comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
12. A computer-implemented method of monitoring a situation by determining a set of one or more cases (21 ) in case-based reasoning, CBR, the method comprising: receiving a data stream comprising information on a monitored situation; generating a plurality of parallel data streams from the received data stream; generating, according to the method of any preceding claim, an overall similarity score between each of the plurality of parallel data streams and a case (21 ), wherein each overall similarity score is generated from a comparison between one of the plurality of parallel data streams and a different one of a plurality of cases (21 ); and determining a set of one or more cases (21 ) in dependence on the generated overall similarity scores.
13. A comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) for monitoring a situation by generating an overall similarity score between a received data stream, comprising information on a monitored situation, and a case (21 ) in case-based reasoning, CBR, wherein the comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) is configured to perform the method of any of claims 1 to 1 1 .
14. A case-based reasoning, CBR, engine (60, 1206) for monitoring a situation by determining a set of one or more cases (21 ), wherein the CBR engine (60, 1206) is configured to perform the method of claim 12. 15. A computer program that, when executed by a computing device, controls the computing device to perform the method of any of claims 1 to 12.
16. A computer-implemented method of monitoring a situation by determining a set of one or more cases (21 ) in case-based reasoning, CBR, the method comprising: receiving (803) a data stream comprising information on a monitored situation; generating (805) a plurality of parallel data streams from the received data stream; generating (807), for each of the parallel data streams, an overall similarity score between the parallel data stream and one of a plurality of cases (21 ), wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and a different case (21 ); and determining (809) a set of one or more cases (21 ) in dependence on the generated overall similarity scores.
17. The method according to claim 16, wherein each of the overall similarity scores is generated by one of a plurality of comparison agents (41 , 51 , 62, 63, 64, 101 , 102, 103) and each of the comparison agents (41 , 51 , 62, 63, 64, 101 , 102, 103) receives one of the plurality of data streams.
18. The method according to claim 17, the method further comprising each comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) generating an overall similarity score by:
receiving one of the plurality of data streams; generating a further plurality of parallel data streams, wherein each of the generated further plurality of parallel data streams is dependent on the received one of the plurality of data streams; generating, for each of said generated further plurality of parallel data streams, a similarity score in dependence on a comparison between information on a feature in the generated further data stream and stored information on the feature of a case (21 ), wherein each of the similarity scores of said generated further plurality of data streams is generated in dependence on a comparison with stored information on a different feature of the same case (21 ); and generating an overall similarity score between the received one of the plurality of data streams and the case (21 ) in dependence on the generated similarity scores.
19. The method according to claim 16 or any dependent claim thereon, further comprising determining to include a case (21 ) in the set of one or more cases (21 ) if the overall similarity score for the case (21 ) is above a predetermined threshold level.
20. The method according to claim 16 or any dependent claim thereon, wherein the determined set of cases (21 ) has a predetermined number of two or more cases (21 ), and the method comprises determining the predetermined number of cases (21 ) for including in the set as the cases (21 ) with the highest overall similarity scores.
21 . The method according to claiml 6 or any dependent claim thereon, further comprising displaying information dependent on each of the determined one or more cases (21 ).
22. The method according to claim 16 or any dependent claim thereon, wherein each case (21 ) comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
23. A case-based reasoning, CBR, engine (60, 1206) for monitoring a situation by determining a set of one or more cases (21 ), wherein the CBR engine (60, 1206) is configured to perform the method of any of claims 16 to 22.
24. A computer program that, when executed by a computing device, controls the computing device to perform the method of any of claims 16 to 22.
25. A case-based reasoning, CBR, platform (1201 ) for monitoring a situation, the CBR platform (1201 ) comprising:
a CBR engine (60, 1206);
a unified data cache (1205);
a case library (1207);
an agent application programming interface, API (1208); a data source API (1203);
an application API (1202); and
a persistence API (1204);
wherein: the data source API (1203) is configured to provide an interface between live and/or static data sources, external from the CBR platform (1201 ), and the unified data cache (1205), wherein the live and/or static data sources are sources of information on a monitored situation; the persistence API (1204) is configured to provide an interface between a persistence database, external from the CBR platform (1201 ), and the unified data cache (1205) and the case library (1207);
the application API (1202) is configured to provide an interface between the systems of data analysts, platform administrators and/or operators, external from the CBR platform (1201 ), and the unified data cache (1205) and the case library (1207); the CBR engine (60, 1206) is configured to receive information for use in generating CBR results from the agent API (1208), to receive information on a plurality of cases (21 ) from the case library (1207), to receive a data stream comprising information on the monitored situation from the unified data cache (1205) and to generate CBR results that are both dependent on a comparison of a plurality of parallel data streams, dependent on the received data stream, with information on the plurality of cases (21 ) and also dependent on the information received from the agent API; the unified data cache (1205) is configured to receive data from the data source API (1203), to send a data stream comprising information on the monitored situation to the CBR engine (60, 1206) and to receive the results of the CBR engine (60, 1206) from the CBR engine (60, 1206); the agent API (1208) is configured to receive information from data interpretation agents external from the CBR platform (1201 ) and to send the received information to the CBR engine (60, 1206) in order to provide the CBR engine (60, 1206) with information for use in generating the CBR results; and the case library (1207) is configured to send information on a plurality of cases (21 ) to the CBR engine (60, 1206).
26. The CBR platform (1201 ) according to claim 25, wherein the case library (1207) comprises a case base. 27. The CBR platform (1201 ) according to claim 26, wherein cases (21 ) are stored in the case base in XML or serialised code format.
28. The CBR platform (1201 ) according to claim 26 or 27, wherein the case base is any of a single database, a plurality of distributed databases, a directory on a server or a plurality of directories on one or more servers.
29. The CBR platform (1201 ) according to claim 25 or any dependent claim thereon, wherein each case (21 ) comprises metadata that includes comparison information for the case (21 ).
30. The CBR platform (1201 ) according to claim 25 or any dependent claim thereon, wherein each case (21 ) is structured according to a computing graph.
31 . The CBR platform (1201 ) according to claim 30, wherein the computing graph is a tree structure or directed acyclic graph.
32. The CBR platform (1201 ) according to claim 25 or any dependent claim thereon, wherein the CBR engine (60, 1206) is configured to perform the method of: receiving said data stream comprising information on the monitored situation from the unified data cache (1205); generating a plurality of parallel data streams from said received data stream comprising information on the monitored situation; generating, for each of the parallel data streams, an overall similarity score between the parallel data stream and information on one of the plurality of cases (21 ) received from the case library (1207), wherein each overall similarity score is generated from a comparison between one of the plurality of data streams and information on a different case (21 ); and
determining the CBR results as a set of one or more cases (21 ) in dependence on the generated overall similarity scores.
33. The CBR platform (1201 ) according to claim 25 or any dependent claim thereon, wherein each case (21 ) comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
34. A computer-implemented method in case-based reasoning, CBR, for creating components of a comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) for monitoring a situation, the method comprising: obtaining (903) one or more parameter values and comparison information for each of a plurality of features of a case (21 ), wherein the comparison information of each feature defines a configuration of a computation unit (42, 43, 55, 56, 57); and creating (905), for each of the plurality of features, a computation unit (42, 43, 55, 56, 57) in dependence on the obtained one or more parameter values and the comparison information of the feature, such that the created computation unit (42, 43, 55, 56, 57) is configured to generate an output in dependence on the obtained one or more parameter values and the comparison information of the feature.
35. The method according to claim 34, further comprising creating each of the computation units (42, 43, 55, 56, 57) such that each computation unit (42, 43, 55, 56, 57) is configured to receive one or more data streams of parameters and to generate an output in dependence on the values of the one or more received parameters, wherein the one or more data streams comprise information on a monitored situation.
36. The method according to claim 35, further comprising:
obtaining comparison information for one or more aggregate computation units (44, 52, 53, 54), wherein the comparison information of each aggregate computation unit (44, 52, 53, 54) defines the configuration of the aggregate computation unit (44, 52, 53, 54); and creating, in dependence on the obtained comparison information, one or more aggregate computation units (44, 52, 53, 54), wherein each aggregate computation unit (44, 52, 53, 54) is configured to receive an output generated by at least one of the computation units (42, 43, 55, 56, 57).
37. The method according to claim 35 or 36, wherein the comparison information for each computation unit (42, 43, 55, 56, 57) includes information on how the obtained one or more parameter values and/or received parameter values are to be weighted by the computation unit
(42, 43, 55, 56, 57), and the method comprises creating each computation unit (42, 43, 55, 56, 57) such that the computation unit (42, 43, 55, 56, 57) is configured to weight the obtained one or more parameter values and/or received parameter values in dependence of the comparison information.
38. The method according to claim 36 or claim 37 when dependent on claim 36, wherein the comparison information for each aggregate computation unit (44, 52, 53, 54) includes information on how the aggregate computation unit (44, 52, 53, 54) is to weight the values of its inputs, and the method comprises creating each aggregate computation unit (44, 52, 53, 54) such that the aggregate computation unit (44, 52, 53, 54) is configured to weight its inputs in dependence on the comparison information.
39. The method according to claim 35 or any dependent claim thereon, wherein the comparison information for each computation unit (42, 43, 55, 56, 57) includes a comparison function that specifies the computations that the computation unit (42, 43, 55, 56, 57) is required to
perform to generate an output, and the method comprises creating each computation unit (42, 43, 55, 56, 57) such that the computation unit (42, 43, 55, 56, 57) is configured to apply computations with the obtained one or more parameter values and received parameter values in dependence on the comparison function.
40. The method according to claim 36 or any dependent claim thereon, wherein the configuration information for each aggregate computation unit (44, 52, 53, 54) includes a comparison function that specifies the computations that the aggregate computation unit (44, 52, 53, 54) is required to perform to generate an output, and the method comprises creating each aggregate computation unit (44, 52, 53, 54) such that the aggregate computation unit (44, 52, 53, 54) is configured to apply computations to its inputs in dependence on the comparison function.
41 . The method according to claim 34 or any dependent claim thereon, wherein the comparison information for each computation unit (42, 43, 55, 56, 57) and the comparison information of each aggregate computation unit (44, 52, 53, 54) are obtained by obtaining metadata of the case (21 ).
42. The method according to claim 41 when dependent on claim 36, wherein the computation units (42, 43, 55, 56, 57) and aggregate computation units (44, 52, 53, 54) are configured in dependence on a computing graph for the case (21 ).
43. The method according to claim 42, wherein the computing graph is a tree structure or directed acyclic graph. 44. The method according to claim 34 or any dependent claim thereon, wherein the case (21 ) comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry.
45. A comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) for monitoring a situation, the comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) created according to the method of any of claims 34 to 44.
46. A computer program that, when executed by a computing device, controls the computing device to perform the method of any of claims 34 to 44.
47. A computer-implemented method of creating a new case (21 ) in a case- based reasoning, CBR, system for monitoring a situation, the method comprising: determining (1 103) a set of one or more cases from a plurality of cases in dependence on a received data stream comprising information on a monitored situation, wherein each case (21 ) comprises information describing a problem and information describing a solution to the problem and the process of determining the set of one or more cases is performed without comparing the description of the problems of any of the plurality of cases with a previously generated current case comprising information describing the monitored situation; generating (1 105) information describing a solution in dependence on information obtained from the determined set of one or more cases and/or in dependence on information received from a user interface; generating (1 107) a current case comprising information describing the monitored situation in dependence on the received data stream; and generating (1 109) a new case (21 ) in dependence on the generated information describing a solution and the generated current case.
48. The method according to claim 47, further comprising generating the current case after generating the information describing a solution.
49. The method according to claim 47 or 48, further comprising generating a request for the current case; and generating the current case in response to receiving the request.
50. The method according to claim 47, further comprising automatically generating the current case. 51 . The method according to any of claims 47 to 50, wherein the plurality of cases are stored in a case base (108) and the method further comprises storing the generated new case (21 ) in the case base.
52. The method according to any of claims 47 to 51 , wherein the process of determining a set of one or more cases (21 ) includes creating a comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) for each of the plurality of cases (21 ), and the method comprises generating a new comparison agent (41 , 51 , 62, 63, 64, 101 , 102, 103) for the generated new case (21 ).
53. The method according to any of claims 47 to 52, wherein each case (21 ) comprises information that describes a situation in one of the finance industry, healthcare industry or energy industry. 54. A case-based reasoning, CBR, system for monitoring a situation, the CBR system configured to perform the method of any of claims 47 to 53.
55. A computer program that, when executed by a computing device, controls the computing device to perform the method of any of claims 47 to 53.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP14767002.0A EP3058519A2 (en) | 2013-10-17 | 2014-09-18 | Case-based reasoning |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1318395.9 | 2013-10-17 | ||
GBGB1318395.9A GB201318395D0 (en) | 2013-10-17 | 2013-10-17 | Case-Based Reasoning |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2015055373A2 true WO2015055373A2 (en) | 2015-04-23 |
WO2015055373A3 WO2015055373A3 (en) | 2015-07-16 |
Family
ID=49726948
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2014/069889 WO2015055373A2 (en) | 2013-10-17 | 2014-09-18 | Case-based reasoning |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3058519A2 (en) |
GB (1) | GB201318395D0 (en) |
WO (1) | WO2015055373A2 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10318539B2 (en) | 2016-03-24 | 2019-06-11 | General Electric Company | Method and apparatus for managing information across like-cases |
CN110974260A (en) * | 2019-12-16 | 2020-04-10 | 兰州大学 | Case-based reasoning depression recognition system based on electroencephalogram characteristics |
CN111105075A (en) * | 2019-11-25 | 2020-05-05 | 上海建科工程咨询有限公司 | Tower crane risk accident prediction method and system based on case-based reasoning |
CN113642733A (en) * | 2021-10-19 | 2021-11-12 | 矿冶科技集团有限公司 | Case reasoning and matching method for gene mineral separation process |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5799148A (en) * | 1996-12-23 | 1998-08-25 | General Electric Company | System and method for estimating a measure of confidence in a match generated from a case-based reasoning system |
US8170800B2 (en) * | 2009-03-16 | 2012-05-01 | Verdande Technology As | Method and system for monitoring a drilling operation |
-
2013
- 2013-10-17 GB GBGB1318395.9A patent/GB201318395D0/en not_active Ceased
-
2014
- 2014-09-18 WO PCT/EP2014/069889 patent/WO2015055373A2/en active Application Filing
- 2014-09-18 EP EP14767002.0A patent/EP3058519A2/en not_active Ceased
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10318539B2 (en) | 2016-03-24 | 2019-06-11 | General Electric Company | Method and apparatus for managing information across like-cases |
CN111105075A (en) * | 2019-11-25 | 2020-05-05 | 上海建科工程咨询有限公司 | Tower crane risk accident prediction method and system based on case-based reasoning |
CN110974260A (en) * | 2019-12-16 | 2020-04-10 | 兰州大学 | Case-based reasoning depression recognition system based on electroencephalogram characteristics |
CN113642733A (en) * | 2021-10-19 | 2021-11-12 | 矿冶科技集团有限公司 | Case reasoning and matching method for gene mineral separation process |
CN113642733B (en) * | 2021-10-19 | 2022-02-15 | 矿冶科技集团有限公司 | Case reasoning and matching method for gene mineral separation process |
Also Published As
Publication number | Publication date |
---|---|
WO2015055373A3 (en) | 2015-07-16 |
GB201318395D0 (en) | 2013-12-04 |
EP3058519A2 (en) | 2016-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9613362B2 (en) | Monitoring a situation by comparing parallel data streams | |
US11163731B1 (en) | Autobuild log anomaly detection methods and systems | |
EP2674875B1 (en) | Method, controller, program and data storage system for performing reconciliation processing | |
US10423647B2 (en) | Descriptive datacenter state comparison | |
US9471470B2 (en) | Automatically recommending test suite from historical data based on randomized evolutionary techniques | |
JP6427592B2 (en) | Manage data profiling operations related to data types | |
US20190286509A1 (en) | Hierarchical fault determination in an application performance management system | |
US10324710B2 (en) | Indicating a trait of a continuous delivery pipeline | |
CA2930623A1 (en) | Method and system for aggregating and ranking of security event-based data | |
US20210406003A1 (en) | Meta-indexing, search, compliance, and test framework for software development using smart contracts | |
JP7545461B2 (en) | DATA PROCESSING METHOD, DATA PROCESSING APPARATUS, ELECTRONIC DEVICE, STORAGE MEDIUM, AND COMPUTER PROGRAM | |
US20160154730A1 (en) | Using linked data to determine package quality | |
US9658834B2 (en) | Program visualization device, program visualization method, and program visualization program | |
JP5901668B2 (en) | System and method for grouping warnings generated during static analysis | |
US8938443B2 (en) | Runtime optimization of spatiotemporal events processing | |
CN111967917B (en) | Method and device for predicting user loss | |
WO2015055373A2 (en) | Case-based reasoning | |
US10878039B2 (en) | Creating knowledge base of similar systems from plurality of systems | |
US20160364282A1 (en) | Application performance management system with collective learning | |
US11153183B2 (en) | Compacted messaging for application performance management system | |
US9619765B2 (en) | Monitoring a situation by generating an overall similarity score | |
US20150112912A1 (en) | Case-based reasoning | |
US20150112914A1 (en) | Case-based reasoning | |
CN112860523A (en) | Fault prediction method and device for batch job processing and server | |
EP2731021A1 (en) | Apparatus, program, and method for reconciliation processing in a graph database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2014767002 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2014767002 Country of ref document: EP |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14767002 Country of ref document: EP Kind code of ref document: A2 |