[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2019103778A1 - Missing label classification and anomaly detection for sparsely populated manufacturing knowledge graphs - Google Patents

Missing label classification and anomaly detection for sparsely populated manufacturing knowledge graphs Download PDF

Info

Publication number
WO2019103778A1
WO2019103778A1 PCT/US2018/049159 US2018049159W WO2019103778A1 WO 2019103778 A1 WO2019103778 A1 WO 2019103778A1 US 2018049159 W US2018049159 W US 2018049159W WO 2019103778 A1 WO2019103778 A1 WO 2019103778A1
Authority
WO
WIPO (PCT)
Prior art keywords
knowledge graph
manufacturing knowledge
graph
manufacturing
stored
Prior art date
Application number
PCT/US2018/049159
Other languages
French (fr)
Inventor
Guannan REN
Sanjeev SRIVASTAVA
Erhan Arisoy
Original Assignee
Siemens Aktiengesellschaft
Siemens Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens Aktiengesellschaft, Siemens Corporation filed Critical Siemens Aktiengesellschaft
Publication of WO2019103778A1 publication Critical patent/WO2019103778A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

Definitions

  • Knowledge graphs are widely used in social media networks and natural language processing. In these areas, data is abundant, easy to obtain, and the relationships between entities are well-defined. Knowledge graphs in manufacturing can be useful for process planning, task scheduling, diagnostics, product design, capability analysis, resource allocation, analyzing production scenarios, and so forth.
  • a computer-implemented method for identifying missing or mislabeled information in a manufacturing knowledge graph includes generating the manufacturing knowledge graph based at least in part on at least one of manufacturing data or manufacturing documents and performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph. The method further includes determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph.
  • the method additionally includes identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting the mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
  • a system for identifying missing or mislabeled information in a manufacturing knowledge graph includes at least one memory storing computer-executable instructions and at least one processor configured to access the at least one memory and execute the computer- executable instructions to perform a set of operations.
  • the operations include generating the manufacturing knowledge graph based at least in part on at least one of
  • the operations further include determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph.
  • the operations additionally include identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting the mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
  • a computer program product for identifying missing or mislabeled information in a manufacturing knowledge graph.
  • the computer program product includes a non-transitory storage medium readable by a processing circuit, the storage medium storing instructions executable by the processing circuit to cause a method to be performed.
  • the method includes generating the manufacturing knowledge graph based at least in part on at least one of
  • the method further includes determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph.
  • the method additionally includes identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
  • FIG. 1 is a hybrid block/data flow diagram schematically depicting a process for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments.
  • FIG. 2 is a process flow diagram of an illustrative method for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments.
  • FIG. 3 is a schematic diagram of an illustrative computing configuration for implementing one or more example embodiments.
  • Example embodiments of the invention relate to, among other things, systems, methods, computer-readable media, techniques, and methodologies for identifying missing or mislabeled information in a manufacturing knowledge graph.
  • Knowledge graphs in the manufacturing domain often face the challenge of sparsity of available information for constructing the knowledge graphs.
  • there is the possibility of missing or mislabeled manufacturing data/information because the data is not always collected in digitized form and may be in the form of textual documents or handwritten notes.
  • Example embodiments address this technical challenge by using a collection of prior stored manufacturing knowledge graphs to learn latent representations present within the stored graphs by traversing at least a portion of the stored graphs using a graph traversal technique such as a random walk traversal.
  • the latent representations constitute an encoding of the stored manufacturing knowledge graphs such that a collection of encoded stored manufacturing knowledge graphs are obtained through learning of the latent representations.
  • a new manufacturing knowledge graph is generated from manufacturing data and/or manufacturing documents. A graph traversal of at least a portion of the new
  • manufacturing knowledge graph is performed to determine latent representation(s) of the new manufacturing knowledge graph that encode the new manufacturing knowledge graph as an encoded new manufacturing knowledge graph.
  • the encoded new manufacturing knowledge graph can be compared to the encoded stored manufacturing knowledge graphs to identify an encoded stored manufacturing knowledge graph that is most similar to the encoded new manufacturing knowledge graph.
  • a structure of the stored manufacturing knowledge graph corresponding to the matching encoded stored manufacturing knowledge graph can then be used to identify missing and/or mislabeled information in the new manufacturing graph.
  • the missing information can be added to the new manufacturing knowledge graph and/or the mislabeled information corrected to obtain a more accurate and complete new manufacturing graph.
  • this new manufacturing knowledge graph can then be stored for potential future matching.
  • the graph traversal and encoded manufacturing graph representation methods according to example embodiments of the invention can be tailored to applications such as a manufacturing process plan that have a low labeled dataset.
  • the terms manufacturing knowledge graph and manufacturing graph may be used interchangeably herein.
  • any given operation of the method 200 may be performed by one or more of the program modules or the like depicted in FIG. 1 , whose operation will be described in more detail later in this disclosure.
  • These program modules may be implemented in any combination of hardware, software, and/or firmware.
  • one or more of these program modules may be implemented, at least in part, as software and/or firmware modules that include computer-executable instructions that when executed by a processing circuit cause one or more operations to be performed.
  • a system or device described herein as being configured to implement example embodiments may include one or more processing circuits, each of which may include one or more processing units or nodes.
  • FIG. 1 is a hybrid block/data flow diagram schematically depicting a process for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments.
  • FIG. 2 is a process flow diagram of an illustrative method 200 for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments.
  • FIGS. 1 and 2 will be described in conjunction with one another hereinafter.
  • the new manufacturing graph 106 may include a set of nodes and a set of edges connecting the nodes.
  • the new manufacturing graph 106 may be representative of a manufacturing process, where each node of the graph 106 may represent a component, sub-component, process, or sub process of the manufacturing process and each edge may represent a process set, a constraint on the manufacturing process, or the like.
  • computer-executable instructions of one or more graph traversal modules 108 may be executed to perform a graph traversal 110 of the new manufacturing graph 106 using a graph traversal methodology.
  • the graph traversal methodology may be a random walk graph traversal.
  • only a portion of the new manufacturing graph 106 may be traversed such as a subgraph of the graph 106.
  • computer-executable instructions of one or more latent representation modules 112 may be executed to determine a first set 114 of one or more latent representations associated with the new manufacturing graph 106 based at least in part on the traversal 110.
  • the set of latent representation(s) 114 may be indicative of underlying relationships present in the graph 106.
  • the set of latent representation(s) 114 may reveal relationships between words in manufacturing documents 102 by treating the traversal 110 as being equivalent to sentences form from the words of the manufacturing documents 102.
  • the set of latent representation(s) 114 may embody or otherwise represent an encoding of the new manufacturing knowledge graph 106 as an encoded new manufacturing knowledge graph 118.
  • the encoded new manufacturing graph 118 may be vectorized representation of the new manufacturing graph 106.
  • computer-executable instructions of the graph traversal module(s) 108 may be executed to perform respective traversals 124 of each of a plurality of stored manufacturing graphs 120 stored, for example, in one or more datastores 122.
  • computer-executable instructions of the latent representation generation module(s) 112 may be executed to determine a respective set of latent representation(s) associated with each of the stored manufacturing graphs 120 based at least in part on a corresponding traversal 124.
  • the sets of latent representation(s) 126 for the stored manufactured graphs 120 may embody or otherwise represent an encoding of the stored manufacturing knowledge graphs 120 as a set of encoded stored manufacturing graphs 128.
  • computer-executable instructions of one or more matching modules 116 may be executed to perform a comparison of the encoded new manufacturing graph 118 to the set of encoded stored manufacturing graphs 128.
  • any suitable comparison algorithm e.g., similarity algorithm
  • the result of the comparison at block 212 may be an identification of a stored manufacturing graph 130 that is most similar to the new manufacturing graph 106 based at least in part on the encoded latent representations contained in the encoded new manufacturing graph 118 and encoded stored manufacturing graph 128 that corresponds to the matching stored manufacturing graph 130.
  • computer-executable instructions of one or more graph correction modules 132 may be executed to detect and correct mislabeled information and/or generate missing labels in the new manufacturing graph 106 based at least in part on a structure of the most similar stored manufacturing graph 130.
  • the result of the operation at block 214 may be an updated new manufacturing graph 134 that contains information in the form of new nodes and edges that were missing from the new manufacturing graph 106.
  • the updated new manufacturing graph 134 may then be stored in the datastore(s) 122 as a stored manufacturing graph 120 for potential future matching.
  • Example embodiments of the invention provide techniques for latent representation learning for manufacturing domain-specific graphical data structures that are capable of addressing the challenge of sparsity of data in the manufacturing domain.
  • techniques disclosed herein maintain performance even in the presence of a sparse amount of training data.
  • example embodiments of the invention provide a number of technical benefits over conventional approaches including unsupervised representation learning techniques that capture graphical structures independent of the label distribution of nodes; a learned model with low dimensionality that provides a higher generalizability and faster convergence; training from a streaming dataset that allows for decentralized training; and use of graph traversal techniques (e.g., random graph walks) that are localized, and thus, allow for distributed computing and improved algorithm scalability.
  • graph traversal techniques e.g., random graph walks
  • FIG. 3 is a schematic diagram of an illustrative computing configuration for implementing one or more example embodiments of the invention.
  • FIG. 4 depicts one or more manufacturing graph anomaly detection servers 302 configured to implement one or more example embodiments. While the server(s) 302 may be described herein in the singular, it should be appreciated that multiple servers 302 may be provided, and functionality described herein may be distributed across multiple such servers 302.
  • the manufacturing graph anomaly detection server 302 may include one or more processors (processor(s)) 304, one or more memory devices 306 (generically referred to herein as memory 306), one or more input/output (“I/O”) interface(s) 308, one or more network interfaces 310, and data storage 314.
  • the manufacturing graph anomaly detection server 302 may further include one or more buses 312 that functionally couple various components of the manufacturing graph anomaly detection server 302.
  • the bus(es) 312 may include at least one of a system bus, a memory bus, an address bus, or a message bus, and may permit the exchange of information (e.g., data (including computer-executable code), signaling, etc.) between various components of the manufacturing graph anomaly detection server 302.
  • the bus(es) 312 may include, without limitation, a memory bus or a memory controller, a peripheral bus, an accelerated graphics port, and so forth.
  • the bus(es) 312 may be associated with any suitable bus architecture including, without limitation, an Industry Standard Architecture (ISA), a Micro Channel Architecture (MCA), an Enhanced ISA (EISA), a Video Electronics Standards Association (VESA) architecture, an Accelerated Graphics Port (AGP) architecture, a Peripheral Component Interconnects (PCI) architecture, a PCI-Express architecture, a Personal Computer Memory Card International Association (PCMCIA) architecture, a Universal Serial Bus (USB) architecture, and so forth.
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • EISA Enhanced ISA
  • VESA Video Electronics Standards Association
  • AGP Accelerated Graphics Port
  • PCI Peripheral Component Interconnects
  • PCMCIA Personal Computer Memory Card International Association
  • USB Universal Serial Bus
  • the memory 306 may include volatile memory (memory that maintains its state when supplied with power) such as random access memory (RAM) and/or non volatile memory (memory that maintains its state even when not supplied with power) such as read-only memory (ROM), flash memory, ferroelectric RAM (FRAM), and so forth.
  • volatile memory memory that maintains its state when supplied with power
  • non volatile memory memory that maintains its state even when not supplied with power
  • ROM read-only memory
  • flash memory flash memory
  • FRAM ferroelectric RAM
  • Persistent data storage may include non-volatile memory.
  • volatile memory may enable faster read/write access than non-volatile memory.
  • certain types of non-volatile memory e.g., FRAM may enable faster read/write access than certain types of volatile memory.
  • the memory 306 may include multiple different types of memory such as various types of static random access memory (SRAM), various types of dynamic random access memory (DRAM), various types of unalterable ROM, and/or writeable variants of ROM such as electrically erasable programmable read-only memory (EEPROM), flash memory, and so forth.
  • the memory 306 may include main memory as well as various forms of cache memory such as instruction cache(s), data cache(s), translation lookaside buffer(s) (TLBs), and so forth.
  • cache memory such as a data cache may be a multi-level cache organized as a hierarchy of one or more cache levels (Ll, L2, etc.).
  • the data storage 314 may include removable storage and/or non-removable storage including, but not limited to, magnetic storage, optical disk storage, and/or tape storage.
  • the data storage 314 may provide non-volatile storage of computer-executable instructions and other data.
  • the memory 306 and the data storage 314, removable and/or non-removable, are examples of computer-readable storage media (CRSM) as that term is used herein.
  • CRSM computer-readable storage media
  • the data storage 314 may store computer-executable code, instructions, or the like that may be loadable into the memory 306 and executable by the processor(s) 304 to cause the processor(s) 304 to perform or initiate various operations.
  • the data storage 314 may additionally store data that may be copied to memory 306 for use by the processor(s) 304 during the execution of the computer-executable instructions.
  • output data generated as a result of execution of the computer-executable instructions by the processor(s) 304 may be stored initially in memory 306 and may ultimately be copied to data storage 314 for non-volatile storage.
  • the data storage 314 may store one or more operating systems (O/S) 316; one or more database management systems (DBMS) 318 configured to access the memory 306 and/or one or more datastores 330; and one or more program modules, applications, engines, managers, computer-executable code, scripts, or the like such as, for example, one or more graph generation modules 320; one or more graph traversal modules 322; one or more latent representation determination modules 324; one or more graph matching modules 326; and one or more graph correction modules 328.
  • Any of the components depicted as being stored in data storage 314 may include any combination of software, firmware, and/or hardware.
  • the software and/or firmware may include computer-executable instructions (e.g., computer-executable program code) that may be loaded into the memory 306 for execution by one or more of the processor(s) 304 to perform any of the corresponding operations described earlier.
  • the data storage 314 may further store various types of data utilized by components of the manufacturing graph anomaly detection server 302 (e.g., data stored in the datastore(s) 330). Any data stored in the data storage 314 may be loaded into the memory 306 for use by the processor(s) 304 in executing computer-executable instructions. In addition, any data stored in the data storage 314 may potentially be stored in the external datastore(s) 330 and may be accessed via the DBMS 418 and loaded in the memory 306 for use by the processor(s) 304 in executing computer-executable instructions.
  • data stored in the data storage 314 may potentially be stored in the external datastore(s) 330 and may be accessed via the DBMS 418 and loaded in the memory 306 for use by the processor(s) 304 in executing computer-executable instructions.
  • the processor(s) 304 may be configured to access the memory 306 and execute computer-executable instructions loaded therein.
  • the processor(s) 304 may be configured to execute computer-executable instructions of the various program modules, applications, engines, managers, or the like of the manufacturing graph anomaly detection server 302 to cause or facilitate various operations to be performed in accordance with one or more embodiments of the disclosure.
  • the processor(s) 304 may include any suitable processing unit capable of accepting data as input, processing the input data in accordance with stored computer-executable instructions, and generating output data.
  • the processor(s) 304 may include any type of suitable processing unit including, but not limited to, a central processing unit, a microprocessor, a Reduced Instruction Set Computer (RISC) microprocessor, a Complex Instruction Set Computer (CISC) microprocessor, a microcontroller, an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), a System-on-a-Chip (SoC), a digital signal processor (DSP), and so forth. Further, the processor(s) 304 may have any suitable microarchitecture design that includes any number of constituent components such as, for example, registers, multiplexers, arithmetic logic units, cache controllers for controlling read/write operations to cache memory, branch predictors, or the like.
  • the processor(s) 304 may have any suitable microarchitecture design that includes any number of constituent components such as, for example, registers, multiplexers, arithmetic logic units, cache controllers for controlling read/write operations to cache memory, branch predictors, or the like.
  • microarchitecture design of the processor(s) 304 may be capable of supporting any of a variety of instruction sets.
  • the O/S 316 may be loaded from the data storage 314 into the memory 306 and may provide an interface between other application software executing on the manufacturing graph anomaly detection server 302 and hardware resources of the manufacturing graph anomaly detection server 302. More specifically, the O/S 316 may include a set of computer-executable instructions for managing hardware resources of the manufacturing graph anomaly detection server 302 and for providing common services to other application programs. In certain example embodiments, the O/S 316 may include or otherwise control the execution of one or more of the program modules, engines, managers, or the like depicted as being stored in the data storage 314. The O/S 316 may include any operating system now known or which may be developed in the future including, but not limited to, any server operating system, any mainframe operating system, or any other proprietary or non-proprietary operating system.
  • the DBMS 318 may be loaded into the memory 306 and may support functionality for accessing, retrieving, storing, and/or manipulating data stored in the memory 306, data stored in the data storage 314, and/or data stored in external datastore(s) 330.
  • the DBMS 318 may use any of a variety of database models (e.g., relational model, object model, etc.) and may support any of a variety of query languages.
  • the DBMS 318 may access data represented in one or more data schemas and stored in any suitable data repository.
  • data stored in the datastore(s) 330 may include, for example, manufacturing graph data; latent representation data; encoded manufacturing graph data; and so forth.
  • External datastore(s) 330 that may be accessible by the manufacturing graph anomaly detection server 302 via the DBMS 318 may include, but are not limited to, databases (e.g., relational, object-oriented, etc.), file systems, flat files, distributed datastores in which data is stored on more than one node of a computer network, peer-to-peer network datastores, or the like.
  • databases e.g., relational, object-oriented, etc.
  • file systems e.g., flat files
  • peer-to-peer network datastores e.g., peer-to-peer network datastores, or the like.
  • the input/output (I/O) interface(s) 308 may facilitate the receipt of input information by the manufacturing graph anomaly detection server 302 from one or more I/O devices as well as the output of information from the
  • the I/O devices may include any of a variety of components such as a display or display screen having a touch surface or touchscreen; an audio output device for producing sound, such as a speaker; an audio capture device, such as a microphone; an image and/or video capture device, such as a camera; a haptic unit; and so forth. Any of these components may be integrated into the manufacturing graph anomaly detection server 302 or may be separate.
  • the I/O devices may further include, for example, any number of peripheral devices such as data storage devices, printing devices, and so forth.
  • the I/O interface(s) 308 may also include an interface for an external peripheral device connection such as universal serial bus (USB), FireWire, Thunderbolt, Ethernet port or other connection protocol that may connect to one or more networks.
  • USB universal serial bus
  • FireWire FireWire
  • Thunderbolt Thunderbolt
  • Ethernet port or other connection protocol that may connect to one or more networks.
  • the I/O interface(s) 308 may also include a connection to one or more antennas to connect to one or more networks via a wireless local area network (WLAN) (such as Wi Fi) radio, Bluetooth, and/or a wireless network radio, such as a radio capable of communication with a wireless communication network such as a Long Term Evolution (LTE) network, WiMAX network, 3G network, etc.
  • WLAN wireless local area network
  • LTE Long Term Evolution
  • WiMAX 3G network
  • the manufacturing graph anomaly detection server 302 may further include one or more network interfaces 310 via which the manufacturing graph anomaly detection server 302 may communicate with one or more other devices or systems via one or more networks.
  • network(s) may include, but are not limited to, any one or more different types of communications networks such as, for example, cable networks, public networks (e.g., the Internet), private networks (e.g., frame-relay networks), wireless networks, cellular networks, telephone networks (e.g., a public switched telephone network), or any other suitable private or public packet-switched or circuit- switched networks.
  • Such network(s) may have any suitable communication range associated therewith and may include, for example, global networks (e.g., the Internet), metropolitan area networks (MANs), wide area networks (WANs), local area networks (LANs), or personal area networks (PANs).
  • network(s) may include communication links and associated networking devices (e.g., link-layer switches, routers, etc.) for transmitting network traffic over any suitable type of medium including, but not limited to, coaxial cable, twisted-pair wire (e.g., twisted-pair copper wire), optical fiber, a hybrid fiber-coaxial (HFC) medium, a microwave medium, a radio frequency communication medium, a satellite communication medium, or any combination thereof.
  • program modules/engines depicted in FIG. 3 as being stored in the data storage 314 are merely illustrative and not exhaustive and that processing described as being supported by any particular module may alternatively be distributed across multiple modules, engines, or the like, or performed by a different module, engine, or the like.
  • various program module(s), script(s), plug-in(s), Application Programming Interface(s) (API(s)), or any other suitable computer- executable code hosted locally on the manufacturing graph anomaly detection server 302 and/or other computing devices accessible via one or more networks may be provided to support functionality provided by the modules depicted in FIG. 3 and/or additional or alternate functionality.
  • functionality may be modularized in any suitable manner such that processing described as being performed by a particular module may be performed by a collection of any number of program modules, or functionality described as being supported by any particular module may be supported, at least in part, by another module.
  • program modules that support the functionality described herein may be executable across any number of cluster members in accordance with any suitable computing model such as, for example, a client-server model, a peer-to-peer model, and so forth.
  • any of the functionality described as being supported by any of the modules depicted in FIG. 3 may be implemented, at least partially, in hardware and/or firmware across any number of devices.
  • the manufacturing graph anomaly detection server 302 may include alternate and/or additional hardware, software, or firmware components beyond those described or depicted without departing from the scope of the disclosure. More particularly, it should be appreciated that software, firmware, or hardware components depicted as forming part of the manufacturing graph anomaly detection server 302 are merely illustrative and that some components may not be present or additional components may be provided in various embodiments. While various illustrative modules have been depicted and described as software modules stored in data storage 314, it should be appreciated that functionality described as being supported by the modules may be enabled by any combination of hardware, software, and/or firmware.
  • each of the above-mentioned modules may, in various embodiments, represent a logical partitioning of supported functionality. This logical partitioning is depicted for ease of explanation of the functionality and may not be representative of the structure of software, hardware, and/or firmware for implementing the functionality. Accordingly, it should be appreciated that functionality described as being provided by a particular module may, in various embodiments, be provided at least in part by one or more other modules. Further, one or more depicted modules may not be present in certain embodiments, while in other embodiments, additional program modules and/or engines not depicted may be present and may support at least a portion of the described functionality and/or additional functionality.
  • One or more operations of the method 200 may be performed by a manufacturing graph anomaly detection server 302 having the illustrative configuration depicted in FIG. 3, or more specifically, by one or more program modules, engines, applications, or the like executable on such a device. It should be appreciated, however, that such operations may be implemented in connection with numerous other device configurations.
  • the present disclosure may be a system, a method, and/or a computer program product.
  • the computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages.
  • the computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field- programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
  • the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block may occur out of the order noted in the figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • General Factory Administration (AREA)

Abstract

Systems, methods, and computer-readable media are described for identifying missing or mislabeled information in a manufacturing knowledge graph. Example embodiments address the technical challenge of sparsity of information in the manufacturing domain by using a collection of prior stored manufacturing knowledge graphs to learn latent representations present within the stored graphs through localized traversals of the stored graphs using a graph traversal technique such as a random walk traversal. The latent representations of the stored manufacturing knowledge graphs correspond to an encoding of the stored manufacturing knowledge graphs as a collection of encoded manufacturing knowledge graphs, which are compared to an encoded new manufacturing knowledge graph to identify a stored manufacturing knowledge graph that is most structurally similar to the new manufacturing knowledge graph. This most similar manufacturing knowledge graph can then be used to identify missing or mislabeled information in the new manufacturing knowledge graph.

Description

MISSING LABEL CLASSILICATION AND ANOMALY DETECTION LOR SPARSELY POPULATED MANUFACTURING KNOWLEDGE GRAPHS
CROSS-REFERENCE TO RELATED APPLICATION(S)
[01] This application claims the benefit of U.S. Application No. 62/590,693, filed on November 27, 2017, the content of which is incorporated by reference herein in its entirety.
BACKGROUND
[02] Knowledge graphs are widely used in social media networks and natural language processing. In these areas, data is abundant, easy to obtain, and the relationships between entities are well-defined. Knowledge graphs in manufacturing can be useful for process planning, task scheduling, diagnostics, product design, capability analysis, resource allocation, analyzing production scenarios, and so forth.
SUMMARY
[03] In one or more example embodiments, a computer-implemented method for identifying missing or mislabeled information in a manufacturing knowledge graph is disclosed. The method includes generating the manufacturing knowledge graph based at least in part on at least one of manufacturing data or manufacturing documents and performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph. The method further includes determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph. The method additionally includes identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting the mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
[04] In one or more other example embodiments, a system for identifying missing or mislabeled information in a manufacturing knowledge graph is disclosed. The system includes at least one memory storing computer-executable instructions and at least one processor configured to access the at least one memory and execute the computer- executable instructions to perform a set of operations. The operations include generating the manufacturing knowledge graph based at least in part on at least one of
manufacturing data or manufacturing documents and performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph. The operations further include determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph. The operations additionally include identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting the mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
[05] In one or more other example embodiments, a computer program product for identifying missing or mislabeled information in a manufacturing knowledge graph is disclosed. The computer program product includes a non-transitory storage medium readable by a processing circuit, the storage medium storing instructions executable by the processing circuit to cause a method to be performed. The method includes generating the manufacturing knowledge graph based at least in part on at least one of
manufacturing data or manufacturing documents and performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph. The method further includes determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph. The method additionally includes identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph and detecting and correcting mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
BRIEF DESCRIPTION OF THE DRAWINGS
[06] The detailed description is set forth with reference to the accompanying drawings. The drawings are provided for purposes of illustration only and merely depict example embodiments of the disclosure. The drawings are provided to facilitate understanding of the disclosure and shall not be deemed to limit the breadth, scope, or applicability of the disclosure. In the drawings, the left-most digit(s) of a reference numeral identifies the drawing in which the reference numeral first appears. The use of the same reference numerals indicates similar, but not necessarily the same or identical components. However, different reference numerals may be used to identify similar components as well. Various embodiments may utilize elements or components other than those illustrated in the drawings, and some elements and/or components may not be present in various embodiments. The use of singular terminology to describe a component or element may, depending on the context, encompass a plural number of such components or elements and vice versa.
[07] FIG. 1 is a hybrid block/data flow diagram schematically depicting a process for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments. [08] FIG. 2 is a process flow diagram of an illustrative method for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments.
[09] FIG. 3 is a schematic diagram of an illustrative computing configuration for implementing one or more example embodiments.
DETAILED DESCRIPTION
[010] Example embodiments of the invention relate to, among other things, systems, methods, computer-readable media, techniques, and methodologies for identifying missing or mislabeled information in a manufacturing knowledge graph. Knowledge graphs in the manufacturing domain often face the challenge of sparsity of available information for constructing the knowledge graphs. In particular, there is the possibility of missing or mislabeled manufacturing data/information because the data is not always collected in digitized form and may be in the form of textual documents or handwritten notes.
[01 1 ] Example embodiments address this technical challenge by using a collection of prior stored manufacturing knowledge graphs to learn latent representations present within the stored graphs by traversing at least a portion of the stored graphs using a graph traversal technique such as a random walk traversal. In example embodiments, the latent representations constitute an encoding of the stored manufacturing knowledge graphs such that a collection of encoded stored manufacturing knowledge graphs are obtained through learning of the latent representations. In example embodiments, a new manufacturing knowledge graph is generated from manufacturing data and/or manufacturing documents. A graph traversal of at least a portion of the new
manufacturing knowledge graph is performed to determine latent representation(s) of the new manufacturing knowledge graph that encode the new manufacturing knowledge graph as an encoded new manufacturing knowledge graph. [012] In example embodiments, the encoded new manufacturing knowledge graph can be compared to the encoded stored manufacturing knowledge graphs to identify an encoded stored manufacturing knowledge graph that is most similar to the encoded new manufacturing knowledge graph. A structure of the stored manufacturing knowledge graph corresponding to the matching encoded stored manufacturing knowledge graph can then be used to identify missing and/or mislabeled information in the new manufacturing graph. The missing information can be added to the new manufacturing knowledge graph and/or the mislabeled information corrected to obtain a more accurate and complete new manufacturing graph. In example embodiments, this new manufacturing knowledge graph can then be stored for potential future matching. The graph traversal and encoded manufacturing graph representation methods according to example embodiments of the invention can be tailored to applications such as a manufacturing process plan that have a low labeled dataset. In example embodiments, the terms manufacturing knowledge graph and manufacturing graph may be used interchangeably herein.
[013] An illustrative method in accordance with example embodiments of the invention will now be described. It should be noted that any given operation of the method 200 may be performed by one or more of the program modules or the like depicted in FIG. 1 , whose operation will be described in more detail later in this disclosure. These program modules may be implemented in any combination of hardware, software, and/or firmware. In certain example embodiments, one or more of these program modules may be implemented, at least in part, as software and/or firmware modules that include computer-executable instructions that when executed by a processing circuit cause one or more operations to be performed. A system or device described herein as being configured to implement example embodiments may include one or more processing circuits, each of which may include one or more processing units or nodes. Computer-executable instructions may include computer-executable program code that when executed by a processing unit may cause input data contained in or referenced by the computer-executable program code to be accessed and processed to yield output data. [014] FIG. 1 is a hybrid block/data flow diagram schematically depicting a process for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments. FIG. 2 is a process flow diagram of an illustrative method 200 for identifying missing or mislabeled information in a manufacturing knowledge graph in accordance with example embodiments. FIGS. 1 and 2 will be described in conjunction with one another hereinafter.
[015] Referring now to FIG. 2 in conjunction with FIG. 1 , at block 202 of the method 200, computer-executable of one or more graph generation modules 104 may be executed to generate a new manufacturing graph 106 from manufacturing data and/or manufacturing documents 102. The new manufacturing graph 106 may include a set of nodes and a set of edges connecting the nodes. In certain example embodiments, the new manufacturing graph 106 may be representative of a manufacturing process, where each node of the graph 106 may represent a component, sub-component, process, or sub process of the manufacturing process and each edge may represent a process set, a constraint on the manufacturing process, or the like.
[016] At block 204 of the method 200, computer-executable instructions of one or more graph traversal modules 108 may be executed to perform a graph traversal 110 of the new manufacturing graph 106 using a graph traversal methodology. In example embodiments, the graph traversal methodology may be a random walk graph traversal. In example embodiments, only a portion of the new manufacturing graph 106 may be traversed such as a subgraph of the graph 106.
[017] At block 206 of the method 200, computer-executable instructions of one or more latent representation modules 112 may be executed to determine a first set 114 of one or more latent representations associated with the new manufacturing graph 106 based at least in part on the traversal 110. The set of latent representation(s) 114 may be indicative of underlying relationships present in the graph 106. In example embodiments, the set of latent representation(s) 114 may reveal relationships between words in manufacturing documents 102 by treating the traversal 110 as being equivalent to sentences form from the words of the manufacturing documents 102. The set of latent representation(s) 114 may embody or otherwise represent an encoding of the new manufacturing knowledge graph 106 as an encoded new manufacturing knowledge graph 118. For example, the encoded new manufacturing graph 118 may be vectorized representation of the new manufacturing graph 106.
[018] At block 208 of the method 200, computer-executable instructions of the graph traversal module(s) 108 may be executed to perform respective traversals 124 of each of a plurality of stored manufacturing graphs 120 stored, for example, in one or more datastores 122. At block 212 of the method 200, computer-executable instructions of the latent representation generation module(s) 112 may be executed to determine a respective set of latent representation(s) associated with each of the stored manufacturing graphs 120 based at least in part on a corresponding traversal 124. The sets of latent representation(s) 126 for the stored manufactured graphs 120 may embody or otherwise represent an encoding of the stored manufacturing knowledge graphs 120 as a set of encoded stored manufacturing graphs 128.
[019] At block 212 of the method, computer-executable instructions of one or more matching modules 116 may be executed to perform a comparison of the encoded new manufacturing graph 118 to the set of encoded stored manufacturing graphs 128. In accordance with example embodiments of the invention, any suitable comparison algorithm (e.g., similarity algorithm) may be employed to perform the comparison at block 212. The result of the comparison at block 212 may be an identification of a stored manufacturing graph 130 that is most similar to the new manufacturing graph 106 based at least in part on the encoded latent representations contained in the encoded new manufacturing graph 118 and encoded stored manufacturing graph 128 that corresponds to the matching stored manufacturing graph 130.
[020] Then, at block 214 of the method 200, computer-executable instructions of one or more graph correction modules 132 may be executed to detect and correct mislabeled information and/or generate missing labels in the new manufacturing graph 106 based at least in part on a structure of the most similar stored manufacturing graph 130. The result of the operation at block 214 may be an updated new manufacturing graph 134 that contains information in the form of new nodes and edges that were missing from the new manufacturing graph 106. The updated new manufacturing graph 134 may then be stored in the datastore(s) 122 as a stored manufacturing graph 120 for potential future matching.
[021 ] Example embodiments of the invention provide techniques for latent representation learning for manufacturing domain-specific graphical data structures that are capable of addressing the challenge of sparsity of data in the manufacturing domain. In particular, techniques disclosed herein maintain performance even in the presence of a sparse amount of training data. More specifically, example embodiments of the invention provide a number of technical benefits over conventional approaches including unsupervised representation learning techniques that capture graphical structures independent of the label distribution of nodes; a learned model with low dimensionality that provides a higher generalizability and faster convergence; training from a streaming dataset that allows for decentralized training; and use of graph traversal techniques (e.g., random graph walks) that are localized, and thus, allow for distributed computing and improved algorithm scalability.
[022] One or more illustrative embodiments of the disclosure are described herein. Such embodiments are merely illustrative of the scope of this disclosure and are not intended to be limiting in any way. Accordingly, variations, modifications, and equivalents of embodiments disclosed herein are also within the scope of this disclosure. For example, the data key generation process described herein in accordance with example embodiments can be expanded to use multiple data seeds to produce one set of unique and reproducible data for each data seed.
[023] FIG. 3 is a schematic diagram of an illustrative computing configuration for implementing one or more example embodiments of the invention. In particular, FIG. 4 depicts one or more manufacturing graph anomaly detection servers 302 configured to implement one or more example embodiments. While the server(s) 302 may be described herein in the singular, it should be appreciated that multiple servers 302 may be provided, and functionality described herein may be distributed across multiple such servers 302.
[024] In an illustrative configuration, the manufacturing graph anomaly detection server 302 may include one or more processors (processor(s)) 304, one or more memory devices 306 (generically referred to herein as memory 306), one or more input/output (“I/O”) interface(s) 308, one or more network interfaces 310, and data storage 314. The manufacturing graph anomaly detection server 302 may further include one or more buses 312 that functionally couple various components of the manufacturing graph anomaly detection server 302.
[025] The bus(es) 312 may include at least one of a system bus, a memory bus, an address bus, or a message bus, and may permit the exchange of information (e.g., data (including computer-executable code), signaling, etc.) between various components of the manufacturing graph anomaly detection server 302. The bus(es) 312 may include, without limitation, a memory bus or a memory controller, a peripheral bus, an accelerated graphics port, and so forth. The bus(es) 312 may be associated with any suitable bus architecture including, without limitation, an Industry Standard Architecture (ISA), a Micro Channel Architecture (MCA), an Enhanced ISA (EISA), a Video Electronics Standards Association (VESA) architecture, an Accelerated Graphics Port (AGP) architecture, a Peripheral Component Interconnects (PCI) architecture, a PCI-Express architecture, a Personal Computer Memory Card International Association (PCMCIA) architecture, a Universal Serial Bus (USB) architecture, and so forth.
[026] The memory 306 may include volatile memory (memory that maintains its state when supplied with power) such as random access memory (RAM) and/or non volatile memory (memory that maintains its state even when not supplied with power) such as read-only memory (ROM), flash memory, ferroelectric RAM (FRAM), and so forth. Persistent data storage, as that term is used herein, may include non-volatile memory. In certain example embodiments, volatile memory may enable faster read/write access than non-volatile memory. However, in certain other example embodiments, certain types of non-volatile memory (e.g., FRAM) may enable faster read/write access than certain types of volatile memory.
[027] In various implementations, the memory 306 may include multiple different types of memory such as various types of static random access memory (SRAM), various types of dynamic random access memory (DRAM), various types of unalterable ROM, and/or writeable variants of ROM such as electrically erasable programmable read-only memory (EEPROM), flash memory, and so forth. The memory 306 may include main memory as well as various forms of cache memory such as instruction cache(s), data cache(s), translation lookaside buffer(s) (TLBs), and so forth. Further, cache memory such as a data cache may be a multi-level cache organized as a hierarchy of one or more cache levels (Ll, L2, etc.).
[028] The data storage 314 may include removable storage and/or non-removable storage including, but not limited to, magnetic storage, optical disk storage, and/or tape storage. The data storage 314 may provide non-volatile storage of computer-executable instructions and other data. The memory 306 and the data storage 314, removable and/or non-removable, are examples of computer-readable storage media (CRSM) as that term is used herein.
[029] The data storage 314 may store computer-executable code, instructions, or the like that may be loadable into the memory 306 and executable by the processor(s) 304 to cause the processor(s) 304 to perform or initiate various operations. The data storage 314 may additionally store data that may be copied to memory 306 for use by the processor(s) 304 during the execution of the computer-executable instructions. Moreover, output data generated as a result of execution of the computer-executable instructions by the processor(s) 304 may be stored initially in memory 306 and may ultimately be copied to data storage 314 for non-volatile storage. [030] More specifically, the data storage 314 may store one or more operating systems (O/S) 316; one or more database management systems (DBMS) 318 configured to access the memory 306 and/or one or more datastores 330; and one or more program modules, applications, engines, managers, computer-executable code, scripts, or the like such as, for example, one or more graph generation modules 320; one or more graph traversal modules 322; one or more latent representation determination modules 324; one or more graph matching modules 326; and one or more graph correction modules 328. Any of the components depicted as being stored in data storage 314 may include any combination of software, firmware, and/or hardware. The software and/or firmware may include computer-executable instructions (e.g., computer-executable program code) that may be loaded into the memory 306 for execution by one or more of the processor(s) 304 to perform any of the corresponding operations described earlier.
[031 ] Although not depicted in FIG. 3, the data storage 314 may further store various types of data utilized by components of the manufacturing graph anomaly detection server 302 (e.g., data stored in the datastore(s) 330). Any data stored in the data storage 314 may be loaded into the memory 306 for use by the processor(s) 304 in executing computer-executable instructions. In addition, any data stored in the data storage 314 may potentially be stored in the external datastore(s) 330 and may be accessed via the DBMS 418 and loaded in the memory 306 for use by the processor(s) 304 in executing computer-executable instructions.
[032] The processor(s) 304 may be configured to access the memory 306 and execute computer-executable instructions loaded therein. For example, the processor(s) 304 may be configured to execute computer-executable instructions of the various program modules, applications, engines, managers, or the like of the manufacturing graph anomaly detection server 302 to cause or facilitate various operations to be performed in accordance with one or more embodiments of the disclosure. The processor(s) 304 may include any suitable processing unit capable of accepting data as input, processing the input data in accordance with stored computer-executable instructions, and generating output data. The processor(s) 304 may include any type of suitable processing unit including, but not limited to, a central processing unit, a microprocessor, a Reduced Instruction Set Computer (RISC) microprocessor, a Complex Instruction Set Computer (CISC) microprocessor, a microcontroller, an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), a System-on-a-Chip (SoC), a digital signal processor (DSP), and so forth. Further, the processor(s) 304 may have any suitable microarchitecture design that includes any number of constituent components such as, for example, registers, multiplexers, arithmetic logic units, cache controllers for controlling read/write operations to cache memory, branch predictors, or the like. The
microarchitecture design of the processor(s) 304 may be capable of supporting any of a variety of instruction sets.
[033] Referring now to other illustrative components depicted as being stored in the data storage 314, the O/S 316 may be loaded from the data storage 314 into the memory 306 and may provide an interface between other application software executing on the manufacturing graph anomaly detection server 302 and hardware resources of the manufacturing graph anomaly detection server 302. More specifically, the O/S 316 may include a set of computer-executable instructions for managing hardware resources of the manufacturing graph anomaly detection server 302 and for providing common services to other application programs. In certain example embodiments, the O/S 316 may include or otherwise control the execution of one or more of the program modules, engines, managers, or the like depicted as being stored in the data storage 314. The O/S 316 may include any operating system now known or which may be developed in the future including, but not limited to, any server operating system, any mainframe operating system, or any other proprietary or non-proprietary operating system.
[034] The DBMS 318 may be loaded into the memory 306 and may support functionality for accessing, retrieving, storing, and/or manipulating data stored in the memory 306, data stored in the data storage 314, and/or data stored in external datastore(s) 330. The DBMS 318 may use any of a variety of database models (e.g., relational model, object model, etc.) and may support any of a variety of query languages. The DBMS 318 may access data represented in one or more data schemas and stored in any suitable data repository. As such, data stored in the datastore(s) 330 may include, for example, manufacturing graph data; latent representation data; encoded manufacturing graph data; and so forth. External datastore(s) 330 that may be accessible by the manufacturing graph anomaly detection server 302 via the DBMS 318 may include, but are not limited to, databases (e.g., relational, object-oriented, etc.), file systems, flat files, distributed datastores in which data is stored on more than one node of a computer network, peer-to-peer network datastores, or the like.
[035] Referring now to other illustrative components of the manufacturing graph anomaly detection server 302, the input/output (I/O) interface(s) 308 may facilitate the receipt of input information by the manufacturing graph anomaly detection server 302 from one or more I/O devices as well as the output of information from the
manufacturing graph anomaly detection server 302 to the one or more I/O devices. The I/O devices may include any of a variety of components such as a display or display screen having a touch surface or touchscreen; an audio output device for producing sound, such as a speaker; an audio capture device, such as a microphone; an image and/or video capture device, such as a camera; a haptic unit; and so forth. Any of these components may be integrated into the manufacturing graph anomaly detection server 302 or may be separate. The I/O devices may further include, for example, any number of peripheral devices such as data storage devices, printing devices, and so forth.
[036] The I/O interface(s) 308 may also include an interface for an external peripheral device connection such as universal serial bus (USB), FireWire, Thunderbolt, Ethernet port or other connection protocol that may connect to one or more networks.
The I/O interface(s) 308 may also include a connection to one or more antennas to connect to one or more networks via a wireless local area network (WLAN) (such as Wi Fi) radio, Bluetooth, and/or a wireless network radio, such as a radio capable of communication with a wireless communication network such as a Long Term Evolution (LTE) network, WiMAX network, 3G network, etc.
[037] The manufacturing graph anomaly detection server 302 may further include one or more network interfaces 310 via which the manufacturing graph anomaly detection server 302 may communicate with one or more other devices or systems via one or more networks. Such network(s) may include, but are not limited to, any one or more different types of communications networks such as, for example, cable networks, public networks (e.g., the Internet), private networks (e.g., frame-relay networks), wireless networks, cellular networks, telephone networks (e.g., a public switched telephone network), or any other suitable private or public packet-switched or circuit- switched networks. Such network(s) may have any suitable communication range associated therewith and may include, for example, global networks (e.g., the Internet), metropolitan area networks (MANs), wide area networks (WANs), local area networks (LANs), or personal area networks (PANs). In addition, such network(s) may include communication links and associated networking devices (e.g., link-layer switches, routers, etc.) for transmitting network traffic over any suitable type of medium including, but not limited to, coaxial cable, twisted-pair wire (e.g., twisted-pair copper wire), optical fiber, a hybrid fiber-coaxial (HFC) medium, a microwave medium, a radio frequency communication medium, a satellite communication medium, or any combination thereof.
[038] It should be appreciated that the program modules/engines depicted in FIG. 3 as being stored in the data storage 314 are merely illustrative and not exhaustive and that processing described as being supported by any particular module may alternatively be distributed across multiple modules, engines, or the like, or performed by a different module, engine, or the like. In addition, various program module(s), script(s), plug-in(s), Application Programming Interface(s) (API(s)), or any other suitable computer- executable code hosted locally on the manufacturing graph anomaly detection server 302 and/or other computing devices accessible via one or more networks, may be provided to support functionality provided by the modules depicted in FIG. 3 and/or additional or alternate functionality. Further, functionality may be modularized in any suitable manner such that processing described as being performed by a particular module may be performed by a collection of any number of program modules, or functionality described as being supported by any particular module may be supported, at least in part, by another module. In addition, program modules that support the functionality described herein may be executable across any number of cluster members in accordance with any suitable computing model such as, for example, a client-server model, a peer-to-peer model, and so forth. In addition, any of the functionality described as being supported by any of the modules depicted in FIG. 3 may be implemented, at least partially, in hardware and/or firmware across any number of devices.
[039] It should further be appreciated that the manufacturing graph anomaly detection server 302 may include alternate and/or additional hardware, software, or firmware components beyond those described or depicted without departing from the scope of the disclosure. More particularly, it should be appreciated that software, firmware, or hardware components depicted as forming part of the manufacturing graph anomaly detection server 302 are merely illustrative and that some components may not be present or additional components may be provided in various embodiments. While various illustrative modules have been depicted and described as software modules stored in data storage 314, it should be appreciated that functionality described as being supported by the modules may be enabled by any combination of hardware, software, and/or firmware. It should further be appreciated that each of the above-mentioned modules may, in various embodiments, represent a logical partitioning of supported functionality. This logical partitioning is depicted for ease of explanation of the functionality and may not be representative of the structure of software, hardware, and/or firmware for implementing the functionality. Accordingly, it should be appreciated that functionality described as being provided by a particular module may, in various embodiments, be provided at least in part by one or more other modules. Further, one or more depicted modules may not be present in certain embodiments, while in other embodiments, additional program modules and/or engines not depicted may be present and may support at least a portion of the described functionality and/or additional functionality.
[040] One or more operations of the method 200 may be performed by a manufacturing graph anomaly detection server 302 having the illustrative configuration depicted in FIG. 3, or more specifically, by one or more program modules, engines, applications, or the like executable on such a device. It should be appreciated, however, that such operations may be implemented in connection with numerous other device configurations.
[041 ] The operations described and depicted in the illustrative method of FIG. 2 may be carried out or performed in any suitable order as desired in various example embodiments of the disclosure. Additionally, in certain example embodiments, at least a portion of the operations may be carried out in parallel. Furthermore, in certain example embodiments, less, more, or different operations than those depicted in FIG. 2 may be performed.
[042] Although specific embodiments of the disclosure have been described, one of ordinary skill in the art will recognize that numerous other modifications and alternative embodiments are within the scope of the disclosure. For example, any of the functionality and/or processing capabilities described with respect to a particular system, system component, device, or device component may be performed by any other system, device, or component. Further, while various illustrative implementations and architectures have been described in accordance with embodiments of the disclosure, one of ordinary skill in the art will appreciate that numerous other modifications to the illustrative
implementations and architectures described herein are also within the scope of this disclosure. In addition, it should be appreciated that any operation, element, component, data, or the like described herein as being based on another operation, element, component, data, or the like may be additionally based on one or more other operations, elements, components, data, or the like. Accordingly, the phrase“based on,” or variants thereof, should be interpreted as“based at least in part on.” [043] The present disclosure may be a system, a method, and/or a computer program product. The computer program product may include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present disclosure.
[044] The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
[045] Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
[046] Computer readable program instructions for carrying out operations of the present disclosure may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field- programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present disclosure.
[047] Aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions. [048] These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, a
programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
[049] The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
[050] The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.

Claims

CLAIMS What is claimed is:
1. A computer- implemented method for identifying missing or mislabeled information in a manufacturing knowledge graph, the method comprising: generating the manufacturing knowledge graph based at least in part on at least one of manufacturing data or manufacturing documents; performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph; determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph; identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph; and detecting and correcting mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
2. The computer-implemented method of claim 1, wherein the graph traversal technique comprises a random walk graph traversal.
3. The computer-implemented method of claim 1, further comprising: performing, using the graph traversal technique, a respective traversal of each of a plurality of stored manufacturing knowledge graphs; and determining, based at least in part on the respective traversal of the at least a portion of the stored manufacturing knowledge graph, a respective set of one or more latent representations associated with each of the plurality of stored manufacturing knowledge graphs to encode the plurality of stored manufacturing knowledge graphs as a plurality of encoded stored manufacturing knowledge graphs including the particular encoded manufacturing knowledge graph.
4. The computer-implemented method of claim 1, wherein identifying the particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph comprises determining that the particular encoded stored manufacturing knowledge graph is most structurally similar to the encoded manufacturing knowledge graph.
5. The computer-implemented method of claim 1, wherein performing the traversal of the at least a portion of the manufacturing knowledge graph comprises performing the traversal of a subgraph of the manufacturing knowledge graph.
6. The computer-implemented method of claim 1, further comprising identifying missing labels in the manufacturing knowledge graph based at least in part on a graph structure of the particular stored manufacturing knowledge graph.
7. The computer-implemented method of claim 1, where the first set of one or more latent representations comprises a vector-based representation of the manufacturing knowledge graph.
8. A system for identifying missing or mislabeled information in a manufacturing knowledge graph, the system comprising: at least one memory storing computer-executable instructions; and at least one processor, wherein the at least one processor is configured to access the at least one memory and execute the computer-executable instructions to: generate the manufacturing knowledge graph based at least in part on at least one of manufacturing data or manufacturing documents; perform, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph; determine, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph; identify a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph; and detect and correct mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
9. The system of claim 8, wherein the graph traversal technique comprises a random walk graph traversal.
10. The system of claim 8, wherein the at least one processor is further configured to execute the computer-executable instructions to: perform, using the graph traversal technique, a respective traversal of each of a plurality of stored manufacturing knowledge graphs; and determine, based at least in part on the respective traversal of the at least a portion of the stored manufacturing knowledge graph, a respective set of one or more latent representations associated with each of the plurality of stored manufacturing knowledge graphs to encode the plurality of stored manufacturing knowledge graphs as a plurality of encoded stored manufacturing knowledge graphs including the particular encoded manufacturing knowledge graph.
11. The system of claim 8, wherein the at least one processor is configured to identify the particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph by executing the computer-executable instructions to determine that the particular encoded stored manufacturing knowledge graph is most structurally similar to the encoded manufacturing knowledge graph.
12. The system of claim 8, wherein the at least one processor is configured to perform the traversal of the at least a portion of the manufacturing knowledge graph by executing the computer-executable instructions to perform the traversal of a subgraph of the manufacturing knowledge graph.
13. The system of claim 8, wherein the at least one processor is further configured to execute the computer-executable instructions to identify missing labels in the
manufacturing knowledge graph based at least in part on a graph structure of the particular stored manufacturing knowledge graph.
14. The system of claim 8, where the first set of one or more latent representations comprises a vector-based representation of the manufacturing knowledge graph.
15. A computer program product for identifying missing or mislabeled information in a manufacturing knowledge graph, the computer program product comprising a computer readable storage medium readable by a processing circuit, the computer readable storage medium storing instructions executable by the processing circuit to cause a method to be performed, the method comprising: generating the manufacturing knowledge graph based at least in part on at least one of manufacturing data or manufacturing documents; performing, using a graph traversal technique, a traversal of at least a portion of the manufacturing knowledge graph; determining, based at least in part on the traversal of the at least a portion of the manufacturing knowledge graph, a first set of one or more latent representations associated with the manufacturing knowledge graph to encode the manufacturing knowledge graph as an encoded manufacturing knowledge graph; identifying a particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph; and detecting and correcting mislabeled information in the manufacturing knowledge graph based at least in part on a graph structure of a particular stored manufacturing knowledge graph corresponding to the particular encoded stored manufacturing knowledge graph.
16. The computer program product of claim 15, wherein the graph traversal technique comprises a random walk graph traversal.
17. The computer program product of claim 15, the method further comprising: performing, using the graph traversal technique, a respective traversal of each of a plurality of stored manufacturing knowledge graphs; and determining, based at least in part on the respective traversal of the at least a portion of the stored manufacturing knowledge graph, a respective set of one or more latent representations associated with each of the plurality of stored manufacturing knowledge graphs to encode the plurality of stored manufacturing knowledge graphs as a plurality of encoded stored manufacturing knowledge graphs including the particular encoded manufacturing knowledge graph.
18. The computer program product of claim 15, wherein identifying the particular encoded stored manufacturing knowledge graph that matches the encoded manufacturing knowledge graph comprises determining that the particular encoded stored manufacturing knowledge graph is most structurally similar to the encoded manufacturing knowledge graph.
19. The computer program product of claim 15, wherein performing the traversal of the at least a portion of the manufacturing knowledge graph comprises performing the traversal of a subgraph of the manufacturing knowledge graph.
20. The computer program product of claim 15, the method further comprising identifying missing labels in the manufacturing knowledge graph based at least in part on a graph structure of the particular stored manufacturing knowledge graph.
PCT/US2018/049159 2017-11-27 2018-08-31 Missing label classification and anomaly detection for sparsely populated manufacturing knowledge graphs WO2019103778A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201762590693P 2017-11-27 2017-11-27
US62/590,693 2017-11-27

Publications (1)

Publication Number Publication Date
WO2019103778A1 true WO2019103778A1 (en) 2019-05-31

Family

ID=63684508

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2018/049159 WO2019103778A1 (en) 2017-11-27 2018-08-31 Missing label classification and anomaly detection for sparsely populated manufacturing knowledge graphs

Country Status (1)

Country Link
WO (1) WO2019103778A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191821A (en) * 2019-12-17 2020-05-22 东华大学 Equipment resource allocation optimization method based on knowledge graph drive
CN111209409A (en) * 2019-12-27 2020-05-29 南京医康科技有限公司 Data matching method and device, storage medium and electronic terminal
CN118133778A (en) * 2024-05-07 2024-06-04 杭州广立微电子股份有限公司 Failure point analysis method, device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280307A1 (en) * 2013-03-15 2014-09-18 Google Inc. Question answering to populate knowledge base
US20150142704A1 (en) * 2013-11-20 2015-05-21 Justin London Adaptive Virtual Intelligent Agent
WO2016094335A1 (en) * 2014-12-09 2016-06-16 Microsoft Technology Licensing, Llc Method and system for determining user intent in a spoken dialog based on transforming at least one portion of a semantic knowledge graph to a probabilistic state graph
US20170308792A1 (en) * 2014-08-06 2017-10-26 Prysm, Inc. Knowledge To User Mapping in Knowledge Automation System

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140280307A1 (en) * 2013-03-15 2014-09-18 Google Inc. Question answering to populate knowledge base
US20150142704A1 (en) * 2013-11-20 2015-05-21 Justin London Adaptive Virtual Intelligent Agent
US20170308792A1 (en) * 2014-08-06 2017-10-26 Prysm, Inc. Knowledge To User Mapping in Knowledge Automation System
WO2016094335A1 (en) * 2014-12-09 2016-06-16 Microsoft Technology Licensing, Llc Method and system for determining user intent in a spoken dialog based on transforming at least one portion of a semantic knowledge graph to a probabilistic state graph

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111191821A (en) * 2019-12-17 2020-05-22 东华大学 Equipment resource allocation optimization method based on knowledge graph drive
CN111209409A (en) * 2019-12-27 2020-05-29 南京医康科技有限公司 Data matching method and device, storage medium and electronic terminal
CN111209409B (en) * 2019-12-27 2023-09-29 医渡云(北京)技术有限公司 Data matching method and device, storage medium and electronic terminal
CN118133778A (en) * 2024-05-07 2024-06-04 杭州广立微电子股份有限公司 Failure point analysis method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US11106567B2 (en) Combinatoric set completion through unique test case generation
US11294943B2 (en) Distributed match and association of entity key-value attribute pairs
US20210183097A1 (en) Spare Part Identification Using a Locally Learned 3D Landmark Database
US20200242013A1 (en) Champion test case generation
US20190279088A1 (en) Training method, apparatus, chip, and system for neural network model
US20200242012A1 (en) Fault detection and localization to generate failing test cases using combinatorial test design techniques
US10664390B2 (en) Optimizing execution order of system interval dependent test cases
US20170220945A1 (en) Enhancing robustness of pseudo-relevance feedback models using query drift minimization
US20190188317A1 (en) Automatic seeding of an application programming interface (api) into a conversational interface
US12033089B2 (en) Deep convolutional factor analyzer
US20180005111A1 (en) Generalized Sigmoids and Activation Function Learning
US10346405B2 (en) Lower-dimensional subspace approximation of a dataset
US20190281407A1 (en) Group-based sequential recommendations
WO2019103778A1 (en) Missing label classification and anomaly detection for sparsely populated manufacturing knowledge graphs
WO2020040734A1 (en) Orientation detection in overhead line insulators
US10831564B2 (en) Bootstrapping a conversation service using documentation of a rest API
CN114139684A (en) Graph neural network generation method, device, system, medium, and electronic apparatus
US10754630B2 (en) Build-time code section-specific compiler selection
US10656938B2 (en) External comment storage and organization
US20170344903A1 (en) Parallel Ensemble of Support Vector Machines
US10963366B2 (en) Regression test fingerprints based on breakpoint values
US11416775B2 (en) Training robust machine learning models
CN118364365B (en) Business event driven engine information generation method, device, electronic equipment and medium
US11989263B2 (en) Method, electronic device, and computer program product for data processing
CN117743555B (en) Reply decision information transmission method, device, equipment and computer readable medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18778675

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18778675

Country of ref document: EP

Kind code of ref document: A1