WO2020232718A1

WO2020232718A1 - Edge side model inference method, edge computing device, and computer readable medium

Info

Publication number: WO2020232718A1
Application number: PCT/CN2019/088193
Authority: WO
Inventors: 毛怿
Original assignee: 西门子股份公司; 西门子（中国）有限公司
Priority date: 2019-05-23
Filing date: 2019-05-23
Publication date: 2020-11-26
Also published as: CN113412495A

Abstract

The present invention relates to the field of edge computing technology, and in particular, to an edge side model inference method, an edge computing device, and a computer readable medium, wherein model inference is performed at an edge side of an Internet of Things (IOT) system, so as to satisfy requirements of the IOT system for real-time performance. A model inference method provided in the embodiments of the present invention comprises: an application program running on an edge computing device subscribing to an inference process subtask of model inference; the application program obtaining data used to execute the inference process subtask; the application program invoking a first API of a model inference service running on the edge computing device, to execute the inference process subtask on the basis of the data; the application program acquiring, from a response of the first API, a model inference result of the inference process subtask; and the application program publishing the model inference result.

Description

Edge-side model inference method, edge computing device and computer readable medium

Technical field

The present invention relates to the technical field of edge computing, and in particular to an edge-side model inference method, an edge computing device and a computer-readable medium.

Background technique

Machine learning is an important method in big data analysis. Machine learning includes two main processes: model training and model inference. Among them, model inference is to use the trained model to perform operations on the data to obtain the result of the classification/regression/prediction problem.

A large amount of data will be generated during the operation of the Internet of Things (IOT) system, and model inference is an effective means of analyzing these data. In the past, the task of model inference was mainly completed on the platform side. For some IOT systems with high real-time requirements, such as industrial IOT (IIOT) systems, because data often needs to be collected from the edge side and sent to the platform side for processing, it is usually not enough. Real-time requirements.

Summary of the invention

The embodiment of the present invention provides an edge-side model inference method, an edge computing device, and a computer-readable medium to perform model inference on the edge side of an IOT system, which meets the real-time requirements of the IOT system.

Among them, relative to the platform side, the edge side can include products on the user side. For example, for an IIoT system, the edge side can include industrial computers, programmable logic controllers (PLC), industrial communication equipment, motion controllers, CNC machine tools (CNC), first-class other special equipment.

In the first aspect, an edge-side model inference method is provided, which can be executed by an edge computing device in an IOT system. The method may include: an application running on an edge computing device subscribes to an inference process subtask of model inference; the application program obtains data used to execute the inference process subtask; and the application program calls the edge computing device The first API of the model inference service running on the computer to execute the inference process subtask based on the data; the application program obtains the model inference result of the inference process subtask from the response of the first API; The application program publishes the model inference result.

In a second aspect, an edge computing device in an IOT system is provided, including: an application module configured to run an application program; a model inference service module configured to provide model inference service; wherein, the application program The module runs the application to complete the following operations: subscribe to the subtasks of the inference process of model inference; obtain the data used to execute the subtasks of the inference process; call the first API of the model inference service based on the data Execute the inference process subtask; obtain the model inference result of the inference process subtask from the response of the first API; publish the model inference result.

In a third aspect, an edge computing device in an IOT system is provided, including: at least one processor configured to execute the method provided in the first aspect.

In a fourth aspect, there is provided an edge computing device in an IOT system, including: at least one memory configured to store computer-readable code; and the at least one processor configured to execute when the computer-readable code is invoked The method provided in the first aspect.

In a fifth aspect, a computer-readable medium is provided, the computer-readable medium stores computer-readable code, and when the computer-readable code is executed by at least one processor, the method provided in the first aspect is executed.

In any of the above aspects, the model inference is performed by the edge computing device, and the data based on the model inference does not need to be transmitted between the edge side and the platform side, only the model inference results need to be transmitted, which shortens the model inference time and reduces the occupation of the IOT system bandwidth , To meet the real-time requirements of the IOT system. Moreover, since the model inference is implemented on the edge side, the privacy of user data on the edge side is greatly protected. In addition, the model inference service is integrated on the edge computing device, and the application is designed to subscribe to the subtasks of the inference process, obtain the data used in the inference, and call the model inference service. There is no need to redesign the logic of the model inference service, only by designing simple The application program implements model inference through API calls, which is simple to implement.

For any of the foregoing aspects, optionally, when the application program obtains the data used to execute the subtask of the inference process, it can obtain the address link of the data used to execute the subtask of the inference process, and then obtain the data from the Get the data at the address link. In this way, simple and convenient data acquisition is realized.

For any of the foregoing aspects, optionally, the application program further obtains information about the model on which the subtask of the inference process is executed, and calls the second API of the model inference service to base the information on the model It is determined whether the model inference service can execute the subtasks of the inference process, so as to ensure the smooth progress of the subsequent model inference process.

For any of the foregoing aspects, optionally, the application program further performs post-processing on the model inference result, and when publishing the model inference result, publishes the post-processed model inference result.

For any of the foregoing aspects, optionally, the application program further calls the third API of the model inference service, and obtains from the response of the third API the information of the model on which the subtask of the inference process is executed. Meta-information, and when the model inferred result, the model inferred result and the meta-information are released. The jointly released meta-information can assist in marking model inference results and provide a reference for the analysis of model inference results.

For any of the foregoing aspects, optionally, the application program and the model inference service run on a container image.

In this way, after the edge computing device completes the subtasks of the inference process of model inference, the computing power used to execute the application and the model inference service can be released from the image of the container. The computing power released in this way can be used to complete other subtasks, or to perform processing tasks of the edge computing device itself. Because the computing power of edge computing devices is usually small, the use of lightweight containers can greatly reduce the occupation of computing power of edge computing devices. Executing microservices on the image of the container can quickly start the execution process when subtasks need to be performed. The computing power can be quickly released after the subtasks are executed.

Description of the drawings

Fig. 1 is a schematic structural diagram of an edge computing system provided by an embodiment of the present invention.

Figure 2 shows the MQTT communication process.

Figure 3 shows the encrypted MQTT communication process in the embodiment of the present invention.

Fig. 4 shows the internal layered structure of the edge computing device provided by some embodiments of the present invention.

Fig. 5 is a flowchart of an edge-side model inference method provided by an embodiment of the present invention.

[Corrected according to Rule 91 04.10.2020]
Fig. 6 is a schematic structural diagram of an edge computing device provided by an embodiment of the present invention.

[Corrected according to Rule 91 04.10.2020]
FIG. 7 is a schematic diagram of another structure of an edge computing device provided by an embodiment of the present invention.

List of reference signs:

10: Edge Computing System

20: Data source

101: Edge Computing Equipment

102: message forwarder

103: Original database

104: Distributed Object Storage

21～26: Applications running on edge computing devices to complete the subtasks of model processing

201: MQTT server

202: MQTT client

203: Post a message

204: Subscribe to news

205: authentication server

206: Authentication key pair-authentication server side (CA key pair), including a public key 206a and a private key 206b

207: Authentication certificate-authentication server side (CA certificate)

208: Authentication key pair-MQTT server side (CA server key), including a public key 208a and a private key 208b

209: Authentication certificate-MQTT server side (Server certificate)

210: Authentication certificate signing request

211: Authentication certificate response

400: Core processor (Core, a CPU can include one or more core processors)

401: Bare Metal virtualization of physical machines, for example: Hypervisor virtualization

402: Editable logic controller PLC operating system

403: Standard operating system, such as: Linux operating system

404: Real-time operating system, such as Debian RT Linux

405: Containers, such as Docker

406: Parts related to PLC control functions

407: C/C++ runtime (runtime) environment

408: Application

409: Model Inference Service

S501: Subscription inference process subtask

S502: Get data

S503: Call the second API of the model inference service to determine whether the model inference service can perform the subtasks of the model inference process

S504: Call the third API of the model inference service to obtain model meta information

S505: Call the first API of the model inference service and execute the subtasks of the inference process

S506: Post-processing the model inference result

S507: Publish model inference results

1011: Application module

1012: Model inference service module

1013: Processor

1014: memory

1015: Communication module

1016: Bus

Detailed ways

In the following, in order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the following further describes the embodiments of the present invention in detail with reference to the drawings. Among them, the embodiments described later are only a part of the embodiments of the present invention, rather than all the embodiments.

FIG. 1 is a schematic structural diagram of an edge computing system 10 provided by an embodiment of the present invention. The edge computing system 10 is located on the edge of an IOT system, and can obtain data from a data source 20 also located on the edge. The edge computing device 101 in the edge computing system 10 performs processing based on the obtained data. For example: model inference in the embodiment of the present invention.

Taking an industrial IOT system (IIOT) as an example, the data obtained from the data source 20 includes but is not limited to sensor data collected during the operation of various industrial equipment, such as temperature, humidity, motor speed, pictures, images, etc. The edge computing device 101 may include, but is not limited to: embedded industrial PCs (for example: Siemens IP2x7IP4x7), high-end industrial PCs (for example: Siemens IP6x7IP8x7), advanced industrial PCs (for example: Siemens IP5x7), industrial gateways (for example, IOT2040) , Programmable logic controller PLC, CNC machine tool CNC machine tool, motion controller, etc. In the embodiment of the present invention, idle computing capabilities of these edge computing devices 101 can be used for model inference.

The data obtained from the data source 20 can be stored in the original database 103 before being preprocessed. Taking an IIoT system as an example, the sensor data and other data collected during the operation of industrial equipment are usually time-stamped and can be regarded as time series. Therefore, an optional implementation of the original database 103 is a time series database.

In some embodiments of the present invention, the edge computing device 101 may perform processing such as model inference through an application program running on it. Among them, model processing such as model inference can be divided into multiple subtasks, which are completed by different applications. These applications can include but are not limited to:

Application program 21: an application program used to complete post-processing subtasks, referred to as "post-processing program 21";

Application 22: an application used to complete preprocessing subtasks, referred to as "preprocessing 22";

Application 23: an application used to complete the subtasks of model inference, model loading, and model update, referred to as "model server program 23";

Application program 24: an application program used to complete the sub-tasks of the training process of model training, referred to as "training server program 24";

Application 25: an application used to complete the protocol conversion subtasks, referred to as "gateway program 25";

Application program 26: an application program used to complete the monitoring subtasks of the training process, referred to as "monitoring program 26".

The data obtained from the data source 20 is stored in the original database 103 after the gateway program 25 performs protocol conversion. For example, for an IIoT system, the gateway program 25 can convert data belonging to different industrial protocols into a unified protocol, such as the MQTT protocol.

The preprocessing program 22 performs preprocessing on the data obtained from the data source 20, for example, selecting data within a certain range from the data stored in the original database 103 for model inference. The preprocessed data is stored persistently, for example, it can be stored in the distributed object storage 104, or stored on a dedicated storage server.

The training server program 24 performs model training based on the preprocessed data.

The model server program 23 loads the trained model and performs model inference.

The post-processing program 21 can process the results of the model inference, for example, organize the results of the model inference in a table, and present it to the user in a way that is easy to understand and observe.

In addition, one application program can also complete the functions of multiple applications among the above-mentioned application programs 21 to 26. For example: an application completes the subtasks of the inference process and the post-processing subtasks.

Using the edge computing system 10 shown in FIG. 1, the tasks of model processing can be divided into subtasks, which are respectively completed by corresponding applications running on each edge computing device 101, and finally the entire model processing task is completed. After the execution of a subtask is completed, the edge computing device 101 executing the subtask sends the execution result of the subtask to trigger other edge computing devices 101 to execute subsequent subtasks. Using the method of mutual triggering of sub-tasks between edge computing devices 101, the tasks of the entire model processing can be finally completed. The intermediate process does not require processing on the platform side of the IOT system, but is processed by the edge side, without the need for data and processing results on the edge Transmission between platform sides reduces the processing delay of tasks and improves the operating efficiency of the entire IOT system.

The edge computing devices 1014 shown in FIG. 1 can communicate with each other through the message forwarder 102. Or, each edge device 1014 forms an ad hoc network, and the two edge computing devices 101 communicate directly. An optional communication method is that each edge computing device 1016 communicates based on the MQTT protocol. As shown in FIG. 2, each edge computing device 101 acts as an MQTT client 202, and forwards messages through an MQTT server 201 (equivalent to a message forwarder 102). For example, an edge computing device 101 sends a subscription message to the MQTT server 201 to subscribe to the subtask of the inference process inferred by the model. After the data on which the model inference is based has been preprocessed, the other edge computing device 101 sends a release message to the MQTT server 201 to notify that the data on which the model inference is based has been preprocessed. After receiving the publish message, the MQTT server 2011 forwards it to the edge computing device 101 that has subscribed to the subtask of the inference process.

Optionally, in this embodiment of the present invention, each edge computing device 101 may communicate based on the MQTT protocol that uses an encryption mechanism. The MQTT communication process using the Secure Socket Layer/Transport Layer Security (SSL/TLS) encryption method can be shown in Figure 3. Wherein, the authentication server 205 creates an authentication key pair (CA key pair) 206, which includes a public key 206a and a private key 206b. The authentication server 205 creates an authentication certificate (CA certificate) 207, and uses the private key 206b to sign the authentication certificate 207. The MQTT server 201 creates an authentication key pair-MQTT server side (CA server key) 208, and uses the public key 208a in the authentication key pair-MQTT server side 208 to create an authentication certificate signing request 210, and sends it to authentication Server 205. The authentication server 205 returns the authentication certificate-authentication server side 207 to the MQTT server 201 via a response 211, and the MQTT server 201 uses the public key 208a to sign the authentication certificate-authentication server side 207 to obtain the authentication certificate-MQTT server Side 209. The authentication server 205 sends the authentication certificate-authentication server side 207 to the MQTT client 202, and the MQTT server 201 sends the authentication certificate-MQTT server side 209 signed by the public key 208a to the MQTT client 202. The MQTT client 202 verifies the authentication certificate-MQTT server side 209 and the authentication certificate-authentication server side 207, and obtains the public key 208a used by the MQTT server 201 to encrypt data. In the subsequent communication process between the MQTT client 202 and the MQTT server 201, the MQTT client 202 uses the public key 208a to encrypt the message, and the MQTT server 201 uses the private key 208b to decrypt the message.

The MQTT protocol adopting an encryption mechanism ensures the security of communication between the edge computing devices 101 in the edge computing system 10.

In the embodiment of the present invention, the computing capability of the edge computing device is used for model inference. Applications and model inference services run on edge computing devices, and applications use the services provided by the model inference service by calling the API provided by the model inference service. The application program implements communication with other edge computing devices, and optionally, can also implement data pre-processing and post-processing.

Take the Siemens quad-core S7-1500 PLC shown in FIG. 4 as an example of the edge computing device 101. The PLC can include the following four layers:

The first layer is the core processor (Core) 400, S7-1500 PLC includes four core processors, of which the two core processors on the left run the PLC operating system 402, and the PLC operating system 402 runs software related to PLC control functions 406. A standard operating system 403 can be run on the third core processor, and a C/C++ runtime environment can be run on it. A real-time operating system 404 can be run on the last core processor, and an application program 408 related to model inference and a model inference service 409 in the embodiment of the present invention can be run on it. The application program 408 may include one or more of the aforementioned application programs 21 to 26 related to model inference. The second layer may be a physical machine virtualization (Bare Metal virtualization) 401, for example, using a Hypervisor virtualization method.

Optionally, the application 408 and the model inference service 409 may run in a container. The edge computing device 101 obtains an image of a container 405 (for example: docker), and then runs an application 408 and a model inference service 409 on the image of the container 405. After the edge computing device 101 completes the subtask of the inference process of model inference, the computing power used by the execution application 408 and the model inference service 409 is released from the image of the container 405. The computing power released in this way can be used to complete other subtasks, or to perform processing tasks of the edge computing device 101 itself, for example, a PLC completes its own production process control tasks. Since the computing power of the edge computing device 101 is usually small, the use of lightweight containers can greatly reduce the occupation of the computing power of the edge computing device 101. The microservices can be executed on the image of the container to quickly start execution when subtasks need to be executed. Process, and the computing power can be quickly released after the subtasks are executed.

Optionally, the application program 408 may be implemented in a microservice manner. Through microservices, the application logic running on the edge computing device 101 can be simplified, and the complex model inference logic can be decoupled into a series of relatively independent microservices. The subtasks of model processing are completed by various microservices, and the execution speed is faster. The required computing power is small, and the idle computing power of the edge computing device 101 can be fully utilized.

Fig. 5 is a flowchart of an edge-side model inference method provided by an embodiment of the present invention. This process can be executed by the edge computing device 101 and can include the following steps:

S501: Subscribe to the subtask of the inference process.

In step S501, the application 408 running on the edge computing device 101 sends a subscription message to the message forwarder 102 to subscribe to the subtask of the inference process of model inference.

S502: Obtain data.

In step 502, when there is data that can be used to perform the subtasks of the above inference process, the message forwarder 102 sends a release message to the edge computing device 101. If the communication between the edge computing devices 101 is implemented based on the MQTT protocol, the release message The message has the same subject as the subscription message in the message of step S501. Among them, the published message may include data used to perform subtasks of the inference process; or when the amount of data is large, the message forwarder 102 may carry a link to the distributed object store 104 that stores the data in the published message, such as hdfs : //<inference_input_file>. In addition, the release message may also include the model name, model version, and inference type (classification, regression, prediction, etc.) of the model on which the inference subtask is executed. The edge computing device 101 obtains data from the link.

S503: Call the second API of the model inference service to determine whether the model inference service 409 can execute the subtask of the inference process.

In step S503, the application 408 may call the second API of the model inference service 409 to determine whether the model inference service 409 can execute the subtask of the inference process. If it is determined that the subtask of the inference process can be executed, step S505 and optional step S504 are executed.

S504: Call the third API of the model inference service 409 to obtain meta information of the model.

In step S504, the application program 408 calls the third API of the model inference service 409 to obtain the metadata of the model on which the subtask of the inference process is executed.

S505: Call the first API of the model inference service 409 to execute the subtask of the inference process.

In step S505, the application 408 calls the first API of the model inference service 409 to execute the subtask of the inference process. If the model inference service 409 is a Restful service, before calling the first API, the application 408 needs to convert the data on which the subtasks of the inference process are executed into a JSON format, and then call the first API. The application program 408 sends an API request to the model inference service 409, and obtains the execution result of the subtask of the inference process from the response, that is, the model inference result.

S506: Perform post-processing on the model inference result.

In this optional step, the application program 408 performs post-processing on the model inference result, such as visualizing the model inference result.

S507: Publish model inference results.

In step S507, the application 408 publishes the model inference result.

Among them, in step S501, the application 408 may subscribe to the subtask of the inference process by subscribing to the topic of "inference input".

An optional implementation of the topic "inferred input" is as follows:

Among them, the subject is "InferenceInput", shopfloor means a factory, datacollectionpoint1 means a data collection point in the factory, and forecastinput means data input for inference. inputfile-link hdfs: //<inference_input_file> represents the link to the data on which the subtasks of the inference process are executed. model name RNN_AlertForecast represents the name of the model on which the subtasks of the inference process are executed, model-version v1.1 represents the model version, and inference-type forecast represents the type of model inference.

In step S507, the application 408 may send a publishing message with the subject of "inference input", and the message includes the model inference result.

An optional implementation of the topic "inferred output" is as follows:

Among them, the subject is "InferenceOutput", shopfloor represents a factory, datacollectionpoint1 represents a data collection point in the factory, and forecastoutput represents the result of model inference as output. outputfile-link hdfs: //<inference_output_file> indicates the link where the model inference result is placed. model name RNN_AlertForecast represents the name of the model on which the subtasks of the inference process are executed, model-version v1.1 represents the model version, and inference-type forecast represents the type of model inference.

[Corrected according to Rule 91 17.06.2019]
Hereinafter, the structure of the edge computing device 101 provided by the embodiment of the present invention will be described in conjunction with FIG. 6 and FIG. 7. FIG. 7 is a schematic structural diagram of an edge computing device provided by an embodiment of the present invention. As shown in FIG. 7, the edge computing device 101 may include:

An application module 1011 is configured to run an application;

A model inference service module 1012, configured to provide model inference services;

Among them, the application module 1011 runs the application to complete the following operations: subscribe to the subtasks of the inference process of model inference; obtain the data used to execute the subtasks of the inference process; call the first API of the model inference service to perform the inference process based on the data Subtask; obtain the model inference result of the inference process subtask from the response of the first API; publish the model inference result.

Optionally, the application module 1011 runs the application, and when obtaining the data used to execute the subtask of the inference process, it specifically completes the following operations: obtain the address link of the data used to execute the subtask of the inference process; obtain the data from the address link .

Optionally, the application module 1011 runs the application, and also completes the following operations: obtain information about the model on which the subtasks of the inference process are executed; call the second API of the model inference service to determine whether the model inference service can be based on the information of the model Perform subtasks of the inference process.

Optionally, the application module 1011 runs the application and also completes the following operations: post-processing the model inference result; the application module 1011 runs the application, and when publishing the model inference result, completes the following operations: the application is released after post-processing The model inferred results.

Optionally, the application module 1011 runs the application, and also completes the following operations: call the third API of the model inference service, and obtain the meta information of the model on which the subtasks of the inference process are executed from the response of the third API; The module 1011 runs the application program and completes the following operations when publishing the model inference result: publishing the model inference result and meta-information.

Optionally, the application and the model inference service run on a container image.

[Corrected according to Rule 91 17.06.2019]
FIG. 7 is a schematic diagram of another structure of an edge computing device provided by an embodiment of the present invention. As shown in FIG. 8, under this structure, the edge computing device 101 may include at least one memory 1015 for storing computer readable code; at least one processor 1014 is configured to execute the computer readable code stored in the at least one memory 1015 , Execute the aforementioned edge-side model inference method. The modules shown in FIG. 8 can be regarded as program modules written by computer readable codes stored in the memory 1015. When these program modules are called by the processor 1014, the aforementioned edge-side model inference method can be executed. The edge computing device 101 may also include a communication module 1016. The edge computing device 101 communicates with other edge computing devices 101 through the communication module 106. In some embodiments, it communicates with the MQTT message forwarder 102, the distributed object store 104, and the original database. 103 and other communications. Optionally, the at least one memory 1015, the at least one processor 1014, and the communication module 1016 may communicate with each other through a bus 1017.

In addition, an embodiment of the present invention also provides a computer-readable medium storing computer-readable code, and when the computer-readable code is executed by at least one processor, the aforementioned edge-side model processing method is implemented.

In summary, the embodiments of the present invention provide an edge-side model processing method, an edge computing device, and a computer-readable medium, which use idle computing capacity on the edge side for model inference, effectively utilizes the idle computing capacity on the edge side, and improves IOT The real-time nature of the system.

In the above embodiments, the hardware unit can be implemented mechanically or electrically. For example, a hardware unit may include permanent dedicated circuits or logic (such as dedicated processors, Field-Programmable Gate Array (FPGA) or Application Specific Integrated Circuits (ASIC), etc.). Complete the corresponding operation. The hardware unit may also include programmable logic or circuits (such as general-purpose processors or other programmable processors), which may be temporarily set by software to complete corresponding operations. The specific implementation (mechanical, or dedicated permanent circuit, or temporarily set circuit) can be determined based on cost and time considerations.

The above embodiments of the present invention are shown and described in detail through the accompanying drawings and preferred embodiments. However, the embodiments of the present invention are not limited to these disclosed embodiments. Based on the above embodiments, those skilled in the art can know that the above different implementations can be combined. The alternative implementations in the examples result in more embodiments of the present invention, and these embodiments are also within the protection scope of the embodiments of the present invention.

Claims

The edge-side model inference method is characterized by including:

An application running on an edge computing device in an IoT system subscribes to the subtask of the inference process of model inference;

The application program obtains the data used to execute the subtask of the inference process;

The application program calls the first API of the model inference service running on the edge computing device to execute the subtask of the inference process based on the data;

The application program obtains the model inference result of the inference process subtask from the response of the first API;

The application program publishes the model inference result.
The method according to claim 1, wherein the application program obtaining the data used to execute the subtask of the inference process comprises:

The application program obtains the address link of the data used for executing the subtask of the inference process;

The application program obtains the data from the address link.
The method of claim 1 or 2, further comprising:

The application program obtains information about the model on which the subtask of the inference process is executed;

The application program calls the second API of the model inference service to determine whether the model inference service can execute the subtask of the inference process based on the information of the model.
The method according to any one of claims 1 to 3, wherein:

It also includes: the application program performs post-processing on the model inference result;

The publishing of the model inference result by the application program includes: the application program publishing the post-processed model inference result.
The method according to any one of claims 1 to 4, further comprising:

The application program calls the third API of the model inference service, and obtains the meta information of the model on which the subtask of the inference process is executed from the response of the third API;

The application program publishing the model inference result includes: the application program publishing the model inference result and the meta information.
The method according to any one of claims 1 to 5, wherein the application program and the model inference service run on a mirror image of a container.
An edge computing device (101) in an IOT system (10) is characterized in that it includes:

An application module (1011), configured to run an application;

A model inference service module (1012), configured to provide model inference services;

Wherein, the application program module (1011) runs the application program to complete the following operations:

Subscription model inference process subtask;

Obtain the data used to execute the subtasks of the inference process;

Calling the first API of the model inference service to execute the subtask of the inference process based on the data;

Obtaining the model inference result of the subtask of the inference process from the response of the first API;

Publish the model inference results.
The device according to claim 7, characterized in that the application program module (1011) runs the application program, and when obtaining the data used to execute the subtasks of the inference process, it specifically completes the following operations:

Obtain the address link of the data used for executing the subtask of the inference process;

Obtain the data from the address link.
The device according to claim 7 or 8, characterized in that the application program module (1011) runs the application program and further completes the following operations:

Acquiring information about the model on which the subtask of the inference process is executed;

The second API of the model inference service is called to determine whether the model inference service can execute the subtask of the inference process based on the information of the model.
The device according to any one of claims 7-9, wherein:

The application module (1011) runs the application, and also completes the following operations: post-processing the model inference result;

The application program module (1011) runs the application program, and when publishing the model inference result, completes the following operations: the application program publishes the post-processed model inference result.
The device according to any one of claims 7 to 10, wherein:

The application module (1011) runs the application, and also completes the following operations: call the third API of the model inference service, and obtain from the response of the third API the basis for executing the subtask of the inference process Meta information of the model;

The application module (1011) runs the application, and when publishing the model inference result, completes the following operations: publishing the model inference result and the meta information.
The device according to any one of claims 7 to 11, wherein the application program and the model inference service run on a mirror image of a container.
An edge computing device (101) in an IOT system (10) is characterized in that it includes:

At least one processor (1014) is configured to execute the method according to any one of claims 1 to 6.
The device of claim 13, further comprising:

At least one memory (1015) configured to store computer readable codes;

The at least one processor (1014) is configured to execute the method according to any one of claims 1 to 6 when the computer readable code is invoked.
A computer-readable medium, wherein the computer-readable medium stores computer-readable code, and when the computer-readable code is executed by at least one processor, the method according to any one of claims 1 to 6 .