CN112817581A - Lightweight intelligent service construction and operation support method - Google Patents
Lightweight intelligent service construction and operation support method Download PDFInfo
- Publication number
- CN112817581A CN112817581A CN202110193425.9A CN202110193425A CN112817581A CN 112817581 A CN112817581 A CN 112817581A CN 202110193425 A CN202110193425 A CN 202110193425A CN 112817581 A CN112817581 A CN 112817581A
- Authority
- CN
- China
- Prior art keywords
- intelligent
- service
- deployment
- container
- management
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a lightweight intelligent service construction and operation support method, wherein the lightweight intelligent service is formed by modifying an intelligent algorithm reasoning script, the execution efficiency is improved by preloading an intelligent model file, and a Python WSGIHTTP server Gunicorn is used for providing HTTP service externally; the intelligent service and the related supporting components thereof are operated in independent docker containers; kubernetes is responsible for the deployment, monitoring and management of containers, and provides resource quota such as storage mounting, CPU/GPU/memory and port mapping function support during deployment; the Harbor serves as a mirror image warehouse to provide mirror image management support for the service. The invention can conveniently and reliably support the rapid deployment and operation of lightweight intelligent services developed based on frames such as TensorFlow, Keras, PyTorch and the like, and realize effective resource allocation and service operation state monitoring.
Description
Technical Field
The invention relates to the technical field of service construction and service operation support, in particular to a lightweight intelligent service construction and operation support method.
Background
In recent years, artificial intelligence techniques represented by deep learning have been rapidly developed and are increasingly widely used in real life. The research result of the intelligent algorithm is usually a model file obtained through long-time training and an execution script for loading the model file and performing inference calculation. When the algorithm is applied, the model file is loaded firstly when the script is executed, then the reasoning calculation is executed, the model data in the memory is released after the calculation is finished, and the model is reloaded when the script is called again. Therefore, the intelligent algorithm is called in a mode of executing the reasoning script, so that the efficiency is low, the algorithm is transformed into Web service, the model is resident in the memory of the server, and the execution efficiency of the intelligent algorithm can be greatly improved.
The Web development framework based on the Python language mainly comprises Django, Flask, Pyramid, Tornado and the like. When applied to a production environment, the direct use of these frameworks is deficient in handling high concurrency and robustness, and is generally deployed in cooperation with a WSGI HTTP server. Container technology is a virtualization technology that is more computing resource efficient and flexible than virtual machine technology. Docker is an open-source application container engine, and developed applications and related dependencies can be packaged into images by using Docker and then are conveniently deployed on different machines. With the development of container technology, a number of container organization tools have emerged, the most popular of which is Kubernetes. Kubernetes is a container cluster management system of Google open source, has the capabilities of application deployment and maintenance, load balancing, service discovery, automatic expansion and the like based on containers, and improves the convenience and reliability of cluster management. The Harbor is an enterprise-level Registry server for storing and distributing Docker images, has excellent performance and security, and can improve the efficiency of constructing and transmitting images by using the Registry.
How to transform an intelligent algorithm to complete the construction of intelligent services, and integrate the various technologies to realize the rapid deployment and operation support of the services, and finally provide efficient, reliable and telescopic services to the outside to form a whole set of solution, which is a problem to be solved by technical personnel in the field.
Disclosure of Invention
The purpose of the invention is as follows: the invention aims to solve the technical problem of the prior art, provides a lightweight intelligent service construction and operation support method, and realizes the operation support functions of quick construction, deployment, management, monitoring and the like of intelligent services.
The invention is realized by the following technical scheme: establishing an intelligent service construction module and a containerization deployment module;
the intelligent service construction module is used for constructing intelligent services, and the intelligent services are formed by reforming intelligent algorithm scripts based on a Web application framework. The intelligent algorithm generally refers to various algorithms in the fields of images, texts and the like based on deep learning, generally comprises a training and reasoning part, an intelligent model file for reasoning is obtained through multiple times of training and parameter adjustment in the development process of the algorithm, and intelligent service only utilizes the model for reasoning. Preloading an intelligent model file when a Web application is started, and providing HTTP service by using a Python WSGI HTTP server Gunicorn;
the containerization deployment module is used for containerization deployment and management of intelligent services based on a container arrangement engine Kubernetes (K8S for short) cluster, and mirror image management support is provided by a private mirror image warehouse Harbor.
The intelligent service is formed by transforming an intelligent algorithm script (developed and completed by an algorithm engineer) based on a Web application framework, and comprises the following steps: model preloading based on an intelligent framework, REST request content analysis and intelligent reasoning result processing. Model preloading is realized by means of an import (import) mechanism of python, and the aim is to avoid repeated loading of the model at each service call, so that the execution efficiency of algorithm reasoning is improved. REST request content analysis and reasoning result processing are realized according to specific application scenarios, so that the intelligent service can meet external requirements.
The kubernets cluster comprises a main node and sub-nodes, and services are issued to the sub-nodes by creating a deployment (a core concept in K8S for managing application copies) in the main node of the kubernets cluster.
When a host node of a Kubernetes cluster creates a deployment, a persistent storage volume is appointed to mount, and a model file of a web project and an intelligent algorithm of a service is mounted in a container where the service is located; meanwhile, NVIDIA driver mounting is designated to support calling of a GPU in a container for intelligent reasoning;
according to the service operation resource requirement, appointing a CPU, a GPU and a memory resource quota in a container;
and (4) specifying a container port, and realizing port mapping by cooperating with the establishment of service.
Monitoring intelligent services by using a Kubernets dashboards visual management tool, wherein the monitoring comprises monitoring the CPU, the memory resource allocation condition and the running state of a container; and managing the intelligent service by using a kubecect command line tool, wherein the management comprises service deployment, deletion, rolling upgrade and dynamic scaling.
And using the Harbor as a mirror image warehouse for mirror image management, wherein the mirror image warehouse provides services of various CPU architectures to construct basic mirror images and mirror images of related support components of MySQL, Redis and Nginx services.
The service construction basic mirror image is internally preloaded with Python modules required by Django, Flask, Gunicorn, TensorFlow, Keras and Celery intelligent services, and the lightweight intelligent services can be quickly constructed based on the basic mirror image.
Has the advantages that: the invention provides a lightweight intelligent service construction and operation support method, which is characterized in that an intelligent algorithm is transformed into Web service, and service is quickly constructed based on a docker basic mirror image; the container deployment is carried out on the Kubernets cluster, the functions of storage mounting, resource limitation, GPU scheduling, port mapping and the like are realized by creating a deployment, the flexibility, the reliability and the operation efficiency of the service are improved, the service can be monitored and managed, and the operation support of the service is realized comprehensively. The whole process covers all stages of construction, deployment, calling, operation and maintenance monitoring and the like of the service, can be used for rapidly packaging an intelligent algorithm in a research and development stage into a module which can be called by an external system, and meanwhile ensures that the service has higher calling efficiency.
Drawings
The foregoing and/or other advantages of the invention will become further apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings.
FIG. 1 is a schematic architectural diagram of the system of the present invention.
Fig. 2 is a flow chart illustrating service delivery and operation in the present invention.
Detailed Description
The invention provides a lightweight intelligent service construction and operation support method, which comprises an intelligent service construction module and a containerization deployment module;
the intelligent service construction module is used for constructing intelligent services, the intelligent services are formed by transforming an intelligent algorithm script based on a Web application framework, an intelligent model file obtained by outputting the algorithm training script is preloaded when the Web application is started, and a Python WSGI HTTP server Gunicorn is used for providing HTTP services;
the containerization deployment module is used for carrying out containerization deployment and management on the intelligent service based on a Kubernetes cluster, and provides mirror image management support by a Harbor.
The intelligent service construction is based on the reconstruction of an intelligent algorithm script and mainly comprises intelligent model preloading, REST request content analysis, intelligent reasoning result processing and the like. The construction method is suitable for various algorithms based on deep learning, taking an image target detection algorithm as an example, and command line parameters required to be specified by a target detection inference script comprise paths of model files, label files and picture files used for inference. Arranging a plurality of model files and label files in a specific directory, and defining a model dictionary for obtaining corresponding paths of the model files and the label files according to the types of detection objects (such as airplanes, ships and the like). And defining the target detection reasoning part code as a function, taking the detection object category as an input parameter, and taking the target detection result as output. And placing the code blocks loaded with the model files in the files where the reasoning functions are located.
The Django project is created using the Django-admin command and an application named objectdetection is created. And defining a Serializer, which comprises a ChoiceField and a FileField, wherein the ChoiceField and the FileField respectively correspond to the detection target category and the uploaded picture file and are used for analyzing the REST request body. And creating a Django view file, and importing an inference function in a target detection inference script, so that a code block of a loading model in the inference script can be directly executed due to the dependence on the inference script when the Django project is started, and the preloading of an intelligent model file is realized. Defining a view DetectView, defining a member function post of the view DetectView, taking charge of acquiring a detection target class from a Serializer, saving an uploaded image file in a server designated directory, calling a reasoning function to carry out reasoning calculation, and finally packaging a detection result as a Response object to return.
On the basis of Django engineering, Gunicorn is used as an HTTP server, and parameters such as bin, workers, log-level and the like are set in a configuration file.
FIG. 1 is an architectural diagram of a lightweight intelligent service build and run support system. The container deployment is carried out based on Kubernetes, the Kubernetes cluster comprises a main node and sub-nodes, and the services are issued to the sub-nodes by creating deployment on the Kubernetes main node. When creating the deployment, the configuration such as mount, resource quota, port mapping and the like of the designated storage volume and the NVIDIA driver is as follows:
creating a persistent storage volume of NFS by defining PersistentVolume and PersistentVolumeClaim, referring to the persistent storage volume in deployment.
The NVIDIA driver of the physical server is mounted to support calling of a GPU in a container to carry out intelligent reasoning;
according to the service operation resource requirement, resource quotas such as a CPU (Central processing Unit), a GPU (graphics processing Unit), a memory and the like in a container are appointed;
and (4) designating a container port, and realizing port mapping by cooperating with the establishment of service, so as to prevent a plurality of service ports from colliding.
An example of a deployment profile is as follows:
fig. 2 is a flow chart of service delivery and operation. After the deployment is created, the Kubernets can create a pod in the child node, a target detection intelligent service is operated by a container in the pod, a mirror image of the pod is pulled from a Harbor mirror image warehouse, an engineering file is mounted in a code directory, a 8000 port is opened to the outside, and resource quotas such as a GPU (graphics processing unit), a CPU (central processing unit), an internal memory and the like are specified. After Gunicorn is started, the intelligent model file is preloaded and resident in the memory, and then the inference function is directly called to process each time an http request is received, so that the model cannot be repeatedly loaded and released. Kubernetes can monitor the running state of the pod in real time, and when the pod exits abnormally due to various reasons, one pod can be automatically rebuilt to ensure that the service is available. Replica parameters can be modified to increase the number of replicas and pods when there is a high demand on the load capacity of the Service, and Service discovery and load balancing are achieved by creating Service. By means of scheduling strategies such as affinity and anti-affinity of the nodes and the pods, the plurality of pods can be scheduled to different child nodes, and high availability of services is achieved. When the intelligent service needs to use external components such as Redis, MySQL, etc., a new deployment can be similarly defined. The running state of the service can be monitored by means of the dashboard interface; by using the kubecect command, the operations of state viewing, rolling upgrading, rollback, capacity expansion, capacity reduction and the like of the service can be realized. The above description shows that the method provided by the invention can improve the flexibility, reliability and operation efficiency of the service, and can monitor and manage the service, thereby more comprehensively realizing the operation support of the service.
The present invention provides a lightweight intelligent service construction and operation support method, and a number of methods and ways for implementing the technical solution are provided, and the above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, a number of improvements and modifications can be made without departing from the principle of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. All the components not specified in the present embodiment can be realized by the prior art.
Claims (7)
1. A lightweight intelligent service construction and operation support system is characterized in that an intelligent service construction module and a containerization deployment module are established;
the intelligent service construction module is used for constructing intelligent services, the intelligent services are formed by transforming intelligent algorithm scripts based on a Web application framework, intelligent model files obtained in the algorithm development process are preloaded when Web applications are started, Gunicorn is used as an HTTP server, and requests of users are received and processed;
the containerization deployment module is used for carrying out containerization deployment and management on the intelligent service based on a container arrangement engine Kubernetes, and a private mirror image warehouse Harbor provides mirror image management support.
2. The system of claim 1, wherein the intelligent service is adapted by an intelligent algorithm script based on a Web application framework, and comprises: model preloading based on an intelligent framework, REST request content analysis and intelligent reasoning result processing.
3. The system of claim 2, wherein the kubernets cluster includes a master node and child nodes, and wherein the service is delivered to the child nodes by creating a deployment at the master node of the kubernets cluster.
4. The system of claim 3, wherein when the host node of the Kubernetes cluster creates a deployment, a persistent storage volume is specified to mount, and a model file of a web project and an intelligent algorithm of a service is mounted into a container where the service is located; meanwhile, NVIDIA driver mounting is designated to support calling of a GPU in a container for intelligent reasoning;
according to the service operation resource requirement, appointing a CPU, a GPU and a memory resource quota in a container;
and (4) specifying a container port, and realizing port mapping by cooperating with the establishment of service.
5. The system of claim 4, wherein monitoring the intelligent service using a visualization management tool kubernets dashboards includes monitoring CPU, memory resource allocation, and container operating status; and managing the intelligent service by using a kubecect command line tool, wherein the management comprises service deployment, deletion, rolling upgrade and dynamic scaling.
6. The system according to claim 5, wherein the mirror management is performed by using a Harbor as a mirror repository, the mirror repository provides services of various CPU architectures to construct base mirrors, and MySQL, Redis, Nginx service-related support component mirrors.
7. The system of claim 6, wherein Python modules required by Django, Flask, Gunicorn, TensorFlow, Keras and Celery intelligent services are pre-installed in the service building base image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110193425.9A CN112817581A (en) | 2021-02-20 | 2021-02-20 | Lightweight intelligent service construction and operation support method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110193425.9A CN112817581A (en) | 2021-02-20 | 2021-02-20 | Lightweight intelligent service construction and operation support method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112817581A true CN112817581A (en) | 2021-05-18 |
Family
ID=75864419
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110193425.9A Pending CN112817581A (en) | 2021-02-20 | 2021-02-20 | Lightweight intelligent service construction and operation support method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112817581A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113886055A (en) * | 2021-12-07 | 2022-01-04 | 中国电子科技集团公司第二十八研究所 | Intelligent model training resource scheduling method based on container cloud technology |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017045424A1 (en) * | 2015-09-18 | 2017-03-23 | 乐视控股(北京)有限公司 | Application program deployment system and deployment method |
CN109284184A (en) * | 2018-03-07 | 2019-01-29 | 中山大学 | A kind of building method of the distributed machines learning platform based on containerization technique |
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
CN110245003A (en) * | 2019-06-06 | 2019-09-17 | 中信银行股份有限公司 | A kind of machine learning uniprocessor algorithm arranging system and method |
CN110780914A (en) * | 2018-07-31 | 2020-02-11 | 中国移动通信集团浙江有限公司 | Service publishing method and device |
CN111488254A (en) * | 2019-01-25 | 2020-08-04 | 顺丰科技有限公司 | Deployment and monitoring device and method of machine learning model |
CN111629061A (en) * | 2020-05-28 | 2020-09-04 | 苏州浪潮智能科技有限公司 | Inference service system based on Kubernetes |
-
2021
- 2021-02-20 CN CN202110193425.9A patent/CN112817581A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017045424A1 (en) * | 2015-09-18 | 2017-03-23 | 乐视控股(北京)有限公司 | Application program deployment system and deployment method |
CN109284184A (en) * | 2018-03-07 | 2019-01-29 | 中山大学 | A kind of building method of the distributed machines learning platform based on containerization technique |
CN110780914A (en) * | 2018-07-31 | 2020-02-11 | 中国移动通信集团浙江有限公司 | Service publishing method and device |
CN111488254A (en) * | 2019-01-25 | 2020-08-04 | 顺丰科技有限公司 | Deployment and monitoring device and method of machine learning model |
CN109885389A (en) * | 2019-02-19 | 2019-06-14 | 山东浪潮云信息技术有限公司 | A kind of parallel deep learning scheduling training method and system based on container |
CN110245003A (en) * | 2019-06-06 | 2019-09-17 | 中信银行股份有限公司 | A kind of machine learning uniprocessor algorithm arranging system and method |
CN111629061A (en) * | 2020-05-28 | 2020-09-04 | 苏州浪潮智能科技有限公司 | Inference service system based on Kubernetes |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113886055A (en) * | 2021-12-07 | 2022-01-04 | 中国电子科技集团公司第二十八研究所 | Intelligent model training resource scheduling method based on container cloud technology |
CN113886055B (en) * | 2021-12-07 | 2022-04-15 | 中国电子科技集团公司第二十八研究所 | Intelligent model training resource scheduling method based on container cloud technology |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110780914B (en) | Service publishing method and device | |
CN108510082B (en) | Method and device for processing machine learning model | |
EP3889774A1 (en) | Heterogeneous computing-based task processing method and software-hardware framework system | |
CN107733977A (en) | A kind of cluster management method and device based on Docker | |
CN111459610B (en) | Model deployment method and device | |
CN111459621B (en) | Cloud simulation integration and scheduling method and device, computer equipment and storage medium | |
CN111966361B (en) | Method, device, equipment and storage medium for determining model to be deployed | |
CN112085217A (en) | Method, device, equipment and computer medium for deploying artificial intelligence service | |
CN112395736B (en) | Parallel simulation job scheduling method of distributed interactive simulation system | |
CN114327399B (en) | Distributed training method, device, computer equipment, storage medium and product | |
CN114924851B (en) | Training task scheduling method and device, electronic equipment and storage medium | |
CN115600676A (en) | Deep learning model reasoning method, device, equipment and storage medium | |
CN106371931B (en) | A kind of high-performance geoscience computing service system based on Web frame | |
CN112817581A (en) | Lightweight intelligent service construction and operation support method | |
CN116932147A (en) | Streaming job processing method and device, electronic equipment and medium | |
CN115794400A (en) | Memory management method, device and equipment of deep learning model and storage medium | |
CN111506393A (en) | ARM-based virtualization device and use method thereof | |
CN116954815A (en) | Resource scheduling method and device, computer equipment and computer readable storage medium | |
US20220067502A1 (en) | Creating deep learning models from kubernetes api objects | |
CN115469887A (en) | Method and device for issuing cloud native application, electronic equipment and storage medium | |
CN112860334A (en) | Parallel simulation method and storage medium | |
CN117170738B (en) | Method, system, equipment and storage medium for interaction of Python and Fortran | |
CN114489929B (en) | Container group starting method and device, electronic equipment and storage medium | |
CN114116051B (en) | Processing method, device, equipment and storage medium based on neural network model | |
CN118132279B (en) | Task processing method, device, electronic equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210518 |
|
RJ01 | Rejection of invention patent application after publication |