CN117032935B - Management scheduling system and method for heterogeneous accelerator card based on K8s - Google Patents
Management scheduling system and method for heterogeneous accelerator card based on K8s Download PDFInfo
- Publication number
- CN117032935B CN117032935B CN202311182106.3A CN202311182106A CN117032935B CN 117032935 B CN117032935 B CN 117032935B CN 202311182106 A CN202311182106 A CN 202311182106A CN 117032935 B CN117032935 B CN 117032935B
- Authority
- CN
- China
- Prior art keywords
- resource information
- accelerator card
- resource
- card
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 230000001133 acceleration Effects 0.000 claims abstract description 53
- 238000012216 screening Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 8
- 230000007246 mechanism Effects 0.000 description 5
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000012544 monitoring process Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 239000008186 active pharmaceutical agent Substances 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a management scheduling system and method of heterogeneous accelerator cards based on K8 s. The resource discovery module is used for detecting heterogeneous accelerator card resource information; the resource management module is used for acquiring heterogeneous acceleration card resource information from the interface service so as to manage the heterogeneous acceleration card resource information; the scheduling module is used for acquiring heterogeneous accelerator card resource information from the resource management module based on the resource request sent by the interface service module, determining a target accelerator card according to the heterogeneous accelerator card resource information, scheduling the container set to a node where the target accelerator card is located, and sending the resource information of the target accelerator card to the resource discovery module for caching; the container running time module is used for acquiring the resource information of the target acceleration card from the resource discovery module and creating containers in the container set according to the resource information of the target acceleration card. The complexity of managing and scheduling the heterogeneous accelerator cards can be reduced.
Description
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a management scheduling system and method of heterogeneous accelerator cards based on K8 s.
Background
With the continuous popularization of Kubernetes (K8 s for short) technology, more and more manufacturers have abandoned a way of deploying applications on virtual machines or physical machines, and gradually begin to adopt container technology, and the difficulty of deploying or online application is reduced through a containerization way. In the field of artificial intelligence (ARTIFICIAL INTELLIGENCE, AI), most AI applications need to use heterogeneous accelerators (AI accelerators for short) to exert maximum service capabilities.
Along with the increasing of manufacturers and models of AI accelerator cards, a containerization scheme, a K8s plug-in scheme, an index monitoring scheme and a resource use interface aiming at a specified manufacturer or model need to be deployed in a K8s cluster, so that an effective heterogeneous accelerator card management and scheduling scheme is needed.
Disclosure of Invention
The embodiment of the invention provides a management scheduling system and a management scheduling method for heterogeneous accelerator cards based on K8s, which can reduce the complexity of management and scheduling of the heterogeneous accelerator cards.
In a first aspect, an embodiment of the present invention provides a management scheduling system for a heterogeneous accelerator card based on K8s, including: the system comprises a resource discovery module, a resource management module, a scheduling module, a container runtime module and an interface service module;
the resource discovery module, the resource management module and the scheduling module are all connected with the interface service module, the resource discovery module is connected with the container runtime module, and the resource management module is connected with the scheduling module;
the resource discovery module is used for detecting heterogeneous accelerator card resource information and sending the heterogeneous accelerator card resource information to the interface service module; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information;
The resource management module is used for acquiring heterogeneous accelerator card resource information from the interface service so as to manage the heterogeneous accelerator card resource information;
The scheduling module is used for acquiring heterogeneous acceleration card resource information from the resource management module based on the resource request sent by the interface service module, determining a target acceleration card according to the heterogeneous acceleration card resource information, scheduling a container set to a node where the target acceleration card is located, and sending the resource information of the target acceleration card to the resource discovery module for caching;
The container running time module is used for acquiring the resource information of the target acceleration card from the resource discovery module, and creating a container in the container set according to the resource information of the target acceleration card so as to start the container set.
In a second aspect, an embodiment of the present invention further provides a management scheduling method for a heterogeneous accelerator card based on K8s, including:
acquiring a plurality of heterogeneous accelerator card resource information based on a resource request; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information;
Determining at least one target accelerator card according to the heterogeneous accelerator card resource information;
Dispatching the container set to a node where the target accelerator card is located;
And creating a container in the container set according to the resource information of the at least one target accelerator card so as to start the container set.
The embodiment of the invention discloses a management scheduling system and method of heterogeneous accelerator cards based on K8 s. The system comprises a resource discovery module, a resource management module, a scheduling module, a container runtime module and an interface service module; the resource discovery module, the resource management module and the scheduling module are all connected with the interface service module, the resource discovery module is connected with the container operation time module, and the resource management module is connected with the scheduling module; the resource discovery module is used for detecting heterogeneous accelerator card resource information and sending the heterogeneous accelerator card resource information to the interface service module; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information; the resource management module is used for acquiring heterogeneous accelerator card resource information from the interface service so as to manage the heterogeneous accelerator card resource information; the scheduling module is used for acquiring heterogeneous accelerator card resource information from the resource management module based on the resource request sent by the interface service module, determining a target accelerator card according to the heterogeneous accelerator card resource information, scheduling the container set to a node where the target accelerator card is located, and sending the resource information of the target accelerator card to the resource discovery module for caching; the container run-time module is used for acquiring the resource information of the target accelerator card from the resource discovery module, and creating a container in the container set according to the resource information of the target accelerator card so as to start the container set. The management and scheduling system for the heterogeneous accelerator cards provided by the embodiment of the invention can reduce the complexity of management and scheduling of the heterogeneous accelerator cards.
Drawings
FIG. 1 is a schematic diagram of a management scheduling system of heterogeneous accelerator cards based on K8s according to a first embodiment of the present invention;
FIG. 2 is a schematic diagram of a resource discovery module according to a first embodiment of the invention;
FIG. 3 is a schematic diagram of a container runtime module in accordance with a first embodiment of the invention;
FIG. 4 is a block diagram of a management scheduling system for heterogeneous accelerator cards based on K8s according to a first embodiment of the present invention;
Fig. 5 is a flowchart of a management scheduling method of heterogeneous accelerator cards based on K8s in a second embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
The AI acceleration card is a module for processing a large amount of computation tasks in artificial intelligence, and is widely applied to the fields of face recognition, automatic driving, security protection, unmanned aerial vehicles and the like. The AI accelerator is typically installed in the server as a high-speed serial computer expansion (PERIPHERAL COMPONENT INTERCONNECT EXPRESS, PCIe) device, and the operating system, by installing the corresponding driver, can identify the corresponding AI accelerator, and the program, via the corresponding software tool (SDK), can use the AI accelerator in the program. In the container cloud platform, since the container runtime and Kubernetes management platform are used, the discovery of AI accelerator card resources, AI accelerator card resource management and scheduling and AI accelerator card containerization use are involved.
The high-level Container Runtime module interacts with the low-level level Container Runtime module of the bottom layer through the OCI specification, and the low-level container module performs the operations of creating and managing the container. I.e., bound by the underlying container runtime module and AI accelerator card. To support the use of AI acceleration cards in a container, each AI acceleration card vendor may implement a low-level container runtime module that supports the use of the respective AI acceleration card. Before the low-level container runtime modules execute the actions of creating containers, the configuration of the incoming environment variables and the like is checked, and according to the interface, the AI acceleration card of the manufacturer needs to be used by a certain container, and the container runtime modules inject the configuration information of the AI acceleration card into the container.
In the container cloud platform K8s, memory resources and computing resources can be applied to the K8s bottom layer, and a scheduler in the K8s schedules a container set (pod) to a proper node according to the actual running state of each node in the cluster. To enable K8s to discover and manage AI acceleration cards, K8s provides a device resource management (device plug in) mechanism for exposure of device resource status on the node and injection of device information into the container prior to container startup. Meanwhile, K8s also provides an expansion mechanism of the scheduler, and a scheduling algorithm can be expanded or customized according to scene requirements.
Different enterprises increasingly have the scene that the same cluster uses heterogeneous AI acceleration cards due to internal and external factors, for example: different types of AI accelerator cards of the same manufacturer; different models of AI accelerator cards from different vendors.
Currently, to support heterogeneous AI acceleration cards in a container cloud platform or cluster, the following problems and challenges exist:
1. The containers currently created at K8s cannot dynamically specify low-level container runtime modules, meaning that the pod created on a single node can only use the same default low-level container runtime module. If multiple manufacturer AI accelerator cards are installed on a node, only one AI accelerator card device may be used.
2. Each node of the K8s cluster needs to install different types of device plug in components according to the AI accelerator card to be installed and the functions to be used, and problems of device plug in conflict and difficult management can potentially occur.
3. Cluster AI acceleration card resources lack a global management perspective. There is no intuitive way to obtain the information of different AI accelerator cards at the cluster level, such as the binding relation between the pod or container and the AI accelerator cards, the number of the AI accelerator cards used, the calculation power and the video memory used by each AI accelerator card, and the like.
4. The default K8s scheduler will only be based on AI accelerator resource usage and will not use more information.
5. In declaring the use of AI accelerator card resources, various vendor specific languages need to be used.
6. When multiple types of AI acceleration cards exist in the cluster, the monitoring component needs to interface with monitoring indexes of various manufacturers.
Example 1
Fig. 1 is a schematic structural diagram of a management scheduling system of a heterogeneous accelerator card based on K8s according to a first embodiment of the present invention, as shown in fig. 1, the system includes: a resource discovery module 110, a resource management module 120, a scheduling module 130, a container runtime module 140, and an interface service module 150.
The resource discovery module 110, the resource management module 120, and the scheduling module 130 are all connected to the interface service module 150, the resource discovery module 110 is connected to the container runtime module 140, and the resource management module 120 is connected to the scheduling module 130. The interface service module 150 may be understood as API SERVER in K8s, which is a hub for data interaction and communication between other modules.
In this embodiment, the resource discovery module 110 is configured to detect heterogeneous accelerator card resource information, and send the heterogeneous accelerator card resource information to the interface service module 150. The resource management module 120 is configured to obtain heterogeneous accelerator card resource information from the interface service module 150, so as to manage the heterogeneous accelerator card resource information. The scheduling module 130 is configured to obtain heterogeneous acceleration card resource information from the resource management module 120 based on the resource request sent by the interface service module 150, determine a target acceleration card according to the heterogeneous acceleration card resource information, schedule the container set to a node where the target acceleration card is located, and send the resource information of the target acceleration card to the resource discovery module 110 for caching. The container runtime module 140 is configured to obtain the resource information of the target accelerator card from the resource discovery module 110, and create a container in the container set according to the resource information of the target accelerator card, so as to start the container set.
The heterogeneous accelerator card resource information comprises memory resource information, computing resource information and usage resource information. The memory resource information may be understood as the memory capacity of the accelerator card, the computing resource information may be understood as the number of computing units of the accelerator card, and the usage resource information may include the memory resources and computing resources already used by the accelerator card.
In this embodiment, the resource discovery module 110 is disposed on each node (which may also be referred to as a server) with an AI accelerator card in the K8s cluster, and is configured to monitor the AI accelerator card, report the resource information of the AI accelerator card to the interface service module 150, and respond to a request from the container runtime module 140. To better interface with K8s, the resource discovery module 110 may implement function expansion in the manner of a Device plug in (Device plug in), and may perform data interaction with the resource management module 120 through API SERVER.
Specifically, fig. 2 is a schematic structural diagram of a resource discovery module in the present embodiment, and as shown in fig. 2, the resource discovery module 110 includes an acceleration card management unit 111, a resource reporting unit 112, a management interface unit 113, and an acceleration card service unit 114.
Wherein, a plurality of acceleration card plug-ins are set in the acceleration card management unit 111, and heterogeneous acceleration card resource information is acquired through the acceleration card plug-ins. The resource reporting unit 112 is configured to report the memory resource information and the computing resource information to the interface service module through the management interface unit 113. The accelerator card service unit 114 is configured to provide an interface for the container runtime module 140 to invoke a query.
In this embodiment, the accelerator card management unit 111 adapts to AI accelerator cards of different types or different models through a plug-in mechanism, and when the accelerator card management unit 111 is started, the AI accelerator cards in the nodes are automatically scanned, corresponding plug-ins are loaded according to basic information (such as identification, name and model) of the scanned AI accelerator cards, and the resource information of the accelerator cards is obtained through the loaded plug-ins. The memory resource information and the computing resource information in the accelerator card resource information are sent to the resource reporting unit 112, and the usage resource information is sent to the accelerator card service unit 114. The resource reporting unit 112 includes a memory resource (memory) reporting subunit and a computing resource (core) reporting subunit, which are both embodied in plug-ins. The memory resource reporting subunit is used for reporting the memory resource information, and the computing resource reporting subunit is used for reporting the computing resource information. In this embodiment, the resource reporting unit 112 firstly reports the memory resource information and the computing resource information to the K8s component (Kubelet) through the management interface unit 113, and then the K8s component sends the memory resource information and the computing resource information to the interface service module 150. The management interface unit 113 is configured to implement communication between the resource reporting unit 112 and the K8s component, and may be implemented based on a remote procedure call (Remote Procedure Call, PRC) technique. Accelerator card service unit 114 is operative to provide an interface for container runtime module 140 to call to query the accelerator card for resource information so that the container runtime module binds the AI accelerator card to the newly created container. And simultaneously transmits the usage resource information to the interface service module 150, so that the interface service module 150 transmits the usage resource information to the resource management module 120.
In this embodiment, the resource discovery module may adapt to a plurality of heterogeneous AI acceleration cards (different manufacturers, different models, etc.) through a plug-in mechanism, and expose the memory resource information and the computing resource information of the acceleration card to K8s through the memory resource reporting subunit and the computing resource reporting subunit in a device plug-in manner. In addition, the resource discovery module reports the usage resource information to the resource management module and is invoked by the container runtime module.
Specifically, the resource management module 120 is further configured to group heterogeneous accelerator card resource information according to accelerator cards, obtain a plurality of accelerator card groups, and set group names of the accelerator card groups.
The heterogeneous accelerator card resource information further comprises: the name of the node where the accelerator card is located, the use state of the accelerator card, the identifier of the accelerator card, the path of the accelerator card, the model of the accelerator card, the provider to which the accelerator card belongs, the use mode of the accelerator card and the container set (pod) using the accelerator card.
The accelerating card using state comprises two states of using and not using, the accelerating card path can be understood as a path of the accelerating card on the node host, the accelerating card using mode comprises an exclusive mode, a sharing mode and an unlimited mode, and a container set using the accelerating card comprises information such as a pod name, an affiliated space name and the like. In this embodiment, the resource management module 120 may manage heterogeneous accelerator card resource information in a list form, or construct a resource information model (e.g., a tree model), and manage heterogeneous accelerator card resource information in a model form. Heterogeneous accelerator card resource information may be stored in the storage component ETCD of K8 s.
Specifically, the scheduling module 130 is configured to obtain a preset scheduling condition, and determine a target accelerator card based on the scheduling condition and heterogeneous accelerator card resource information.
Wherein the scheduling conditions may include: accelerator card usage pattern, accelerator card model number, and group name of accelerator card group. The accelerator card usage pattern may be understood as screening accelerator cards according to usage patterns preset in the pod; the accelerator card model can be understood to screen accelerator cards according to the model set in advance in the pod; the group name of the accelerator card group can be understood as screening accelerator cards according to the group name set in advance in the pod.
In this embodiment, the scheduling module 130 may be implemented by a filter plug-in (FilterPlugin), a score plug-in, a reservation plug-in (ReservePlugin), and a pre-binding plug-in (PreBindPlugin). In the scheduling process of the Pod, firstly, a filtering plug-in is called, the main work of the filtering plug-in is to judge whether the node meets the condition of running the Pod, and if the node meets the condition of running the Pod, a successful status code is returned. When all the plugins return success, this indicates that the node is selected as a schedulable node. And then running a scoring plugin, selecting the node with the highest score from the schedulable node list as a suggested scheduling node of the pod according to the score, wherein although the pod is not actually scheduled to the node at the moment, if the scoring plugin makes a decision on scheduling based on certain saved states, in order to prevent the pod in the scheduling stage from competing with the pod in the binding stage for resources, the states of the plugin need to be updated, such as reserving resources needed by the pod. At this time, a reserved plug-in is called, two methods are provided in the plug-in, the Reserve method can be called before the Pod enters the binding stage to update the state of the plug-in, and if the later plug-in call fails or the Pod is refused to be scheduled, the Unreserve method can be called to rollback the Reserve method to update the plug-in state.
In this embodiment, when scheduling multiple pod of a distributed multi-machine multi-card task, a reservation plug-in is needed to ensure that there are sufficient resources allocated to the task. Or when the resources are insufficient, a scene that the resource part is reserved does not appear.
The pre-binding plug-in, through which the AI acceleration card information is injected into the pod, is invoked before the pod is actually bound to the proposed scheduling node. For example: the manner of Annography can be used: "AI-devices/assigned-AI-of-container-x", x denotes the sequence number of the container in the pod object configuration file, the value of the annotation is the ID of the accelerator card, if there are multiple accelerator cards, then use "," separate, and finally update the pod in the K8s cluster with the updated pod object.
Scheduling module 130 completes the scheduling decision for the accelerator card and injects the AI accelerator card's resource information into the pod's annotation. For containers using AI acceleration cards, it is of interest if the resource information of the AI acceleration card on the node meets the resource conditions set by the pod.
Specifically, the container runtime module 140 is configured to create a container using an accelerator card, and provides a plug-in mechanism to facilitate the initialization process associated with the accelerator card. Fig. 3 is a schematic structural diagram of a container runtime module in this embodiment, and as shown in fig. 3, the container runtime module includes a call layer interface 141, a configuration unit 142, and an initialization plug-in 143.
Wherein the call layer (Open Container Initiative, OCI) interface 141 is used to communicate with the container management tool (containerd) and call the container management tool interface to query the configuration information of the container; the configuration unit 142 is configured to adjust configuration information based on resource information of the target accelerator card; the initializing plug-in 143 is configured to create a container in the container set and mount the running plug-in of the target accelerator card on the container based on the adjusted configuration information, and initialize the container on which the target accelerator card is mounted.
The configuration unit 142 may be understood as a container runtime library, for performing resource configuration. The initialization plug-in 143 may be plural and may be a plug-in used by a manufacturer to mount accelerator cards into a container to execute associated initialization logic.
Optionally, the configuration unit 142 is configured to adjust the identifier of the target accelerator card to an environment variable field of the configuration information; the configuration information further comprises a container set identifier and a container name.
By way of example, fig. 4 is a schematic diagram of a management scheduling system of a heterogeneous acceleration card based on K8s in the present embodiment, and as shown in fig. 4, the process of creating a container may be: when a pod is dispatched to a designated node, the Kubelet component invokes the container engine dockerd, dockerd to invoke the container tool contaiernd, ultimately invoking the container runtime module to create the container for the pod: the method comprises the steps that when a container runs, a module de-sequences a configuration file of the container into a configuration object; then, resolving a container set identifier (pod uuid) and a container name from the configuration object, and requesting a resource information column of the target accelerator card from a resource discovery module; if one pod binds a plurality of AI acceleration cards, calling plug-ins of different manufacturers in turn to complete relevant configuration of environmental parameters, and mounting operation plug-ins of a target acceleration card; the container runtime module writes the configuration information into the original configuration file and finally creates the container.
The management scheduling system of heterogeneous accelerator cards based on K8s of the embodiment comprises: the system comprises a resource discovery module, a resource management module, a scheduling module, a container runtime module and an interface service module; the resource discovery module, the resource management module and the scheduling module are all connected with the interface service module, the resource discovery module is connected with the container operation time module, and the resource management module is connected with the scheduling module; the resource discovery module is used for detecting heterogeneous accelerator card resource information and sending the heterogeneous accelerator card resource information to the interface service module; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information; the resource management module is used for acquiring heterogeneous accelerator card resource information from the interface service so as to manage the heterogeneous accelerator card resource information; the scheduling module is used for acquiring heterogeneous accelerator card resource information from the resource management module based on the resource request sent by the interface service module, determining a target accelerator card according to the heterogeneous accelerator card resource information, scheduling the container set to a node where the target accelerator card is located, and sending the resource information of the target accelerator card to the resource discovery module for caching; the container run-time module is used for acquiring the resource information of the target accelerator card from the resource discovery module, and creating a container in the container set according to the resource information of the target accelerator card so as to start the container set. The management and scheduling system for the heterogeneous accelerator cards provided by the embodiment of the invention can reduce the complexity of management and scheduling of the heterogeneous accelerator cards.
Example two
Fig. 5 is a flowchart of a management scheduling method of heterogeneous accelerator cards based on K8s according to a second embodiment of the present invention, as shown in fig. 5, the method includes the following steps:
s510, acquiring a plurality of heterogeneous accelerator card resource information based on the resource request.
The heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information;
s520, determining at least one target accelerator card according to the heterogeneous accelerator card resource information.
The method for determining at least one target accelerator card to be scheduled according to the heterogeneous accelerator card resource information may be to obtain a preset scheduling condition; screening the heterogeneous accelerator card resource information based on the scheduling conditions to obtain candidate heterogeneous accelerator card resource information; and processing the candidate heterogeneous accelerator card resource information based on a set scheduling algorithm to obtain at least one target accelerator card for scheduling.
S530, dispatching the container set to a node where the target accelerator card is located;
s540, creating a container in the container set according to the resource information of the at least one target accelerator card to start the container set.
Specifically, the manner of creating the container according to the resource information of the at least one target accelerator card may be: acquiring configuration information; adjusting an environment variable field of the configuration information according to the identification of the target accelerator card; and creating containers in the container set based on the adjusted configuration information.
In this embodiment, the management scheduling method of the heterogeneous accelerator card based on K8s may refer to the functions of each module of the management scheduling system of the heterogeneous accelerator card based on K8s in the foregoing embodiment, which is not described herein.
According to the technical scheme, a plurality of heterogeneous accelerator card resource information is acquired based on a resource request; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information; determining at least one target accelerator card according to the heterogeneous accelerator card resource information; dispatching the container set to a node where the target accelerator card is located; a container is created in the container set according to the resource information of the at least one target accelerator card to launch the container set. The complexity of managing and scheduling the heterogeneous accelerator cards can be reduced.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.
Claims (6)
1. A heterogeneous accelerator card management scheduling system based on K8s, comprising: the system comprises a resource discovery module, a resource management module, a scheduling module, a container runtime module and an interface service module;
the resource discovery module, the resource management module and the scheduling module are all connected with the interface service module, the resource discovery module is connected with the container runtime module, and the resource management module is connected with the scheduling module;
the resource discovery module is used for detecting heterogeneous accelerator card resource information and sending the heterogeneous accelerator card resource information to the interface service module; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information;
the resource management module is used for acquiring heterogeneous accelerator card resource information from the interface service module so as to manage the heterogeneous accelerator card resource information;
The scheduling module is used for acquiring heterogeneous acceleration card resource information from the resource management module based on the resource request sent by the interface service module, determining a target acceleration card according to the heterogeneous acceleration card resource information, scheduling a container set to a node where the target acceleration card is located, and sending the resource information of the target acceleration card to the resource discovery module for caching;
the container runtime module is used for acquiring the resource information of the target acceleration card from the resource discovery module, and creating a container in the container set according to the resource information of the target acceleration card so as to start the container set;
The resource management module is further used for grouping the heterogeneous accelerator card resource information according to accelerator cards to obtain a plurality of accelerator card groups, and setting group names of the accelerator card groups; the heterogeneous accelerator card resource information further includes: the method comprises the steps of node names of accelerator cards, accelerator card use states, accelerator card identifications, accelerator card paths, accelerator card models, suppliers of the accelerator cards, accelerator card use modes and container sets using the accelerator cards;
the scheduling module is used for acquiring preset scheduling conditions and determining a target accelerator card based on the scheduling conditions and the heterogeneous accelerator card resource information;
The scheduling conditions include: accelerator card usage pattern, accelerator card model number, and group name of heterogeneous accelerator card group.
2. The system of claim 1, wherein the resource discovery module comprises an accelerator card management unit, a resource reporting unit, a management interface unit, and an accelerator card service unit;
the acceleration card management unit is provided with a plurality of acceleration card plug-ins, and heterogeneous acceleration card resource information is acquired through the acceleration card plug-ins;
The resource reporting unit is used for reporting the memory resource information and the computing resource information to the interface service module through the management interface unit;
The acceleration card service unit is used for providing an interface for the module to call and inquire when the container runs.
3. The system of claim 1, wherein the container runtime module comprises a call layer interface, a configuration unit, and an initialization plug-in;
The calling layer interface is used for communicating with the container management tool and calling the container management tool interface to inquire the configuration information of the container; the configuration unit is used for adjusting the configuration information based on the resource information of the target accelerator card; the initialization plug-in is used for establishing a container in the container set based on the adjusted configuration information, loading the operation plug-in of the target accelerator card on the container, and initializing the container on which the target accelerator card is mounted.
4. The system of claim 3, wherein the configuration unit is configured to adjust an identification of the target accelerator card by an environment variable field of the configuration information; the configuration information further comprises a container set identifier and a container name.
5. A method for managing and scheduling heterogeneous accelerator cards based on K8s, wherein the method is performed by the managing and scheduling system of heterogeneous accelerator cards based on K8s according to any one of claims 1 to 4, and comprises:
acquiring a plurality of heterogeneous accelerator card resource information based on a resource request; the heterogeneous accelerator card resource information comprises memory resource information, computing resource information and use resource information;
Determining at least one target accelerator card according to the heterogeneous accelerator card resource information;
Dispatching the container set to a node where the target accelerator card is located;
Creating a container in the container set according to the resource information of the at least one target accelerator card so as to start the container set;
wherein determining at least one target accelerator card for scheduling according to the plurality of heterogeneous accelerator card resource information comprises:
Acquiring preset scheduling conditions;
screening the heterogeneous accelerator card resource information based on the scheduling conditions to obtain candidate heterogeneous accelerator card resource information;
And processing the candidate heterogeneous accelerator card resource information based on a set scheduling algorithm to obtain at least one target accelerator card for scheduling.
6. The method of claim 5, wherein creating a container from the resource information of the at least one target accelerator card comprises:
acquiring configuration information;
Adjusting an environment variable field of the configuration information according to the identification of the target accelerator card;
And creating containers in the container set based on the adjusted configuration information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311182106.3A CN117032935B (en) | 2023-09-13 | 2023-09-13 | Management scheduling system and method for heterogeneous accelerator card based on K8s |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311182106.3A CN117032935B (en) | 2023-09-13 | 2023-09-13 | Management scheduling system and method for heterogeneous accelerator card based on K8s |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117032935A CN117032935A (en) | 2023-11-10 |
CN117032935B true CN117032935B (en) | 2024-05-31 |
Family
ID=88633900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311182106.3A Active CN117032935B (en) | 2023-09-13 | 2023-09-13 | Management scheduling system and method for heterogeneous accelerator card based on K8s |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117032935B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117971906B (en) * | 2024-04-02 | 2024-07-02 | 山东浪潮科学研究院有限公司 | Multi-card collaborative database query method, device, equipment and storage medium |
CN118426971B (en) * | 2024-07-03 | 2024-10-11 | 中国人民解放军96901部队 | AI acceleration card resource scheduling method based on double optimization models |
CN118502965B (en) * | 2024-07-16 | 2024-10-01 | 苏州元脑智能科技有限公司 | Acceleration card distribution method and device and artificial intelligent platform |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113377529A (en) * | 2021-05-24 | 2021-09-10 | 阿里巴巴新加坡控股有限公司 | Intelligent accelerator card and data processing method based on intelligent accelerator card |
WO2022095348A1 (en) * | 2020-11-06 | 2022-05-12 | 浪潮(北京)电子信息产业有限公司 | Remote mapping method and apparatus for computing resources, device and storage medium |
CN116260876A (en) * | 2023-01-31 | 2023-06-13 | 苏州浪潮智能科技有限公司 | AI application scheduling method and device based on K8s and electronic equipment |
-
2023
- 2023-09-13 CN CN202311182106.3A patent/CN117032935B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022095348A1 (en) * | 2020-11-06 | 2022-05-12 | 浪潮(北京)电子信息产业有限公司 | Remote mapping method and apparatus for computing resources, device and storage medium |
CN113377529A (en) * | 2021-05-24 | 2021-09-10 | 阿里巴巴新加坡控股有限公司 | Intelligent accelerator card and data processing method based on intelligent accelerator card |
CN116260876A (en) * | 2023-01-31 | 2023-06-13 | 苏州浪潮智能科技有限公司 | AI application scheduling method and device based on K8s and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN117032935A (en) | 2023-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117032935B (en) | Management scheduling system and method for heterogeneous accelerator card based on K8s | |
CN111324571B (en) | Container cluster management method, device and system | |
CN112269640B (en) | Method for realizing life cycle management of container cloud component | |
CN107769949A (en) | A kind of application component dispositions method and deployment node | |
CN113296792A (en) | Storage method, device, equipment, storage medium and system | |
CN110838939B (en) | Scheduling method based on lightweight container and edge Internet of things management platform | |
CN111443985A (en) | Method and equipment for instantiating virtual network function | |
CN113190282A (en) | Android operating environment construction method and device | |
CN112887417A (en) | Service management framework and method for supporting airborne embedded system | |
CN115328529B (en) | Application management method and related equipment | |
CN111443984B (en) | Container deployment method and device of network function virtualization NVF system | |
CN115480910A (en) | Multi-cluster resource management method and device and electronic equipment | |
CN111522623B (en) | Modularized software multi-process running system | |
CN111447076B (en) | Container deployment method and network element of network function virtualization (NVF) system | |
CN117234741A (en) | Resource management and scheduling method and device, electronic equipment and storage medium | |
CN116974689A (en) | Cluster container scheduling method, device, equipment and computer readable storage medium | |
CN115525391A (en) | Embedded cloud platform management monitoring system | |
CN115422277A (en) | Data source connection pool control method and device and server | |
CN112379867A (en) | Embedded operating system, method and storage medium based on modular development | |
CN118413573B (en) | Resource management method, device, computer equipment, storage medium and product | |
CN115484231B (en) | Pod IP distribution method and related device | |
CN113127257A (en) | Software upgrading method | |
CN112558991B (en) | Mirror image management method and system, cloud management platform and storage medium | |
US11953972B2 (en) | Selective privileged container augmentation | |
CN102681881A (en) | Cross-computer scheduling method and system thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |