CN113703936B

CN113703936B - Method for creating computing force container, computing force platform, electronic equipment and storage medium

Info

Publication number: CN113703936B
Application number: CN202110397131.8A
Authority: CN
Inventors: 查冲
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Filing date: 2021-04-13
Publication date: 2024-11-15
Anticipated expiration: 2041-04-13

Abstract

A method for creating a computing force container, a computing force platform, electronic equipment and a storage medium are provided, and relate to the field of big data processing of cloud technology. The method comprises the following steps: acquiring a target task through an affinity task cache; determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n is more than 1; a computing force container is created in the first resource pool for the target task. The method can regulate the calculation force resource pool from the time dimension, avoid the calculation force resource pool from being scattered on the time granularity and reduce the calculation force fragmentation.

Description

Method for creating computing force container, computing force platform, electronic equipment and storage medium

Technical Field

The embodiment of the application relates to the field of cloud technology, in particular to the field of big data processing of cloud technology, and more particularly relates to a method for creating a computing power container, a computing power platform, electronic equipment and a storage medium.

Background

Up to now, the technical solution for "reducing the computational effort fragmentation" in the industry mainly comprises: after the user submits AI training, the user's requirement is satisfied from the perspective of the resource, a saturation priority algorithm is generally adopted, that is, the power calculation fragmentation is reduced from the perspective of the resource, and the fragmented power calculation equipment is preferentially allocated; however, the saturation-priority algorithm is effective for a resource scenario of static production delivery in a scenario of resource allocation, and for a dynamic scenario of tasking, fragmentation occurs even if the user needs are boxed according to the saturation-priority.

Thus, there is an urgent need in the art for a more efficient method of creating a computing force container for a diversified resource allocation scenario.

Disclosure of Invention

The embodiment of the application provides a method for creating a computing force container, a computing force platform, electronic equipment and a storage medium, which can regulate a computing force resource pool from the dimension of time, prevent the computing force resource pool from being scattered on the time granularity and reduce computing force fragmentation.

In one aspect, embodiments of the present application provide a method of creating a computing force container, the method being applied to a computing force platform; comprising the following steps:

acquiring a target task through an affinity task cache;

determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n is more than 1;

a computing force container is created in the first resource pool for the target task.

In another aspect, an embodiment of the present application provides a computing platform, including:

The acquisition unit is used for acquiring the target task through the affinity task cache;

The determining unit is used for determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n is more than 1;

and the creating unit is used for creating a computing force container for the target task in the first resource pool.

In another aspect, an embodiment of the present application provides an electronic device, including:

a processor adapted to execute a computer program;

a computer readable storage medium having a computer program stored therein, which when executed by the processor, implements the method of the first aspect or the method of the second aspect.

In another aspect, embodiments of the present application provide a computer-readable storage medium storing computer instructions that, when read and executed by a processor of a computer device, cause the computer device to perform the method of the first aspect or the method of the second aspect.

In the method for creating the calculation container, the target task is obtained through the affinity task cache; determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n is more than 1; a computing force container is created in the first resource pool for the target task.

In other words, the affinity task resource pool corresponding to the target task is determined through the affinity task cache, namely a first resource pool, and a computing container is created for the target task in the first resource pool; on the one hand, as the N affinity task resource pools respectively correspond to N time periods, the force resource pools can be regulated in the time dimension; on the other hand, the affinity task cache determines the first resource pool corresponding to the target task in the N affinity task resource pools according to the time period of the target task, so that the target task can run in the corresponding affinity task resource pool, and computational power fragmentation caused by overlarge running time difference of a plurality of tasks is avoided, thereby reducing the computational power fragmentation in the time dimension.

In addition, the calculation power fragmentation is reduced from the time dimension, the whole activity of the calculation power resources is facilitated, the situation that the total calculation power resources of the calculation power platform at a certain moment are enough, but the calculation power resources of a certain task resource pool required by a user task cannot be met can be avoided.

In short, the method for creating the computing force container can regulate the computing force resource pool in the time dimension, thereby reducing the computing force fragmentation in the time dimension.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is an example of a system framework provided by an embodiment of the present application.

FIG. 2 is a schematic flow chart of a method of creating a computing force container provided by an embodiment of the application.

FIG. 3 is another schematic flow chart of a method of creating a computing force container provided by an embodiment of the application.

FIG. 4 is another schematic flow chart diagram of a method of creating a computing force container provided by an embodiment of the application.

FIG. 5 is a schematic block diagram of a computing platform provided by an embodiment of the present application.

Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The scheme provided by the application can relate to artificial intelligence technology.

Wherein artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

It should be appreciated that artificial intelligence techniques are a comprehensive discipline involving a wide range of fields, both hardware-level and software-level techniques. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.

Embodiments of the application may also relate to machine learning (MACHINE LEARNING, ML) in artificial intelligence techniques, where ML is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

In addition, the scheme provided by the application can relate to cloud technology, in particular to cloud technology big data processing; in particular to a technology for pooling computing power resources in cloud technology.

Cloud computing (closed computing) refers to the delivery and usage mode of an IT infrastructure, meaning that required resources are obtained in an on-demand, easily scalable manner through a network; generalized cloud computing refers to the delivery and usage patterns of services, meaning that the required services are obtained in an on-demand, easily scalable manner over a network. Such services may be IT, software, internet related, or other services. Cloud Computing is a product of fusion of traditional computer and network technology developments such as Grid Computing (Grid Computing), distributed Computing (DistributedComputing), parallel Computing (Parallel Computing), utility Computing (Utility Computing), network storage (Network Storage Technologies), virtualization (Virtualization), load balancing (Load Balance), and the like.

With the development of the internet, real-time data flow and diversification of connected devices, and the promotion of demands of search services, social networks, mobile commerce, open collaboration and the like, cloud computing is rapidly developed. Unlike the previous parallel distributed computing, the generation of cloud computing will promote the revolutionary transformation of the whole internet mode and enterprise management mode in concept.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which needs a new processing mode to have stronger decision-making ability, insight discovery ability and flow optimization ability. With the advent of the cloud age, big data has attracted more and more attention, and special techniques are required for big data to effectively process a large amount of data within a tolerant elapsed time. Technologies applicable to big data include massively parallel processing databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the internet, and scalable storage systems.

The computing power resource pooling technology realizes unified management of hardware resources through a software technology, changes the computing resources from hardware definition into software definition, and realizes flexible scheduling of the computing power resources. Through the computing power resource pooling technology, a user can efficiently schedule and use chip resources in a data center, so that the computing power utilization rate is improved, and the fragmentation and computing power cost is reduced.

Fig. 1 is an example of a system framework 100 provided by an embodiment of the present application.

As shown in FIG. 1, system framework 100 may include a task submission module 110, a task container list module 160, a pool of computing resources 101, and a data center 150. The computing resource pool 101 includes an affinity task cache 120, affinity task resource pools 1 to N, and a shared cache resource pool 140. The pool of computing resources 101 and the data center 150 may be connected by a network to receive or send messages. It should be noted that the present application does not limit the number of affinity task resource pools.

The task submission module 110 may be configured to obtain a user's request and submit tasks in the request to the computational resource pool 101. Alternatively, the task may be any task requiring computation, for example, a training task.

The affinity task buffer 120 may communicate with the affinity task resource pools 1 to N, and specifically, the affinity task buffer 120 determines an affinity task resource pool i corresponding to a task submitted by the task submitting module 110 in the affinity task resource pools 1 to N, and sends the task to the affinity task resource pool i so as to calculate the affinity task resource pool i. Affinity task cache 120, on the other hand, may also be configured to send remaining resources of affinity task resource pool 1-affinity task resource pool N and shared cache resource pool 140 to data center 150, such that data center 150 dynamically adjusts the resources of affinity task resource pool 1-affinity task resource pool N and shared cache resource pool 140.

Affinity task resource pool 1-affinity task resource pool N may be used to create a computational power container for the task and returned to task container list module 160, with task container list module 160 feeding back the results to the user.

On the one hand, the data center 150 obtains configuration instructions for initialization, and sends initialization policies of the affinity task resource pool 1-the affinity task resource pool N to the affinity task cache 120; on the other hand, the method can also be used for dynamically adjusting the resources of the affinity task resource pool 1 to the affinity task resource pool N and the shared cache resource pool 140.

It should be noted that, the system framework 100 is an execution subject of the method for creating a computing power container provided by the embodiment of the present application, and the system framework 100 may also be referred to as a computing power platform. The computing resource pool 101 is used to provide a computing resource for a task, and the computing resource pool 101 may be any device or apparatus having data processing capability, including but not limited to a graphics processing unit (Graphic Processing Units, GPU), a central processor (Central Processing Unit, CPU), a neural network processor (Neural Network Processing Unit, NPU), a tensor processor (Tensor Processing Unit, TPU), and an acceleration processor (ACCELERATED PROCESSING UNIT, APU). Data center 150 may be any network device, such as a server, having data computing, transfer, storage capabilities.

FIG. 2 is a schematic flow chart of a method 200 of creating a computing force container provided by an embodiment of the application.

It should be noted that the method 200 provided by the present application is applicable to the system framework 100, such as any computing platform having data processing capability. Optionally, the computing platform includes a pool of computing resources including, but not limited to, a graphics processing unit (Graphic Processing Units, GPU), a central processor (Central Processing Unit, CPU), a neural network processor (Neural Network Processing Unit, NPU), a tensor processor (Tensor Processing Unit, TPU), an acceleration processor (ACCELERATED PROCESSING UNIT, APU). Optionally, the computing power resource pool includes an affinity task cache, N affinity task resource pools, and a shared cache resource pool. Optionally, the computing platform may further comprise a data center. Of course, the computing platform may also be a cloud computing platform, which is not particularly limited in the present application.

As shown in fig. 2, the method 200 may include:

S201: acquiring a target task through an affinity task cache;

S202: determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n is more than 1;

s203: a computing force container is created in the first resource pool for the target task.

In the embodiment of the application, the N affinity task resource pools respectively correspond to N time periods. Equivalently, for conventional resource pools, a time characteristic was introduced. In other words, the affinity task resource pool according to the present application may also be referred to as a time affinity task resource pool, i.e. an affinity task resource pool for processing tasks within a certain threshold. Correspondingly, the affinity task cache related to the application refers to a cache of an affinity task resource pool corresponding to a target task, wherein the cache can be determined from N affinity task resource pools by utilizing the time period of the target task.

In other words, the affinity task resource pool corresponding to the target task, namely the first resource pool, is determined through the affinity task cache; on the one hand, as the N affinity task resource pools respectively correspond to N time periods, the force resource pools can be regulated in the time dimension; on the other hand, the affinity task cache determines the first resource pool corresponding to the target task in the N affinity task resource pools according to the time period of the target task, so that the target task can run in the corresponding affinity task resource pool, and computational power fragmentation caused by overlarge running time difference of a plurality of tasks is avoided, thereby reducing the computational power fragmentation in the time dimension.

It should be noted that the number of the N affinity task resource pools is greater than 1, and the specific number is not limited by the present application. Further, each of the N time periods may be a time period having a start time and an end time, or may be a time period having no start time and no end time. Accordingly, the time period of the target task may be a time period having a start time and an end time, or may be a time period having no start time and no end time, which is not particularly limited in the present application. In summary, the present application aims to scale the computational effort resource pool in the time dimension, whereby the computational effort fragmentation of the resource pool in the computational effort platform can be reduced in the time dimension based on the time information of the task.

In one implementation, the computing platform obtains an instruction for creating a computing container for a target task, based on the instruction, determines a first resource pool corresponding to the target task in N affinity task resource pools according to a time period of the target task through affinity task buffering, and creates the computing container for the target task in the first resource pool.

In some embodiments of the application, the time period of the target task includes an estimated run period of the target task; s202 may include:

And determining an affinity task resource pool corresponding to a time period, in which the difference value of the estimated operation time periods is smaller than a first threshold value, as the first resource pool in the N time periods through the affinity task cache. Optionally, the affinity task cache stores N time periods corresponding to the N affinity task resource pools respectively.

In one implementation, the estimated time period for the target task may be a target time period having a start time and an end time, within which the target task is estimated to run. At this time, each of the N time periods may also be a time period having a start time and an end time; in one implementation, the computing platform may determine, among the N time periods, a time period in which a difference from an ending time or a starting time of the estimated running period of the target task is less than a first threshold, and determine its corresponding affinity task resource pool as the first resource pool. Of course, in other implementations of the present application, the computing platform may further determine, among the N time periods, a time period within which the estimated running period of the target task falls, and determine, as the first resource pool, a corresponding affinity task resource pool; the embodiment of the present application is not particularly limited thereto.

In another implementation, the estimated time period of the target task may also be a time period of operation without a start time and an end time. At this time, each of the N time periods may be a time period having no start time and no end time; based on the above, among the N time periods, a time period with the estimated running time length of the target task smaller than a first threshold value is determined, and the affinity task resource pool corresponding to the estimated running time length of the target task is determined as a first resource pool.

It should be noted that, the first threshold may be set in advance, or may be input by a user, and the setting of the first threshold is not limited in the present application. For example 0.1h.

In some embodiments of the present application, the S203 may include:

Determining whether the resources of the first resource pool meet the resources required by the target task; if the resources of the first resource pool meet the resources required by the target task, creating an computing power container in the first resource pool; if the resources of the first resource pool can not meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; a computing force container is created for the target task based on the resources of the first resource pool and the second resource. In one implementation, if the resources of the first task resource pool cannot meet the resources required by the target task, allocating the second resources in the shared cache resource pool to the first task resource pool through task cache; and creating a computing power container for the target task in the first task resource pool based on the resources of the first task resource pool and the second resources acquired from the shared cache resource pool.

In other words, the computing platform determines whether the resources of the first resource pool meet the resources required by the target task; if the resources of the first resource pool meet the resources required by the target task, creating a computing power container for the target task in the first resource pool, and returning a container list of the task container; if the resources of the first resource pool can not meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; a computing power container is created for the target task based on the resources of the first resource pool and the second resource, and a container list of task containers is returned. The task container can also be called as a computing force container, and the computing force container is used as a computing unit and can execute various computing tasks, including data preprocessing, training of a machine learning model, deducing unlabeled data by using an existing model and the like; accordingly, the container list of task containers may be the results of the task container's calculations on the target task, such as a trained model or calculated values, and so forth. After the computing platform creates a computing container for the target task, the target task can be operated through the computing container, and an operation result, namely a container list of the task container, is returned after the operation is finished.

It should be noted that the second resource may be greater than or equal to the resource required by the first resource pool, which is not particularly limited by the present application.

In some embodiments of the present application, the method 200 may further comprise:

If the task accumulation occurs in the first resource pool, reporting a task accumulation notice to a data center through the affinity task cache; responsive to the task accumulation notification, sending, by the data center, resources of n affinity task resource pools and a new allocation policy of a shared cache resource pool to the affinity task cache; the N affinity task resource pools correspond to N time periods respectively, the N time periods being different from the N time periods; n is more than 1; and reallocating resources for the n affinity task resource pools and the shared cache resource pool through the affinity task cache based on the new allocation policy.

Wherein N may be equal to N or may not be equal to N, which is not particularly limited in the present application.

In one implementation, the N time periods include time periods obtained after slicing for the N time periods. For example, the N time periods include a time period obtained by dividing a time period corresponding to the first resource pool in the N time periods.

In other words, if a calculation container is needed to be created for a plurality of tasks in the first resource pool, that is, when the first resource pool has task accumulation, it is stated that after the data center divides the calculation resource pool into N affinity task resource pools based on a time period, the tasks of the affinity task resource pool corresponding to a certain time period have concentration or accumulation, that is, the tasks of the certain time period have concentration or accumulation, at this time, the task accumulation notification can be reported to the data center through the affinity task cache, so that the data center can repartition the time period or repartition the concentrated time period, and resend the allocation policy of the affinity task resource pool corresponding to each time period after repartition or segmentation to the affinity task cache. Accordingly, receiving, by the affinity task cache, new allocation policies for the n affinity task resource pools and the shared cache resource pool from the data center, reallocating resources by the affinity task cache for the n affinity task resource pools and the shared cache resource pool based on the new allocation policies. It should be noted that, the resources related to the embodiments of the present application may be calculated as units, and the present application is not limited to a specific implementation manner thereof. For example, the maximum number of hash collisions that can be made per second may be referred to as the computational effort, and the units are denoted as hash/s. For example, MH/s is a million hashes (hashes) per second, and TH/s is a trillion hashes per second.

When the task accumulation occurs in the first resource pool, the task accumulation notice is reported to the data center through the affinity task cache, new allocation strategies of n affinity task resource pools and shared cache resource pools corresponding to n time periods after repartitioning or splitting sent by the data center are received through the affinity task cache, resources are reallocated for the n affinity task resource pools and the shared cache resource pools based on the new allocation strategies, smooth allocation of different affinity task resource pools on time granularity is guaranteed, and the probability of fragmentation of computing resources on time dimension is reduced.

The new allocation policy may be a new percentage of the resources of the n affinity task resource pools in the total resources (i.e., the computing resource pools) of the computing platform, a new percentage of the resources of the n affinity task resource pools in the total resources of the original affinity task resource pool, or an allocation amount of the resources of the n affinity task resource pools. In other words, the new allocation policy aims at allocating or defining resources of the n affinity task resource pools, and the present application is not limited to the specific implementation of the new allocation policy. It should be noted that, since the data center divides the original computing power resource pool based on N time periods, and after the task accumulation occurs, the computing power resource pool is divided again based on N time periods or a certain computing power affinity task resource pool is divided, N time periods after the division or the division are different from the original N time periods.

Periodically sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to a data center through the affinity task cache; receiving an adjustment strategy sent by the data center through the affinity task cache; and based on the adjustment strategy, adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform through the affinity task cache.

In other words, firstly, periodically sending the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform to the data center by the computing platform through the affinity task cache, so that the computing platform determines an adjustment policy based on the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool in the computing platform through the data center; then, the computing platform sends an adjustment strategy to the affinity task cache through the data center; correspondingly, after receiving the adjustment policy sent by the data center through the affinity task cache, the computing platform adjusts the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool through the affinity task cache based on the adjustment policy. For example, after the computing platform receives the adjustment policy through the affinity task cache, releasing redundant resources in the remaining resources of the N affinity task resource pools to the shared cache resource pool based on the adjustment policy; for example, after receiving the adjustment policy through the affinity task cache, the computing platform obtains resources from the shared cache resource pool based on the adjustment policy and fills the resources into an insufficient affinity task resource pool of the remaining resources in the N affinity task resource pools.

On the one hand, the computing platform periodically sends the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pools in the computing platform to the data center through the affinity task cache, so that the resource scheduling condition of each affinity task resource pool can be periodically reported to the data center, and a data basis is provided for the data center to adjust the residual resources of each affinity task resource pool; on the other hand, the computing platform receives an adjustment strategy sent by the data center through the affinity task cache, adjusts the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pools in the computing platform based on the adjustment strategy, and can dial the resource transfer of each affinity task resource pool in the computing platform, so that the situation that a certain affinity task resource pool or the resources of certain affinity task resource pools are not scheduled or blocked for a long time is avoided; further, the method is beneficial to rapid iteration of the task.

The adjustment policy may be a new percentage of the resources of the N affinity task resource pools in the total resources of a certain affinity task resource pool, a new percentage of the resources of the N affinity task resource pools in the total resources of the computing platform, or an adjustment amount of the resources of the N affinity task resource pools. In other words, the adjustment policy aims at reallocating or adjusting the adjustment amount of the resources of the N affinity task resource pools, and the present application is not limited to the specific implementation of the adjustment policy.

In some embodiments of the present application, prior to S201, the method 200 may further include:

acquiring an initialization strategy by using initialization data through a data center; receiving the initialization strategy sent by the data center through the affinity task cache; initializing the N affinity task resource pools and the shared cache resource pool by the affinity task cache based on the initialization policy.

In other words, after the computing platform acquires the initialization policy through the data center, the affinity task cache receives the initialization policies of the N affinity task resource pools and the shared cache resource pools sent by the data center, and allocates resources for the N affinity task resource pools and the shared cache resource pools of the computing platform based on the initialization policies through the affinity task cache.

In some implementations, the initialization data may be operation data of a historical task, for example, an operation duration of the historical task, and the computing platform may determine an amount of tasks corresponding to each of the N time periods according to the operation duration of the historical task, and further determine the initialization policy based on the amount of tasks corresponding to each of the N time periods. The initialization policy may be a percentage ratio of resources of the N affinity task resource pools and the shared buffer resource pools in total resources (i.e., the computing power resource pools) of the computing power platform, or may be allocation amounts of the resources of the N affinity task resource pools and the shared buffer resource pools. In other words, the initialization policy aims to clarify the allocation amount of resources of the N affinity task resource pools and the shared cache resource pool. The application does not limit the specific implementation of the initialization data and the initialization strategy. For ease of understanding, the initialization data in the present application will be described below by taking an example in which the initialization policy includes the duty ratio of the resources of the N affinity task resource pools and the shared cache resource pool in the total resources of the computing platform.

If the total resources (i.e. the computing power resource pool) of the computing power platform are M, the percentage ratio of the total resources of all affinity task resource pools to the resources of the shared cache resource pool is A: and B, the total resources of all the affinity task resource pools are M.A, and the resources of the shared cache resource pool are M.B. The data center can also determine the ratio of resources required by the tasks in each time period to the total resources required by all the tasks according to the summarized data of the tasks in a plurality of time periods, and determine the ratio as the resource ratio of the affinity task resource pool of the corresponding time period. For example, taking a3 time period as an example, if one task is included in each of the 3 time periods, the calculated percentage of the 3 affinity task resource pools corresponding to the 3 time periods respectively is as shown in table 1:

TABLE 1

As shown in table 1, after the computing platform receives the percentage ratio a of all affinity task resource pools as in table 1 sent by the data center, determining the resources of each affinity task resource pool, that is, the resources of3 affinity task resource pools, according to the total resources m×a of all affinity task resource pools and the percentage ratio of each affinity task resource pool, where the resources are respectively: m X, M X Y, M X Z.

In some implementations, the configuration instructions for indicating initialization are obtained by the data center before the initialization policies are obtained by the data center using the initialization data; and responding to the configuration instruction, and acquiring the initialization strategy by the data center through the initialization data.

FIG. 3 is a schematic flow chart of a method 300 of creating a computing force container provided by an embodiment of the application.

It should be appreciated that the method 300 may be performed by affinity task caches and data centers in a computing platform, which may be any computing platform having data processing capabilities. Optionally, the computing platform includes a computing resource pool and a data center. The computational power resource pool includes affinity task caches including, but not limited to, graphics processing units (Graphic Processing Units, GPU), central processor (Central Processing Unit, CPU), neural network processor (Neural Network Processing Unit, NPU), tensor processor (Tensor Processing Unit, TPU), acceleration processor (ACCELERATED PROCESSING UNIT, APU), the data center may be any network device with data computing, transfer, storage capabilities. Such as a server.

As shown in fig. 3, the method 300 may include:

S301: the computing platform obtains a configuration instruction for indicating initialization through the data center. The configuration instructions are for instructing the data center to initialize the N affinity task resource pools and the shared cache resource pool. For example, the configuration instructions are to instruct the data center to send an initialization policy to the affinity task cache. Alternatively, as shown in fig. 1, the configuration instruction may be an instruction input by a user.

S302: the computing platform sends an initialization strategy to the affinity task cache through the data center.

S303: and the computing platform receives an initialization strategy sent by the data center through the affinity task cache, and initializes N affinity task resource pools and a shared cache resource pool of the computing platform based on the initialization strategy.

In some implementations, the initialization data may be operation data of a historical task, for example, an operation duration of the historical task, and the computing platform may determine an amount of tasks corresponding to each of the N time periods according to the operation duration of the historical task, and further determine the initialization policy based on the amount of tasks corresponding to each of the N time periods. The initialization policy may be a percentage ratio of the resources of the N affinity task resource pools in the total resources (i.e., the computing resource pools) of the computing platform, or may be an allocation amount of the resources of the N affinity task resource pools. In other words, the initialization policy aims at specifying the allocation amount of resources of the N affinity task resource pools. The application does not limit the specific implementation of the initialization data and the initialization strategy.

In some embodiments of the present application, the S202 may further include:

Determining a first sub-resource pool corresponding to the target task from N1 sub-resource pools in the first resource pool based on the machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms comprise the machine room of the target task; n1 is more than 1; a computing force container is created for the target task in the first sub-resource pool in the first resource pool.

In other words, the affinity task is cached in N1 sub-resource pools corresponding to N1 rooms in the first resource pool, and the sub-resource pool corresponding to the room matching the target task is determined as the first sub-resource pool.

The first sub-resource pool corresponding to the machine room of the target task is determined from the N1 sub-resource pools in the first resource pool, which is equivalent to the fact that the machine room affinity of the task is considered on the basis of considering the time affinity of the task, so that the computational effort resource pool can be regulated in the time dimension, the computational effort fragmentation is reduced, the affinity task resource pool can be regulated in the machine room dimension, and the processing efficiency of the task can be accelerated. In other words, on one hand, the application considers the time characteristic of the task, namely the time affinity of the task by dividing the computational power resource pool in the time dimension; on the other hand, the mode of dividing the affinity task resource pool in the machine room dimension considers the machine room characteristics of the task, namely the machine room affinity of the task; based on the above, the computational fragmentation is reduced, and the affinity task resource pool can be regulated in the machine room dimension, so that the task processing efficiency can be accelerated.

In some embodiments of the present application, the S202 may further include:

determining a second sub-resource pool corresponding to the target task in N2 sub-resource pools in the first resource pool based on a network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks comprise networks adopted by the target task; n2 is more than 1; a computing force container is created for the target task in the second sub-resource pool in the first resource pool.

In other words, by the affinity task cache, among N2 sub-resource pools corresponding to N2 networks, respectively, in the first resource pool, a sub-resource pool corresponding to a network matching the network employed by the target task is determined as the second sub-resource pool.

The second sub-resource pool corresponding to the network adopted by the target task is determined in the N2 sub-resource pools in the first resource pool, which is equivalent to the fact that the network affinity of the task is considered on the basis of considering the time affinity of the task, so that the computational effort resource pool can be regulated in the time dimension, the computational effort fragmentation is reduced, the affinity task resource pool can be regulated in the network dimension, and the processing efficiency of the task can be accelerated. In other words, on one hand, the application considers the time characteristic of the task, namely the time affinity of the task by dividing the computational power resource pool in the time dimension; on the other hand, the mode of dividing the affinity task resource pool in the network dimension considers the network characteristics of the task, namely the network affinity of the task; based on this, the affinity task resource pool can be regulated in the network dimension while reducing the computational fragmentation, and the processing efficiency of the task can be accelerated.

FIG. 4 is a schematic flow chart of a method 400 of creating a computing force container provided by an embodiment of the application.

It should be appreciated that the method 400 is applicable to the system framework 100, such as any computing platform having data processing capabilities. Optionally, the computing platform includes a pool of computing resources including, but not limited to, a graphics processing unit (Graphic Processing Units, GPU), a central processor (Central Processing Unit, CPU), a neural network processor (Neural Network Processing Unit, NPU), a tensor processor (Tensor Processing Unit, TPU), an acceleration processor (ACCELERATED PROCESSING UNIT, APU). Optionally, the computing power resource pool includes an affinity task cache, N affinity task resource pools, and a shared cache resource pool. Optionally, the computing platform may further comprise a data center. Of course, the computing platform may also be a cloud computing platform, which is not particularly limited in the present application.

As shown in fig. 4, the method 400 may include:

s401: and acquiring the target task through the affinity task cache.

In one implementation, a computing platform obtains, via an affinity task cache, instructions for creating a computing container for a target task, the instructions for requesting the creation of the computing container for the target task. Optionally, the instructions include an estimated run time period of the target task.

S402: determining a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task through the affinity task cache; n > 1.

Specifically, the affinity task resource pool corresponding to a time period in which the difference between the estimated operation time periods is smaller than a first threshold value is determined as the first resource pool from among the N time periods by the affinity task cache.

S403: is the resource of the first resource pool satisfying the resources required for the target task?

If the resources of the first resource pool meet the resources required by the target task, a computing power container is created for the target task in the first resource pool. If the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache; a computing power container is created for the target task based on the resources of the first resource pool and the second resources.

S404: and acquiring the second resource from the shared cache resource pool through the affinity task cache.

And if the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from the shared cache resource pool through the affinity task cache. It should be noted that the second resource may be greater than or equal to the resource required by the first resource pool, which is not particularly limited by the present application.

S405: a computing force container is created for the target task.

The computing platform creates a computing container for a target task in a first resource pool for the target task, or the computing platform may create a computing container for the target task based on resources of the first resource pool and second resources.

S406: and dynamically adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool.

Specifically, firstly, periodically sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pools in the computing platform to a data center by the computing platform through the affinity task cache, so that the computing platform determines an adjustment strategy through the data center based on the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pools in the computing platform; then, the computing platform sends an adjustment strategy to the affinity task cache through the data center; correspondingly, after receiving the adjustment policy sent by the data center through the affinity task cache, the computing platform adjusts the remaining resources of the N affinity task resource pools and the remaining resources of the shared cache resource pool through the affinity task cache based on the adjustment policy. For example, after the computing platform receives the adjustment policy through the affinity task cache, releasing redundant resources in the remaining resources of the N affinity task resource pools to the shared cache resource pool based on the adjustment policy; for example, after receiving the adjustment policy through the affinity task cache, the computing platform obtains resources from the shared cache resource pool based on the adjustment policy and fills the resources into an insufficient affinity task resource pool of the remaining resources in the N affinity task resource pools.

S407: initialization of N affinity task resource pools and shared cache resource pools.

Specifically, an initialization strategy is obtained by using initialization data through a data center; receiving the initialization strategy sent by the data center through the affinity task cache; initializing the N affinity task resource pools and the shared cache resource pool by the affinity task cache based on the initialization policy.

In other words the first and second phase of the process,

After the computing platform acquires the initialization strategy through the data center, the affinity task cache receives the initialization strategy of the resources of the N affinity task resource pools transmitted by the data center, and allocates the resources for the N affinity task resource pools and the shared cache resource pool of the computing platform based on the initialization strategy through the affinity task cache.

It should be noted that, the drawings provided by the embodiments of the present application are only examples, and should not be construed as limiting the present application. For example, S401, S402, and S405 shown in fig. 4 may be used to replace S201, S202, and S203 shown in fig. 2, respectively.

The preferred embodiments of the present application have been described in detail above with reference to the accompanying drawings, but the present application is not limited to the specific details of the above embodiments, and various simple modifications can be made to the technical solution of the present application within the scope of the technical concept of the present application, and all the simple modifications belong to the protection scope of the present application. For example, the specific features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various possible combinations are not described further. As another example, any combination of the various embodiments of the present application may be made without departing from the spirit of the present application, which should also be regarded as the disclosure of the present application.

It should be further understood that, in the various method embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic of the processes, and should not constitute any limitation on the implementation process of the embodiments of the present application.

The method provided by the embodiment of the application is explained above, and the force calculation platform provided by the embodiment of the application is explained below.

Fig. 5 is a schematic block diagram of a computing platform 500 provided by an embodiment of the application.

As shown in fig. 5, the computing platform 500 includes:

an obtaining unit 501, configured to obtain a target task through an affinity task cache;

A determining unit 502, configured to determine, according to the affinity task cache, a first resource pool corresponding to the target task from N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task; n is more than 1;

A creating unit 503, configured to create a computing power container for the target task in the first resource pool.

In some embodiments of the present application, the determining unit 502 is specifically configured to:

In some embodiments of the present application, the creation unit 503 is specifically configured to:

determining whether the resources of the first resource pool meet the resources required by the target task;

If the resources of the first resource pool meet the resources required by the target task, creating a computing force container for the target task in the first resource pool;

If the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from a shared cache resource pool in the computing platform through the affinity task cache; a computing force container is created for the target task based on the resources of the first resource pool and the second resource.

In some embodiments of the present application, the determining unit 502 may be further configured to:

If the task accumulation occurs in the first resource pool, reporting a task accumulation notice to a data center in the computing platform through the affinity task cache;

In response to the task accumulation notification, sending, by the data center, a new allocation policy of the shared cache resource pools of the n affinity task resource pools and the computing platform to the affinity task cache; the N affinity task resource pools correspond to N time periods respectively, the N time periods being different from the N time periods; n is more than 1;

And reallocating resources for the n affinity task resource pools and the shared cache resource pool through the affinity task cache based on the new allocation policy.

In some embodiments of the present application, the N time periods include time periods obtained after slicing for the N time periods.

Periodically sending the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform to a data center through the affinity task cache;

Receiving an adjustment strategy sent by the data center through the affinity task cache;

And based on the adjustment strategy, adjusting the residual resources of the N affinity task resource pools and the residual resources of the shared cache resource pool in the computing platform through the affinity task cache.

acquiring an initialization strategy by using initialization data through a data center;

receiving the initialization strategy sent by the data center through the affinity task cache;

And initializing the N affinity task resource pools and the shared cache resource pool of the computing platform through the affinity task cache based on the initialization strategy.

In some embodiments of the present application, the determining unit 502 may be further configured to, prior to obtaining the initialization policy by the data center using the initialization data:

acquiring a configuration instruction for indicating initialization through a data center;

And responding to the configuration instruction, and acquiring the initialization strategy by the data center through the initialization data.

In some embodiments of the present application, the determining unit 502 may be further specifically configured to:

Determining a first sub-resource pool corresponding to the target task from N1 sub-resource pools in the first resource pool based on the machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms comprise the machine room of the target task; n1 is more than 1;

A computing force container is created for the target task in the first sub-resource pool in the first resource pool.

determining a second sub-resource pool corresponding to the target task in N2 sub-resource pools in the first resource pool based on a network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks comprise networks adopted by the target task; n2 is more than 1;

a computing force container is created for the target task in the second sub-resource pool in the first resource pool.

It should be understood that apparatus embodiments and method embodiments may correspond with each other and that similar descriptions may refer to the method embodiments. To avoid repetition, no further description is provided here. Specifically, the computing platform 500 may correspond to a respective subject computing platform that performs the methods 200, 300, and 400 of embodiments of the present application, and the foregoing and other operations and/or functions of the respective modules in the computing platform 500 are respectively for implementing the respective flows in the respective methods in fig. 2 to 4, and are not further described herein for brevity.

It should also be understood that each unit in the computing platform 500 according to the embodiments of the present application may be separately or all combined into one or several other units to form a structure, or some unit(s) thereof may be further split into a plurality of units with smaller functions to form a structure, which may achieve the same operation without affecting the implementation of the technical effects of the embodiments of the present application. The above units are divided based on logic functions, and in practical applications, the functions of one unit may be implemented by a plurality of units, or the functions of a plurality of units may be implemented by one unit. In other embodiments of the present application, the computing platform 500 may also include other elements, and in actual practice, these functions may be facilitated by other elements and may be cooperatively implemented by a plurality of elements. According to another embodiment of the present application, the computing platform 500 according to the embodiments of the present application, and the method of creating a computing container according to the embodiments of the present application, may be constructed by running a computer program (including program code) capable of executing the steps involved in the respective methods on a general-purpose computing device of a general-purpose computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read only storage medium (ROM), and the like, and a storage element. The computer program may be recorded on a computer readable storage medium, and loaded into an electronic device and executed therein to implement a corresponding method according to an embodiment of the present application.

In other words, the units referred to above may be implemented in hardware, or may be implemented by instructions in software, or may be implemented in a combination of hardware and software. Specifically, each step of the method embodiment in the embodiment of the present application may be implemented by an integrated logic circuit of hardware in a processor and/or an instruction in software form, and the steps of the method disclosed in connection with the embodiment of the present application may be directly implemented as a hardware decoding processor or implemented by a combination of hardware and software in the decoding processor. Alternatively, the software may reside in a well-established storage medium in the art such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, and the like. The storage medium is located in a memory, and the processor reads information in the memory, and in combination with hardware, performs the steps in the above method embodiments.

Fig. 6 is a schematic structural diagram of an electronic device 600 provided in an embodiment of the present application.

As shown in fig. 6, the electronic device 600 includes at least a processor 610 and a computer-readable storage medium 620. Wherein the processor 610 and the computer-readable storage medium 620 may be connected by a bus or other means. The computer readable storage medium 620 is used to store a computer program 621, the computer program 621 including computer instructions, and the processor 610 is used to execute the computer instructions stored by the computer readable storage medium 620. Processor 610 is a computing core and a control core of electronic device 600 that are adapted to implement one or more computer instructions, in particular to load and execute one or more computer instructions to implement a corresponding method flow or a corresponding function.

By way of example, the processor 610 may also be referred to as a central processor (CentralProcessingUnit, CPU). The processor 610 may include, but is not limited to: a general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field programmable gate array (Field Programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like.

By way of example, computer readable storage medium 620 can be high speed RAM memory or Non-volatile memory (Non-VolatileMemory), such as at least one disk memory; alternatively, it may be at least one computer-readable storage medium located remotely from the aforementioned processor 610. In particular, computer-readable storage media 620 include, but are not limited to: volatile memory and/or nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, DDR SDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), and Direct memory bus RAM (DR RAM).

In one implementation, the electronic device 600 may be the computing platform 500 shown in FIG. 5; the computer readable storage medium 620 has stored therein computer instructions; computer instructions stored in computer-readable storage medium 620 are loaded and executed by processor 610 to implement the corresponding steps in the method embodiments shown in fig. 2-4; in particular, the computer instructions in the computer-readable storage medium 620 are loaded by the processor 610 and perform the corresponding steps, and are not repeated here.

According to another aspect of the present application, the embodiment of the present application further provides a computer-readable storage medium (Memory), which is a Memory device in the electronic device 600, for storing programs and data. Such as computer-readable storage medium 620. It is understood that the computer readable storage medium 620 herein may include a built-in storage medium in the electronic device 600, and may include an extended storage medium supported by the electronic device 600. The computer-readable storage medium provides storage space that stores an operating system of the electronic device 600. Also stored in this memory space are one or more computer instructions, which may be one or more computer programs 621 (including program code), adapted to be loaded and executed by the processor 610.

According to another aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. Such as computer program 621. At this time, the electronic device 600 may be a computer, and the processor 610 reads the computer instructions from the computer-readable storage medium 620, and the processor 610 executes the computer instructions, so that the computer performs the method of creating a computing force container provided in the above-described various alternatives.

In other words, when implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, runs the processes of, or implements the functions of, embodiments of the present application, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, from one website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.

Those of ordinary skill in the art will appreciate that the elements and process steps of the examples described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or as a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It should be noted that the above is only a specific embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about the changes or substitutions within the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

1. A method of creating a computing force container, the method being applied to a computing force platform;

Comprising the following steps:

acquiring a target task through an affinity task cache;

creating a computing force container for the target task in the first resource pool;

Responding to the task accumulation notice, and sending a new allocation strategy of n affinity task resource pools and a shared cache resource pool of a computing platform to the affinity task cache through the data center; the N affinity task resource pools correspond to N time periods respectively, and the N time periods are different from the N time periods; n is more than 1;

2. The method of claim 1, wherein the time period of the target task comprises an estimated run period of the target task;

The determining, by the affinity task cache, a first resource pool corresponding to the target task in N affinity task resource pools corresponding to N time periods respectively according to the time periods of the target task includes:

And determining an affinity task resource pool corresponding to a time period, in which the difference value of the estimated operation time periods is smaller than a first threshold value, as the first resource pool in the N time periods through the affinity task cache.

3. The method of claim 1, wherein the creating a computing force container in the first resource pool for the target task comprises:

If the resources of the first resource pool cannot meet the resources required by the target task, acquiring second resources from a shared cache resource pool in the computing platform through the affinity task cache; a computing force container is created for the target task based on the resources of the first resource pool and the second resources.

4. The method of claim 1, wherein the N time periods include time periods resulting from slicing the N time periods.

5. The method according to any one of claims 1 to 4, further comprising:

6. The method according to any one of claims 1 to 4, wherein prior to the obtaining the target task by the affinity task cache, the method further comprises:

7. The method of claim 6, wherein prior to obtaining the initialization policy by the data center using the initialization data, the method further comprises:

8. The method of any one of claims 1 to 4, wherein the creating a computing force container in the first resource pool for the target task comprises:

Determining a first sub-resource pool corresponding to the target task from N1 sub-resource pools in the first resource pool based on a machine room of the target task through the affinity task cache; the N1 sub-resource pools respectively correspond to N1 machine rooms, and the N1 machine rooms comprise the machine rooms of the target task; n1 is more than 1;

9. The method of any one of claims 1 to 4, wherein the creating a computing force container in the first resource pool for the target task comprises:

Determining a second sub-resource pool corresponding to the target task from N2 sub-resource pools in the first resource pool based on a network adopted by the target task through the affinity task cache; the N2 sub-resource pools respectively correspond to N2 networks, and the N2 networks comprise networks adopted by the target task; n2 is more than 1;

10. A computing platform, comprising:

A creating unit, configured to create a computing power container for the target task in the first resource pool;

The determining unit is further configured to:

11. An electronic device, comprising:

a processor adapted to execute a computer program;

A computer readable storage medium having stored therein a computer program which, when executed by the processor, implements the method of any of claims 1 to 9.

12. A computer readable storage medium storing a computer program for causing a computer to perform the method of any one of claims 1 to 9.