[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112445521B - Data processing method, related equipment and computer readable medium - Google Patents

Data processing method, related equipment and computer readable medium Download PDF

Info

Publication number
CN112445521B
CN112445521B CN201910829877.4A CN201910829877A CN112445521B CN 112445521 B CN112445521 B CN 112445521B CN 201910829877 A CN201910829877 A CN 201910829877A CN 112445521 B CN112445521 B CN 112445521B
Authority
CN
China
Prior art keywords
instruction
static
block
jump
data block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910829877.4A
Other languages
Chinese (zh)
Other versions
CN112445521A (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cambricon Technologies Corp Ltd
Original Assignee
Cambricon Technologies Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cambricon Technologies Corp Ltd filed Critical Cambricon Technologies Corp Ltd
Priority to CN201910829877.4A priority Critical patent/CN112445521B/en
Publication of CN112445521A publication Critical patent/CN112445521A/en
Application granted granted Critical
Publication of CN112445521B publication Critical patent/CN112445521B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3005Arrangements for executing specific machine instructions to perform operations for flow control

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

The embodiment of the invention discloses a computing device, which comprises: the system comprises a processor, a memory and a bus, wherein the processor and the memory are connected through the bus, the memory is used for storing instructions, and the processor is used for calling the instructions stored in the memory and executing a specific data processing method so as to improve the data processing performance and efficiency.

Description

Data processing method, related equipment and computer readable medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data processing method, a related device, and a computer readable medium.
Background
For an operator, when the operator processes a block of data, the shape of the block of data is denoted (n, c, h, w). When the system generates an instruction for an operator, the shape parameter of the data block needs to be given, namely, the generated instruction only supports the processing of the data block with a specific scale. If the shape parameters of the data block change, instructions need to be regenerated for the data block to process the data block by using operators.
However, in some application scenarios, such as neuro-linguistic programming (NLP), the shape of the data block to be processed of the input operator needs to be determined at runtime, the system cannot generate corresponding instructions for the data block to be processed in advance, and cannot cover the application scenario where the data block is variable, which has a limited application range.
Disclosure of Invention
The embodiment of the invention provides a data processing method, which can solve the problems that the application range is limited, the application scene with changeable data blocks cannot be covered and the like in the prior art.
In a first aspect, an embodiment of the present invention provides a data processing method, including: the computing device determines a target dynamic instruction block and at least one target static instruction block adopted for processing the variable data block when the variable data block is subjected to operation processing by using a preset operator. Each target static instruction block is used for executing logic operation of static data blocks with preset scales, and the respective scales of the static data blocks corresponding to at least one target static instruction block are different from each other and jointly form or piece together into variable data blocks. In the actual processing process, if the computing device detects a pseudo jump instruction in the target dynamic instruction block, the computing device responds to the pseudo jump instruction, jumps and calls the first static instruction block to execute the logic operation of the corresponding static data block. The first static instruction block is an instruction block in at least one target static instruction block. After the logic operation is executed, the computing device jumps to the next dynamic control instruction adjacent to the pseudo jump instruction according to the jump instruction in the first static instruction block, and according to the principle, the computing device can realize the operation processing of the variable data block according to the target dynamic control instruction.
By implementing the embodiment of the invention, the processing of different variable data blocks by using the preset operator can be adaptively utilized, so that the problems that the application range is limited, the variable application scene of the data blocks cannot be covered and the like in the prior art are solved.
With reference to the first aspect, in some possible embodiments, the computing device may, in response to the pseudo-jump instruction, add a first static instruction block indicated by the pseudo-jump instruction to a current instruction list, the current instruction list including at least the target dynamic instruction block. Then, setting a first jump offset to jump to the first static instruction block, and calling the first static instruction block to execute the logic operation of the static data block indicated by the first static instruction block; the first jump offset is the difference between the index number of the pseudo jump instruction and the number of instructions of the current instruction list before the first static instruction block is not added.
With reference to the first aspect, in some possible embodiments, the computing device may set a second jump offset according to a jump instruction in the first static instruction block to jump to a next dynamic control instruction in the target dynamic instruction block adjacent to the pseudo jump instruction; the second jump offset is the difference between the index number of the pseudo jump instruction added by one and the number of instructions of the current instruction list after the first static instruction block is added.
With reference to the first aspect, in some possible embodiments, the static data block is identified by a first number pair, where the first number pair is (identification static_op_id of the target static instruction block, and a static index table table_index), and the static index table is used to indicate an index number of the static data block, and at least one static data block included for the same preset operator is stored in the same static index table.
With reference to the first aspect, in some possible embodiments, the identification of the variable data block is used to indicate an offset of the variable data block relative to a storage base address of a preset data block, and a memory address of the variable data block is stored in a target address, where the target address is an address determined by the storage base address of the preset data block and the offset.
With reference to the first aspect, in some possible embodiments, the preset operator includes at least one expression calculation formula, the computing device may perform an operation processing on the variable data block by using the expression calculation formula to obtain an expression result, and then store the expression result in a pre-allocated expression data block, where the expression calculation formula adopts a second number pair identifier, and the second number pair is (identifier data_id of the expression data block, and identifier exp_id of the expression calculation formula), and an identifier of the expression calculation formula is used to indicate an offset of the expression result relative to the expression data block.
With reference to the first aspect, in some possible embodiments, the computing device may further utilize a variable network to split and compile a preset operator, to obtain an initial dynamic instruction block of the preset operator and at least one initial static instruction block, where the initial dynamic instruction block includes at least one dynamic control instruction for controlling to implement a logical operation of the variable data block, and the initial static instruction block includes at least one static operation instruction for executing a logical operation of the preset static data block; if any dynamic control instruction in the initial dynamic instruction blocks needs to call a second static instruction block, inserting a pseudo jump instruction after any dynamic control instruction so as to obtain a new dynamic instruction block, wherein the second static instruction block is any one of the at least one initial static instruction block, and the pseudo jump instruction is used for indicating to jump to the second static instruction block; inserting a jump instruction after the last static operation instruction in the second static instruction block so as to obtain a new static instruction block, wherein the jump instruction is used for indicating to jump to the next dynamic control instruction of any dynamic control instruction; the target dynamic instruction block is the new dynamic instruction block, and the target static instruction block is an instruction block in the new static instruction block.
In a second aspect, an embodiment of the present invention provides a data processing apparatus comprising means or modules for performing the method of the first aspect described above.
In a third aspect, an embodiment of the present invention provides another computing device, including a processor, a memory, and a bus, where the processor and the memory are connected by the bus, the memory is configured to store instructions, and the processor is configured to invoke the instructions stored in the memory, and perform the method of the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of the first aspect described above.
By implementing the embodiment of the invention, the variable data blocks with different scales can be adaptively operated by using a preset algorithm, the problems that the application range is limited, the application scene with variable data blocks cannot be covered and the like in the prior art can be solved, and the data processing performance is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a data processing method according to an embodiment of the present invention.
FIG. 2 is a schematic diagram of an instruction jump according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a data secondary storage according to an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of another data processing apparatus according to an embodiment of the present invention.
Fig. 6 is a schematic structural diagram of a computing device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims of this application and in the drawings, are used for distinguishing between different objects and not for describing a particular sequential order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The applicant found in the course of the present application that: at present, when an operator is used for carrying out operation processing on data blocks, corresponding instructions are required to be generated for the operator for the data blocks with different scales. This will generate a large number of instructions, taking up more memory space. And for any data block, the time cost of instruction generation is larger, and if the number of the data blocks is larger, larger data processing delay is caused, and the data processing efficiency and performance are affected. In addition, in special application scenarios such as NLP, the data block to be processed of the input operator needs to be determined in the running process, the computing device cannot generate corresponding instructions for the data block to be processed in advance, the application scenarios with variable data blocks cannot be covered, and the application range is limited.
In order to solve the above problems, the present application proposes a data processing method, and related devices and apparatuses to which the method is applicable. Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the invention. The method as shown in fig. 1 comprises the following implementation steps:
s101, splitting and compiling a preset operator by the computing equipment to obtain an initial dynamic instruction block and at least one initial static instruction block of the preset operator. The initial dynamic instruction block comprises at least one dynamic control instruction and is used for controlling the logic operation of the variable data block. The initial static instruction block comprises at least one static operation instruction and is used for executing logic operation of static data blocks with preset scale.
The computing device splits network components contained in the network model into a list of operators including one or more operators therein. For example, a neural network model is taken as an example, and a Convolutional (CONV) layer, an activation layer, and the like are included in the neural network model. The computing equipment splits the convolution layer in the neural network model to obtain corresponding convolution operators. Then, the computing device may transmit the operator list and the network type into a network instruction to generate an interface inst_gen_interface, split and circularly compile each operator in the operator list by using the interface to obtain an abstract instruction of the operator, and then combine the abstract instructions to obtain a corresponding initial dynamic instruction block and an initial static instruction block. The network type is used for reflecting the type of the network model, and can specifically comprise a static network model and a dynamic network model. Since the static network model does not involve dynamic control, embodiments of the present application are applicable to dynamic network models.
Each operator is split into two parts at the time of specific splitting: a dynamic control part and a static operation part. In other words, for any (preset) operator, the operator op_id includes a dynamic control operator dynamic_op_id and a static operation operator static_op_id, each operator having a correspondingly identified id distinction. Table 1 below shows a mapping table of preset operator splitting.
TABLE 1
Operator identification op_id (dynamic control operator identification, static operator identification)
1(CONV) (static_op_id1,dynamic_op_id1)
... ...
In practical application, the computing device may call an operator instruction generator (specific referent function) to compile a corresponding operator, so as to obtain an abstract instruction contained in the corresponding operator. Illustratively, the computing device invokes an operator instruction generator to compile the split operators (which may be specifically a dynamic control operator and a static operation operator) to correspondingly obtain an initial dynamic instruction block and at least one initial static instruction block. The initial dynamic instruction block includes one or more dynamic control instructions, and the initial static instruction block includes one or more static operation instructions.
Taking a dynamic control operator (abbreviated as a convolution dynamic operator) contained in the convolution operator as an example, the computing device calls a function Gen_conv_op_inst () to compile the convolution dynamic operator, so as to obtain an abstract instruction contained in the convolution dynamic operator, which can be specifically a dynamic control instruction. See table 2 for an exemplary mapping table of preset operators and operator instruction generators.
TABLE 2
Operator identification op_id Operator instruction generator function
1(CONV) Gen_conv_op_inst()
... ...
In the compiling process of the static operator, after the computing device finishes the mapping of the op_id and the (static_op_id, dynamic_op_id), a static scale table is also required to be obtained, and a plurality of static data blocks with different scales supported by the system are recorded in the static scale table. Then, when compiling the static operation operator, the computing device needs to use the static data blocks as template data blocks, construct variable data blocks related to the static operation operator, and generate corresponding initial static instruction blocks so as to implement logic operations on the static instruction blocks, such as convolution operations indicated by the convolution operator, and the like.
For example, please refer to fig. 3, which shows one possible static scale table.
TABLE 3 Table 3
static_op_id table_index Static data block HW
1 1 (1,1)
1 2 (3,3)
... ... ...
It is understood that for the same operator, the static operator of the operator may have a plurality of static data blocks h×w of different sizes. Therefore, the static table index table_index is used to distinguish or store the serial number of the static data block in the static scale table, and the first number pair (static_op_id, table_index) is used to identify the static data block, or the static instruction block corresponding to the static data block. For the same operator, the information of the operator is typically recorded by an operator layer table layer_t. The information of the operator comprises data block information, additional parameters required by the operator and the like. The data block information comprises information of variable data blocks of an input operator, information of output data blocks of an output operator, additional data blocks required in operator operation processing and the like. The additional parameters may be, for example, model layer parameters required in the convolution operator, etc. The data block information may specifically include a data block address, a data block shape (i.e., a data block size), a data block type, and a data block identification data_id.
Wherein the data block address is used to indicate the memory address allocated by the computing device for the data block, in other words the data block address is used to store the corresponding data block. The data block identifier is used for distinguishing the data blocks, and is a unique identifier of the data blocks. The data block shape is used to reflect the size of the data block. For example, the size of a data block is typically n×c×h×w, where N represents the number of data blocks, C represents the number of channels (also referred to as dimensions), H represents the height, and W represents the width. Taking a data block as an example for describing an image, N represents the number of input images, C represents the number of channels of the image, and c=3 as an example for an RGB color image. H and W represent the height and width of the image, respectively.
In practical applications, data block processing on a two-dimensional plane is generally considered, and the application uses the height H and width W directions (i.e., HW planes) as an example to illustrate the relevant content, while compressing C and N to 1. In other words, the sizes of the data blocks according to the embodiments of the present application are all denoted as h×w.
In one embodiment, S101 specifically includes the following implementation steps:
s11, the computing equipment acquires an operator identifier op_id.
S12, the computing equipment determines a dynamic control operator dynamic_op_id and a static operation operator static_op_id included in the operators. Specifically, the computing device may query and obtain the dynamic control operator dynamic_op_id and the static operation operator static_op_id included in the operator op_id according to table 1.
And S13, the computing equipment acquires a static scale table stable_table according to a static operation operator stable_op_id. The static operator is conveniently configured with corresponding static data blocks in the static_table.
And S14, the computing equipment copies an operator layer table layer_t for the dynamic control operator so as to generate a corresponding initial dynamic instruction block according to the layer_t and the dynamic control operator.
S15, the computing equipment traverses the static scale table, copies an operator layer table layer_t for a static operator static_op_id, and converts a data block (particularly, a variable data block of an input operator) in the layer_t into a static data block H x W in the static scale table so as to generate a corresponding initial static instruction block according to the layer_t and the static operator.
Specifically, the computing device converts the variable data block input in layer_t into a static data block in the static scale table, thereby obtaining at least one static data block constituting the variable data block. And then splitting and compiling the static operator according to at least one static data block related to the static operator to obtain at least one initial static instruction block. The static data blocks are in one-to-one correspondence with the static instruction blocks, and the static instruction blocks are used for indicating to execute the logic operation of the static data blocks.
It should be noted that, after the operator is compiled, the computing device may translate each abstract instruction into a binary instruction (also referred to as a hardware instruction) that supports the hardware implementation of the device, where the hardware instruction and the abstract instruction are set to correspond to each other one by one. For convenience of description, the abstract instruction and the binary instruction are collectively referred to as an instruction, and may be specifically referred to as a dynamic control instruction, a static operation instruction, a pseudo jump instruction, a jump instruction, and the like.
S102, if any dynamic control instruction in the initial dynamic instruction block needs to call any initial static instruction block (also called a second static instruction block), the computing device inserts a pseudo jump instruction after any dynamic control instruction, so as to obtain a new dynamic instruction block.
Since the size of the variable data block may be different from the size of the static data block in the static size table, the variable data block may be composed of a plurality of static data blocks during the actual operation process. At this time, when any dynamic control instruction in the dynamic instruction block calls the static instruction block to execute the logic operation of the static data block with a specific scale, the position of the static instruction block cannot be known, so that a pseudo jump instruction needs to be inserted after any dynamic control instruction, thereby obtaining a new dynamic instruction block. The pseudo jump instruction is used for indicating a static instruction block to be jumped to, so that the corresponding logic operation is executed on the static data block correspondingly to the static instruction block.
In practical applications, the number of pseudo jump instructions included in the new dynamic instruction block is not limited, and may be one or more.
S103, inserting a jump instruction after the last static operation instruction in the second static instruction block, so as to obtain a new static instruction block. The jump instruction is used for indicating to jump to the next dynamic control instruction of any dynamic control instruction.
In the application, if multiple static instruction blocks are required to be called in the new dynamic instruction block to realize the logic operation of the variable data block, the computing device may insert a jump instruction after each static instruction block to jump back to the new dynamic instruction block to perform the control processing of the next dynamic control instruction. Thus correspondingly a plurality of new static instruction blocks are available.
And S104, when the computing equipment utilizes a preset operator to carry out operation processing on the input variable data block, determining a target dynamic instruction block and at least one target static instruction block adopted for processing the variable data block. The target dynamic instruction block is a new dynamic instruction block and the target static instruction block is an instruction block in the new static instruction block. The static data blocks corresponding to the at least one target static instruction block jointly form a variable data block, and the sizes of the static data blocks corresponding to each target static instruction block are different from each other.
The target dynamic instruction block comprises at least one dynamic control instruction and at least one pseudo-jump instruction, wherein each pseudo-jump instruction is used for indicating to jump to a corresponding target static instruction block so as to execute the logical operation of the corresponding static data block of the target static instruction block. The target dynamic instruction block is used for controlling how to call at least one target static instruction block so as to realize the operation processing of the whole variable data block. Each target static instruction block comprises at least one static operation instruction and a jump instruction, wherein the jump instruction is used for indicating to jump back to the next dynamic control instruction in the target dynamic instruction block so as to control the operation control indicated by the next dynamic control instruction.
And S105, if the computing equipment detects a pseudo jump instruction in the target dynamic control instruction, responding to the pseudo jump instruction, jumping and calling the first static instruction block to execute logic operation of the corresponding static data block. The first static instruction block is an instruction block in at least one target static instruction block.
In the operation processing process, if the computing equipment detects a pseudo jump instruction in the target dynamic control instruction, the computing equipment can respond to the pseudo jump instruction and jump to a first static instruction block, and further call the first static instruction block to execute corresponding logic operation on a static data block indicated by the first static instruction block. The pseudo jump instruction is to instruct invoking the first static instruction block to perform a logical operation of the corresponding static data block.
Specifically, the computing device responds to the pseudo jump instruction, and adds all static operation instructions included in the first static instruction block to a current instruction list, wherein the instructions included in the current instruction list are used for indicating to realize operation processing of the variable data block, for example, all instructions included in the target static instruction block, such as a dynamic control instruction, a pseudo jump instruction and the like. Then, setting a first jump offset to jump from a pseudo jump instruction in the target dynamic instruction block to a first static instruction block according to the jump offset, and calling the first static instruction block to execute logic operation for the corresponding static data block. The first jump offset is the difference between the index number of the pseudo jump instruction and the number of instructions of the current instruction list before the first static instruction block is not added.
For example, please refer to fig. 2, which shows a schematic diagram of a current instruction list. Assuming that the variable data block size is 3*3, the static data blocks provided in the static size table include only 1*1 and 2 x 2. As shown in FIG. 2, the current instruction list includes instructions 1 through n+k. Wherein, the instructions 1 to n form a target dynamic instruction block, and the instructions n+1 to n+k belong to the instructions in the target static instruction blocks. If the computing device currently detects that the instruction i in the target dynamic instruction block is a pseudo jump instruction, the pseudo jump instruction i is used for indicating to call the first static instruction block to perform the logical operation of the static data block 2 x 2. The computing device, in response to the pseudo-jump instruction i, adds l instructions included in the first static instruction block to a current instruction list that includes (n+k+l) instructions after the first static instruction block is added. And setting a first jump offset (n+k-i) to jump to the first static instruction block, and executing logic operation of the static data block 2 x 2 according to the instruction of the first static instruction block.
And S106, after the logic operation is executed, the computing equipment jumps to the next dynamic control instruction adjacent to the pseudo jump instruction according to the jump instruction in the first static instruction block. According to the principle, the computing device controls the operation processing of the whole variable data block according to the target dynamic instruction block.
After the first static instruction block is called to execute the corresponding logic operation, the computing device can jump back to the next dynamic control instruction adjacent to the pseudo jump instruction in the target dynamic instruction block according to the last jump instruction included in the first static instruction block, such as the i+1th dynamic control instruction shown in fig. 2. The computing device repeatedly executes the steps of steps S105 and S106 as above, and sequentially calls at least one target static instruction block according to the instruction of the target dynamic instruction block to implement the operation processing of the whole variable data block.
In particular, the computing device may set the second jump offset according to the jump instruction in the first static instruction block to recall a next dynamic control instruction adjacent to the pseudo jump instruction in the target dynamic instruction block. The second jump offset is the difference between the index number of the pseudo jump instruction plus one and the number of instructions of the current instruction list after the first static instruction block is added. Referring to the example of fig. 2, the current instruction list includes (n+k+l) instructions, after the computing device executes the first static instruction block, a second jump offset (n+k+l- (i+1)) may be set, and then the i+1th dynamic control instruction in the target dynamic control instruction block, that is, the next dynamic control instruction of the pseudo jump instruction i may be jumped back.
In an alternative embodiment, when the computing device performs operation processing on the variable data block by using the preset operator, parameters (such as operator parameters, operation results or intermediate data generated by processing) involved in the operation processing process may be stored in a memory, for example, a corresponding register disposed in the memory, so as to be called or queried later.
In an alternative embodiment, since the data blocks processed by the input preset operator are variable, they cannot be simulated at the time of the compilation, and the number of data blocks required to be processed is not limited. Thus, during the compiling process, the computing device may pre-allocate a base address, abbreviated as base address, to the data block address. The base address is used to store the data block address. When more data blocks exist, the computing device can determine the offset of the data block address relative to the base address according to the data block identification data_id, further reads the data block address corresponding to the data_id from the storage address corresponding to the base address plus the offset, and finally reads the data block corresponding to the data_id according to the data block address. During actual operation, the data block address typically identifies the memory address allocated for the corresponding data block for that data block by the computing device.
In other words, the data blocks related to the application are all stored in a two-level mapping mode. The computing device allocates a space in memory, referred to as memory space for short. The memory space stores all the data block addresses. In the actual data operation processing process, the computing device can search the corresponding data block address from the memory space, and then read the data block corresponding to the data block address.
Referring to fig. 3, a schematic diagram of a structure of a secondary storage of a data block is shown. As shown in fig. 3, the structure diagram includes a base address data_table_addr, i.e. a start address for storing all data block addresses in the memory space, and a storage address addr_table for storing all data block addresses. In the operator compiling process, the computing device can acquire a base address data_table_addr, and then calculate and store an offset of a data block address corresponding to the data_id relative to a starting address according to a data block identifier data_id to obtain a storage address of the data block address. And finally, writing the data block address into a storage address of the data block address for storage.
Accordingly, during the operation of the actual data block, the computing device allocates a memory address for the data block (e.g., the variable data block described above) as the data block address. And then reading the data block address from the storage address addr_table of the data block address to obtain a data block to be operated and processed.
In an alternative embodiment, the preset operator also refers to an expression calculation formula, which is simply called an expression calculation formula, for example, the convolution operator refers to a convolution calculation formula and the like. The expression calculation is typically related to variable data blocks, such as height H and width W of the variable data blocks. The expression calculation may be, for example: h+3w, etc. Since the computing device uses the chip to perform the computation of the complex expression computation formula is complex, and there is much time overhead. In order to facilitate the calculation of the computing device and reduce the data calculation time, the computing device can obtain the expression result corresponding to the expression calculation formula through the CPU (central processing unit) by means of pre-calculation, and then the expression result is cached to a storage address of the memory for processing.
Specifically, the computing device needs to allocate or newly add a corresponding expression data block to each operator, so as to store the expression result calculated by each expression calculation formula in the operator. Because the expression data block and the data block to be processed related to the operator belong to the data block, the computing equipment can use the thought of the operator data block, and each expression data block has a corresponding unique identification data_id. The computing device may assign an expression identification exp_id to an expression in the operator and then identify or tag an expression with a second number pair. The second number pair is specifically (the identifier data_id of the expression data block, the identifier exp_id of the expression calculation formula). The identification of the expression calculation is used to indicate the offset that the expression result calculated by the expression calculation stores in the expression data block.
In other words, the present application employs a two-level storage structure as shown in fig. 3 above to store the expression results corresponding to each expression calculation in the operator. Specifically, the computing device allocates a memory space data_table_addr in the memory, and is used for storing the expression data block corresponding to the operator. The computing device stores the expression calculation formula address to the storage address addr_table corresponding to the data_id according to the identification data_id of the expression data block by adopting the principle as shown in the figure 3. The expression calculation formula address is a memory address allocated to the corresponding expression calculation formula by the calculation equipment during actual operation processing. Then, the computing device stores the expression result calculated by the expression computing formula into the expression data block corresponding to the data_id, specifically, as a certain offset exists in the expression data block by the expression computing formula, the computing device can determine the offset of the expression result corresponding to the exp_id in the expression data block according to the identifier exp_id of the expression computing formula, and then stores the expression result into the corresponding offset of the expression data block, so as to realize the storage of the expression result.
The computing device according to the present invention includes, but is not limited to, a chip (e.g., AI chip), a smart phone (e.g., android phone, IOS phone, etc.), a personal computer, a tablet computer, a palm computer, a mobile internet device (MID, mobile Internet Devices), or a wearable smart device, etc., and the embodiments of the present invention are not limited.
By implementing the embodiment of the invention, the operation processing of variable data blocks with different scales can be realized by utilizing operator self-adaption, the problems that the application range is limited, the application scene with variable data blocks cannot be covered and the like in the prior art can be solved, and the efficiency and the performance of data processing are improved.
The apparatus and associated devices to which the present application is applicable are described below based on the embodiments described in fig. 1-3. Fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. The data processing apparatus as shown in fig. 4 includes a determination unit 402, a processing unit 404, and a jumping unit 406. Wherein,
the determining unit 402 is configured to determine, when performing operation processing on a variable data block by using a preset operator, a target dynamic instruction block and at least one target static instruction block adopted when processing the variable data block, where each target static instruction block is configured to perform a logical operation of a static data block of a preset size, and the static data blocks corresponding to the at least one target static instruction block are different from each other and together form the variable data block;
the processing unit 404 is configured to, if a pseudo jump instruction in the target dynamic instruction block is detected, jump and call a first static instruction block to perform a logical operation of a corresponding static data block in response to the pseudo jump instruction, where the first static instruction block is an instruction block in the at least one target static instruction block;
The jump unit 406 is configured to jump to a next dynamic control instruction adjacent to the pseudo jump instruction according to the jump instruction in the first static instruction block after the logic operation is performed, so as to control and implement the operation processing of the variable data block.
In some possible embodiments, the processing unit 404 is specifically configured to respond to the pseudo-jump instruction by adding a first static instruction block indicated by the pseudo-jump instruction to a current instruction list, where the current instruction list includes at least the target dynamic instruction block; setting a first jump offset to jump to the first static instruction block and calling the first static instruction block to execute the logic operation of the static data block indicated by the first static instruction block; the first jump offset is the difference between the index number of the pseudo jump instruction and the number of instructions of the current instruction list before the first static instruction block is not added.
In some possible embodiments, the jump unit 406 is specifically configured to set a second jump offset according to a jump instruction in the first static instruction block, so as to jump to a next dynamic control instruction adjacent to the pseudo jump instruction in the target dynamic instruction block; the second jump offset is the difference between the index number of the pseudo jump instruction added by one and the number of instructions of the current instruction list after the first static instruction block is added.
In some possible embodiments, the target static instruction block is identified by a first number pair, where the first number pair is (a static table index table_index of the identification static_op_id of the target static instruction block) and the static table index is used to indicate an index number of a static data block corresponding to the target static instruction block.
In some possible embodiments, the identification of the variable data block is used to indicate an offset of the variable data block relative to a storage base address of a preset data block, and the memory address of the variable data block is stored in a target address, where the target address is an address determined by the storage base address of the preset data block and the offset.
In some possible embodiments, a schematic diagram of another data processing apparatus is shown in fig. 5. The apparatus further comprises an arithmetic unit 408 and a storage unit 410. Wherein:
the operation unit 408 is configured to perform an operation process on the information of the variable data block by using the expression calculation formula, so as to obtain an expression result;
the storage unit 410 is configured to store the expression result in a pre-allocated expression data block, where the expression calculation formula uses a second number pair identifier, the second number pair is (identifier data_id of the expression data block and identifier exp_id of the expression calculation formula), and the identifier of the expression calculation formula is used to indicate an offset of the expression result relative to the expression data block.
In some possible embodiments, the apparatus may further comprise a compiling unit 412 and an inserting unit 414 as shown in fig. 5. Wherein:
the compiling unit 412 is configured to split and compile a preset operator to obtain an initial dynamic instruction block and at least one initial static instruction block of the preset operator, where the initial dynamic instruction block includes at least one dynamic control instruction for controlling a logic operation for implementing the variable data block, and the initial static instruction block includes at least one static operation instruction for executing a logic operation of a static data block of a preset scale;
the inserting unit 414 is configured to insert a pseudo jump instruction after any one of the initial dynamic instruction blocks if the second static instruction block is required to be called by any one of the initial dynamic instruction blocks, so as to obtain a new dynamic instruction block, where the second static instruction block is any one of the at least one initial static instruction block, and the pseudo jump instruction is used to instruct to jump to the second static instruction block; inserting a jump instruction after the last static operation instruction in the second static instruction block so as to obtain a new static instruction block, wherein the jump instruction is used for indicating to jump to the next dynamic control instruction of any dynamic control instruction;
The target dynamic instruction block is the new dynamic instruction block, and the target static instruction block is an instruction block in the new static instruction block.
Optionally, the memory unit in the data processing device may also store program code for implementing relevant operations of the data processing device. In practical applications, each module or unit involved in the apparatus in the embodiment of the present invention may be implemented by a software program or hardware. When implemented by a software program, the modules or units involved in the apparatus are software modules or software units, and when implemented by hardware, the modules or units involved in the apparatus may be implemented by application-specific integrated circuits (ASICs-specific integrated circuit), programmable logic devices (programmable logic device, PLDs), which may be complex program logic devices (complex programmable logical device, CPLDs), field-programmable gate arrays (field-programmable gate array, FPGAs), general-purpose array logic (generic array logic, GAL), or any combination thereof, as the invention is not limited.
It should be noted that fig. 4 or fig. 5 are only one possible implementation manner of the embodiment of the present application, and in practical applications, more or fewer components may be included in the data processing apparatus, which is not limited herein. For details not shown or described in the embodiments of the present invention, reference may be made to the related descriptions in the foregoing method embodiments, which are not repeated here.
Fig. 6 is a schematic structural diagram of a computing device according to an embodiment of the invention. The computing device shown in fig. 6 includes one or more processors 601, a communication interface 602, and a memory 603, where the processors 601, the communication interface 602, and the memory 603 may be connected by a bus, or may communicate by other means such as wireless transmission. In the embodiment of the present invention, the memory 603 is used for storing instructions, and the processor 601 is used for executing the instructions stored in the memory 603, which is connected through the bus 604. The memory 603 stores program codes, and the processor 601 may call the program codes stored in the memory 603 to perform the following operations:
when a variable data block is subjected to operation processing by using a preset operator, determining a target dynamic instruction block and at least one target static instruction block adopted when the variable data block is processed, wherein each target static instruction block is used for executing logic operation of a static data block with a preset scale, and the static data blocks corresponding to the at least one target static instruction block are mutually different and jointly form the variable data block;
if a pseudo jump instruction in the target dynamic instruction block is detected, responding to the pseudo jump instruction, jumping and calling a first static instruction block to execute logic operation of a corresponding static data block, wherein the first static instruction block is an instruction block in the at least one target static instruction block;
After the logic operation is executed, according to the jump instruction in the first static instruction block, jumping to the next dynamic control instruction adjacent to the pseudo jump instruction to control and realize the operation processing of the variable data block.
For details not shown or described in the embodiment of the present invention, reference may be made to the foregoing description of the embodiment of fig. 1, which is not repeated herein.
It should be appreciated that in embodiments of the present invention, the processor 601 may be a central processing unit (Central Processing Unit, CPU), which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSPs), application specific integrated circuits (Application Specific Integrated Circuit, ASICs), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The communication interface 602 may be a wired interface (e.g., an ethernet interface) or a wireless interface (e.g., a cellular network interface or using a wireless local area network interface) for communicating with other units or apparatus devices. For example, the communication interface 602 in the embodiment of the present application may be specifically configured to obtain a static instruction block or a dynamic instruction block, etc.
The Memory 603 may include Volatile Memory (RAM), such as random access Memory (Random Access Memory); the Memory may also include a Non-Volatile Memory (Non-Volatile Memory), such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the memory may also comprise a combination of the above types of memories. The memory may be used to store a set of program code such that the processor invokes the program code stored in the memory to implement the functions of the functional modules described above in connection with the embodiments of the present invention.
It should be noted that fig. 6 is merely one possible implementation of an embodiment of the present invention, and the computing device may include more or fewer components in practical applications, which are not limited herein. For details not shown or described in the embodiments of the present invention, reference may be made to the related descriptions in the foregoing method embodiments, which are not repeated here.
Embodiments of the present invention also provide a computer readable storage medium having instructions stored therein that, when executed on a processor, implement the method flow shown in fig. 1.
Embodiments of the present invention also provide a computer program product for implementing the method flow shown in the embodiment of fig. 1 when the computer program product is run on a processor.
The computer readable storage medium may be an internal storage unit of a computing device, such as a hard disk or a memory of the computing device, as described in any of the foregoing embodiments. The computer readable storage medium may also be an external storage device of the computing device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. that are provided on the computing device. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the client. The computer readable storage medium is used to store the computer program and other programs and data required by the computing device. The computer-readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working procedures of the terminal device and unit described above may refer to the corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In several embodiments provided in the present application, it should be understood that the disclosed terminal device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. In addition, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices, or elements, or may be an electrical, mechanical, or other form of connection.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present invention.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention is essentially or a part contributing to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (10)

1. A method of data processing, the method comprising:
when a variable data block is subjected to operation processing by using a preset operator, determining a target dynamic instruction block and at least one target static instruction block adopted when the variable data block is processed, wherein each target static instruction block is used for executing logic operation of a static data block with a preset scale, and the static data blocks corresponding to the at least one target static instruction block are mutually different and jointly form the variable data block;
if a pseudo jump instruction in the target dynamic instruction block is detected, responding to the pseudo jump instruction, jumping and calling a first static instruction block to execute logic operation of a corresponding static data block, wherein the first static instruction block is an instruction block in the at least one target static instruction block;
After the logic operation is executed, according to the jump instruction in the first static instruction block, jumping to the next dynamic control instruction adjacent to the pseudo jump instruction to control and realize the operation processing of the variable data block.
2. The method of claim 1, wherein, in response to the pseudo-jump instruction, jumping and invoking the first static instruction block to perform a logical operation of a corresponding static data block comprises:
responding to the pseudo jump instruction, adding a first static instruction block indicated by the pseudo jump instruction into a current instruction list, wherein the current instruction list at least comprises the target dynamic instruction block;
setting a first jump offset to jump to the first static instruction block and calling the first static instruction block to execute the logic operation of the static data block indicated by the first static instruction block; the first jump offset is the difference between the index number of the pseudo jump instruction and the number of instructions of the current instruction list before the first static instruction block is not added.
3. The method of claim 2, wherein said jumping to a next dynamic control instruction adjacent to said pseudo-jump instruction in accordance with a jump instruction in said first static instruction block comprises:
Setting a second jump offset according to the jump instruction in the first static instruction block so as to jump to the next dynamic control instruction adjacent to the pseudo jump instruction in the target dynamic instruction block; the second jump offset is the difference between the index number of the pseudo jump instruction added by one and the number of instructions of the current instruction list after the first static instruction block is added.
4. The method of claim 1, wherein the target static instruction block is identified with a first number pair comprising an identification static_op_id of the target static instruction block and a static table index table_index for indicating an index number of a static data block corresponding to the target static instruction block.
5. The method of claim 1, wherein the identification of the variable data block is used to indicate an offset of the variable data block relative to a storage base of a preset data block, and wherein a memory address of the variable data block is stored at a target address, where the target address is an address determined by the storage base of the preset data block and the offset.
6. The method of claim 1, wherein the preset operator comprises at least one expression calculation, the method further comprising:
Performing operation processing on the information of the variable data block by using the expression calculation formula to obtain an expression result;
storing the expression result in a pre-allocated expression data block, wherein the expression calculation formula adopts a second number pair identifier, the second number pair comprises an identifier data_id of the expression data block and an identifier exp_id of the expression calculation formula, and the identifier of the expression calculation formula is used for indicating the offset of the expression result relative to the expression data block.
7. The method of any one of claims 1-6, wherein the method further comprises:
splitting and compiling a preset operator to obtain an initial dynamic instruction block and at least one initial static instruction block of the preset operator, wherein the initial dynamic instruction block comprises at least one dynamic control instruction for controlling and realizing the logic operation of the variable data block, and the initial static instruction block comprises at least one static operation instruction for executing the logic operation of the static data block with a preset scale;
if any dynamic control instruction in the initial dynamic instruction blocks needs to call a second static instruction block, inserting a pseudo jump instruction after any dynamic control instruction so as to obtain a new dynamic instruction block, wherein the second static instruction block is any one of the at least one initial static instruction block, and the pseudo jump instruction is used for indicating to jump to the second static instruction block;
Inserting a jump instruction after the last static operation instruction in the second static instruction block so as to obtain a new static instruction block, wherein the jump instruction is used for indicating to jump to the next dynamic control instruction of any dynamic control instruction;
the target dynamic instruction block is the new dynamic instruction block, and the target static instruction block is an instruction block in the new static instruction block.
8. A computing device comprising a determination unit, a processing unit, and a jump unit, wherein,
the determining unit is used for determining a target dynamic instruction block and at least one target static instruction block adopted when the variable data block is processed by utilizing a preset operator to perform operation processing on the variable data block, wherein each target static instruction block is used for executing logic operation of a preset static data block, and the static data blocks corresponding to the at least one target static instruction block are different from each other and form the variable data block together;
the processing unit is configured to, if a pseudo jump instruction in the target dynamic instruction block is detected, respond to the pseudo jump instruction, jump and call a first static instruction block to perform a logical operation of a corresponding static data block, where the first static instruction block is an instruction block in the at least one target static instruction block;
And the jump unit is further used for jumping to the next dynamic control instruction adjacent to the pseudo jump instruction according to the jump instruction in the first static instruction block after the logic operation is executed, so as to control the operation processing of the variable data block.
9. A computing device comprising a processor, a memory and a bus, the processor and the memory being connected by the bus, the memory for storing instructions, the processor for invoking the instructions stored in the memory for performing the method of any of claims 1-7.
10. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-7.
CN201910829877.4A 2019-09-02 2019-09-02 Data processing method, related equipment and computer readable medium Active CN112445521B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910829877.4A CN112445521B (en) 2019-09-02 2019-09-02 Data processing method, related equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910829877.4A CN112445521B (en) 2019-09-02 2019-09-02 Data processing method, related equipment and computer readable medium

Publications (2)

Publication Number Publication Date
CN112445521A CN112445521A (en) 2021-03-05
CN112445521B true CN112445521B (en) 2024-03-26

Family

ID=74735384

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910829877.4A Active CN112445521B (en) 2019-09-02 2019-09-02 Data processing method, related equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN112445521B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7818356B2 (en) * 2001-10-29 2010-10-19 Intel Corporation Bitstream buffer manipulation with a SIMD merge instruction
WO2010011342A1 (en) * 2008-07-25 2010-01-28 Brown University Apparatus, methods, and computer program products providing dynamic provable data possession
US9141831B2 (en) * 2010-07-08 2015-09-22 Texas Instruments Incorporated Scheduler, security context cache, packet processor, and authentication, encryption modules
US9330028B2 (en) * 2014-03-27 2016-05-03 Intel Corporation Instruction and logic for a binary translation mechanism for control-flow security
US11093822B2 (en) * 2017-04-28 2021-08-17 Intel Corporation Variable precision and mix type representation of multiple layers in a network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
动态可重构总线数据传输管理方法设计与实现;邓哲;张伟功;朱晓燕;杜瑞;;计算机工程(第01期);第1-6页 *

Also Published As

Publication number Publication date
CN112445521A (en) 2021-03-05

Similar Documents

Publication Publication Date Title
CN111309486B (en) Conversion method, conversion device, computer equipment and storage medium
CN107729083B (en) Method for loading driver and embedded device
US20110154299A1 (en) Apparatus and method for executing instrumentation code
WO2022057420A1 (en) Data processing method and apparatus, electronic device, and storage medium
CN102023843B (en) Function calling method and device as well as smart card
CN110780921A (en) Data processing method and device, storage medium and electronic device
CN113032007B (en) Data processing method and device
US20190087208A1 (en) Method and apparatus for loading elf file of linux system in windows system
CN112445729A (en) Operation address determination method, PCIe system, electronic device and storage medium
CN105335309A (en) Data transmission method and computer
CN113504918A (en) Equipment tree configuration optimization method and device, computer equipment and storage medium
WO2013112065A1 (en) Object selection in an image
CN110458285B (en) Data processing method, data processing device, computer equipment and storage medium
CN109976751B (en) Model operation method, related device and computer readable storage medium
CN112445521B (en) Data processing method, related equipment and computer readable medium
CN109766123A (en) Application program packaging method and device
CN114004335A (en) Data processing method and device, electronic equipment and storage medium
CN111381905B (en) Program processing method, device and equipment
CN110955380B (en) Access data generation method, storage medium, computer device and apparatus
CN111027688A (en) Neural network calculator generation method and device based on FPGA
CN110333870B (en) Simulink model variable distribution processing method, device and equipment
CN114490074B (en) Arbitration system, arbitration method, electronic device, storage medium and chip
CN111258733B (en) Embedded OS task scheduling method and device, terminal equipment and storage medium
CN113536716A (en) SOC system verification method and device, computer equipment and readable storage medium
CN114327941A (en) Service providing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant