CN109313663A

CN109313663A - Artificial intelligence calculates Auxiliary Processing Unit, method, storage medium and terminal

Info

Publication number: CN109313663A
Application number: CN201880002144.7A
Authority: CN
Inventors: 肖梦秋
Original assignee: Shenzhen Corerain Technologies Co Ltd
Current assignee: Shenzhen Corerain Technologies Co Ltd
Priority date: 2018-01-15
Filing date: 2018-01-15
Publication date: 2019-02-05
Anticipated expiration: 2038-01-15
Also published as: WO2019136750A1; CN109313663B

Abstract

The present invention provides a kind of artificial intelligence calculating Auxiliary Processing Unit, comprising: multiple memory modules are stored with pending data matrix；Memory is equipped with null matrix；Control module, for the pending data matrix to be taken out from the memory module, it is placed in the memory null matrix, by enable the pending data matrix can be constituted centered on any one first matrix element by it having a size of W*W to convolution matrix, carry out convolutional calculation according to preset step-length for convolution nuclear matrix；Wherein, W is the size of the convolution nuclear matrix.The present invention builds zero padding operation system by hardware configuration, and it is preset with null matrix in memory, zero padding operation can be realized for the merging of pending data matrix, without calculating the parameters such as the quantity of zero padding or the position of zero padding, greatly reduce the calculation amount of system, the efficiency of zero padding operation is improved, the response speed of the operations such as image procossing is accelerated.

Description

Artificial intelligence calculation auxiliary processing device, method, storage medium and terminal

Technical Field

The invention relates to the field of artificial intelligence, in particular to an artificial intelligence calculation auxiliary processing method, an artificial intelligence calculation auxiliary processing device, a readable computer storage medium and a terminal.

Background

Nowadays, with the development of the artificial intelligence industry, various artificial intelligence fields have been developed. Among them, convolutional neural networks have become a research hotspot in many artificial intelligence fields.

As early as the 60's of the 20 th century, scientists discovered unique network structures that could effectively reduce the complexity of feedback neural networks when studying neurons in the feline cerebral cortex for local sensitivity and direction selection, and subsequently proposed convolutional neural networks. Subsequently, more researchers have been invested in the study of convolutional neural networks.

In general, in order to make the size of the matrix after extracting the eigenvalue by convolution consistent with the size of the original data matrix before convolution, zero padding operation needs to be performed on the original data matrix.

However, in the prior art, zero padding operation can only be performed by software technology, and the calculation amount of the CPU is very large, which leads to very low zero padding efficiency.

Disclosure of Invention

In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide an auxiliary processing method and apparatus for work intelligence calculation, a readable computer storage medium, and a terminal, which are used to solve the technical problems of low efficiency and large amount of calculation in zero padding operation in the prior art.

To achieve the above and other related objects, the present invention provides an artificial intelligence calculation auxiliary processing apparatus, including: the storage modules are used for storing the data matrix to be processed; the memory is provided with a zero matrix; the control module is used for taking the data matrix to be processed out of the storage module and placing the data matrix to be processed in the memory zero matrix so as to enable the data matrix to be processed to form a matrix to be convolved with the size W x W by taking any one first matrix element of the data matrix as a center, and the convolution kernel matrix is used for performing convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.

In an embodiment of the present invention, the data matrix to be processed includes an N × M matrix, and the zero matrix includes an N × M zero matrix; wherein,

in an embodiment of the present invention, the extracting the to-be-processed data matrix from the storage module and placing the to-be-processed data matrix in the zero matrix includes: the control module takes the N x M matrix out of the storage module and places the N x M matrix into an N x M zero matrix of the memory to form a filling matrix; wherein the fill matrix comprises: 1 st to 1 thLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.

In an embodiment of the present invention, the extracting the to-be-processed data matrix from the storage module and placing the to-be-processed data matrix in the zero matrix includes: the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module sequentially takes out each block matrix and fills each block matrix into the N x M zero matrix according to a preset placing condition.

In an embodiment of the invention, the preset placing condition includes: and the control module sequentially places the block matrixes into the N x M zero matrix according to a preset initial address in the memory and the sizes of the block matrixes.

In an embodiment of the invention, the memory module includes a dual memory module.

In an embodiment of the present invention, the auxiliary processing device for artificial intelligence calculation includes: the multiplier is used for multiplying each matrix to be convolved with a convolution kernel respectively to obtain a corresponding multiplication result matrix; wherein each multiplication result matrix is aligned with each first matrix element; the adder is used for adding the second matrix elements in each multiplication result matrix to obtain a corresponding convolution result value; and aligning each convolution result value with each first matrix element to form a convolution result matrix with the size of n x m.

In order to achieve the above objects and other related objects, the present invention provides an artificial intelligence computing auxiliary processing method, applied to a control module, the method including: taking the n x m matrix from the storage module; placing the N x M matrix in an N x M zero matrix of an internal memory, so that the N x M matrix can form a matrix to be convolved with the size of W x W by taking any one first matrix element as a center, and providing the matrix to be convolved with the convolution kernel; wherein,w is the order of the convolution kernel.

In an embodiment of the present invention, the extracting, by the control module, the n × m matrix from the storage module specifically includes: the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module takes out the block matrixes in sequence.

In an embodiment of the present invention, the placing the N × M matrix in an N × M zero matrix of a memory specifically includes: and the control module sequentially fills each block matrix into the N x M zero matrix according to the starting address and the size of the block matrix.

To achieve the above and other related objects, the present invention provides a computer-readable storage medium having stored thereon a computer program, which, when executed by a processor, implements the artificial intelligence computing assistance processing method.

To achieve the above and other related objects, the present invention provides an artificial intelligence computing assistant processing terminal, comprising: a processor and a memory; the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the artificial intelligence calculation auxiliary processing method.

As described above, the artificial intelligence computing auxiliary processing method, the artificial intelligence computing auxiliary processing device, the readable computer storage medium, and the terminal according to the present invention have the following advantages: according to the artificial intelligence calculation auxiliary processing method and device, the zero filling operation system is built through the hardware structure, the zero matrix is preset in the memory, the zero filling operation can be achieved by arranging the data matrix to be processed, parameters such as the number of zero fillings or the position of zero filling are not required to be calculated, the calculated amount of the system is greatly reduced, the efficiency of the zero filling operation is improved, and the response speed of operations such as image processing is accelerated. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.

Drawings

FIG. 1 is a diagram illustrating an auxiliary processing apparatus for artificial intelligence computing according to an embodiment of the present invention.

FIG. 2 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.

FIG. 3 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.

FIG. 4 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.

FIG. 5 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.

FIG. 6 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.

FIG. 7 is a diagram of an apparatus for processing assistance in artificial intelligence computing according to an embodiment of the invention.

FIG. 8 is a diagram illustrating an artificial intelligence computing assistance processing method according to an embodiment of the invention.

Description of the element reference numerals

11 memory module

12 internal memory

13 control module

M1 matrix of data to be processed

M2 convolution kernel matrix

M3 zero matrix

M4 filling matrix

M401-M425 matrix to be convolved

M501-M505 multiplication result matrix

M6 convolution result matrix

M7 matrix of data to be processed

M8 zero matrix

71 memory module

72 internal memory

R1 rectangle dashed box

R2 rectangle dashed box

R3 rectangle dashed box

Steps S801 to S802

Detailed Description

The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.

The invention provides an artificial intelligence calculation auxiliary processing device which is used for carrying out zero filling operation on a data matrix to be processed.

Fig. 1 shows an artificial intelligence computing auxiliary processing device according to an embodiment of the invention. The artificial intelligence computing auxiliary processing device comprises a plurality of storage modules 11, a memory 12 and a control module 13. The plurality of storage modules 11 store n × m matrixes, and n and m are natural numbers greater than or equal to 1. The memory 12 stores an N × M zero matrix; the control module 13 is configured to take out the N × M matrix and place the N × M matrix in the N × M zero matrix.

Preferably, the storage module 11 is a dual storage module; the double storage module specifically comprises two storage modules, wherein when one storage module is in scanning output, the other storage module is in data processing; and when the next period is reached, the storage module which finishes processing the data starts scanning and outputting, and the storage module which is originally scanned and output starts processing the data. That is, the two memory modules are always in a state where one is scanning and the other is processing data, and the effect is that each frame data finally output seems to go through two steps of processing and scanning output, but the two memories cooperate with each other to complete together, thereby achieving the technical effect of increasing the data transmission and processing efficiency by times.

Preferably, the control module realizes data transmission in a DMA mode. Specifically, the DMA is called DirectMemory Access, which means direct memory Access, and is a controller that can directly Access data from a memory without passing through a CPU. In the DMA mode, the CPU only needs to issue an instruction to the control module, the control module is enabled to process data transmission, and information is fed back to the CPU after data bolting is finished, so that the resource occupancy of the CPU is greatly reduced, and system resources are greatly saved.

The N x M zero matrix is a matrix consisting of (N x M) zeros.

In particular, the method comprises the following steps of, and W is the order of the W-W order convolution kernel matrix. The convolution kernel matrix is a weight matrix used for performing weighted average calculation on matrix data, and the function of the convolution kernel matrix is equivalent to a filter in convolution calculation. Generally, the order of the weight matrix is odd, so that the position of the matrix is determined by the central element of the odd-order matrix.

Specifically, the control module takes out the N × M order matrix from the storage module and places the N × M order matrix in the N × M zero matrix of the memory to form a filling matrix. 1 st to 1 th of the filler matrixLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.

And taking the N x M matrix out of the storage module 11 and placing the N x M matrix into the N x M zero matrix in the memory 12 to form a filling matrix, so that the N x M matrix can form a matrix to be convolved with the size of W x W by taking any matrix element as the center. The following describes a process of performing convolution calculation on the n × m matrix and the W × W convolution kernel in a specific embodiment.

Fig. 2 to 6 are schematic diagrams showing an auxiliary process of artificial intelligence computation according to an embodiment of the present invention. Wherein:

fig. 2 shows the pending data matrix M1 and the convolution kernel matrix M2 in this embodiment. The data matrix to be processed is a 5 × 5 order matrix, the convolution kernel matrix is a 3 × 3 order matrix, and the numerical value in each matrix is the matrix element of the matrix.

Fig. 3 shows a zero matrix M3 in this embodiment.

The zero matrix is arranged in the memory and is based on n is 5, W is 3: the zero matrix is a 7 x 7 order matrix.

Fig. 4 shows a filling matrix M4 in this embodiment. The filling matrix is formed by the data matrix to be processed M1 after being placed into the zero matrix M3. According to 1 st of the filling matrixLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are known as the n × m order matrix: the 1 st row, the 7 th row, the 1 st column and the 7 th column of the filling matrix M4 are all 0, the areas of the 2 nd to 6 th rows and the 2 nd to 6 th columns are used for placing the data matrix M1 to be processed, and the filling matrix M4 is a matrix after zero padding operation is performed on the data matrix M1 to be processed.

The rectangular dashed box R1 in fig. 4 represents a 3 × 3-stage convolution-ready matrix M401 formed by centering on the matrix element 18, and is located on the right side of fig. 4. And moving the rectangular dashed frame R1 to the right by step 1 to sequentially obtain matrices M401 to M405 to be convolved with each matrix element in the first row of the filling matrix M4 as the center. The same operations are sequentially performed on the second row to the seventh row in the same manner as the first row, so that 25 matrices M401 to M425 to be convolved are finally obtained.

It should be noted that, the matrix elements in the matrices M401 to M425 to be convolved and the data matrix M1 to be processed are mutually aligned. In addition, although the step size of the movement of the rectangular dashed box R1 is 1 in this embodiment, that is, the movement is performed by only one matrix element at a time, the step size of the movement of the dashed box is not limited in the present invention.

Fig. 5 shows a schematic diagram of multiplication of each of the matrices to be convolved and the convolution kernel matrix in this embodiment. The artificial intelligence calculation auxiliary processing device comprises a multiplier which is not shown in the figure and is used for multiplying the convolution kernel matrix with each matrix to be convolved respectively. Specifically, the matrix elements in the matrix to be convolved are multiplied by the matrix elements aligned in the convolution kernel matrix to obtain corresponding multiplication result matrices M501 to M525. It should be noted that, each matrix element in the multiplication result matrices M501 to M525 and the to-be-processed data matrix M1 is aligned.

Fig. 6 is a schematic diagram illustrating the addition operation performed on each multiplication result matrix in this embodiment. The artificial intelligence calculation auxiliary processing device includes an adder, not shown, for performing the following operations on each of the multiplication result matrices M501 to M525: and adding matrix elements in the multiplication result matrix to obtain a corresponding convolution result value. For example, matrix elements in the multiplication result matrix M501 are added to obtain a convolution result value 32, and so on, all 25 result matrices are added to obtain a convolution result matrix M6.

As can be seen from the foregoing embodiments, the artificial intelligence computation auxiliary processing device provided in the present invention performs convolution computation on the n × m matrix in the storage module and the convolution kernel, and then outputs the convolution result matrix with the order of n × m.

It is worth noting that the artificial intelligence computing auxiliary processing device provided by the invention builds a zero filling operation system through a hardware structure; in addition, the zero matrix is preset in the memory for the data matrix to be processed to be arranged, so that zero filling operation can be realized without calculating parameters such as the number of zero filling or the position of zero filling. Compared with the operation mode of realizing zero filling by running and processing software through a CPU (central processing unit) in the prior art, the invention greatly reduces the calculated amount of the system, improves the efficiency of zero filling operation and accelerates the response speed of operations such as image processing and the like.

Optionally, in an embodiment, the N × M matrix is divided into a plurality of block matrices with the same size, and a specific manner in which the control module takes the N × M matrix out of the storage module and places the N × M matrix in the N × M zero matrix of the memory includes: the control module sequentially takes out each block matrix, and fills each block matrix into the N × M zero matrix according to a preset placement condition, which is described below with reference to a specific embodiment.

FIG. 7 is a schematic diagram of an auxiliary processing device for artificial intelligence calculation according to an embodiment of the present invention. The processing device comprises a storage module 71, in which a matrix M7 of data to be processed is arranged. The data matrix M7 to be processed is a 4 × 4 matrix, and 2 × 2 matrix is a block matrix, which can be divided into 4 block matrices, and one block matrix is represented by a rectangular dashed box R2 in fig. 7.

The processing device comprises a memory 72, wherein a zero matrix M8 is arranged in the memory, and the zero matrix M8 is a 6 x 6 matrix. The area of the zero matrix M8 for storing the pending data matrix M7 starts from the matrix element of the second row and the second column, and its storage address is 0x 00220000. The control module, not shown, places the first block matrix of the pending data matrix M7 into a rectangular dashed box R3 with the storage address as a start address.

The control module sequentially places each block matrix at a corresponding position in the zero matrix M8 according to the initial address and the size of the block matrix. For example, the control module places a first block matrix into the zero matrix M8, and places a second block matrix into the zero matrix M8 with the storage address of 0x00220004 as the starting address; and so on until all the data matrixes to be processed are placed into the zero matrix M8.

According to the artificial intelligence calculation auxiliary processing method provided by the invention, the control module is used for taking out data from the storage module, and the data matrix to be processed is divided into a plurality of block matrixes with the same size, so that the data taking-out efficiency is greatly improved, and the response speed of the system is accelerated.

The invention also provides an artificial intelligence calculation auxiliary processing method, which is applied to the control module and specifically comprises the following steps:

s801: taking out a data matrix to be processed from the storage module;

s802: placing the data matrix to be processed in a zero matrix in a memory, so that the data matrix to be processed can form a matrix to be convolved with the size W x W by taking any one first matrix element as a center, and a convolution kernel matrix is subjected to convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.

The implementation of the artificial intelligence computing auxiliary processing method is similar to that of the artificial intelligence computing auxiliary processing device, and therefore, the description thereof is omitted.

Those of ordinary skill in the art will understand that: all or part of the steps of the embodiment of the artificial intelligence calculation auxiliary processing method can be completed by relevant hardware of a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

The invention also provides an artificial intelligence calculation auxiliary processing terminal, which comprises: a processor and a memory. The memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the artificial intelligence calculation auxiliary processing method.

The memory mentioned above may include Random Access Memory (RAM), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.

The processor may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.

In summary, the artificial intelligence computing auxiliary processing method and device provided by the invention build a zero filling operation system through a hardware structure, and a zero matrix is preset in the memory for the data matrix to be processed to be put in, so that the zero filling operation can be realized, parameters such as the number of zero fillings or the positions of zero fillings do not need to be calculated, the calculated amount of the system is greatly reduced, the efficiency of the zero filling operation is improved, and the response speed of operations such as image processing is accelerated. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims

1. An artificial intelligence computing auxiliary processing apparatus, comprising:

the storage modules are used for storing the data matrix to be processed;

the memory is provided with a zero matrix;

the control module is used for taking the data matrix to be processed out of the storage module and placing the data matrix to be processed in the memory zero matrix so as to enable the data matrix to be processed to form a matrix to be convolved with the size W x W by taking any one first matrix element of the data matrix as a center, and the convolution kernel matrix is used for performing convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.

2. The artificial intelligence computing assistance processing apparatus of claim 1, comprising:

the data matrix to be processed comprises an N x M matrix, and the zero matrix comprises an N x M zero matrix;

wherein,

3. the auxiliary processing device for artificial intelligence calculation according to claim 2, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:

the control module takes the N x M matrix out of the storage module and places the N x M matrix into an N x M zero matrix of the memory to form a filling matrix; wherein the fill matrix comprises:

1 st to 1 thLine, firstTo row N zero;

1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.

4. The artificial intelligence computing auxiliary processing device of claim 2, wherein the data matrix to be processed is taken out of the storage module and placed in the internal zero matrix, and the specific manner includes:

the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module sequentially takes out each block matrix and fills each block matrix into the N x M zero matrix according to a preset placing condition.

5. The artificial intelligence computing assistance processing apparatus of claim 4, wherein the preset placement condition includes:

and the control module sequentially places the block matrixes into the N x M zero matrix according to a preset initial address in the memory and the sizes of the block matrixes.

6. The artificial intelligence computing assistance processing apparatus of claim 1, wherein the storage module comprises a dual storage module.

7. The artificial intelligence computing assistance processing apparatus of claim 1, comprising:

the multiplier is used for multiplying each matrix to be convolved with a convolution kernel respectively to obtain a corresponding multiplication result matrix; wherein each multiplication result matrix is aligned with each first matrix element;

the adder is used for adding the second matrix elements in each multiplication result matrix to obtain a corresponding convolution result value; and aligning each convolution result value with each first matrix element to form a convolution result matrix with the size of n x m.

8. An artificial intelligence calculation auxiliary processing method is applied to a control module, and comprises the following steps:

taking out a data matrix to be processed from the storage module;

placing the data matrix to be processed in a zero matrix in a memory, so that the data matrix to be processed can form a matrix to be convolved with the size W x W by taking any one first matrix element as a center, and a convolution kernel matrix is subjected to convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.

9. The artificial intelligence computing assistance processing method of claim 8, comprising:

wherein,

10. the artificial intelligence computing auxiliary processing method according to claim 8, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:

and the n x m matrix is divided into a plurality of block matrixes with the same size, and the block matrixes are sequentially taken out by the control module.

11. The artificial intelligence computing auxiliary processing method according to claim 8, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:

and the control module sequentially fills each block matrix into the N x M zero matrix according to the starting address and the size of the block matrix.

12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the artificial intelligence computing assistance processing method of any one of claims 8 to 11.

13. An artificial intelligence calculates auxiliary processing terminal which characterized by, includes: a processor and a memory;

the memory is used for storing a computer program, and the processor is used for executing the computer program stored by the memory to cause the terminal to execute the artificial intelligence calculation auxiliary processing method according to any one of claims 8 to 11.