CN109313663A - Artificial intelligence calculates Auxiliary Processing Unit, method, storage medium and terminal - Google Patents
Artificial intelligence calculates Auxiliary Processing Unit, method, storage medium and terminal Download PDFInfo
- Publication number
- CN109313663A CN109313663A CN201880002144.7A CN201880002144A CN109313663A CN 109313663 A CN109313663 A CN 109313663A CN 201880002144 A CN201880002144 A CN 201880002144A CN 109313663 A CN109313663 A CN 109313663A
- Authority
- CN
- China
- Prior art keywords
- matrix
- zero
- artificial intelligence
- processed
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 57
- 238000012545 processing Methods 0.000 title claims abstract description 41
- 238000000034 method Methods 0.000 title description 13
- 239000011159 matrix material Substances 0.000 claims abstract description 257
- 230000015654 memory Effects 0.000 claims abstract description 51
- 238000004364 calculation method Methods 0.000 claims abstract description 31
- 238000003672 processing method Methods 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 10
- 230000009977 dual effect Effects 0.000 claims description 3
- 230000004044 response Effects 0.000 abstract description 5
- 102000008297 Nuclear Matrix-Associated Proteins Human genes 0.000 abstract 2
- 108010035916 Nuclear Matrix-Associated Proteins Proteins 0.000 abstract 2
- 210000000299 nuclear matrix Anatomy 0.000 abstract 2
- 238000010586 diagram Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000007547 defect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 241000282324 Felis Species 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 210000003710 cerebral cortex Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of artificial intelligence calculating Auxiliary Processing Unit, comprising: multiple memory modules are stored with pending data matrix;Memory is equipped with null matrix;Control module, for the pending data matrix to be taken out from the memory module, it is placed in the memory null matrix, by enable the pending data matrix can be constituted centered on any one first matrix element by it having a size of W*W to convolution matrix, carry out convolutional calculation according to preset step-length for convolution nuclear matrix;Wherein, W is the size of the convolution nuclear matrix.The present invention builds zero padding operation system by hardware configuration, and it is preset with null matrix in memory, zero padding operation can be realized for the merging of pending data matrix, without calculating the parameters such as the quantity of zero padding or the position of zero padding, greatly reduce the calculation amount of system, the efficiency of zero padding operation is improved, the response speed of the operations such as image procossing is accelerated.
Description
Technical Field
The invention relates to the field of artificial intelligence, in particular to an artificial intelligence calculation auxiliary processing method, an artificial intelligence calculation auxiliary processing device, a readable computer storage medium and a terminal.
Background
Nowadays, with the development of the artificial intelligence industry, various artificial intelligence fields have been developed. Among them, convolutional neural networks have become a research hotspot in many artificial intelligence fields.
As early as the 60's of the 20 th century, scientists discovered unique network structures that could effectively reduce the complexity of feedback neural networks when studying neurons in the feline cerebral cortex for local sensitivity and direction selection, and subsequently proposed convolutional neural networks. Subsequently, more researchers have been invested in the study of convolutional neural networks.
In general, in order to make the size of the matrix after extracting the eigenvalue by convolution consistent with the size of the original data matrix before convolution, zero padding operation needs to be performed on the original data matrix.
However, in the prior art, zero padding operation can only be performed by software technology, and the calculation amount of the CPU is very large, which leads to very low zero padding efficiency.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, an object of the present invention is to provide an auxiliary processing method and apparatus for work intelligence calculation, a readable computer storage medium, and a terminal, which are used to solve the technical problems of low efficiency and large amount of calculation in zero padding operation in the prior art.
To achieve the above and other related objects, the present invention provides an artificial intelligence calculation auxiliary processing apparatus, including: the storage modules are used for storing the data matrix to be processed; the memory is provided with a zero matrix; the control module is used for taking the data matrix to be processed out of the storage module and placing the data matrix to be processed in the memory zero matrix so as to enable the data matrix to be processed to form a matrix to be convolved with the size W x W by taking any one first matrix element of the data matrix as a center, and the convolution kernel matrix is used for performing convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.
In an embodiment of the present invention, the data matrix to be processed includes an N × M matrix, and the zero matrix includes an N × M zero matrix; wherein,
in an embodiment of the present invention, the extracting the to-be-processed data matrix from the storage module and placing the to-be-processed data matrix in the zero matrix includes: the control module takes the N x M matrix out of the storage module and places the N x M matrix into an N x M zero matrix of the memory to form a filling matrix; wherein the fill matrix comprises: 1 st to 1 thLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.
In an embodiment of the present invention, the extracting the to-be-processed data matrix from the storage module and placing the to-be-processed data matrix in the zero matrix includes: the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module sequentially takes out each block matrix and fills each block matrix into the N x M zero matrix according to a preset placing condition.
In an embodiment of the invention, the preset placing condition includes: and the control module sequentially places the block matrixes into the N x M zero matrix according to a preset initial address in the memory and the sizes of the block matrixes.
In an embodiment of the invention, the memory module includes a dual memory module.
In an embodiment of the present invention, the auxiliary processing device for artificial intelligence calculation includes: the multiplier is used for multiplying each matrix to be convolved with a convolution kernel respectively to obtain a corresponding multiplication result matrix; wherein each multiplication result matrix is aligned with each first matrix element; the adder is used for adding the second matrix elements in each multiplication result matrix to obtain a corresponding convolution result value; and aligning each convolution result value with each first matrix element to form a convolution result matrix with the size of n x m.
In order to achieve the above objects and other related objects, the present invention provides an artificial intelligence computing auxiliary processing method, applied to a control module, the method including: taking the n x m matrix from the storage module; placing the N x M matrix in an N x M zero matrix of an internal memory, so that the N x M matrix can form a matrix to be convolved with the size of W x W by taking any one first matrix element as a center, and providing the matrix to be convolved with the convolution kernel; wherein,w is the order of the convolution kernel.
In an embodiment of the present invention, the extracting, by the control module, the n × m matrix from the storage module specifically includes: the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module takes out the block matrixes in sequence.
In an embodiment of the present invention, the placing the N × M matrix in an N × M zero matrix of a memory specifically includes: and the control module sequentially fills each block matrix into the N x M zero matrix according to the starting address and the size of the block matrix.
To achieve the above and other related objects, the present invention provides a computer-readable storage medium having stored thereon a computer program, which, when executed by a processor, implements the artificial intelligence computing assistance processing method.
To achieve the above and other related objects, the present invention provides an artificial intelligence computing assistant processing terminal, comprising: a processor and a memory; the memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the artificial intelligence calculation auxiliary processing method.
As described above, the artificial intelligence computing auxiliary processing method, the artificial intelligence computing auxiliary processing device, the readable computer storage medium, and the terminal according to the present invention have the following advantages: according to the artificial intelligence calculation auxiliary processing method and device, the zero filling operation system is built through the hardware structure, the zero matrix is preset in the memory, the zero filling operation can be achieved by arranging the data matrix to be processed, parameters such as the number of zero fillings or the position of zero filling are not required to be calculated, the calculated amount of the system is greatly reduced, the efficiency of the zero filling operation is improved, and the response speed of operations such as image processing is accelerated. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
Drawings
FIG. 1 is a diagram illustrating an auxiliary processing apparatus for artificial intelligence computing according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.
FIG. 3 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.
FIG. 4 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.
FIG. 5 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.
FIG. 6 is a diagram illustrating an artificial intelligence computing assistance process according to an embodiment of the invention.
FIG. 7 is a diagram of an apparatus for processing assistance in artificial intelligence computing according to an embodiment of the invention.
FIG. 8 is a diagram illustrating an artificial intelligence computing assistance processing method according to an embodiment of the invention.
Description of the element reference numerals
11 memory module
12 internal memory
13 control module
M1 matrix of data to be processed
M2 convolution kernel matrix
M3 zero matrix
M4 filling matrix
M401-M425 matrix to be convolved
M501-M505 multiplication result matrix
M6 convolution result matrix
M7 matrix of data to be processed
M8 zero matrix
71 memory module
72 internal memory
R1 rectangle dashed box
R2 rectangle dashed box
R3 rectangle dashed box
Steps S801 to S802
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
The invention provides an artificial intelligence calculation auxiliary processing device which is used for carrying out zero filling operation on a data matrix to be processed.
Fig. 1 shows an artificial intelligence computing auxiliary processing device according to an embodiment of the invention. The artificial intelligence computing auxiliary processing device comprises a plurality of storage modules 11, a memory 12 and a control module 13. The plurality of storage modules 11 store n × m matrixes, and n and m are natural numbers greater than or equal to 1. The memory 12 stores an N × M zero matrix; the control module 13 is configured to take out the N × M matrix and place the N × M matrix in the N × M zero matrix.
Preferably, the storage module 11 is a dual storage module; the double storage module specifically comprises two storage modules, wherein when one storage module is in scanning output, the other storage module is in data processing; and when the next period is reached, the storage module which finishes processing the data starts scanning and outputting, and the storage module which is originally scanned and output starts processing the data. That is, the two memory modules are always in a state where one is scanning and the other is processing data, and the effect is that each frame data finally output seems to go through two steps of processing and scanning output, but the two memories cooperate with each other to complete together, thereby achieving the technical effect of increasing the data transmission and processing efficiency by times.
Preferably, the control module realizes data transmission in a DMA mode. Specifically, the DMA is called DirectMemory Access, which means direct memory Access, and is a controller that can directly Access data from a memory without passing through a CPU. In the DMA mode, the CPU only needs to issue an instruction to the control module, the control module is enabled to process data transmission, and information is fed back to the CPU after data bolting is finished, so that the resource occupancy of the CPU is greatly reduced, and system resources are greatly saved.
The N x M zero matrix is a matrix consisting of (N x M) zeros.
In particular, the method comprises the following steps of, and W is the order of the W-W order convolution kernel matrix. The convolution kernel matrix is a weight matrix used for performing weighted average calculation on matrix data, and the function of the convolution kernel matrix is equivalent to a filter in convolution calculation. Generally, the order of the weight matrix is odd, so that the position of the matrix is determined by the central element of the odd-order matrix.
Specifically, the control module takes out the N × M order matrix from the storage module and places the N × M order matrix in the N × M zero matrix of the memory to form a filling matrix. 1 st to 1 th of the filler matrixLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.
And taking the N x M matrix out of the storage module 11 and placing the N x M matrix into the N x M zero matrix in the memory 12 to form a filling matrix, so that the N x M matrix can form a matrix to be convolved with the size of W x W by taking any matrix element as the center. The following describes a process of performing convolution calculation on the n × m matrix and the W × W convolution kernel in a specific embodiment.
Fig. 2 to 6 are schematic diagrams showing an auxiliary process of artificial intelligence computation according to an embodiment of the present invention. Wherein:
fig. 2 shows the pending data matrix M1 and the convolution kernel matrix M2 in this embodiment. The data matrix to be processed is a 5 × 5 order matrix, the convolution kernel matrix is a 3 × 3 order matrix, and the numerical value in each matrix is the matrix element of the matrix.
Fig. 3 shows a zero matrix M3 in this embodiment.
The zero matrix is arranged in the memory and is based on n is 5, W is 3: the zero matrix is a 7 x 7 order matrix.
Fig. 4 shows a filling matrix M4 in this embodiment. The filling matrix is formed by the data matrix to be processed M1 after being placed into the zero matrix M3. According to 1 st of the filling matrixLine, firstTo row N zero; 1 st to 1 thColumn, firstColumn M to column zero; the other regions are known as the n × m order matrix: the 1 st row, the 7 th row, the 1 st column and the 7 th column of the filling matrix M4 are all 0, the areas of the 2 nd to 6 th rows and the 2 nd to 6 th columns are used for placing the data matrix M1 to be processed, and the filling matrix M4 is a matrix after zero padding operation is performed on the data matrix M1 to be processed.
The rectangular dashed box R1 in fig. 4 represents a 3 × 3-stage convolution-ready matrix M401 formed by centering on the matrix element 18, and is located on the right side of fig. 4. And moving the rectangular dashed frame R1 to the right by step 1 to sequentially obtain matrices M401 to M405 to be convolved with each matrix element in the first row of the filling matrix M4 as the center. The same operations are sequentially performed on the second row to the seventh row in the same manner as the first row, so that 25 matrices M401 to M425 to be convolved are finally obtained.
It should be noted that, the matrix elements in the matrices M401 to M425 to be convolved and the data matrix M1 to be processed are mutually aligned. In addition, although the step size of the movement of the rectangular dashed box R1 is 1 in this embodiment, that is, the movement is performed by only one matrix element at a time, the step size of the movement of the dashed box is not limited in the present invention.
Fig. 5 shows a schematic diagram of multiplication of each of the matrices to be convolved and the convolution kernel matrix in this embodiment. The artificial intelligence calculation auxiliary processing device comprises a multiplier which is not shown in the figure and is used for multiplying the convolution kernel matrix with each matrix to be convolved respectively. Specifically, the matrix elements in the matrix to be convolved are multiplied by the matrix elements aligned in the convolution kernel matrix to obtain corresponding multiplication result matrices M501 to M525. It should be noted that, each matrix element in the multiplication result matrices M501 to M525 and the to-be-processed data matrix M1 is aligned.
Fig. 6 is a schematic diagram illustrating the addition operation performed on each multiplication result matrix in this embodiment. The artificial intelligence calculation auxiliary processing device includes an adder, not shown, for performing the following operations on each of the multiplication result matrices M501 to M525: and adding matrix elements in the multiplication result matrix to obtain a corresponding convolution result value. For example, matrix elements in the multiplication result matrix M501 are added to obtain a convolution result value 32, and so on, all 25 result matrices are added to obtain a convolution result matrix M6.
As can be seen from the foregoing embodiments, the artificial intelligence computation auxiliary processing device provided in the present invention performs convolution computation on the n × m matrix in the storage module and the convolution kernel, and then outputs the convolution result matrix with the order of n × m.
It is worth noting that the artificial intelligence computing auxiliary processing device provided by the invention builds a zero filling operation system through a hardware structure; in addition, the zero matrix is preset in the memory for the data matrix to be processed to be arranged, so that zero filling operation can be realized without calculating parameters such as the number of zero filling or the position of zero filling. Compared with the operation mode of realizing zero filling by running and processing software through a CPU (central processing unit) in the prior art, the invention greatly reduces the calculated amount of the system, improves the efficiency of zero filling operation and accelerates the response speed of operations such as image processing and the like.
Optionally, in an embodiment, the N × M matrix is divided into a plurality of block matrices with the same size, and a specific manner in which the control module takes the N × M matrix out of the storage module and places the N × M matrix in the N × M zero matrix of the memory includes: the control module sequentially takes out each block matrix, and fills each block matrix into the N × M zero matrix according to a preset placement condition, which is described below with reference to a specific embodiment.
FIG. 7 is a schematic diagram of an auxiliary processing device for artificial intelligence calculation according to an embodiment of the present invention. The processing device comprises a storage module 71, in which a matrix M7 of data to be processed is arranged. The data matrix M7 to be processed is a 4 × 4 matrix, and 2 × 2 matrix is a block matrix, which can be divided into 4 block matrices, and one block matrix is represented by a rectangular dashed box R2 in fig. 7.
The processing device comprises a memory 72, wherein a zero matrix M8 is arranged in the memory, and the zero matrix M8 is a 6 x 6 matrix. The area of the zero matrix M8 for storing the pending data matrix M7 starts from the matrix element of the second row and the second column, and its storage address is 0x 00220000. The control module, not shown, places the first block matrix of the pending data matrix M7 into a rectangular dashed box R3 with the storage address as a start address.
The control module sequentially places each block matrix at a corresponding position in the zero matrix M8 according to the initial address and the size of the block matrix. For example, the control module places a first block matrix into the zero matrix M8, and places a second block matrix into the zero matrix M8 with the storage address of 0x00220004 as the starting address; and so on until all the data matrixes to be processed are placed into the zero matrix M8.
According to the artificial intelligence calculation auxiliary processing method provided by the invention, the control module is used for taking out data from the storage module, and the data matrix to be processed is divided into a plurality of block matrixes with the same size, so that the data taking-out efficiency is greatly improved, and the response speed of the system is accelerated.
The invention also provides an artificial intelligence calculation auxiliary processing method, which is applied to the control module and specifically comprises the following steps:
s801: taking out a data matrix to be processed from the storage module;
s802: placing the data matrix to be processed in a zero matrix in a memory, so that the data matrix to be processed can form a matrix to be convolved with the size W x W by taking any one first matrix element as a center, and a convolution kernel matrix is subjected to convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.
The implementation of the artificial intelligence computing auxiliary processing method is similar to that of the artificial intelligence computing auxiliary processing device, and therefore, the description thereof is omitted.
Those of ordinary skill in the art will understand that: all or part of the steps of the embodiment of the artificial intelligence calculation auxiliary processing method can be completed by relevant hardware of a computer program. The aforementioned computer program may be stored in a computer readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
The invention also provides an artificial intelligence calculation auxiliary processing terminal, which comprises: a processor and a memory. The memory is used for storing computer programs, and the processor is used for executing the computer programs stored by the memory so as to enable the terminal to execute the artificial intelligence calculation auxiliary processing method.
The memory mentioned above may include Random Access Memory (RAM), and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor may be a general-purpose processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the integrated circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components.
In summary, the artificial intelligence computing auxiliary processing method and device provided by the invention build a zero filling operation system through a hardware structure, and a zero matrix is preset in the memory for the data matrix to be processed to be put in, so that the zero filling operation can be realized, parameters such as the number of zero fillings or the positions of zero fillings do not need to be calculated, the calculated amount of the system is greatly reduced, the efficiency of the zero filling operation is improved, and the response speed of operations such as image processing is accelerated. Therefore, the invention effectively overcomes various defects in the prior art and has high industrial utilization value.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.
Claims (13)
1. An artificial intelligence computing auxiliary processing apparatus, comprising:
the storage modules are used for storing the data matrix to be processed;
the memory is provided with a zero matrix;
the control module is used for taking the data matrix to be processed out of the storage module and placing the data matrix to be processed in the memory zero matrix so as to enable the data matrix to be processed to form a matrix to be convolved with the size W x W by taking any one first matrix element of the data matrix as a center, and the convolution kernel matrix is used for performing convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.
2. The artificial intelligence computing assistance processing apparatus of claim 1, comprising:
the data matrix to be processed comprises an N x M matrix, and the zero matrix comprises an N x M zero matrix;
wherein,
3. the auxiliary processing device for artificial intelligence calculation according to claim 2, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:
the control module takes the N x M matrix out of the storage module and places the N x M matrix into an N x M zero matrix of the memory to form a filling matrix; wherein the fill matrix comprises:
1 st to 1 thLine, firstTo row N zero;
1 st to 1 thColumn, firstColumn M to column zero; the other regions are the n x m matrix.
4. The artificial intelligence computing auxiliary processing device of claim 2, wherein the data matrix to be processed is taken out of the storage module and placed in the internal zero matrix, and the specific manner includes:
the n x m matrix is divided into a plurality of block matrixes with the same size; and the control module sequentially takes out each block matrix and fills each block matrix into the N x M zero matrix according to a preset placing condition.
5. The artificial intelligence computing assistance processing apparatus of claim 4, wherein the preset placement condition includes:
and the control module sequentially places the block matrixes into the N x M zero matrix according to a preset initial address in the memory and the sizes of the block matrixes.
6. The artificial intelligence computing assistance processing apparatus of claim 1, wherein the storage module comprises a dual storage module.
7. The artificial intelligence computing assistance processing apparatus of claim 1, comprising:
the multiplier is used for multiplying each matrix to be convolved with a convolution kernel respectively to obtain a corresponding multiplication result matrix; wherein each multiplication result matrix is aligned with each first matrix element;
the adder is used for adding the second matrix elements in each multiplication result matrix to obtain a corresponding convolution result value; and aligning each convolution result value with each first matrix element to form a convolution result matrix with the size of n x m.
8. An artificial intelligence calculation auxiliary processing method is applied to a control module, and comprises the following steps:
taking out a data matrix to be processed from the storage module;
placing the data matrix to be processed in a zero matrix in a memory, so that the data matrix to be processed can form a matrix to be convolved with the size W x W by taking any one first matrix element as a center, and a convolution kernel matrix is subjected to convolution calculation according to a preset step length; wherein W is the size of the convolution kernel matrix.
9. The artificial intelligence computing assistance processing method of claim 8, comprising:
the data matrix to be processed comprises an N x M matrix, and the zero matrix comprises an N x M zero matrix;
wherein,
10. the artificial intelligence computing auxiliary processing method according to claim 8, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:
and the n x m matrix is divided into a plurality of block matrixes with the same size, and the block matrixes are sequentially taken out by the control module.
11. The artificial intelligence computing auxiliary processing method according to claim 8, wherein the extracting the data matrix to be processed from the storage module and placing the extracted data matrix in the internal zero matrix specifically includes:
and the control module sequentially fills each block matrix into the N x M zero matrix according to the starting address and the size of the block matrix.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the artificial intelligence computing assistance processing method of any one of claims 8 to 11.
13. An artificial intelligence calculates auxiliary processing terminal which characterized by, includes: a processor and a memory;
the memory is used for storing a computer program, and the processor is used for executing the computer program stored by the memory to cause the terminal to execute the artificial intelligence calculation auxiliary processing method according to any one of claims 8 to 11.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2018/072662 WO2019136750A1 (en) | 2018-01-15 | 2018-01-15 | Artificial intelligence-based computer-aided processing device and method, storage medium, and terminal |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109313663A true CN109313663A (en) | 2019-02-05 |
CN109313663B CN109313663B (en) | 2023-03-31 |
Family
ID=65221779
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201880002144.7A Active CN109313663B (en) | 2018-01-15 | 2018-01-15 | Artificial intelligence calculation auxiliary processing device, method, storage medium and terminal |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109313663B (en) |
WO (1) | WO2019136750A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111553224A (en) * | 2020-04-21 | 2020-08-18 | 中国电子科技集团公司第五十四研究所 | Large remote sensing image block distribution method |
CN112396175A (en) * | 2019-08-16 | 2021-02-23 | 脸谱公司 | Mapping convolutions to matrix processor units |
CN112561943A (en) * | 2020-12-23 | 2021-03-26 | 清华大学 | Image processing method based on data multiplexing of pulse array convolution operation |
CN112825151A (en) * | 2019-11-20 | 2021-05-21 | 上海商汤智能科技有限公司 | Data processing method, device and equipment |
WO2021120036A1 (en) * | 2019-12-18 | 2021-06-24 | 华为技术有限公司 | Data processing apparatus and data processing method |
CN117574036A (en) * | 2024-01-16 | 2024-02-20 | 北京壁仞科技开发有限公司 | Computing device, method of operation, and machine-readable storage medium |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111257913B (en) * | 2019-11-29 | 2024-04-30 | 交通运输部长江通信管理局 | Beidou satellite signal capturing method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1777082A (en) * | 2005-12-08 | 2006-05-24 | 西安电子科技大学 | Encoder of parallel-convolution LDPC code based on precoding and its fast encoding method |
CN101192833A (en) * | 2006-11-28 | 2008-06-04 | 华为技术有限公司 | A device and method for low-density checksum LDPC parallel coding |
WO2011116785A1 (en) * | 2010-03-23 | 2011-09-29 | Max-Planck-Gesellschaft Zur Förderung Der... | Method and device for reconstructing a sequence of mr images using a regularized nonlinear inverse reconstruction process |
CN104104394A (en) * | 2014-06-13 | 2014-10-15 | 哈尔滨工业大学 | Signal reconstruction method for acquiring random demodulation system perception matrix based on MLS sequence and system thereof |
CN105334542A (en) * | 2015-10-23 | 2016-02-17 | 中南大学 | Rapid and high-precision forward modeling method for gravitational field of arbitrary density distribution complex geological body |
CN106447030A (en) * | 2016-08-30 | 2017-02-22 | 深圳市诺比邻科技有限公司 | Computing resource optimization method and system of convolutional neural network |
CN107451654A (en) * | 2017-07-05 | 2017-12-08 | 深圳市自行科技有限公司 | Acceleration operation method, server and the storage medium of convolutional neural networks |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6134571A (en) * | 1998-04-29 | 2000-10-17 | Hewlett-Packard Company | Implicit DST-based filter operating in the DCT domain |
CN1145267C (en) * | 2001-03-09 | 2004-04-07 | 华为技术有限公司 | High-efficiency convolution coding method |
CN104574277A (en) * | 2015-01-30 | 2015-04-29 | 京东方科技集团股份有限公司 | Image interpolation method and image interpolation device |
CN107301668B (en) * | 2017-06-14 | 2019-03-15 | 成都四方伟业软件股份有限公司 | A kind of picture compression method based on sparse matrix, convolutional neural networks |
-
2018
- 2018-01-15 CN CN201880002144.7A patent/CN109313663B/en active Active
- 2018-01-15 WO PCT/CN2018/072662 patent/WO2019136750A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1777082A (en) * | 2005-12-08 | 2006-05-24 | 西安电子科技大学 | Encoder of parallel-convolution LDPC code based on precoding and its fast encoding method |
CN101192833A (en) * | 2006-11-28 | 2008-06-04 | 华为技术有限公司 | A device and method for low-density checksum LDPC parallel coding |
WO2011116785A1 (en) * | 2010-03-23 | 2011-09-29 | Max-Planck-Gesellschaft Zur Förderung Der... | Method and device for reconstructing a sequence of mr images using a regularized nonlinear inverse reconstruction process |
CN104104394A (en) * | 2014-06-13 | 2014-10-15 | 哈尔滨工业大学 | Signal reconstruction method for acquiring random demodulation system perception matrix based on MLS sequence and system thereof |
CN105334542A (en) * | 2015-10-23 | 2016-02-17 | 中南大学 | Rapid and high-precision forward modeling method for gravitational field of arbitrary density distribution complex geological body |
CN106447030A (en) * | 2016-08-30 | 2017-02-22 | 深圳市诺比邻科技有限公司 | Computing resource optimization method and system of convolutional neural network |
CN107451654A (en) * | 2017-07-05 | 2017-12-08 | 深圳市自行科技有限公司 | Acceleration operation method, server and the storage medium of convolutional neural networks |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112396175A (en) * | 2019-08-16 | 2021-02-23 | 脸谱公司 | Mapping convolutions to matrix processor units |
CN112825151A (en) * | 2019-11-20 | 2021-05-21 | 上海商汤智能科技有限公司 | Data processing method, device and equipment |
CN112825151B (en) * | 2019-11-20 | 2024-09-13 | 上海商汤智能科技有限公司 | Data processing method, device and equipment |
WO2021120036A1 (en) * | 2019-12-18 | 2021-06-24 | 华为技术有限公司 | Data processing apparatus and data processing method |
CN114730331A (en) * | 2019-12-18 | 2022-07-08 | 华为技术有限公司 | Data processing apparatus and data processing method |
CN111553224A (en) * | 2020-04-21 | 2020-08-18 | 中国电子科技集团公司第五十四研究所 | Large remote sensing image block distribution method |
CN112561943A (en) * | 2020-12-23 | 2021-03-26 | 清华大学 | Image processing method based on data multiplexing of pulse array convolution operation |
CN112561943B (en) * | 2020-12-23 | 2022-11-22 | 清华大学 | Image processing method based on data multiplexing of pulse array convolution operation |
CN117574036A (en) * | 2024-01-16 | 2024-02-20 | 北京壁仞科技开发有限公司 | Computing device, method of operation, and machine-readable storage medium |
CN117574036B (en) * | 2024-01-16 | 2024-04-12 | 北京壁仞科技开发有限公司 | Computing device, method of operation, and machine-readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2019136750A1 (en) | 2019-07-18 |
CN109313663B (en) | 2023-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109313663B (en) | Artificial intelligence calculation auxiliary processing device, method, storage medium and terminal | |
US10394929B2 (en) | Adaptive execution engine for convolution computing systems | |
US10585621B2 (en) | Statically-schedulable feed and drain structure for systolic array architecture | |
WO2019119301A1 (en) | Method and device for determining feature image in convolutional neural network model | |
WO2019201656A1 (en) | Method for accelerating operations and accelerator apparatus | |
CN107862650A (en) | The method of speed-up computation two dimensional image CNN convolution | |
EP3093757B1 (en) | Multi-dimensional sliding window operation for a vector processor | |
JP6955598B2 (en) | Parallel extraction method of image data in multiple convolution windows, devices, equipment and computer readable storage media | |
CN109416755B (en) | Artificial intelligence parallel processing method and device, readable storage medium and terminal | |
US11164032B2 (en) | Method of performing data processing operation | |
WO2024193337A1 (en) | Convolutional neural network acceleration method and system, storage medium, apparatus, and device | |
CN110738317A (en) | FPGA-based deformable convolution network operation method, device and system | |
US20230267740A1 (en) | Video data processing method and system, and relevant assemblies | |
KR20230081697A (en) | Method and apparatus for accelerating dilatational convolution calculation | |
CN109313723B (en) | Artificial intelligence convolution processing method and device, readable storage medium and terminal | |
KR20200043617A (en) | Artificial neural network module and scheduling method thereof for highly effective operation processing | |
US11874898B2 (en) | Streaming-based artificial intelligence convolution processing method and apparatus, readable storage medium and terminal | |
CN117851742B (en) | Data storage method, data processing method, data memory and data processor | |
TWI788257B (en) | Method and non-transitory computer readable medium for compute-in-memory macro arrangement, and electronic device applying the same | |
CN115563443A (en) | Convolution operation method and device, convolution processing method and device and storage medium | |
CN111831207B (en) | Data processing method, device and equipment thereof | |
CN116109481A (en) | Scaling method, chip, storage medium and electronic device | |
CN114741650A (en) | Tensor calculation device, data processor, tensor calculation method, and storage medium | |
CN114662647A (en) | Processing data for layers of a neural network | |
US20240296520A1 (en) | Parameter optimizing method of neural network and computing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |