CN109784484A - Neural network accelerated method, device, neural network accelerate chip and storage medium - Google Patents
Neural network accelerated method, device, neural network accelerate chip and storage medium Download PDFInfo
- Publication number
- CN109784484A CN109784484A CN201910100514.7A CN201910100514A CN109784484A CN 109784484 A CN109784484 A CN 109784484A CN 201910100514 A CN201910100514 A CN 201910100514A CN 109784484 A CN109784484 A CN 109784484A
- Authority
- CN
- China
- Prior art keywords
- layer
- neural network
- parameter
- current layer
- accelerated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Semiconductor Memories (AREA)
Abstract
The invention discloses a kind of neural network accelerated method, device, neural networks to accelerate chip and storage medium, this method comprises: being directed to neural network to be accelerated, carry out following step, until determining that the neural network accelerates to complete: carrying out acceleration processing to the current layer using the parameter of current layer to be accelerated, and dispatch next layer of parameter of the current layer;When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and carries out acceleration processing.Neural network accelerates chip when accelerate processing to neural network current layer in the present invention, is capable of next layer of parameter of Parallel Scheduling current layer, shortens the whole acceleration time of neural network, improve the acceleration efficiency of neural network.
Description
Technical field
The present invention relates to field of artificial intelligence more particularly to a kind of neural network accelerated methods, device, neural network
Accelerate chip and storage medium.
Background technique
With using deep learning as the promotion of the precision of the neural network algorithm of representative, artificial intelligence overall market scale
Gradually expand, huge market potentiality have attracted numerous chips, algorithm and application vendor to bound oneself to it.Due to artificial intelligence
It can need largely to calculate in model training and reasoning, the passing characteristic for having been limited to its algorithm and calculating itself, tradition
Computing chip be unable to satisfy demand, therefore be that neural network algorithm makes dedicated chip i.e. nerve there have been chip manufacturer
Network accelerator.
Neural network accelerator needs to manage at work the parameter that network model is successively obtained in device in the outside, is equivalent to
Ppu is by bus successively to the parameter of neural network accelerator Configuration network model, the every processing of neural network accelerator
A complete layer data obtains the parameter of primary next layer network model to ppu, causes after the completion of layer data processing,
Next layer parameter data acquisition arrive before section this period, i.e., in parameter scheduling time section, neural network accelerator will not be into
The processing of row layer data leads to entire neural network acceleration time long low efficiency.
Summary of the invention
The present invention provides a kind of neural network accelerated method, device, neural networks to accelerate chip and storage medium, to
Solve the problems, such as neural network acceleration time long low efficiency in the prior art.
The present invention provides a kind of neural network accelerated methods, are applied to neural network and accelerate chip, this method comprises:
For neural network to be accelerated, following step is carried out, until determining that the neural network accelerates to complete:
Acceleration processing is carried out to the current layer using the parameter of current layer to be accelerated, and is dispatched under the current layer
One layer of parameter;
When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and is carried out at acceleration
Reason.
Further, if the current layer to be accelerated is the last layer, next layer of the scheduling current layer
Parameter include:
Dispatch the parameter of first layer.
Further, next layer of parameter of the scheduling current layer includes:
Next layer of parameter of the current layer saved in scheduling on-chip memory.
Further, next layer of the parameter for dispatching the current layer saved in on-chip memory includes:
By REG file, next layer of parameter of the current layer saved in on-chip memory is dispatched.
Further, described that acceleration processing is carried out to the current layer using the parameter of current layer to be accelerated, and dispatch
Before next layer of parameter of the current layer, the method also includes:
Every layer of parameter needed for accelerating processing is extracted from neural network to be accelerated, and is saved in the on piece storage
In device.
Further, the on-chip memory includes ROM.
The present invention provides a kind of neural network accelerators, are applied to neural network and accelerate chip, which includes:
Accelerate scheduler module, for being directed to neural network to be accelerated, following step is carried out, until determining the nerve net
Network accelerates to complete: carrying out acceleration processing to the current layer using the parameter of current layer to be accelerated, and dispatches the current layer
Next layer of parameter;
Determining module, for when the current layer accelerates processing to complete, next layer to be determined as to be accelerated work as
Front layer carries out acceleration processing.
Further, the acceleration scheduler module is adjusted if being the last layer specifically for the current layer to be accelerated
Spend the parameter of first layer.
Further, the acceleration scheduler module, specifically for dispatching the current layer saved in on-chip memory
Next layer of parameter.
Further, the acceleration scheduler module is specifically used for passing through REG file, save in scheduling on-chip memory
Next layer of parameter of the current layer.
Further, described device further include:
Preserving module is extracted, for extracting every layer of parameter for accelerating processing required from neural network to be accelerated, and
It is saved in the on-chip memory.
Further, the on-chip memory includes ROM.
The present invention provides a kind of neural networks to accelerate chip, comprising: processor, communication interface, memory and communication are total
Line, wherein processor, communication interface, memory complete mutual communication by communication bus;
It is stored with computer program in the memory, when described program is executed by the processor, so that the place
Manage the step of device executes any of the above-described the method.
The present invention provides a kind of computer readable storage medium, being stored with can accelerate chip to execute by neural network
Computer program, when described program is when the neural network accelerates to run on chip, so that the neural network accelerates chip
The step of executing any of the above-described the method.
The present invention provides a kind of neural network accelerated method, device, neural networks to accelerate chip and storage medium, the party
Method includes: to carry out following step for neural network to be accelerated, until determining that the neural network accelerates to complete: using to
The parameter of the current layer of acceleration carries out acceleration processing to the current layer, and dispatches next layer of parameter of the current layer;When
When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and carries out acceleration processing.The present invention
Middle neural network accelerates chip when accelerate processing to neural network current layer, is capable of next layer of Parallel Scheduling current layer
Parameter, shorten the whole acceleration time of neural network, improve the acceleration efficiency of neural network.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is a kind of schematic diagram for neural network accelerator that the embodiment of the present invention 1 provides;
Fig. 2 is the structural schematic diagram that a kind of neural network that the embodiment of the present invention 6 provides accelerates chip;
Fig. 3 is a kind of neural network accelerator schematic diagram provided in an embodiment of the present invention.
Specific embodiment
In order to shorten the whole acceleration time of neural network, the acceleration efficiency of neural network is improved, the embodiment of the present invention mentions
A kind of neural network accelerated method, device, neural network has been supplied to accelerate chip and storage medium.
To make the objectives, technical solutions, and advantages of the present invention clearer, make below in conjunction with the attached drawing present invention into one
Step ground detailed description, it is clear that described embodiment is only a part of the embodiments of the present invention, rather than whole implementation
Example.Based on the embodiments of the present invention, obtained by those of ordinary skill in the art without making creative efforts
Every other embodiment, shall fall within the protection scope of the present invention.
Embodiment 1:
Fig. 1 is a kind of schematic diagram of neural network accelerator provided in an embodiment of the present invention, which includes following step
It is rapid:
S101: for neural network to be accelerated, carrying out following step, until determining that the neural network accelerates to complete.
Neural network accelerated method provided in an embodiment of the present invention is applied to neural network and accelerates chip, which adds
Fast chip can be GPU (Graphics Processing Unit, graphics processor), AI (Artificial
Intelligence, artificial intelligence) chip, FPGA (Field-Programmable Gate Array, field-programmable gate array
Column) chip or other be able to carry out neural network acceleration chip.Specifically, it can be and accelerate chip applied to neural network
In computing unit.
The neural network accelerates to preserve the algorithm for accelerate to neural network processing, therefore the neural network in chip
Chip pins are accelerated to treat the neural network of acceleration, the neural network that acceleration can be treated according to following step carries out acceleration processing.
The neural network accelerates chip to can determine whether the neural network accelerates to complete, and determines whether neural network accelerates
The process of completion belongs to the prior art, does not repeat them here in embodiments of the present invention.
Signified neural network includes deep learning neural network model in the embodiment of the present invention.
S102: carrying out acceleration processing to the current layer using the parameter of current layer to be accelerated, and dispatches described current
Next layer of parameter of layer.
The neural network accelerates chip to can determine current layer to be accelerated, using the parameter pair for the current layer being dispatched to
The current layer carries out acceleration processing.
Accelerate the process of processing that can realize using the prior art on layer using parameter, in embodiments of the present invention not
It repeats.
Current layer can be first layer, the last layer or other layers of neural network, and only generation, which refers to, currently carries out acceleration processing
Layer, without limiting specific a certain layer, signified layer is usually the convolutional layer in neural network in the embodiment of the present invention.
If current layer is first layer, the parameter of the current layer is to be dispatched to before carrying out acceleration processing to current layer,
It is specifically as follows i.e. determination after system starting to be scheduled immediately after the neural network accelerated.
After neural network accelerates chip to determine current layer, next layer of current layer can be determined, specifically neural network adds
Next layer of the information that each current layer is preserved in fast chip, such as can be and directly preserves a layer contingency table, in layer contingency table
Next layer of information for directly preserving each layer can be and be named using each layer of serial number, and neural network accelerates chip to press
Each layer of next layer etc. is determined according to the sequence of serial number.
Neural network accelerates chip while the parameter using current layer accelerate processing to current layer, Parallel Scheduling
Next layer of parameter of current layer.
Parameter needed for every layer can be stored in neural network i.e. external in ppu, can be stored in mind
In storage inside module through network acceleration chip.Therefore accordingly, neural network accelerates chip that can manage in device in the outside
The parameter for dispatching next layer can be and dispatch next layer of parameter in the storage inside module of itself.
S103: when the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and is carried out
Acceleration processing.
Neural network accelerates chip that can determine whether current layer accelerates processing to complete, which belongs to the prior art,
It is not repeated them here in the embodiment of the present invention.
Neural network accelerates chip when determining that current layer accelerates processing to complete, by next layer as current layer to be accelerated
Continue acceleration processing, therefore when neural network accelerates not completing, circulation carries out acceleration processing to each layer.
In order to facilitate understanding, the process of neural network acceleration is described in a circulating manner below:
A: the parameter of current layer is dispatched using first layer as current layer to be accelerated for neural network to be accelerated.
B: carrying out acceleration processing to current layer using the parameter of current layer, and dispatches next layer of parameter of current layer.
C: judge whether neural network accelerates to complete;If not, carrying out D;If so, carrying out E.
D: if current layer accelerates processing to complete, next layer is determined as to current layer to be accelerated, and next layer of parameter
It is determined as the parameter of current layer to be accelerated, returns to B.
E: determine that neural network accelerates to complete.
Neural network accelerates chip when accelerate processing to neural network current layer in the embodiment of the present invention, can be simultaneously
Next layer of parameter of row scheduling current layer, shortens the whole acceleration time of neural network, improves the acceleration of neural network
Efficiency.
Embodiment 2:
On the basis of the above embodiments, in the embodiment of the present invention, if the current layer to be accelerated is the last layer,
Next layer of parameter of the scheduling current layer includes:
Dispatch the parameter of first layer.
Due to needing successively cyclically to carry out at acceleration every layer of neural network before neural network accelerates to complete
Reason can carry out at acceleration so if current layer is the last layer using first layer as next layer of circulation of the last layer
Reason.
Therefore next layer of parameter of scheduling the last layer is specially to dispatch the parameter of first layer.
Due in the embodiment of the present invention current layer be the last layer when, using the parameter of first layer as under the last layer
One layer of parameter is scheduled, it is ensured that before neural network acceleration is completed, realizes layer-by-layer to circulation in neural network add
Speed processing improves the acceleration efficiency of neural network to guarantee the whole acceleration time of shortening neural network.
Embodiment 3:
On the basis of the various embodiments described above, in the embodiment of the present invention, next layer of ginseng of the scheduling current layer
Number includes:
Next layer of parameter of the current layer saved in scheduling on-chip memory.
In order to further increase the acceleration efficiency of neural network, every layer of parameter is pre-stored in neural network and accelerates chip
Storage inside module, i.e., in on-chip memory, rather than in ppu, therefore next layer can be dispatched to more quickly
Parameter.
Specifically, neural network acceleration chip can dispatch next layer of ginseng of current layer directly in on-chip memory
Number is also possible to dispatch next layer of parameter of current layer, such as alternative document in on-chip memory indirectly by alternative document
It can be REG file.
The on-chip memory includes read-only memory (ROM, Read-Only Memory), naturally it is also possible to for other tools
There is the module etc. of store function.
When every layer of parameter is stored in on-chip memory, it can be stored in every layer of corresponding space, it preferably can be with
It is that the corresponding space of adjacent layer is continuous, in order to lower layer of parameter of fast dispatch.
Accelerate in the on-chip memory of chip since every layer in the embodiment of the present invention of parameter is pre-stored in neural network,
Therefore it can be dispatched to more quickly next layer of parameter, to further increase the acceleration efficiency of neural network.
Embodiment 4:
On the basis of the various embodiments described above, in the embodiment of the present invention, saved in the scheduling on-chip memory described
Next layer of parameter of current layer includes:
By REG file, next layer of parameter of the current layer saved in on-chip memory is dispatched.
In order to further increase the acceleration efficiency of neural network, every layer of parameter being saved in on-chip memory
In, the parameter of layer to be processed is stored in REG file, computing unit can directly in REG file direct scheduling parameter,
It saves and first determines next layer, then search the time of the parameter of next layer of scheduling.
Specifically, after the parameter of a certain layer is removed in REG file, next layer of the parameter of this layer is deposited from piece immediately
It reads and is saved in REG file in reservoir.
Therefore, one layer of parameter is only saved in REG file every time.
Below with a specific embodiment to being illustrated in the embodiment of the present invention, some neural network has M convolution
The parameter of M convolutional layer, is respectively stored in 1~M of space of ROM by layer first, first by the ginseng in space 1 after system starting
Number, which is read out, to be stored in REG file, and the 1st layer parameter of computing unit is told to have been prepared for finishing, and waits computing unit will
After first layer parameter is taken away, the parameter in ROM Space 2 is read out be stored in REG file immediately, and tell computing unit
2nd layer parameter has been prepared for finishing, and so on, when computing unit takes M layer parameter away, and since the 1st layer, until should
Neural metwork training is completed.
Since in the embodiment of the present invention, neural network accelerates chip to pass through REG file, saved in scheduling on-chip memory
Next layer of parameter of the current layer, further improves the acceleration efficiency of neural network.
Embodiment 5:
On the basis of the various embodiments described above, in the embodiment of the present invention, the parameter pair using current layer to be accelerated
The current layer carries out acceleration processing, and before dispatching next layer of parameter of the current layer, the method also includes:
Every layer of parameter needed for accelerating processing is extracted from neural network to be accelerated, and is saved in the on piece storage
In device.
Neural network accelerates chip that can mention from neural network to be accelerated such as deep learning neural network model automatically
Required parameter is taken, and is saved in on-chip memory, to improve the acceleration efficiency of neural network.
Neural network accelerates chip that could extract every layer of parameter for accelerating processing required from neural network to be accelerated,
It can be and extracted in neural network according to the corresponding keyword of parameter, is also possible to pre-configured every in neural network
The parameter of layer and preservation, neural network accelerate parameter save location of the chip directly in neural network directly to extract etc..
Neural network accelerates chip after extracting every layer of parameter needed for accelerating to handle in neural network, by the nerve
The parameter of every layer of network is stored in internal on-chip memory.
By automatically from parameter needed for deep learning neural network model extraction accelerator, Parameter File being stored in and is added
In the on-chip memory ROM of fast device, when neural network accelerator operation, successively reads configuration parameter, automatically configures mind to realize
Network model parameter through network accelerator facilitates the reading of the parameter in neural network accelerator middle layer, to improve mind
Through network acceleration efficiency.
Embodiment 6:
On the basis of the various embodiments described above, the embodiment of the invention also provides a kind of neural networks to accelerate chip, such as Fig. 2
It is shown, comprising: processor 201, communication interface 202, memory 203 and communication bus 204, wherein processor 201, communication connects
Mouth 202, memory 203 complete mutual communication by communication bus 204;
It is stored with computer program in the memory 203, when described program is executed by the processor 201, so that
The processor 201 executes following steps:
For neural network to be accelerated, following step is carried out, until determining that the neural network accelerates to complete:
Acceleration processing is carried out to the current layer using the parameter of current layer to be accelerated, and is dispatched under the current layer
One layer of parameter;
When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and is carried out at acceleration
Reason.
The communication bus that above-mentioned neural network accelerates chip to mention can be Peripheral Component Interconnect standard (Peripheral
Component Interconnect, PCI) bus or expanding the industrial standard structure (Extended Industry Standard
Architecture, EISA) bus etc..The communication bus can be divided into address bus, data/address bus, control bus etc..For just
It is only indicated with a thick line in expression, figure, it is not intended that an only bus or a type of bus.
Communication interface 202 accelerates the communication between chip and other equipment for above-mentioned neural network.
Memory may include random access memory (Random Access Memory, RAM), also may include non-easy
The property lost memory (Non-Volatile Memory, NVM), for example, at least a magnetic disk storage.Optionally, memory may be used also
To be storage device that at least one is located remotely from aforementioned processor.
Above-mentioned processor can be general processor, including central processing unit, network processing unit (Network
Processor, NP) etc.;It can also be digital command processor (Digital Signal Processing, DSP), dedicated collection
At circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hard
Part component etc..
In embodiments of the present invention, when processor executes the program stored on memory, realization is worked as to neural network
When front layer accelerate processing, it is capable of next layer of parameter of Parallel Scheduling current layer, shortens the whole of neural network and accelerate
Time improves the acceleration efficiency of neural network.
Embodiment 7:
On the basis of the various embodiments described above, the embodiment of the invention also provides a kind of computers to store readable storage medium
Matter is stored with the computer program that chip can be accelerated to execute by neural network in the computer readable storage medium, when described
Program is when the neural network accelerates to run on chip, so that realizing following step when the neural network accelerates chip to execute
It is rapid:
For neural network to be accelerated, following step is carried out, until determining that the neural network accelerates to complete:
Acceleration processing is carried out to the current layer using the parameter of current layer to be accelerated, and is dispatched under the current layer
One layer of parameter;
When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and is carried out at acceleration
Reason.
The processor that above-mentioned computer readable storage medium can be in neural network acceleration chip can access any
Usable medium or data storage device, including but not limited to magnetic storage such as floppy disk, hard disk, tape, magneto-optic disk (MO) etc., light
Learn memory such as CD, DVD, BD, HVD etc. and semiconductor memory such as ROM, EPROM, EEPROM, nonvolatile memory
(NAND FLASH), solid state hard disk (SSD) etc..
Computer program, computer program are provided in the computer readable storage medium provided in embodiments of the present invention
When being executed by processor, realizes when accelerate processing to neural network current layer, be capable of the next of Parallel Scheduling current layer
The parameter of layer, shortens the whole acceleration time of neural network, improves the acceleration efficiency of neural network.
Fig. 3 is a kind of neural network accelerator schematic diagram provided in an embodiment of the present invention, is applied to neural network and accelerates
Chip, the device include:
Accelerate scheduler module 301, for being directed to neural network to be accelerated, following step is carried out, until determining the mind
It is completed through network acceleration: acceleration processing being carried out to the current layer using the parameter of current layer to be accelerated, and is worked as described in scheduling
Next layer of parameter of front layer;
Determining module 302, for next layer being determined as to be accelerated when the current layer accelerates processing to complete
Current layer carries out acceleration processing.
The acceleration scheduler module 301, if being the last layer, scheduling first specifically for the current layer to be accelerated
The parameter of layer.
The acceleration scheduler module 301, specifically for next layer of the current layer saved in scheduling on-chip memory
Parameter.
The acceleration scheduler module 301, is specifically used for through REG file, dispatches save in on-chip memory described and works as
Next layer of parameter of front layer.
Described device further include:
Preserving module 303 is extracted, for extracting every layer of parameter for accelerating processing required from neural network to be accelerated,
And it is saved in the on-chip memory.
The on-chip memory includes ROM.
Neural network accelerates chip when accelerate processing to neural network current layer in the embodiment of the present invention, can be simultaneously
Next layer of parameter of row scheduling current layer, shortens the whole acceleration time of neural network, improves the acceleration of neural network
Efficiency.
For systems/devices embodiment, since it is substantially similar to the method embodiment, so the comparison of description is simple
Single, the relevent part can refer to the partial explaination of embodiments of method.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or an operation are distinguished with another entity or another operation, without necessarily requiring or implying these entities
Or there are any actual relationship or orders between operation.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,
The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces
The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment of the application has been described, it is created once a person skilled in the art knows basic
Property concept, then additional changes and modifications can be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as
It selects embodiment and falls into all change and modification of the application range.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (14)
1. a kind of neural network accelerated method, which is characterized in that it is applied to neural network and accelerates chip, this method comprises:
For neural network to be accelerated, following step is carried out, until determining that the neural network accelerates to complete:
Acceleration processing is carried out to the current layer using the parameter of current layer to be accelerated, and dispatches next layer of the current layer
Parameter;
When the current layer accelerates processing to complete, next layer is determined as current layer to be accelerated and carries out acceleration processing.
2. the method as described in claim 1, which is characterized in that described if the current layer to be accelerated is the last layer
Next layer of parameter for dispatching the current layer includes:
Dispatch the parameter of first layer.
3. the method as described in claim 1, which is characterized in that next layer of parameter of the scheduling current layer includes:
Next layer of parameter of the current layer saved in scheduling on-chip memory.
4. method as claimed in claim 3, which is characterized in that the current layer saved in the scheduling on-chip memory
Next layer of parameter includes:
By REG file, next layer of parameter of the current layer saved in on-chip memory is dispatched.
5. the method as claimed in claim 3 or 4, which is characterized in that the parameter using current layer to be accelerated is to described
Current layer carries out acceleration processing, and before dispatching next layer of parameter of the current layer, the method also includes:
Every layer of parameter needed for accelerating processing is extracted from neural network to be accelerated, and is saved in the on-chip memory
In.
6. the method as claimed in claim 3 or 4, which is characterized in that the on-chip memory includes read only memory ROM.
7. a kind of neural network accelerator, which is characterized in that be applied to neural network and accelerate chip, which includes:
Accelerate scheduler module, for being directed to neural network to be accelerated, following step is carried out, until determining that the neural network adds
Speed is completed: being carried out acceleration processing to the current layer using the parameter of current layer to be accelerated, and is dispatched under the current layer
One layer of parameter;
Determining module, for when the current layer accelerates processing to complete, next layer to be determined as current layer to be accelerated
Carry out acceleration processing.
8. device as claimed in claim 7, which is characterized in that the acceleration scheduler module, if be specifically used for described to be added
The current layer of speed is the last layer, dispatches the parameter of first layer.
9. device as claimed in claim 7, which is characterized in that the acceleration scheduler module is specifically used for scheduling on piece storage
Next layer of parameter of the current layer saved in device.
10. device as claimed in claim 9, which is characterized in that the acceleration scheduler module is specifically used for through REG file,
Next layer of parameter of the current layer saved in scheduling on-chip memory.
11. the device as described in claim 9 or 10, which is characterized in that described device further include:
Preserving module is extracted, for extracting every layer of parameter for accelerating processing required from neural network to be accelerated, and is saved
Into the on-chip memory.
12. the device as described in claim 9 or 10, which is characterized in that the on-chip memory includes read only memory ROM.
13. a kind of neural network accelerates chip characterized by comprising processor, communication interface, memory and communication bus,
Wherein, processor, communication interface, memory complete mutual communication by communication bus;
It is stored with computer program in the memory, when described program is executed by the processor, so that the processor
Perform claim requires the step of any one of 1~6 the method.
14. a kind of computer readable storage medium, which is characterized in that it is stored with the meter that chip can be accelerated to execute by neural network
Calculation machine program, when described program is when the neural network accelerates to run on chip, so that the neural network accelerates chip to hold
The step of any one of row claim 1~6 the method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910100514.7A CN109784484A (en) | 2019-01-31 | 2019-01-31 | Neural network accelerated method, device, neural network accelerate chip and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910100514.7A CN109784484A (en) | 2019-01-31 | 2019-01-31 | Neural network accelerated method, device, neural network accelerate chip and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109784484A true CN109784484A (en) | 2019-05-21 |
Family
ID=66504054
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910100514.7A Pending CN109784484A (en) | 2019-01-31 | 2019-01-31 | Neural network accelerated method, device, neural network accelerate chip and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109784484A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110032374A (en) * | 2019-03-21 | 2019-07-19 | 深兰科技(上海)有限公司 | A kind of parameter extracting method, device, equipment and medium |
CN111488970A (en) * | 2020-04-03 | 2020-08-04 | 北京思朗科技有限责任公司 | Execution optimization method and device of neural network |
CN112613605A (en) * | 2020-12-07 | 2021-04-06 | 深兰人工智能(深圳)有限公司 | Neural network acceleration control method and device, electronic equipment and storage medium |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915322A (en) * | 2015-06-09 | 2015-09-16 | 中国人民解放军国防科学技术大学 | Method for accelerating convolution neutral network hardware and AXI bus IP core thereof |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
CN107657581A (en) * | 2017-09-28 | 2018-02-02 | 中国人民解放军国防科技大学 | Convolutional neural network CNN hardware accelerator and acceleration method |
CN108197705A (en) * | 2017-12-29 | 2018-06-22 | 国民技术股份有限公司 | Convolutional neural networks hardware accelerator and convolutional calculation method and storage medium |
CN108205704A (en) * | 2017-09-27 | 2018-06-26 | 深圳市商汤科技有限公司 | A kind of neural network chip |
CN108229670A (en) * | 2018-01-05 | 2018-06-29 | 中国科学技术大学苏州研究院 | Deep neural network based on FPGA accelerates platform |
CN108280514A (en) * | 2018-01-05 | 2018-07-13 | 中国科学技术大学 | Sparse neural network acceleration system based on FPGA and design method |
CN108510063A (en) * | 2018-04-08 | 2018-09-07 | 清华大学 | A kind of accelerated method and accelerator applied to convolutional neural networks |
CN108665059A (en) * | 2018-05-22 | 2018-10-16 | 中国科学技术大学苏州研究院 | Convolutional neural networks acceleration system based on field programmable gate array |
CN108763159A (en) * | 2018-05-22 | 2018-11-06 | 中国科学技术大学苏州研究院 | To arithmetic accelerator before a kind of LSTM based on FPGA |
CN208283943U (en) * | 2018-06-08 | 2018-12-25 | 南京信息工程大学 | A kind of CNN acceleration optimization device based on FPGA |
-
2019
- 2019-01-31 CN CN201910100514.7A patent/CN109784484A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915322A (en) * | 2015-06-09 | 2015-09-16 | 中国人民解放军国防科学技术大学 | Method for accelerating convolution neutral network hardware and AXI bus IP core thereof |
CN107229967A (en) * | 2016-08-22 | 2017-10-03 | 北京深鉴智能科技有限公司 | A kind of hardware accelerator and method that rarefaction GRU neutral nets are realized based on FPGA |
CN108205704A (en) * | 2017-09-27 | 2018-06-26 | 深圳市商汤科技有限公司 | A kind of neural network chip |
CN107657581A (en) * | 2017-09-28 | 2018-02-02 | 中国人民解放军国防科技大学 | Convolutional neural network CNN hardware accelerator and acceleration method |
CN108197705A (en) * | 2017-12-29 | 2018-06-22 | 国民技术股份有限公司 | Convolutional neural networks hardware accelerator and convolutional calculation method and storage medium |
CN108229670A (en) * | 2018-01-05 | 2018-06-29 | 中国科学技术大学苏州研究院 | Deep neural network based on FPGA accelerates platform |
CN108280514A (en) * | 2018-01-05 | 2018-07-13 | 中国科学技术大学 | Sparse neural network acceleration system based on FPGA and design method |
CN108510063A (en) * | 2018-04-08 | 2018-09-07 | 清华大学 | A kind of accelerated method and accelerator applied to convolutional neural networks |
CN108665059A (en) * | 2018-05-22 | 2018-10-16 | 中国科学技术大学苏州研究院 | Convolutional neural networks acceleration system based on field programmable gate array |
CN108763159A (en) * | 2018-05-22 | 2018-11-06 | 中国科学技术大学苏州研究院 | To arithmetic accelerator before a kind of LSTM based on FPGA |
CN208283943U (en) * | 2018-06-08 | 2018-12-25 | 南京信息工程大学 | A kind of CNN acceleration optimization device based on FPGA |
Non-Patent Citations (5)
Title |
---|
AHMAD SHAWAHNA等: "FPGA-based Accelerators of Deep Learning Networks for Learning and Classification: A Review", 《IEEE ACCESS》 * |
CHEN ZHANG等: "Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks", 《IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS》 * |
仇越: "一种基于FPGA的卷积神经网络加速器设计与实现", 《微电子学与计算机》 * |
石润彬: "深度学习卷积神经网络VLIW加速器设计与实现", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》 * |
陆志坚: "基于FPGA的卷积神经网络并行结构研究", 《中国博士学位论文全文数据库 信息科技辑》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110032374A (en) * | 2019-03-21 | 2019-07-19 | 深兰科技(上海)有限公司 | A kind of parameter extracting method, device, equipment and medium |
CN111488970A (en) * | 2020-04-03 | 2020-08-04 | 北京思朗科技有限责任公司 | Execution optimization method and device of neural network |
CN112613605A (en) * | 2020-12-07 | 2021-04-06 | 深兰人工智能(深圳)有限公司 | Neural network acceleration control method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20140343711A1 (en) | Decision support system for order prioritization | |
CN107451653A (en) | Computational methods, device and the readable storage medium storing program for executing of deep neural network | |
CN111462137A (en) | Point cloud scene segmentation method based on knowledge distillation and semantic fusion | |
CN109784484A (en) | Neural network accelerated method, device, neural network accelerate chip and storage medium | |
CN110866589B (en) | Operation method, device and framework of deep neural network model | |
CN109740508B (en) | Image processing method based on neural network system and neural network system | |
CN111143578B (en) | Method, device and processor for extracting event relationship based on neural network | |
CN111144561A (en) | Neural network model determining method and device | |
CN106779057A (en) | The method and device of the calculating binary neural network convolution based on GPU | |
CN105988930A (en) | Test case generation method and device | |
CN105700956A (en) | Distributed job processing method and system | |
CN109359732A (en) | A kind of chip and the data processing method based on it | |
CN106295670A (en) | Data processing method and data processing equipment | |
US8768680B2 (en) | Simulator of multi-core system employing reconfigurable processor cores and method of simulating multi-core system employing reconfigurable processor cores | |
CN111352697A (en) | Flexible physical function and virtual function mapping | |
CN107402905A (en) | Computational methods and device based on neutral net | |
CN110941934A (en) | FPGA prototype verification development board segmentation simulation system, method, medium and terminal | |
CN107590009A (en) | Fault handling method and device for main frame running | |
CN108446758B (en) | Artificial intelligence calculation-oriented neural network data serial flow processing method | |
US20150081263A1 (en) | Production simulation apparatus and production simulation method | |
US10719982B2 (en) | Surface extrction method, apparatus, and non-transitory computer readable storage medium thereof | |
CN110515734A (en) | The load processing method and device of data processing task | |
CN109829078A (en) | A kind of data processing method and device of raster data | |
CN111027669A (en) | Method and device for realizing deep neural network on field programmable gate array | |
CN113362139A (en) | Data processing method and device based on double-tower structure model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190521 |