CN103369326A - Transition coder applicable to HEVC ( high efficiency video coding) standards - Google Patents
Transition coder applicable to HEVC ( high efficiency video coding) standards Download PDFInfo
- Publication number
- CN103369326A CN103369326A CN2013102833903A CN201310283390A CN103369326A CN 103369326 A CN103369326 A CN 103369326A CN 2013102833903 A CN2013102833903 A CN 2013102833903A CN 201310283390 A CN201310283390 A CN 201310283390A CN 103369326 A CN103369326 A CN 103369326A
- Authority
- CN
- China
- Prior art keywords
- data
- input
- coefficient
- eeeo
- achieve
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000007704 transition Effects 0.000 title abstract description 13
- 238000012545 processing Methods 0.000 claims abstract description 126
- 238000007792 addition Methods 0.000 claims description 126
- 230000003139 buffering effect Effects 0.000 claims description 45
- 241001269238 Data Species 0.000 claims description 11
- 238000006073 displacement reaction Methods 0.000 claims description 11
- 230000003111 delayed effect Effects 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 claims 1
- 239000011159 matrix material Substances 0.000 abstract description 11
- 230000017105 transposition Effects 0.000 abstract description 5
- 230000009466 transformation Effects 0.000 abstract description 2
- 230000010354 integration Effects 0.000 abstract 1
- 230000006835 compression Effects 0.000 description 8
- 238000007906 compression Methods 0.000 description 8
- 238000013519 translation Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 238000003491 array Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Images
Landscapes
- Complex Calculations (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a transition coder applicable to HEVC ( high efficiency video coding) standards and mainly solves the problems of multiple use of multipliers and a complicated circuit in the prior art. The transition coder comprises a one-dimensional DCT (discrete cosine transformation) module (1), a transposition buffer module (2) and a top layer control module (3), wherein the one-dimensional DCT module (1) adopts a plurality of butterfly computing units and a plurality of odd coefficient processing units for completing each DCT in the HEVC standards, the odd coefficient processing units resolve the complicated multiply operation into multistage circuits and adopt a shifter, an adding device and a subtracter, i.e. the multistage shifter, the adding device and the subtracter are used for replacing a matrix multiplier, and the circuit structure is simplified. The transition coder has the advantages that the structure is simple and regular, the reusability is high, the key route is short, the lock frequency is high, and the integration is easy. The transition coding on video residual data is efficiently realized under the condition of not using the multipliers.
Description
Technical field
The invention belongs to the electronic circuit technology field, be specifically related to the transform coder structure among the video compression coding standard HEVC, can be applicable to VLSI (very large scale integrated circuit) designs.
Background technology
As everyone knows, along with the development of electronics and information industry, it is increasingly extensive that the application of digital video technology has become.Yet along with the continuous lifting of image resolution ratio, its corresponding data volume also increases thereupon.Contradiction between these mass datas and hard-disk capacity and channel capacity also seems and becomes increasingly conspicuous.Thereby High Data Rate, big data quantity problem have proposed huge challenge to existing compression algorithm, become a large bottleneck of expansion high-resolution video application.How become the problem that people are studying not losing or do not reduce data volume in the situation of loss of information as far as possible.Therefore, many image/video compression algorithms are proposed in succession by people.
Wherein, HEVC is as up-to-date video compression coding standard, and it has adopted a lot of efficiently image compression algorithms.With respect to video compression coding standard H.264, it has adopted meticulousr tree-shaped partitioned organization, so that the piecemeal of image is meticulousr; And basic block size also 16 * 16 increases to 64 * 64 by what adopt in H.264, makes it be more suitable for the compression of large image.Yet when obtaining higher compression efficiency, its corresponding computational complexity also increases greatly.Lifting along with the basic block size, the size of HEVC converter unit also increases thereupon, and it need support 4 * 4,8 * 8,16 * 16 and 32 * 32 4 kinds of dct transforms, so that the multiplier number in its corresponding circuits sharply increases, it is very complicated that translation circuit becomes, and becomes a hard-wired difficult point.Thereby, design an efficient transform coder and seem very important.
So far, in order to reduce the multiplier number in the transition coding module, reduce the complexity of transition coding module, the transition coding structure that has proposed mainly contains following two kinds:
The first is the structure that the part butterfly that adopts in the HEVC test model combines with matrix multiplier, and it has utilized the symmetry of basic matrix in the transition coding, has reduced by 3 times multiplier number.This structure is made of four butterfly structures and four matrix multipliers.Wherein, butterfly structure is comprised of a series of adders and subtracter, after butterfly structure, computing is divided into two parts, even number part and odd number part, this odd number part is finished calculating by the less translation circuit of multiplexing transform block size, and this even number part then is to use matrix multiplier to calculate.Although through optimizing, the number of multiplier is still a lot of in its matrix multiplier for this structure, is difficult for hardware and realizes.
The second is that the patent application that Xian Electronics Science and Technology University proposes " is suitable for the transform coder of HEVC standard " (number of patent application 201210251115.9, publication number CN102857756A).This Invention Announce a kind of transform coder that is suitable for the HEVC standard, be mainly used in solving multiplier in part butterfly and the matrix multiplier combined structure and use too much problem.This structure comprises one dimension DCT/DST module, transpose buffering module and top layer control unit.Wherein, one dimension DCT/DST module is finished the various transition codings of HEVC in conjunction with butterfly structure and matrix multiplication array; The transpose buffering module is utilized the storage different with memory in the path delay between register and is read order, finishes the matrix transpose operation of transform data; The top layer control unit produces resetting and enable signal of one dimension DCT/DST module and transpose buffering module, controls each module co-ordination.But the one-dimensional transform module in this structure still will be used 48 multipliers, and its circuit structure is complicated, is unfavorable for the hardware-efficient realization, and its needed clock cycle when realizing relatively large transition coding is also long.
Summary of the invention
The object of the invention is to the deficiency for above-mentioned prior art, a kind of transform coder that is suitable for high-performance video coding standard HEVC is proposed, to reduce the complexity of circuit structure, the needed clock cycle when reducing transition coding, be easy to hardware and realize, the high-performance that satisfies the HEVC coding standard realizes requirement.
Realize that the object of the invention technical thought is: decompose by the matrix multiplication in part butterfly and the matrix multiplier combined structure is operated, the multiply operation that it is complicated decomposes to multi-level pmultistage circuit and finishes, namely by simple shift unit and adder complete operation, so that the computational complexity of every one-level circuit reduces greatly, thereby shortening critical path, improve clock frequency and the code efficiency of transition coding circuit, finally obtain a transform coder that is suitable for high-performance video coding standard HEVC that does not comprise multiplier.
According to above-mentioned thinking, transform coder of the present invention comprises: one dimension DCT module, transpose buffering module and top layer control module, the data output end of this one dimension DCT module links to each other with the data input pin of transpose buffering module, and data input pin links to each other with the data output end of transpose buffering module; This top layer control module links to each other with reset terminal, the Enable Pin of reset terminal, Enable Pin and the transpose buffering module of one dimension DCT module respectively, it is characterized in that:
Described one dimension DCT module comprises:
32 butterfly processing elements, be used for finishing to the in twos addition of coefficient to be transformed of input and the operation of subtracting each other in twos, and 16 data that the phase add operation obtains are inputed to 16 butterfly processing elements, 16 data that the phase reducing is obtained input to 32 strange coefficient processing unit;
16 butterfly processing elements, be used for finishing to 16 in twos additions of data of 32 butterfly processing element inputs and the operation of subtracting each other in twos, and 8 data that addition obtains are inputed to 8 butterfly processing elements, will subtract each other 8 data that obtain and input to 16 strange coefficient processing unit;
32 strange coefficient processing unit, be used for obtaining by 16 data of 32 butterfly processing elements inputs and this 16 data self move to left rear coefficient with, and by 16 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 16 transform datas, and input to the transpose buffering module;
8 butterfly processing elements, be used for finishing to 8 in twos additions of data of 16 butterfly processing element inputs and the operation of subtracting each other in twos, and 4 data that addition obtains are inputed to 4 butterfly processing elements, will subtract each other 4 data that obtain and input to 8 strange coefficient processing unit;
16 strange coefficient processing unit, be used for obtaining by 8 data of 16 butterfly processing elements inputs and this 8 data self move to left rear coefficient with, and by 8 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 8 transform datas, and input to the transpose buffering module;
4 butterfly processing elements, be used for finishing 4 in twos additions of data that 8 butterfly processing elements are inputted and subtracting each other in twos, and 2 data that addition obtains are inputed to 4 even coefficient processing unit, will subtract each other 2 data that obtain and input to 4 strange coefficient processing unit;
8 strange coefficient processing unit, be used for obtaining by 4 data of 8 butterfly processing elements inputs and this 4 data self move to left rear coefficient with, and by 4 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 4 transform datas and input to the transpose buffering module;
4 even coefficient processing unit, 2 data that are used for finishing 4 butterfly processing elements inputs postpone, and shifter-adder, the operation of subtracting each other, and try to achieve 2 transform datas and input to the transpose buffering module;
4 strange coefficient processing unit, be used for obtaining by 2 data of 4 butterfly processing elements inputs and this 2 data self move to left rear coefficient with, and by 2 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 2 transform datas and input to the transpose buffering module;
The reset enable control unit links to each other with the top layer control module, be used for to receive resetting and enable signal of top layer control module output, and according to reset and enable signal control one dimension DCT module in the resetting and enable of unit.
The present invention compared with prior art has the following advantages:
The first, the present invention has adopted unified conversion implementation structure, can use same encoder circuit to finish the dct transform of 4 kinds of different masses sizes, thereby has improved the extent for multiplexing of circuit, has reduced greatly circuit scale;
Second, the one dimension DCT module that the present invention adopts, be assigned in the multi-level pmultistage circuit by the multiplying with complexity and finish, use the strange coefficient processing unit that does not comprise multiplier to finish complicated multiply operation, reduced the complexity in every one-level circuit, improved system clock frequency, be more suitable for hardware and realize;
Description of drawings
Fig. 1 is the general structure block diagram of transform coder of the present invention;
Fig. 2 is transpose buffering modular structure schematic diagram of the present invention;
Fig. 3 is the structured flowchart of one dimension DCT module among the present invention;
Fig. 4 is structure and the connection diagram of 32 butterfly processing elements, 16 butterfly processing elements, 8 butterfly processing elements and 4 butterfly processing elements among the present invention;
Fig. 5 is the structure chart of 4 even coefficient processing unit among the present invention;
Fig. 6 is the structure chart of 4 strange coefficient processing unit among the present invention;
Fig. 7 is the structure chart of 8 dot factor addition subelements among the present invention;
Fig. 8 is the structure chart of 16 dot factor addition subelements among the present invention;
Fig. 9 is the structure chart of 32 dot factor addition subelements among the present invention.
Embodiment
The present invention is the improvement to one-dimensional transform structure in the existing HEVC standard, can reduce the computational complexity of every grade of flowing water, improves system clock, and is easier to the Parallel Implementation of hardware.
The present invention is described in detail below in conjunction with drawings and Examples.
With reference to Fig. 1, the transform coder of high-performance video coding standard HEVC of the present invention, consisted of by one dimension DCT module 1, transpose buffering module 2 and top layer control module 3, wherein the output of top layer control module 3 is divided into two-way, the first via is connected with one dimension DCT module 1, and the second the tunnel is connected with transpose buffering module 2; The input of the data input pin of one dimension DCT module 1 is divided into two-way, and the first via is connected with outside input data, and the second the tunnel is connected with the data output end of transpose buffering module 2; The data output end of one dimension DCT module 1 is connected with the data input pin of transpose buffering module 2; The data input pin of transpose buffering module 2 is connected with the data output end of one dimension DCT module 1, and the output of the data output end of transpose buffering module 2 is divided into two-way, and the first via is connected with the data input pin of one dimension DCT module 1, and the second the tunnel is connected with outside output.Wherein:
Described top layer control module 3, comprise reset enable module 30 and data flow con-trol module 31, reset enable module 30 links to each other with the reset enable control unit 19 of one dimension DCT module 1 and the transposition reset enable unit 20 of transpose buffering module 2 respectively, enables and reset signal for these two modules provide; Data flow con-trol module 31 links to each other with the address control unit 22 of transpose buffering module 2, for generation of control signal, and the read-write mode of control transpose buffering module 2 and read-write order.This reset enable module 30 and data flow con-trol module 31 consist of by counter and logical circuit, be used for count status and current alternative types of carrying out according to counter, produce resetting of one dimension DCT module 1 by logical circuit, enable and the resetting of transpose buffering module 2, enable, the data flow con-trol signal, the input data of 1 pair of transform coder of control one dimension DCT module are carried out the one dimension line translation, and generation control signal control transpose buffering module 2 receives the line translation result of one dimension DCT modules 1, after All Datarows was finished dealing with, control transpose buffering module 2 exported the line translation result behind the transposition to one dimension DCT module 1 and carries out the one dimension rank transformation.
With reference to Fig. 2, described transpose buffering module 2, comprise transposition reset enable unit 20, RAM memory 21 and address control unit 22, transposition reset enable unit 20 is made of logical circuit, be used for to receive that top layer control module 3 sends reset, enable signal, and produce resetting and enabling of control signal control RAM memory 21 and address control unit 22; RAM memory 21 is made of 8 memory arrays, and each memory array all links to each other with one dimension DCT module 1; Address control unit 22 links to each other with the address end of each memory array in the RAM memory 21, input and output for generation of each memory enable and I/O Address, realization deposits the dct transform result of one dimension DCT module 1 input respectively in 8 memory arrays, again by row or by the operation that is listed as output.
Described one dimension DCT module 1 is used for finishing 4 DCT of HEVC standard, DCT, 16 DCT and 32 DCT one-dimensional transforms at 8, and its structure as shown in Figure 3.
With reference to Fig. 3, one dimension DCT module 1 comprises 32 butterfly processing elements 10,16 butterfly processing elements 11,13,16 strange coefficient processing of 12,8 butterfly processing elements in 32 strange coefficient processing unit unit 14,4 butterfly processing elements 15,17,4 strange coefficient processing unit, 16,4 even coefficient processing unit, 8 strange coefficient processing unit 18, reset enable control unit 19, wherein:
Described reset enable control unit 19, consisted of by logical circuit, it links to each other with the reset enable unit 30 of top layer control module 3 and the unit of one dimension DCT module 1, be used for to receive resetting and enable signal of top layer control module 3 outputs, and according to resetting and enable signal is controlled resetting of unit in the whole one dimension DCT module 1 and enabled.
Described 32 butterfly processing elements 10 are made of 16 adders and 16 subtracters, and these 16 adders link to each other with 16 butterfly processing elements 11, and these 16 subtracters link to each other with 32 strange coefficient processing unit 12, as shown in Figure 4.
These 16 adders are sued for peace in twos to carrying out head and the tail from 32 data of one dimension DCT module 1 input input, namely try to achieve the 1st data and the 32nd data sum E
0, ask again the 2nd data and the 31st data sum E
1, so analogize, try to achieve the 16th data and the 17th data sum E
15, and 16 addition result E that will try to achieve
0~E
15Input to 16 butterfly processing elements 11;
These 16 subtracters carry out head and the tail to 32 coefficients from the input of one dimension DCT module 1 input and ask in twos poor, namely try to achieve the difference O of the 1st data and the 32nd data
0, try to achieve again the difference O of the 2nd data and the 31st data
1, so analogize, try to achieve the difference O of the 16th data and the 17th data
15, and 16 of will try to achieve subtract each other as a result O
0~O
15Input to 32 strange coefficient processing unit 12.
Described 16 butterfly processing elements 11 are made of 8 adders and 8 subtracters, and these 8 adders link to each other with 8 butterfly processing elements 13, and these 8 subtracters link to each other with 16 strange coefficient processing unit 14, as shown in Figure 4.
These 8 adders are to the data E by 10 inputs of 32 butterfly processing elements
0~E
15Carry out head and the tail and sue for peace in twos, namely try to achieve E
0With E
15Sum EE
0, try to achieve again E
1With E
14Sum EE
1, so analogize, try to achieve E
7With E
8Sum EE
7, and 8 addition result EE that will try to achieve
0~EE
7Input to 8 butterfly processing elements 13;
These 8 subtracters are to data E
0~E
15Carry out head and the tail and ask in twos poor, namely try to achieve E
0With E
15Difference EO
0, try to achieve again E
1With E
14Difference EO
1, so analogize, try to achieve E
7With E
8Difference EO
7, and 8 of will try to achieve subtract each other as a result EO
0~EO
7Input to 16 strange coefficient processing unit 14.
Described 8 butterfly processing elements 13 are made of 4 adders and 4 subtracters, these 4 adders and 4 butterfly processing elements 15, and these 4 subtracters link to each other with 8 strange coefficient processing unit 16, as shown in Figure 4.
These 4 adders are to the data EE by 11 inputs of 16 butterfly processing elements
0~EE
7Carry out head and the tail and sue for peace in twos, namely try to achieve EE
0With EE
7Sum EEE
0, try to achieve again EE
1With EE
6Sum EEE
1, so analogize, try to achieve and EE
3With EE
4Sum EEE
3, with 4 addition result EEE that try to achieve
0~EEE
3Input to 4 butterfly processing elements 15;
These 4 subtracters are to data EE
0~EE
7Carry out head and the tail and ask in twos poor, namely try to achieve EE
0With EE
7Difference EEO
0, try to achieve again EE
1With EE
6Difference EEO
1, so analogize, try to achieve also, and EE
3With EE
4Difference EEO
34 of trying to achieve are subtracted each other as a result EEO
0~EEO
3Input to 8 strange coefficient processing unit 16.
Described 4 butterfly processing elements 15 are made of 2 adders and 2 subtracters, these 2 adders and 4 even coefficient processing unit 17, and these 2 subtracters link to each other with 4 strange coefficient processing unit 18, as shown in Figure 4.
These 2 adders are used in the hope of the data EEE by 13 inputs of 8 butterfly processing elements
0With EEE
3Sum EEEE
0, and the data EEE of input
1With EEE
2Sum EEEE
1, and these 2 addition result EEEE that will try to achieve
0, EEEE
1Input to 4 even coefficient processing unit 17;
The data EEE of these 2 subtracters in the hope of input
0With EEE
3Difference EEEO
0, and the data EEE of input
1With EEE
2Difference EEEO
1, and 2 of will try to achieve subtract each other as a result EEEO
0, EEEO
1Input to 4 strange coefficient processing unit 18.
With reference to Fig. 5, described 4 even coefficient processing unit 17 consist of by postponing subelement 170,2 butterfly computation subelements 171 and the subelement 172 that is shifted;
This postpones subelement 170, to the data EEEE by 15 inputs of 4 butterfly processing elements
0With EEEE
1Carry out the delay of 2 clock cycle, obtain delayed data EEEE
0_0With EEEE
1_0, and these 2 data are sent into 2 butterfly computation subelements 171;
These 2 butterfly computation subelements 171 are made of 1 adder and 1 subtracter, are used for postponing the delayed data EEEE of subelement 170 inputs
0_0With EEEE
1_0Carry out respectively addition and subtract each other, obtain summarized information EEEEE and subtract each other data EEEEO and send into displacement subelement 172;
This subelement 172 that is shifted is made of 2 shift units, is used for data EEEEE and EEEEO by 171 inputs of 2 butterfly computation subelements are moved to left 6, and will try to achieve 2 coefficients and export to transpose buffering module 2 as a result.
With reference to Fig. 6, described 4 strange coefficient processing unit 18 are made of 14 dot factor operator unit 180 and 24 dot factor addition subelements 181;
This 4 dot factor operator unit 180 is made of register, shift unit and adder cascade, is used for finishing the data EEEO to by 15 inputs of 4 butterfly processing elements
0, EEEO
1Postpone, obtain retardation coefficient EEEO
0_0, EEEO
1_0, and try to achieve respectively EEEO
0With EEEO
0, and EEEO
1With EEEO
1Self move to left data sum behind the coordination not, that is:
Try to achieve EEEO
0With EEEO
0Self move to left data sum after 1 obtains the first summation coefficient EEEO of 4
0_1,
Try to achieve EEEO
1With EEEO
1Self move to left data sum after 1 obtains the second summation coefficient EEEO of 4
1_1,
Try to achieve EEEO
0With EEEO
0Self move to left data sum after 2 obtains 4 the 3rd summation coefficient EEEO
0_2,
Try to achieve EEEO
1With EEEO
1Self move to left data sum after 2 obtains 4 the 4th summation coefficient EEEO
1_2,
Again these retardation coefficients and summation coefficient are inputed to each 4 dot factor addition subelement 181;
Each 4 dot factor addition subelement 181 is made of shift unit, adder and subtracter cascade, is used for trying to achieve one of dct transform as a result coefficient, namely divides 3 grades to two retardation coefficient EEEO of 4 dot factor operator unit, 180 inputs
0_0, EEEO
1_0, and four summation coefficient EEEO
0_1, EEEO
0_2, EEEO
1_1, EEEO
1_2Merge, wherein:
The 1st grade is that following three groups of coefficients are once merged respectively simultaneously:
First group is with EEEO
0_0And EEEO
1_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 1st grade of 4
4_101
Second group is with EEEO
0_1And EEEO
1_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain second merge coefficient COE of the 1st grade of 4
4_102
The 3rd group is with EEEO
0_2And EEEO
1_2These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 4 the 1st grade the 3rd merge coefficient COE
4_103
The 2nd grade is simultaneously the 1st grade of three merge coefficients of trying to achieve to be carried out respectively secondary to merge:
First merge coefficient COE of the 1st grade with 4
4_101Second merge coefficient COE of the 1st grade with 4
4_102After moving to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 2nd grade of 4
4_201
The 1st grade the 3rd merge coefficient COE with 4
4_103Move to left, obtain second merge coefficient COE of the 2nd grade of 4
4_202
3rd level is that the 2nd grade of two merge coefficients of trying to achieve are merged, and is about to first merge coefficient COE of the 2nd grade of 4
4_201Second merge coefficient COE of the 2nd grade with 4
4_202After moving to left respectively, carry out again addition or subtract each other, obtain one 4 as a result coefficient COEFF
4, and with this as a result coefficient COEFF of 4
4Export to transpose buffering module 2.
Described 8 strange coefficient processing unit 16 are made of 18 dot factor operator unit 160 and 48 dot factor addition subelements 161;
This 8 dot factor operator unit 160 is made of register, shift unit and adder cascade, is used for the data EEO to 13 inputs of 8 butterfly processing elements
0~EEO
3Postpone respectively, obtain retardation coefficient EEO
0_0~EEO
3_0, and try to achieve respectively data EEO
0~EEO
3With these data EEO
0~EEO
3Self move to left data sum behind the coordination not, that is:
Try to achieve EEO
0With EEO
0Self move to left data sum after 1 obtains the first summation coefficient EEO of 8
0_1
Try to achieve EEO
1With EEO
1Self move to left data sum after 1 obtains the second summation coefficient EEO of 8
1_1
Try to achieve EEO
2With EEO
2Self move to left data sum after 1 obtains 8 the 3rd summation coefficient EEO
2_1
Try to achieve EEO
3With EEO
3Self move to left data sum after 1 obtains 8 the 4th summation coefficient EEO
3_1
Try to achieve EEO
0With EEO
0Self move to left data sum after 2 obtains 8 the 5th summation coefficient EEO
0_2
Try to achieve EEO
1With EEO
1Self move to left data sum after 2 obtains 8 the 6th summation coefficient EEO
1_2
Try to achieve EEO
2With EEO
2Self move to left data sum after 2 obtains 8 the 7th summation coefficient EEO
2_2
Try to achieve EEO
3With EEO
3Self move to left data sum after 2 obtains 8 the 8th summation coefficient EEO
3_2
These eight summation coefficients are sent into to each 8 dot factor addition subelement 161;
Each 8 dot factor addition subelement 161 is made of shift unit, adder and subtracter cascade, is used for trying to achieve one of dct transform as a result coefficient, namely divides 4 grades to the coefficient EEO by 160 inputs of 8 dot factor operator unit
0_0~EEO
3_0, EEO
0_1~EEO
3_1And EEO
0_2~EEO
3_2Carry out shifter-adder or displacement and subtract each other, wherein:
The 1st grade is that following six groups of coefficients are once merged respectively simultaneously:
First group is with EEO
0_0And EEO
1_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of 8: 0 1st grades
8_101
Second group is with EEO
2_0And EEO
3_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of 8: 0 1st grades
8_102
The 3rd group is with EEO
0_1And EEO
1_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 8: 0 1st grades the 3rd merge coefficient COE
8_103
The 4th group is with EEO
2_1And EEO
3_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 8: 0 1st grades the 4th merge coefficient COE
8_104
The 5th group is with EEO
0_2And EEO
1_2These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 8: 0 1st grades the 5th merge coefficient COE
8_105
The 6th group is with EEO
2_2And EEO
3_2These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 8: 0 1st grades the 6th merge coefficient COE
8_106
The 2nd grade is to the 1st grade of three being combined coefficient and carrying out respectively secondary and merge of trying to achieve simultaneously:
First group is with COE
8_101And COE
8_102After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of 8: 0 2nd grades
8_201
Second group is with COE
8_103And COE
8_104After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of 8: 0 2nd grades
8_202
The 3rd group is with COE
8_105And COE
8_106After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 8: 0 2nd grades the 3rd merge coefficient COE
8_203
3rd level is simultaneously the 2nd grade of three merge coefficients of trying to achieve to be carried out respectively three times to merge:
First merge coefficient COE with 8: 0 2nd grades
8_201Second merge coefficient COE with 8: 0 2nd grades
8_202After moving to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of 8 3rd levels
8_301
The 3rd merge coefficient COE with 8: 0 2nd grades
8_203Move to left, obtain second merge coefficient COE of 8 3rd levels
8_302
The 4th grade is that two merge coefficients that 3rd level is tried to achieve are merged, and is about to first merge coefficient COE of 8 3rd levels
8_301Second merge coefficient COE with 8 3rd levels
8_302After moving to left respectively, carry out again addition or subtract each other, obtain one 8 as a result coefficient COEFF
8, and with this as a result coefficient COEFF of 8
8Export to transpose buffering module 2, as shown in Figure 7.
Described 16 strange coefficient processing unit 14 are made of 1 16 dot factor operator unit 140 and 8 16 dot factor addition subelements 141;
This 16 dot factor operator unit 140 is made of register, shift unit and adder cascade, is used for the data EO to 11 inputs of 16 butterfly processing elements
0~EO
7Postpone respectively, obtain retardation coefficient EO
0_0~EO
7_0, and try to achieve respectively retardation coefficient EO
0~EO
7With EO
0~EO
7Self move to left data sum behind the coordination not, that is:
Try to achieve data EO
0With EO
0Self move to left data sum after 1 obtains the first summation coefficient EO of 16
0_1
Try to achieve data EO
1With EO
1Self move to left data sum after 1 obtains the second summation coefficient EO of 16
1_1
So analogize;
Try to achieve data EO
7With EO
7Self move to left data sum after 1 obtains 16 the 8th summation coefficient EO
7_1
Try to achieve data EO
0With EO
0Self move to left data sum after 2 obtains 16 the 9th summation coefficient EO
0_2
Try to achieve data EO
1With EO
1Self move to left data sum after 2 obtains 16 the tenth summation coefficient EO
1_2
So analogize;
Try to achieve data EO
7With EO
7Self move to left data sum after 2 obtains 16 the 16 summation coefficient EO
7_2
These 16 summation coefficients are sent into to each 16 dot factor addition subelement 141;
Described 16 dot factor addition subelements 141 are made of shift unit, adder and subtracter cascade, are used for trying to achieve one of dct transform as a result coefficient, namely divide 5 grades to the coefficient EO by 140 inputs of 16 dot factor operator unit
0_0~EO
7_0, EO
0_1~EO
7_1And EO
0_2~EO
7_2Carry out shifter-adder or displacement and subtract each other, wherein:
The 1st grade is that following 12 groups of coefficients are once merged respectively simultaneously:
First group is with EO
0_0And EO
1_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 1st grade of 16
16_101
Second group is with EO
2_0And EO
3_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 1st grade of 16
16_102
The 3rd group is with EO
4_0And EO
5_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 1st grade the 3rd merge coefficient COE
16_103
The 4th group is with EO
6_0And EO
7_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 1st grade the 4th merge coefficient COE
16_104
The 5th group is with EO
0_1And EO
1_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 16 the 1st grade the 5th merge coefficient COE
16_105
The 6th group is with EO
2_1And EO
3_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 16 the 1st grade the 6th merge coefficient COE
16_106
So analogize;
The 11 group is with EO
4_2And EO
5_2These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 16 the 1st grade the 11 merge coefficient COE
16_111
The 12 group is with EO
6_2And EO
7_2These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 16 the 1st grade the 12 merge coefficient COE
16_112
The 2nd grade is following six groups of coefficients to be carried out respectively secondary merge simultaneously:
First group is with COE
16_101And COE
16_102After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 2nd grade of 16
16_201
Second group is with COE
16_103And COE
16_104After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 2nd grade of 16
16_202
The 3rd group is with COE
16_105And COE
16_106After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 2nd grade the 3rd merge coefficient COE
16_203
The 4th group is with COE
16_107And COE
16_108After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 2nd grade the 4th merge coefficient COE
16_204
The 5th group is with COE
16_109And COE
16_110After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 2nd grade the 5th merge coefficient COE
16_205
The 6th group is with COE
16_111And COE
16_112After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 16 the 2nd grade the 6th merge coefficient COE
16_206
3rd level is to be combined coefficient to following three to carry out respectively three merging simultaneously:
First group is with COE
16_201And COE
16_202After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of 16 3rd level
16_301
Second group is with COE
16_203And COE
16_204After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of 16 3rd level
16_302
The 3rd group is with COE
16_205And COE
16_206After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain the 3rd merge coefficient COE of 16 3rd level
16_303
The 4th grade is that three merge coefficients of simultaneously 3rd level being tried to achieve carry out respectively four merging:
First merge coefficient COE with 16 3rd levels
16_301Second merge coefficient COE with 16 3rd level
16_302After moving to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 4th grade of 16
16_401
The 3rd merge coefficient COE with 16 3rd levels
16_303Move to left, obtain second merge coefficient COE of the 4th grade of 16
16_402
The 5th grade is that the 4th grade of two merge coefficients of trying to achieve are merged, and is about to first merge coefficient COE of the 4th grade of 16
16_401Second merge coefficient COE of the 4th grade with 16
16_402After moving to left respectively, carry out again addition or subtract each other, obtain one 16 as a result coefficient COEFF
16, and with this as a result coefficient COEFF of 16
16Export to transpose buffering module 2, as shown in Figure 8.
Described 32 strange coefficient processing unit 12 are made of 1 32 dot factor operator unit 120 and 16 32 dot factor addition subelements 121;
This 32 dot factor operator unit 120 is made of register, shift unit and adder cascade, is used for the data O to 10 inputs of 32 butterfly processing elements
0~O
15Postpone respectively, obtain retardation coefficient O
0_0~O
15_0, and try to achieve respectively input data O
0~O
15With this O
0~O
15Self move to left data sum behind the coordination not, that is:
Try to achieve O
0With O
0Self move to left data sum after 1 obtains the first summation coefficient O of 32
0_1
Try to achieve O
1With O
1Self move to left data sum after 1 obtains the second summation coefficient O of 32
1_1
So analogize;
Try to achieve O
15With O
15Self move to left data sum after 1 obtains 32 the 16 summation coefficient O
15_1
Try to achieve O
0With O
0Self move to left data sum after 2 obtains 32 the 17 summation coefficient O
0_2
Try to achieve O
1With O
1Self move to left data sum after 2 obtains 32 the 18 summation coefficient O
1_2
So analogize;
Try to achieve O
15With O
15Self move to left data sum after 2 obtains 32 the 32 summation coefficient O
15_2
Try to achieve O
0With O
0Self move to left data sum after 3 obtains 32 the 33 summation coefficient O
0_3
Try to achieve O
1With O
1Self move to left data sum after 3 obtains 32 the 34 summation coefficient O
1_3
So analogize;
Try to achieve O
15With O
15Self move to left data sum after 3 obtains 32 the 48 summation coefficient O
15_3
These 48 summation coefficients are sent into to each 32 dot factor addition subelement 121;
Each 32 dot factor addition subelement 121 is used for trying to achieve one of dct transform as a result coefficient, and this subelement is made of shift unit, adder and subtracter cascade, and minutes 6 grades to the coefficient O by 120 inputs of 32 dot factor operator unit
0_0~O
15_0, O
0_1~O
15_1, O
0_2~O
15_2And O
0_3~O
15_3Carry out shifter-adder or displacement and subtract each other,
Wherein:
The 1st grade is that following 32 groups of coefficients are once merged respectively simultaneously:
First group is with O
0_0And O
1_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 1st grade of 32
32_101
Second group is with O
2_0And O
3_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 1st grade of 32
32_102
The 3rd group is with O
4_0And O
5_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 1st grade the 3rd merge coefficient COE
32_103
So analogize;
The 8th group is with O
14_0And O
15_0After these two retardation coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 1st grade the 8th merge coefficient COE
32_108
The 9th group is with O
0_1And O
1_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 32 the 1st grade the 9th merge coefficient COE
32_109
The tenth group is with O
2_1And O
3_1These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 32 the 1st grade the tenth merge coefficient COE
32_110
So analogize;
The 31 group is with O
12_3And O
13_3These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 32 the 1st grade the 31 merge coefficient COE
32_131
The 32 group is with O
14_3And O
15_3These two summations are after coefficients move to left respectively, carry out addition again or subtract each other, and obtain 32 the 1st grade the 32 merge coefficient COE
32_132
The 2nd grade is following 16 groups of coefficients to be carried out respectively secondary merge simultaneously:
First group is with COE
32_101And COE
32_102After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 2nd grade of 32
32_201
Second group is with COE
32_103And COE
32_104After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 2nd grade of 32
32_202
The 3rd group is with COE
32_105And COE
32_106After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 3rd merge coefficient COE
32_203
The 4th group is with COE
32_107And COE
32_108After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 4th merge coefficient COE
32_204
The 5th group is with COE
32_109And COE
32_110After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 5th merge coefficient COE
32_205
The 6th group is with COE
32_110And COE
32_111After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 6th merge coefficient COE
32_206
So analogize;
The 15 group is with COE
32_128And COE
32_129After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 15 merge coefficient COE
32_215
The 16 group is with COE
32_130And COE
32_131After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 2nd grade the 16 merge coefficient COE
32_216
3rd level is following eight groups of coefficients to be carried out respectively three times merge simultaneously:
First group is with COE
32_201And COE
32_202After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of 32 3rd level
32_301
Second group is with COE
32_203And COE
32_204After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of 32 3rd level
32_302
The 3rd group is with COE
32_205And COE
32_206After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain the 3rd merge coefficient COE of 32 3rd level
32_303
So analogize;
The 7th group is with COE
32_213And COE
32_214After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain the 7th merge coefficient COE of 32 3rd level
32_307
The 8th group is with COE
32_215And COE
32_216After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain the 8th merge coefficient COE of 32 3rd level
32_308
The 4th grade is to be combined coefficient to following four to carry out respectively four merging simultaneously:
First group is with COE
32_301And COE
32_302After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 4th grade of 32
32_401
Second group is with COE
32_303And COE
32_304After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 4th grade of 32
32_402
The 3rd group is with COE
32_305And COE
32_306After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 4th grade the 3rd merge coefficient COE
32_403
The 4th group is with COE
32_307And COE
32_308After these two merge coefficients move to left respectively, carry out again addition or subtract each other, obtain 32 the 4th grade the 4th merge coefficient COE
32_404
The 5th grade is simultaneously the 4th grade of four merge coefficients of trying to achieve to be carried out respectively five times to merge:
First merge coefficient COE of the 4th grade with 32
32_401Second merge coefficient COE of the 4th grade with 32
32_402After moving to left respectively, carry out again addition or subtract each other, obtain first merge coefficient COE of the 5th grade of 32
32_501
The 4th grade the 3rd merge coefficient COE with 32
32_403The 4th grade the 4th merge coefficient COE with 32
32_404After moving to left respectively, carry out again addition or subtract each other, obtain second merge coefficient COE of the 5th grade of 32
32_502
The 6th grade is that the 5th grade of two merge coefficients of trying to achieve are merged, and is about to first merge coefficient COE of the 5th grade of 32
32_501Second merge coefficient COE of the 5th grade with 32
32_502After moving to left respectively, carry out again addition or subtract each other, obtain one 32 as a result coefficient COEFF
32, and with this as a result coefficient COEFF of 32
32Export to transpose buffering module 2, as shown in Figure 9.
During every one-level in above-mentioned each 4 dot factor addition subelement 181,8 dot factor addition subelements, 161,16 dot factor addition subelement 141 and 32 dot factor addition subelements 121 merges, choosing of shift count and adder or subtracter is to determine with experiment according to the demand of reality.
Claims (10)
1. transform coder that is suitable for high-performance video coding standard HEVC, comprise: one dimension DCT module (1), transpose buffering module (2) and top layer control module (3), the data output end of this one dimension DCT module (1) links to each other with the data input pin of transpose buffering module (2), and data input pin links to each other with the data output end of transpose buffering module (2); This top layer control module (3) links to each other with reset terminal, the Enable Pin of reset terminal, Enable Pin and the transpose buffering module (2) of one dimension DCT module (1) respectively, it is characterized in that:
Described one dimension DCT module (1) comprising:
32 butterfly processing elements (10), be used for finishing to the in twos addition of coefficient to be transformed of input and the operation of subtracting each other in twos, and 16 data that the phase add operation obtains are inputed to 16 butterfly processing elements (11), 16 data that the phase reducing is obtained input to 32 strange coefficient processing unit (12);
16 butterfly processing elements (11), be used for finishing to 16 in twos additions of data of 32 butterfly processing elements (10) input and the operation of subtracting each other in twos, and 8 data that addition obtains are inputed to 8 butterfly processing elements (13), will subtract each other 8 data that obtain and input to 16 strange coefficient processing unit (14);
32 strange coefficient processing unit (12), be used for obtaining by 16 data of 32 butterfly processing elements (10) input and this 16 data self move to left rear coefficient with, and by 16 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 16 transform datas, and input to transpose buffering module (2);
8 butterfly processing elements (13), be used for finishing to 8 in twos additions of data of 16 butterfly processing elements (11) input and the operation of subtracting each other in twos, and 4 data that addition obtains are inputed to 4 butterfly processing elements (15), will subtract each other 4 data that obtain and input to 8 strange coefficient processing unit (16);
16 strange coefficient processing unit (14), be used for obtaining by 8 data of 16 butterfly processing elements (11) input and this 8 data self move to left rear coefficient with, and by 8 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 8 transform datas, and input to transpose buffering module (2);
4 butterfly processing elements (15), be used for finishing 4 in twos additions of data that 8 butterfly processing elements (13) are inputted and subtracting each other in twos, and 2 data that addition obtains are inputed to 4 even coefficient processing unit (17), will subtract each other 2 data that obtain and input to 4 strange coefficient processing unit (18);
8 strange coefficient processing unit (16), be used for obtaining by 4 data of 8 butterfly processing elements (13) input and this 4 data self move to left rear coefficient with, and by 4 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 4 transform datas and input to transpose buffering module (2);
4 even coefficient processing unit (17), 2 data that are used for finishing 4 butterfly processing elements (15) input postpone, and shifter-adder, the operation of subtracting each other, and try to achieve 2 transform datas and input to transpose buffering module (2);
4 strange coefficient processing unit (18), be used for obtaining by 2 data of 4 butterfly processing elements (15) input and this 2 data self move to left rear coefficient with, and by 2 groups of different shift counts summed result is shifted respectively, addition, subtracts each other, try to achieve 2 transform datas and input to transpose buffering module (2);
Reset enable control unit (19) links to each other with top layer control module (3), be used for to receive resetting and enable signal of top layer control module (3) output, and according to reset and enable signal control one dimension DCT module (1) in the resetting and enable of unit.
2. transform coder according to claim 1 is characterized in that: 32 butterfly processing elements (10), consisted of by 16 adders and 16 subtracters, and 16 adders are carried out head and the tail to the input data and are sued for peace in twos, and 16 addition result E that will try to achieve
0~E
15Input to 16 butterfly processing elements (11); 16 subtracters carry out head and the tail to input coefficient and ask in twos poor, and 16 of will try to achieve subtract each other as a result O
0~O
15Input to 32 strange coefficient processing unit (12).
3. transform coder according to claim 1 is characterized in that: 16 butterfly processing elements (11), consisted of by 8 adders and 8 subtracters, and 8 adders are to the data E by 32 butterfly processing elements (10) input
0~E
15Carry out head and the tail and sue for peace in twos, and 8 addition result EE that will try to achieve
0~EE
7Input to 8 butterfly processing elements (13), 8 subtracters are to data E
0~E
15Carry out head and the tail and ask in twos poor, and 8 of will try to achieve subtract each other as a result EO
0~EO
7Input to 16 strange coefficient processing unit (14).
4. transform coder according to claim 1 is characterized in that: 32 strange coefficient processing unit (12) are made of 1 32 dot factor operator unit (120) and 16 32 dot factor addition subelement (121) cascades;
Described 32 dot factor operator unit (120) are made of register, shift unit and adder cascade, are used for finishing the data O to by 32 butterfly processing elements (10) input
0~O
15Postpone to obtain retardation coefficient O
0_0~O
15_0, and try to achieve O
0~O
15With O
0~O
15Self move to left 1,2,3 and O
0_1~O
15_1, O
0_2~O
15_2, O
0_3~O
15_3, these coefficients are sent into to each 32 dot factor addition subelement (121);
Described 32 dot factor addition subelements (121) are made of shift unit, adder and subtracter cascade, are used for finishing the coefficient O to by the input of 32 dot factor operator unit (120)
0_0~O
15_0, O
0_1~O
15_1, O
0_2~O
15_2And O
0_3~O
15_3Carry out shifter-adder or displacement and subtract each other, 1 data of finally trying to achieve also output it to transpose buffering module (2).
5. transform coder according to claim 1 is characterized in that: 8 butterfly processing elements (13), consisted of by 4 adders and 4 subtracters, and 4 adders are to the data EE by 16 butterfly processing elements (11) input
0~EE
7Carry out head and the tail and sue for peace in twos, and 4 addition result EEE that will try to achieve
0~EEE
3Input to 4 butterfly processing elements (15), 4 subtracters are to data EE
0~EE
7Carry out head and the tail and ask in twos poor, and 4 of will try to achieve subtract each other as a result EEO
0~EEO
3Input to 8 strange coefficient processing unit (16).
6. transform coder according to claim 1 is characterized in that: 16 strange coefficient processing unit (14) are made of 1 16 dot factor operator unit (140) and 8 16 dot factor addition subelement (141) cascades;
Described 16 dot factor operator unit (140) are made of register, shift unit and adder cascade, are used for finishing the data EO to by 16 butterfly processing elements (11) input
0~EO
7Postpone, obtain retardation coefficient EO
0_0~EO
7_0, and try to achieve EO
0~EO
7Respectively with EO
0~EO
71 sum coefficient EO self moves to left
0_1~EO
7_1And EO
0~EO
72 sum coefficient EO self move to left
0_2~EO
7_2, these coefficients are sent into to each 16 dot factor addition subelement (141);
Described 16 dot factor addition subelements (141) are made of shift unit, adder and subtracter cascade, are used for finishing the coefficient EO to by the input of 16 dot factor operator unit (140)
0_0~EO
7_0, EO
0_1~EO
7_1And EO
0_2~EO
7_2Carry out shifter-adder or displacement and subtract each other, 1 data of finally trying to achieve are exported to transpose buffering module (2).
7. transform coder according to claim 1 is characterized in that: 4 butterfly processing elements (15), consisted of by 2 adders and 2 subtracters, and 2 adders are used in the hope of the data EEE by 8 butterfly processing elements (13) input
0With EEE
3Sum EEEE
0, and the data EEE of input
1With EEE
2Sum EEEE
1, and these 2 addition result EEEE that will try to achieve
0, EEEE
1Input to 4 even coefficient processing unit (17); 2 subtracters data EEE in the hope of input
0With EEE
3Difference EEEO
0, and the data EEE of input
1With EEE
2Difference EEEO
1, and 2 of will try to achieve subtract each other as a result EEEO
0, EEEO
1Input to 4 strange coefficient processing unit (18).
8. transform coder according to claim 1 is characterized in that: 8 strange coefficient processing unit (16) are made of 18 dot factor operator unit (160) and 48 dot factor addition subelement (161) cascades;
Described 8 dot factor operator unit (160) are made of register, shift unit and adder cascade, are used for finishing the data EEO to by 8 butterfly processing elements (13) input
0~EEO
3Postpone, obtain retardation coefficient EEO
0_0~EEO
3_0, and try to achieve EEO
0~EEO
3Respectively with EEO
0~EEO
31 sum coefficient EEO self moves to left
0_1~EEO
3_1And EEO
0~EEO
32 sum coefficient EEO self move to left
0_2~EEO
3_2, these coefficients are sent into to each 8 dot factor addition subelement (161);
Described 8 dot factor addition subelements (161) are made of shift unit, adder and subtracter cascade, are used for finishing the coefficient EEO to by the input of 8 dot factor operator unit (160)
0_0~EEO
3_0, EEO
0_1~EEO
3_1And EEO
0_2~EEO
3_2Carry out shifter-adder or displacement and subtract each other, 1 data of finally trying to achieve are exported to transpose buffering module (2).
9. transform coder according to claim 1 is characterized in that: 4 even coefficient processing unit (17), by postponing subelement (170), 2 butterfly computation subelements (171) and displacement subelement (172) cascade formation;
Described delay subelement (170) is to the data EEEE by 4 butterfly processing elements (15) input
0With EEEE
1Carry out the delay of 2 clock cycle, obtain delayed data EEEE
0_0With EEEE
1_0, and these 2 data are sent into 2 butterfly computation subelements (171);
Described 2 butterfly computation subelements (171) are made of 1 adder and 1 subtracter, are used for postponing the delayed data EEEE of subelement (170) input
0_0With EEEE
1_0Carry out respectively addition and subtract each other, obtain summarized information EEEEE and subtract each other data EEEEO and send into displacement subelement (172);
Described displacement subelement (172) is made of 2 shift units, be used for data EEEEE and EEEEO by 2 butterfly computation subelements (171) input are moved to left, and 2 data will trying to achieve is exported to transpose buffering module (2).
10. transform coder according to claim 1 is characterized in that: 4 strange coefficient processing unit (18) are made of 14 dot factor operator unit (180) and 24 dot factor addition subelements (181);
Described 4 dot factor operator unit (180) are made of register, shift unit and adder cascade, are used for finishing the data EEEO to by 4 butterfly processing elements (15) input
0, EEEO
1Postpone, obtain retardation coefficient EEEO
0_0, EEEO
1_0, and try to achieve respectively EEEO
0With EEEO
0, and EEEO
1With EEEO
1Self move to left data sum behind the coordination not, that is:
Try to achieve EEEO
0With EEEO
0Data sum EEEO after 1 self moves to left
0_1,
Try to achieve EEEO
1With EEEO
1Data sum EEEO after 1 self moves to left
1_1,
Try to achieve EEEO
0With EEEO
0Data sum EEEO after 2 self moves to left
0_2,
Try to achieve EEEO
1With EEEO
1Data sum EEEO after 2 self moves to left
1_2,
These coefficients are inputed to each 4 dot factor addition subelement (181);
Described 4 dot factor addition subelements (181) are made of shift unit, adder and subtracter cascade, are used for the coefficient EEEO to the input of 4 dot factor operator unit (180)
0_0, EEEO
1_0, EEEO
0_1, EEEO
1_1, EEEO
0_2And EEEO
1_2Carry out shifter-adder or displacement and subtract each other, 1 data of finally trying to achieve are exported to transpose buffering module (2).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310283390.3A CN103369326B (en) | 2013-07-05 | 2013-07-05 | Be suitable to the transform coder of high-performance video coding standard HEVC |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310283390.3A CN103369326B (en) | 2013-07-05 | 2013-07-05 | Be suitable to the transform coder of high-performance video coding standard HEVC |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103369326A true CN103369326A (en) | 2013-10-23 |
CN103369326B CN103369326B (en) | 2016-06-29 |
Family
ID=49369731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310283390.3A Expired - Fee Related CN103369326B (en) | 2013-07-05 | 2013-07-05 | Be suitable to the transform coder of high-performance video coding standard HEVC |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103369326B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105100811A (en) * | 2014-05-14 | 2015-11-25 | 北京君正集成电路股份有限公司 | Video transformation realizing method and device |
CN105791871A (en) * | 2014-12-25 | 2016-07-20 | 炬芯(珠海)科技有限公司 | Discrete cosine transform DCT device and application method |
CN106028049A (en) * | 2016-07-06 | 2016-10-12 | 电子科技大学 | Two-dimensional DCT image processor |
CN107027039A (en) * | 2017-04-14 | 2017-08-08 | 西安电子科技大学 | Discrete cosine transform implementation method based on efficient video coding standard |
CN107181963A (en) * | 2017-03-31 | 2017-09-19 | 武汉斗鱼网络科技有限公司 | A kind of video-frequency compression method and device |
CN108184127A (en) * | 2018-01-13 | 2018-06-19 | 福州大学 | A kind of configurable more dimension D CT mapping hardware multiplexing architectures |
CN109521994A (en) * | 2017-09-19 | 2019-03-26 | 华为技术有限公司 | Multiplication hardware circuit, system on chip and electronic equipment |
CN116366248A (en) * | 2023-05-31 | 2023-06-30 | 山东大学 | Kyber implementation method and system based on compact instruction set expansion |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102857756A (en) * | 2012-07-19 | 2013-01-02 | 西安电子科技大学 | Transfer coder adaptive to high efficiency video coding (HEVC) standard |
WO2013010386A1 (en) * | 2011-07-18 | 2013-01-24 | Mediatek Singapore Pte. Ltd. | Method and apparatus for compressing coding unit in high efficiency video coding |
CN103024389A (en) * | 2012-12-24 | 2013-04-03 | 芯原微电子(北京)有限公司 | HEVC (high efficiency video coding) decoding device and method |
CN103067718A (en) * | 2013-01-30 | 2013-04-24 | 上海交通大学 | One-dimensional inverse discrete cosine transform (IDCT) module circuit suitable for digital video coding/decoding |
CN103092559A (en) * | 2013-01-30 | 2013-05-08 | 上海交通大学 | Multiplying unit structure for discrete cosine transformation (DCT)/inverse discrete cosine transformation (IDCT) circuit under high efficiency video coding (HEVC) standard |
-
2013
- 2013-07-05 CN CN201310283390.3A patent/CN103369326B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013010386A1 (en) * | 2011-07-18 | 2013-01-24 | Mediatek Singapore Pte. Ltd. | Method and apparatus for compressing coding unit in high efficiency video coding |
CN102857756A (en) * | 2012-07-19 | 2013-01-02 | 西安电子科技大学 | Transfer coder adaptive to high efficiency video coding (HEVC) standard |
CN103024389A (en) * | 2012-12-24 | 2013-04-03 | 芯原微电子(北京)有限公司 | HEVC (high efficiency video coding) decoding device and method |
CN103067718A (en) * | 2013-01-30 | 2013-04-24 | 上海交通大学 | One-dimensional inverse discrete cosine transform (IDCT) module circuit suitable for digital video coding/decoding |
CN103092559A (en) * | 2013-01-30 | 2013-05-08 | 上海交通大学 | Multiplying unit structure for discrete cosine transformation (DCT)/inverse discrete cosine transformation (IDCT) circuit under high efficiency video coding (HEVC) standard |
Non-Patent Citations (2)
Title |
---|
CHUNXIAO FAN ET AL.: "A Low Complexity Multiplierless Transform Coding for HEVC", 《ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2012》 * |
朱秀昌等: "新一代视频编码标准——HEVC", 《南京邮电大学学报(自然科学版)》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105100811B (en) * | 2014-05-14 | 2018-04-03 | 北京君正集成电路股份有限公司 | The implementation method and device of a kind of video transformation |
CN105100811A (en) * | 2014-05-14 | 2015-11-25 | 北京君正集成电路股份有限公司 | Video transformation realizing method and device |
CN105791871A (en) * | 2014-12-25 | 2016-07-20 | 炬芯(珠海)科技有限公司 | Discrete cosine transform DCT device and application method |
CN106028049A (en) * | 2016-07-06 | 2016-10-12 | 电子科技大学 | Two-dimensional DCT image processor |
CN106028049B (en) * | 2016-07-06 | 2018-11-13 | 电子科技大学 | A kind of two-dimensional dct image processor |
CN107181963A (en) * | 2017-03-31 | 2017-09-19 | 武汉斗鱼网络科技有限公司 | A kind of video-frequency compression method and device |
CN107181963B (en) * | 2017-03-31 | 2019-10-22 | 武汉斗鱼网络科技有限公司 | A kind of video-frequency compression method and device |
CN107027039B (en) * | 2017-04-14 | 2019-08-27 | 西安电子科技大学 | Discrete cosine transform implementation method based on efficient video coding standard |
CN107027039A (en) * | 2017-04-14 | 2017-08-08 | 西安电子科技大学 | Discrete cosine transform implementation method based on efficient video coding standard |
CN109521994A (en) * | 2017-09-19 | 2019-03-26 | 华为技术有限公司 | Multiplication hardware circuit, system on chip and electronic equipment |
WO2019057093A1 (en) * | 2017-09-19 | 2019-03-28 | 华为技术有限公司 | Multiplication circuit, system on chip, and electronic device |
CN109521994B (en) * | 2017-09-19 | 2020-11-10 | 华为技术有限公司 | Multiplication hardware circuit, system on chip and electronic equipment |
US11249721B2 (en) | 2017-09-19 | 2022-02-15 | Huawei Technologies Co., Ltd. | Multiplication circuit, system on chip, and electronic device |
CN108184127A (en) * | 2018-01-13 | 2018-06-19 | 福州大学 | A kind of configurable more dimension D CT mapping hardware multiplexing architectures |
CN108184127B (en) * | 2018-01-13 | 2020-06-12 | 福州大学 | Configurable multi-size DCT (discrete cosine transform) transformation hardware multiplexing architecture |
CN116366248A (en) * | 2023-05-31 | 2023-06-30 | 山东大学 | Kyber implementation method and system based on compact instruction set expansion |
CN116366248B (en) * | 2023-05-31 | 2023-09-29 | 山东大学 | Kyber implementation method and system based on compact instruction set expansion |
Also Published As
Publication number | Publication date |
---|---|
CN103369326B (en) | 2016-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103369326A (en) | Transition coder applicable to HEVC ( high efficiency video coding) standards | |
Gong et al. | New cost-effective VLSI implementation of a 2-D discrete cosine transform and its inverse | |
CN102857756B (en) | Transfer coder adaptive to high efficiency video coding (HEVC) standard | |
CN105183425A (en) | Fixed-bit-width multiplier with high accuracy and low complexity properties | |
CN101729893A (en) | MPEG multi-format compatible decoding method based on software and hardware coprocessing and device thereof | |
CN101426134A (en) | Hardware device and method for video encoding and decoding | |
CN102300092B (en) | Lifting scheme-based 9/7 wavelet inverse transformation image decompressing method | |
Kim et al. | An area efficient DCT architecture for MPEG-2 video encoder | |
CN103237219A (en) | Two-dimensional discrete cosine transformation (DCT)/inverse DCT circuit and method | |
CN103092559B (en) | For the multiplier architecture of DCT/IDCT circuit under HEVC standard | |
CN110766136B (en) | Compression method of sparse matrix and vector | |
CN114758209B (en) | Convolution result obtaining method and device, computer equipment and storage medium | |
CN103179398A (en) | FPGA (field programmable gate array) implement method for lifting wavelet transform | |
CN102447898B (en) | Method for realizing KLT (Karhunen-Loeve Transform) by means of FPGA (Field Program Gate Array) | |
CN210109863U (en) | Multiplier, device, neural network chip and electronic equipment | |
KR100444729B1 (en) | Fast fourier transform apparatus using radix-8 single-path delay commutator and method thereof | |
CN101316367B (en) | Two-dimension inverse transformation method of video encoding and decoding standard, and its implementing circuit | |
CN105898334B (en) | A kind of DC prediction circuits and its method applied to coding and decoding video | |
Senthilkumar et al. | Power Reduction in DCT Implementation using Comparative Input Method | |
KR20150050680A (en) | Device and method for discrete cosine transform | |
KR100202567B1 (en) | An arithmetic apparatus for high speed idct | |
KR960014197B1 (en) | Distributed arithmetic unit | |
Chetan et al. | Performance Analysis of Modified Architecture of DA-DWT and Lifting based Scheme DWT for Image Compression. | |
CN114327637A (en) | Data conversion method, apparatus, electronic device, medium, and computer program product | |
Kumar et al. | POWER AND AREA OPTIMAL IMPLEMENTATION OF 2D-CSDA FOR MULTI STANDARD CORE |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160629 |