US20190342551A1 - Rate control - Google Patents
Rate control Download PDFInfo
- Publication number
- US20190342551A1 US20190342551A1 US16/511,839 US201916511839A US2019342551A1 US 20190342551 A1 US20190342551 A1 US 20190342551A1 US 201916511839 A US201916511839 A US 201916511839A US 2019342551 A1 US2019342551 A1 US 2019342551A1
- Authority
- US
- United States
- Prior art keywords
- coding parameter
- input frame
- bitrate
- parameter values
- rate control
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/0001—Systems modifying transmission characteristics according to link quality, e.g. power backoff
- H04L1/0014—Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the source coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L1/00—Arrangements for detecting or preventing errors in the information received
- H04L1/0001—Systems modifying transmission characteristics according to link quality, e.g. power backoff
- H04L1/0015—Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0852—Delays
- H04L43/087—Jitter
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/149—Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W24/00—Supervisory, monitoring or testing arrangements
- H04W24/02—Arrangements for optimising operational condition
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
- H04L43/0882—Utilisation of link capacity
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/19—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/189—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
- H04N19/196—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
Definitions
- the present disclosure relates to data encoding and, more particularly, to a method and apparatus for rate control, a multi-rate encoding apparatus, and a transmitting terminal.
- a condition of a channel such as channel bandwidth
- a wireless channel There are many factors affecting the wireless channel, such as physical distance, relative position, and obstacles/occlusions between receiving and transmitting terminals, immediate electromagnetic interference, and the like.
- a data source of the transmission also varies over time.
- the source time-variation and the channel time-variation are independent of each other and are difficult to predict, which cause difficulties in adapting source encoding to the channel bandwidth in real-time. For example, when the channel is stable, a sudden movement of the camera or a large movement of the object in the camera view leads to a sudden change in the size of the encoded data stream.
- the transmission latency/delay is doubled accordingly.
- the size of the data stream remains constant, but a sudden channel variation can still cause transmission jitter (transmission latency that varies over time). If the channel bandwidth reduces by one half, the transmission latency is increased by two times accordingly.
- Rate control technologies that adapt encoding rate to the channel bandwidth in real-time have been widely used in the wireless video transmission applications to ensure a smooth transmission over unreliable channels.
- Conventional rate control technologies only control the overall average bitrate of a group of frames (e.g., multiple frames). Because only one sample point including two elements, e.g., a coding parameter value and a corresponding bitrate value, is generated per frame, several sample points are needed to be generated from multiple frames over a given time period for estimating the parameters of the rate control model.
- the conventional rate control technologies stabilize the average bitrate over a given time period (e.g., multiple frames) at an expected bitrate to ensure that the overall jitter averaged over multiple frames or a period of time is small.
- the low-latency video transmission requires stabilizing the bitrate per frame within a certain range to avoid large transmission jitter that cause the playback frequently stop at the receiving terminal.
- a rate control method including encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encoding a second input frame based on the updated rate control model.
- a rate control apparatus including one or more memories storing instructions and one or more processors coupled to the one or more memories.
- the one or more processors are configured to encode a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, update a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encode a second input frame based on the updated rate control model.
- FIG. 1 is a schematic diagram showing a transmitting terminal according to exemplary embodiments of the disclosure.
- FIG. 2 is a schematic block diagram showing a multi-rate encoding apparatus according to an exemplary embodiment of the disclosure.
- FIG. 3 is schematic block diagram showing a single-rate encoder according to exemplary embodiments of the disclosure.
- FIG. 4 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
- FIG. 5 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
- FIG. 6 is a schematic diagram illustrating a process of updating a rate control model per frame according to exemplary embodiments of the disclosure.
- FIG. 7 is flow chart of a rate control method according to exemplary embodiments of the disclosure.
- FIG. 8 a flow chart showing a process of iteratively updating a rate control model according to exemplary embodiments of the disclosure.
- FIG. 9 schematically shows a variation of a bitrate versus quantization parameter (QP) curve (R-Q curve) between frames according to exemplary embodiments of the disclosure.
- QP bitrate versus quantization parameter
- FIG. 1 is a schematic diagram showing an exemplary transmitting terminal 100 consistent with the disclosure.
- the transmitting terminal 100 is configured to capture images and encode the images according to a plurality of coding parameter values to generate a plurality of encoded data streams, also referred to as a plurality of encoded data streams.
- the images may be still images, e.g., pictures, and/or moving images, e.g., videos.
- image is used to refer to either a still image or a moving image.
- the coding parameter refers to a parameter associated with the encoding process, such as a quantization parameter (QP), a coding mode selection, a packet size, or the like.
- QP quantization parameter
- Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of a plurality of bitrate values.
- the transmitting terminal 100 is further configured to select one of the plurality of encoded data streams as an output data stream for transmitting over a transmission channel.
- the transmitting terminal 100 may be integrated in a mobile body, such as an unmanned aerial vehicle (UAV), a driverless car, a mobile robot, a driverless boat, a submarine, a spacecraft, a satellite, or the like.
- UAV unmanned aerial vehicle
- the transmitting terminal 100 may be a hosted payload carried by the mobile body that operates independently but may share the power supply of the mobile object.
- the transmission channel may use any form of communication connection, such as the Internet connection, cable television connection, telephone connection, wireless connection, or another connection capable of supporting the transmission of images.
- the transmission channel can be a wireless channel.
- the transmission channel may use any type of physical transmission medium, such as cable (e.g., twisted-pair wire cable and fiber-optic cable), air, water, space, or any combination of the above media.
- cable e.g., twisted-pair wire cable and fiber-optic cable
- air e.g., water, space, or any combination of the above media.
- the transmitting terminal 100 is integrated in a UAV, one or more of the multiple channels of encoded data streams can be over air.
- the transmitting terminal 100 is a hosted payload carried by a commercial satellite, one or more of the multiple channels of encoded data streams can be over space and air.
- the transmitting terminal 100 is a hosted payload carried by a submarine, one or more of the multiple channels of encoded data streams can be over water.
- the transmitting terminal 100 includes an image capturing device 110 , a multi-rate encoding apparatus 130 coupled to the image capturing device 110 , and a transceiver 150 coupled to the multi-rate encoding apparatus 130 .
- the image capturing device 110 includes an image sensor and a lens or a lens set, and is configured to capture images.
- the image sensor may be, for example, an opto-electronic sensor, such as a charge-coupled device (CCD) sensor, a complementary metal-oxide-semiconductor (CMOS) sensor, or the like.
- the image capturing device 110 is further configured to send the captured images to the multi-rate encoding apparatus 130 for encoding.
- the image capturing device 110 may include a memory for storing, either temporarily or permanently, the captured images.
- the multi-rate encoding apparatus 130 is configured to receive the images captured by the image capturing device 110 , and encode the images according to the plurality of coding parameter values to generate the plurality of encoded data streams. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of the plurality of bitrate values. As shown in FIG. 1 , the multi-rate encoding apparatus 130 includes a multi-rate encoder 1301 , a rate controller 1303 , and a rate selector 1305 coupled to each other. Further, the multi-rate encoder 1301 is coupled to the image capturing device 110 . The rate selector 1305 is coupled to the transceiver 150 .
- the multi-rate encoder 1301 may receive and encode the images captured by the image capturing device 110 according to any suitable video coding standard, also referred to as video compression standard, such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard.
- video compression standard such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard.
- WMV Windows Media Video
- SMPTE Society of Motion Picture and Television Engineers
- the video coding standard may be selected according to the video coding standard supported by a decoder, the channel conditions, the image quality requirement, and/or the like. For example, an image encoded using the MPEG standard needs to be decoded by a corresponding decoder adapted to support the appropriate MPEG standard.
- a lossless compression format may be used to achieve a high image quality requirement, and a lossy compression format may be used to adapt to limited transmission channel bandwidth.
- the multi-rate encoder 1301 may implement one or more different codec algorithms.
- the selection of the codec algorithm may be based on encoding complexity, encoding speed, encoding ratio, encoding efficiency, and/or the like. For example, a fast codec algorithm may be performed in real-time on a low-end hardware. A high encoding ratio algorithm may be desirable for a transmission channel with a small bandwidth.
- the multi-rate encoder 1301 may further perform at least one of encryption, error-correction encoding, format conversion, or the like.
- the encryption may be performed before transmission or storage to protect confidentiality.
- FIG. 2 is a schematic block diagram showing an example of the multi-rate encoding apparatus 130 consistent with the disclosure.
- the multi-rate encoder 1301 includes a plurality of single-rate encoders for generating the plurality of encoded data streams.
- Each single-rate encoder can generate one of the plurality of encoded data streams having a corresponding one of the plurality of bitrates according to one of the plurality of coding parameter values.
- the plurality of single-rate encoders may be separate parts or partially separate parts sharing one or more common circuits.
- FIG. 3 is a schematic block diagram showing an exemplary single-rate encoder consistent with the disclosure.
- the single-rate encoder includes a “forward path” connected by solid-line arrows and an “inverse path” connected by dashed-line arrows in the figure.
- the “forward path” includes conducting an encoding process on an entire image frame or a block, e.g., a macroblock (MB), of the image frame
- the “inverse path” includes implementing a reconstruction process, which generates context 301 for prediction of a next image frame or a next block of the next image frame.
- An image frame refers to a complete image.
- the terms “frame,” “image,” and “image frame” are used interchangeably.
- the size and type of the block of the image frame may be determined according to the encoding standard that is employed. For example, a fixed-sized MB covering 16 ⁇ 16 pixels is the basic syntax and processing unit employed in H.264 standard. H.264 also allows the subdivision of an MB into smaller sub-blocks, down to a size of 4 ⁇ 4 pixels, for motion-compensation prediction.
- An MB may be split into sub-blocks in one of four manners: 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, or 8 ⁇ 8.
- the 8 ⁇ 8 sub-block may be further split in one of four manners: 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, or 4 ⁇ 4. Therefore, when H.264 standard is used, the size of the block of the image frame can range from 16 ⁇ 16 to 4 ⁇ 4 with many options between the two as described above.
- the “forward path” includes a prediction process 302 , a transformation process 303 , a quantization process 304 , and an entropy encoding process 305 .
- a predicted block can be generated according to a prediction mode.
- the prediction mode can be selected from a plurality of intra-prediction modes and/or a plurality of inter-prediction modes that are supported by the video encoding standard that is employed. Taking H.264 for an example, H.264 supports nine intra-prediction modes for luminance 4 ⁇ 4 and 8 ⁇ 8 blocks, including eight directional modes and an intra direct component (DC) mode that is a non-directional mode.
- DC intra direct component
- H.264 supports four intra-prediction modes, i.e., Vertical mode, Horizontal mode, DC mode, and Plane mode. Further, H.264 supports all possible combination of inter-prediction modes, such as variable block sizes (i.e., 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, 4 ⁇ 4) used in inter-frame motion estimation, different inter-frame motion estimation modes (i.e., use of integer, half, or quarter pixel motion estimation), multiple reference frames.
- variable block sizes i.e., 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, 4 ⁇ 4 ⁇ 4 used in inter-frame motion estimation
- different inter-frame motion estimation modes i.e., use of integer, half, or quarter pixel motion estimation
- the predicted block is created using a previously encoded block from the current frame.
- the previously encoded block from a past or a future frame (a neighboring frame) is stored in the context 301 and used as a reference for inter-prediction.
- a weighted sum of two or more previously encoded blocks from one or more past frames and/or one or more future frames can be stored in the context 301 for inter-prediction.
- the prediction process 302 can also include a prediction mode selection process (not shown).
- the prediction mode selection process can include determining whether to apply the intra-prediction or the inter-prediction on the block. In some embodiments, which one of the intra-prediction or inter-prediction to be applied on the block can be determined according to the position of the block. For example, if the block is in the first image frame of a video or in an image frame at one of random access points of the video, the block may be intra-coded. On the other hand, if the block is in one of the remaining frames, i.e., images frames other than the first image frame, of the video or in an image frame between two random access points, the block may be inter-coded.
- An access point may refer to, e.g., a point in the stream of the video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted.
- which one of the intra-prediction or inter-prediction to be employed on the block can be determined according to a transmission error, a sudden change of channel conditions, or the like. For example, if a transmission error occurs or a sudden change of channel conditions occurs when the block is generated, the block can be intra-predicted.
- the prediction mode selection process can further include selecting an intra-prediction mode for the block from the plurality of intra-prediction modes when intra-prediction is determined to be employed and an inter-prediction mode from the plurality of inter-prediction modes when inter-prediction is determined to be employed.
- Any suitable prediction mode selection technique may be used here.
- H.264 uses a Rate-Distortion Optimization (RDO) technique to select the intra-prediction mode or the inter-prediction mode that has a least rate-distortion (RD) cost for the block.
- RDO Rate-Distortion Optimization
- the predicted block is subtracted from the block to generate a residual block.
- the residual block is transformed into a representation in the spatial-frequency domain (also referred to as spatial-spectrum domain), in which the residual block can be expressed in terms of a plurality of spatial-frequency domain components, e.g., cycles per spatial unit in X and Y directions.
- Coefficients associated with the spatial-frequency domain components in the spatial-frequency domain expression are also referred to as transform coefficients.
- Any suitable transformation method such as a discrete cosine transform (DCT), a wavelet transform, or the like, can be used here. Taking H.264 as an example, the residual block is transformed using a 4 ⁇ 4 or 8 ⁇ 8 integer transform derived from the DCT.
- quantized transform coefficients can be obtained by dividing the transform coefficients with a quantization step size (Q step ) for associating the transformed coefficients with a finite set of quantization steps.
- Q step a quantization step size
- a QP can be used to determine the Q step .
- an expected bitrate can be achieved by adjusting the value of a coding parameter, for example, the value of QP.
- Small values of QP can more accurately approximate the spatial frequency spectrum of the residual block, i.e., more spatial detail can be retained, but at the cost of more bits and higher bitrates in the encoded data stream.
- Large values of QP represent big step sizes that crudely approximate the spatial frequency spectrum of the residual block such that most of the spatial detail of residual block the can be captured by only a few quantized transform coefficients. That is, as the value of QP increases, some spatial detail is aggregated such that the bitrate drops, but at the price of loss of quality.
- H.264 allows a total of 52 possible values of QP, which are 0, 1, 2, . . . , 51, and each unit increase of QP lengthens the Q step by 12% and reduces the bitrate by roughly 12%.
- the quantized transform coefficients are entropy encoded.
- the quantized transform coefficients may be reordered (not shown) before entropy encoding.
- the entropy encoding can convert symbols into binary codes, e.g., a data stream or a bitstream, which can be easily stored and transmitted.
- context-adaptive variable-length coding CAVLC is used in H.264 standard to generate data streams.
- the symbols that are to be entropy encoded include, but are not limited to, the quantized transform coefficients, information for enabling the decoder to recreate the prediction (e.g., selected prediction mode, partition size, and the like), information about the structure of the data stream, information about a complete sequence (e.g., MB headers), and the like.
- the “inverse path” includes an inverse quantization process 306 , an inverse transformation process 307 , and a reconstruction process 308 .
- the quantized transform coefficients are inversely quantized and inversely transformed to generate a reconstructed residual block.
- the inverse quantization is also referred to as a re-scaling process, where the quantized transform coefficients are multiplied by Q step to obtain rescaled coefficients, respectively.
- the rescaled coefficients are inversely transformed to generate the reconstructed residual block.
- An inverse transformation method corresponding to the transformation method used in the transformation process 303 can be used here.
- a reverse integer DCT can be used in the reverse transformation process 307 .
- the reconstructed residual block is added to the predicted block in the reconstruction process 308 to create a reconstructed block, which is stored in the context 301 as a reference for prediction of the next block.
- the single-rate encoder may be a codec. That is, the single-rate encoder may also include a decoder (not shown).
- the decoder conceptually works in a reverse manner including an entropy decoder (not shown) and the processing elements defined within the reconstruction process, shown by the “inverse path” in FIG. 3 . The detailed description thereof is omitted here.
- FIG. 4 is a schematic block diagram showing another example of the multi-rate encoding apparatus 130 consistent with the disclosure.
- the multi-rate encoder 1301 include the plurality of single-rate encoders that share a common circuit 310 and have separate processing circuits 330 to generate the plurality of encoded data streams with different bitrates.
- the processing circuit 330 can perform the transform process 303 , the quantization process 304 , the entropy encoding process 305 , the inverse quantization process 306 , the inverse transform process 307 , and the reconstruction process 308 .
- the common circuit 310 can perform the prediction process 302 , of which the computational complexity and the computing resource consumption may account for about 70% of the overall calculations of the single-rate encoder. As such, the multi-rate encoder 1301 with the structure shown in FIG. 4 and described above can reduce resource consumption.
- the rate controller 1303 is configured to adjust the plurality of coding parameter values of the multi-rate encoder 1301 to control the plurality of bitrate values of the plurality of encoded data streams generated by the multi-rate encoder 1301 , according to a rate control model.
- the rate control model characterizes a correspondence between coding parameter and bitrate.
- the rate controller 1303 can implement a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below.
- the rate controller 1303 can be coupled to the plurality of single-rate encoders and can be configured to adjust the coding parameter value of each single-rate encoder to control the bitrate value of the encoded data stream generated by each single-rate encoder, according to the rate control model.
- the rate selector 1305 is configured to select one of the plurality of encoded data streams as the output data stream based on, for example, a current channel capacity, a current channel bandwidth, a transmission latency, and/or the like, and send the output data stream to the transceiver 150 for transmitting.
- the rate selector 1305 can be also configured to obtain feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from the transceiver 150 .
- the rate selector 1305 can be coupled to the plurality of single-rate encoders and can be configured to select one of the plurality of encoded data streams as the output data stream from the corresponding single-rate encoder based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
- the transceiver 150 is configured to obtain the output data stream from the rate selector 1305 and transmit the output data stream over the transmission channel.
- the transceiver 150 is further configured to receive the feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from a receiving terminal over the transmission channel, and send the feedback information to the rate selector 1305 .
- the transceiver 150 can include a transmitter and a receiver, and can be configured to have two-way communications capability, i.e., can both transmit and receive data.
- the transmitter and the receiver may share common circuitry.
- the transmitter and the receiver may be separate parts sharing a single housing.
- the transceiver 150 may work in any suitable frequency band, for example, the microwave band, millimeter-wave band, centimeter-wave band, optical wave band, or the like.
- the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 can be separate devices, or any two or all of them can be integrated in one device.
- the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are separate devices that can be connected or coupled to each other through wired or wireless means.
- the image capturing device 110 can be a camera, a camcorder, or a smartphone having a camera function.
- FIG. 5 is a schematic block diagram showing another example of multi-rate encoding apparatus 130 consistent with the disclosure. As shown in FIG.
- the multi-rate encoding apparatus 130 includes one or more processors 130 - 1 and one or more memories 130 - 2 .
- the one or more processors 130 - 1 can include any suitable hardware processor, such as a microprocessor, a micro-controller, a central processing unit (CPU), a network processor (NP), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another programmable logic device, discrete gate or transistor logic device, discrete hardware component.
- the one or more memories 130 - 2 store computer program codes that, when executed by the one or more processors, control the one or more processors to perform a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below, and the encoding functions of the method consistent with the disclosure.
- the one or more memories can include a non-transitory computer-readable storage medium, such as a random access memory (RAM), a read only memory, a flash memory, a volatile memory, a hard disk storage, or an optical medium.
- the transceiver 150 can be an independent device combining a transmitter and a receiver in a single package.
- the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are integrated in a same electronic device.
- the image capturing device 110 may include an image sensor and a lens or a lens set of the electronic device.
- the multi-rate encoding apparatus 113 may be implemented by one or more single-chip encoders, one or more single-chip codecs, one or more image processor, one or more image processing engine, or the like, which are integrated in the electronic device.
- the transceiver 150 may be implemented by an integrated circuit, a chip, or a chipset that is integrated in the electronic device.
- the electronic device may be a smartphone having a built-in camera and a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150 .
- any two of the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are integrated in a same electronic device.
- the image capturing device 110 can be a camera or a camcorder that is coupled to an electronic device having a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150 .
- a rate control method consistent with the disclosure can be implemented in a multi-rate encoding apparatus consistent with the disclosure.
- the multi-rate encoding apparatus can be configured as a portion of a transmitting terminal.
- the multi-rate encoding apparatus and the transmitting terminal can be, for example, the multi-rate encoding apparatus 130 and the transmitting terminal 100 described above.
- the bitrate of an encoded data stream can be controlled by controlling a coding parameter, such as a quantization parameter, used for encoding image frames.
- a coding parameter such as a quantization parameter
- the coding parameter can be selected according to a rate control model describing correspondences between coding parameters and bitrates.
- the rate control model can also be updated during the encoding process based on calculation/encoding results during the encoding process.
- the rate control model can be updated based on the encoding process of one frame or based on the encoding process of a plurality of frames.
- FIG. 6 schematically illustrates a process for updating the rate control model per frame consistent with the disclosure.
- a frame 610 is encoded using a plurality of coding parameter values (denoted using letters CP 1 , CP 2 , . . . , and CP N in FIG. 6 ) to generate a plurality of encoded data streams 630 having a plurality of bitrate values (denoted using letters R 1 , R 2 , . . . , and R N in FIG. 6 ).
- CP i corresponds to R 1
- CP 2 corresponds to R 2
- CP N corresponds to R N
- the plurality of (CP i , R i ) pairs form a plurality of sample points 650 , which can then be applied to a rate control model 670 for determining/updating parameters of the rate control model 670 according to, for example, a fitting method.
- the parameters of the rate control model can be updated or estimated per frame.
- a frame-level rate control that can stabilize the bitrate per frame at an expected bitrate can be achieved.
- the frame-level rate control can avoid frequent playback stops at the receiving terminal due to large transmission jitter.
- the overall perceptual quality of a video can be enhanced and the user experience can be improved.
- FIG. 7 is flow chart of an exemplary rate control method 700 consistent with the disclosure.
- a rate controller such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above, can control a plurality of coding parameter values of a multi-rate encoder, such as the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above, according to which a plurality of encoded data streams having a corresponding plurality of bitrate values can be generate by the multi-rate encoder.
- a rate selector such as the rate selector 1305 of the multi-rate encoding apparatus 130 described above, can select one of the plurality of encoded data streams as the output data stream based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
- the first plurality of coding parameter values may include a plurality of coding parameter values for encoding a first input frame.
- the first input frame can be a first one of image frames captured by an image capturing device and sent to a multi-rate encoder for encoding.
- the image capturing device can be, for example, the image capturing device 110 described above.
- the multi-rate encoder can be, for example, the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above.
- the first input frame can be an image frame in the stream of a video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, the first input frame can be any one of image frames captured by the image capturing device or any image frame in the stream of a video.
- the first plurality of coding parameter values are provided by a rate control model based at least in part on an expected bitrate for the first input frame (also referred to as a “first expected bitrate”). That is, one of the first plurality of coding parameter values is provided by the rate control model based on an expected bitrate for the first input frame, which can also be referred to as a “first main coding parameter value.”
- first main coding parameter value an expected bitrate for the first input frame
- the remaining ones of the first plurality of coding parameter values i.e., those of the first plurality of coding parameter values other than the first main coding parameter value, can be referred to as first auxiliary coding parameter values.
- the rate control model can include a quantizer-domain (Q-domain) rate control model (also referred to as a rate quantization (R-Q) model), that characterizes the relationship between bitrate and QP, a rho-domain (p-domain) rate control model that characterizes the relationship between bitrate and parameter p (the percentage of zeros among the quantized transform coefficients), or a Lambda-domain (k-domain) rate control model (also referred to as a rate-lambda (R- ⁇ ) model) that characterizes the relationship between bitrate and the Lagrange multiplier ⁇ corresponding to QP for each frame.
- Q-domain quantizer-domain
- p-domain rate quantization
- the rate control model may have initial parameters that are pre-stored in a rate controller, such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above.
- a coding parameter value corresponding to the expected bitrate value for the first input frame obtained according to the rate control model can be set as the first main coding parameter value.
- the coding parameter can be the QP and the R-Q model can be expressed as an exponential of the second-order polynomial:
- R represents the value of bitrate
- Q represents the value of QP
- a, b, and c represent parameters.
- the coding parameter value corresponding to the expected bitrate value for the first input frame can be calculated from the above exponential of the second-order polynomial with initial a, b, and c values.
- the expected bitrate for the first input frame can be a preset bitrate. In some embodiments, the expected bitrate for the first input frame can be obtained from a user input. In some other embodiments, the expected bitrate for the first input frame can be determined based on, for example, the channel capacity, the channel bandwidth, the transmission latency, or the like.
- the first auxiliary coding parameter values can be gradually deviated from the first main coding parameter value at a coding parameter interval.
- the first auxiliary coding parameter values can be gradually stepped down or stepped up from the first main coding parameter value and arranged at the coding parameter interval.
- one or some of the first auxiliary coding parameter values can be obtained by gradually stepping down from the first main coding parameter value and one or some of the first auxiliary coding parameter values can be obtained by gradually stepping up from the first main coding parameter value.
- the determination of the preset coding parameter interval may be a tradeoff between the computation complexity and the estimation accuracy of the parameters of the rate control model. For example, a large coding parameter interval leads to a small number of coding parameter values, which can reduce the computational burden, but may increase the estimation error of the parameters of the rate control model. On the other hand, a fine coding parameter interval can generate a plurality of coding parameter values that are densely distributed over a certain range, which can reduce the estimation error of the parameters of the rate control model, but may increase the computational burden.
- the coding parameter interval can be a constant interval, i.e., the interval between each pair of neighboring coding parameter values is the same.
- the coding parameter interval can be a variable interval, i.e., the interval between each pair of neighboring coding parameter values may vary from pair to pair, or may be the same among some pairs of neighboring coding parameter values but different among some other pairs.
- the interval can be, for example, varied with the curvature of a curve of the coding parameter versus the bitrate.
- a large interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively small curvature
- a fine interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively large curvature
- the first auxiliary coding parameter values are obtained by first obtaining the first main coding parameter value and then calculating the first auxiliary coding parameter values according to a preset interval.
- the first auxiliary coding parameter values can be obtained based on selected bitrates for the first input frame. Such selected bitrates for the first input frame are also referred to as “first auxiliary bitrate values.”
- the first auxiliary bitrate values can be gradually deviated from the expected bitrate for the first input frame at a bitrate interval. Similar to the coding parameter interval, the bitrate interval also can be constant or variable, and can be determined in a similar manner as determining the coding parameter interval.
- the first auxiliary bitrate values can be obtained by gradually stepping down and/or stepping up from the expected bitrate for the first input frame and arranged at the bitrate interval.
- the first auxiliary coding parameter values corresponding to the first auxiliary bitrate values can be calculated according to the rate control model.
- the first input frame is encoded using the first plurality of coding parameter values to generate a first plurality of encoded data streams.
- Each of the first plurality of encoded data streams is generated using a corresponding one of the first plurality of coding parameter values and has a corresponding one of a first plurality of bitrate values.
- the first input frame can be intra-encoded using the first plurality of coding parameter values to generate the first plurality of encoded data streams.
- encoding the first input frame using one of the first plurality of coding parameter values can include a prediction process, a transformation process, a quantization process, and an entropy encoding process.
- the encoding processes of the first input frame using the first plurality of coding parameter values can be separate processes and implemented in parallel.
- the multi-rate encoding apparatus 130 can include a plurality of separate single-rate encoders, each single-rate encoder can be used to encode the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams.
- the encoding processes of the first input frame using the first plurality of coding parameter values can include at least one common process.
- the encoding processes of the first input frame using the first plurality of coding parameter values can share a common prediction process in the common circuit 310 and use separate transformation processes, separate quantization processes, and separate entropy encoding processes in the separate processing circuits 330 - 1 , 330 - 2 , . . . 330 -N to generate the first plurality of encoded data streams.
- the computational complexity and the computing resource consumption can be reduced by sharing the common prediction circuit 310 .
- the rate control model can be updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. That is, updated parameters of the rate control model can be obtained based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. Thus, an updated rate control model can be generated with the updated parameters.
- an updated R-Q curve can be modeled by the least-squares fitting of the exponential of the second-order polynomial described above based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, such that the updated values of a, b, and c can be obtained.
- the rate control model is updated once using one input frame and then used for encoding the next input frame.
- one updating may not be enough to create an updated rate control model that closely models the actual correspondence relationship between the coding parameter and the bitrate.
- the degree of approximation between the updated rate control model and the actual correspondence relationship between the coding parameter and the bitrate can be determined by a difference between a first actual bitrate value and the expected bitrate value.
- the first actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value. Therefore, in some other embodiments, the rate control model can be iteratively updated using the first input frame until a difference between the first actual bitrate value and the expected bitrate value is within a preset range, e.g., smaller than a preset value.
- one of the first plurality of encoded data streams is selected as an output data stream for the first input frame.
- the selection of the one of the first plurality of encoded data streams can be based on, for example, the expected bitrate value for the first input frame, the channel capacity, the channel bandwidth, the transmission latency, and/or the like.
- the output data stream for the first input frame can be selected from the first plurality of encoded streams according to the expected bitrate value for the first input frame.
- the one of the first plurality of encoded streams obtained by encoding the first input frame using the first main coding parameter value (also referred to as a “first main encoded data stream”) can be directly selected as the output data stream.
- the bitrate of the first main encoded data stream may differ from the expected bitrate for the first input frame by a relatively large value.
- the expected bitrate for the first input frame can be subject back to the updated rate control model to obtain a new coding parameter value for encoding the first input frame, and the obtained encoded data stream can be output as the output data stream for the first input frame.
- the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is closest to the expected bitrate value for the first input frame.
- the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is not more than and closest to the expected bitrate value for the first input frame.
- the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that has a difference from the expected bitrate value for the first input frame within a preset range.
- the output data stream for the first input frame can be selected from the first plurality of encoded streams according to a current channel bandwidth.
- the output data stream for the first input frame can be one of the first plurality of encoded streams that matches the current channel bandwidth.
- the output data stream can be adapted to the time-varying channel bandwidth in real-time. That it, when the channel bandwidth varies with time, the output data stream can match the channel bandwidth in real-time.
- the output data stream for the first input frame may be selected according to the current channel bandwidth and a target latency.
- the target latency may also be referred to as a control target of the latency, which represents an expected transmission latency.
- the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is closest to the target latency.
- the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is not more than and closest to the target latency.
- the output data stream for the first input frame may be one of the first plurality of encoded streams with the highest bitrate among the first plurality of encoded streams, with which the difference the target latency and the transmission latency under the current channel bandwidth is within a preset range. Because a higher bitrate generally correspond to a higher encoding quality, this approach can ensure that the encoded data with the highest encoding quality can be selected when the target latency is satisfied.
- the output data stream for the first input frame may be selected according to the channel bandwidth, the target latency, and the encoding quality. That is, the selection of the output data stream for the first input frame can be based on a combination of the requirements of the channel bandwidth, the target latency, and the encoding quality.
- a cost function may be determined according to the channel bandwidth, the target latency, the encoding quality, and a target bitrate.
- the output data stream for the first input frame may be one of the first plurality of encoded data with the smallest value of the cost function.
- the cost function may be as follows:
- Cost A ⁇
- the values of A and B can be adjusted to bias towards the requirement of the encoding quality or the requirement of the latency control, e.g., the values of A and B can be adjusted to give more weight to the requirement of the encoding quality or to the requirement of the latency control in the calculation of Cost.
- a reconstructed frame obtained from the output data stream for the first input frame can be used as the context of a second input frame. That is, a reconstructed frame obtained from the output data stream for the first input frame can be used as a reference for the prediction of the second input frame.
- a second input frame is encoded based on the updated rate control model. For example, a second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model, and the second input frame can be encoded using the second plurality of coding parameter values.
- the second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model and an expected bitrate for the second input frame (also referred to as a “second expected bitrate”).
- an expected bitrate for the second input frame also referred to as a “second expected bitrate”.
- one of the second plurality of coding parameter values can be determined based on the updated rate control model and the expected bitrate for the second input frame, which can be referred to as a second main coding parameter value.
- the remaining ones of the second plurality of coding parameter values, i.e., the coding parameter values of the second plurality of coding parameter values other than the second main coding parameter value can be referred to as second auxiliary coding parameter values.
- the coding parameter value corresponding to the expected bitrate for the second input frame calculated from the exponential of the second-order polynomial with the updated a, b, and c parameters can be set as the second main coding parameter value.
- the second main coding parameter value and the first main coding parameter value are for a same encoding channel.
- the same encoding channel refers to, for example, a same single-rate encoder included in the multi-rate encoder as shown in FIG. 2 , or a same processing circuit included in the multi-rate encoder as shown in FIG. 4 .
- the second auxiliary coding parameter values can be gradually deviated from the second main coding parameter value at a coding parameter interval.
- the second auxiliary coding parameter values can be obtained by gradually stepping up and/or stepping down from the second main coding parameter value and arranged at the coding parameter interval.
- the coding parameter interval for the second auxiliary coding parameter values can be a constant interval or a variable interval.
- the coding parameter interval for the second auxiliary coding parameter values can be determined in a similar manner as that for determining the coding parameter interval for the first auxiliary coding parameter values, and thus the detailed description thereof is omitted.
- the second auxiliary coding parameter values can be obtained based on selected bitrates for the second input images, also referred to as “second auxiliary bitrate values.” Similar to the first auxiliary bitrate values, the second auxiliary bitrate values can be gradually deviated from the expected bitrate for the second input frame at a bitrate interval. The second auxiliary coding parameter values corresponding to the second auxiliary bitrate values can be calculated based on the rate control model. For example, the second auxiliary bitrate values can be obtained by gradually stepping up and/or stepping down from the expected bitrate for the second input images.
- the bitrate interval for the second auxiliary bitrate values can be a constant interval or a variable interval.
- the bitrate interval for the second auxiliary bitrate values can be determined in a similar manner as that for determining the bitrate interval for the first auxiliary bitrate values, and thus the detailed description thereof is omitted.
- the second input frame can be inter-encoded and/or intra-encoded using the second plurality of coding parameter values to generate a second plurality of encoded data streams.
- encoding the second input frame using one of the second plurality of coding parameter values can include the prediction process, the transformation process, the quantization process, and the entropy encoding process.
- One of the second plurality of encoded data streams can be selected as an output data stream corresponding to the second input frame.
- the selection of the output data stream corresponding to the second input frame can be similar to the selection of the output data stream corresponding to the first input frame, and thus detailed description thereof is omitted.
- the rate control model may also vary between image frames. That is, the updated rate control model obtained based on the first input frame may not accurately reflect the correspondence relationship between coding parameter and bitrate in the second input frame.
- the rate control model can be further updated based on the second plurality of coding parameter values and the second plurality of bitrate values respectively corresponding to the second plurality of encoded data streams.
- the second main coding parameter value may be iteratively adjusted until the difference between a second actual bitrate and the expected bitrate value for the second input frame is within a preset range.
- the second actual bitrate value refers to a bitrate value obtained by encoding the second input frame using the second main coding parameter value.
- FIG. 8 is a flow chart showing a process of iteratively updating the rate control model consistent with the disclosure.
- the rate control model can be first updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams [denoted as letters (CP 1 i , R 1 i )] obtained according to the approaches ( 705 ) described above.
- the second plurality of coding parameter values (denoted as CP 2 i ) can be determined based on the updated rate control model and the expected bitrate for the second input frame as described above.
- the coding parameter value calculated from the expected bitrate for the second input frame using the updated rate control model is the second main coding parameter value.
- the second frame can be encoded using CP 2 i to generate the second plurality of data streams having a plurality of actual bitrates (denoted as R 2 i ), including the data stream having the second actual bitrate value generated by encoding the second frame using the second main coding parameter value. If the difference between the second actual bitrate value and the expected bitrate value for the second input frame falls outside the preset range, the rate control model is updated according to the (CP 2 i , R 2 i ) pairs.
- the second main coding parameter value can be updated to a coding parameter corresponding to the expected bitrate value for the second input frame calculated from the further updated rate control model.
- the iterative adjustment process can be stopped and one of the second plurality of data streams is output as the output data stream, as shown in FIG. 8 .
- FIG. 9 schematically shows a variation of a bitrate-versus-QP curve (R-Q curve or R-Q model, i.e., an example of the rate control model) between frames.
- curve 1 represents the R-Q model obtained/updated based on the first input frame, i.e., a curve created by fitting the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams.
- the R-Q model may move from curve 1 to curve 2 .
- Curve 2 is an actual R-Q curve corresponding to the second input frame, which is yet unknown. As shown in FIG.
- curve 1 gives a corresponding second main coding parameter value QP e .
- QP e is used for encoding the second input frame
- the obtained encoded data stream will have an actual bitrate R e according to curve 2 , which is different from the expected bitrate for second input frame R e .
- Curve 1 can be iteratively updated according to, e.g., the method described above in connection with FIG. 8 to obtain curve 2 or a curve close to curve 2 . Thereafter, according to the obtained curve 2 or the obtained curve close to curve 2 , the second main coding parameter value QP e1 that can result in the expected bitrate for second input frame R e can be obtained.
- obtaining the first plurality of coding parameter values can further include iteratively adjusting the first main coding parameter value until the difference between an actual bitrate and the expected bitrate value for the first input frame is within a preset range.
- An actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Quality & Reliability (AREA)
- Algebra (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Environmental & Geological Engineering (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
- This application is a continuation of International Application No. PCT/CN2018/072444, filed Jan. 12, 2018, which claims priority to International Application No. PCT/CN2017/071491, filed Jan. 18, 2017, the entire contents of both of which are incorporated herein by reference.
- A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
- The present disclosure relates to data encoding and, more particularly, to a method and apparatus for rate control, a multi-rate encoding apparatus, and a transmitting terminal.
- One challenge in a low-latency video/image transmission system is that a condition of a channel, such as channel bandwidth, varies over time, particularly for a wireless channel. There are many factors affecting the wireless channel, such as physical distance, relative position, and obstacles/occlusions between receiving and transmitting terminals, immediate electromagnetic interference, and the like. Furthermore, a data source of the transmission also varies over time. The source time-variation and the channel time-variation are independent of each other and are difficult to predict, which cause difficulties in adapting source encoding to the channel bandwidth in real-time. For example, when the channel is stable, a sudden movement of the camera or a large movement of the object in the camera view leads to a sudden change in the size of the encoded data stream. If the size of the data stream is doubled, the transmission latency/delay is doubled accordingly. When the source is stable, the size of the data stream remains constant, but a sudden channel variation can still cause transmission jitter (transmission latency that varies over time). If the channel bandwidth reduces by one half, the transmission latency is increased by two times accordingly.
- Rate control technologies that adapt encoding rate to the channel bandwidth in real-time have been widely used in the wireless video transmission applications to ensure a smooth transmission over unreliable channels. Conventional rate control technologies only control the overall average bitrate of a group of frames (e.g., multiple frames). Because only one sample point including two elements, e.g., a coding parameter value and a corresponding bitrate value, is generated per frame, several sample points are needed to be generated from multiple frames over a given time period for estimating the parameters of the rate control model. As such, the conventional rate control technologies stabilize the average bitrate over a given time period (e.g., multiple frames) at an expected bitrate to ensure that the overall jitter averaged over multiple frames or a period of time is small. However, the low-latency video transmission requires stabilizing the bitrate per frame within a certain range to avoid large transmission jitter that cause the playback frequently stop at the receiving terminal.
- In accordance with the disclosure, there is provided a rate control method including encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encoding a second input frame based on the updated rate control model.
- Also in accordance with the disclosure, there is provided a rate control apparatus including one or more memories storing instructions and one or more processors coupled to the one or more memories. The one or more processors are configured to encode a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, update a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encode a second input frame based on the updated rate control model.
-
FIG. 1 is a schematic diagram showing a transmitting terminal according to exemplary embodiments of the disclosure. -
FIG. 2 is a schematic block diagram showing a multi-rate encoding apparatus according to an exemplary embodiment of the disclosure. -
FIG. 3 is schematic block diagram showing a single-rate encoder according to exemplary embodiments of the disclosure. -
FIG. 4 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure. -
FIG. 5 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure. -
FIG. 6 is a schematic diagram illustrating a process of updating a rate control model per frame according to exemplary embodiments of the disclosure. -
FIG. 7 is flow chart of a rate control method according to exemplary embodiments of the disclosure. -
FIG. 8 a flow chart showing a process of iteratively updating a rate control model according to exemplary embodiments of the disclosure. -
FIG. 9 schematically shows a variation of a bitrate versus quantization parameter (QP) curve (R-Q curve) between frames according to exemplary embodiments of the disclosure. - Hereinafter, embodiments consistent with the disclosure will be described with reference to the drawings, which are merely examples for illustrative purposes and are not intended to limit the scope of the disclosure. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
-
FIG. 1 is a schematic diagram showing an exemplarytransmitting terminal 100 consistent with the disclosure. The transmittingterminal 100 is configured to capture images and encode the images according to a plurality of coding parameter values to generate a plurality of encoded data streams, also referred to as a plurality of encoded data streams. The images may be still images, e.g., pictures, and/or moving images, e.g., videos. Hereinafter, the term “image” is used to refer to either a still image or a moving image. The coding parameter refers to a parameter associated with the encoding process, such as a quantization parameter (QP), a coding mode selection, a packet size, or the like. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of a plurality of bitrate values. The transmittingterminal 100 is further configured to select one of the plurality of encoded data streams as an output data stream for transmitting over a transmission channel. - In some embodiments, the
transmitting terminal 100 may be integrated in a mobile body, such as an unmanned aerial vehicle (UAV), a driverless car, a mobile robot, a driverless boat, a submarine, a spacecraft, a satellite, or the like. In some other embodiments, thetransmitting terminal 100 may be a hosted payload carried by the mobile body that operates independently but may share the power supply of the mobile object. - The transmission channel may use any form of communication connection, such as the Internet connection, cable television connection, telephone connection, wireless connection, or another connection capable of supporting the transmission of images. For example, if the
transmitting terminal 100 is integrated in a UAV, the transmission channel can be a wireless channel. The transmission channel may use any type of physical transmission medium, such as cable (e.g., twisted-pair wire cable and fiber-optic cable), air, water, space, or any combination of the above media. For example, if thetransmitting terminal 100 is integrated in a UAV, one or more of the multiple channels of encoded data streams can be over air. If the transmittingterminal 100 is a hosted payload carried by a commercial satellite, one or more of the multiple channels of encoded data streams can be over space and air. If the transmittingterminal 100 is a hosted payload carried by a submarine, one or more of the multiple channels of encoded data streams can be over water. - As shown in
FIG. 1 , thetransmitting terminal 100 includes an image capturingdevice 110, amulti-rate encoding apparatus 130 coupled to the image capturingdevice 110, and atransceiver 150 coupled to themulti-rate encoding apparatus 130. - The image capturing
device 110 includes an image sensor and a lens or a lens set, and is configured to capture images. The image sensor may be, for example, an opto-electronic sensor, such as a charge-coupled device (CCD) sensor, a complementary metal-oxide-semiconductor (CMOS) sensor, or the like. The image capturingdevice 110 is further configured to send the captured images to themulti-rate encoding apparatus 130 for encoding. In some embodiments, the image capturingdevice 110 may include a memory for storing, either temporarily or permanently, the captured images. - The
multi-rate encoding apparatus 130 is configured to receive the images captured by the image capturingdevice 110, and encode the images according to the plurality of coding parameter values to generate the plurality of encoded data streams. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of the plurality of bitrate values. As shown inFIG. 1 , themulti-rate encoding apparatus 130 includes amulti-rate encoder 1301, arate controller 1303, and arate selector 1305 coupled to each other. Further, themulti-rate encoder 1301 is coupled to the image capturingdevice 110. Therate selector 1305 is coupled to thetransceiver 150. - The
multi-rate encoder 1301 may receive and encode the images captured by theimage capturing device 110 according to any suitable video coding standard, also referred to as video compression standard, such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard. - In some embodiments, the video coding standard may be selected according to the video coding standard supported by a decoder, the channel conditions, the image quality requirement, and/or the like. For example, an image encoded using the MPEG standard needs to be decoded by a corresponding decoder adapted to support the appropriate MPEG standard. A lossless compression format may be used to achieve a high image quality requirement, and a lossy compression format may be used to adapt to limited transmission channel bandwidth.
- In some embodiments, the
multi-rate encoder 1301 may implement one or more different codec algorithms. The selection of the codec algorithm may be based on encoding complexity, encoding speed, encoding ratio, encoding efficiency, and/or the like. For example, a fast codec algorithm may be performed in real-time on a low-end hardware. A high encoding ratio algorithm may be desirable for a transmission channel with a small bandwidth. - In some other embodiments, the
multi-rate encoder 1301 may further perform at least one of encryption, error-correction encoding, format conversion, or the like. For example, when the images captured by theimage capturing device 110 contains confidential information, the encryption may be performed before transmission or storage to protect confidentiality. -
FIG. 2 is a schematic block diagram showing an example of themulti-rate encoding apparatus 130 consistent with the disclosure. As shown inFIG. 2 , themulti-rate encoder 1301 includes a plurality of single-rate encoders for generating the plurality of encoded data streams. Each single-rate encoder can generate one of the plurality of encoded data streams having a corresponding one of the plurality of bitrates according to one of the plurality of coding parameter values. In some embodiments, the plurality of single-rate encoders may be separate parts or partially separate parts sharing one or more common circuits. -
FIG. 3 is a schematic block diagram showing an exemplary single-rate encoder consistent with the disclosure. As shown inFIG. 3 , the single-rate encoder includes a “forward path” connected by solid-line arrows and an “inverse path” connected by dashed-line arrows in the figure. The “forward path” includes conducting an encoding process on an entire image frame or a block, e.g., a macroblock (MB), of the image frame, and the “inverse path” includes implementing a reconstruction process, which generatescontext 301 for prediction of a next image frame or a next block of the next image frame. An image frame refers to a complete image. Hereinafter, the terms “frame,” “image,” and “image frame” are used interchangeably. - The size and type of the block of the image frame may be determined according to the encoding standard that is employed. For example, a fixed-sized MB covering 16×16 pixels is the basic syntax and processing unit employed in H.264 standard. H.264 also allows the subdivision of an MB into smaller sub-blocks, down to a size of 4×4 pixels, for motion-compensation prediction. An MB may be split into sub-blocks in one of four manners: 16×16, 16×8, 8×16, or 8×8. The 8×8 sub-block may be further split in one of four manners: 8×8, 8×4, 4×8, or 4×4. Therefore, when H.264 standard is used, the size of the block of the image frame can range from 16×16 to 4×4 with many options between the two as described above.
- In some embodiments, as shown in
FIG. 3 , the “forward path” includes aprediction process 302, atransformation process 303, aquantization process 304, and anentropy encoding process 305. In theprediction process 302, a predicted block can be generated according to a prediction mode. The prediction mode can be selected from a plurality of intra-prediction modes and/or a plurality of inter-prediction modes that are supported by the video encoding standard that is employed. Taking H.264 for an example, H.264 supports nine intra-prediction modes for luminance 4×4 and 8×8 blocks, including eight directional modes and an intra direct component (DC) mode that is a non-directional mode. For luminance 16×16 blocks, H.264 supports four intra-prediction modes, i.e., Vertical mode, Horizontal mode, DC mode, and Plane mode. Further, H.264 supports all possible combination of inter-prediction modes, such as variable block sizes (i.e., 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4) used in inter-frame motion estimation, different inter-frame motion estimation modes (i.e., use of integer, half, or quarter pixel motion estimation), multiple reference frames. - In the plurality of intra-prediction modes, the predicted block is created using a previously encoded block from the current frame. In the plurality of inter-prediction modes, the previously encoded block from a past or a future frame (a neighboring frame) is stored in the
context 301 and used as a reference for inter-prediction. In some embodiments, a weighted sum of two or more previously encoded blocks from one or more past frames and/or one or more future frames can be stored in thecontext 301 for inter-prediction. - In some embodiments, the
prediction process 302 can also include a prediction mode selection process (not shown). In some embodiments, the prediction mode selection process can include determining whether to apply the intra-prediction or the inter-prediction on the block. In some embodiments, which one of the intra-prediction or inter-prediction to be applied on the block can be determined according to the position of the block. For example, if the block is in the first image frame of a video or in an image frame at one of random access points of the video, the block may be intra-coded. On the other hand, if the block is in one of the remaining frames, i.e., images frames other than the first image frame, of the video or in an image frame between two random access points, the block may be inter-coded. An access point may refer to, e.g., a point in the stream of the video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, which one of the intra-prediction or inter-prediction to be employed on the block can be determined according to a transmission error, a sudden change of channel conditions, or the like. For example, if a transmission error occurs or a sudden change of channel conditions occurs when the block is generated, the block can be intra-predicted. - In some embodiments, the prediction mode selection process can further include selecting an intra-prediction mode for the block from the plurality of intra-prediction modes when intra-prediction is determined to be employed and an inter-prediction mode from the plurality of inter-prediction modes when inter-prediction is determined to be employed. Any suitable prediction mode selection technique may be used here. For example, H.264 uses a Rate-Distortion Optimization (RDO) technique to select the intra-prediction mode or the inter-prediction mode that has a least rate-distortion (RD) cost for the block.
- The predicted block is subtracted from the block to generate a residual block.
- In the
transformation process 303, the residual block is transformed into a representation in the spatial-frequency domain (also referred to as spatial-spectrum domain), in which the residual block can be expressed in terms of a plurality of spatial-frequency domain components, e.g., cycles per spatial unit in X and Y directions. Coefficients associated with the spatial-frequency domain components in the spatial-frequency domain expression are also referred to as transform coefficients. Any suitable transformation method, such as a discrete cosine transform (DCT), a wavelet transform, or the like, can be used here. Taking H.264 as an example, the residual block is transformed using a 4×4 or 8×8 integer transform derived from the DCT. - In the
quantization process 304, quantized transform coefficients can be obtained by dividing the transform coefficients with a quantization step size (Qstep) for associating the transformed coefficients with a finite set of quantization steps. In some embodiments, a QP can be used to determine the Qstep. The relation between the value of QP and Qstep may be linear or exponential according to different encoding standards. Taking H.263 as an example, the relationship between the value of QP and Qstep is that Qstep=2×QP. Taking H.264 as another example, the relationship between the value of QP and Qstep is that Qstep=2QP/6. - In some embodiments, an expected bitrate can be achieved by adjusting the value of a coding parameter, for example, the value of QP. Small values of QP can more accurately approximate the spatial frequency spectrum of the residual block, i.e., more spatial detail can be retained, but at the cost of more bits and higher bitrates in the encoded data stream. Large values of QP represent big step sizes that crudely approximate the spatial frequency spectrum of the residual block such that most of the spatial detail of residual block the can be captured by only a few quantized transform coefficients. That is, as the value of QP increases, some spatial detail is aggregated such that the bitrate drops, but at the price of loss of quality. For example, H.264 allows a total of 52 possible values of QP, which are 0, 1, 2, . . . , 51, and each unit increase of QP lengthens the Qstep by 12% and reduces the bitrate by roughly 12%.
- In the
entropy encoding process 305, the quantized transform coefficients are entropy encoded. In some embodiments, the quantized transform coefficients may be reordered (not shown) before entropy encoding. The entropy encoding can convert symbols into binary codes, e.g., a data stream or a bitstream, which can be easily stored and transmitted. For example, context-adaptive variable-length coding (CAVLC) is used in H.264 standard to generate data streams. The symbols that are to be entropy encoded include, but are not limited to, the quantized transform coefficients, information for enabling the decoder to recreate the prediction (e.g., selected prediction mode, partition size, and the like), information about the structure of the data stream, information about a complete sequence (e.g., MB headers), and the like. - In some embodiments, as shown in
FIG. 3 , the “inverse path” includes aninverse quantization process 306, aninverse transformation process 307, and areconstruction process 308. The quantized transform coefficients are inversely quantized and inversely transformed to generate a reconstructed residual block. The inverse quantization is also referred to as a re-scaling process, where the quantized transform coefficients are multiplied by Qstep to obtain rescaled coefficients, respectively. The rescaled coefficients are inversely transformed to generate the reconstructed residual block. An inverse transformation method corresponding to the transformation method used in thetransformation process 303 can be used here. For example, if an integer DCT is used in thetransformation process 303, a reverse integer DCT can be used in thereverse transformation process 307. The reconstructed residual block is added to the predicted block in thereconstruction process 308 to create a reconstructed block, which is stored in thecontext 301 as a reference for prediction of the next block. - In some embodiments, the single-rate encoder may be a codec. That is, the single-rate encoder may also include a decoder (not shown). The decoder conceptually works in a reverse manner including an entropy decoder (not shown) and the processing elements defined within the reconstruction process, shown by the “inverse path” in
FIG. 3 . The detailed description thereof is omitted here. -
FIG. 4 is a schematic block diagram showing another example of themulti-rate encoding apparatus 130 consistent with the disclosure. As shown inFIG. 4 , themulti-rate encoder 1301 include the plurality of single-rate encoders that share acommon circuit 310 and haveseparate processing circuits 330 to generate the plurality of encoded data streams with different bitrates. Referring again toFIG. 3 , theprocessing circuit 330 can perform thetransform process 303, thequantization process 304, theentropy encoding process 305, theinverse quantization process 306, theinverse transform process 307, and thereconstruction process 308. Thecommon circuit 310 can perform theprediction process 302, of which the computational complexity and the computing resource consumption may account for about 70% of the overall calculations of the single-rate encoder. As such, themulti-rate encoder 1301 with the structure shown inFIG. 4 and described above can reduce resource consumption. - Referring again to
FIGS. 2 and 4 , therate controller 1303 is configured to adjust the plurality of coding parameter values of themulti-rate encoder 1301 to control the plurality of bitrate values of the plurality of encoded data streams generated by themulti-rate encoder 1301, according to a rate control model. The rate control model characterizes a correspondence between coding parameter and bitrate. In some embodiments, therate controller 1303 can implement a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below. - In some embodiments, as shown in
FIGS. 2 and 4 , when themulti-rate encoder 1301 includes the plurality of single-rate encoders, therate controller 1303 can be coupled to the plurality of single-rate encoders and can be configured to adjust the coding parameter value of each single-rate encoder to control the bitrate value of the encoded data stream generated by each single-rate encoder, according to the rate control model. - The
rate selector 1305 is configured to select one of the plurality of encoded data streams as the output data stream based on, for example, a current channel capacity, a current channel bandwidth, a transmission latency, and/or the like, and send the output data stream to thetransceiver 150 for transmitting. In some embodiments, therate selector 1305 can be also configured to obtain feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from thetransceiver 150. - In some embodiments, as shown in
FIGS. 2 and 4 , when themulti-rate encoder 1301 includes the plurality of single-rate encoders, therate selector 1305 can be coupled to the plurality of single-rate encoders and can be configured to select one of the plurality of encoded data streams as the output data stream from the corresponding single-rate encoder based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like. - Referring again to
FIG. 1 , thetransceiver 150 is configured to obtain the output data stream from therate selector 1305 and transmit the output data stream over the transmission channel. In some embodiments, thetransceiver 150 is further configured to receive the feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from a receiving terminal over the transmission channel, and send the feedback information to therate selector 1305. - The
transceiver 150 can include a transmitter and a receiver, and can be configured to have two-way communications capability, i.e., can both transmit and receive data. In some embodiments, the transmitter and the receiver may share common circuitry. In some other embodiments, the transmitter and the receiver may be separate parts sharing a single housing. Thetransceiver 150 may work in any suitable frequency band, for example, the microwave band, millimeter-wave band, centimeter-wave band, optical wave band, or the like. - According to the disclosure, the
image capturing device 110, themulti-rate encoding apparatus 130, and thetransceiver 150 can be separate devices, or any two or all of them can be integrated in one device. In some embodiments, theimage capturing device 110, themulti-rate encoding apparatus 130, and thetransceiver 150 are separate devices that can be connected or coupled to each other through wired or wireless means. For example, theimage capturing device 110 can be a camera, a camcorder, or a smartphone having a camera function.FIG. 5 is a schematic block diagram showing another example ofmulti-rate encoding apparatus 130 consistent with the disclosure. As shown inFIG. 5 , themulti-rate encoding apparatus 130 includes one or more processors 130-1 and one or more memories 130-2. The one or more processors 130-1 can include any suitable hardware processor, such as a microprocessor, a micro-controller, a central processing unit (CPU), a network processor (NP), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another programmable logic device, discrete gate or transistor logic device, discrete hardware component. The one or more memories 130-2 store computer program codes that, when executed by the one or more processors, control the one or more processors to perform a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below, and the encoding functions of the method consistent with the disclosure. The one or more memories can include a non-transitory computer-readable storage medium, such as a random access memory (RAM), a read only memory, a flash memory, a volatile memory, a hard disk storage, or an optical medium. Thetransceiver 150 can be an independent device combining a transmitter and a receiver in a single package. - In some other embodiments, the
image capturing device 110, themulti-rate encoding apparatus 130, and thetransceiver 150 are integrated in a same electronic device. For example, theimage capturing device 110 may include an image sensor and a lens or a lens set of the electronic device. The multi-rate encoding apparatus 113 may be implemented by one or more single-chip encoders, one or more single-chip codecs, one or more image processor, one or more image processing engine, or the like, which are integrated in the electronic device. Thetransceiver 150 may be implemented by an integrated circuit, a chip, or a chipset that is integrated in the electronic device. For example, the electronic device may be a smartphone having a built-in camera and a motherboard that integrates themulti-rate encoding apparatus 130 and thetransceiver 150. - In some other embodiments, any two of the
image capturing device 110, themulti-rate encoding apparatus 130, and thetransceiver 150 are integrated in a same electronic device. For example, theimage capturing device 110 can be a camera or a camcorder that is coupled to an electronic device having a motherboard that integrates themulti-rate encoding apparatus 130 and thetransceiver 150. - Exemplary rate control methods consistent with the disclosure will be described in more detail below. A rate control method consistent with the disclosure can be implemented in a multi-rate encoding apparatus consistent with the disclosure. The multi-rate encoding apparatus can be configured as a portion of a transmitting terminal. The multi-rate encoding apparatus and the transmitting terminal can be, for example, the
multi-rate encoding apparatus 130 and the transmittingterminal 100 described above. - As described above, the bitrate of an encoded data stream can be controlled by controlling a coding parameter, such as a quantization parameter, used for encoding image frames. To obtain an encoded data stream having a desired bitrate (also referred to as an “expected bitrate”), the coding parameter can be selected according to a rate control model describing correspondences between coding parameters and bitrates. The rate control model can also be updated during the encoding process based on calculation/encoding results during the encoding process. In some embodiments, the rate control model can be updated based on the encoding process of one frame or based on the encoding process of a plurality of frames.
-
FIG. 6 schematically illustrates a process for updating the rate control model per frame consistent with the disclosure. As shown inFIG. 6 , aframe 610 is encoded using a plurality of coding parameter values (denoted using letters CP1, CP2, . . . , and CPN inFIG. 6 ) to generate a plurality of encoded data streams 630 having a plurality of bitrate values (denoted using letters R1, R2, . . . , and RN inFIG. 6 ). Each of the plurality of coding parameter values CPi corresponds to one of the plurality of bitrate values Ri (i=1, 2, . . . , N). For example, CPi corresponds to R1, CP2 corresponds to R2, CPN corresponds to RN, and so on. The plurality of (CPi, Ri) pairs form a plurality ofsample points 650, which can then be applied to arate control model 670 for determining/updating parameters of therate control model 670 according to, for example, a fitting method. - According to the process shown in
FIG. 6 , the parameters of the rate control model can be updated or estimated per frame. Thus, a frame-level rate control that can stabilize the bitrate per frame at an expected bitrate can be achieved. The frame-level rate control can avoid frequent playback stops at the receiving terminal due to large transmission jitter. The overall perceptual quality of a video can be enhanced and the user experience can be improved. -
FIG. 7 is flow chart of an exemplaryrate control method 700 consistent with the disclosure. According to therate control method 700, a rate controller, such as therate controller 1303 of themulti-rate encoding apparatus 130 described above, can control a plurality of coding parameter values of a multi-rate encoder, such as themulti-rate encoder 1301 of themulti-rate encoding apparatus 130 described above, according to which a plurality of encoded data streams having a corresponding plurality of bitrate values can be generate by the multi-rate encoder. A rate selector, such as therate selector 1305 of themulti-rate encoding apparatus 130 described above, can select one of the plurality of encoded data streams as the output data stream based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like. - As shown in
FIG. 7 , at 701, a first plurality of coding parameter values are obtained. The first plurality of coding parameter values may include a plurality of coding parameter values for encoding a first input frame. In some embodiments, the first input frame can be a first one of image frames captured by an image capturing device and sent to a multi-rate encoder for encoding. The image capturing device can be, for example, theimage capturing device 110 described above. The multi-rate encoder can be, for example, themulti-rate encoder 1301 of themulti-rate encoding apparatus 130 described above. In some other embodiments, the first input frame can be an image frame in the stream of a video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, the first input frame can be any one of image frames captured by the image capturing device or any image frame in the stream of a video. - In some embodiments, the first plurality of coding parameter values are provided by a rate control model based at least in part on an expected bitrate for the first input frame (also referred to as a “first expected bitrate”). That is, one of the first plurality of coding parameter values is provided by the rate control model based on an expected bitrate for the first input frame, which can also be referred to as a “first main coding parameter value.” The remaining ones of the first plurality of coding parameter values, i.e., those of the first plurality of coding parameter values other than the first main coding parameter value, can be referred to as first auxiliary coding parameter values.
- Any suitable rate control model can be used here. For example, the rate control model can include a quantizer-domain (Q-domain) rate control model (also referred to as a rate quantization (R-Q) model), that characterizes the relationship between bitrate and QP, a rho-domain (p-domain) rate control model that characterizes the relationship between bitrate and parameter p (the percentage of zeros among the quantized transform coefficients), or a Lambda-domain (k-domain) rate control model (also referred to as a rate-lambda (R-λ) model) that characterizes the relationship between bitrate and the Lagrange multiplier λ corresponding to QP for each frame.
- In some embodiments, the rate control model may have initial parameters that are pre-stored in a rate controller, such as the
rate controller 1303 of themulti-rate encoding apparatus 130 described above. A coding parameter value corresponding to the expected bitrate value for the first input frame obtained according to the rate control model can be set as the first main coding parameter value. Taking an R-Q model as an example, the coding parameter can be the QP and the R-Q model can be expressed as an exponential of the second-order polynomial: -
R(Q)=exp(α·Q 2 +b·Q+c) - where R represents the value of bitrate, Q represents the value of QP and, a, b, and c represent parameters. The coding parameter value corresponding to the expected bitrate value for the first input frame can be calculated from the above exponential of the second-order polynomial with initial a, b, and c values.
- In some embodiments, the expected bitrate for the first input frame can be a preset bitrate. In some embodiments, the expected bitrate for the first input frame can be obtained from a user input. In some other embodiments, the expected bitrate for the first input frame can be determined based on, for example, the channel capacity, the channel bandwidth, the transmission latency, or the like.
- In some embodiments, the first auxiliary coding parameter values can be gradually deviated from the first main coding parameter value at a coding parameter interval. For example, the first auxiliary coding parameter values can be gradually stepped down or stepped up from the first main coding parameter value and arranged at the coding parameter interval. As another example, one or some of the first auxiliary coding parameter values can be obtained by gradually stepping down from the first main coding parameter value and one or some of the first auxiliary coding parameter values can be obtained by gradually stepping up from the first main coding parameter value.
- In some embodiments, the determination of the preset coding parameter interval may be a tradeoff between the computation complexity and the estimation accuracy of the parameters of the rate control model. For example, a large coding parameter interval leads to a small number of coding parameter values, which can reduce the computational burden, but may increase the estimation error of the parameters of the rate control model. On the other hand, a fine coding parameter interval can generate a plurality of coding parameter values that are densely distributed over a certain range, which can reduce the estimation error of the parameters of the rate control model, but may increase the computational burden.
- In some embodiments, the coding parameter interval can be a constant interval, i.e., the interval between each pair of neighboring coding parameter values is the same. In some other embodiments, the coding parameter interval can be a variable interval, i.e., the interval between each pair of neighboring coding parameter values may vary from pair to pair, or may be the same among some pairs of neighboring coding parameter values but different among some other pairs. In the embodiments of varying coding parameter interval, the interval can be, for example, varied with the curvature of a curve of the coding parameter versus the bitrate. For example, a large interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively small curvature, and a fine interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively large curvature.
- In the embodiments described above, the first auxiliary coding parameter values are obtained by first obtaining the first main coding parameter value and then calculating the first auxiliary coding parameter values according to a preset interval. In some other embodiments, the first auxiliary coding parameter values can be obtained based on selected bitrates for the first input frame. Such selected bitrates for the first input frame are also referred to as “first auxiliary bitrate values.” For example, the first auxiliary bitrate values can be gradually deviated from the expected bitrate for the first input frame at a bitrate interval. Similar to the coding parameter interval, the bitrate interval also can be constant or variable, and can be determined in a similar manner as determining the coding parameter interval. For example, the first auxiliary bitrate values can be obtained by gradually stepping down and/or stepping up from the expected bitrate for the first input frame and arranged at the bitrate interval. The first auxiliary coding parameter values corresponding to the first auxiliary bitrate values can be calculated according to the rate control model.
- At 703, the first input frame is encoded using the first plurality of coding parameter values to generate a first plurality of encoded data streams. Each of the first plurality of encoded data streams is generated using a corresponding one of the first plurality of coding parameter values and has a corresponding one of a first plurality of bitrate values.
- In some embodiments, the first input frame can be intra-encoded using the first plurality of coding parameter values to generate the first plurality of encoded data streams. In some embodiments, encoding the first input frame using one of the first plurality of coding parameter values can include a prediction process, a transformation process, a quantization process, and an entropy encoding process.
- In some embodiments, the encoding processes of the first input frame using the first plurality of coding parameter values can be separate processes and implemented in parallel. For example, as shown in
FIG. 2 , themulti-rate encoding apparatus 130 can include a plurality of separate single-rate encoders, each single-rate encoder can be used to encode the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams. - In some other embodiments, the encoding processes of the first input frame using the first plurality of coding parameter values can include at least one common process. For example, as shown in
FIGS. 3 and 4 , the encoding processes of the first input frame using the first plurality of coding parameter values can share a common prediction process in thecommon circuit 310 and use separate transformation processes, separate quantization processes, and separate entropy encoding processes in the separate processing circuits 330-1, 330-2, . . . 330-N to generate the first plurality of encoded data streams. The computational complexity and the computing resource consumption can be reduced by sharing thecommon prediction circuit 310. - At 705, the rate control model can be updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. That is, updated parameters of the rate control model can be obtained based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. Thus, an updated rate control model can be generated with the updated parameters. Taking the R-Q model described above as an example, an updated R-Q curve can be modeled by the least-squares fitting of the exponential of the second-order polynomial described above based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, such that the updated values of a, b, and c can be obtained.
- In the embodiments described above, the rate control model is updated once using one input frame and then used for encoding the next input frame. In some scenarios, one updating may not be enough to create an updated rate control model that closely models the actual correspondence relationship between the coding parameter and the bitrate. The degree of approximation between the updated rate control model and the actual correspondence relationship between the coding parameter and the bitrate can be determined by a difference between a first actual bitrate value and the expected bitrate value. The first actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value. Therefore, in some other embodiments, the rate control model can be iteratively updated using the first input frame until a difference between the first actual bitrate value and the expected bitrate value is within a preset range, e.g., smaller than a preset value.
- At 707, one of the first plurality of encoded data streams is selected as an output data stream for the first input frame. The selection of the one of the first plurality of encoded data streams can be based on, for example, the expected bitrate value for the first input frame, the channel capacity, the channel bandwidth, the transmission latency, and/or the like.
- In some embodiments, the output data stream for the first input frame can be selected from the first plurality of encoded streams according to the expected bitrate value for the first input frame. For example, the one of the first plurality of encoded streams obtained by encoding the first input frame using the first main coding parameter value (also referred to as a “first main encoded data stream”) can be directly selected as the output data stream. In some scenarios, the bitrate of the first main encoded data stream may differ from the expected bitrate for the first input frame by a relatively large value. In these scenarios, the expected bitrate for the first input frame can be subject back to the updated rate control model to obtain a new coding parameter value for encoding the first input frame, and the obtained encoded data stream can be output as the output data stream for the first input frame.
- As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is closest to the expected bitrate value for the first input frame.
- As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is not more than and closest to the expected bitrate value for the first input frame.
- As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that has a difference from the expected bitrate value for the first input frame within a preset range.
- In some embodiments, the output data stream for the first input frame can be selected from the first plurality of encoded streams according to a current channel bandwidth. For example, the output data stream for the first input frame can be one of the first plurality of encoded streams that matches the current channel bandwidth. As such, the output data stream can be adapted to the time-varying channel bandwidth in real-time. That it, when the channel bandwidth varies with time, the output data stream can match the channel bandwidth in real-time.
- In some other embodiments, the output data stream for the first input frame may be selected according to the current channel bandwidth and a target latency. The target latency may also be referred to as a control target of the latency, which represents an expected transmission latency.
- For example, the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is closest to the target latency.
- As another example, the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is not more than and closest to the target latency.
- As a further example, the output data stream for the first input frame may be one of the first plurality of encoded streams with the highest bitrate among the first plurality of encoded streams, with which the difference the target latency and the transmission latency under the current channel bandwidth is within a preset range. Because a higher bitrate generally correspond to a higher encoding quality, this approach can ensure that the encoded data with the highest encoding quality can be selected when the target latency is satisfied.
- In some embodiments, the output data stream for the first input frame may be selected according to the channel bandwidth, the target latency, and the encoding quality. That is, the selection of the output data stream for the first input frame can be based on a combination of the requirements of the channel bandwidth, the target latency, and the encoding quality.
- In some embodiments, a cost function may be determined according to the channel bandwidth, the target latency, the encoding quality, and a target bitrate. The output data stream for the first input frame may be one of the first plurality of encoded data with the smallest value of the cost function.
- For example, the cost function may be as follows:
-
Cost=A×|bitrate/bandwidth−target latency|+B×encoding quality -
- where Cost represents the cost, and A and B represent weights.
- According to the requirements of different application scenarios, the values of A and B can be adjusted to bias towards the requirement of the encoding quality or the requirement of the latency control, e.g., the values of A and B can be adjusted to give more weight to the requirement of the encoding quality or to the requirement of the latency control in the calculation of Cost.
- In some embodiments, a reconstructed frame obtained from the output data stream for the first input frame can be used as the context of a second input frame. That is, a reconstructed frame obtained from the output data stream for the first input frame can be used as a reference for the prediction of the second input frame.
- At 709, a second input frame is encoded based on the updated rate control model. For example, a second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model, and the second input frame can be encoded using the second plurality of coding parameter values.
- In some embodiments, the second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model and an expected bitrate for the second input frame (also referred to as a “second expected bitrate”). For example, one of the second plurality of coding parameter values can be determined based on the updated rate control model and the expected bitrate for the second input frame, which can be referred to as a second main coding parameter value. The remaining ones of the second plurality of coding parameter values, i.e., the coding parameter values of the second plurality of coding parameter values other than the second main coding parameter value, can be referred to as second auxiliary coding parameter values. Taking the R-Q model described above as an example, the coding parameter value corresponding to the expected bitrate for the second input frame calculated from the exponential of the second-order polynomial with the updated a, b, and c parameters can be set as the second main coding parameter value.
- In some embodiments, the second main coding parameter value and the first main coding parameter value are for a same encoding channel. The same encoding channel refers to, for example, a same single-rate encoder included in the multi-rate encoder as shown in
FIG. 2 , or a same processing circuit included in the multi-rate encoder as shown inFIG. 4 . - In some embodiments, similar to the first auxiliary coding parameter values, the second auxiliary coding parameter values can be gradually deviated from the second main coding parameter value at a coding parameter interval. For example, the second auxiliary coding parameter values can be obtained by gradually stepping up and/or stepping down from the second main coding parameter value and arranged at the coding parameter interval. The coding parameter interval for the second auxiliary coding parameter values can be a constant interval or a variable interval. The coding parameter interval for the second auxiliary coding parameter values can be determined in a similar manner as that for determining the coding parameter interval for the first auxiliary coding parameter values, and thus the detailed description thereof is omitted.
- In some other embodiments, the second auxiliary coding parameter values can be obtained based on selected bitrates for the second input images, also referred to as “second auxiliary bitrate values.” Similar to the first auxiliary bitrate values, the second auxiliary bitrate values can be gradually deviated from the expected bitrate for the second input frame at a bitrate interval. The second auxiliary coding parameter values corresponding to the second auxiliary bitrate values can be calculated based on the rate control model. For example, the second auxiliary bitrate values can be obtained by gradually stepping up and/or stepping down from the expected bitrate for the second input images. The bitrate interval for the second auxiliary bitrate values can be a constant interval or a variable interval. The bitrate interval for the second auxiliary bitrate values can be determined in a similar manner as that for determining the bitrate interval for the first auxiliary bitrate values, and thus the detailed description thereof is omitted.
- In some embodiments, the second input frame can be inter-encoded and/or intra-encoded using the second plurality of coding parameter values to generate a second plurality of encoded data streams. In some embodiments, encoding the second input frame using one of the second plurality of coding parameter values can include the prediction process, the transformation process, the quantization process, and the entropy encoding process.
- One of the second plurality of encoded data streams can be selected as an output data stream corresponding to the second input frame. The selection of the output data stream corresponding to the second input frame can be similar to the selection of the output data stream corresponding to the first input frame, and thus detailed description thereof is omitted.
- In some embodiments, the rate control model may also vary between image frames. That is, the updated rate control model obtained based on the first input frame may not accurately reflect the correspondence relationship between coding parameter and bitrate in the second input frame. In these embodiments, the rate control model can be further updated based on the second plurality of coding parameter values and the second plurality of bitrate values respectively corresponding to the second plurality of encoded data streams. To do so, the second main coding parameter value may be iteratively adjusted until the difference between a second actual bitrate and the expected bitrate value for the second input frame is within a preset range. The second actual bitrate value refers to a bitrate value obtained by encoding the second input frame using the second main coding parameter value.
-
FIG. 8 is a flow chart showing a process of iteratively updating the rate control model consistent with the disclosure. As shown inFIG. 8 , the rate control model can be first updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams [denoted as letters (CP1 i, R1 i)] obtained according to the approaches (705) described above. The second plurality of coding parameter values (denoted as CP2 i) can be determined based on the updated rate control model and the expected bitrate for the second input frame as described above. As described above, the coding parameter value calculated from the expected bitrate for the second input frame using the updated rate control model is the second main coding parameter value. The second frame can be encoded using CP2 i to generate the second plurality of data streams having a plurality of actual bitrates (denoted as R2 i), including the data stream having the second actual bitrate value generated by encoding the second frame using the second main coding parameter value. If the difference between the second actual bitrate value and the expected bitrate value for the second input frame falls outside the preset range, the rate control model is updated according to the (CP2 i, R2 i) pairs. The second main coding parameter value can be updated to a coding parameter corresponding to the expected bitrate value for the second input frame calculated from the further updated rate control model. On the other hand, if the above difference between the second actual bitrate value and the expected bitrate value for the second input frame is within the preset range, the iterative adjustment process can be stopped and one of the second plurality of data streams is output as the output data stream, as shown inFIG. 8 . -
FIG. 9 schematically shows a variation of a bitrate-versus-QP curve (R-Q curve or R-Q model, i.e., an example of the rate control model) between frames. InFIG. 9 ,curve 1 represents the R-Q model obtained/updated based on the first input frame, i.e., a curve created by fitting the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. As shown inFIG. 9 , when, for example, the complexities of the first input frame and the second input frame are different, the R-Q model may move fromcurve 1 tocurve 2.Curve 2 is an actual R-Q curve corresponding to the second input frame, which is yet unknown. As shown inFIG. 9 , if an expected bitrate for second input frame Re is desired,curve 1 gives a corresponding second main coding parameter value QPe. However, if QPe is used for encoding the second input frame, the obtained encoded data stream will have an actual bitrate Re according tocurve 2, which is different from the expected bitrate for second input frame Re. Curve 1 can be iteratively updated according to, e.g., the method described above in connection withFIG. 8 to obtaincurve 2 or a curve close tocurve 2. Thereafter, according to the obtainedcurve 2 or the obtained curve close tocurve 2, the second main coding parameter value QPe1 that can result in the expected bitrate for second input frame Re can be obtained. - In some embodiments, obtaining the first plurality of coding parameter values (at 701) can further include iteratively adjusting the first main coding parameter value until the difference between an actual bitrate and the expected bitrate value for the first input frame is within a preset range. An actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value.
- Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. It is intended that the specification and examples be considered as exemplary only and not to limit the scope of the disclosure, with a true scope and spirit of the invention being indicated by the following claims.
Claims (20)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/071491 WO2018132964A1 (en) | 2017-01-18 | 2017-01-18 | Method and apparatus for transmitting coded data, computer system, and mobile device |
CNPCT/CN2017/071491 | 2017-01-18 | ||
PCT/CN2018/072444 WO2018133734A1 (en) | 2017-01-18 | 2018-01-12 | Rate control |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/072444 Continuation WO2018133734A1 (en) | 2017-01-18 | 2018-01-12 | Rate control |
Publications (1)
Publication Number | Publication Date |
---|---|
US20190342551A1 true US20190342551A1 (en) | 2019-11-07 |
Family
ID=59613570
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/511,839 Abandoned US20190342551A1 (en) | 2017-01-18 | 2019-07-15 | Rate control |
US16/514,559 Active US11159796B2 (en) | 2017-01-18 | 2019-07-17 | Data transmission |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/514,559 Active US11159796B2 (en) | 2017-01-18 | 2019-07-17 | Data transmission |
Country Status (5)
Country | Link |
---|---|
US (2) | US20190342551A1 (en) |
EP (1) | EP3571840B1 (en) |
JP (1) | JP6862633B2 (en) |
CN (2) | CN107078852B (en) |
WO (2) | WO2018132964A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200045312A1 (en) * | 2016-10-05 | 2020-02-06 | Interdigital Vc Holdings, Inc. | Method and apparatus for encoding a picture |
US11159796B2 (en) * | 2017-01-18 | 2021-10-26 | SZ DJI Technology Co., Ltd. | Data transmission |
CN113660488A (en) * | 2021-10-18 | 2021-11-16 | 腾讯科技(深圳)有限公司 | Method and device for carrying out flow control on multimedia data and training flow control model |
US20220093119A1 (en) * | 2020-09-22 | 2022-03-24 | International Business Machines Corporation | Real-time vs non-real time audio streaming |
US11343501B2 (en) | 2018-10-12 | 2022-05-24 | Central South University | Video transcoding method and device, and storage medium |
US20220375133A1 (en) * | 2020-02-07 | 2022-11-24 | Huawei Technologies Co., Ltd. | Image processing method and related device |
US11627307B2 (en) * | 2018-09-28 | 2023-04-11 | Intel Corporation | Transport controlled video coding |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108521869B (en) * | 2017-09-06 | 2020-12-25 | 深圳市大疆创新科技有限公司 | Wireless data transmission method and device |
WO2019047059A1 (en) * | 2017-09-06 | 2019-03-14 | 深圳市大疆创新科技有限公司 | Method and device for transmitting wireless data |
US11683550B2 (en) | 2017-09-18 | 2023-06-20 | Intel Corporation | Apparatus, system and method of video encoding |
WO2019119175A1 (en) * | 2017-12-18 | 2019-06-27 | 深圳市大疆创新科技有限公司 | Bit rate control method, bit rate control device and wireless communication device |
CN108986829B (en) * | 2018-09-04 | 2020-12-15 | 北京猿力未来科技有限公司 | Data transmission method, device, equipment and storage medium |
US11368692B2 (en) * | 2018-10-31 | 2022-06-21 | Ati Technologies Ulc | Content adaptive quantization strength and bitrate modeling |
US11838796B2 (en) * | 2021-08-18 | 2023-12-05 | Corning Research & Development Corporation | Compression and decompression between elements of a wireless communications system (WCS) |
CN117615141B (en) * | 2023-11-23 | 2024-08-02 | 镕铭微电子(济南)有限公司 | Video coding method, system, equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050180500A1 (en) * | 2001-12-31 | 2005-08-18 | Stmicroelectronics Asia Pacific Pte Ltd | Video encoding |
US20100111163A1 (en) * | 2006-09-28 | 2010-05-06 | Hua Yang | Method for p-domain frame level bit allocation for effective rate control and enhanced video encoding quality |
US20120230400A1 (en) * | 2011-03-10 | 2012-09-13 | Microsoft Corporation | Mean absolute difference prediction for video encoding rate control |
US20130010859A1 (en) * | 2011-07-07 | 2013-01-10 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V | Model parameter estimation for a rate- or distortion-quantization model function |
US20140328384A1 (en) * | 2013-05-02 | 2014-11-06 | Magnum Semiconductor, Inc. | Methods and apparatuses including a statistical multiplexer with global rate control |
US20180139450A1 (en) * | 2016-11-15 | 2018-05-17 | City University Of Hong Kong | Systems and methods for rate control in video coding using joint machine learning and game theory |
Family Cites Families (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3265696B2 (en) * | 1993-04-01 | 2002-03-11 | 松下電器産業株式会社 | Image compression coding device |
JPH07212757A (en) * | 1994-01-24 | 1995-08-11 | Toshiba Corp | Picture compression coder |
JP3149673B2 (en) * | 1994-03-25 | 2001-03-26 | 松下電器産業株式会社 | Video encoding device, video encoding method, video reproducing device, and optical disc |
US6366614B1 (en) * | 1996-10-11 | 2002-04-02 | Qualcomm Inc. | Adaptive rate control for digital video compression |
US7062445B2 (en) | 2001-01-26 | 2006-06-13 | Microsoft Corporation | Quantization loop with heuristic approach |
US7062429B2 (en) * | 2001-09-07 | 2006-06-13 | Agere Systems Inc. | Distortion-based method and apparatus for buffer control in a communication system |
US20070013561A1 (en) | 2005-01-20 | 2007-01-18 | Qian Xu | Signal coding |
US20090225829A2 (en) | 2005-07-06 | 2009-09-10 | Do-Kyoung Kwon | Method and apparatus for operational frame-layerrate control in video encoder |
US7539612B2 (en) | 2005-07-15 | 2009-05-26 | Microsoft Corporation | Coding and decoding scale factor information |
US8077775B2 (en) | 2006-05-12 | 2011-12-13 | Freescale Semiconductor, Inc. | System and method of adaptive rate control for a video encoder |
JP2008283560A (en) * | 2007-05-11 | 2008-11-20 | Canon Inc | Information processing apparatus and method thereof |
WO2010005691A1 (en) * | 2008-06-16 | 2010-01-14 | Dolby Laboratories Licensing Corporation | Rate control model adaptation based on slice dependencies for video coding |
WO2010030569A2 (en) * | 2008-09-09 | 2010-03-18 | Dilithium Networks, Inc. | Method and apparatus for transmitting video |
JP5257215B2 (en) * | 2009-04-16 | 2013-08-07 | ソニー株式会社 | Image coding apparatus and image coding method |
CN102036062B (en) * | 2009-09-29 | 2012-12-19 | 华为技术有限公司 | Video coding method and device and electronic equipment |
CN101800885A (en) * | 2010-02-26 | 2010-08-11 | 北京新岸线网络技术有限公司 | Video data distribution method and system method and system for distributing video data |
CN101888542B (en) * | 2010-06-11 | 2013-01-09 | 北京数码视讯科技股份有限公司 | Control method for frame level bit-rate of video transcoding and transcoder |
CN102843351B (en) * | 2012-03-31 | 2016-01-27 | 华为技术有限公司 | A kind of processing method of streaming media service, streaming media server and system |
CN103379362B (en) * | 2012-04-24 | 2017-07-07 | 腾讯科技(深圳)有限公司 | VOD method and system |
US20130322516A1 (en) * | 2012-05-31 | 2013-12-05 | Broadcom Corporation | Systems and methods for generating multiple bitrate streams using a single encoding engine |
CN102970540B (en) * | 2012-11-21 | 2016-03-02 | 宁波大学 | Based on the multi-view video rate control of key frame code rate-quantitative model |
US9560361B2 (en) | 2012-12-05 | 2017-01-31 | Vixs Systems Inc. | Adaptive single-field/dual-field video encoding |
US9621902B2 (en) * | 2013-02-28 | 2017-04-11 | Google Inc. | Multi-stream optimization |
US20140334553A1 (en) * | 2013-05-07 | 2014-11-13 | Magnum Semiconductor, Inc. | Methods and apparatuses including a statistical multiplexer with bitrate smoothing |
EP2879339A1 (en) * | 2013-11-27 | 2015-06-03 | Thomson Licensing | Method for distributing available bandwidth of a network amongst ongoing traffic sessions run by devices of the network, corresponding device. |
KR102249819B1 (en) | 2014-05-02 | 2021-05-10 | 삼성전자주식회사 | System on chip and data processing system including the same |
CN105208390B (en) * | 2014-06-30 | 2018-07-20 | 杭州海康威视数字技术股份有限公司 | The bit rate control method and its system of Video coding |
US10165272B2 (en) * | 2015-01-29 | 2018-12-25 | Arris Enterprises Llc | Picture-level QP rate control performance improvements for HEVC encoding |
US9749178B2 (en) * | 2015-09-18 | 2017-08-29 | Whatsapp Inc. | Techniques to dynamically configure target bitrate for streaming network connections |
US20170094301A1 (en) * | 2015-09-28 | 2017-03-30 | Cybrook Inc. | Initial Bandwidth Estimation For Real-time Video Transmission |
CN105898211A (en) * | 2015-12-21 | 2016-08-24 | 乐视致新电子科技(天津)有限公司 | Multimedia information processing method and device |
CN106170089B (en) * | 2016-08-25 | 2020-05-22 | 上海交通大学 | H.265-based multi-path coding method |
WO2018132964A1 (en) * | 2017-01-18 | 2018-07-26 | 深圳市大疆创新科技有限公司 | Method and apparatus for transmitting coded data, computer system, and mobile device |
-
2017
- 2017-01-18 WO PCT/CN2017/071491 patent/WO2018132964A1/en active Application Filing
- 2017-01-18 CN CN201780000112.9A patent/CN107078852B/en not_active Expired - Fee Related
-
2018
- 2018-01-12 JP JP2019537365A patent/JP6862633B2/en active Active
- 2018-01-12 CN CN201880006102.0A patent/CN110169066A/en active Pending
- 2018-01-12 WO PCT/CN2018/072444 patent/WO2018133734A1/en unknown
- 2018-01-12 EP EP18741391.9A patent/EP3571840B1/en active Active
-
2019
- 2019-07-15 US US16/511,839 patent/US20190342551A1/en not_active Abandoned
- 2019-07-17 US US16/514,559 patent/US11159796B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050180500A1 (en) * | 2001-12-31 | 2005-08-18 | Stmicroelectronics Asia Pacific Pte Ltd | Video encoding |
US20100111163A1 (en) * | 2006-09-28 | 2010-05-06 | Hua Yang | Method for p-domain frame level bit allocation for effective rate control and enhanced video encoding quality |
US20120230400A1 (en) * | 2011-03-10 | 2012-09-13 | Microsoft Corporation | Mean absolute difference prediction for video encoding rate control |
US20130010859A1 (en) * | 2011-07-07 | 2013-01-10 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V | Model parameter estimation for a rate- or distortion-quantization model function |
US20140328384A1 (en) * | 2013-05-02 | 2014-11-06 | Magnum Semiconductor, Inc. | Methods and apparatuses including a statistical multiplexer with global rate control |
US20180139450A1 (en) * | 2016-11-15 | 2018-05-17 | City University Of Hong Kong | Systems and methods for rate control in video coding using joint machine learning and game theory |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200045312A1 (en) * | 2016-10-05 | 2020-02-06 | Interdigital Vc Holdings, Inc. | Method and apparatus for encoding a picture |
US10841582B2 (en) * | 2016-10-05 | 2020-11-17 | Interdigital Vc Holdings, Inc. | Method and apparatus for encoding a picture |
US11159796B2 (en) * | 2017-01-18 | 2021-10-26 | SZ DJI Technology Co., Ltd. | Data transmission |
US11627307B2 (en) * | 2018-09-28 | 2023-04-11 | Intel Corporation | Transport controlled video coding |
US11343501B2 (en) | 2018-10-12 | 2022-05-24 | Central South University | Video transcoding method and device, and storage medium |
US20220375133A1 (en) * | 2020-02-07 | 2022-11-24 | Huawei Technologies Co., Ltd. | Image processing method and related device |
US20220093119A1 (en) * | 2020-09-22 | 2022-03-24 | International Business Machines Corporation | Real-time vs non-real time audio streaming |
US11355139B2 (en) * | 2020-09-22 | 2022-06-07 | International Business Machines Corporation | Real-time vs non-real time audio streaming |
CN113660488A (en) * | 2021-10-18 | 2021-11-16 | 腾讯科技(深圳)有限公司 | Method and device for carrying out flow control on multimedia data and training flow control model |
Also Published As
Publication number | Publication date |
---|---|
CN107078852A (en) | 2017-08-18 |
JP2020505830A (en) | 2020-02-20 |
WO2018132964A1 (en) | 2018-07-26 |
EP3571840B1 (en) | 2021-09-15 |
JP6862633B2 (en) | 2021-04-21 |
EP3571840A4 (en) | 2020-01-22 |
US20190342771A1 (en) | 2019-11-07 |
US11159796B2 (en) | 2021-10-26 |
CN107078852B (en) | 2019-03-08 |
CN110169066A (en) | 2019-08-23 |
EP3571840A1 (en) | 2019-11-27 |
WO2018133734A1 (en) | 2018-07-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3571840B1 (en) | Rate control | |
US8331449B2 (en) | Fast encoding method and system using adaptive intra prediction | |
JP5384694B2 (en) | Rate control for multi-layer video design | |
EP1549074A1 (en) | A bit-rate control method and device combined with rate-distortion optimization | |
WO2020253858A1 (en) | An encoder, a decoder and corresponding methods | |
CN110870311A (en) | Fractional quantization parameter offset in video compression | |
US8340172B2 (en) | Rate control techniques for video encoding using parametric equations | |
US20130235938A1 (en) | Rate-distortion optimized transform and quantization system | |
US9560386B2 (en) | Pyramid vector quantization for video coding | |
US20210014486A1 (en) | Image transmission | |
WO2021136056A1 (en) | Encoding method and encoder | |
KR101959490B1 (en) | Method for video bit rate control and apparatus thereof | |
JP2018067808A (en) | Picture encoder, imaging apparatus, picture coding method, and program | |
CN113132726B (en) | Encoding method and encoder | |
US11800097B2 (en) | Method for image processing and apparatus for implementing the same | |
US9392286B2 (en) | Apparatuses and methods for providing quantized coefficients for video encoding | |
US20200374553A1 (en) | Image processing | |
CN112055211A (en) | Video encoder and QP setting method | |
WO2019148320A1 (en) | Video data encoding | |
US12149697B2 (en) | Encoding method and encoder | |
WO2023172616A1 (en) | Systems and methods for division-free probability regularization for arithmetic coding | |
KR101307469B1 (en) | Video encoder, video decoder, video encoding method, and video decoding method | |
KR20150102874A (en) | Method for coding image by using adaptive coding scheme and device for coding image using the method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SZ DJI TECHNOLOGY CO., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZHU, LEI;REEL/FRAME:049760/0529 Effective date: 20190708 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |