[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

US20190342551A1 - Rate control - Google Patents

Rate control Download PDF

Info

Publication number
US20190342551A1
US20190342551A1 US16/511,839 US201916511839A US2019342551A1 US 20190342551 A1 US20190342551 A1 US 20190342551A1 US 201916511839 A US201916511839 A US 201916511839A US 2019342551 A1 US2019342551 A1 US 2019342551A1
Authority
US
United States
Prior art keywords
coding parameter
input frame
bitrate
parameter values
rate control
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US16/511,839
Inventor
Lei Zhu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SZ DJI Technology Co Ltd
Original Assignee
SZ DJI Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SZ DJI Technology Co Ltd filed Critical SZ DJI Technology Co Ltd
Assigned to SZ DJI Technology Co., Ltd. reassignment SZ DJI Technology Co., Ltd. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ZHU, LEI
Publication of US20190342551A1 publication Critical patent/US20190342551A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0014Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the source coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0015Systems modifying transmission characteristics according to link quality, e.g. power backoff characterised by the adaptation strategy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays
    • H04L43/087Jitter
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0882Utilisation of link capacity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/196Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters

Definitions

  • the present disclosure relates to data encoding and, more particularly, to a method and apparatus for rate control, a multi-rate encoding apparatus, and a transmitting terminal.
  • a condition of a channel such as channel bandwidth
  • a wireless channel There are many factors affecting the wireless channel, such as physical distance, relative position, and obstacles/occlusions between receiving and transmitting terminals, immediate electromagnetic interference, and the like.
  • a data source of the transmission also varies over time.
  • the source time-variation and the channel time-variation are independent of each other and are difficult to predict, which cause difficulties in adapting source encoding to the channel bandwidth in real-time. For example, when the channel is stable, a sudden movement of the camera or a large movement of the object in the camera view leads to a sudden change in the size of the encoded data stream.
  • the transmission latency/delay is doubled accordingly.
  • the size of the data stream remains constant, but a sudden channel variation can still cause transmission jitter (transmission latency that varies over time). If the channel bandwidth reduces by one half, the transmission latency is increased by two times accordingly.
  • Rate control technologies that adapt encoding rate to the channel bandwidth in real-time have been widely used in the wireless video transmission applications to ensure a smooth transmission over unreliable channels.
  • Conventional rate control technologies only control the overall average bitrate of a group of frames (e.g., multiple frames). Because only one sample point including two elements, e.g., a coding parameter value and a corresponding bitrate value, is generated per frame, several sample points are needed to be generated from multiple frames over a given time period for estimating the parameters of the rate control model.
  • the conventional rate control technologies stabilize the average bitrate over a given time period (e.g., multiple frames) at an expected bitrate to ensure that the overall jitter averaged over multiple frames or a period of time is small.
  • the low-latency video transmission requires stabilizing the bitrate per frame within a certain range to avoid large transmission jitter that cause the playback frequently stop at the receiving terminal.
  • a rate control method including encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encoding a second input frame based on the updated rate control model.
  • a rate control apparatus including one or more memories storing instructions and one or more processors coupled to the one or more memories.
  • the one or more processors are configured to encode a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, update a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encode a second input frame based on the updated rate control model.
  • FIG. 1 is a schematic diagram showing a transmitting terminal according to exemplary embodiments of the disclosure.
  • FIG. 2 is a schematic block diagram showing a multi-rate encoding apparatus according to an exemplary embodiment of the disclosure.
  • FIG. 3 is schematic block diagram showing a single-rate encoder according to exemplary embodiments of the disclosure.
  • FIG. 4 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
  • FIG. 5 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
  • FIG. 6 is a schematic diagram illustrating a process of updating a rate control model per frame according to exemplary embodiments of the disclosure.
  • FIG. 7 is flow chart of a rate control method according to exemplary embodiments of the disclosure.
  • FIG. 8 a flow chart showing a process of iteratively updating a rate control model according to exemplary embodiments of the disclosure.
  • FIG. 9 schematically shows a variation of a bitrate versus quantization parameter (QP) curve (R-Q curve) between frames according to exemplary embodiments of the disclosure.
  • QP bitrate versus quantization parameter
  • FIG. 1 is a schematic diagram showing an exemplary transmitting terminal 100 consistent with the disclosure.
  • the transmitting terminal 100 is configured to capture images and encode the images according to a plurality of coding parameter values to generate a plurality of encoded data streams, also referred to as a plurality of encoded data streams.
  • the images may be still images, e.g., pictures, and/or moving images, e.g., videos.
  • image is used to refer to either a still image or a moving image.
  • the coding parameter refers to a parameter associated with the encoding process, such as a quantization parameter (QP), a coding mode selection, a packet size, or the like.
  • QP quantization parameter
  • Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of a plurality of bitrate values.
  • the transmitting terminal 100 is further configured to select one of the plurality of encoded data streams as an output data stream for transmitting over a transmission channel.
  • the transmitting terminal 100 may be integrated in a mobile body, such as an unmanned aerial vehicle (UAV), a driverless car, a mobile robot, a driverless boat, a submarine, a spacecraft, a satellite, or the like.
  • UAV unmanned aerial vehicle
  • the transmitting terminal 100 may be a hosted payload carried by the mobile body that operates independently but may share the power supply of the mobile object.
  • the transmission channel may use any form of communication connection, such as the Internet connection, cable television connection, telephone connection, wireless connection, or another connection capable of supporting the transmission of images.
  • the transmission channel can be a wireless channel.
  • the transmission channel may use any type of physical transmission medium, such as cable (e.g., twisted-pair wire cable and fiber-optic cable), air, water, space, or any combination of the above media.
  • cable e.g., twisted-pair wire cable and fiber-optic cable
  • air e.g., water, space, or any combination of the above media.
  • the transmitting terminal 100 is integrated in a UAV, one or more of the multiple channels of encoded data streams can be over air.
  • the transmitting terminal 100 is a hosted payload carried by a commercial satellite, one or more of the multiple channels of encoded data streams can be over space and air.
  • the transmitting terminal 100 is a hosted payload carried by a submarine, one or more of the multiple channels of encoded data streams can be over water.
  • the transmitting terminal 100 includes an image capturing device 110 , a multi-rate encoding apparatus 130 coupled to the image capturing device 110 , and a transceiver 150 coupled to the multi-rate encoding apparatus 130 .
  • the image capturing device 110 includes an image sensor and a lens or a lens set, and is configured to capture images.
  • the image sensor may be, for example, an opto-electronic sensor, such as a charge-coupled device (CCD) sensor, a complementary metal-oxide-semiconductor (CMOS) sensor, or the like.
  • the image capturing device 110 is further configured to send the captured images to the multi-rate encoding apparatus 130 for encoding.
  • the image capturing device 110 may include a memory for storing, either temporarily or permanently, the captured images.
  • the multi-rate encoding apparatus 130 is configured to receive the images captured by the image capturing device 110 , and encode the images according to the plurality of coding parameter values to generate the plurality of encoded data streams. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of the plurality of bitrate values. As shown in FIG. 1 , the multi-rate encoding apparatus 130 includes a multi-rate encoder 1301 , a rate controller 1303 , and a rate selector 1305 coupled to each other. Further, the multi-rate encoder 1301 is coupled to the image capturing device 110 . The rate selector 1305 is coupled to the transceiver 150 .
  • the multi-rate encoder 1301 may receive and encode the images captured by the image capturing device 110 according to any suitable video coding standard, also referred to as video compression standard, such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard.
  • video compression standard such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard.
  • WMV Windows Media Video
  • SMPTE Society of Motion Picture and Television Engineers
  • the video coding standard may be selected according to the video coding standard supported by a decoder, the channel conditions, the image quality requirement, and/or the like. For example, an image encoded using the MPEG standard needs to be decoded by a corresponding decoder adapted to support the appropriate MPEG standard.
  • a lossless compression format may be used to achieve a high image quality requirement, and a lossy compression format may be used to adapt to limited transmission channel bandwidth.
  • the multi-rate encoder 1301 may implement one or more different codec algorithms.
  • the selection of the codec algorithm may be based on encoding complexity, encoding speed, encoding ratio, encoding efficiency, and/or the like. For example, a fast codec algorithm may be performed in real-time on a low-end hardware. A high encoding ratio algorithm may be desirable for a transmission channel with a small bandwidth.
  • the multi-rate encoder 1301 may further perform at least one of encryption, error-correction encoding, format conversion, or the like.
  • the encryption may be performed before transmission or storage to protect confidentiality.
  • FIG. 2 is a schematic block diagram showing an example of the multi-rate encoding apparatus 130 consistent with the disclosure.
  • the multi-rate encoder 1301 includes a plurality of single-rate encoders for generating the plurality of encoded data streams.
  • Each single-rate encoder can generate one of the plurality of encoded data streams having a corresponding one of the plurality of bitrates according to one of the plurality of coding parameter values.
  • the plurality of single-rate encoders may be separate parts or partially separate parts sharing one or more common circuits.
  • FIG. 3 is a schematic block diagram showing an exemplary single-rate encoder consistent with the disclosure.
  • the single-rate encoder includes a “forward path” connected by solid-line arrows and an “inverse path” connected by dashed-line arrows in the figure.
  • the “forward path” includes conducting an encoding process on an entire image frame or a block, e.g., a macroblock (MB), of the image frame
  • the “inverse path” includes implementing a reconstruction process, which generates context 301 for prediction of a next image frame or a next block of the next image frame.
  • An image frame refers to a complete image.
  • the terms “frame,” “image,” and “image frame” are used interchangeably.
  • the size and type of the block of the image frame may be determined according to the encoding standard that is employed. For example, a fixed-sized MB covering 16 ⁇ 16 pixels is the basic syntax and processing unit employed in H.264 standard. H.264 also allows the subdivision of an MB into smaller sub-blocks, down to a size of 4 ⁇ 4 pixels, for motion-compensation prediction.
  • An MB may be split into sub-blocks in one of four manners: 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, or 8 ⁇ 8.
  • the 8 ⁇ 8 sub-block may be further split in one of four manners: 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, or 4 ⁇ 4. Therefore, when H.264 standard is used, the size of the block of the image frame can range from 16 ⁇ 16 to 4 ⁇ 4 with many options between the two as described above.
  • the “forward path” includes a prediction process 302 , a transformation process 303 , a quantization process 304 , and an entropy encoding process 305 .
  • a predicted block can be generated according to a prediction mode.
  • the prediction mode can be selected from a plurality of intra-prediction modes and/or a plurality of inter-prediction modes that are supported by the video encoding standard that is employed. Taking H.264 for an example, H.264 supports nine intra-prediction modes for luminance 4 ⁇ 4 and 8 ⁇ 8 blocks, including eight directional modes and an intra direct component (DC) mode that is a non-directional mode.
  • DC intra direct component
  • H.264 supports four intra-prediction modes, i.e., Vertical mode, Horizontal mode, DC mode, and Plane mode. Further, H.264 supports all possible combination of inter-prediction modes, such as variable block sizes (i.e., 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, 4 ⁇ 4) used in inter-frame motion estimation, different inter-frame motion estimation modes (i.e., use of integer, half, or quarter pixel motion estimation), multiple reference frames.
  • variable block sizes i.e., 16 ⁇ 16, 16 ⁇ 8, 8 ⁇ 16, 8 ⁇ 8, 8 ⁇ 4, 4 ⁇ 8, 4 ⁇ 4 ⁇ 4 used in inter-frame motion estimation
  • different inter-frame motion estimation modes i.e., use of integer, half, or quarter pixel motion estimation
  • the predicted block is created using a previously encoded block from the current frame.
  • the previously encoded block from a past or a future frame (a neighboring frame) is stored in the context 301 and used as a reference for inter-prediction.
  • a weighted sum of two or more previously encoded blocks from one or more past frames and/or one or more future frames can be stored in the context 301 for inter-prediction.
  • the prediction process 302 can also include a prediction mode selection process (not shown).
  • the prediction mode selection process can include determining whether to apply the intra-prediction or the inter-prediction on the block. In some embodiments, which one of the intra-prediction or inter-prediction to be applied on the block can be determined according to the position of the block. For example, if the block is in the first image frame of a video or in an image frame at one of random access points of the video, the block may be intra-coded. On the other hand, if the block is in one of the remaining frames, i.e., images frames other than the first image frame, of the video or in an image frame between two random access points, the block may be inter-coded.
  • An access point may refer to, e.g., a point in the stream of the video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted.
  • which one of the intra-prediction or inter-prediction to be employed on the block can be determined according to a transmission error, a sudden change of channel conditions, or the like. For example, if a transmission error occurs or a sudden change of channel conditions occurs when the block is generated, the block can be intra-predicted.
  • the prediction mode selection process can further include selecting an intra-prediction mode for the block from the plurality of intra-prediction modes when intra-prediction is determined to be employed and an inter-prediction mode from the plurality of inter-prediction modes when inter-prediction is determined to be employed.
  • Any suitable prediction mode selection technique may be used here.
  • H.264 uses a Rate-Distortion Optimization (RDO) technique to select the intra-prediction mode or the inter-prediction mode that has a least rate-distortion (RD) cost for the block.
  • RDO Rate-Distortion Optimization
  • the predicted block is subtracted from the block to generate a residual block.
  • the residual block is transformed into a representation in the spatial-frequency domain (also referred to as spatial-spectrum domain), in which the residual block can be expressed in terms of a plurality of spatial-frequency domain components, e.g., cycles per spatial unit in X and Y directions.
  • Coefficients associated with the spatial-frequency domain components in the spatial-frequency domain expression are also referred to as transform coefficients.
  • Any suitable transformation method such as a discrete cosine transform (DCT), a wavelet transform, or the like, can be used here. Taking H.264 as an example, the residual block is transformed using a 4 ⁇ 4 or 8 ⁇ 8 integer transform derived from the DCT.
  • quantized transform coefficients can be obtained by dividing the transform coefficients with a quantization step size (Q step ) for associating the transformed coefficients with a finite set of quantization steps.
  • Q step a quantization step size
  • a QP can be used to determine the Q step .
  • an expected bitrate can be achieved by adjusting the value of a coding parameter, for example, the value of QP.
  • Small values of QP can more accurately approximate the spatial frequency spectrum of the residual block, i.e., more spatial detail can be retained, but at the cost of more bits and higher bitrates in the encoded data stream.
  • Large values of QP represent big step sizes that crudely approximate the spatial frequency spectrum of the residual block such that most of the spatial detail of residual block the can be captured by only a few quantized transform coefficients. That is, as the value of QP increases, some spatial detail is aggregated such that the bitrate drops, but at the price of loss of quality.
  • H.264 allows a total of 52 possible values of QP, which are 0, 1, 2, . . . , 51, and each unit increase of QP lengthens the Q step by 12% and reduces the bitrate by roughly 12%.
  • the quantized transform coefficients are entropy encoded.
  • the quantized transform coefficients may be reordered (not shown) before entropy encoding.
  • the entropy encoding can convert symbols into binary codes, e.g., a data stream or a bitstream, which can be easily stored and transmitted.
  • context-adaptive variable-length coding CAVLC is used in H.264 standard to generate data streams.
  • the symbols that are to be entropy encoded include, but are not limited to, the quantized transform coefficients, information for enabling the decoder to recreate the prediction (e.g., selected prediction mode, partition size, and the like), information about the structure of the data stream, information about a complete sequence (e.g., MB headers), and the like.
  • the “inverse path” includes an inverse quantization process 306 , an inverse transformation process 307 , and a reconstruction process 308 .
  • the quantized transform coefficients are inversely quantized and inversely transformed to generate a reconstructed residual block.
  • the inverse quantization is also referred to as a re-scaling process, where the quantized transform coefficients are multiplied by Q step to obtain rescaled coefficients, respectively.
  • the rescaled coefficients are inversely transformed to generate the reconstructed residual block.
  • An inverse transformation method corresponding to the transformation method used in the transformation process 303 can be used here.
  • a reverse integer DCT can be used in the reverse transformation process 307 .
  • the reconstructed residual block is added to the predicted block in the reconstruction process 308 to create a reconstructed block, which is stored in the context 301 as a reference for prediction of the next block.
  • the single-rate encoder may be a codec. That is, the single-rate encoder may also include a decoder (not shown).
  • the decoder conceptually works in a reverse manner including an entropy decoder (not shown) and the processing elements defined within the reconstruction process, shown by the “inverse path” in FIG. 3 . The detailed description thereof is omitted here.
  • FIG. 4 is a schematic block diagram showing another example of the multi-rate encoding apparatus 130 consistent with the disclosure.
  • the multi-rate encoder 1301 include the plurality of single-rate encoders that share a common circuit 310 and have separate processing circuits 330 to generate the plurality of encoded data streams with different bitrates.
  • the processing circuit 330 can perform the transform process 303 , the quantization process 304 , the entropy encoding process 305 , the inverse quantization process 306 , the inverse transform process 307 , and the reconstruction process 308 .
  • the common circuit 310 can perform the prediction process 302 , of which the computational complexity and the computing resource consumption may account for about 70% of the overall calculations of the single-rate encoder. As such, the multi-rate encoder 1301 with the structure shown in FIG. 4 and described above can reduce resource consumption.
  • the rate controller 1303 is configured to adjust the plurality of coding parameter values of the multi-rate encoder 1301 to control the plurality of bitrate values of the plurality of encoded data streams generated by the multi-rate encoder 1301 , according to a rate control model.
  • the rate control model characterizes a correspondence between coding parameter and bitrate.
  • the rate controller 1303 can implement a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below.
  • the rate controller 1303 can be coupled to the plurality of single-rate encoders and can be configured to adjust the coding parameter value of each single-rate encoder to control the bitrate value of the encoded data stream generated by each single-rate encoder, according to the rate control model.
  • the rate selector 1305 is configured to select one of the plurality of encoded data streams as the output data stream based on, for example, a current channel capacity, a current channel bandwidth, a transmission latency, and/or the like, and send the output data stream to the transceiver 150 for transmitting.
  • the rate selector 1305 can be also configured to obtain feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from the transceiver 150 .
  • the rate selector 1305 can be coupled to the plurality of single-rate encoders and can be configured to select one of the plurality of encoded data streams as the output data stream from the corresponding single-rate encoder based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
  • the transceiver 150 is configured to obtain the output data stream from the rate selector 1305 and transmit the output data stream over the transmission channel.
  • the transceiver 150 is further configured to receive the feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from a receiving terminal over the transmission channel, and send the feedback information to the rate selector 1305 .
  • the transceiver 150 can include a transmitter and a receiver, and can be configured to have two-way communications capability, i.e., can both transmit and receive data.
  • the transmitter and the receiver may share common circuitry.
  • the transmitter and the receiver may be separate parts sharing a single housing.
  • the transceiver 150 may work in any suitable frequency band, for example, the microwave band, millimeter-wave band, centimeter-wave band, optical wave band, or the like.
  • the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 can be separate devices, or any two or all of them can be integrated in one device.
  • the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are separate devices that can be connected or coupled to each other through wired or wireless means.
  • the image capturing device 110 can be a camera, a camcorder, or a smartphone having a camera function.
  • FIG. 5 is a schematic block diagram showing another example of multi-rate encoding apparatus 130 consistent with the disclosure. As shown in FIG.
  • the multi-rate encoding apparatus 130 includes one or more processors 130 - 1 and one or more memories 130 - 2 .
  • the one or more processors 130 - 1 can include any suitable hardware processor, such as a microprocessor, a micro-controller, a central processing unit (CPU), a network processor (NP), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another programmable logic device, discrete gate or transistor logic device, discrete hardware component.
  • the one or more memories 130 - 2 store computer program codes that, when executed by the one or more processors, control the one or more processors to perform a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below, and the encoding functions of the method consistent with the disclosure.
  • the one or more memories can include a non-transitory computer-readable storage medium, such as a random access memory (RAM), a read only memory, a flash memory, a volatile memory, a hard disk storage, or an optical medium.
  • the transceiver 150 can be an independent device combining a transmitter and a receiver in a single package.
  • the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are integrated in a same electronic device.
  • the image capturing device 110 may include an image sensor and a lens or a lens set of the electronic device.
  • the multi-rate encoding apparatus 113 may be implemented by one or more single-chip encoders, one or more single-chip codecs, one or more image processor, one or more image processing engine, or the like, which are integrated in the electronic device.
  • the transceiver 150 may be implemented by an integrated circuit, a chip, or a chipset that is integrated in the electronic device.
  • the electronic device may be a smartphone having a built-in camera and a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150 .
  • any two of the image capturing device 110 , the multi-rate encoding apparatus 130 , and the transceiver 150 are integrated in a same electronic device.
  • the image capturing device 110 can be a camera or a camcorder that is coupled to an electronic device having a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150 .
  • a rate control method consistent with the disclosure can be implemented in a multi-rate encoding apparatus consistent with the disclosure.
  • the multi-rate encoding apparatus can be configured as a portion of a transmitting terminal.
  • the multi-rate encoding apparatus and the transmitting terminal can be, for example, the multi-rate encoding apparatus 130 and the transmitting terminal 100 described above.
  • the bitrate of an encoded data stream can be controlled by controlling a coding parameter, such as a quantization parameter, used for encoding image frames.
  • a coding parameter such as a quantization parameter
  • the coding parameter can be selected according to a rate control model describing correspondences between coding parameters and bitrates.
  • the rate control model can also be updated during the encoding process based on calculation/encoding results during the encoding process.
  • the rate control model can be updated based on the encoding process of one frame or based on the encoding process of a plurality of frames.
  • FIG. 6 schematically illustrates a process for updating the rate control model per frame consistent with the disclosure.
  • a frame 610 is encoded using a plurality of coding parameter values (denoted using letters CP 1 , CP 2 , . . . , and CP N in FIG. 6 ) to generate a plurality of encoded data streams 630 having a plurality of bitrate values (denoted using letters R 1 , R 2 , . . . , and R N in FIG. 6 ).
  • CP i corresponds to R 1
  • CP 2 corresponds to R 2
  • CP N corresponds to R N
  • the plurality of (CP i , R i ) pairs form a plurality of sample points 650 , which can then be applied to a rate control model 670 for determining/updating parameters of the rate control model 670 according to, for example, a fitting method.
  • the parameters of the rate control model can be updated or estimated per frame.
  • a frame-level rate control that can stabilize the bitrate per frame at an expected bitrate can be achieved.
  • the frame-level rate control can avoid frequent playback stops at the receiving terminal due to large transmission jitter.
  • the overall perceptual quality of a video can be enhanced and the user experience can be improved.
  • FIG. 7 is flow chart of an exemplary rate control method 700 consistent with the disclosure.
  • a rate controller such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above, can control a plurality of coding parameter values of a multi-rate encoder, such as the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above, according to which a plurality of encoded data streams having a corresponding plurality of bitrate values can be generate by the multi-rate encoder.
  • a rate selector such as the rate selector 1305 of the multi-rate encoding apparatus 130 described above, can select one of the plurality of encoded data streams as the output data stream based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
  • the first plurality of coding parameter values may include a plurality of coding parameter values for encoding a first input frame.
  • the first input frame can be a first one of image frames captured by an image capturing device and sent to a multi-rate encoder for encoding.
  • the image capturing device can be, for example, the image capturing device 110 described above.
  • the multi-rate encoder can be, for example, the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above.
  • the first input frame can be an image frame in the stream of a video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, the first input frame can be any one of image frames captured by the image capturing device or any image frame in the stream of a video.
  • the first plurality of coding parameter values are provided by a rate control model based at least in part on an expected bitrate for the first input frame (also referred to as a “first expected bitrate”). That is, one of the first plurality of coding parameter values is provided by the rate control model based on an expected bitrate for the first input frame, which can also be referred to as a “first main coding parameter value.”
  • first main coding parameter value an expected bitrate for the first input frame
  • the remaining ones of the first plurality of coding parameter values i.e., those of the first plurality of coding parameter values other than the first main coding parameter value, can be referred to as first auxiliary coding parameter values.
  • the rate control model can include a quantizer-domain (Q-domain) rate control model (also referred to as a rate quantization (R-Q) model), that characterizes the relationship between bitrate and QP, a rho-domain (p-domain) rate control model that characterizes the relationship between bitrate and parameter p (the percentage of zeros among the quantized transform coefficients), or a Lambda-domain (k-domain) rate control model (also referred to as a rate-lambda (R- ⁇ ) model) that characterizes the relationship between bitrate and the Lagrange multiplier ⁇ corresponding to QP for each frame.
  • Q-domain quantizer-domain
  • p-domain rate quantization
  • the rate control model may have initial parameters that are pre-stored in a rate controller, such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above.
  • a coding parameter value corresponding to the expected bitrate value for the first input frame obtained according to the rate control model can be set as the first main coding parameter value.
  • the coding parameter can be the QP and the R-Q model can be expressed as an exponential of the second-order polynomial:
  • R represents the value of bitrate
  • Q represents the value of QP
  • a, b, and c represent parameters.
  • the coding parameter value corresponding to the expected bitrate value for the first input frame can be calculated from the above exponential of the second-order polynomial with initial a, b, and c values.
  • the expected bitrate for the first input frame can be a preset bitrate. In some embodiments, the expected bitrate for the first input frame can be obtained from a user input. In some other embodiments, the expected bitrate for the first input frame can be determined based on, for example, the channel capacity, the channel bandwidth, the transmission latency, or the like.
  • the first auxiliary coding parameter values can be gradually deviated from the first main coding parameter value at a coding parameter interval.
  • the first auxiliary coding parameter values can be gradually stepped down or stepped up from the first main coding parameter value and arranged at the coding parameter interval.
  • one or some of the first auxiliary coding parameter values can be obtained by gradually stepping down from the first main coding parameter value and one or some of the first auxiliary coding parameter values can be obtained by gradually stepping up from the first main coding parameter value.
  • the determination of the preset coding parameter interval may be a tradeoff between the computation complexity and the estimation accuracy of the parameters of the rate control model. For example, a large coding parameter interval leads to a small number of coding parameter values, which can reduce the computational burden, but may increase the estimation error of the parameters of the rate control model. On the other hand, a fine coding parameter interval can generate a plurality of coding parameter values that are densely distributed over a certain range, which can reduce the estimation error of the parameters of the rate control model, but may increase the computational burden.
  • the coding parameter interval can be a constant interval, i.e., the interval between each pair of neighboring coding parameter values is the same.
  • the coding parameter interval can be a variable interval, i.e., the interval between each pair of neighboring coding parameter values may vary from pair to pair, or may be the same among some pairs of neighboring coding parameter values but different among some other pairs.
  • the interval can be, for example, varied with the curvature of a curve of the coding parameter versus the bitrate.
  • a large interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively small curvature
  • a fine interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively large curvature
  • the first auxiliary coding parameter values are obtained by first obtaining the first main coding parameter value and then calculating the first auxiliary coding parameter values according to a preset interval.
  • the first auxiliary coding parameter values can be obtained based on selected bitrates for the first input frame. Such selected bitrates for the first input frame are also referred to as “first auxiliary bitrate values.”
  • the first auxiliary bitrate values can be gradually deviated from the expected bitrate for the first input frame at a bitrate interval. Similar to the coding parameter interval, the bitrate interval also can be constant or variable, and can be determined in a similar manner as determining the coding parameter interval.
  • the first auxiliary bitrate values can be obtained by gradually stepping down and/or stepping up from the expected bitrate for the first input frame and arranged at the bitrate interval.
  • the first auxiliary coding parameter values corresponding to the first auxiliary bitrate values can be calculated according to the rate control model.
  • the first input frame is encoded using the first plurality of coding parameter values to generate a first plurality of encoded data streams.
  • Each of the first plurality of encoded data streams is generated using a corresponding one of the first plurality of coding parameter values and has a corresponding one of a first plurality of bitrate values.
  • the first input frame can be intra-encoded using the first plurality of coding parameter values to generate the first plurality of encoded data streams.
  • encoding the first input frame using one of the first plurality of coding parameter values can include a prediction process, a transformation process, a quantization process, and an entropy encoding process.
  • the encoding processes of the first input frame using the first plurality of coding parameter values can be separate processes and implemented in parallel.
  • the multi-rate encoding apparatus 130 can include a plurality of separate single-rate encoders, each single-rate encoder can be used to encode the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams.
  • the encoding processes of the first input frame using the first plurality of coding parameter values can include at least one common process.
  • the encoding processes of the first input frame using the first plurality of coding parameter values can share a common prediction process in the common circuit 310 and use separate transformation processes, separate quantization processes, and separate entropy encoding processes in the separate processing circuits 330 - 1 , 330 - 2 , . . . 330 -N to generate the first plurality of encoded data streams.
  • the computational complexity and the computing resource consumption can be reduced by sharing the common prediction circuit 310 .
  • the rate control model can be updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. That is, updated parameters of the rate control model can be obtained based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. Thus, an updated rate control model can be generated with the updated parameters.
  • an updated R-Q curve can be modeled by the least-squares fitting of the exponential of the second-order polynomial described above based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, such that the updated values of a, b, and c can be obtained.
  • the rate control model is updated once using one input frame and then used for encoding the next input frame.
  • one updating may not be enough to create an updated rate control model that closely models the actual correspondence relationship between the coding parameter and the bitrate.
  • the degree of approximation between the updated rate control model and the actual correspondence relationship between the coding parameter and the bitrate can be determined by a difference between a first actual bitrate value and the expected bitrate value.
  • the first actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value. Therefore, in some other embodiments, the rate control model can be iteratively updated using the first input frame until a difference between the first actual bitrate value and the expected bitrate value is within a preset range, e.g., smaller than a preset value.
  • one of the first plurality of encoded data streams is selected as an output data stream for the first input frame.
  • the selection of the one of the first plurality of encoded data streams can be based on, for example, the expected bitrate value for the first input frame, the channel capacity, the channel bandwidth, the transmission latency, and/or the like.
  • the output data stream for the first input frame can be selected from the first plurality of encoded streams according to the expected bitrate value for the first input frame.
  • the one of the first plurality of encoded streams obtained by encoding the first input frame using the first main coding parameter value (also referred to as a “first main encoded data stream”) can be directly selected as the output data stream.
  • the bitrate of the first main encoded data stream may differ from the expected bitrate for the first input frame by a relatively large value.
  • the expected bitrate for the first input frame can be subject back to the updated rate control model to obtain a new coding parameter value for encoding the first input frame, and the obtained encoded data stream can be output as the output data stream for the first input frame.
  • the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is closest to the expected bitrate value for the first input frame.
  • the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is not more than and closest to the expected bitrate value for the first input frame.
  • the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that has a difference from the expected bitrate value for the first input frame within a preset range.
  • the output data stream for the first input frame can be selected from the first plurality of encoded streams according to a current channel bandwidth.
  • the output data stream for the first input frame can be one of the first plurality of encoded streams that matches the current channel bandwidth.
  • the output data stream can be adapted to the time-varying channel bandwidth in real-time. That it, when the channel bandwidth varies with time, the output data stream can match the channel bandwidth in real-time.
  • the output data stream for the first input frame may be selected according to the current channel bandwidth and a target latency.
  • the target latency may also be referred to as a control target of the latency, which represents an expected transmission latency.
  • the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is closest to the target latency.
  • the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is not more than and closest to the target latency.
  • the output data stream for the first input frame may be one of the first plurality of encoded streams with the highest bitrate among the first plurality of encoded streams, with which the difference the target latency and the transmission latency under the current channel bandwidth is within a preset range. Because a higher bitrate generally correspond to a higher encoding quality, this approach can ensure that the encoded data with the highest encoding quality can be selected when the target latency is satisfied.
  • the output data stream for the first input frame may be selected according to the channel bandwidth, the target latency, and the encoding quality. That is, the selection of the output data stream for the first input frame can be based on a combination of the requirements of the channel bandwidth, the target latency, and the encoding quality.
  • a cost function may be determined according to the channel bandwidth, the target latency, the encoding quality, and a target bitrate.
  • the output data stream for the first input frame may be one of the first plurality of encoded data with the smallest value of the cost function.
  • the cost function may be as follows:
  • Cost A ⁇
  • the values of A and B can be adjusted to bias towards the requirement of the encoding quality or the requirement of the latency control, e.g., the values of A and B can be adjusted to give more weight to the requirement of the encoding quality or to the requirement of the latency control in the calculation of Cost.
  • a reconstructed frame obtained from the output data stream for the first input frame can be used as the context of a second input frame. That is, a reconstructed frame obtained from the output data stream for the first input frame can be used as a reference for the prediction of the second input frame.
  • a second input frame is encoded based on the updated rate control model. For example, a second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model, and the second input frame can be encoded using the second plurality of coding parameter values.
  • the second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model and an expected bitrate for the second input frame (also referred to as a “second expected bitrate”).
  • an expected bitrate for the second input frame also referred to as a “second expected bitrate”.
  • one of the second plurality of coding parameter values can be determined based on the updated rate control model and the expected bitrate for the second input frame, which can be referred to as a second main coding parameter value.
  • the remaining ones of the second plurality of coding parameter values, i.e., the coding parameter values of the second plurality of coding parameter values other than the second main coding parameter value can be referred to as second auxiliary coding parameter values.
  • the coding parameter value corresponding to the expected bitrate for the second input frame calculated from the exponential of the second-order polynomial with the updated a, b, and c parameters can be set as the second main coding parameter value.
  • the second main coding parameter value and the first main coding parameter value are for a same encoding channel.
  • the same encoding channel refers to, for example, a same single-rate encoder included in the multi-rate encoder as shown in FIG. 2 , or a same processing circuit included in the multi-rate encoder as shown in FIG. 4 .
  • the second auxiliary coding parameter values can be gradually deviated from the second main coding parameter value at a coding parameter interval.
  • the second auxiliary coding parameter values can be obtained by gradually stepping up and/or stepping down from the second main coding parameter value and arranged at the coding parameter interval.
  • the coding parameter interval for the second auxiliary coding parameter values can be a constant interval or a variable interval.
  • the coding parameter interval for the second auxiliary coding parameter values can be determined in a similar manner as that for determining the coding parameter interval for the first auxiliary coding parameter values, and thus the detailed description thereof is omitted.
  • the second auxiliary coding parameter values can be obtained based on selected bitrates for the second input images, also referred to as “second auxiliary bitrate values.” Similar to the first auxiliary bitrate values, the second auxiliary bitrate values can be gradually deviated from the expected bitrate for the second input frame at a bitrate interval. The second auxiliary coding parameter values corresponding to the second auxiliary bitrate values can be calculated based on the rate control model. For example, the second auxiliary bitrate values can be obtained by gradually stepping up and/or stepping down from the expected bitrate for the second input images.
  • the bitrate interval for the second auxiliary bitrate values can be a constant interval or a variable interval.
  • the bitrate interval for the second auxiliary bitrate values can be determined in a similar manner as that for determining the bitrate interval for the first auxiliary bitrate values, and thus the detailed description thereof is omitted.
  • the second input frame can be inter-encoded and/or intra-encoded using the second plurality of coding parameter values to generate a second plurality of encoded data streams.
  • encoding the second input frame using one of the second plurality of coding parameter values can include the prediction process, the transformation process, the quantization process, and the entropy encoding process.
  • One of the second plurality of encoded data streams can be selected as an output data stream corresponding to the second input frame.
  • the selection of the output data stream corresponding to the second input frame can be similar to the selection of the output data stream corresponding to the first input frame, and thus detailed description thereof is omitted.
  • the rate control model may also vary between image frames. That is, the updated rate control model obtained based on the first input frame may not accurately reflect the correspondence relationship between coding parameter and bitrate in the second input frame.
  • the rate control model can be further updated based on the second plurality of coding parameter values and the second plurality of bitrate values respectively corresponding to the second plurality of encoded data streams.
  • the second main coding parameter value may be iteratively adjusted until the difference between a second actual bitrate and the expected bitrate value for the second input frame is within a preset range.
  • the second actual bitrate value refers to a bitrate value obtained by encoding the second input frame using the second main coding parameter value.
  • FIG. 8 is a flow chart showing a process of iteratively updating the rate control model consistent with the disclosure.
  • the rate control model can be first updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams [denoted as letters (CP 1 i , R 1 i )] obtained according to the approaches ( 705 ) described above.
  • the second plurality of coding parameter values (denoted as CP 2 i ) can be determined based on the updated rate control model and the expected bitrate for the second input frame as described above.
  • the coding parameter value calculated from the expected bitrate for the second input frame using the updated rate control model is the second main coding parameter value.
  • the second frame can be encoded using CP 2 i to generate the second plurality of data streams having a plurality of actual bitrates (denoted as R 2 i ), including the data stream having the second actual bitrate value generated by encoding the second frame using the second main coding parameter value. If the difference between the second actual bitrate value and the expected bitrate value for the second input frame falls outside the preset range, the rate control model is updated according to the (CP 2 i , R 2 i ) pairs.
  • the second main coding parameter value can be updated to a coding parameter corresponding to the expected bitrate value for the second input frame calculated from the further updated rate control model.
  • the iterative adjustment process can be stopped and one of the second plurality of data streams is output as the output data stream, as shown in FIG. 8 .
  • FIG. 9 schematically shows a variation of a bitrate-versus-QP curve (R-Q curve or R-Q model, i.e., an example of the rate control model) between frames.
  • curve 1 represents the R-Q model obtained/updated based on the first input frame, i.e., a curve created by fitting the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams.
  • the R-Q model may move from curve 1 to curve 2 .
  • Curve 2 is an actual R-Q curve corresponding to the second input frame, which is yet unknown. As shown in FIG.
  • curve 1 gives a corresponding second main coding parameter value QP e .
  • QP e is used for encoding the second input frame
  • the obtained encoded data stream will have an actual bitrate R e according to curve 2 , which is different from the expected bitrate for second input frame R e .
  • Curve 1 can be iteratively updated according to, e.g., the method described above in connection with FIG. 8 to obtain curve 2 or a curve close to curve 2 . Thereafter, according to the obtained curve 2 or the obtained curve close to curve 2 , the second main coding parameter value QP e1 that can result in the expected bitrate for second input frame R e can be obtained.
  • obtaining the first plurality of coding parameter values can further include iteratively adjusting the first main coding parameter value until the difference between an actual bitrate and the expected bitrate value for the first input frame is within a preset range.
  • An actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Quality & Reliability (AREA)
  • Algebra (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Environmental & Geological Engineering (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method for rate control includes encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encoding a second input frame based on the updated rate control model.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of International Application No. PCT/CN2018/072444, filed Jan. 12, 2018, which claims priority to International Application No. PCT/CN2017/071491, filed Jan. 18, 2017, the entire contents of both of which are incorporated herein by reference.
  • COPYRIGHT NOTICE
  • A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
  • TECHNICAL FIELD
  • The present disclosure relates to data encoding and, more particularly, to a method and apparatus for rate control, a multi-rate encoding apparatus, and a transmitting terminal.
  • BACKGROUND
  • One challenge in a low-latency video/image transmission system is that a condition of a channel, such as channel bandwidth, varies over time, particularly for a wireless channel. There are many factors affecting the wireless channel, such as physical distance, relative position, and obstacles/occlusions between receiving and transmitting terminals, immediate electromagnetic interference, and the like. Furthermore, a data source of the transmission also varies over time. The source time-variation and the channel time-variation are independent of each other and are difficult to predict, which cause difficulties in adapting source encoding to the channel bandwidth in real-time. For example, when the channel is stable, a sudden movement of the camera or a large movement of the object in the camera view leads to a sudden change in the size of the encoded data stream. If the size of the data stream is doubled, the transmission latency/delay is doubled accordingly. When the source is stable, the size of the data stream remains constant, but a sudden channel variation can still cause transmission jitter (transmission latency that varies over time). If the channel bandwidth reduces by one half, the transmission latency is increased by two times accordingly.
  • Rate control technologies that adapt encoding rate to the channel bandwidth in real-time have been widely used in the wireless video transmission applications to ensure a smooth transmission over unreliable channels. Conventional rate control technologies only control the overall average bitrate of a group of frames (e.g., multiple frames). Because only one sample point including two elements, e.g., a coding parameter value and a corresponding bitrate value, is generated per frame, several sample points are needed to be generated from multiple frames over a given time period for estimating the parameters of the rate control model. As such, the conventional rate control technologies stabilize the average bitrate over a given time period (e.g., multiple frames) at an expected bitrate to ensure that the overall jitter averaged over multiple frames or a period of time is small. However, the low-latency video transmission requires stabilizing the bitrate per frame within a certain range to avoid large transmission jitter that cause the playback frequently stop at the receiving terminal.
  • SUMMARY
  • In accordance with the disclosure, there is provided a rate control method including encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encoding a second input frame based on the updated rate control model.
  • Also in accordance with the disclosure, there is provided a rate control apparatus including one or more memories storing instructions and one or more processors coupled to the one or more memories. The one or more processors are configured to encode a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values, update a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, and encode a second input frame based on the updated rate control model.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a schematic diagram showing a transmitting terminal according to exemplary embodiments of the disclosure.
  • FIG. 2 is a schematic block diagram showing a multi-rate encoding apparatus according to an exemplary embodiment of the disclosure.
  • FIG. 3 is schematic block diagram showing a single-rate encoder according to exemplary embodiments of the disclosure.
  • FIG. 4 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
  • FIG. 5 is a schematic block diagram showing a multi-rate encoding apparatus according to another exemplary embodiment of the disclosure.
  • FIG. 6 is a schematic diagram illustrating a process of updating a rate control model per frame according to exemplary embodiments of the disclosure.
  • FIG. 7 is flow chart of a rate control method according to exemplary embodiments of the disclosure.
  • FIG. 8 a flow chart showing a process of iteratively updating a rate control model according to exemplary embodiments of the disclosure.
  • FIG. 9 schematically shows a variation of a bitrate versus quantization parameter (QP) curve (R-Q curve) between frames according to exemplary embodiments of the disclosure.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • Hereinafter, embodiments consistent with the disclosure will be described with reference to the drawings, which are merely examples for illustrative purposes and are not intended to limit the scope of the disclosure. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
  • FIG. 1 is a schematic diagram showing an exemplary transmitting terminal 100 consistent with the disclosure. The transmitting terminal 100 is configured to capture images and encode the images according to a plurality of coding parameter values to generate a plurality of encoded data streams, also referred to as a plurality of encoded data streams. The images may be still images, e.g., pictures, and/or moving images, e.g., videos. Hereinafter, the term “image” is used to refer to either a still image or a moving image. The coding parameter refers to a parameter associated with the encoding process, such as a quantization parameter (QP), a coding mode selection, a packet size, or the like. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of a plurality of bitrate values. The transmitting terminal 100 is further configured to select one of the plurality of encoded data streams as an output data stream for transmitting over a transmission channel.
  • In some embodiments, the transmitting terminal 100 may be integrated in a mobile body, such as an unmanned aerial vehicle (UAV), a driverless car, a mobile robot, a driverless boat, a submarine, a spacecraft, a satellite, or the like. In some other embodiments, the transmitting terminal 100 may be a hosted payload carried by the mobile body that operates independently but may share the power supply of the mobile object.
  • The transmission channel may use any form of communication connection, such as the Internet connection, cable television connection, telephone connection, wireless connection, or another connection capable of supporting the transmission of images. For example, if the transmitting terminal 100 is integrated in a UAV, the transmission channel can be a wireless channel. The transmission channel may use any type of physical transmission medium, such as cable (e.g., twisted-pair wire cable and fiber-optic cable), air, water, space, or any combination of the above media. For example, if the transmitting terminal 100 is integrated in a UAV, one or more of the multiple channels of encoded data streams can be over air. If the transmitting terminal 100 is a hosted payload carried by a commercial satellite, one or more of the multiple channels of encoded data streams can be over space and air. If the transmitting terminal 100 is a hosted payload carried by a submarine, one or more of the multiple channels of encoded data streams can be over water.
  • As shown in FIG. 1, the transmitting terminal 100 includes an image capturing device 110, a multi-rate encoding apparatus 130 coupled to the image capturing device 110, and a transceiver 150 coupled to the multi-rate encoding apparatus 130.
  • The image capturing device 110 includes an image sensor and a lens or a lens set, and is configured to capture images. The image sensor may be, for example, an opto-electronic sensor, such as a charge-coupled device (CCD) sensor, a complementary metal-oxide-semiconductor (CMOS) sensor, or the like. The image capturing device 110 is further configured to send the captured images to the multi-rate encoding apparatus 130 for encoding. In some embodiments, the image capturing device 110 may include a memory for storing, either temporarily or permanently, the captured images.
  • The multi-rate encoding apparatus 130 is configured to receive the images captured by the image capturing device 110, and encode the images according to the plurality of coding parameter values to generate the plurality of encoded data streams. Each of the plurality of encoded data streams is generated using a corresponding one of the plurality of coding parameter values and corresponds to one of the plurality of bitrate values. As shown in FIG. 1, the multi-rate encoding apparatus 130 includes a multi-rate encoder 1301, a rate controller 1303, and a rate selector 1305 coupled to each other. Further, the multi-rate encoder 1301 is coupled to the image capturing device 110. The rate selector 1305 is coupled to the transceiver 150.
  • The multi-rate encoder 1301 may receive and encode the images captured by the image capturing device 110 according to any suitable video coding standard, also referred to as video compression standard, such as Windows Media Video (WMV) standard, Society of Motion Picture and Television Engineers (SMPTE) 421-M standard, Moving Picture Experts Group (MPEG) standard, e.g., MPEG-1, MPEG-2, or MPEG-4, H.26x standard, e.g., H.261, H.262, H.263, or H.264, or another standard.
  • In some embodiments, the video coding standard may be selected according to the video coding standard supported by a decoder, the channel conditions, the image quality requirement, and/or the like. For example, an image encoded using the MPEG standard needs to be decoded by a corresponding decoder adapted to support the appropriate MPEG standard. A lossless compression format may be used to achieve a high image quality requirement, and a lossy compression format may be used to adapt to limited transmission channel bandwidth.
  • In some embodiments, the multi-rate encoder 1301 may implement one or more different codec algorithms. The selection of the codec algorithm may be based on encoding complexity, encoding speed, encoding ratio, encoding efficiency, and/or the like. For example, a fast codec algorithm may be performed in real-time on a low-end hardware. A high encoding ratio algorithm may be desirable for a transmission channel with a small bandwidth.
  • In some other embodiments, the multi-rate encoder 1301 may further perform at least one of encryption, error-correction encoding, format conversion, or the like. For example, when the images captured by the image capturing device 110 contains confidential information, the encryption may be performed before transmission or storage to protect confidentiality.
  • FIG. 2 is a schematic block diagram showing an example of the multi-rate encoding apparatus 130 consistent with the disclosure. As shown in FIG. 2, the multi-rate encoder 1301 includes a plurality of single-rate encoders for generating the plurality of encoded data streams. Each single-rate encoder can generate one of the plurality of encoded data streams having a corresponding one of the plurality of bitrates according to one of the plurality of coding parameter values. In some embodiments, the plurality of single-rate encoders may be separate parts or partially separate parts sharing one or more common circuits.
  • FIG. 3 is a schematic block diagram showing an exemplary single-rate encoder consistent with the disclosure. As shown in FIG. 3, the single-rate encoder includes a “forward path” connected by solid-line arrows and an “inverse path” connected by dashed-line arrows in the figure. The “forward path” includes conducting an encoding process on an entire image frame or a block, e.g., a macroblock (MB), of the image frame, and the “inverse path” includes implementing a reconstruction process, which generates context 301 for prediction of a next image frame or a next block of the next image frame. An image frame refers to a complete image. Hereinafter, the terms “frame,” “image,” and “image frame” are used interchangeably.
  • The size and type of the block of the image frame may be determined according to the encoding standard that is employed. For example, a fixed-sized MB covering 16×16 pixels is the basic syntax and processing unit employed in H.264 standard. H.264 also allows the subdivision of an MB into smaller sub-blocks, down to a size of 4×4 pixels, for motion-compensation prediction. An MB may be split into sub-blocks in one of four manners: 16×16, 16×8, 8×16, or 8×8. The 8×8 sub-block may be further split in one of four manners: 8×8, 8×4, 4×8, or 4×4. Therefore, when H.264 standard is used, the size of the block of the image frame can range from 16×16 to 4×4 with many options between the two as described above.
  • In some embodiments, as shown in FIG. 3, the “forward path” includes a prediction process 302, a transformation process 303, a quantization process 304, and an entropy encoding process 305. In the prediction process 302, a predicted block can be generated according to a prediction mode. The prediction mode can be selected from a plurality of intra-prediction modes and/or a plurality of inter-prediction modes that are supported by the video encoding standard that is employed. Taking H.264 for an example, H.264 supports nine intra-prediction modes for luminance 4×4 and 8×8 blocks, including eight directional modes and an intra direct component (DC) mode that is a non-directional mode. For luminance 16×16 blocks, H.264 supports four intra-prediction modes, i.e., Vertical mode, Horizontal mode, DC mode, and Plane mode. Further, H.264 supports all possible combination of inter-prediction modes, such as variable block sizes (i.e., 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4) used in inter-frame motion estimation, different inter-frame motion estimation modes (i.e., use of integer, half, or quarter pixel motion estimation), multiple reference frames.
  • In the plurality of intra-prediction modes, the predicted block is created using a previously encoded block from the current frame. In the plurality of inter-prediction modes, the previously encoded block from a past or a future frame (a neighboring frame) is stored in the context 301 and used as a reference for inter-prediction. In some embodiments, a weighted sum of two or more previously encoded blocks from one or more past frames and/or one or more future frames can be stored in the context 301 for inter-prediction.
  • In some embodiments, the prediction process 302 can also include a prediction mode selection process (not shown). In some embodiments, the prediction mode selection process can include determining whether to apply the intra-prediction or the inter-prediction on the block. In some embodiments, which one of the intra-prediction or inter-prediction to be applied on the block can be determined according to the position of the block. For example, if the block is in the first image frame of a video or in an image frame at one of random access points of the video, the block may be intra-coded. On the other hand, if the block is in one of the remaining frames, i.e., images frames other than the first image frame, of the video or in an image frame between two random access points, the block may be inter-coded. An access point may refer to, e.g., a point in the stream of the video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, which one of the intra-prediction or inter-prediction to be employed on the block can be determined according to a transmission error, a sudden change of channel conditions, or the like. For example, if a transmission error occurs or a sudden change of channel conditions occurs when the block is generated, the block can be intra-predicted.
  • In some embodiments, the prediction mode selection process can further include selecting an intra-prediction mode for the block from the plurality of intra-prediction modes when intra-prediction is determined to be employed and an inter-prediction mode from the plurality of inter-prediction modes when inter-prediction is determined to be employed. Any suitable prediction mode selection technique may be used here. For example, H.264 uses a Rate-Distortion Optimization (RDO) technique to select the intra-prediction mode or the inter-prediction mode that has a least rate-distortion (RD) cost for the block.
  • The predicted block is subtracted from the block to generate a residual block.
  • In the transformation process 303, the residual block is transformed into a representation in the spatial-frequency domain (also referred to as spatial-spectrum domain), in which the residual block can be expressed in terms of a plurality of spatial-frequency domain components, e.g., cycles per spatial unit in X and Y directions. Coefficients associated with the spatial-frequency domain components in the spatial-frequency domain expression are also referred to as transform coefficients. Any suitable transformation method, such as a discrete cosine transform (DCT), a wavelet transform, or the like, can be used here. Taking H.264 as an example, the residual block is transformed using a 4×4 or 8×8 integer transform derived from the DCT.
  • In the quantization process 304, quantized transform coefficients can be obtained by dividing the transform coefficients with a quantization step size (Qstep) for associating the transformed coefficients with a finite set of quantization steps. In some embodiments, a QP can be used to determine the Qstep. The relation between the value of QP and Qstep may be linear or exponential according to different encoding standards. Taking H.263 as an example, the relationship between the value of QP and Qstep is that Qstep=2×QP. Taking H.264 as another example, the relationship between the value of QP and Qstep is that Qstep=2QP/6.
  • In some embodiments, an expected bitrate can be achieved by adjusting the value of a coding parameter, for example, the value of QP. Small values of QP can more accurately approximate the spatial frequency spectrum of the residual block, i.e., more spatial detail can be retained, but at the cost of more bits and higher bitrates in the encoded data stream. Large values of QP represent big step sizes that crudely approximate the spatial frequency spectrum of the residual block such that most of the spatial detail of residual block the can be captured by only a few quantized transform coefficients. That is, as the value of QP increases, some spatial detail is aggregated such that the bitrate drops, but at the price of loss of quality. For example, H.264 allows a total of 52 possible values of QP, which are 0, 1, 2, . . . , 51, and each unit increase of QP lengthens the Qstep by 12% and reduces the bitrate by roughly 12%.
  • In the entropy encoding process 305, the quantized transform coefficients are entropy encoded. In some embodiments, the quantized transform coefficients may be reordered (not shown) before entropy encoding. The entropy encoding can convert symbols into binary codes, e.g., a data stream or a bitstream, which can be easily stored and transmitted. For example, context-adaptive variable-length coding (CAVLC) is used in H.264 standard to generate data streams. The symbols that are to be entropy encoded include, but are not limited to, the quantized transform coefficients, information for enabling the decoder to recreate the prediction (e.g., selected prediction mode, partition size, and the like), information about the structure of the data stream, information about a complete sequence (e.g., MB headers), and the like.
  • In some embodiments, as shown in FIG. 3, the “inverse path” includes an inverse quantization process 306, an inverse transformation process 307, and a reconstruction process 308. The quantized transform coefficients are inversely quantized and inversely transformed to generate a reconstructed residual block. The inverse quantization is also referred to as a re-scaling process, where the quantized transform coefficients are multiplied by Qstep to obtain rescaled coefficients, respectively. The rescaled coefficients are inversely transformed to generate the reconstructed residual block. An inverse transformation method corresponding to the transformation method used in the transformation process 303 can be used here. For example, if an integer DCT is used in the transformation process 303, a reverse integer DCT can be used in the reverse transformation process 307. The reconstructed residual block is added to the predicted block in the reconstruction process 308 to create a reconstructed block, which is stored in the context 301 as a reference for prediction of the next block.
  • In some embodiments, the single-rate encoder may be a codec. That is, the single-rate encoder may also include a decoder (not shown). The decoder conceptually works in a reverse manner including an entropy decoder (not shown) and the processing elements defined within the reconstruction process, shown by the “inverse path” in FIG. 3. The detailed description thereof is omitted here.
  • FIG. 4 is a schematic block diagram showing another example of the multi-rate encoding apparatus 130 consistent with the disclosure. As shown in FIG. 4, the multi-rate encoder 1301 include the plurality of single-rate encoders that share a common circuit 310 and have separate processing circuits 330 to generate the plurality of encoded data streams with different bitrates. Referring again to FIG. 3, the processing circuit 330 can perform the transform process 303, the quantization process 304, the entropy encoding process 305, the inverse quantization process 306, the inverse transform process 307, and the reconstruction process 308. The common circuit 310 can perform the prediction process 302, of which the computational complexity and the computing resource consumption may account for about 70% of the overall calculations of the single-rate encoder. As such, the multi-rate encoder 1301 with the structure shown in FIG. 4 and described above can reduce resource consumption.
  • Referring again to FIGS. 2 and 4, the rate controller 1303 is configured to adjust the plurality of coding parameter values of the multi-rate encoder 1301 to control the plurality of bitrate values of the plurality of encoded data streams generated by the multi-rate encoder 1301, according to a rate control model. The rate control model characterizes a correspondence between coding parameter and bitrate. In some embodiments, the rate controller 1303 can implement a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below.
  • In some embodiments, as shown in FIGS. 2 and 4, when the multi-rate encoder 1301 includes the plurality of single-rate encoders, the rate controller 1303 can be coupled to the plurality of single-rate encoders and can be configured to adjust the coding parameter value of each single-rate encoder to control the bitrate value of the encoded data stream generated by each single-rate encoder, according to the rate control model.
  • The rate selector 1305 is configured to select one of the plurality of encoded data streams as the output data stream based on, for example, a current channel capacity, a current channel bandwidth, a transmission latency, and/or the like, and send the output data stream to the transceiver 150 for transmitting. In some embodiments, the rate selector 1305 can be also configured to obtain feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from the transceiver 150.
  • In some embodiments, as shown in FIGS. 2 and 4, when the multi-rate encoder 1301 includes the plurality of single-rate encoders, the rate selector 1305 can be coupled to the plurality of single-rate encoders and can be configured to select one of the plurality of encoded data streams as the output data stream from the corresponding single-rate encoder based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
  • Referring again to FIG. 1, the transceiver 150 is configured to obtain the output data stream from the rate selector 1305 and transmit the output data stream over the transmission channel. In some embodiments, the transceiver 150 is further configured to receive the feedback information including, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like, from a receiving terminal over the transmission channel, and send the feedback information to the rate selector 1305.
  • The transceiver 150 can include a transmitter and a receiver, and can be configured to have two-way communications capability, i.e., can both transmit and receive data. In some embodiments, the transmitter and the receiver may share common circuitry. In some other embodiments, the transmitter and the receiver may be separate parts sharing a single housing. The transceiver 150 may work in any suitable frequency band, for example, the microwave band, millimeter-wave band, centimeter-wave band, optical wave band, or the like.
  • According to the disclosure, the image capturing device 110, the multi-rate encoding apparatus 130, and the transceiver 150 can be separate devices, or any two or all of them can be integrated in one device. In some embodiments, the image capturing device 110, the multi-rate encoding apparatus 130, and the transceiver 150 are separate devices that can be connected or coupled to each other through wired or wireless means. For example, the image capturing device 110 can be a camera, a camcorder, or a smartphone having a camera function. FIG. 5 is a schematic block diagram showing another example of multi-rate encoding apparatus 130 consistent with the disclosure. As shown in FIG. 5, the multi-rate encoding apparatus 130 includes one or more processors 130-1 and one or more memories 130-2. The one or more processors 130-1 can include any suitable hardware processor, such as a microprocessor, a micro-controller, a central processing unit (CPU), a network processor (NP), a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or another programmable logic device, discrete gate or transistor logic device, discrete hardware component. The one or more memories 130-2 store computer program codes that, when executed by the one or more processors, control the one or more processors to perform a rate control method consistent with the disclosure, such as one of the exemplary rate control methods described below, and the encoding functions of the method consistent with the disclosure. The one or more memories can include a non-transitory computer-readable storage medium, such as a random access memory (RAM), a read only memory, a flash memory, a volatile memory, a hard disk storage, or an optical medium. The transceiver 150 can be an independent device combining a transmitter and a receiver in a single package.
  • In some other embodiments, the image capturing device 110, the multi-rate encoding apparatus 130, and the transceiver 150 are integrated in a same electronic device. For example, the image capturing device 110 may include an image sensor and a lens or a lens set of the electronic device. The multi-rate encoding apparatus 113 may be implemented by one or more single-chip encoders, one or more single-chip codecs, one or more image processor, one or more image processing engine, or the like, which are integrated in the electronic device. The transceiver 150 may be implemented by an integrated circuit, a chip, or a chipset that is integrated in the electronic device. For example, the electronic device may be a smartphone having a built-in camera and a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150.
  • In some other embodiments, any two of the image capturing device 110, the multi-rate encoding apparatus 130, and the transceiver 150 are integrated in a same electronic device. For example, the image capturing device 110 can be a camera or a camcorder that is coupled to an electronic device having a motherboard that integrates the multi-rate encoding apparatus 130 and the transceiver 150.
  • Exemplary rate control methods consistent with the disclosure will be described in more detail below. A rate control method consistent with the disclosure can be implemented in a multi-rate encoding apparatus consistent with the disclosure. The multi-rate encoding apparatus can be configured as a portion of a transmitting terminal. The multi-rate encoding apparatus and the transmitting terminal can be, for example, the multi-rate encoding apparatus 130 and the transmitting terminal 100 described above.
  • As described above, the bitrate of an encoded data stream can be controlled by controlling a coding parameter, such as a quantization parameter, used for encoding image frames. To obtain an encoded data stream having a desired bitrate (also referred to as an “expected bitrate”), the coding parameter can be selected according to a rate control model describing correspondences between coding parameters and bitrates. The rate control model can also be updated during the encoding process based on calculation/encoding results during the encoding process. In some embodiments, the rate control model can be updated based on the encoding process of one frame or based on the encoding process of a plurality of frames.
  • FIG. 6 schematically illustrates a process for updating the rate control model per frame consistent with the disclosure. As shown in FIG. 6, a frame 610 is encoded using a plurality of coding parameter values (denoted using letters CP1, CP2, . . . , and CPN in FIG. 6) to generate a plurality of encoded data streams 630 having a plurality of bitrate values (denoted using letters R1, R2, . . . , and RN in FIG. 6). Each of the plurality of coding parameter values CPi corresponds to one of the plurality of bitrate values Ri (i=1, 2, . . . , N). For example, CPi corresponds to R1, CP2 corresponds to R2, CPN corresponds to RN, and so on. The plurality of (CPi, Ri) pairs form a plurality of sample points 650, which can then be applied to a rate control model 670 for determining/updating parameters of the rate control model 670 according to, for example, a fitting method.
  • According to the process shown in FIG. 6, the parameters of the rate control model can be updated or estimated per frame. Thus, a frame-level rate control that can stabilize the bitrate per frame at an expected bitrate can be achieved. The frame-level rate control can avoid frequent playback stops at the receiving terminal due to large transmission jitter. The overall perceptual quality of a video can be enhanced and the user experience can be improved.
  • FIG. 7 is flow chart of an exemplary rate control method 700 consistent with the disclosure. According to the rate control method 700, a rate controller, such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above, can control a plurality of coding parameter values of a multi-rate encoder, such as the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above, according to which a plurality of encoded data streams having a corresponding plurality of bitrate values can be generate by the multi-rate encoder. A rate selector, such as the rate selector 1305 of the multi-rate encoding apparatus 130 described above, can select one of the plurality of encoded data streams as the output data stream based on, for example, the current channel capacity, the current channel bandwidth, the transmission latency, and/or the like.
  • As shown in FIG. 7, at 701, a first plurality of coding parameter values are obtained. The first plurality of coding parameter values may include a plurality of coding parameter values for encoding a first input frame. In some embodiments, the first input frame can be a first one of image frames captured by an image capturing device and sent to a multi-rate encoder for encoding. The image capturing device can be, for example, the image capturing device 110 described above. The multi-rate encoder can be, for example, the multi-rate encoder 1301 of the multi-rate encoding apparatus 130 described above. In some other embodiments, the first input frame can be an image frame in the stream of a video from which the video is started to be encoded or transmitted, or from which the video is resumed to be encoded or transmitted. In some other embodiments, the first input frame can be any one of image frames captured by the image capturing device or any image frame in the stream of a video.
  • In some embodiments, the first plurality of coding parameter values are provided by a rate control model based at least in part on an expected bitrate for the first input frame (also referred to as a “first expected bitrate”). That is, one of the first plurality of coding parameter values is provided by the rate control model based on an expected bitrate for the first input frame, which can also be referred to as a “first main coding parameter value.” The remaining ones of the first plurality of coding parameter values, i.e., those of the first plurality of coding parameter values other than the first main coding parameter value, can be referred to as first auxiliary coding parameter values.
  • Any suitable rate control model can be used here. For example, the rate control model can include a quantizer-domain (Q-domain) rate control model (also referred to as a rate quantization (R-Q) model), that characterizes the relationship between bitrate and QP, a rho-domain (p-domain) rate control model that characterizes the relationship between bitrate and parameter p (the percentage of zeros among the quantized transform coefficients), or a Lambda-domain (k-domain) rate control model (also referred to as a rate-lambda (R-λ) model) that characterizes the relationship between bitrate and the Lagrange multiplier λ corresponding to QP for each frame.
  • In some embodiments, the rate control model may have initial parameters that are pre-stored in a rate controller, such as the rate controller 1303 of the multi-rate encoding apparatus 130 described above. A coding parameter value corresponding to the expected bitrate value for the first input frame obtained according to the rate control model can be set as the first main coding parameter value. Taking an R-Q model as an example, the coding parameter can be the QP and the R-Q model can be expressed as an exponential of the second-order polynomial:

  • R(Q)=exp(α·Q 2 +b·Q+c)
  • where R represents the value of bitrate, Q represents the value of QP and, a, b, and c represent parameters. The coding parameter value corresponding to the expected bitrate value for the first input frame can be calculated from the above exponential of the second-order polynomial with initial a, b, and c values.
  • In some embodiments, the expected bitrate for the first input frame can be a preset bitrate. In some embodiments, the expected bitrate for the first input frame can be obtained from a user input. In some other embodiments, the expected bitrate for the first input frame can be determined based on, for example, the channel capacity, the channel bandwidth, the transmission latency, or the like.
  • In some embodiments, the first auxiliary coding parameter values can be gradually deviated from the first main coding parameter value at a coding parameter interval. For example, the first auxiliary coding parameter values can be gradually stepped down or stepped up from the first main coding parameter value and arranged at the coding parameter interval. As another example, one or some of the first auxiliary coding parameter values can be obtained by gradually stepping down from the first main coding parameter value and one or some of the first auxiliary coding parameter values can be obtained by gradually stepping up from the first main coding parameter value.
  • In some embodiments, the determination of the preset coding parameter interval may be a tradeoff between the computation complexity and the estimation accuracy of the parameters of the rate control model. For example, a large coding parameter interval leads to a small number of coding parameter values, which can reduce the computational burden, but may increase the estimation error of the parameters of the rate control model. On the other hand, a fine coding parameter interval can generate a plurality of coding parameter values that are densely distributed over a certain range, which can reduce the estimation error of the parameters of the rate control model, but may increase the computational burden.
  • In some embodiments, the coding parameter interval can be a constant interval, i.e., the interval between each pair of neighboring coding parameter values is the same. In some other embodiments, the coding parameter interval can be a variable interval, i.e., the interval between each pair of neighboring coding parameter values may vary from pair to pair, or may be the same among some pairs of neighboring coding parameter values but different among some other pairs. In the embodiments of varying coding parameter interval, the interval can be, for example, varied with the curvature of a curve of the coding parameter versus the bitrate. For example, a large interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively small curvature, and a fine interval can be used for one or some of the first auxiliary coding parameter values falling on a portion of the curve that has a relatively large curvature.
  • In the embodiments described above, the first auxiliary coding parameter values are obtained by first obtaining the first main coding parameter value and then calculating the first auxiliary coding parameter values according to a preset interval. In some other embodiments, the first auxiliary coding parameter values can be obtained based on selected bitrates for the first input frame. Such selected bitrates for the first input frame are also referred to as “first auxiliary bitrate values.” For example, the first auxiliary bitrate values can be gradually deviated from the expected bitrate for the first input frame at a bitrate interval. Similar to the coding parameter interval, the bitrate interval also can be constant or variable, and can be determined in a similar manner as determining the coding parameter interval. For example, the first auxiliary bitrate values can be obtained by gradually stepping down and/or stepping up from the expected bitrate for the first input frame and arranged at the bitrate interval. The first auxiliary coding parameter values corresponding to the first auxiliary bitrate values can be calculated according to the rate control model.
  • At 703, the first input frame is encoded using the first plurality of coding parameter values to generate a first plurality of encoded data streams. Each of the first plurality of encoded data streams is generated using a corresponding one of the first plurality of coding parameter values and has a corresponding one of a first plurality of bitrate values.
  • In some embodiments, the first input frame can be intra-encoded using the first plurality of coding parameter values to generate the first plurality of encoded data streams. In some embodiments, encoding the first input frame using one of the first plurality of coding parameter values can include a prediction process, a transformation process, a quantization process, and an entropy encoding process.
  • In some embodiments, the encoding processes of the first input frame using the first plurality of coding parameter values can be separate processes and implemented in parallel. For example, as shown in FIG. 2, the multi-rate encoding apparatus 130 can include a plurality of separate single-rate encoders, each single-rate encoder can be used to encode the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams.
  • In some other embodiments, the encoding processes of the first input frame using the first plurality of coding parameter values can include at least one common process. For example, as shown in FIGS. 3 and 4, the encoding processes of the first input frame using the first plurality of coding parameter values can share a common prediction process in the common circuit 310 and use separate transformation processes, separate quantization processes, and separate entropy encoding processes in the separate processing circuits 330-1, 330-2, . . . 330-N to generate the first plurality of encoded data streams. The computational complexity and the computing resource consumption can be reduced by sharing the common prediction circuit 310.
  • At 705, the rate control model can be updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. That is, updated parameters of the rate control model can be obtained based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. Thus, an updated rate control model can be generated with the updated parameters. Taking the R-Q model described above as an example, an updated R-Q curve can be modeled by the least-squares fitting of the exponential of the second-order polynomial described above based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams, such that the updated values of a, b, and c can be obtained.
  • In the embodiments described above, the rate control model is updated once using one input frame and then used for encoding the next input frame. In some scenarios, one updating may not be enough to create an updated rate control model that closely models the actual correspondence relationship between the coding parameter and the bitrate. The degree of approximation between the updated rate control model and the actual correspondence relationship between the coding parameter and the bitrate can be determined by a difference between a first actual bitrate value and the expected bitrate value. The first actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value. Therefore, in some other embodiments, the rate control model can be iteratively updated using the first input frame until a difference between the first actual bitrate value and the expected bitrate value is within a preset range, e.g., smaller than a preset value.
  • At 707, one of the first plurality of encoded data streams is selected as an output data stream for the first input frame. The selection of the one of the first plurality of encoded data streams can be based on, for example, the expected bitrate value for the first input frame, the channel capacity, the channel bandwidth, the transmission latency, and/or the like.
  • In some embodiments, the output data stream for the first input frame can be selected from the first plurality of encoded streams according to the expected bitrate value for the first input frame. For example, the one of the first plurality of encoded streams obtained by encoding the first input frame using the first main coding parameter value (also referred to as a “first main encoded data stream”) can be directly selected as the output data stream. In some scenarios, the bitrate of the first main encoded data stream may differ from the expected bitrate for the first input frame by a relatively large value. In these scenarios, the expected bitrate for the first input frame can be subject back to the updated rate control model to obtain a new coding parameter value for encoding the first input frame, and the obtained encoded data stream can be output as the output data stream for the first input frame.
  • As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is closest to the expected bitrate value for the first input frame.
  • As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that is not more than and closest to the expected bitrate value for the first input frame.
  • As another example, the output data stream for the first input frame may be the one of the first plurality of encoded streams having a corresponding bitrate of the first plurality of bitrate values that has a difference from the expected bitrate value for the first input frame within a preset range.
  • In some embodiments, the output data stream for the first input frame can be selected from the first plurality of encoded streams according to a current channel bandwidth. For example, the output data stream for the first input frame can be one of the first plurality of encoded streams that matches the current channel bandwidth. As such, the output data stream can be adapted to the time-varying channel bandwidth in real-time. That it, when the channel bandwidth varies with time, the output data stream can match the channel bandwidth in real-time.
  • In some other embodiments, the output data stream for the first input frame may be selected according to the current channel bandwidth and a target latency. The target latency may also be referred to as a control target of the latency, which represents an expected transmission latency.
  • For example, the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is closest to the target latency.
  • As another example, the output data stream for the first input frame may be one of the first plurality of encoded streams of which the transmission latency under the current channel bandwidth is not more than and closest to the target latency.
  • As a further example, the output data stream for the first input frame may be one of the first plurality of encoded streams with the highest bitrate among the first plurality of encoded streams, with which the difference the target latency and the transmission latency under the current channel bandwidth is within a preset range. Because a higher bitrate generally correspond to a higher encoding quality, this approach can ensure that the encoded data with the highest encoding quality can be selected when the target latency is satisfied.
  • In some embodiments, the output data stream for the first input frame may be selected according to the channel bandwidth, the target latency, and the encoding quality. That is, the selection of the output data stream for the first input frame can be based on a combination of the requirements of the channel bandwidth, the target latency, and the encoding quality.
  • In some embodiments, a cost function may be determined according to the channel bandwidth, the target latency, the encoding quality, and a target bitrate. The output data stream for the first input frame may be one of the first plurality of encoded data with the smallest value of the cost function.
  • For example, the cost function may be as follows:

  • Cost=A×|bitrate/bandwidth−target latency|+B×encoding quality
      • where Cost represents the cost, and A and B represent weights.
  • According to the requirements of different application scenarios, the values of A and B can be adjusted to bias towards the requirement of the encoding quality or the requirement of the latency control, e.g., the values of A and B can be adjusted to give more weight to the requirement of the encoding quality or to the requirement of the latency control in the calculation of Cost.
  • In some embodiments, a reconstructed frame obtained from the output data stream for the first input frame can be used as the context of a second input frame. That is, a reconstructed frame obtained from the output data stream for the first input frame can be used as a reference for the prediction of the second input frame.
  • At 709, a second input frame is encoded based on the updated rate control model. For example, a second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model, and the second input frame can be encoded using the second plurality of coding parameter values.
  • In some embodiments, the second plurality of coding parameter values for encoding the second input frame can be determined based on the updated rate control model and an expected bitrate for the second input frame (also referred to as a “second expected bitrate”). For example, one of the second plurality of coding parameter values can be determined based on the updated rate control model and the expected bitrate for the second input frame, which can be referred to as a second main coding parameter value. The remaining ones of the second plurality of coding parameter values, i.e., the coding parameter values of the second plurality of coding parameter values other than the second main coding parameter value, can be referred to as second auxiliary coding parameter values. Taking the R-Q model described above as an example, the coding parameter value corresponding to the expected bitrate for the second input frame calculated from the exponential of the second-order polynomial with the updated a, b, and c parameters can be set as the second main coding parameter value.
  • In some embodiments, the second main coding parameter value and the first main coding parameter value are for a same encoding channel. The same encoding channel refers to, for example, a same single-rate encoder included in the multi-rate encoder as shown in FIG. 2, or a same processing circuit included in the multi-rate encoder as shown in FIG. 4.
  • In some embodiments, similar to the first auxiliary coding parameter values, the second auxiliary coding parameter values can be gradually deviated from the second main coding parameter value at a coding parameter interval. For example, the second auxiliary coding parameter values can be obtained by gradually stepping up and/or stepping down from the second main coding parameter value and arranged at the coding parameter interval. The coding parameter interval for the second auxiliary coding parameter values can be a constant interval or a variable interval. The coding parameter interval for the second auxiliary coding parameter values can be determined in a similar manner as that for determining the coding parameter interval for the first auxiliary coding parameter values, and thus the detailed description thereof is omitted.
  • In some other embodiments, the second auxiliary coding parameter values can be obtained based on selected bitrates for the second input images, also referred to as “second auxiliary bitrate values.” Similar to the first auxiliary bitrate values, the second auxiliary bitrate values can be gradually deviated from the expected bitrate for the second input frame at a bitrate interval. The second auxiliary coding parameter values corresponding to the second auxiliary bitrate values can be calculated based on the rate control model. For example, the second auxiliary bitrate values can be obtained by gradually stepping up and/or stepping down from the expected bitrate for the second input images. The bitrate interval for the second auxiliary bitrate values can be a constant interval or a variable interval. The bitrate interval for the second auxiliary bitrate values can be determined in a similar manner as that for determining the bitrate interval for the first auxiliary bitrate values, and thus the detailed description thereof is omitted.
  • In some embodiments, the second input frame can be inter-encoded and/or intra-encoded using the second plurality of coding parameter values to generate a second plurality of encoded data streams. In some embodiments, encoding the second input frame using one of the second plurality of coding parameter values can include the prediction process, the transformation process, the quantization process, and the entropy encoding process.
  • One of the second plurality of encoded data streams can be selected as an output data stream corresponding to the second input frame. The selection of the output data stream corresponding to the second input frame can be similar to the selection of the output data stream corresponding to the first input frame, and thus detailed description thereof is omitted.
  • In some embodiments, the rate control model may also vary between image frames. That is, the updated rate control model obtained based on the first input frame may not accurately reflect the correspondence relationship between coding parameter and bitrate in the second input frame. In these embodiments, the rate control model can be further updated based on the second plurality of coding parameter values and the second plurality of bitrate values respectively corresponding to the second plurality of encoded data streams. To do so, the second main coding parameter value may be iteratively adjusted until the difference between a second actual bitrate and the expected bitrate value for the second input frame is within a preset range. The second actual bitrate value refers to a bitrate value obtained by encoding the second input frame using the second main coding parameter value.
  • FIG. 8 is a flow chart showing a process of iteratively updating the rate control model consistent with the disclosure. As shown in FIG. 8, the rate control model can be first updated based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams [denoted as letters (CP1 i, R1 i)] obtained according to the approaches (705) described above. The second plurality of coding parameter values (denoted as CP2 i) can be determined based on the updated rate control model and the expected bitrate for the second input frame as described above. As described above, the coding parameter value calculated from the expected bitrate for the second input frame using the updated rate control model is the second main coding parameter value. The second frame can be encoded using CP2 i to generate the second plurality of data streams having a plurality of actual bitrates (denoted as R2 i), including the data stream having the second actual bitrate value generated by encoding the second frame using the second main coding parameter value. If the difference between the second actual bitrate value and the expected bitrate value for the second input frame falls outside the preset range, the rate control model is updated according to the (CP2 i, R2 i) pairs. The second main coding parameter value can be updated to a coding parameter corresponding to the expected bitrate value for the second input frame calculated from the further updated rate control model. On the other hand, if the above difference between the second actual bitrate value and the expected bitrate value for the second input frame is within the preset range, the iterative adjustment process can be stopped and one of the second plurality of data streams is output as the output data stream, as shown in FIG. 8.
  • FIG. 9 schematically shows a variation of a bitrate-versus-QP curve (R-Q curve or R-Q model, i.e., an example of the rate control model) between frames. In FIG. 9, curve 1 represents the R-Q model obtained/updated based on the first input frame, i.e., a curve created by fitting the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams. As shown in FIG. 9, when, for example, the complexities of the first input frame and the second input frame are different, the R-Q model may move from curve 1 to curve 2. Curve 2 is an actual R-Q curve corresponding to the second input frame, which is yet unknown. As shown in FIG. 9, if an expected bitrate for second input frame Re is desired, curve 1 gives a corresponding second main coding parameter value QPe. However, if QPe is used for encoding the second input frame, the obtained encoded data stream will have an actual bitrate Re according to curve 2, which is different from the expected bitrate for second input frame Re. Curve 1 can be iteratively updated according to, e.g., the method described above in connection with FIG. 8 to obtain curve 2 or a curve close to curve 2. Thereafter, according to the obtained curve 2 or the obtained curve close to curve 2, the second main coding parameter value QPe1 that can result in the expected bitrate for second input frame Re can be obtained.
  • In some embodiments, obtaining the first plurality of coding parameter values (at 701) can further include iteratively adjusting the first main coding parameter value until the difference between an actual bitrate and the expected bitrate value for the first input frame is within a preset range. An actual bitrate value refers to a bitrate value obtained by encoding the first input frame using the first main coding parameter value.
  • Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. It is intended that the specification and examples be considered as exemplary only and not to limit the scope of the disclosure, with a true scope and spirit of the invention being indicated by the following claims.

Claims (20)

What is claimed is:
1. A method for rate control, comprising:
encoding a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values;
updating a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams; and
encoding a second input frame based on the updated rate control model.
2. The method according to claim 1, wherein the first plurality of coding parameter values are provided by the rate control model based at least in part on an expected bitrate for the first input frame.
3. The method according to claim 2, wherein one of the first plurality of coding parameter values is provided by the rate control model based on the expected bitrate for the first input frame, and others of the first plurality of coding parameter values are gradually deviated from the one of the first plurality of coding parameter values at a constant interval or a variable interval.
4. The method according to claim 2, wherein:
the first plurality of coding parameter values include a main coding parameter value calculated from the expected bitrate value for the first input frame according to the rate control model, and
updating the rate control model comprises iteratively updating the rate control model using the first input frame until a difference between the expected bitrate value for the first input frame and an actual bitrate value obtained by encoding the first input frame using the main coding parameter value is within a preset range.
5. The method according to claim 1, wherein encoding the first input frame comprises implementing a plurality of separate encoding processes on the first input frame to generate the first plurality of encoded data streams, each of the plurality of separate encoding processes including encoding the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams.
6. The method according to claim 1, wherein encoding the second input frame comprises determining a coding parameter value for the second input frame based on the updated rate control model and an expected bitrate for the second input frame.
7. The method according to claim 1, further comprising:
determining a second plurality of coding parameter values for encoding the second input frame based at least in part on the updated rate control model; and
encoding the second input frame using the second plurality of coding parameter values.
8. The method according to claim 7, wherein determining the second plurality of coding parameter values for the second input frame comprises:
determining one of the second plurality of coding parameter values for the second input frame based on an expected bitrate for the second input frame according to the updated rate control model, and
gradually deviating others of the second plurality of coding parameter values from the one of the second plurality of coding parameter values at a constant interval or a variable interval.
9. The method of claim 1, the method further comprising:
selecting an encoded data stream as an output data stream for the first input frame from the first plurality of encoded data streams based on at least one of an expected bitrate value for the first input frame, a channel bandwidth, a transmission latency, or an encoding quality.
10. The method according to claim 9, wherein:
the first plurality of coding parameter values include a main coding parameter value calculated from the expected bitrate value for the first input frame according to the rate control model, and
selecting the encoded data stream as the output data stream comprises selecting one of the first plurality of encoded data streams obtained by encoding the first input frame using the main coding parameter value as the output data stream.
11. The method according to claim 10, wherein selecting the encoded data stream as the output data stream comprises selecting one of the first plurality of encoded data streams having a corresponding bitrate of the first plurality of bitrate values that is closest to the expected bitrate value for the first input frame as the output data stream.
12. An apparatus for rate control in data coding, comprising:
one or more processors; and
one or more memories coupled to the one or more processors and storing instructions that, when executed by the one or more processors, cause the one or more processors to:
encode a first input frame using a first plurality of coding parameter values to generate a first plurality of encoded data streams, each of the first plurality of encoded data streams being generated using a corresponding coding parameter value of the first plurality of coding parameter values and each of the first plurality of encoded data streams having a corresponding bitrate of a first plurality of bitrate values;
update a rate control model representing a correspondence between coding parameter and bitrate based on the first plurality of coding parameter values and the first plurality of bitrate values respectively corresponding to the first plurality of encoded data streams; and
encode a second input frame based on the updated rate control model.
13. The apparatus according to claim 12, wherein the instructions further cause the one or more processors to:
provide the first plurality of coding parameter values by the rate control model based at least in part on an expected bitrate for the first input frame.
14. The apparatus according to claim 13, wherein the instructions further cause the one or more processors to:
provide one of the first plurality of coding parameter values by the rate control model based on the expected bitrate for the first input frame, and gradually deviate others of the first plurality of coding parameter values from the one of the first plurality of coding parameter values at a constant interval or a variable interval.
15. The apparatus according to claim 13, wherein:
the first plurality of coding parameter values include a main coding parameter value calculated from the expected bitrate value for the first input frame according to the rate control model, and
the instructions further cause the one or more processors to iteratively update the rate control model using the first input frame until a difference between the expected bitrate value for the first input frame and an actual bitrate value obtained by encoding the first input frame using a main coding parameter value is within a preset range.
16. The apparatus according to claim 12, wherein the instructions further cause the one or more processors to:
implement a plurality of separate encoding processes on the first input frame to generate the first plurality of encoded data streams, each of the plurality of separate encoding processes including encoding the first input frame using one of the first plurality of coding parameter values to generate a corresponding one of the first plurality of encoded data streams.
17. The apparatus according to claim 12, wherein the instructions further cause the one or more processors to:
determine a coding parameter value for the second input frame based on the updated rate control model and an expected bitrate for the second input frame.
18. The apparatus according to claim 12, wherein the instructions further cause the one or more processors to:
determine a second plurality of coding parameter values for encoding the second input frame based at least in part on the updated rate control model; and
encode the second input frame using the second plurality of coding parameter values.
19. The apparatus according to claim 18, wherein the instructions further cause the one or more processors to:
determine one of the second plurality of coding parameter values for the second input frame based on an expected bitrate for the second input frame according to the updated rate control model, and
gradually deviate others of the second plurality of coding parameter values from the one of the second plurality of coding parameter values at a constant interval or a variable interval.
20. The apparatus according to claim 12, wherein the instructions further cause the one or more processors to:
select an encoded data stream as an output data stream for the first input frame from the first plurality of encoded data streams based on at least one of an expected bitrate value for the first input frame, a channel bandwidth, a transmission latency, or an encoding quality.
US16/511,839 2017-01-18 2019-07-15 Rate control Abandoned US20190342551A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
PCT/CN2017/071491 WO2018132964A1 (en) 2017-01-18 2017-01-18 Method and apparatus for transmitting coded data, computer system, and mobile device
CNPCT/CN2017/071491 2017-01-18
PCT/CN2018/072444 WO2018133734A1 (en) 2017-01-18 2018-01-12 Rate control

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/072444 Continuation WO2018133734A1 (en) 2017-01-18 2018-01-12 Rate control

Publications (1)

Publication Number Publication Date
US20190342551A1 true US20190342551A1 (en) 2019-11-07

Family

ID=59613570

Family Applications (2)

Application Number Title Priority Date Filing Date
US16/511,839 Abandoned US20190342551A1 (en) 2017-01-18 2019-07-15 Rate control
US16/514,559 Active US11159796B2 (en) 2017-01-18 2019-07-17 Data transmission

Family Applications After (1)

Application Number Title Priority Date Filing Date
US16/514,559 Active US11159796B2 (en) 2017-01-18 2019-07-17 Data transmission

Country Status (5)

Country Link
US (2) US20190342551A1 (en)
EP (1) EP3571840B1 (en)
JP (1) JP6862633B2 (en)
CN (2) CN107078852B (en)
WO (2) WO2018132964A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200045312A1 (en) * 2016-10-05 2020-02-06 Interdigital Vc Holdings, Inc. Method and apparatus for encoding a picture
US11159796B2 (en) * 2017-01-18 2021-10-26 SZ DJI Technology Co., Ltd. Data transmission
CN113660488A (en) * 2021-10-18 2021-11-16 腾讯科技(深圳)有限公司 Method and device for carrying out flow control on multimedia data and training flow control model
US20220093119A1 (en) * 2020-09-22 2022-03-24 International Business Machines Corporation Real-time vs non-real time audio streaming
US11343501B2 (en) 2018-10-12 2022-05-24 Central South University Video transcoding method and device, and storage medium
US20220375133A1 (en) * 2020-02-07 2022-11-24 Huawei Technologies Co., Ltd. Image processing method and related device
US11627307B2 (en) * 2018-09-28 2023-04-11 Intel Corporation Transport controlled video coding

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108521869B (en) * 2017-09-06 2020-12-25 深圳市大疆创新科技有限公司 Wireless data transmission method and device
WO2019047059A1 (en) * 2017-09-06 2019-03-14 深圳市大疆创新科技有限公司 Method and device for transmitting wireless data
US11683550B2 (en) 2017-09-18 2023-06-20 Intel Corporation Apparatus, system and method of video encoding
WO2019119175A1 (en) * 2017-12-18 2019-06-27 深圳市大疆创新科技有限公司 Bit rate control method, bit rate control device and wireless communication device
CN108986829B (en) * 2018-09-04 2020-12-15 北京猿力未来科技有限公司 Data transmission method, device, equipment and storage medium
US11368692B2 (en) * 2018-10-31 2022-06-21 Ati Technologies Ulc Content adaptive quantization strength and bitrate modeling
US11838796B2 (en) * 2021-08-18 2023-12-05 Corning Research & Development Corporation Compression and decompression between elements of a wireless communications system (WCS)
CN117615141B (en) * 2023-11-23 2024-08-02 镕铭微电子(济南)有限公司 Video coding method, system, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050180500A1 (en) * 2001-12-31 2005-08-18 Stmicroelectronics Asia Pacific Pte Ltd Video encoding
US20100111163A1 (en) * 2006-09-28 2010-05-06 Hua Yang Method for p-domain frame level bit allocation for effective rate control and enhanced video encoding quality
US20120230400A1 (en) * 2011-03-10 2012-09-13 Microsoft Corporation Mean absolute difference prediction for video encoding rate control
US20130010859A1 (en) * 2011-07-07 2013-01-10 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V Model parameter estimation for a rate- or distortion-quantization model function
US20140328384A1 (en) * 2013-05-02 2014-11-06 Magnum Semiconductor, Inc. Methods and apparatuses including a statistical multiplexer with global rate control
US20180139450A1 (en) * 2016-11-15 2018-05-17 City University Of Hong Kong Systems and methods for rate control in video coding using joint machine learning and game theory

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3265696B2 (en) * 1993-04-01 2002-03-11 松下電器産業株式会社 Image compression coding device
JPH07212757A (en) * 1994-01-24 1995-08-11 Toshiba Corp Picture compression coder
JP3149673B2 (en) * 1994-03-25 2001-03-26 松下電器産業株式会社 Video encoding device, video encoding method, video reproducing device, and optical disc
US6366614B1 (en) * 1996-10-11 2002-04-02 Qualcomm Inc. Adaptive rate control for digital video compression
US7062445B2 (en) 2001-01-26 2006-06-13 Microsoft Corporation Quantization loop with heuristic approach
US7062429B2 (en) * 2001-09-07 2006-06-13 Agere Systems Inc. Distortion-based method and apparatus for buffer control in a communication system
US20070013561A1 (en) 2005-01-20 2007-01-18 Qian Xu Signal coding
US20090225829A2 (en) 2005-07-06 2009-09-10 Do-Kyoung Kwon Method and apparatus for operational frame-layerrate control in video encoder
US7539612B2 (en) 2005-07-15 2009-05-26 Microsoft Corporation Coding and decoding scale factor information
US8077775B2 (en) 2006-05-12 2011-12-13 Freescale Semiconductor, Inc. System and method of adaptive rate control for a video encoder
JP2008283560A (en) * 2007-05-11 2008-11-20 Canon Inc Information processing apparatus and method thereof
WO2010005691A1 (en) * 2008-06-16 2010-01-14 Dolby Laboratories Licensing Corporation Rate control model adaptation based on slice dependencies for video coding
WO2010030569A2 (en) * 2008-09-09 2010-03-18 Dilithium Networks, Inc. Method and apparatus for transmitting video
JP5257215B2 (en) * 2009-04-16 2013-08-07 ソニー株式会社 Image coding apparatus and image coding method
CN102036062B (en) * 2009-09-29 2012-12-19 华为技术有限公司 Video coding method and device and electronic equipment
CN101800885A (en) * 2010-02-26 2010-08-11 北京新岸线网络技术有限公司 Video data distribution method and system method and system for distributing video data
CN101888542B (en) * 2010-06-11 2013-01-09 北京数码视讯科技股份有限公司 Control method for frame level bit-rate of video transcoding and transcoder
CN102843351B (en) * 2012-03-31 2016-01-27 华为技术有限公司 A kind of processing method of streaming media service, streaming media server and system
CN103379362B (en) * 2012-04-24 2017-07-07 腾讯科技(深圳)有限公司 VOD method and system
US20130322516A1 (en) * 2012-05-31 2013-12-05 Broadcom Corporation Systems and methods for generating multiple bitrate streams using a single encoding engine
CN102970540B (en) * 2012-11-21 2016-03-02 宁波大学 Based on the multi-view video rate control of key frame code rate-quantitative model
US9560361B2 (en) 2012-12-05 2017-01-31 Vixs Systems Inc. Adaptive single-field/dual-field video encoding
US9621902B2 (en) * 2013-02-28 2017-04-11 Google Inc. Multi-stream optimization
US20140334553A1 (en) * 2013-05-07 2014-11-13 Magnum Semiconductor, Inc. Methods and apparatuses including a statistical multiplexer with bitrate smoothing
EP2879339A1 (en) * 2013-11-27 2015-06-03 Thomson Licensing Method for distributing available bandwidth of a network amongst ongoing traffic sessions run by devices of the network, corresponding device.
KR102249819B1 (en) 2014-05-02 2021-05-10 삼성전자주식회사 System on chip and data processing system including the same
CN105208390B (en) * 2014-06-30 2018-07-20 杭州海康威视数字技术股份有限公司 The bit rate control method and its system of Video coding
US10165272B2 (en) * 2015-01-29 2018-12-25 Arris Enterprises Llc Picture-level QP rate control performance improvements for HEVC encoding
US9749178B2 (en) * 2015-09-18 2017-08-29 Whatsapp Inc. Techniques to dynamically configure target bitrate for streaming network connections
US20170094301A1 (en) * 2015-09-28 2017-03-30 Cybrook Inc. Initial Bandwidth Estimation For Real-time Video Transmission
CN105898211A (en) * 2015-12-21 2016-08-24 乐视致新电子科技(天津)有限公司 Multimedia information processing method and device
CN106170089B (en) * 2016-08-25 2020-05-22 上海交通大学 H.265-based multi-path coding method
WO2018132964A1 (en) * 2017-01-18 2018-07-26 深圳市大疆创新科技有限公司 Method and apparatus for transmitting coded data, computer system, and mobile device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050180500A1 (en) * 2001-12-31 2005-08-18 Stmicroelectronics Asia Pacific Pte Ltd Video encoding
US20100111163A1 (en) * 2006-09-28 2010-05-06 Hua Yang Method for p-domain frame level bit allocation for effective rate control and enhanced video encoding quality
US20120230400A1 (en) * 2011-03-10 2012-09-13 Microsoft Corporation Mean absolute difference prediction for video encoding rate control
US20130010859A1 (en) * 2011-07-07 2013-01-10 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V Model parameter estimation for a rate- or distortion-quantization model function
US20140328384A1 (en) * 2013-05-02 2014-11-06 Magnum Semiconductor, Inc. Methods and apparatuses including a statistical multiplexer with global rate control
US20180139450A1 (en) * 2016-11-15 2018-05-17 City University Of Hong Kong Systems and methods for rate control in video coding using joint machine learning and game theory

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200045312A1 (en) * 2016-10-05 2020-02-06 Interdigital Vc Holdings, Inc. Method and apparatus for encoding a picture
US10841582B2 (en) * 2016-10-05 2020-11-17 Interdigital Vc Holdings, Inc. Method and apparatus for encoding a picture
US11159796B2 (en) * 2017-01-18 2021-10-26 SZ DJI Technology Co., Ltd. Data transmission
US11627307B2 (en) * 2018-09-28 2023-04-11 Intel Corporation Transport controlled video coding
US11343501B2 (en) 2018-10-12 2022-05-24 Central South University Video transcoding method and device, and storage medium
US20220375133A1 (en) * 2020-02-07 2022-11-24 Huawei Technologies Co., Ltd. Image processing method and related device
US20220093119A1 (en) * 2020-09-22 2022-03-24 International Business Machines Corporation Real-time vs non-real time audio streaming
US11355139B2 (en) * 2020-09-22 2022-06-07 International Business Machines Corporation Real-time vs non-real time audio streaming
CN113660488A (en) * 2021-10-18 2021-11-16 腾讯科技(深圳)有限公司 Method and device for carrying out flow control on multimedia data and training flow control model

Also Published As

Publication number Publication date
CN107078852A (en) 2017-08-18
JP2020505830A (en) 2020-02-20
WO2018132964A1 (en) 2018-07-26
EP3571840B1 (en) 2021-09-15
JP6862633B2 (en) 2021-04-21
EP3571840A4 (en) 2020-01-22
US20190342771A1 (en) 2019-11-07
US11159796B2 (en) 2021-10-26
CN107078852B (en) 2019-03-08
CN110169066A (en) 2019-08-23
EP3571840A1 (en) 2019-11-27
WO2018133734A1 (en) 2018-07-26

Similar Documents

Publication Publication Date Title
EP3571840B1 (en) Rate control
US8331449B2 (en) Fast encoding method and system using adaptive intra prediction
JP5384694B2 (en) Rate control for multi-layer video design
EP1549074A1 (en) A bit-rate control method and device combined with rate-distortion optimization
WO2020253858A1 (en) An encoder, a decoder and corresponding methods
CN110870311A (en) Fractional quantization parameter offset in video compression
US8340172B2 (en) Rate control techniques for video encoding using parametric equations
US20130235938A1 (en) Rate-distortion optimized transform and quantization system
US9560386B2 (en) Pyramid vector quantization for video coding
US20210014486A1 (en) Image transmission
WO2021136056A1 (en) Encoding method and encoder
KR101959490B1 (en) Method for video bit rate control and apparatus thereof
JP2018067808A (en) Picture encoder, imaging apparatus, picture coding method, and program
CN113132726B (en) Encoding method and encoder
US11800097B2 (en) Method for image processing and apparatus for implementing the same
US9392286B2 (en) Apparatuses and methods for providing quantized coefficients for video encoding
US20200374553A1 (en) Image processing
CN112055211A (en) Video encoder and QP setting method
WO2019148320A1 (en) Video data encoding
US12149697B2 (en) Encoding method and encoder
WO2023172616A1 (en) Systems and methods for division-free probability regularization for arithmetic coding
KR101307469B1 (en) Video encoder, video decoder, video encoding method, and video decoding method
KR20150102874A (en) Method for coding image by using adaptive coding scheme and device for coding image using the method

Legal Events

Date Code Title Description
AS Assignment

Owner name: SZ DJI TECHNOLOGY CO., LTD., CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ZHU, LEI;REEL/FRAME:049760/0529

Effective date: 20190708

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: ADVISORY ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION