DYNAMIC QUALITY ADAPTATION USING PERSONAL QUALITY PROFILES AND COMPOSITE PERFORMANCE METRIC
This application claims the benefit of priority from United States
Provisional Patent Application no. 60/242,855 filed October 24, 2000.
Technical Field
This invention relates generally to multimedia communications and more particular to methods and systems for adjusting the quality of the communication of video information transmitted over the Internet according to available resources.
Background
With the availability of high-performance personal computers and popularity of broadband Internet connections, the demand for Internet- based video applications such as video conferencing, video messaging, video-on-demand, etc. is rapidly increasing. To reduce transmission and storage costs, improved bit-rate compression/decompression ("codec") systems are needed. Image, video, and audio signals are amenable to compression due to considerable statistical redundancy in the signals. Within a single image or a single video frame, there exists significant correlation among neighboring samples, giving rise to what is generally termed "spatial correlation" . Also, in moving images, such as full motion video, there is significant correlation among samples in different segments of time such as successive frames. This correlation is generally referred to as "temporal correlation" . There is a need for an improved, cost-effective system and method that removes the redundancy in the video to achieve high compression in transmission and to maintain good to excellent image quality, while adapting to change in
2/35844
- 2 -
the available bandwidth of the transmission channel and to the limitations of the receiving resources of the users.
The Internet is inherently a heterogeneous environment with computers having different computing resources and networking capabilities. Furthermore, both the characteristics (such as loss rate, delay, jitter, etc.) of the connection and load-factor of the computer dynamically change; the bandwidth of the transmission channel is subject to change during transmission, and users/clients with varying receiver resources may join or leave the network during transmission. When multiple users engage in a real-time multimedia application session, it is desirable that the client application makes best utilization of the available resources of each user at any one point of time.
Although there exist methods and systems in the existing technology that purport to continually measure and adjust video compression and transmission to be commensurate with the currently available bandwidth of the transmission channel and to the receiver resources of the "least capable receiver" , known methods and systems are quite crude, and ultimately adjust the quality of compression and transmission to suit only the lowest common denominator, to the detriment of all other users. What is needed is a dynamic quality adaptation method and system that makes better use of available resources for the overall benefit of the entire group of users.
Summary of Invention
This invention relates to a new method and system for dynamic adaptation in the transmittal or streaming quality of media wherein the dynamic quality adaptation occurs as a function of transmission-related parameter tracking as well as the handling capabili-
ties of the recipient systems. In one embodiment of the present invention, a sending computer system (termed a "controller") transmits media over a connection or information network to one or more receiving computer systems (each termed a "follower"). A follower will provide transmission-related information, such as packet loss rate, round trip transmission time, and CPU load, to the controller. The controller gathers this information from all followers and calculates a "composite performance metric" value from such information, providing a cumulative measure of the quality of communications to all users; the controller then uses that composite performance metric in the dynamic adaptation process. When a calculated composite performance metric value exceeds a given threshold, the resource load due to the current quality level of the media stream (that is, the media transmission) is considered also to overrun an acceptable threshold or limit. The present invention adapts the media stream by reducing the quality of the media stream and thereby bringing resource utilization within the acceptable limit. When a calculated composite performance metric value falls short of the given threshold, the resource load is considered not to approach the acceptable resource utilization limit established, which may indicate that the quality of the media stream could be improved without exceeding this established resource threshold/limit. The present invention, in this situation, may adapt the media stream by increasing the quality of the media stream.
In this regard, the present invention also preferably uses a sequence of pre-established quality levels for the transmission of streamed media termed a "personal quality profile" ("PQP"). A PQP has an established sequence of operating qualities organized in levels, and may be set by individual users or by groups of users having similar require- ments. Multiple PQPs may be used according to various implementations of the present invention. For example, one PQP may be used for
all users having narrowband connections while another PQP may be used for all users having broadband connections. In another example, each individual user may have an individual PQP. The PQP is used when the present invention adjusts the media stream quality as a result of a calculated composite performance metric value. When quality is to be improved, the next higher level in the PQP is used to determine the new transmission quality settings. In the reverse, when quality is to be decreased, the next lower level in the PQP is used to determine the new transmission quality settings. Each PQP level may contain various transmission settings. For example, in one embodiment of the present invention, a given level in a PQP may contain transmission settings for picture quality, frame rate, and audio quality.
The individual PQPs or group PQPs, as the case may be, cooper- ate with the composite performance metric of the present invention to provide a sophisticated adaptation procedure that makes better use of available resources than do methods that merely adjust quality downward to suit the least capable receiver, resulting in an underutilization of resources in respect of the group of users as a whole. The method and system of the present invention uses the composite performance metric to assess and adjust the overall quality of communications to the group of users as a whole, and then adjusts the quality of communications to individual users up or down within the self-declared preference levels in the PQP of each user or group of users. By doing so, one individual user's limited resources do not unduly affect the quality of communications to other users within the group.
Brief Description of Drawings
In Figures which illustrate non-limiting embodiments of the invention: FIG. 1 is a schematic representation of a "controller-follower" peer-to-peer video communication system according to the present invention.
FIG. 2 is a functional block diagram of the controller-follower system of FIG. 1. FIG. 3 is a state diagram of the adaptation procedure according to an embodiment of the present invention.
FIG. 4A is a flow chart of the "STEADY" state of FIG. 3. FIG. 4B is a flow chart of the "MEASUREMENT" state of FIG. 3. FIG. 4C is a flow chart of the "HYSTERESIS" state of FIG. 3.
Description
Throughout the following description, specific details are set forth in order to provide a more thorough understanding of the invention. However, the invention may be practiced without these particulars. In other instances, well known elements have not been shown or described in detail to avoid unnecessarily obscuring the present invention. Ac- cordingly, the specification and drawings are to be regarded in an illustrative, rather than a restrictive, sense.
This invention provides a new method and system for dynamic adapation of media quality based on user input in the form of "personal quality profiles" ("PQP") and a composite set of monitored performance parameters. Although the invention will be described with a two-way
video conference example, the developed techniques will also be generalized for any interactive multi-way applications.
Video Compression Prerequisite
Although the method and system of the invention are independent of the choice of video codecs, it is desirable that the applied video codec supports the following features:
β A scalable video codec in terms of both network bandwidth and computing power requirement is highly desirable. This means that the same video stream can serve users having either broadband or narrowband network connections. Users having broadband network connections may receive the full video stream providing very good overall quality.
Users using narrowband network connections receive only part of the video bitstream providing lower overall quality. Similarly, high-performance clients receive and process the full video bitstream while low-performance clients may only receive and process part of the bitstream.
• The video codec should have low computational complexity in order to support software-only implementation for encoding and decoding.
• Finally, the applied video codec should provide high transmission error resilience in order to support recovery from transmission errors and minimization of visual distortion. This should be accomplished without any retransmission support from the networking infrastructure since: (1) retransmission introduces delay undesirable for time sensi-
tive multimedia data; and (2) it does not scale well for multicast video transmission.
The Controller-Follower Model
The method and system of the invention use the controller-follower system model as shown in Figure 1. In this model, for multi- client applications, one user/client is selected to be the controller and the rest of the users/clients become followers. The method and system of the invention provide dynamic adaptation of quality of audio and video media based on the following concepts:
• all users, namely the controller and all followers, monitor a few performance parameters and report them to the control- ler;
• the controller uses the aggregation of monitored performance parameters as a composite performance metric in order to make adaptation decisions; and
all users/clients enforce these adaptation decisions.
In one implementation, the controller and the follower(s) monitor performance parameters such as network delay, packet loss rate, and CPU load, etc. Based upon this information, the controller makes decisions such as "do not change operating quality" , "move to higher operating quality" , or "move to lower operating quality" .
There are many ways a controller may be selected in a multi- client multimedia session. One way can be to select the session initiator (for example, the conference caller for a video conference session).
Another way could be to choose the user/client with highest CPU capacity. Yet another way could be to choose the user/client with the lowest number for its IP address.
Figure 2 is a functional block diagram of an exemplary adaptive controller-follower multimedia system with two clients: the controller and one follower. Each client has media objects such as "Video Sender" , "Audio Sender" , "Video Receiver" , and "Audio Receiver" . The "Adaptation Manager" of the follower consists of an object termed as the QMonitor, which communicates with the Video Receiver and Audio Receiver objects in order to monitor a few performance parameters. The Adaptation Manager of the controller includes two objects, the QMonitor and the QController. The QController receives messages from all QMonitors and executes an adaptation procedure in order to make decisions as to how the adaptation should proceed, if any. An adaptation decision is then sent to the Audio Sender and Video Sender objects (namely, encoders) of all clients for enforcement.
Personal Quality Profile
This invention introduces the concept of "personal quality profile" ("PQP"). The PQP provides a way for each user or group of users to specify an input in order to control the direction of adaptation. It is a sequence of acceptable operating qualities in increasing order of prefer- ence (from minimum acceptable quality to highest desired quality). For example, a PQP with eight quality levels is shown in Table I:
Table I: Personal quality profile.
It is important to note that raising PQP level means higher media quality (in terms of frame rate, picture quality, and audio quality), implying increase in network bandwidth and CPU capacity requirements. Similarly, reducing PQP level means lower media quality requiring lower transmission bandwidth and lower CPU capacity. The objective of the adaptation procedure is to provide the highest possible PQP operating level for given network conditions and computing power.
The present invention is independent of how users specify their PQPs. For example, in one implementation, users can choose from a few preset choices designed by application programmers in order to meet most requirements. In another implementation, users may use a graphical user interface to specify their customized PQP. However, all users (controller and all followers) within a given controller-follower system will preferably use PQPs having the same number of quality levels, in order to lower the computation complexity of the adaptation algorithm that simultaneously moves each user (or group of users) up or down levels within that user's PQP in response to changes in the composite performance metric. Having the same number of levels makes it easier to move all users up a PQP level or down a PQP level.
Each user's PQP will define what that user wishes to receive and, depending on the user's own resource limitations, the user will subscribe to the suitable level of qualities in selecting its PQP. The QController compares the PQPs of all users and determines how best to encode media into multiple layers in order to support the users to send and receive video information according to their desired quality levels. Some grouping of PQPs may be needed in some applications, and this invention supports grouped PQPs.
In one implementation involving group PQPs, two sets of PQPs are designed. The first PQP set is designed for all users having narrowband connections including dial-up modem, wireless connection, etc. The other PQP set is designed to accommodate all broadband users connected over ISDN, xDSL, cable modem, Tl , etc. It is possible to merge the two sets of quality profiles into one set accommodating both narrowband and broadband users. At this stage, however, having two separate sets of PQPs appears to be more efficient for this particular application. It is important to note that a computer program according to the invention can automatically detect the network connection speed and decide whether to use the narrowband PQP or broadband PQP, being totally transparent to the user.
This particular example involving grouping of PQPs does not change the fact that individual users can still create their own custom- ized individual PQP. However, when network bandwidth and properties are widely different for two general types of interfaces, it may be more efficient in some circumstances simply to specify two general categories of PQP from which users can make minor customizations.
It is conceivable that a user will create a customized PQP with unrealistic expectations of quality. However, the user's choices in
setting up PQP levels nevertheless define that user's relative priorities regarding quality. Accordingly, even an unrealistic user PQP can be effectively used to define the relative importance to that user of different aspects of media quality (for example, frame rate versus picture quality versus audio quality). Accordingly, the method and system of the invention is still able to make use of that user's PQP to adjust quality according to available resources, while still accommodating to some extent the stated preferences of that user.
Adapted Media Quality Parameters
There are a number of media quality parameters that may be dynamically adjusted in order to make optimal use of the available resources. One implementation of the invention uses the following parameters:
• Video Frame Rate: The frame rate can be set in two ways. If the applied video codec supports temporal scalability, the decoder temporal resolution is changed. If the video codec does not support temporal scalability, the encoder rate is increased or decreased.
• Video Picture Quality: The picture quality can also be set in two ways. If the video codec does not support scalable video representation, e.g. , H.263, the quantization parameter is changed. If the video codec supports scalable video representation, the combination of both spatial resolution and SNR resolution is used to set the picture quality.
β Audio Quality: Since most of the popular audio codecs can operate only at one fixed bit-rate, audio quality adaptation
is implemented by switching audio codecs. The selection of codecs ranges from MELP (multiple excitation linear prediction) providing intelligible speech quality at 2.4k bits- per-second (bps) to ADPCM (adaptive differential pulse code modulation) providing excellent audio quality at 32 kbps.
Composite Performance Metric
The composite performance metric serves to map several monitored parameters from the participating users to a decision on how to change the operating quality of a session (the PQP level). One implementation uses the following monitored parameters both at the controller and followers:
® Packet loss rate: If the packet loss rate is too high, it will influence the composite performance metric towards removing one PQP layer. If the packet loss rate is very low, it will push the composite performance metric towards adding one PQP layer. While raising the PQP level means increasing bandwidth, lowering the PQP level will decrease the required bandwidth, which reduces network congestion.
• Round trip time (RJT): Low RTT is necessary due to interactive videoconferencing application. Increasing RTT may also indicate that the network buffers are slowly filling up, introducing larger delays and eventually packet losses. High RTT will make the composite performance metric in favor of reducing one PQP layer and low RTT will drive the composite performance metric towards adding one PQP
layer. Reducing a PQP layer will decrease the network load and will result in lower RTT.
• CPU load: If the applied video codecs provide symmetric encoder/decoder complexity, it is sufficient to measure the decoder time only. Otherwise (in the case of H.263, etc.) both the encoder and decoder time are measured. High CPU load will force the composite performance metric to decrease one PQP layer. Low CPU load will result in adding one PQP layer. Reducing one PQP layer will result in lower computational cost. Adding one PQP layer will result in higher computational cost.
These parameters are monitored by all the QMonitors and fed to the QController, which calculates the composite performance metric.
By way of example, the following formula can be used for the calculation of the composite performance metric using the monitored performance parameters:
M = A [ 1(c) 4- 1(f) ] + B [ r(c) + r(f) ] + C [p(c) + p(f) ]
where
M is the composite performance metric; 1(c) and 1(f) denote the loss rate at the controller and follower(s), respectively; r(c) and r(f) denote the RTT of the controller and follower(s), respectively; p(c) and p(f) denote the CPU load at the controller and fol- lower(s), respectively;
A, B and C are adjustable constants weighting the packet loss rate, RTT, and CPU load, respectively.
Of course, this formula may be modified to include other monitored performance parameters, as desired.
By using a threshold T (reflecting the limit of available resources), this composite performance metric is mapped to a binary decision. M> T indicates overrun, and so the media quality level shall be decreased. M< T indicates that no resource limit has been reached, and so the media quality level shall be increased.
Adaptation Procedure
The general idea of the adaptation procedure is the following:
(1) Monitor the performance of the application for a period of time using some timer.
(2) If monitored performance is better than a minimum acceptable value (there are sufficient resources available), raise the operating quality by moving to a higher PQP level, and then go back to Step 1.
(3) If monitored performance is worse than a minimum acceptable value (resource overrun), lower the operating quality by moving to a lower PQP level, and then go back to Step 1.
According to an embodiment of the invention, the adaptation procedure uses three states: the STEADY state, the MEASUREMENT
state, and the HYSTERESIS state. The PQP level may be raised in the STEADY state and may be lowered in the MEASUREMENT state.
According to this embodiment, the adaptation procedure uses two timers whose timeout triggers change of states: the join-timer Tj used for transition to a higher PQP level and detection-timer TD used for transition to a lower PQP level. The detection-timer preferably uses a fixed value for timeout duration. The value for join-timer timeout duration, however, is preferably dependent on the PQP level. The join- timer timeout duration is preferably updated based on the outcome of the previous PQP level transition experiment.
Figure 3 is a state diagram of the adaptation procedure according to this embodiment of the invention, and can be better under- stood with reference to the flow charts in Figures 4A, 4B, and 4C. M denotes the composite performance metric that includes the packet loss rate, CPU overload, and RTT of both the controller and followers. T denotes the composite performance metric threshold described above. M> T indicates overrun.
The adaptation procedure starts at a predetermined PQP level . The starting PQP level may be chosen in many ways. A simple choice would be to start with the lowest PQP level . Another choice would be to find a suitable starting level using a mix of performance measurement and information from past experiences. The adaptation procedure starts by setting a join-timer.
• STEADY State: The flow chart of the STEADY state is shown in Figure 4A. At any time after a previously set join-timer expires, the adaptation procedure enters the
STEADY state. The composite performance metric is then
calculated. If the composite performance metric does not indicate overrun, a transition experiment is started by raising the PQP by one level and the join-timer timeout durations of all lower levels are reduced (for example, by a factor of two). If the composite performance metric indicates overrun, no transition experiment is started and the PQP layer is left unchanged. Finally, the detection- timer is set, and the adaptation procedure is set to proceed to the MEASUREMENT state upon expiry of the detection- timer.
MEASUREMENT State: The flow chart of the MEASUREMENT state is shown in Figure 4B. When the previously- set detection-timer expires, the system enters the MEA- SUREMENT state. The composite performance metric is then calculated. If the composite performance metric indicates overrun, the PQP level is lowered by one level, and the join-timer timeout duration for that level is increased (for example, by a factor of two). If there was no overrun, the PQP level is not changed. Then, the join-timer for the next level is set regardless of whether or not the level was raised. If the join-timer timeout duration is longer than the detection-timer timeout duration, the detection-timer is set as well and the adaptation procedure is set to proceed to the HYSTERESIS state when the detection-timer expires.
Otherwise, only the join-timer is set, and the adaptation procedure proceeds to the STEADY state after the join- timer expires.
HYSTERESIS State: The flow chart of the HYSTERESIS state is shown in Figure 4C. This state is used only when
the join-timer timeout duration is longer than the detection- timer timeout duration at the end of the MEASUREMENT state. In such a case, both timers have been set, and when the detection-timer expires first, the adaptation procedure enters the HYSTERESIS state. The composite performance metric is then calculated. If there is no overrun, the detection-timer is set again and the adaptation procedure continues in the HYSTERESIS state when the detection- timer next expires. In the absence of an overrun, the adap- tation procedure will continue in the HYSTERESIS state until the join-timer expires, whereupon it will immediately transfer to the STEADY state. However, if an overrun is detected during the HYSTERESIS state, the current join- transition experiment is immediately cancelled and the join- timer is deactivated; in that case, the detection-timer is still set, but the adaptation procedure is set to proceed to the MEASUREMENT state when the detection-timer next expires instead of continuing in the HYSTERESIS state.
As will be apparent to those skilled in the art in the light of the foregoing disclosure, many alterations and modifications are possible in the practice of this invention without departing from the scope thereof. Accordingly, the scope of the invention is to be construed in accordance with the substance defined by the following claims.