WO2015076694A9 - Method for delivering personalized interactive video stream - Google Patents
Method for delivering personalized interactive video stream Download PDFInfo
- Publication number
- WO2015076694A9 WO2015076694A9 PCT/RU2013/001057 RU2013001057W WO2015076694A9 WO 2015076694 A9 WO2015076694 A9 WO 2015076694A9 RU 2013001057 W RU2013001057 W RU 2013001057W WO 2015076694 A9 WO2015076694 A9 WO 2015076694A9
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- client device
- buffer
- server
- communication
- user input
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 9
- 239000000872 buffer Substances 0.000 claims abstract description 180
- 238000004891 communication Methods 0.000 claims description 41
- 238000003825 pressing Methods 0.000 claims description 13
- 230000009471 action Effects 0.000 claims description 11
- 230000005540 biological transmission Effects 0.000 claims description 3
- 230000007257 malfunction Effects 0.000 claims description 3
- 230000009467 reduction Effects 0.000 abstract description 6
- 230000003993 interaction Effects 0.000 abstract description 5
- 230000004044 response Effects 0.000 description 17
- 230000008569 process Effects 0.000 description 11
- 238000009877 rendering Methods 0.000 description 8
- 238000001514 detection method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000003139 buffering effect Effects 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000003860 storage Methods 0.000 description 2
- 206010028347 Muscle twitching Diseases 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000005429 filling process Methods 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/24—Monitoring of processes or resources, e.g. monitoring of server load, available bandwidth, upstream requests
- H04N21/2401—Monitoring of the client buffer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/61—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio
- H04L65/612—Network streaming of media packets for supporting one-way streaming services, e.g. Internet radio for unicast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
- H04L65/75—Media network packet handling
- H04L65/762—Media network packet handling at the source
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44004—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving video buffer management, e.g. video decoder buffer or video display buffer
Definitions
- the present invention relates to the field of Internet TV and can be used to deliver a personalized interactive video stream over the Internet.
- Interactive TV systems gain popularity in recent years, said systems enabling a user to interact with information displayed to him on a screen and tune a customized fit of the content.
- the user can select a television broadcast and a time of viewing it.
- the user can rewind video in an accelerated mode or decelerate it, look at additional information and entertainment messages (for example, forecast of the weather, road traffic data, exchange quotations, rates of currencies, etc.) on top of video, or look at the second video program in a small window simultaneously with the first one.
- Interactive interaction is carried out as follows: a user presses a remote control of own TV set or own set-top box (or presses a mouse button of own computer or a tablet touch-screen), and in response, a screen image changes. A menu appears on a screen, and the user is able to select a menu item or message on top of video or other video, etc.
- the remote control sends a signal to a client device, that is; a user device (a computer, a TV set, etc.).
- a server generates a response to this user input and transmits said information to the user device.
- IP TV and (JI T) Traditional Internet TV (IP TV and (JI T) is constructed as follows: a producer generates a video stream in a TV studio and transmits it to a client. Video stream generation is taken to mean video shooting, video and audio joining, video compositing (drawing of information messages and widgets on top of video, picture-in-picture, etc.), and video rendering. A generated video stream is encoded and transmitted to a client via the Internet (OTT) or via an individual cable communication operator network (IPTV). A client device receives video, decodes it and displays it on a screeri.
- OTT Internet
- IPTV individual cable communication operator network
- the video stream generation process can be organized in other way.
- the video stream generation can be conceptually and technologically divided into two portions.
- the first portion is the video and audio shooting, while the second portion is the video compositing, video rendering and video encoding.
- An owner of a TV channel can independently carry out the first portion and transfer organization of the second portion to external services. This essentially accelerates and simplifies creation of the TV channel.
- the present invention is of current interest for a case where production of TV channel is divided into two portions by the way described above.
- the video stream rendering, compositing and encoding are carried out by a service external in respect with a creator of the TV channel.
- a service can be placed in the Internet and available for use by many TV channel creators and many clients simultaneously. In doing so, the problem of delayed response to user actions occurs. This problem will be described later.
- Fig. 1 depicts the general structure of the Internet TV according to the prior art in which the video and audio shooting are conceptually and technologically separated from the video rendering and compositing: a client device 1 exchanges signals with a server 2 via the Internet, said server carrying out the video stream rendering and compositing.
- the server receives commands from program 4 describing, video stream scenes, receives data from video and audio source(s) 3, carries out the rendering and compositing, encodes the video and sends the encoded video stream via . the Internet to the client device 1.
- Fig. 2 illustrates a result of a user input, particularly, of pressing a button of a remote control or a tablet touch screen, etc., as known from the prior art.
- An upper screen in Fig. 2 shows the video, and it is shown on a lower screen in Fig. 2 that a menu 6 has occurred after the user input 5 on top of the video on the same screen.
- the buffering technology is in common use in the Internet TV.
- This technology allows storage of data packets (video frames, audio data, and any other data) to use in future.
- the question is storage of video and audio data to screen said data in future.
- FIG. 3 illustrates the present technology.
- a server 7 sends data packets to a client device 8 where a video displaying program (video player) 9 is installed.
- video player video player
- a buffer of rton-decoded frames 10 stores several second of encoded video.
- Data packets from this buffer arrive at a decoder 11 that can be both, software and hardware one and can either comprise or not comprise its own buffer.
- Decoded frames from the decoder arrive at a next buffer 12 containing the decoded frames.
- the video player transmits the data from this latter buffer for demonstration to a screen 13.
- Said buffers smooth delays in the data delivery and make it possible to indemnify the user against oscillations in the network throughput.
- the server generates a response to a user input and transmits said response to the client device.
- the client device comprises buffers containing several seconds of video.
- the video is divided into frames that are in a linear order, that is, one after another.
- the frames arrive at the decoder in the same order as they are arranged in the buffer: one after another.
- the server response to the user input is new encoded video frame that is positioned to an end of a frame queue. (If the buffer has N frames, then, new frame becomes an (N+l )rth one). New frame arrives at the decoder after all frames contained in the buffer prior to it.
- the user receives the response to his or her input after viewing the video from the buffer, that is, with a delay equal to a buffer length. This delay essentially slows down the user's interaction with video content.
- the reduction of an input video stream buffer in size increases the malfunction probability in playback of the video stream, while the enlargement of the input video stream buffer in size increases a time delay between a time when the video stream enters the client device and a time of a user response to displayed information contained in the video stream, with reduction in the interactivity.
- the present invention is of current interest for packetized data, that is, data which is delivered from a server to a client device in the packetized form.
- the present invention allows of the problem of there being a delay, in response for a time necessary *#5 to play data from a buffer, a response to a user input.
- the present invention allows reduction in a delay of demonstrating a determined frame on a screen of the client device, and generally allows control of a time of demonstrating the frame on the client device. Owing to this, apart from reduction in the interaction delay, it is possible to use the invention for urgent demonstration of new , frames on the screen, for example, to broadcast news urgently.
- Disclosure is made of a method for delivering a personalized interactive video stream over a network from a server, where all information on all entities forming a video scene at any time is stored, to a client device comprising a decoder and an incoming video stream buffer and being, in communication with a display device, said method comprising the steps of:
- new reference IDR frame i.e., such a reference frame that frames following it can be decoded without taking frames arrived prior to the reference IDR frame into account
- new reference IDR frame being sent by the server to the client device along with a command to flush buffers and to display new frame
- the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a remote control panel of the client device or the display device being in commumcation with the client device.
- the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key of a mouse of the client device or the display device being in communication with the client device, or moving a cursor of the mouse of the client device or the display device being in communication with the client device to a predetermined area of a screen of the client device or the display device being in communication with the client device.
- the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key of a keypad of the client device or the display device being in communication with the client device.
- the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key arranged on a housing of the client device or the display device being in communication with the client device.
- the user input carried out at the client device or at the display device being in communication with the client device in the method is a command issued by the user - by means voice, mimics or gestures - to the client device or the display device being in communication with the client device.
- the user input carried out at the client device or at the display device being in communication with the client device in the method is a command issued by the user - by means of touching a touch panel, a contactless touch controller (kinect), a leap motion and any other user input device - to the client device or the display device being in communication with the client device.
- the decoder comprises a buffer whose size can be controlled, wherein a "flush" command is sent to a non-decoded data buffer, a decoder buffer and a decoded data buffer.
- the decoder comprises a buffer whose size and content cannot be controlled by a command from outside, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
- the decoder does not comprise a buffer, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
- the method according to the present invention consists of the steps device; and outputting a buffer flush command, as arrived from the server, to the client device.
- the client device sends the user input information to the server.
- the user input carried out at the client device is an action selected from the group consisting of pressing a remote control panel, pressing a key of a mouse, moving a cursor of the mouse to a predetermined area of a screen, touching a touch screen, touching a key of a keypad, pressing a key arranged on a housing of the client device, and of a command issued by the user by means of voice, a command issued by the user by means of gestures.
- the client device also sends information to the server, which information is on a size of the buffer present in the client device at a time when the user input was made.
- the server stores all information on all entities forming a video scene at any time, said entities being contained in the user buffer.
- said entities can describe a video frame, creeping lines, information solids, etc.
- the server At the time of' carrying out the user input, the server generates a frame to be received by the .client' . device after reception of all frames from the buffer (if the client device displays a frame ' No. 0 at the display device, and the buffer has N frames, then, the server generates a frame No. N+l ).
- the server Having received the user input information, the server returns a scene : and all entities therein to a state seen by a client on the screen at this time. This state corresponds to a frame No. 0+k, where k represents several frames during which the user input information was delivered to the server.
- the server generates new reference IDR frame (i.e., such a reference frame that frames following it can be decoded without taking frames arrived prior to the reference IDR frame into account) that is to appear on the screen as -a result of the user input.
- the server sends this frame to the client device along with a buffer flush and new frame display command. Having received this command, the client device removes all frames present in the client device buffer from said buffer and displays riew frame arrived from the server.
- the client sees a response to own user input practically without a delay.
- the buffer Upon execution of the command, the buffer stays empty, and the client is not protected agairist network problems. To solve this problem, a data stream from the server can be accelerated in order to accelerate the filling of the buffer therewith.
- the buffer usually generates a constant number of frames per second and sends them to the client device; the same number is displayed on the screen for this period.
- the server Upon execution of the command, the server can generate frames and send them to the display device more frequently for some period in order to fill the buffer more quickly.
- Fig. 1 depicts the general structure of the Internet TV according to the prior art in which the video and audio shooting is conceptually and technologically separated from the video rendering and compositing.
- Fig. 2 illustrates a result of a user input (pressing a button of a remote control or a tablet touch screen, etc.), as known from the prior art.
- Fig. 3 illustrates structure of buffers in a program at a client device, as known from the prior art.
- Fig. 4 illustrates the sending of a user input message to a server and a response of the server to a user input.
- Figs. 5a-5d illustrate a buffer flush process after reception of a "flush" command if a decoder does not comprise a buffer.
- Figs. 6a-6f illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content can be controlled by commands from outside. '
- Figs. 7a-7e illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content cannot be controlled by commands from outside.
- Fig. 8 illustrates the hasting - accelerated frame sending - process.
- Figs.-9a-9d illustrate the buffer filing pfOces ' s du ing
- a user presses an input device that sends a signal to a signal reception point in a TV viewing device.
- Said device transmits user input data to a program Used to view the Internet TV.
- the program processes the signal and sends user input information to a server after providing said information with additional information, specifically, information on:
- a number of a frame which is displayed at a time when a user input signal arrives a number of a frame which has last arrived at an input buffer of the program (,a buffer with non-decoded data); buffer with non-decoded data);
- the server Upon reception of a command, the server also receives information on a command-arrival-at-server time and on a number of a frame last sent to the program.
- the server calculates a number of frames to return backwardly a scene to be generated and sends said number to the server.
- the calculation is carried out according to one of techniques as follows:
- N_final N_last_sent - N_on_screen, where: N_final is the number of frames to return the scene backwardly; N_last_sent is the number of the frame sent last to the device by the server; N_on_screen is the number of the frame displayed on the screen at the time of sending a user input message;
- N_final N_last_recieved - N_on_screen, where: N_fmal is the number of frames to return the scene backwardly; N_last_recieved is the number of the frame received last by the program; N_on_screen is the number of the frame demonstrated on the screen at the time of sending the user input message;
- N_final (t_recieved - t_sent)*frequency_of_frarnes + N_on_screen, where: N_fmal is the number of frames to return the scene backwardly; t_sent is the time of sending the user input information by the program to the server; t_recieved is the time when the server receives the user input information; frequency_of_frames is a frequency of showing frames on the screen; N_on_screen is the number of the frame demonstrated on the screen at the time of sending the user input message.
- the method 2 is used in systems which use a program at the client device, said program sufficiently reliably sending the number of the last received frame (N_last_recieved), and which have no systematic error in measurement of said number.
- the method 1 may be used in systems with very high network quality. In the majority of cases, the method 3 is used because it is the most reliable.
- the server carries out actions as follows:
- a. receives (from internal or external sources) mathematical, that is, formulaic descriptions of all entities constituting a video scene: solids containing information, and widgets of pictures, rectangles within which the video should be played, etc.; descriptions of laws of modifying the entities described above in a.;
- c. receives information on particular data sources and on video files, live broadcasts, information message texts and other contents of the video channel to be received;
- d. receives the information described above in c. from the sources described above in c. and brings said information into coincidence with the entities described above in a., that is, superimposes video information messages, places text, logos, etc.. and then modifies said information in accordance with the laws described above in b.; this process is referred to as the rendering and compositing;
- f. sends the encoded video stream to the client device in the form of data packets; g. stores all information described above in a. to c. for several seconds and then updates it using new information in accordance with modifications of the scene 'that should be shown on the screen Of the client device.
- the server calculates how many frames back the scene needs to be returned in accordance with one of the formulae -described above, and then refers to the information stored in g. and begins formation of data from a respective frame stored in a server memor while adding entities and data thereto which should be printed on the user, screen as a response to the user input (for example, the server adds the screen with new entity "Menu" on top of all other entities).
- the server generates a data packet containing new encoded reference frame (a reference frame is a frame that can be decoded and shown on the screen without use of -in-foFmation-en-0therr-frames)-and-also-containing-a— "flush”— command— that-is ⁇ a — command to flush the buffer.
- the server sends this data packet to the client device.
- the client device transmits this data packet to the program. Packets arriving at the program are handled by a handler that logs whether or not the packet contains the "flush" command.
- the program sends the packet to the non-decoded data buffer from which it arrives at the decoder and then to the decoded data buffer and next is displayed on the screen.
- the handler If the packet contains the "flush” command, the handler records this fact and sends signals "flush data" to the buffers. In doing so, the following variants are possible with respect to a (hardware or software) decoder buffer:
- the decoder comprises does not contain any buffer, a data flush signal is sent to the non-decoded data buffer and to the decoded data buffer;
- the decoder comprises a buffer whose size can be controlled by commands from outside, a data flush signal is sent to the non-decoded data buffer, the decoder buffer, and to the decoded data buffer;
- the decoder comprises a buffer whose size and content cannot be controlled by commands from outside, a data flush signal is sent to the non-decoded data buffer and to the decoded data buffer.
- the buffer flushes its data.
- a packet containing new frame and the "flush" command arrives at the non-decoded data buffer and is handled exactly as a usual frame: new frame arrives at the decoder (if the decoder comprises the buffer whose size and content cannot be controlled by commands from outside; 3 ⁇ 4hen said frame becomes a last one in a queue of frames in the decoder buffer) and next arrives at the decoded frame buffer and then is displayed on the screen.
- Fig. 4 illustrates the sending of a user iriput response to a server and a response of the server.
- a user inputs a command; to a command source 14 that sends a signal to a client device 15.
- the client device sends a user input signal to a server 16 while providing said signal with information on a number of a frame on a screen and: a Command sending; time..
- the server generates a response to the user input: new frame and a "Hush" command, in other words, a command "flush buffer," arid sends a packet containing data of a necessary response to the client device 8.
- Figs. 5a-5d illustrate a buffer flush process after reception of a "flush"; command if a decoder does not comprise own buffer.
- Fig. 5a illustrates a situation in which a program 19 demonstrating video on a client device 18 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 21 and in a decoded frame buffer 23 and having arrived thereto from a decoder 22.
- the data from the decoded frame buffer arrives at a screen 24.
- a server 17 has sent a data packet containing a "flush” command, that is, "flush buffer” to the client device. This packet is marked by a flag in the diagram.
- the client device has sent the present packet along with other data sent from the server to a "flush" command handler 20 from which they will enter the non- decoded frame buffer.
- Fig. 5b illustrates a situation in which the program 19 on the client deyice 18 has received the packet with the "flush” command and transmitted it to the "flush'' command handler 20, which logs whether or not the "flush” command is present.
- the "flush” command handler has logged the "flush” command.
- Data from the "flush” command handler then arrives at the non-decoded frame buffer 21 from which the data enters the decoder 22 and then arrives at the decoded frame buffer 23 afteTwfficlrthe program de nonslrates " it " ohl:he screen " 247 ⁇
- Fig. 5c illustrates a situation in which the program 19 on the client device 18 has received the packet with the "flush” command -from the server 17. and the "flush” command handler 20 has sent a command "clear buffer” to the buffers.
- the non- decoded frame buffer 21 and the decoded frame buffer 23. become empty, while a frame with the "flush” command arrives at the non-decoded frame buffer.
- the decoder 22 does not comprise a buffer. New frames do not arrive at the screen 24. . . ' . . .. ⁇ . . .
- Fig. 5d illustrates a situation in which the frame with the "flush” command, as . sent by the server 17 to the program 19 on the client device 18 and passed through: the "flush” command handler 20 and the non-decoded frame buffer 21 , entered the decoder 22 and from it - to the empty decoded frame buffer 23 in order to be displayed on the screen 24 in future. During this period, the non-decoded frame buffer has been filled with new data packets- that came from the server.
- Figs. 6a-6f illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content can be controlled by commands from outside.
- Fig. 6a illustrates a situation in which a program 27 demonstrating video on a client device 26 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 29, a decoder buffer 30, and in a decoded frame buffer 31.
- Data from the decoded frame buffer arrives at a screen 32.
- a server 25 has sent a data packet containing a "flush” command, that is, "flush buffer” to the client device. This packet is marked by a flag in the diagram.
- the client device has sent the present packet along with other data sent from the server to a "flush" command handler 28 from which they will enter the non-decoded frame buffer 29.
- Fig. 6b illustrates a situation in which the program 27 on the client device 26 has received a packet with a "flush” command from the server 25 and transmitted it to the "flush” command handler 28 logging whether or not the "flush” command is present in the data.
- the "flush” command handler has logged the "flush” command.
- Data from the "flush” command handler then arrives at the non-decoded frame buffer 29 from which the data enters the decoder buffer 30 and then arrives at the decoded frame buffer 31 after which the program demonstrates it on the screen 32.
- FIG. 6c illustrates a situation in which the ' pfograrn ⁇ 27 on the clien de ' vice 26 has received the packet with the "flush" command from the server 25, and the "flush” command handler 28 has sent a command "flush buffer” to the buffers.
- the decoder comprises the buffer 30 whose size and content can be controlled- by commands from outside.
- the non-decoded frame buffer 29, the decoder buffer 30, and the decoded frame buffer 31 are cleared. New frames do not arrive at the screen 32.
- Fig. 6d illustrates a situation in which, after arrival of the frame with the "flush” command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush” command handler 28, the decoder buffer 30 and the decoded frame buffer 31 are empty, new frames do not arrive at the screen 32, and the frame with the "flush” command arrived at the empty non-decoded frame buffer -29.
- Fig. 6e illustrates a situation in which, after arrival of the frame with the "flush” command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush” command handler 28, the decoded frame buffer 3 1 is empty, new frames do not arrive at the screen 32, the frame with the "flush” command arrived at the empty decoder buffer 30 while the filling of the non-decoded frame buffer 86 with new frames coming from the server started.
- Fig. 6f illustrates a situation in which, after arrival of the frame with the "flush” command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush” command handler 28, the frame with the "flush” command arrived at the empty decoded frame buffer 31 in order , to be -then displayed to the screen 32 by the program 27. During this period, the non-decoded frame buffer 29 is filled with new frames coming from the server and transmits them to the decoder buffer 30.
- Figs. 7a-7e illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content cannot b,e 5 controlled by commands from outside.
- Fig. 7a illustrates a situation in which a program 35 demonstrating video on a client device 34 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 37, a decoder buffer 38, and in a decoded frame buffer 39.
- Data from the decoded frame buffer arrives at a screen 40.
- a server 33 - +0— - has-sent a-data -packet -containing ⁇ a ⁇ "flush': ' " command; that " is7"fluslT buffer” to ' the " client” device.
- This packet is marked by a flag in the diagram.
- the client device has sent the present packet along with other data sent from the server to a "flush" command handler 36 from which they will enter the non-decoded frame buffer.
- Fig. 7b illustrates a situation in which the program 35 on the client device.34 has 15 received a packet with a "flush” command from the server 33 and transmitted it to the "flush” command handler 36 logging whether or not the "flush” command is present in the data.
- the "flush” command handler has logged the "flush” command.
- Data from the "flush” command handler then arrives at the non-decoded frame buffer 37 from which the data enters the decoder buffer 38 and then arrives at the 20 decoded frame buffer 39 after which the program 35 demonstrates it on the screen 40.
- Fig. 7c illustrates a situation in which the program 35 on the client device 34 has received the packet with the "flush” command from the server 33, and the "flush"
- decoded frame buffer 37 and the decoded frame buffer 39 become empty, and a frame 25 with the "flush" command .arrives at the non-decoded frame buffer 37, In the present case, the decoder buffer 38 contains several frames. New frames do not arrive at the screen 40.
- Fig. 7d illustrates a situation in which,: after arrival of the frame with the "flush” command from the server 33 to the program 35 at the client device 34 and detection of 30 said command by the "flush” command handler 36, frames from the decoder buffer 38 arrived at the empty decoded frame buffer 3 , said frames being displayed to the screen 40, the frame with the "flush” command arrived at the decoder buffer while the filling of the non-de ' coded frame buffer 37 with new frames coming from the server started.
- Fig. 7e illustrates a situation in which, after arrival of the frame with the "flush',' command from the server 33 to the program 35 at the client device 34 and detection of said command by the "flush” command handler 36, the frame with the "flush” command arrived at the decoded frame buffer 39 in order to be then displayed to the screen 40 by the program after all other frames present in the present buffer 39.
- the non-decoded frame buffer 37 is filled with new frames coming from the server and transmits them to the decoder buffer 38.
- Fig. 8 illustrates the hasting - accelerated frame sending - process.
- the lower portion of the Figure demonstrates the hasting mode: the server 41 generates, encodes, and sends a larger number (K2) of frames per second to a client device 42 (K2 > K1).
- Figs. 9a-9d illustrate the buffer filling process during hasting in the basic case when the decoder does not comprise a buffer.
- Fig. 9a illustrates a situation in which a program 45 demonstrating video on a client device 44 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 47 and in a decoded frame buffer 49 and having arrived thereto from a decoder 48.
- the data from the decoded frame buffer 4 arrives at a screen 50.
- a server 43 has sent a data packet containing a "flush" command, that is, a command to flush the buffer to the client device. This packet is tnarked ⁇ by ⁇ a"fl ⁇
- Fig. 9b illustrates a situation in which the program 45 on the client device 44 has received the packet with the "flush” command and transmitted it to the "flush” command handler 46 which logs whether or not the "flush” command is present.
- the "flush” command handler 46 has logged the "flush” command.
- Data from the "flush” command handler 46 then arrives at the non-decoded frame buffer 47 from which the data enters the decoder 48 and then arrives at the decoded frame buffer 49 after which the program demonstrates it on the screen 50.
- the server begins the sending of frames in the "hasting" mode, that is, in the accelerated mode.
- Fig. 9c illustrates a situation in which the program 45 on the client device 44 has received the packet with the "flush” command from the server 43* ' and the "flush” command handler 46 has sent a command "flush buffer” to the buffers.
- the non- decoded frame buffer 47 and the decoded frame buffer 49 become empty, while a frame with the "flush” command arrives at the non-decoded frame buffer 47.
- the decoder 48 does not comprise a buffer, New frames do not arrive at the screen 55.
- the server sends frames in the "hasting" mode, that is, in the accelerated mode.
- Fig. 9d illustrates a situation in which the frame with the "flush” command, as sent by the server 43 to the program 45 on the client device 44 and passed through the "flush” command handler 46 and the non-decoded frame buffer 47, entered the decoder 49 and from it - the empty decoded frame buffer 49 in order to be displayed on the screen 50 in future.
- the non-decoded frame buffer 47 is faster filled with new data packets, because the server sends said packets faster than in the usual mode.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The present invention relates to the field of the Internet TV and can be used to deliver a personalized interactive video stream over the Internet. The present invention allows reduction in a delay of demonstrating a determined frame on a screen of a client device, and generally allows control of a time of demonstrating the frame on the client device. Owing to this, apart from reduction in the interactive interaction delay, it is possible to use the invention for urgent demonstration of new frames on the screen, for example, to broadcast news urgently. The method according to the present invention consists of the steps of: re-setting a local time in the server backwardly to a state to be seen by a client in the device; and outputting a buffer flush command, as arrived from the server, to the client device.
Description
METHOD FOR DELIVERING PERSONALIZED INTERACTIVE VIDEO STREAM
Field of the Invention
The present invention relates to the field of Internet TV and can be used to deliver a personalized interactive video stream over the Internet.
Background of the Invention
Interactive TV systems gain popularity in recent years, said systems enabling a user to interact with information displayed to him on a screen and tune a customized fit of the content. The user can select a television broadcast and a time of viewing it. Depending upon functionality of the Internet TV, the user can rewind video in an accelerated mode or decelerate it, look at additional information and entertainment messages (for example, forecast of the weather, road traffic data, exchange quotations, rates of currencies, etc.) on top of video, or look at the second video program in a small window simultaneously with the first one.
Interactive interaction is carried out as follows: a user presses a remote control of own TV set or own set-top box (or presses a mouse button of own computer or a tablet touch-screen), and in response, a screen image changes. A menu appears on a screen, and the user is able to select a menu item or message on top of video or other video, etc. When the user presses a button of the remote control, the remote control sends a signal to a client device, that is; a user device (a computer, a TV set, etc.). User input information is transmitted therefrom to a server. The server generates a response to this user input and transmits said information to the user device.
Traditional Internet TV (IP TV and (JI T) is constructed as follows: a producer generates a video stream in a TV studio and transmits it to a client. Video stream generation is taken to mean video shooting, video and audio joining, video compositing (drawing of information messages and widgets on top of video, picture-in-picture, etc.), and video rendering. A generated video stream is encoded and transmitted to a client via the Internet (OTT) or via an individual cable communication operator network (IPTV). A client device receives video, decodes it and displays it on a screeri.
The video stream generation process can be organized in other way. The video stream generation can be conceptually and technologically divided into two portions. The first portion is the video and audio shooting, while the second portion is the video compositing, video rendering and video encoding. An owner of a TV channel can
independently carry out the first portion and transfer organization of the second portion to external services. This essentially accelerates and simplifies creation of the TV channel.
The present invention is of current interest for a case where production of TV channel is divided into two portions by the way described above. In this case, the video stream rendering, compositing and encoding are carried out by a service external in respect with a creator of the TV channel. Such a service can be placed in the Internet and available for use by many TV channel creators and many clients simultaneously. In doing so, the problem of delayed response to user actions occurs. This problem will be described later.
Fig. 1 depicts the general structure of the Internet TV according to the prior art in which the video and audio shooting are conceptually and technologically separated from the video rendering and compositing: a client device 1 exchanges signals with a server 2 via the Internet, said server carrying out the video stream rendering and compositing. The server receives commands from program 4 describing, video stream scenes, receives data from video and audio source(s) 3, carries out the rendering and compositing, encodes the video and sends the encoded video stream via . the Internet to the client device 1.
Fig. 2 illustrates a result of a user input, particularly, of pressing a button of a remote control or a tablet touch screen, etc., as known from the prior art. An upper screen in Fig. 2 shows the video, and it is shown on a lower screen in Fig. 2 that a menu 6 has occurred after the user input 5 on top of the video on the same screen.
The closest similar prior art to the present invention is the technical solution disclosed in the International Publication WO 2013/024441 Λ 1 which defines an apparatus coordinating streaming of multimedia content from a streaming server to a client device. While the client device is receiving a first multicast stream from the streaming server, the apparatus coordinating the streaming of the multimedia content receives notifications of a recent or impending occurrence of a predefined event associated with characteristic of interest of the multimedia content in a second multicast stream. Responsive to receiving such a. notification, the apparatus coordinating the streaming of the content directs the streaming server to stream to the client device a third stream whose video content comprises windowed video content of
the first and second multicast streams. This windowed video content comprises video content of the first and second multicast streams arranged in respective windows. Such event-triggered streamlining thus enables a user of the client device to simultaneously watch both the first and second streams, as desired, by effectively receiving only a single third stream.
However, it is impossible to monitor the quality of network in the Internet, to be exact, it is impossible to monitor a data loss, a data bandwidth and delay time of delivery of data to an end user (latency). Depending upon a channel quality, data arrives at an end device non-uniformly, and the end device can demonstrate a non— uniform, twitching video.
To solve this problem, the buffering technology is in common use in the Internet TV. This technology allows storage of data packets (video frames, audio data, and any other data) to use in future. In case of the Internet TV, the question is storage of video and audio data to screen said data in future.
Fig. 3 illustrates the present technology. A server 7 sends data packets to a client device 8 where a video displaying program (video player) 9 is installed. To make a client able to watch video without delays, several buffers are formed in the program. A buffer of rton-decoded frames 10 stores several second of encoded video. Data packets from this buffer arrive at a decoder 11 that can be both, software and hardware one and can either comprise or not comprise its own buffer. Decoded frames from the decoder arrive at a next buffer 12 containing the decoded frames. The video player transmits the data from this latter buffer for demonstration to a screen 13. Said buffers smooth delays in the data delivery and make it possible to indemnify the user against oscillations in the network throughput.
However, the buffering technology creates real challenges for rapid data transmission during interaction. As indicated above, the server generates a response to a user input and transmits said response to the client device. The client device comprises buffers containing several seconds of video. The video is divided into frames that are in a linear order, that is, one after another. The frames arrive at the decoder in the same order as they are arranged in the buffer: one after another. The server response to the user input is new encoded video frame that is positioned to an end of a frame queue. (If the buffer has N frames, then, new frame becomes an (N+l )rth one). New
frame arrives at the decoder after all frames contained in the buffer prior to it. Thus, the user receives the response to his or her input after viewing the video from the buffer, that is, with a delay equal to a buffer length. This delay essentially slows down the user's interaction with video content.
Thus, the reduction of an input video stream buffer in size increases the malfunction probability in playback of the video stream, while the enlargement of the input video stream buffer in size increases a time delay between a time when the video stream enters the client device and a time of a user response to displayed information contained in the video stream, with reduction in the interactivity.
Problem to be Solved by the Invention
The present invention is of current interest for packetized data, that is, data which is delivered from a server to a client device in the packetized form. The present invention allows of the problem of there being a delay, in response for a time necessary *#5 to play data from a buffer, a response to a user input.
The present invention allows reduction in a delay of demonstrating a determined frame on a screen of the client device, and generally allows control of a time of demonstrating the frame on the client device. Owing to this, apart from reduction in the interaction delay, it is possible to use the invention for urgent demonstration of new , frames on the screen, for example, to broadcast news urgently.
Summary of the Invention
Disclosure is made of a method for delivering a personalized interactive video stream over a network from a server, where all information on all entities forming a video scene at any time is stored, to a client device comprising a decoder and an incoming video stream buffer and being, in communication with a display device, said method comprising the steps of:
carrying out a user input by a user at the client device or at the display device being in communication with the client device;
transmitting a user input signal by the client device to a program used to view the Internet TV;
sending user input information by said program to the server, said user input having been carried out at the client device or at the display device being in communication with the client device, after supplementing said information with
communication with the client device, after provision of said information with additional information on a number of a frame demonstrated in the display device at a time when the user input was made, on a number of a frame last arrived at an input buffer of the program with non-encoded data, and on a time of sending said user input information to the server;
receiving, at the server, said information as well as information on a number of a frame sent last to the program and on a time of arrival of the user input information at the server;
returning, by the server, both the scene and all entities represented therein to the state displayed to the user on a screen of the display device being in communication with the client device at the time when the user input was made;
generating, by the server, new reference IDR frame (i.e., such a reference frame that frames following it can be decoded without taking frames arrived prior to the reference IDR frame into account) that is to be displayed to the user on the screen of the display device being in communication with the client device as a result of carrying out the user input, said new reference IDR frame being sent by the server to the client device along with a command to flush buffers and to display new frame;
after reception of said command to flush buffers and to display new frame, removing, by the client device, all frames present in the client device buffer and outputting, by the client device, new reference IDR frame received from the server to the display device being in communication with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a remote control panel of the client device or the display device being in commumcation with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key of a mouse of the client device or the display device being in communication with the client device, or moving a cursor of the mouse of the client device or the display device being in communication with the client device to a predetermined area of a screen of the client device or the display device being in communication with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key of a keypad of the client device or the display device being in communication with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key arranged on a housing of the client device or the display device being in communication with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is a command issued by the user - by means voice, mimics or gestures - to the client device or the display device being in communication with the client device.
According to an additional embodiment, the user input carried out at the client device or at the display device being in communication with the client device in the method is a command issued by the user - by means of touching a touch panel, a contactless touch controller (kinect), a leap motion and any other user input device - to the client device or the display device being in communication with the client device.
According to an additional embodiment, after removing by the client device all frames present in its buffer, there is the step of increasing a stream transmission rate of the video data stream to be transmitted from the server to the client device in order to accelerate the filling of buffers thereby to reduce the malfunction probability in playback of the video stream.
According to an additional embodiment, the decoder comprises a buffer whose size can be controlled, wherein a "flush" command is sent to a non-decoded data buffer, a decoder buffer and a decoded data buffer.
According to an additional embodiment, the decoder comprises a buffer whose size and content cannot be controlled by a command from outside, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
According to an additional embodiment, the decoder does not comprise a buffer, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
Alternatively, the method according to the present invention consists of the steps
device; and outputting a buffer flush command, as arrived from the server, to the client device.
More specifically, the client device sends the user input information to the server. At the same time, the user input carried out at the client device is an action selected from the group consisting of pressing a remote control panel, pressing a key of a mouse, moving a cursor of the mouse to a predetermined area of a screen, touching a touch screen, touching a key of a keypad, pressing a key arranged on a housing of the client device, and of a command issued by the user by means of voice, a command issued by the user by means of gestures. The client device also sends information to the server, which information is on a size of the buffer present in the client device at a time when the user input was made. The server stores all information on all entities forming a video scene at any time, said entities being contained in the user buffer. (Said entities can describe a video frame, creeping lines, information solids, etc.). At the time of' carrying out the user input, the server generates a frame to be received by the .client' . device after reception of all frames from the buffer (if the client device displays a frame ' No. 0 at the display device, and the buffer has N frames, then, the server generates a frame No. N+l ). Having received the user input information, the server returns a scene : and all entities therein to a state seen by a client on the screen at this time. This state corresponds to a frame No. 0+k, where k represents several frames during which the user input information was delivered to the server.
Then the server generates new reference IDR frame (i.e., such a reference frame that frames following it can be decoded without taking frames arrived prior to the reference IDR frame into account) that is to appear on the screen as -a result of the user input. The server sends this frame to the client device along with a buffer flush and new frame display command. Having received this command, the client device removes all frames present in the client device buffer from said buffer and displays riew frame arrived from the server.
Thus, the client sees a response to own user input practically without a delay.
Upon execution of the command, the buffer stays empty, and the client is not protected agairist network problems. To solve this problem, a data stream from the server can be accelerated in order to accelerate the filling of the buffer therewith. The buffer usually generates a constant number of frames per second and sends them to the
client device; the same number is displayed on the screen for this period. Upon execution of the command, the server can generate frames and send them to the display device more frequently for some period in order to fill the buffer more quickly.
List of Drawing Figures
Fig. 1 depicts the general structure of the Internet TV according to the prior art in which the video and audio shooting is conceptually and technologically separated from the video rendering and compositing.
Fig. 2 illustrates a result of a user input (pressing a button of a remote control or a tablet touch screen, etc.), as known from the prior art.
Fig. 3 illustrates structure of buffers in a program at a client device, as known from the prior art.
Fig. 4 illustrates the sending of a user input message to a server and a response of the server to a user input.
Figs. 5a-5d illustrate a buffer flush process after reception of a "flush" command if a decoder does not comprise a buffer.
Figs. 6a-6f illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content can be controlled by commands from outside. '
Figs. 7a-7e illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content cannot be controlled by commands from outside.
Fig. 8 illustrates the hasting - accelerated frame sending - process.
— Figs.-9a-9d illustrate the buffer filing pfOces's du ing
Detailed Description of Embodiments of the Invention
According to the claimed method, a user presses an input device that sends a signal to a signal reception point in a TV viewing device. Said device transmits user input data to a program Used to view the Internet TV. The program processes the signal and sends user input information to a server after providing said information with additional information, specifically, information on:
a number of a frame which is displayed at a time when a user input signal arrives; a number of a frame which has last arrived at an input buffer of the program (,a buffer with non-decoded data);
buffer with non-decoded data);
a time of sending the user input information to the server.
Upon reception of a command, the server also receives information on a command-arrival-at-server time and on a number of a frame last sent to the program.
Then the server calculates a number of frames to return backwardly a scene to be generated and sends said number to the server. The calculation is carried out according to one of techniques as follows:
1. N_final = N_last_sent - N_on_screen, where: N_final is the number of frames to return the scene backwardly; N_last_sent is the number of the frame sent last to the device by the server; N_on_screen is the number of the frame displayed on the screen at the time of sending a user input message;
2. N_final = N_last_recieved - N_on_screen, where: N_fmal is the number of frames to return the scene backwardly; N_last_recieved is the number of the frame received last by the program; N_on_screen is the number of the frame demonstrated on the screen at the time of sending the user input message;
3. N_final = (t_recieved - t_sent)*frequency_of_frarnes + N_on_screen, where: N_fmal is the number of frames to return the scene backwardly; t_sent is the time of sending the user input information by the program to the server; t_recieved is the time when the server receives the user input information; frequency_of_frames is a frequency of showing frames on the screen; N_on_screen is the number of the frame demonstrated on the screen at the time of sending the user input message.
The method 2 is used in systems which use a program at the client device, said program sufficiently reliably sending the number of the last received frame (N_last_recieved), and which have no systematic error in measurement of said number. The method 1 may be used in systems with very high network quality. In the majority of cases, the method 3 is used because it is the most reliable.
In the case of forming the Internet TV as described above (the case where the video and audio shooting is conceptually and technologically separated from creation of a TV channel), the server carries out actions as follows:
a. receives (from internal or external sources) mathematical, that is, formulaic descriptions of all entities constituting a video scene: solids containing information, and widgets of pictures, rectangles within which the video should be played, etc.;
descriptions of laws of modifying the entities described above in a.;
c. receives information on particular data sources and on video files, live broadcasts, information message texts and other contents of the video channel to be received;
d. receives the information described above in c. from the sources described above in c. and brings said information into coincidence with the entities described above in a., that is, superimposes video information messages, places text, logos, etc.. and then modifies said information in accordance with the laws described above in b.; this process is referred to as the rendering and compositing;
e. encodes the resultant video stream;
f. sends the encoded video stream to the client device in the form of data packets; g. stores all information described above in a. to c. for several seconds and then updates it using new information in accordance with modifications of the scene 'that should be shown on the screen Of the client device.
Having received the user input signal, the server calculates how many frames back the scene needs to be returned in accordance with one of the formulae -described above, and then refers to the information stored in g. and begins formation of data from a respective frame stored in a server memor while adding entities and data thereto which should be printed on the user, screen as a response to the user input (for example, the server adds the screen with new entity "Menu" on top of all other entities).
The server generates a data packet containing new encoded reference frame (a reference frame is a frame that can be decoded and shown on the screen without use of -in-foFmation-en-0therr-frames)-and-also-containing-a— "flush"— command— that-is~a— command to flush the buffer. The server sends this data packet to the client device. The client device transmits this data packet to the program. Packets arriving at the program are handled by a handler that logs whether or not the packet contains the "flush" command.
If the packet does not contain the "flush" command, the program sends the packet to the non-decoded data buffer from which it arrives at the decoder and then to the decoded data buffer and next is displayed on the screen.
If the packet contains the "flush" command, the handler records this fact and sends signals "flush data" to the buffers. In doing so, the following variants are possible
with respect to a (hardware or software) decoder buffer:
1. the decoder comprises does not contain any buffer, a data flush signal is sent to the non-decoded data buffer and to the decoded data buffer;
2. the decoder comprises a buffer whose size can be controlled by commands from outside, a data flush signal is sent to the non-decoded data buffer, the decoder buffer, and to the decoded data buffer;
3. the decoder comprises a buffer whose size and content cannot be controlled by commands from outside, a data flush signal is sent to the non-decoded data buffer and to the decoded data buffer.
Having received said signal, the buffer flushes its data. A packet containing new frame and the "flush" command arrives at the non-decoded data buffer and is handled exactly as a usual frame: new frame arrives at the decoder (if the decoder comprises the buffer whose size and content cannot be controlled by commands from outside; ¾hen said frame becomes a last one in a queue of frames in the decoder buffer) and next arrives at the decoded frame buffer and then is displayed on the screen.
Embodiments of the present invention will be described below with reference to the accompanying drawings, when necessary.
Fig. 4 illustrates the sending of a user iriput response to a server and a response of the server. A user inputs a command; to a command source 14 that sends a signal to a client device 15. The client device sends a user input signal to a server 16 while providing said signal with information on a number of a frame on a screen and: a Command sending; time.. The server generates a response to the user input: new frame and a "Hush" command, in other words, a command "flush buffer," arid sends a packet containing data of a necessary response to the client device 8.
Figs. 5a-5d illustrate a buffer flush process after reception of a "flush"; command if a decoder does not comprise own buffer.
Fig. 5a illustrates a situation in which a program 19 demonstrating video on a client device 18 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 21 and in a decoded frame buffer 23 and having arrived thereto from a decoder 22. The data from the decoded frame buffer arrives at a screen 24. A server 17 has sent a data packet containing a "flush" command, that is, "flush buffer" to the client device. This packet is marked by a flag in
the diagram. The client device has sent the present packet along with other data sent from the server to a "flush" command handler 20 from which they will enter the non- decoded frame buffer.
Fig. 5b illustrates a situation in which the program 19 on the client deyice 18 has received the packet with the "flush" command and transmitted it to the "flush'' command handler 20, which logs whether or not the "flush" command is present. In the present case, the "flush" command handler has logged the "flush" command. Data from the "flush" command handler then arrives at the non-decoded frame buffer 21 from which the data enters the decoder 22 and then arrives at the decoded frame buffer 23 afteTwfficlrthe program de nonslrates"it"ohl:he screen"247~
Fig. 5c illustrates a situation in which the program 19 on the client device 18 has received the packet with the "flush" command -from the server 17. and the "flush" command handler 20 has sent a command "clear buffer" to the buffers. The non- decoded frame buffer 21 and the decoded frame buffer 23. become empty, while a frame with the "flush" command arrives at the non-decoded frame buffer. In the present case, the decoder 22 does not comprise a buffer. New frames do not arrive at the screen 24. . . ' . . .. · . . .
Fig. 5d illustrates a situation in which the frame with the "flush" command, as . sent by the server 17 to the program 19 on the client device 18 and passed through: the "flush" command handler 20 and the non-decoded frame buffer 21 , entered the decoder 22 and from it - to the empty decoded frame buffer 23 in order to be displayed on the screen 24 in future. During this period, the non-decoded frame buffer has been filled with new data packets- that came from the server.
Figs. 6a-6f illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content can be controlled by commands from outside.
Fig. 6a illustrates a situation in which a program 27 demonstrating video on a client device 26 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 29, a decoder buffer 30, and in a decoded frame buffer 31. Data from the decoded frame buffer arrives at a screen 32. A server 25 has sent a data packet containing a "flush" command, that is, "flush buffer" to the client device. This packet is marked by a flag in the diagram. The client device has sent the
present packet along with other data sent from the server to a "flush" command handler 28 from which they will enter the non-decoded frame buffer 29.
Fig. 6b illustrates a situation in which the program 27 on the client device 26 has received a packet with a "flush" command from the server 25 and transmitted it to the "flush" command handler 28 logging whether or not the "flush" command is present in the data. In the present case, the "flush" command handler has logged the "flush" command. Data from the "flush" command handler then arrives at the non-decoded frame buffer 29 from which the data enters the decoder buffer 30 and then arrives at the decoded frame buffer 31 after which the program demonstrates it on the screen 32.
-Fig. 6c illustrates a situation in which the' pfograrn~27 on the clien de'vice 26 has received the packet with the "flush" command from the server 25, and the "flush" command handler 28 has sent a command "flush buffer" to the buffers. In the present case, the decoder comprises the buffer 30 whose size and content can be controlled- by commands from outside. As a result, the non-decoded frame buffer 29, the decoder buffer 30, and the decoded frame buffer 31 are cleared. New frames do not arrive at the screen 32.
Fig. 6d illustrates a situation in which, after arrival of the frame with the "flush" command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush" command handler 28, the decoder buffer 30 and the decoded frame buffer 31 are empty, new frames do not arrive at the screen 32, and the frame with the "flush" command arrived at the empty non-decoded frame buffer -29.
Fig. 6e illustrates a situation in which, after arrival of the frame with the "flush" command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush" command handler 28, the decoded frame buffer 3 1 is empty, new frames do not arrive at the screen 32, the frame with the "flush" command arrived at the empty decoder buffer 30 while the filling of the non-decoded frame buffer 86 with new frames coming from the server started.
Fig. 6f illustrates a situation in which, after arrival of the frame with the "flush" command from the server 25 at the client device 26 to the program 27 and detection of said command by the "flush" command handler 28, the frame with the "flush" command arrived at the empty decoded frame buffer 31 in order , to be -then displayed to the screen 32 by the program 27. During this period, the non-decoded frame buffer 29
is filled with new frames coming from the server and transmits them to the decoder buffer 30.
Figs. 7a-7e illustrate the buffer flush process after reception of a "flush" command if the decoder comprises a buffer whose size and content cannot b,e 5 controlled by commands from outside.
Fig. 7a illustrates a situation in which a program 35 demonstrating video on a client device 34 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 37, a decoder buffer 38, and in a decoded frame buffer 39. Data from the decoded frame buffer arrives at a screen 40. A server 33 - +0— -has-sent a-data -packet -containing~a~"flush': '" command; that" is7"fluslT buffer" to'the "client" device. This packet is marked by a flag in the diagram. The client device has sent the present packet along with other data sent from the server to a "flush" command handler 36 from which they will enter the non-decoded frame buffer.
Fig. 7b illustrates a situation in which the program 35 on the client device.34 has 15 received a packet with a "flush" command from the server 33 and transmitted it to the "flush" command handler 36 logging whether or not the "flush" command is present in the data. In the present case, the "flush" command handler has logged the "flush" command. Data from the "flush" command handler then arrives at the non-decoded frame buffer 37 from which the data enters the decoder buffer 38 and then arrives at the 20 decoded frame buffer 39 after which the program 35 demonstrates it on the screen 40.
Fig. 7c illustrates a situation in which the program 35 on the client device 34 has received the packet with the "flush" command from the server 33, and the "flush"
decoded frame buffer 37 and the decoded frame buffer 39 become empty, and a frame 25 with the "flush" command .arrives at the non-decoded frame buffer 37, In the present case, the decoder buffer 38 contains several frames. New frames do not arrive at the screen 40.
Fig. 7d illustrates a situation in which,: after arrival of the frame with the "flush" command from the server 33 to the program 35 at the client device 34 and detection of 30 said command by the "flush" command handler 36, frames from the decoder buffer 38 arrived at the empty decoded frame buffer 3 , said frames being displayed to the screen 40, the frame with the "flush" command arrived at the decoder buffer while the filling
of the non-de'coded frame buffer 37 with new frames coming from the server started.
Fig. 7e illustrates a situation in which, after arrival of the frame with the "flush',' command from the server 33 to the program 35 at the client device 34 and detection of said command by the "flush" command handler 36, the frame with the "flush" command arrived at the decoded frame buffer 39 in order to be then displayed to the screen 40 by the program after all other frames present in the present buffer 39. During this period, the non-decoded frame buffer 37 is filled with new frames coming from the server and transmits them to the decoder buffer 38.
Fig. 8 illustrates the hasting - accelerated frame sending - process. The upper ortion-of the-Figure"demOnstrates"a"usual"frame"sending-mode:"a'server^41_generatesr~ encodes, and sends a constant number (Kl) of frames per second to a client device 42. The lower portion of the Figure demonstrates the hasting mode: the server 41 generates, encodes, and sends a larger number (K2) of frames per second to a client device 42 (K2 > K1).
Figs. 9a-9d illustrate the buffer filling process during hasting in the basic case when the decoder does not comprise a buffer.
Fig. 9a illustrates a situation in which a program 45 demonstrating video on a client device 44 has collected some data volume in progress of its operation, said data being present in a non-decoded frame buffer 47 and in a decoded frame buffer 49 and having arrived thereto from a decoder 48. The data from the decoded frame buffer 4 arrives at a screen 50. A server 43 has sent a data packet containing a "flush" command, that is, a command to flush the buffer to the client device. This packet is tnarked~by~a"fl^
with other data sent from the server to a "flush" command handler 46 from which they will enter the non-decoded frame buffer 47.
Fig. 9b illustrates a situation in which the program 45 on the client device 44 has received the packet with the "flush" command and transmitted it to the "flush" command handler 46 which logs whether or not the "flush" command is present. In the present case, the "flush" command handler 46 has logged the "flush" command. Data from the "flush" command handler 46 then arrives at the non-decoded frame buffer 47 from which the data enters the decoder 48 and then arrives at the decoded frame buffer 49 after which the program demonstrates it on the screen 50. Right after sending a
frame with the "flush" command, the server begins the sending of frames in the "hasting" mode, that is, in the accelerated mode.
Fig. 9c illustrates a situation in which the program 45 on the client device 44 has received the packet with the "flush" command from the server 43*' and the "flush" command handler 46 has sent a command "flush buffer" to the buffers. The non- decoded frame buffer 47 and the decoded frame buffer 49 become empty, while a frame with the "flush" command arrives at the non-decoded frame buffer 47. In the present case, the decoder 48 does not comprise a buffer, New frames do not arrive at the screen 55. The server sends frames in the "hasting" mode, that is, in the accelerated mode.
Fig. 9d illustrates a situation in which the frame with the "flush" command, as sent by the server 43 to the program 45 on the client device 44 and passed through the "flush" command handler 46 and the non-decoded frame buffer 47, entered the decoder 49 and from it - the empty decoded frame buffer 49 in order to be displayed on the screen 50 in future. During this period, the non-decoded frame buffer 47 is faster filled with new data packets, because the server sends said packets faster than in the usual mode. - '
The embodiment stated above should be understood as explanatory examples of the invention. It is necessary to understand that any feature described in respect with any embodiment can be used individually or in combination with other . described features and can be used together with one or more features of any other one of the embodiments or any combination of any other embodiments. Furthermore, equivalents andnnodifreationsTiot-deser^
scope of the invention as defined in the appended set of claims.
Claims
1. A method for delivering a personalized interactive video stream over a network from a server where all information on all entities forming a video scene at any time is stored to a client device comprising a decoder and an incoming video stream buffer and being in communication with a display device, said method comprising the steps of: carrying out a user input by a user at the client device or at the display device being in communication with the client device;
transmitting a user input signal by the client device to a program used to view the Internet TV;
sending user input information by said program to the server, said user input having been carried out at the client device or at the display device being in communication with the client device, after provision of said information with additional information on a number of a frame demonstrated in the display device at a time when the user input was made, on a number of a frame last arrived at an input buffer of the program with non-encoded data, and on a time of sending said user input information to the server;
receiving, at the server, information on a time or arriving the user input at the server and information on a number of a frame send last to the program;
returning, by the server, both the scene and all entities represented therein to the state displayed to the user on a screen of the display device being in communication with the client device at the time when the user input was made;
generating, by the server, new reference frame that is to be displayed to the user on the screen of the display device being in communication with the client device as a result of carrying out the user input, said new reference frame being an IDR frame, i.e. such a reference frame that frame following it can be decoded without taking frames arrived prior to said reference frame into account;
sending, by the server, new reference IDR frame along with a command to flush buffers and to display new frame to the client device;
after reception of said command to flush buffers and to display new frame, removing, by the client device, all frames present in the client device buffer and outputting, by the client device, new reference IDR frame received from the server to the display device being in communication with the client device.
2. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device is an action of pressing a remote control panel of the client device or the display device being in communication with the client device.
3. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device is an action of pressing a key of a mouse of the client device or the display device being in communication with the client device, or moving a cursor of the mouse of the client device or the display device being in communication with the client device to a predetermined area of a screen of the client device or the display device being in communication with the client device.
4. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device is an action of pressing a key of a keypad of the client device or the display device being in communication with the client device.
5. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device in the method is an action of pressing a key arranged on a housing of the client device or the display device being in communication with the client device.
6. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device is a command issued by the user - by means voice, mimics or gestures - to the client device or the display device being in communication with the client device.
7. The method according to claim 1 wherein the user input carried out at the client device or at the display device being in communication with the client device is a command issued by the user - by means of touching a touch panel, a contactless touch controller or by means of any other user input device - to the client device or the display device being in communication with the client device.
8. The method according to claim 1 wherein, after removing by the client device all frames present in its buffer, there is the step of increasing a stream transmission rate of the video data stream to be transmitted from the server to the client device in order to accelerate the filling of buffers thereby to reduce the malfunction probability in
playback of the video stream.
9. The method according to claim 1 wherein the decoder comprises a buffer whose size can be controlled, wherein a "flush" command is sent to a non-decoded data buffer, a decoder buffer and a decoded data buffer.
10. The method according to claim 1 wherein the decoder comprises a buffer whose size and content cannot be controlled by commands from outside, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
11. The method according to claim 1 wherein the decoder does not comprise a buffer, while a "flush" command is sent to a non-decoded data buffer and a decoded data buffer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/RU2013/001057 WO2015076694A1 (en) | 2013-11-25 | 2013-11-25 | Method for delivering personalized interactive video stream |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/RU2013/001057 WO2015076694A1 (en) | 2013-11-25 | 2013-11-25 | Method for delivering personalized interactive video stream |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2015076694A1 WO2015076694A1 (en) | 2015-05-28 |
WO2015076694A9 true WO2015076694A9 (en) | 2015-08-13 |
Family
ID=51014607
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/RU2013/001057 WO2015076694A1 (en) | 2013-11-25 | 2013-11-25 | Method for delivering personalized interactive video stream |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2015076694A1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10148582B2 (en) | 2016-05-24 | 2018-12-04 | Samsung Electronics Co., Ltd. | Managing buffers for rate pacing |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5963202A (en) * | 1997-04-14 | 1999-10-05 | Instant Video Technologies, Inc. | System and method for distributing and managing digital video information in a video distribution network |
WO2003051051A1 (en) * | 2001-12-13 | 2003-06-19 | Koninklijke Philips Electronics N.V. | Recommending media content on a media system |
US9485546B2 (en) * | 2010-06-29 | 2016-11-01 | Qualcomm Incorporated | Signaling video samples for trick mode video representations |
US20130046856A1 (en) | 2011-08-15 | 2013-02-21 | Telefonaktiebolaget L M Ericsson (Publ) | Event-triggered streaming of windowed video content |
US9736476B2 (en) * | 2012-04-27 | 2017-08-15 | Qualcomm Incorporated | Full random access from clean random access pictures in video coding |
-
2013
- 2013-11-25 WO PCT/RU2013/001057 patent/WO2015076694A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2015076694A1 (en) | 2015-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113423018B (en) | Game data processing method, device and storage medium | |
US10306293B2 (en) | Systems and methods of server based interactive content injection | |
US11025982B2 (en) | System and method for synchronizing content and data for customized display | |
US8321905B1 (en) | Fast switching of media streams | |
EP3018910B1 (en) | Transmission device, transmission method, reception device, and reception method | |
EP2095205B1 (en) | Hybrid buffer management | |
KR101616152B1 (en) | Delivering cacheable streaming media presentations | |
US11516518B2 (en) | Live streaming with live video production and commentary | |
US8010692B1 (en) | Adapting audio and video content for hardware platform | |
WO2017063399A1 (en) | Video playback method and device | |
EP1420590A1 (en) | Content providing apparatus and content providing method | |
EP3742742A1 (en) | Method, apparatus and system for synchronously playing message stream and audio/video stream | |
US10887646B2 (en) | Live streaming with multiple remote commentators | |
CN113661692B (en) | Method, apparatus and non-volatile computer-readable storage medium for receiving media data | |
US20130242189A1 (en) | Method and system for providing synchronized playback of media streams and corresponding closed captions | |
US10110972B2 (en) | Transmitting device, transmitting method, receiving device, and receiving method | |
US11102540B2 (en) | Method, device and system for synchronously playing message stream and audio-video stream | |
WO2015076694A9 (en) | Method for delivering personalized interactive video stream | |
CN115209163B (en) | Data processing method and device, storage medium and electronic equipment | |
CN113923530B (en) | Interactive information display method and device, electronic equipment and storage medium | |
US10037780B1 (en) | Computing system with video content generation feature | |
Westerink et al. | A live intranet distance learning system using MPEG-4 over RTP/RTSP | |
CN112188256A (en) | Information processing method, information providing device, electronic device and storage medium | |
Cymbalák et al. | Next generation IPTV solution for educational purposes | |
KR101229982B1 (en) | System for internet protocol television |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13866514 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13866514 Country of ref document: EP Kind code of ref document: A1 |