CN111263192A

CN111263192A - Video processing method and related equipment

Info

Publication number: CN111263192A
Application number: CN201811462770.2A
Authority: CN
Inventors: 杨少石; 陈雨辰; 王峰; 胡磊; 司源; 魏岳军
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-11-30
Filing date: 2018-11-30
Publication date: 2020-06-09

Abstract

The embodiment of the application discloses a video processing method and a device for realizing the method. The method provided by the embodiment of the application can dynamically set the coding strategy of the video according to the content characteristics of the video and the historical feedback information, and compared with the prior art that only a fixed coding strategy is adopted for coding, by implementing the embodiment of the application, the target code stream obtained by coding according to the dynamically set coding strategy can better adapt to the channel state of the wireless channel, thereby being beneficial to improving the transmission performance of the target code stream in the wireless channel.

Description

Video processing method and related equipment

Technical Field

The present application relates to the field of wireless transmission, and in particular, to a video processing method and related device.

Background

With the rapid development of computer networks and image processing technologies, video transmission has been widely applied in various scenes. Video transmission is divided into two types, namely wired video transmission and wireless video transmission, and compared with the wired video transmission, the wireless video transmission has the following advantages: (1) the cost is low, the wireless video transmission system does not need to erect a cable or dig a cable trench, and the manpower and material resources are saved; (2) the application range is wide, and the wireless transmission is hardly limited by the geographical environment in special geographical environments such as mountainous regions, lakes, forest zones and the like; (3) the expansibility is good, when the wireless video transmission system is expanded, only front-end equipment, a transmitter and a receiver are needed to be added, and the wired video transmission system needs to be re-wired; (4) the mobility is high, a wired video transmission system cannot support a mobile scene, and a wireless video transmission system has high mobility in a wireless network coverage range.

In the current wireless video transmission system, a video sending end adopts a fixed coding strategy to code video data, and sends a video code stream obtained by coding to a wireless channel sending end, so that the wireless channel sending end sends the video code stream through a wireless channel. Due to the fact that bandwidth resources of the wireless channel are limited, and the channel state of the wireless channel has the characteristic of time variation, the video code stream obtained through coding cannot be well adapted to the wireless channel with the time variation of the channel state, and transmission performance of the video code stream transmitted through the wireless channel is low. Therefore, how to improve the transmission performance of the video code stream in the wireless channel becomes a technical problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a video processing method and related equipment, which are beneficial to improving the transmission performance of a target code stream in a wireless channel.

In a first aspect, an embodiment of the present application provides a video processing method, which may be applied to a video processing device, where the method includes: acquiring original video data of a video to be transmitted, and setting a video coding strategy according to content characteristics of the video and historical feedback information, wherein the historical feedback information comprises a historical transmission result and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters; and coding the original video data based on the coding strategy to obtain a target code stream, and sending the target code stream to the wireless transmission equipment so that the wireless transmission equipment transmits the target code stream through a wireless channel.

In the technical scheme, the encoding strategy of the video can be dynamically set according to the content characteristics of the video and the historical feedback information. By the method, the target code stream obtained by coding the original video data according to the dynamically set coding strategy can better adapt to the channel state of the current wireless channel, and the transmission performance of the target code stream in the wireless channel is improved.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include first video frames, and an occupation ratio of the first video frames in the first picture group structure may be higher than an occupation ratio of the first video frames in the second picture group structure, where the occupation ratio of the first video frames in the first picture group structure is a ratio between the number of the first video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the first video frames in the second picture group structure is a ratio between the number of the first video frames in the second picture group structure and the number of the video frames; the specific implementation of setting the video encoding policy according to the content characteristics of the video and the historical feedback information may be as follows: and if the transmission condition of the wireless channel meets the preset degradation condition, setting the picture group structure of the video as a first picture group structure.

In the technical scheme, when the transmission condition of the wireless channel meets the preset degradation condition, the GOP structure of the video is set as the first GOP structure, so that the proportion of the first video frame in the lost video frames in the transmission process is large, and when the first video frame is not referred by other video frames, the normal decoding of other video frames cannot be influenced by the lost first video frame, so that the influence of the lost packet on the decoding of the video receiving equipment can be reduced as much as possible, and further, the fluency of the video playing of the video receiving equipment can be ensured as much as possible.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include second video frames, and an occupation ratio of the second video frames in the first picture group structure may be higher than an occupation ratio of the second video frames in the second picture group structure, where the occupation ratio of the second video frames in the first picture group structure is a ratio between the number of the second video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the second video frames in the second picture group structure is a ratio between the number of the second video frames in the second picture group structure and the number of the video frames; the amount of data constituting the second video frame may be larger than the amount of data constituting the first video frame; the specific implementation of setting the video encoding policy according to the content characteristics of the video and the historical feedback information may be as follows: and if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the picture group structure of the video as a second picture group structure.

In this technical solution, since the proportion of the second video frame in the first GOP structure is higher than the proportion of the second video frame in the second GOP structure, and the data amount of the second video frame is larger than the data amount constituting the first video frame, when the GOP lengths of the first GOP structure and the second GOP structure are the same, the code rate of the target code stream obtained by encoding based on the second GOP structure is lower than that of the target code stream obtained by encoding based on the first GOP structure, in other words, when the target code stream obtained by encoding based on the second GOP structure is transmitted in the wireless channel, less bandwidth resources are consumed when the target code stream obtained by encoding based on the second GOP structure is transmitted in the wireless channel than that of the target code stream obtained by encoding based on the first GOP structure.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel, and the encoding parameters may include quantization parameters; the specific implementation of setting the video encoding policy according to the content characteristics of the video and the historical feedback information may be as follows: if the transmission condition of the wireless channel meets the preset degradation condition, setting the quantization parameter of the video according to the content characteristics fed back by the user; and/or if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the quantization parameter of the video according to the content characteristic of the video and the content characteristic fed back by the user; and/or if the content characteristics of the video are different from the content characteristics fed back by the user, setting the quantization parameters of the video according to the content characteristics fed back by the user.

In the technical scheme, when the transmission condition of the wireless channel meets the preset degradation condition, the channel quality of the wireless channel is poor, namely the data volume which can be transmitted by the wireless channel is small, at the moment, the video processing equipment sets the quantization parameter of the video according to the content characteristics fed back by the user, so that more detailed information of the image object which is actually concerned by the user can be transmitted by fully utilizing the limited wireless resources, and the viewing experience of the user is improved; when the transmission condition of the wireless channel does not meet the preset degradation condition, the channel quality of the wireless channel is good, namely the data volume which can be transmitted by the wireless channel is large, at the moment, the video processing equipment sets the quantization parameter of the video according to the content characteristics of the video and the content characteristics fed back by the user, can transmit more detailed information of the image object which is probably concerned by the user through the wireless channel, and transmits more detailed information of the image object which is actually concerned by the user, and is beneficial to improving the watching experience of the user; compared with the content characteristics of the video, the content characteristics fed back by the user can more accurately indicate the image object concerned by the user, so that when the content characteristics of the video are different from the content characteristics fed back by the user, the quantization parameters of the video are set according to the content characteristics fed back by the user, more detailed information of the image object concerned by the user can be transmitted through a wireless channel, and the watching experience of the user is improved.

In one implementation, the method may further include: analyzing the original video data to obtain a target video scene of the video, and determining the content characteristics of the video based on the target video scene.

In the technical scheme, the content characteristics of the video can be used for indicating the image objects which are possibly more concerned by the user in the video in the target video scene, so that more image detail information can be reserved for the image objects which are possibly more concerned by the user during encoding, and therefore after the original video data of the video is recovered by the video receiving equipment, the video quality of the image objects which are possibly more concerned by the user can be higher when the original video data is output, and the improvement of the watching experience of the user is facilitated.

In one implementation, the target code stream may include a first video data packet and a second video data packet, and the method may further include setting importance of the first video data packet and setting importance of the second video data packet based on content characteristics of the video, content characteristics fed back by a user, and/or an encoding policy of the video.

In the technical scheme, the importance is set for the first video data packet and the second video data packet, so that the video data packets with higher importance can be transmitted by fully utilizing the limited wireless channel resources.

In one implementation, the content features of the video may include a first category and a second category, and the content features of the user feedback may include a third category; the specific implementation manner of setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics of the video, the content characteristics fed back by the user, and/or the encoding policy of the video may be: acquiring the first number of times that a video frame corresponding to the first video data packet is referred to, and setting the importance of the first video data packet according to the first number of times; acquiring a second number of times that a video frame corresponding to the second video data packet is referred to, and setting the importance of the second video data packet according to the second number of times; and/or if the category to which the video data in the first video data packet belongs is a first category, setting the importance of the first video data packet according to the importance of the first category, and if the category to which the video data in the second video data packet belongs is a second category, setting the importance of the second video data packet according to the importance of the second category; and/or if the category to which the video data in the first video data packet belongs is the same as the third category, setting the importance of the first video data packet according to the importance of the third category, and if the category to which the video data in the second video data packet belongs is different from the third category, setting the importance of the second video data packet according to the category to which the video data in the second video data packet belongs.

In the technical scheme, the more the number of times a video frame is referred to is, the more the video frame affects the decoding of other video frames, and the video processing equipment sets the importance of the video data packet according to the number of times the video frame corresponding to the video data packet is referred to, so that the set importance can better reflect the influence of the video frame corresponding to the video data packet on the decoding success rate of the video receiving equipment; the content characteristics of the video can be used for indicating the types of the image objects which are likely to be concerned by the user, and the importance of the video data packets is set according to the importance of the types of the image objects which are likely to be concerned by the user, so that the video data packets with higher importance are transmitted preferentially, the video recovered by the video receiving equipment can comprise more image objects which are likely to be concerned by the user, and the watching experience of the user can be improved; the content characteristics fed back by the user can be used for indicating the type of the image object which is actually concerned by the user, the importance of the video data packet is set according to the importance of the type of the image object which is actually concerned by the user, and the video data packet with higher importance is preferentially transmitted, so that the video receiving equipment can recover the video which contains more image objects which are actually concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, if the first number is higher than the second number, the importance of the first video data packet may be higher than the importance of the second video data packet; alternatively, if the importance of the first category is higher than the importance of the second category, the importance of the first video data packet may be higher than the importance of the second video data packet; alternatively, if the category to which the video data in the first video data packet belongs is the same as the third category and the category to which the video data in the second video data packet belongs is different from the third category, the importance of the first video data packet may be higher than that of the second video data packet.

In the technical scheme, if the number of times that the video frame corresponding to the video data packet is referred to is more, the importance of the video data packet is higher, and by the mode, the video data packet with higher importance can be transmitted preferentially, so that the decoding success rate of the video receiving equipment is improved; if the importance of the category to which the video data in the video data packet belongs is higher, the importance of the video data packet is higher, and in this way, the video data packet with higher importance can be preferentially transmitted, so that the video recovered by the video receiving equipment comprises more image objects which are concerned by the user, and the watching experience of the user is favorably improved; by preferentially transmitting the video data packets with higher importance, the video recovered by the video receiving equipment comprises more image objects which are actually concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, the method may further include: and acquiring a historical video scene, and if the historical video scene is different from the target video scene, adjusting the coding strategy.

In the technical scheme, the difference between the current identified target video scene and the historical video scene indicates that the difference between the previous frame of image and the current image is large, and if the encoding strategy is not adjusted and the historical encoding strategy is still adopted for encoding, the information loss generated in the reference process is large, so that the recovery quality of the video is low.

In an implementation manner, if the historical video scene is different from the target video scene, the specific implementation manner of adjusting the encoding policy may be: and if the historical video scene is different from the target video scene, increasing the number of second video frames in the picture group structure of the video.

In the technical scheme, the data volume of the second video frame is greater than that of the first video frame, so that compared with the first video frame, the second video frame can record more image detail information, and when a video scene changes, by increasing the number of the second video frames (such as I frames) included in the GOP structure of the video, the information loss generated in the reference process can be reduced, and the recovery quality of the video can be improved.

In one implementation, the method may further include: and receiving a source code stream of the video, and decoding the source code stream to obtain original video data.

In the technical scheme, compared with the source code stream transmitted in the wireless channel, the target code stream can be better transmitted in the wireless channel by transcoding the source code stream to obtain the target code stream.

In a second aspect, an embodiment of the present application provides another video processing method, which may be applied to a wireless transmission device, where the method includes: receiving a target code stream of a video to be transmitted, which is sent by video processing equipment, wherein the target code stream is obtained by the video processing equipment through coding processing on original video data of the video to be transmitted based on a coding strategy, the coding strategy is set by the video processing equipment according to content characteristics of the video to be transmitted and historical feedback information, the historical feedback information comprises a historical transmission result and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters; and transmitting the target code stream through a wireless channel.

In the technical scheme, the video coding strategy is dynamically set according to the content characteristics of the video and the historical feedback information, and the target code stream obtained based on the coding of the dynamically set coding strategy is transmitted through the wireless channel, so that the transmission performance of the target code stream in the wireless channel is improved.

In one implementation, the method may further include: acquiring content characteristics fed back by a user, and counting historical transmission results in a preset time period; and sending the historical transmission result and the content characteristics fed back by the user to the video processing equipment.

In the technical scheme, the historical transmission result and the content characteristics fed back by the user are sent to the video processing equipment, so that the video processing equipment can dynamically set the encoding strategy based on the historical transmission result and the content characteristics fed back by the user, and the transmission performance of the target code stream in a wireless channel, which is obtained by encoding based on the dynamically set encoding strategy, can be improved.

In one implementation, the target code stream may include a first video data packet and a second video data packet; the specific implementation of transmitting the target code stream through the wireless channel may be as follows: screening a target code stream to obtain a code stream to be transmitted based on the transmission condition of the wireless channel, the importance of the first video data packet and the importance of the second video data packet, wherein the code stream to be transmitted comprises the first video data packet and/or the second video data packet; and transmitting the code stream to be transmitted through a wireless channel.

In the technical scheme, compared with all video data packets in the transmission target code stream, the target code stream is screened and part of the video data packets are actively abandoned, so that the video data packets with higher importance can be reduced or even avoided being lost in the transmission process, and the influence of the lost packets on the quality of the recovered video is further reduced as much as possible.

In an implementation manner, the importance of the first video data packet may be higher than that of the second video data packet, and a specific implementation manner of transmitting the bitstream to be transmitted through the wireless channel may be as follows: if the transmission condition of the wireless channel meets the preset degradation condition, transmitting a first video data packet through the wireless channel; and/or if the transmission condition of the wireless channel meets the preset degradation condition, sequentially transmitting the first video data packet and the second video data packet according to the high-low order of importance; and/or modulating the first video data packet to a first satellite seat, modulating the second video data packet to a second satellite seat, and transmitting the modulated first video data packet and the modulated second video data packet, wherein the reliability of the first satellite seat can be higher than that of the second satellite seat.

In the technical scheme, the transmission condition of the wireless channel meets the preset degradation condition, which indicates that the channel quality of the wireless channel is poor and packet loss is easily generated in the wireless transmission process, and the first wireless transmission equipment can avoid the influence of the second video data packet on the transmission process of the first video data packet by actively discarding the second video data packet with smaller importance; because the transmission condition of the wireless channel has the characteristic of time variation, the first wireless transmission equipment sequentially transmits the first video data packet and the second video data packet according to the high-low sequence of importance, so that the reliable transmission of the first video data packet with higher importance can be better ensured, and the first video data packet with higher importance is prevented from losing after the transmission condition of the wireless channel is deteriorated; compared with the error rate generated after the second video data packet modulated to the second satellite seat is transmitted in the wireless channel, the error rate generated after the first video data packet modulated to the first satellite seat is transmitted in the wireless channel is lower, and the reliable transmission of the first video data packet with higher importance is favorably ensured.

In a third aspect, an embodiment of the present application provides a video processing apparatus, where the apparatus has a function of implementing the video processing method according to the first aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.

In a fourth aspect, the present application provides another video processing apparatus, which has a function of implementing the video processing method according to the second aspect. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions.

In a fifth aspect, an embodiment of the present application provides a video processing system, which includes the video processing apparatus of the third aspect and the video processing apparatus of the fourth aspect.

In a sixth aspect, an embodiment of the present application provides a video processing apparatus, which includes a memory and a processor, where the memory stores program instructions, and the processor calls the program instructions stored in the memory to implement the video processing method according to the first aspect.

In a seventh aspect, an embodiment of the present application provides a wireless transmission device, where the wireless transmission device includes a memory and a processor, where the memory stores program instructions, and the processor calls the program instructions stored in the memory to implement the video processing method according to the second aspect.

In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium for storing computer program instructions for a video processing apparatus according to the third aspect, which includes a program for executing the above-mentioned first aspect.

In a ninth aspect, an embodiment of the present application provides a computer-readable storage medium for storing computer program instructions for a video processing apparatus according to the fourth aspect, which includes a program for executing the above-mentioned second aspect.

In a tenth aspect, an embodiment of the present application provides a computer program product, which includes a program that, when executed, implements the method described in the first aspect.

In an eleventh aspect, the present application provides a computer program product, which includes a program, and when executed, the program implements the method of the second aspect.

In a twelfth aspect, an embodiment of the present application further provides a processor, where the processor includes at least one circuit, and is configured to set a video encoding policy according to content characteristics of a video and historical feedback information, and the processor further includes at least one circuit, and is configured to perform encoding processing on original video data based on the encoding policy to obtain a target code stream. The processor may be a chip, and may execute instructions or a program for implementing the method according to the first aspect.

In a thirteenth aspect, an embodiment of the present application further provides a processor, where the processor includes at least one circuit, configured to receive a target code stream of a video to be transmitted, where the target code stream is sent by a video processing device, and the processor further includes at least one circuit, configured to transmit the target code stream through a wireless channel. The processor may be a chip, and may execute instructions or a program for implementing the method according to the second aspect.

In a fourteenth aspect, embodiments of the present application further provide a chip system, where the chip system includes a processor, for example, applied in a video processing device, and is configured to implement the functions or methods related to the first aspect. In a possible implementation, the chip system further comprises a memory for storing program instructions and data necessary for implementing the functions of the method according to the first aspect. The chip system may be formed by a chip, or may include a chip and other discrete devices.

In a fifteenth aspect, the present application further provides a chip system, which includes a processor, for example, and is applied in a wireless transmission device, to implement the functions or the method related to the second aspect. In a possible implementation, the chip system further comprises a memory for storing program instructions and data necessary for implementing the functions of the method according to the second aspect. The chip system may be formed by a chip, or may include a chip and other discrete devices.

Drawings

Fig. 1 is a schematic diagram of an architecture of a wireless video transmission system according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a video processing method disclosed in an embodiment of the present application;

fig. 3 is a schematic flow chart of another video processing method disclosed in the embodiments of the present application;

fig. 3a is a schematic structural diagram of a first GOP structure disclosed in the embodiment of the present application;

fig. 3b is a schematic structural diagram of a second GOP structure disclosed in the embodiment of the present application;

fig. 3c is a schematic structural diagram of another first GOP structure disclosed in the embodiments of the present application;

fig. 3d is a schematic structural diagram of another second GOP structure disclosed in the embodiment of the present application;

fig. 3e is a schematic structural diagram of a new second GOP structure obtained after an I frame is inserted into the second GOP structure shown in fig. 3b according to the embodiment of the present application;

fig. 4 is a schematic flow chart of another video processing method disclosed in the embodiments of the present application;

fig. 5 is a schematic flow chart of another video processing method disclosed in the embodiments of the present application;

fig. 6 is a schematic structural diagram of a video processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of another video processing apparatus disclosed in an embodiment of the present application;

fig. 8 is a schematic structural diagram of a video processing apparatus disclosed in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a wireless transmission device disclosed in an embodiment of the present application.

Detailed Description

In order to solve the problem that the transmission performance of a video code stream in a wireless channel is low in the prior art, the embodiment of the present application provides a solution based on the wireless video transmission system shown in fig. 1, so as to improve the transmission performance of the video code stream in the wireless channel.

Referring to fig. 1, fig. 1 is a schematic diagram of an architecture of a wireless video transmission system according to an embodiment of the present disclosure. As shown in fig. 1, the system includes: a video processing device 101, a first wireless transmission device 102, a second wireless transmission device 103, and a video reception device 104. The video processing device 101 is connected with the first wireless transmission device 102, the second wireless transmission device 103 is connected with the video receiving device 104 in a wired manner, and the first wireless transmission device 102 is connected with the second wireless transmission device 103 in a wireless manner. The video processing device 101 may be configured to dynamically set a coding policy of a video according to content characteristics of the video and historical feedback information, and perform coding processing on original video data of the video based on the set coding policy, so that a target code stream obtained after coding is suitable for transmission in a wireless channel. In one implementation, a video storage module (not shown) may be integrated in the video processing device 101, where the video storage module may be configured to store original video data of a video required by a user, and when the user needs to watch the video, the video processing device 101 may obtain the original video data of the video required by the user from the video storage module, encode the obtained original video data based on a dynamically set encoding policy to obtain a target code stream, and then send the target code stream to the first wireless transmission device 102; after receiving the target code stream, the first wireless transmission device 102 may transmit the target code stream to the second wireless transmission device 103 through a wireless channel; the second wireless transmission device 103 may forward the target code stream to the video receiving device 104, so that the video receiving device 104 performs decoding processing on the target code stream to recover the original video data, and outputs the original video data obtained by decoding for the user to watch.

Because the video processing device 101 can dynamically set the encoding policy of the video according to the content characteristics of the video and the historical feedback information, compared with the prior art that only a fixed encoding policy is adopted, the target code stream obtained by the video processing device 101 based on the encoding of the dynamically set encoding policy can better adapt to the channel state of the wireless channel, which is beneficial to improving the transmission performance of the target code stream in the wireless channel.

In one implementation, the video processing device may be an encoding device to encode raw video data of a video based on an encoding policy, or the video processing device may also be a transcoding device. Specifically, if the video processing device 101 is not integrated with a video storage module, when a user needs to watch a video, the video processing device 101 may receive a source code stream of the video, which is sent by a video sending device (not shown) and is needed by the user, transcode the source code stream, and send a target code stream obtained after transcoding to the first wireless transmission device 102. In an implementation manner, a specific implementation manner of the target code stream obtained by the video processing device 101 transcoding the source code stream may be: the video processing device 101 decodes the source code stream to obtain original video data of the video, and encodes the original video data based on the set encoding policy to obtain a target code stream. The video sending device may be a video server.

It should be noted that, the video processing apparatus 101, the first wireless transmission apparatus 102, the second wireless transmission apparatus 103, and the video receiving apparatus 104 shown in fig. 1 exist as separate physical entities respectively for example only, and do not constitute a limitation to the embodiments of the present application. In other possible implementations, the video processing device 101 and the first wireless transmission device 102 may be integrated in the same physical entity; the second wireless transmitting device 103 and the video receiving device 104 may be integrated in the same physical entity.

It can be understood that the wireless video transmission system described in the embodiment of the present application is for more clearly illustrating the technical solution of the embodiment of the present application, and does not form a limitation on the technical solution provided in the embodiment of the present application, and as a person having ordinary skill in the art knows that as the system architecture evolves and a new service scenario appears, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.

The words which comprise the embodiments of the present application are described below:

the content feature of the video may be used to indicate an image object that the user may be more interested in the video in the target video scene, or may be used to indicate a category of an image object that the user may be more interested in the video in the target video scene. The image objects may include, but are not limited to, people, faces, human bodies, animals, vehicles, tables, chairs, plants, buildings, roads, or other objects, the categories of the image objects may include, but are not limited to, motion classes or still classes, the image objects of motion classes may include, but are not limited to, people, faces, human bodies, animals, vehicles, or other moving objects, and the image objects of still classes may include, but are not limited to, tables, chairs, plants, buildings, roads, or other still objects.

The content feature of the user feedback is used for indicating the image object which is actually focused by the user in the video, or can be used for indicating the category of the image object which is actually focused by the user in the video.

The historical transmission result may include a transmission condition of the wireless channel, where the transmission condition of the wireless channel may be used to evaluate the channel quality of the wireless channel, and if the transmission condition of the wireless channel satisfies a predetermined degradation condition, it indicates that the channel quality of the wireless channel is poor, and if the transmission condition of the wireless channel does not satisfy the predetermined degradation condition, it indicates that the channel quality of the wireless channel is good.

And the coding strategies are used for coding the video data into target code streams, and the target code streams obtained by coding according to different coding strategies are different. The target code stream refers to encoded video data transmitted in a unit time. The encoding strategy may include an encoding mode and/or an encoding parameter.

The encoding scheme may include a Group of Pictures (GOP) structure, where a GOP refers to a Group of consecutive Pictures, and the GOP structure includes multiple types of video frames, e.g., the GOP structure may include I-frames (I-frames) and/or P-frames (P-frames), or the GOP structure may include I-frames, P-frames, and B-frames (B-frames). The I frame can be coded only by using the information of the frame without referring to other video frames, and can be independently decoded without referring to other images; the P frame is used for recording the difference with the adjacent previous frame, the P frame can be coded only by referring to the previous frame, and the P frame can be successfully decoded to obtain a picture only by using the picture cached before to overlap the difference defined by the frame, wherein the previous frame referred by the P frame can be an I frame or a P frame; the B frame is used for recording a difference between the adjacent previous frame and the adjacent next frame, the B frame needs to refer to the adjacent previous frame and the adjacent next frame for encoding, and the B frame needs to obtain a picture decoded from the previous frame and a picture decoded from the next frame and to superimpose the difference defined by the frame for successfully decoding to obtain the picture. In the embodiment of the present application, different types of video frames have different importance degrees for whether the video receiving device can successfully decode the target code stream, for example, if an I frame is lost in the transmission process, all video frames in a GOP where the I frame is located cannot be decoded, so that the I frame has a very important influence on whether the video receiving device can successfully decode the target code stream. For another example, if a certain B frame or P frame is lost in the transmission process and the lost B frame or P frame is not referred to by other video frames, the normal decoding of other video frames is not affected by the loss of the B frame or P frame, so that if a certain video frame is not referred to by other video frames, the video frame has a smaller influence on whether the video receiving device can successfully decode the target code stream.

The GOP structure of the video may include, but is not limited to, a first GOP structure and a second GOP structure, which in one implementation may each include a first video frame. The first video frame may refer to a video frame that is not referenced by other video frames in the GOP, e.g., the first video frame may be a P-frame or a B-frame. In one implementation, the GOP length of the first GOP structure may be the same as or different from the GOP length of the second GOP structure, and the number of the first video frames in the first GOP structure may be the same as or different from the number of the first video frames in the second GOP structure. The GOP length may refer to the number of video frames included in a GOP. In one implementation, the percentage of the first video frames in the first GOP structure may be higher than the percentage of the first video frames in the second GOP structure, where the percentage of the first video frames in the first GOP structure is a ratio between the number of the first video frames in the first GOP structure and the number of the video frames in the first GOP structure, and the percentage of the first video frames in the second GOP structure is a ratio between the number of the first video frames in the second GOP structure and the number of the video frames in the second GOP structure.

In one implementation, the first GOP structure and the second GOP structure may each include a second video frame. In a GOP, the amount of data comprising the second video frame may be greater than the amount of data comprising the other video frames, e.g., the amount of data comprising the second video frame may be greater than the amount of data comprising the first video frame, and in one implementation, the second video frame may be an I-frame. In one implementation, the percentage of the second video frames in the first GOP structure may be higher than the percentage of the second video frames in the second GOP structure, where the percentage of the second video frames in the first GOP structure is a ratio between the number of the second video frames in the first GOP structure and the number of the video frames in the first GOP structure, and the percentage of the second video frames in the second GOP structure is a ratio between the number of the second video frames in the second GOP structure and the number of the video frames in the second GOP structure.

The encoding parameters may include GOP parameters and/or quantization parameters. The GOP parameters may include, but are not limited to, GOP length, number of first video frames, number of second video frames, and number of other video frames (i.e., video frames other than the first video frames and the second video frames).

The quantization parameter can be used for reflecting the spatial detail compression condition of the video frame, and the smaller the quantization parameter is, the more image detail information is reserved, the larger the quantization parameter is, the less image detail information is reserved. The quantization parameter may affect the quality of the decoded video. The quality of the decoded video can be higher by reducing the quantization parameter, but the video code rate can also be improved at the same time.

Based on the schematic architecture of the wireless video transmission system shown in fig. 1, please refer to fig. 2, and fig. 2 is a schematic flow chart of a video processing method provided in the embodiment of the present application, which can be applied to the wireless video transmission system, and the method can include, but is not limited to, the following steps:

step S201: the video processing device obtains original video data of a video to be transmitted. Specifically, the original video data of the video to be transmitted may be stored in the video processing device, that is, the video processing device may obtain the original video data of the video to be transmitted locally. In an implementation manner, original video data of a video to be transmitted may be stored in a video sending device, and when a user needs to watch the video, the video sending device may obtain the original video data of the video to be transmitted from the local, perform encoding processing on the original video data, obtain a source code stream of the video to be transmitted, and send the source code stream to a video processing device. After receiving the source code stream sent by the video sending device, the video processing device may perform decoding processing on the source code stream to recover original video data of the video, and then re-encode the original video data based on the encoding policy to obtain a target code stream. Compared with the source code stream transmitted in the wireless channel, the target code stream can be better transmitted in the wireless channel by transcoding the source code stream to obtain the target code stream.

Step S202: the video processing equipment sets the coding strategy of the video according to the content characteristics of the video and historical feedback information, the historical feedback information comprises historical transmission results and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters. After the video processing device acquires the original video data, the original video data needs to be encoded to obtain a target code stream suitable for transmission in a wireless channel, and the wireless channel has the characteristics of limited transmission bandwidth and time-varying channel state, so that if the original video data is encoded by adopting a fixed and unchangeable encoding strategy under any condition, the target code stream obtained by encoding cannot be well adapted to the wireless channel, and the transmission performance of the target code stream in the wireless channel is low. To solve this problem, the video processing apparatus may dynamically set an encoding policy of a video according to content characteristics of the video and historical feedback information before performing encoding processing on original video data of the video. By the method, the target code stream obtained by coding the original video data according to the dynamically set coding strategy can better adapt to the channel state of the current wireless channel, and the transmission performance of the target code stream in the wireless channel is improved.

The history feedback information may be directly sent to the video processing device by the first wireless transmission device, or may be obtained by processing, by the video processing device, the history information sent by the first wireless transmission device, which is not limited in this embodiment of the present application.

In an implementation manner, the content feature of the video may be obtained by analyzing, by a video processing device, original video data of the video, and specifically, the video processing device may analyze, by the video processing device, the original video data of the video to obtain a target video scene of the video, and determine the content feature of the video based on the target video scene. In one implementation, the video processing device may pre-store a correspondence between the video scene and the content features, so as to identify the target video scene, determine the content features corresponding to the target video scene through the correspondence, and set a smaller quantization parameter for the image objects indicated by the content features that the user may pay more attention to, that is, more image detail information is reserved for the image objects that the user may pay more attention to during encoding, so that after the video receiving device recovers the original video data of the video, the video quality of the image objects that the user may pay more attention to may be higher when the video receiving device outputs the original video data, which is beneficial to improving the viewing experience of the user. For example, in a video surveillance scene, if the image objects in the surveillance video include a person, a vehicle, a plant, a building, and a road surface, the user may pay more attention to moving objects such as the person and the vehicle, and pay less attention to static objects such as the plant, the building, or the road surface. For another example, in a video teaching scene, if the image objects in the teaching video include a teacher, a table and a chair, and a lecture, the user may pay more attention to the teaching content, that is, the user may pay more attention to the lecture than the teacher and the table and the chair. For another example, in an outdoor live scene, if the image objects in the live video include a human face, a human body, and a background, the user may pay more attention to the human face and the human body, but pay less attention to the background.

In one implementation, the content features of the video may include one or more image objects, or one or more categories. When the content feature of the video includes a plurality of image objects or categories, the content feature of the video may further include a degree of attention corresponding to each image object or category, and the degrees of attention corresponding to different image objects or categories may be the same or different, which is not limited in this embodiment of the application. For example, if the target video scene is a video surveillance scene, the image objects in the surveillance video include a person, a vehicle, a plant, a building, and a road surface, and the content features of the surveillance video include the person, the vehicle, the plant, and the building, the attention of the person may be higher than the attention of the vehicle, the plant, and the building, and the attention of the plant and the building may be the same. In one implementation, the video processing device may set a smaller quantization parameter for image objects with higher attention and a larger quantization parameter for image objects with lower attention.

In one implementation, a video processing device may receive historical feedback information sent by a first wireless transmission device. In one implementation, the historical feedback information may include content characteristics fed back by the user, where the content characteristics of the video are obtained by analyzing and predicting original video data of the video by the video processing device, and therefore, the predicted content characteristics may not accurately indicate an image object or category that the user actually concerns, and by feeding back the content characteristics fed back by the user to the video processing device, the video processing device may obtain the image object or category that the user actually concerns more. When the video processing device obtains the image object or the category which is relatively concerned by the user, the quantization parameter can be set according to the image object or the category which is relatively concerned by the user. For example, a smaller quantization parameter is set for an image object that the user actually pays more attention to. In one implementation, the content characteristics of the user feedback may be determined by the video receiving device based on user operations. In one implementation, the content characteristics fed back by the user may be sent to the first wireless transmission device by the video receiving device and fed back to the video processing device by the first wireless transmission device, or the content characteristics fed back by the user may be sent to the second wireless transmission device by the video receiving device, sent to the first wireless transmission device by the second wireless transmission device, and fed back to the video processing device by the first wireless transmission device. In the embodiment of the present application, both the first wireless transmission device and the second wireless transmission device may have a wireless receiving function and a wireless transmitting function. In one implementation, the historical feedback information may include historical transmission results, which may include transmission conditions of the wireless channel. In one implementation, the first wireless transmission device may determine the transmission condition of the wireless channel according to the packet loss rate, the transmission delay, or the average data transmission rate, or the first wireless transmission device may determine the transmission condition of the wireless channel comprehensively according to the packet loss rate, the transmission delay, and the average data transmission rate. For example, if the packet loss rate is greater than a preset packet loss rate threshold, the first wireless transmission device may determine that the transmission condition of the wireless channel satisfies a preset degradation condition; if the packet loss rate is less than or equal to the preset packet loss rate threshold, the first wireless transmission device may determine that the transmission condition of the wireless channel does not satisfy the preset degradation condition. In an implementation manner, the preset packet loss rate threshold may be set by a default of the video processing device, or may be determined by the video processing device according to an input operation of a user, which is not limited in this embodiment of the present application. In one implementation, the packet loss rate and/or the transmission delay may be counted by the second wireless transmission device and fed back to the first wireless transmission device, and the average data transmission rate may be counted by the first wireless transmission device.

In an implementation manner, reference relationships among various types of video frames in different GOP structures may be different, and different GOP structures are used to encode original video data, so that code rates of target code streams obtained by encoding are different, or quality of videos obtained by decoding the target code streams and recovering by video receiving equipment is different. The code rate refers to the number of data bits transmitted per unit time when video data is transmitted.

In one implementation, the specific implementation manner in which the video processing device dynamically sets the encoding policy of the video according to the content characteristics of the video and the historical feedback information may be: the video processing device dynamically sets the encoding mode and/or encoding parameters of the video according to the content characteristics of the video and the historical transmission result, for example, sets quantization parameters according to the content characteristics of the video, and sets the GOP structure of the video to GOP structure 1 if the historical transmission result is the transmission condition of the wireless channel and the transmission condition of the wireless channel meets the preset degradation condition; and if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the GOP structure of the video as a GOP structure 2, wherein the code rate of the target code stream 1 obtained by coding according to the GOP structure 1 is lower than the code rate of the target code stream 2 obtained by coding according to the GOP structure 2. In one implementation, a video processing device may dynamically set encoding parameters for a video according to content characteristics of the video and content characteristics of user feedback.

In one implementation, the specific implementation manner in which the video processing device dynamically sets the encoding policy of the video according to the content characteristics of the video and the historical feedback information may be: the video processing equipment acquires a historical encoding strategy, wherein the historical encoding strategy comprises a historical encoding mode and historical encoding parameters, acquires content characteristics and historical feedback information which are referred to when the historical encoding strategy is set, judges whether the content characteristics of the video are the same as the content characteristics which are referred to when the historical encoding strategy is set, judges whether the historical feedback information is the same as the historical feedback information which is referred to when the historical encoding strategy is set, and if the content characteristics of the video are the same as the historical feedback information which is referred to when the historical encoding strategy is set, the video processing equipment can set the encoding mode of the video as the historical encoding mode and set the encoding parameters of the video as the historical encoding parameters; if the content characteristics of the video are different from the content characteristics referred to when the historical encoding strategy is set, and/or the historical feedback information is different from the historical feedback information referred to when the historical encoding strategy is set, the video processing device may reset the encoding mode and the encoding parameters of the video according to the content characteristics of the video and the historical feedback information.

Step S203: and the video processing equipment encodes the original video data based on the encoding strategy to obtain a target code stream. Specifically, the video processing device may perform encoding processing on the original video data based on a dynamically set encoding policy to obtain a target code stream, and send the target code stream to the first wireless transmission device, so that the first wireless transmission device transmits the target code stream through the wireless channel.

Step S204: and the video processing equipment sends the target code stream to the first wireless transmission equipment. Specifically, after the video processing device obtains the target code stream, the video processing device may send the target code stream to the first wireless transmission device, so that the first wireless transmission device may convert the target code stream into a wireless signal form for transmission in a wireless channel.

In one implementation, if the original video data acquired by the video processing device is obtained by decoding a source code stream from the video sending device, and the source code stream is obtained by the video sending device by encoding based on an original encoding policy, when the video processing device sends a target code stream to the first wireless transmission device, the video processing device may also send the original encoding policy one to the first wireless transmission device, so that the first wireless transmission device forwards the original encoding information to the video receiving device, and the video receiving device may recover to obtain the source code stream of the video based on the original encoding information.

Step S205: and the first wireless transmission equipment transmits the target code stream through a wireless channel. The target code stream is obtained by coding based on a dynamically set coding strategy, the coding strategy is dynamically set according to the content characteristics of the video and the historical feedback information, and the target code stream is transmitted through a wireless channel, so that the transmission performance of the target code stream in the wireless channel is improved. Specifically, when the wireless receiving module is integrated in the video receiving device, the first wireless transmission device may directly transmit the target code stream to the video receiving device through the wireless channel, so that the video receiving device decodes the target code stream to obtain original video data of the video, and outputs the original video data for the user to watch. The wireless receiving module in the video receiving device is used for receiving a target code stream transmitted by the first wireless transmission device through a wireless channel. In one implementation, when the wireless receiving module is not integrated in the video receiving device, the first wireless transmission device may transmit the target code stream to the second wireless transmission device through a wireless channel, so that the second wireless transmission device forwards the received target code stream to the video receiving device.

In one implementation, if the target code stream is obtained by transcoding a source code stream from the video sending device by the video processing device, the first wireless transmission device may directly transmit the target code stream to the video receiving device through the wireless channel, so that the video receiving device transcodes the target code stream to obtain the source code stream of the video and sends the source code stream to the video application device. Specifically, the specific implementation manner of the video receiving device transcoding the target code stream to obtain the source code stream of the video may be: the video receiving equipment decodes the target code stream to obtain original video data, and codes the original video data based on an original coding strategy to obtain a source code stream of the video so as to restore the target code stream to an original coding format which is existed before transcoding in the video processing equipment. By the method, the problem that the video application equipment cannot decode the target code stream after receiving the target code stream when the coding strategies supported by the video application equipment and the video processing equipment are different can be solved. The video receiving device transcodes the target code stream into the source code stream in the original coding format and sends the source code stream to the video application device, so that the requirement on the decoding capability of the video application device can be removed, and the transcoding process is transparent to the video sending device and the video application device, namely, any video sending device and any video application device can be compatible. In an implementation manner, the original encoding policy may also be sent to the video receiving device by the video application device, and in this manner, the target code stream may be transcoded into an encoding format supported by the video application device, so as to ensure that the video application device can successfully decode the source code stream sent by the video receiving device.

In one implementation, the first wireless transmission device may obtain content characteristics fed back by the user, count historical transmission results within a preset time period, and send the historical transmission results and the content characteristics fed back by the user to the video processing device. The historical transmission result in the preset time period may include the number of lost packets in the preset time period, the average transmission delay in the preset time period, and/or the average data transmission rate in the preset time period. The content feature of the user feedback may be directly sent to the first wireless transmission device by the video receiving device, or may be sent to the second wireless transmission device by the video receiving device and forwarded to the first wireless transmission device by the second wireless transmission device, which is not limited in this embodiment of the present application. The first wireless transmission device sends the historical transmission result and the content characteristics fed back by the user to the video processing device, so that the video processing device can dynamically set the encoding strategy based on the historical transmission result and the content characteristics fed back by the user, and the transmission performance of the target code stream in a wireless channel, which is obtained by encoding based on the dynamically set encoding strategy, can be improved.

Therefore, by implementing the embodiment of the application, the encoding strategy of the video can be dynamically set according to the content characteristics of the video and the historical feedback information, and by adopting the method, the target code stream obtained by encoding the original video data according to the dynamically set encoding strategy can better adapt to the channel state of the current wireless channel, and the transmission performance of the target code stream in the wireless channel can be improved.

Referring to fig. 3, fig. 3 is a schematic flow chart of another video processing method provided in this embodiment of the present application, which can be applied to a wireless video transmission system, and which details how to set a GOP structure of a video, and the method can include, but is not limited to, the following steps:

step S301: the video processing device obtains original video data of a video to be transmitted. It should be noted that, the execution process of step S301 may refer to the specific description of step S201 in fig. 2, which is not described herein again.

In one implementation, when the GOP length of the first GOP structure and the GOP length of the second GOP structure are both 4, that is, the first GOP structure and the second GOP structure both include 4 video frames, the schematic diagram of the structure of the first GOP structure may be as shown in fig. 3a, and the schematic diagram of the structure of the second GOP structure may be as shown in fig. 3 b. In fig. 3a and 3b, I represents an I frame; x represents a video frame referred to by other video frames, and X can be an I frame or a P frame; y represents a video frame that is not referenced by other video frames (i.e., the first video frame), and the Y frame may be a P frame or a B frame; roman numerals represent the numbers of video frames for distinguishing different video frames; the arrow direction represents the reference relationship between two video frames, e.g., in FIG. 3a, the arrow between I0 and Y1 represents Y1 references I0. As can be seen from fig. 3a and 3b, the number of the first video frames in the first GOP structure is 2, and the number of the first video frames in the second GOP structure is 1, i.e. the occupation ratio of the first video frames in the first GOP structure is higher than that of the first video frames in the second GOP structure. It should be noted that the schematic structure of the first GOP structure shown in fig. 3a and the schematic structure of the second GOP structure shown in fig. 3b are only used for example and do not constitute a limitation to the embodiment of the present application, and in other possible implementations, the first GOP structure and the second GOP structure may also be in other structures.

Step S302: if the transmission condition of the wireless channel meets a preset degradation condition, the video processing device sets the GOP structure of the video to be a first GOP structure. Specifically, when the transmission condition of the wireless channel meets a preset degradation condition, it is indicated that the channel quality of the wireless channel is poor, packet loss is easily generated in the transmission process, and since the proportion of the first video frame in the first GOP structure is higher than the proportion of the first video frame in the second GOP structure, compared with setting the GOP structure of the video to be the second GOP structure, the video processing device sets the GOP structure of the video to be the first GOP structure, the proportion of the first video frame in the video frames lost in the transmission process can be made larger, and because the first video frame is not referred by other video frames, the normal decoding of other video frames cannot be affected by the lost first video frame, so that the influence of the packet loss on the decoding of the video receiving device can be reduced as much as possible, and further, the smoothness of the video played by the video receiving device can be ensured as much as possible.

For example, if 2 video frames are lost when a target code stream encoded based on the first GOP structure shown in fig. 3a is transmitted in a wireless channel, and the 2 video frames are all Y frames (i.e., first video frames), since the normal decoding of other video frames is not affected by the loss of the first video frame, even if the first video frame is lost in the transmission process, the video receiving device can still successfully decode the received video frame, and smoothly play the decoded video. However, if 2 video frames are lost when the target code stream encoded based on the second GOP structure shown in fig. 3b is transmitted in the wireless channel, at least 1 non-first video frame exists in the 2 video frames, where the non-first video frame may be an I frame or an X frame in the second GOP structure, and since the I frame and the X frame are both referred to by other video frames, the loss of the I frame or the X frame may cause that the video frame that refers to the lost I frame or the X frame cannot be decoded when the video receiving device performs decoding, that is, the decoding success rate of the video receiving device is low, and further cause the video receiving device to generate a jam and the like when playing video.

In one implementation, the types of video frames in different GOPs belonging to the same GOP structure may or may not be identical. For example, when the GOP length of the first GOP structure and the GOP length of the second GOP structure are both 4, that is, 1 first GOP structure and 1 second GOP structure each include 4 video frames, the structural diagram of the first GOP structure may be as shown in fig. 3c, and the structural diagram of the second GOP structure may be as shown in fig. 3 d. In fig. 3c and 3d, GOP1 represents the first GOP of two adjacent GOPs, and GOP2 represents the second GOP of the two adjacent GOPs, in fig. 3c, the first GOP and the second GOP both belong to the first GOP structure, but the types of video frames included in the first GOP and the second GOP are not identical, wherein the first GOP includes I frames, but the second GOP does not include I frames; in fig. 3d, the first GOP and the second GOP both belong to the second GOP structure, and both include video frames of the same type; i represents a second video frame; x represents a video frame referred to by other video frames, and X can be an I frame or a P frame; y represents a first video frame, and the Y frame can be a P frame or a B frame; roman numerals represent the numbers of video frames for distinguishing different video frames; the arrow direction represents the reference relationship between two video frames. As can be seen from fig. 3c and fig. 3d, in the first GOP structure, the video frames in the second GOP (i.e. X4) can refer to the second video frames in the first GOP (i.e. I0), i.e. the number of the second video frames in the second GOP is 0, while in the second GOP structure, the video frames in the second GOP do not refer to the video frames in the first GOP, and the number of the second video frames in the second GOP are both 1, i.e. the occupation ratio of the second video frames in the first GOP structure is higher than that of the second video frames in the second GOP structure. It should be noted that the schematic structure of the first GOP structure shown in fig. 3c and the schematic structure of the second GOP structure shown in fig. 3d are only used for example, and do not constitute a limitation to the embodiment of the present application, and in other possible implementations, the first GOP structure and the second GOP structure may also be in other structures.

In one implementation, the video processing apparatus may set the GOP structure of the video to a second GOP structure if the transmission condition of the wireless channel does not satisfy a preset degradation condition. When the transmission condition of the wireless channel does not meet the preset degradation condition, the channel quality of the wireless channel is good, and packet loss is not easy to generate in the transmission process, because the proportion of the second video frame in the first GOP structure is higher than that of the second video frame in the second GOP structure, and the data volume of the second video frame is larger than that of the first video frame, when the GOP lengths of the first GOP structure and the second GOP structure are the same, the code rate of the target code stream obtained based on the second GOP structure is lower than that of the target code stream obtained based on the first GOP structure, in other words, when the target code stream obtained based on the second GOP structure is transmitted in the wireless channel, the consumed bandwidth resource is less than that when the target code stream obtained based on the first GOP structure is transmitted in the wireless channel.

In one implementation, the video processing device may set default GOP parameters for the first GOP structure and set default GOP parameters for the second GOP structure. In an implementation manner, after the GOP structure of the video is set as the first GOP structure, the video processing device may acquire a default GOP parameter corresponding to the first GOP structure, and encode the original video data according to the first GOP structure and the default GOP parameter corresponding to the first GOP structure to obtain the target code stream. In an implementation manner, after the GOP structure of the video is set as the second GOP structure, the video processing device may acquire a default GOP parameter corresponding to the second GOP structure, and encode the original video data according to the second GOP structure and the default GOP parameter corresponding to the second GOP structure to obtain the target code stream. For example, the default GOP parameters for the first GOP structure can be as shown in table 1, and the default GOP parameters for the second GOP structure can be as shown in table 2.

TABLE 1 Default GOP parameters for the first GOP Structure

GOP Length (frame)	Number of first video frames (frame)	Number of second video frames (frame)	Number of other video frames (frames)
				4	2	1	1

TABLE 2 Default GOP parameters for the second GOP Structure

GOP Length (frame)	Number of first video frames (frame)	Number of second video frames (frame)	Number of other video frames (frames)
				4	1	1	2

In one implementation, before the video processing device sets the GOP structure of the video to the first GOP structure (or the second GOP structure), the original video data may be analyzed to obtain a target video scene of the video, the target video scene may be stored, and based on the target video scene, the content characteristics of the video may be determined. In one implementation, different target video scenes can be obtained by analyzing different original video data of the video by the video processing device, wherein the original video data can be one image. The video processing equipment can store a target video scene obtained by identifying a previous frame of image, and determine the target video scene obtained by identifying the previous frame of image as a historical video scene, when the video processing equipment identifies a current image to obtain the target video scene, the video processing equipment can obtain the pre-stored historical video scene and judge whether the currently identified target video scene is the same as the historical video scene, if so, the video processing equipment indicates that the video scene of the video is not changed, namely, the difference between the previous frame of image and the current image is smaller; if the difference is not the same, it indicates that the video scene of the video has changed, i.e. the difference between the previous frame image and the current image is larger.

In one implementation, if the currently identified target video scene is the same as the historical video scene, the video processing device may determine the encoding policy of the video as the historical encoding policy. In one implementation, the video processing device may adjust the encoding policy if the currently identified target video scene is different from the historical video scene. The difference between the current identified target video scene and the historical video scene indicates that the difference between the previous frame of image and the current image is large, and if the encoding strategy is not adjusted and the historical encoding strategy is still adopted for encoding, the information loss generated in the reference process is large, so that the recovery quality of the video is low. Specifically, the specific implementation manner of the video processing device adjusting the encoding policy may be: the GOP structure of the video is increased by the number of second video frames included, wherein the second video frames may be I-frames. When the original image of the current frame is compressed into a video frame (such as a P frame or a B frame) by referring to other original images during encoding and then transmitted, information loss occurs during the reference process, so that when the video frame is restored into an image in a video receiving device, an error exists between the restored image and the original image. If the difference between the reference original image and the original image of the frame is small, the information loss generated in the reference process is small, and similarly, if the difference between the reference original image and the original image of the frame is large, the information loss generated in the reference process is large. Because the data volume of the second video frame is greater than that of the first video frame, the second video frame can record more image detail information compared with the first video frame, and when a video scene changes, the information loss generated in the reference process can be reduced by increasing the number of the second video frames (such as I frames) included in the GOP structure of the video, so that the recovery quality of the video is improved.

In one implementation, when detecting that a video scene occurs, the video processing device may insert an I-frame into the set GOP structure, where the insertion position of the I-frame may be before the next frame of the current video frame. Taking fig. 3b as an example, if the historical video scene is obtained by analyzing and identifying the original images corresponding to the X1 frames, the current target video scene is obtained by analyzing and identifying the original images corresponding to the X2 frames, and the target video scene is different from the historical video scene, it indicates that the difference between the original images corresponding to the X1 frames and the original images corresponding to the X2 frames is large. At this time, the video processing device cannot accurately and completely record the difference between the original image corresponding to the X1 frame and the original image corresponding to the X2 frame into the X2 frame, which causes the difference between the image obtained by decoding the X2 frame and the original image corresponding to the X2 frame by the video receiving device to be large, that is, most of the image information of the original image corresponding to the X2 frame is lost, which causes the quality of the video recovered by the video receiving device to be low. To avoid the above problem, the video processing apparatus may insert an I-frame before the next frame of the current video frame (i.e., X1 frame), and a schematic diagram of the structure of a new second GOP structure obtained after inserting the I-frame in the second GOP structure shown in fig. 3b may be as shown in fig. 3e, in which the dotted line portion is the newly inserted I-frame (i.e., I2). In one implementation, the video processing device may modify the type of the video frames in the second GOP structure in addition to the inserted I-frame in the second GOP structure, and taking fig. 3b as an example, the video processing device may replace the X2 frames in the second GOP structure with I2 frames in such a way as not to change the GOP length of the second GOP structure.

Step S303: and the video processing equipment carries out coding processing on the original video data based on the first GOP structure to obtain a target code stream. Specifically, when the transmission condition of the wireless channel meets the preset degradation condition, the video processing device performs encoding processing on the original video data based on the first GOP structure to obtain the target code stream, so that the influence of packet loss on decoding of the video receiving device can be reduced as much as possible, and further, the fluency of playing the video by the video receiving device can be ensured as much as possible.

In one implementation, when the transmission condition of the wireless channel does not satisfy the preset degradation condition, the video processing device may perform encoding processing on the original video data based on the second GOP structure to obtain the target code stream, and when the target code stream obtained based on the second GOP structure is transmitted in the wireless channel, less bandwidth resources are consumed compared to when the target code stream obtained based on the first GOP structure is transmitted in the wireless channel.

In one implementation, when the transmission condition of the wireless channel does not satisfy the preset degradation condition, the video processing device may retain more detailed information of the image object that is of greater interest to the user during the encoding process of the original video data based on the second GOP structure. Compared with the target code stream obtained by transmitting the first GOP structure-based code in the wireless channel, the video quality of the image object which is relatively concerned by the user in the video obtained by the video receiving equipment is higher under the condition of keeping the code rate of the target code stream unchanged, and the watching experience of the user is favorably improved.

Step S304: and the video processing equipment sends the target code stream to the first wireless transmission equipment.

Step S305: and the first wireless transmission equipment transmits the target code stream through a wireless channel.

It should be noted that the execution processes of steps S304 to S305 can respectively refer to the detailed descriptions of steps S204 to S205 in fig. 2, which are not repeated herein.

It can be seen that, by implementing the embodiment of the present application, when the transmission condition of the wireless channel meets the preset degradation condition, the GOP structure of the video is set as the first GOP structure, so that the proportion of the first video frame in the video frames lost in the transmission process is large, and when the first video frame is not referred to by other video frames, the normal decoding of other video frames is not affected by the loss of the first video frame, so that the influence of packet loss on the decoding of the video receiving device can be reduced as much as possible, and further, the fluency of the video playing by the video receiving device can be ensured as much as possible.

Referring to fig. 4, fig. 4 is a flow chart of another video processing method provided in this embodiment of the present application, which may be applied to a wireless video transmission system, and which details how to set quantization parameters of a video, and the method may include, but is not limited to, the following steps:

step S401: the video processing device obtains original video data of a video to be transmitted. It should be noted that, the execution process of step S401 may refer to the specific description of step S201 in fig. 2, and is not described herein again.

In order to improve the recovery quality of the video, a smaller quantization parameter may be set for the video, but if the quantization parameter is decreased, the video bitrate exceeds the transmission bandwidth of the wireless channel, which may increase packet loss generated during wireless transmission. Therefore, the video processing device can set more appropriate quantization parameters for the video by referring to the transmission conditions of the wireless channel, and encode the original video data based on the set quantization parameters to obtain the target code stream.

In one implementation, if the transmission condition of the wireless channel satisfies a preset degradation condition, the video processing device may set a quantization parameter of the video according to the content characteristics fed back by the user. Specifically, when the transmission condition of the wireless channel meets the preset degradation condition, it is indicated that the channel quality of the wireless channel is poor, that is, the data volume which can be transmitted by the wireless channel is small, at this time, the video processing device sets the quantization parameter of the video according to the content characteristics fed back by the user, so that the limited wireless resources can be fully utilized to transmit more detailed information of the image object which is actually concerned by the user, and the viewing experience of the user is improved.

In one implementation, the content feature fed back by the user may be a category of an image object that is actually more concerned by the user, and when the transmission condition of the wireless channel satisfies a preset degradation condition, the specific implementation manner in which the first wireless transmission device sets the quantization parameter of the video according to the content feature fed back by the user may be: when the transmission condition of the wireless channel satisfies a preset degradation condition, the video processing apparatus may identify a target image object having the same category as that of an image object that is actually focused by the user in the original video data, and set a smaller quantization parameter for video data constituting the target image object and a larger quantization parameter for video data constituting image objects other than the target image object. For example, if the image objects in the original video image include a person, a vehicle and a background, and the content characteristics fed back by the user indicate that the category of the image object concerned by the user is a motion category, the video processing device may identify the image objects of the person and the vehicle belonging to the motion category, set a smaller quantization parameter for the video data constituting the person and the vehicle, and set a larger quantization parameter for the video data constituting the background (the background belonging to a still-class image object). By the method, the image detail information of more motion image objects can be transmitted, so that the video quality of the motion image objects in the video obtained by the video receiving equipment is higher, and the user watching experience is improved.

In one implementation, the content feature fed back by the user may also be an image object that is actually focused by the user. In one implementation, the number of the image objects that are fed back by the user and are actually concerned with may be one or more, when the number of the image objects that are fed back by the user and are actually concerned with is multiple, each image object may correspond to a degree of attention, the degrees of attention of different image objects may be different, and specifically, the video processing device may set a quantization parameter of video data that constitutes each image object according to the degree of attention. For example, if the image objects in the original video image include a person, a vehicle, and a background, and the content characteristics fed back by the user indicate that the image objects of greater interest to the user are the person and the vehicle, and the degree of interest of the person is higher than the degree of interest of the vehicle, the video processing apparatus may set a minimum quantization parameter for the video data constituting the person, a smaller quantization parameter for the video data constituting the vehicle, and a larger quantization parameter for the video data constituting the background.

In one implementation, if the transmission condition of the wireless channel does not satisfy the preset degradation condition, the video processing device may set a quantization parameter of the video according to a content feature of the video and a content feature fed back by a user. Specifically, when the transmission condition of the wireless channel does not meet the preset degradation condition, it is indicated that the channel quality of the wireless channel is good, that is, the data volume which can be transmitted by the wireless channel is large, at this time, the video processing device sets the quantization parameter of the video according to the content feature of the video and the content feature fed back by the user, can transmit more detailed information of the image object which the user may pay attention to through the wireless channel, and transmit more detailed information of the image object which the user actually pays attention to, which is beneficial to improving the viewing experience of the user.

Step S402: and if the content characteristics of the video are different from the content characteristics fed back by the user, the video processing equipment sets the quantization parameters of the video according to the content characteristics fed back by the user. Specifically, the video processing device may determine whether the content characteristics of the video are the same as the content characteristics fed back by the user, and if so, may set the quantization parameters of the video according to the content characteristics of the video or the content characteristics fed back by the user; if not, the video processing device can set the quantization parameter of the video according to the content characteristics fed back by the user. Compared with the content characteristics of the video, the content characteristics fed back by the user can more accurately indicate the image object concerned by the user, so that when the content characteristics of the video are different from the content characteristics fed back by the user, the quantization parameter of the video is set according to the content characteristics fed back by the user, more detailed information of the image object concerned by the user can be transmitted through a wireless channel, and the watching experience of the user is improved.

In one implementation, if the content characteristics of the video are different from the content characteristics fed back by the user and the transmission condition of the wireless channel meets the preset degradation condition, the video processing device may set the quantization parameter of the video according to the content characteristics fed back by the user. In one implementation, if the content characteristics of the video are different from the content characteristics fed back by the user and the transmission conditions of the wireless channel do not satisfy the preset degradation conditions, the video processing device may set the quantization parameter of the video according to the content characteristics of the video and the content characteristics fed back by the user together. For example, if the image objects in the original video image include a person, a vehicle, and a background, the image features obtained by analyzing the original video image indicate that the image object that the user is likely to pay attention to is the person, and the content features fed back by the user indicate that the image object that the user is likely to pay attention to is the vehicle, the video processing apparatus may set a minimum quantization parameter for the video data constituting the vehicle, a smaller quantization parameter for the video data constituting the person, and a larger quantization parameter for the video data constituting the background.

Step S403: and the video processing equipment carries out coding processing on the original video data based on the set quantization parameter to obtain a target code stream. Specifically, when the content characteristics of the video are different from the content characteristics fed back by the user, the video processing device sets the quantization parameters of the video according to the content characteristics fed back by the user, and performs coding processing on the original video data based on the set quantization parameters to obtain a target code stream, so that more detailed information of an image object which is actually concerned by the user can be transmitted through a wireless channel, and the viewing experience of the user can be improved.

Step S404: and the video processing equipment sends the target code stream to the first wireless transmission equipment.

Step S405: and the first wireless transmission equipment transmits the target code stream through a wireless channel.

It should be noted that the execution processes of steps S404 to S405 can respectively refer to the detailed descriptions of steps S204 to S205 in fig. 2, which are not repeated herein.

Therefore, by implementing the embodiment of the application, when the content characteristics of the video are different from the content characteristics fed back by the user, the quantization parameters of the video can be set according to the content characteristics fed back by the user, and by the method, more detailed information of the image object which is actually concerned by the user can be transmitted through the wireless channel, so that the watching experience of the user can be improved.

Referring to fig. 5, fig. 5 is a flow chart of another video processing method provided in this embodiment, which may be applied to a wireless video transmission system, and details how to set the importance of a video data packet in a target code stream, where the method may include, but is not limited to, the following steps:

step S501: the video processing device obtains original video data of a video to be transmitted.

Step S502: the video processing equipment sets the coding strategy of the video according to the content characteristics of the video and historical feedback information, the historical feedback information comprises historical transmission results and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters. It should be noted that the execution processes of steps S501 to S502 can be referred to the specific descriptions of steps S201 to S202 in fig. 2, which are not repeated herein.

Step S503: the video processing equipment carries out coding processing on the original video data based on the coding strategy to obtain a target code stream, wherein the target code stream comprises a first video data packet and a second video data packet. Here, the first video data packet (or the second video data packet) may include all video data constituting one image, or the first video data packet (or the second video data packet) may include video data constituting a partial image area in one image. For example, the first video data packet may include an I-frame, or the first video data packet may include a portion of data in an I-frame and the second video data packet may include another portion of data in an I-frame.

Step S504: the video processing apparatus sets the importance of the first video data packet and sets the importance of the second video data packet based on the content characteristics of the video, the content characteristics fed back by the user, and/or the encoding policy of the video. The importance of the video data packet may be used to indicate the attention of a user to the video data in the video data packet, or the importance of the video data packet may be used to indicate the influence of the video data in the video data packet on the quality of the video recovered by the video receiving device. For example, if the importance of the video data packet is higher, it indicates that the user is more concerned about the video data in the video data packet. As another example, if the importance of the video data packet is higher, it indicates that the video data in the video data packet has a greater influence on the video recovery quality. By setting importance for the first video data packet and the second video data packet, it is possible to transmit the video data packets having higher importance by making full use of limited wireless channel resources.

In one implementation, the video processing device may determine the number of times each video frame in the GOP structure is referenced based on the encoding policy of the video. For example, if the GOP structure in the video encoding strategy is the second GOP structure shown in fig. 3b, and the encoding parameters in the video encoding strategy are the default GOP parameters corresponding to the second GOP structure shown in table 2, the video processing apparatus may determine that, in the second GOP structure, the I0 frame is referred to 3 times, the X1 frame is referred to 1 time, the X2 frame is referred to 1 time, and the Y3 frame is referred to 0 time.

In one implementation, the specific implementation manner of the video processing device setting the importance of the first video data packet and setting the importance of the second video data packet based on the coding policy of the video may be that: the video processing equipment acquires a first time number of reference of a video frame corresponding to a first video data packet based on a video coding strategy, and sets the importance of the first video data packet according to the first time number; the video processing device acquires a second number of times that a video frame corresponding to the second video data packet is referred to based on a video encoding strategy, and sets the importance of the second video data packet according to the second number of times. When a video receiving device decodes a certain video frame, it needs to refer to images decoded from other video frames to successfully decode the video frame, and if the referred video frame is lost, the video receiving device cannot decode the video frame. The video processing equipment sets the importance of the video data packet according to the number of times that the video frame corresponding to the video data packet is referred to, so that the set importance can better reflect the influence of the video frame corresponding to the video data packet on the decoding success rate of the video receiving equipment.

In one implementation, the video frame corresponding to the first video data packet may be the same as or different from the video frame corresponding to the second video data packet. When the first video data packet and the second video data packet both include all video data constituting one image, a video frame corresponding to the first video data packet is different from a video frame corresponding to the second video data packet. In one implementation, when the video frame corresponding to the first video data packet is the same as the video frame corresponding to the second video data packet, the first number of times is the same as the second number of times, that is, the importance of the first video data packet is the same as the importance of the second video data packet. In one implementation, when the video frame corresponding to the first video data packet is different from the video frame corresponding to the second video data packet, the first number of times may be the same as or different from the second number of times, that is, the importance of the first video data packet may be the same as or different from that of the second video data packet.

In one implementation, if the first number is higher than the second number, the importance of the first video data packet may be higher than that of the second video data packet, in other words, if the number of times that the video frame corresponding to the video data packet is referred to is larger, the importance of the video data packet is higher. By the method, the video data packets with higher importance can be transmitted preferentially, and the decoding success rate of the video receiving equipment is improved.

In one implementation, the video processing device may set the importance of a video data packet according to the category to which the video data in the video data packet belongs. In one implementation, the content features of the video may include a first category and a second category, where the first category and the second category may be determined by the video processing device after identifying a target video scene of the video, and the first category and/or the second category may change under different target video scenes. In one implementation, the importance of the first category may be the same as or different from the importance of the second category, and the embodiment of the present application is described by taking the example that the importance of the first category is different from the importance of the second category. In one implementation, the importance of the first category and the importance of the second category may be set by a video processing device. In one implementation, the specific implementation manner of the video processing device setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics of the video may be that: the video processing device analyzes the video data in the first video data packet to obtain the category to which the video data in the first video data packet belongs, and if the category to which the video data in the first video data packet belongs is the first category, the video processing device can set the importance of the first video data packet according to the importance of the first category; the video processing device analyzes the video data in the second video data packet to obtain the category to which the video data in the second video data packet belongs, and if the category to which the video data in the second video data packet belongs is the second category, the video processing device may set the importance of the second video data packet according to the importance of the second category. The content characteristics of the video can be used for indicating the types of the image objects which are possibly concerned by the user, the importance of the video data packets is set according to the importance of the types of the image objects which are possibly concerned by the user, and the video data packets with higher importance are preferentially transmitted, so that the video receiving equipment can recover the obtained video including more image objects which are concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, the importance of the first video packet may be higher than the importance of the second video packet if the importance of the first category is higher than the importance of the second category, in other words, the importance of the video packet is higher if the importance of the category to which the video data in the video packet belongs is higher. By the method, the video data packets with higher importance can be preferentially transmitted, so that the video recovered by the video receiving equipment comprises more image objects which are concerned by the user, and the watching experience of the user is improved.

In an implementation manner, if a category to which video data in a video data packet belongs is different from a category included in content features of a video, the video processing device may set a default importance for the video data packet, where the default importance may be set by the video processing device by default, or may be determined by the video processing device according to an input operation of a user, which is not limited in this embodiment of the present application.

In one implementation, the video processing device may also set the importance of a video data packet according to the image object that the video data in the video data packet constitutes. In one implementation, the content features of the video may include a first image object and a second image object, and the first image object and the second image object may be determined by the video processing device after identifying a target video scene of the video, where the first image object and/or the second image object may change. For example, when the target video scene is a video surveillance scene, the first image object and the second image object may be a person and a vehicle, respectively; when the target video scene is a video teaching scene, the first image object and the second image object can be a lecture and a character respectively. In one implementation, the importance of the first image object may be the same as or different from the importance of the second image object, and the importance of the first image object is different from the importance of the second image object in this embodiment of the present application. In one implementation, the importance of the first image object and the importance of the second image object may be set by a video processing device, and the importance of the first image object and/or the importance of the second image object may change in different target video scenes. For example, when the target video scene is a video surveillance scene, the importance of the character is the highest; when the target video scene is a video teaching scene, the importance of the character is low.

In one implementation, the specific implementation manner of the video processing device setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics of the video may be that: the video processing device analyzes the video data in the first video data packet to obtain an image object composed of the video data in the first video data packet, and if the image object composed of the video data in the first video data packet is the first image object, the video processing device can set the importance of the first video data packet according to the importance of the first image object; the video processing device analyzes the video data in the second video data packet to obtain an image object composed of the video data in the second video data packet, and if the image object composed of the video data in the second video data packet is the second image object, the video processing device may set the importance of the second video data packet according to the importance of the second image object. In one implementation, the importance of the first video data packet may be higher than the importance of the second video data packet if the importance of the first image object is higher than the importance of the second image object, in other words, the importance of the video data packet is higher if the importance of the image object composed of the video data in the video data packet is higher. By the method, the video data packets with higher importance can be preferentially transmitted, so that the video recovered by the video receiving equipment comprises more image objects which are concerned by the user, and the watching experience of the user is improved.

In one implementation, the content feature of the user feedback may include a third category, and the third category may be determined by the video receiving device according to an input operation of the user, for example, during a video playing process, if the video receiving device detects a continuous pressing operation of the user, a category to which the continuously pressed image object belongs may be determined as the third category. In one implementation, the specific implementation manner of the video processing device setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics fed back by the user may be that: the video processing device analyzes the video data in the first video data packet to obtain the category to which the video data in the first video data packet belongs, and if the category to which the video data in the first video data packet belongs is the same as the third category, the video processing device can set the importance of the first video data packet according to the importance of the third category; the video processing device analyzes the video data in the second video data packet to obtain a category to which the video data in the second video data packet belongs, and if the category to which the video data in the second video data packet belongs is different from the third category, the video processing device may set the importance of the second video data packet according to the category to which the video data in the second video data packet belongs. The content characteristics fed back by the user can be used for indicating the type of the image object which is actually concerned by the user, the importance of the video data packet is set according to the importance of the type of the image object which is actually concerned by the user, and the video data packet with higher importance is preferentially transmitted, so that the video receiving equipment can recover the video which contains more image objects which are actually concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, the video processing device may set importance for the third category when receiving the third category. The video processing device may set the highest importance for the third category compared to the first and second categories. In one implementation, the importance of the third category may be set by the video receiving device and sent to the video processing device. In one implementation, the video processing apparatus may default importance to different categories, and specifically, the video processing apparatus may acquire importance default set to the category to which the video data in the second video data packet belongs and determine it as the importance of the second video data packet.

In one implementation, if the category to which the video data in the first video data packet belongs is the same as the third category and the category to which the video data in the second video data packet belongs is different from the third category, the importance of the first video data packet may be higher than the importance of the second video data packet. In other words, if the category to which the video data in the video data packet belongs is a category that the user actually pays more attention to, the importance of the video data packet is higher; if the category to which the video data in the video data packet belongs is different from the category that the user actually pays more attention to, the importance of the video data packet is low. By the method, the video data packets with higher importance can be preferentially transmitted, so that the video recovered by the video receiving equipment comprises more image objects which are actually concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, the content feature of the user feedback may include a third image object, and the third image object may be determined by the video receiving device according to an input operation of the user. In one implementation, the specific implementation manner of the video processing device setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics fed back by the user may be that: the video processing device analyzes the video data in the first video data packet to obtain an image object composed of the video data in the first video data packet, and if the image object composed of the video data in the first video data packet is the same as the third image object, the video processing device can set the importance of the first video data packet according to the importance of the third image object; the video processing device analyzes the video data in the second video data packet to obtain an image object composed of the video data in the second video data packet, and if the image object composed of the video data in the second video data packet is different from the third image object, the video processing device may set the importance of the second video data packet according to the image object composed of the video data in the second video data packet. The content characteristics fed back by the user can be used for indicating the image objects which are actually concerned by the user, the importance of the video data packet is set according to the importance of the image objects which are actually concerned by the user, and then the video data packet with higher importance is preferentially transmitted, so that the video recovered by the video receiving equipment comprises more image objects which are actually concerned by the user, and the watching experience of the user is favorably improved.

In one implementation, the video processing device may set importance for the third image object when receiving the third image object. The video processing device may set the highest importance for the third image object compared to the first image object and the second image object. In one implementation, the importance of the third image object may be set by the video receiving device and sent to the video processing device. In one implementation, the video processing device may default importance to different image objects, and specifically, the video processing device may acquire the importance of the default setting of the image objects configured for the video data in the second video data packet and determine it as the importance of the second video data packet.

In one implementation, the video processing device may collectively set the importance of the first video data packet and collectively set the importance of the second video data packet based on the content characteristics of the video, the content characteristics fed back by the user, and the encoding policy of the video.

Step S505: and the video processing equipment sends the target code stream to the first wireless transmission equipment. It should be noted that, the execution process of step S505 can refer to the specific description of step S204 in fig. 2, which is not described herein again.

Step S506: the first wireless transmission equipment screens the target code stream to obtain a code stream to be transmitted based on the transmission condition of the wireless channel, the importance of the first video data packet and the importance of the second video data packet, wherein the code stream to be transmitted comprises the first video data packet and/or the second video data packet. Because the transmission bandwidth of the wireless channel is very limited, and the data volume of the video data included in the target code stream is large, when the transmission bandwidth of the wireless channel is smaller than the bandwidth required for transmitting the target code stream, packet loss will occur in the transmission process, and if the video data packet with higher importance is lost in the transmission process, the quality of the video recovered by the video receiving device will be low. In order to avoid the above problem, the first wireless transmission device may filter the target code stream to determine which video data packets in the target code stream are transmitted, determine that the transmitted video data packets may include video data of image elements that are more concerned by the user, or determine that the transmitted video data packets have a greater influence on the quality of the video recovered by the video receiving device. Compared with all video data packets in the transmission target code stream, the method has the advantages that the video data packets with higher importance can be reduced or even avoided being lost in the transmission process by actively giving up transmission of part of the video data packets, and further the influence of packet loss on the quality of the recovered video is reduced as much as possible. It should be noted that the importance of each video data packet may be carried in the video data packet, and after receiving the video data packet, the first wireless transmission device may parse the data in the video data packet to obtain the importance of the video data packet. For example, the importance of each video data packet may be recorded in an extension field of a header of the video data packet.

In one implementation, the screening method adopted by the first wireless transmission device may include, but is not limited to, the following two methods: first, the first wireless transmission device may determine whether to transmit the first video data packet based on a transmission condition of the wireless channel and an importance of the first video data packet, add the first video data packet to the code stream to be transmitted if it is determined to transmit the first video data packet, determine whether to transmit the second video data packet based on the transmission condition of the wireless channel and the importance of the second video data packet, and add the second video data packet to the code stream to be transmitted if it is determined to transmit the second video data packet. For example, if the importance of the first video data packet is higher than a preset importance threshold, the first wireless transmission device determines to transmit the first video data packet. The preset importance threshold may be determined based on the transmission condition of the wireless channel, for example, when the transmission condition of the wireless channel is better, more data may be transmitted through the wireless channel, and at this time, the first wireless transmission device may set the preset importance threshold smaller so as to transmit more video data packets through the wireless channel.

Secondly, the first wireless transmission device may determine whether to transmit the first video data packet based on the transmission condition of the wireless channel and the importance of all video data packets in the target code stream, add the first video data packet to the code stream to be transmitted if it is determined to transmit the first video data packet, determine whether to transmit the second video data packet based on the transmission condition of the wireless channel and the importance of all video data packets in the target code stream, and add the second video data packet to the code stream to be transmitted if it is determined to transmit the second video data packet. For example, when the target code stream includes M video data packets, the first wireless transmission device may determine the first N video data packets with the highest importance in the target code stream, and transmit the first N video data packets with the highest importance through the wireless channel. Where the value N (M > ═ N) may be determined based on the transmission conditions of the wireless channel, for example, when the transmission conditions of the wireless channel are better, more data may be transmitted through the wireless channel, and at this time, the first wireless transmission device may set the value N larger so as to transmit more video data packets through the wireless channel.

Step S507: the first wireless transmission equipment transmits the code stream to be transmitted through a wireless channel. Specifically, when the importance of the first video data packet is higher than that of the second video data packet, and the to-be-transmitted code stream includes the first video data packet and the second video data packet, the specific implementation manner of the first wireless transmission device transmitting the to-be-transmitted code stream through the wireless channel may be: if the transmission condition of the wireless channel meets the preset degradation condition, the first wireless transmission device can transmit the first video data packet through the wireless channel. The transmission condition of the wireless channel meets the preset degradation condition, which indicates that the channel quality of the wireless channel is poor and packet loss is easily generated in the wireless transmission process, and the first wireless transmission equipment can avoid the influence of the second video data packet on the transmission process of the first video data packet by actively discarding the second video data packet with smaller importance.

In one implementation manner, when the importance of the first video data packet is higher than that of the second video data packet, and the to-be-transmitted code stream includes the first video data packet and the second video data packet, a specific implementation manner in which the first wireless transmission device transmits the to-be-transmitted code stream through the wireless channel may be: if the transmission condition of the wireless channel satisfies the predetermined degradation condition, the first wireless transmission device may sequentially transmit the first video data packet and the second video data packet in order of importance. Because the transmission condition of the wireless channel has the characteristic of time variation, the first wireless transmission equipment sequentially transmits the first video data packet and the second video data packet according to the high-low sequence of importance, so that the reliable transmission of the first video data packet with higher importance can be better ensured, and the first video data packet with higher importance is prevented from losing after the transmission condition of the wireless channel is deteriorated.

In one implementation manner, when the importance of the first video data packet is higher than that of the second video data packet, and the to-be-transmitted code stream includes the first video data packet and the second video data packet, a specific implementation manner in which the first wireless transmission device transmits the to-be-transmitted code stream through the wireless channel may be: the first wireless transmission equipment modulates the first video data packet to a first satellite seat, modulates the second video data packet to a second satellite seat, and transmits the modulated first video data packet and the modulated second video data packet. The first constellation position and the second constellation position may be different constellation positions in the same constellation diagram, and the reliability of the first constellation position may be higher than that of the second constellation position. The constellation diagram can be a distribution diagram of signal vector endpoints, the reliability of the first satellite seat is higher than that of the second satellite seat, so that the error rate generated after the first video data packet modulated to the first satellite seat is transmitted in the wireless channel is lower than the error rate generated after the second video data packet modulated to the second satellite seat is transmitted in the wireless channel, and the reliable transmission of the first video data packet with higher importance is favorably ensured.

In one implementation manner, when the importance of the first video data packet is higher than that of the second video data packet, and the to-be-transmitted code stream includes the first video data packet and the second video data packet, a specific implementation manner in which the first wireless transmission device transmits the to-be-transmitted code stream through the wireless channel may be: the first wireless transmission device may transmit the modulated first video data packet with a higher transmit power than the second video data packet; alternatively, the first wireless transmission device may transmit the first video data packet using a more reliable channel; or, the first wireless transmission device may adopt a lower channel coding rate for the first video data packet, that is, the first wireless transmission device may add more redundancy in the first video data packet, so as to enhance the interference resistance of the first video data packet when transmitting in the wireless channel; alternatively, the first wireless transmission device may apply a lower modulation order to the first video data packet in order to reduce the error rate of the first video data packet; alternatively, the first wireless transmission device may add more forward error correction codes to the first video data packet in order to enhance the error correction capability of the first video data packet.

Therefore, by implementing the embodiment of the application, the target code stream can be screened based on the transmission condition of the wireless channel, the importance of the first video data packet and the importance of the second video data packet, and the code stream to be transmitted obtained by screening is transmitted through the wireless channel.

The method of the embodiments of the present application is set forth above in detail and the apparatus of the embodiments of the present application is provided below.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a video processing apparatus 60 according to an embodiment of the present disclosure, where the video processing apparatus 60 is configured to perform the steps performed by the video processing device in the method embodiments corresponding to fig. 2 to fig. 5, and the video processing apparatus 60 may include:

an obtaining module 601, configured to obtain original video data of a video to be transmitted;

the processing module 602 is configured to set a video encoding policy according to content characteristics of a video and historical feedback information, where the historical feedback information includes a historical transmission result and/or content characteristics fed back by a user, and the encoding policy includes an encoding mode and/or an encoding parameter;

the processing module 602 is further configured to perform encoding processing on the original video data based on an encoding policy to obtain a target code stream;

the sending module 603 is configured to send the target code stream to the wireless transmission device, so that the wireless transmission device transmits the target code stream through a wireless channel.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include first video frames, and an occupation ratio of the first video frames in the first picture group structure may be higher than an occupation ratio of the first video frames in the second picture group structure, where the occupation ratio of the first video frames in the first picture group structure is a ratio between the number of the first video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the first video frames in the second picture group structure is a ratio between the number of the first video frames in the second picture group structure and the number of the video frames; the processing module 602 is configured to, when setting a video encoding policy according to the content characteristics of the video and the historical feedback information, specifically, set the picture group structure of the video to be a first picture group structure if the transmission condition of the wireless channel meets a preset degradation condition.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include second video frames, and an occupation ratio of the second video frames in the first picture group structure may be higher than an occupation ratio of the second video frames in the second picture group structure, where the occupation ratio of the second video frames in the first picture group structure is a ratio between the number of the second video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the second video frames in the second picture group structure is a ratio between the number of the second video frames in the second picture group structure and the number of the video frames; the amount of data constituting the second video frame may be larger than the amount of data constituting the first video frame; the processing module 602 is configured to, when setting a video encoding policy according to the content characteristics of the video and the history feedback information, specifically, set the picture group structure of the video to be a second picture group structure if the transmission condition of the wireless channel does not satisfy the preset degradation condition.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel, and the encoding parameters may include quantization parameters; the processing module 602 is configured to, when setting a video encoding policy according to the content characteristics of the video and the historical feedback information, specifically, set quantization parameters of the video according to the content characteristics fed back by the user if the transmission condition of the wireless channel meets a preset degradation condition; and/or if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the quantization parameter of the video according to the content characteristic of the video and the content characteristic fed back by the user; and/or if the content characteristics of the video are different from the content characteristics fed back by the user, setting the quantization parameters of the video according to the content characteristics fed back by the user.

In one implementation, the processing module 602 may be further configured to analyze the original video data to obtain a target video scene of the video, and determine content characteristics of the video based on the target video scene.

In an implementation manner, the target code stream may include a first video data packet and a second video data packet, and the processing module 602 may further be configured to set the importance of the first video data packet and set the importance of the second video data packet based on the content characteristics of the video, the content characteristics fed back by the user, and/or the encoding policy of the video.

In one implementation, the content features of the video may include a first category and a second category, and the content features of the user feedback may include a third category; the processing module 602 is configured to set an importance of the first video data packet based on a content feature of the video, a content feature fed back by a user, and/or a video encoding policy, and when setting an importance of the second video data packet, may be specifically configured to obtain a first number of times that a video frame corresponding to the first video data packet is referred to, and set the importance of the first video data packet according to the first number of times; acquiring a second number of times that a video frame corresponding to the second video data packet is referred to, and setting the importance of the second video data packet according to the second number of times; and/or if the category to which the video data in the first video data packet belongs is a first category, setting the importance of the first video data packet according to the importance of the first category, and if the category to which the video data in the second video data packet belongs is a second category, setting the importance of the second video data packet according to the importance of the second category; and/or if the category to which the video data in the first video data packet belongs is the same as the third category, setting the importance of the first video data packet according to the importance of the third category, and if the category to which the video data in the second video data packet belongs is different from the third category, setting the importance of the second video data packet according to the category to which the video data in the second video data packet belongs.

In one implementation, the obtaining module 601 may be further configured to obtain a historical video scene; the processing module 602 may also be configured to adjust the encoding strategy if the historical video scene is different from the target video scene.

In an implementation, the processing module 602 is configured to, when adjusting the encoding policy if the historical video scene is different from the target video scene, specifically, increase the number of second video frames in the group of pictures structure of the video if the historical video scene is different from the target video scene.

In one implementation, the video processing apparatus 60 may further include a receiving module 604 for receiving a source code stream of a video; the processing module 602 may also be configured to perform decoding processing on the source code stream to obtain original video data.

It should be noted that details that are not mentioned in the embodiment corresponding to fig. 6 and specific implementation manners of the steps executed by each module may refer to the embodiments shown in fig. 2 to fig. 5 and the foregoing details, and are not described again here.

Referring to fig. 7, fig. 7 is a schematic structural diagram of a video processing apparatus 70 according to an embodiment of the present disclosure, where the video processing apparatus 70 is configured to perform the steps performed by the first wireless transmission device in the method embodiments corresponding to fig. 2 to fig. 5, and the video processing apparatus 70 may include:

the receiving module 701 is configured to receive a target code stream of a video to be transmitted, where the target code stream is obtained by encoding original video data of the video to be transmitted, where the target code stream is sent by a video processing device, the encoding policy is set by the video processing device according to content characteristics of the video to be transmitted and historical feedback information, the historical feedback information includes a historical transmission result and/or content characteristics fed back by a user, and the encoding policy includes an encoding mode and/or encoding parameters;

a sending module 702, configured to transmit the target code stream through a wireless channel.

In an implementation manner, the video processing apparatus 70 may further include an obtaining module 703, configured to obtain content characteristics fed back by the user, and count historical transmission results within a preset time period; the sending module 702 may be further configured to send the historical transmission result and the content characteristics fed back by the user to the video processing device.

In one implementation, the target code stream may include a first video data packet and a second video data packet; the sending module 702 is configured to, when transmitting a target code stream through a wireless channel, specifically, be configured to screen the target code stream to obtain a code stream to be transmitted based on a transmission condition of the wireless channel, an importance of a first video data packet, and an importance of a second video data packet, where the code stream to be transmitted includes the first video data packet and/or the second video data packet; and transmitting the code stream to be transmitted through a wireless channel.

In an implementation manner, the importance of the first video data packet may be higher than that of the second video data packet, and the sending module 702 may be configured to transmit the first video data packet through the wireless channel when the code stream to be transmitted is transmitted through the wireless channel, specifically, if the transmission condition of the wireless channel meets a preset degradation condition; and/or if the transmission condition of the wireless channel meets the preset degradation condition, sequentially transmitting the first video data packet and the second video data packet according to the high-low order of importance; and/or modulating the first video data packet to a first satellite seat, modulating the second video data packet to a second satellite seat, and transmitting the modulated first video data packet and the modulated second video data packet, wherein the reliability of the first satellite seat can be higher than that of the second satellite seat.

It should be noted that details that are not mentioned in the embodiment corresponding to fig. 7 and specific implementation manners of the steps executed by each module may refer to the embodiments shown in fig. 2 to fig. 5 and the foregoing details, and are not described again here.

In one implementation, the relevant functions implemented by the various modules in fig. 6 may be implemented in conjunction with a processor and a transceiver. Referring to fig. 8, fig. 8 is a schematic structural diagram of a video processing apparatus provided in an embodiment of the present application, where the video processing apparatus 80 includes: a transceiver 801, a processor 802 and a memory 803, the transceiver 801, the processor 802 and the memory 803 being connected by one or more communication buses, or by other means.

The transceiver 801 may be used to transmit data and/or signaling as well as receive data and/or signaling. In the present embodiment, the transceiver 801 may be used to transmit the object code stream to a wireless transmission device or receive the source code stream of the video.

The processor 802 is configured to perform the respective functions of the video processing device in the methods described in fig. 2-5. The processor 802 may include one or more processors, for example, the processor 802 may be one or more Central Processing Units (CPUs), Network Processors (NPs), hardware chips, or any combination thereof. In the case where the processor 802 is a CPU, the CPU may be a single-core CPU or a multi-core CPU.

The memory 803 is used to store program codes and the like. The memory 803 may include volatile memory (volatile), such as Random Access Memory (RAM); the memory 803 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory (flash memory), a Hard Disk Drive (HDD), or a solid-state drive (SSD); the memory 803 may also comprise a combination of memories of the kind described above.

The processor 802 may call program code stored in the memory 803 to perform the following operations:

acquiring original video data of a video to be transmitted;

setting a video coding strategy according to the content characteristics of the video and historical feedback information, wherein the historical feedback information comprises historical transmission results and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters;

the original video data is encoded based on the encoding strategy to obtain a target code stream, and the transceiver 801 is called to send the target code stream to the wireless transmission device, so that the wireless transmission device transmits the target code stream through a wireless channel.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include first video frames, and an occupation ratio of the first video frames in the first picture group structure may be higher than an occupation ratio of the first video frames in the second picture group structure, where the occupation ratio of the first video frames in the first picture group structure is a ratio between the number of the first video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the first video frames in the second picture group structure is a ratio between the number of the first video frames in the second picture group structure and the number of the video frames; when the processor 802 executes the video encoding policy set according to the content characteristics of the video and the historical feedback information, the following operations are specifically executed: and if the transmission condition of the wireless channel meets the preset degradation condition, setting the picture group structure of the video as a first picture group structure.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel; the encoding mode may include a picture group structure, the picture group structure of the video may include a first picture group structure or a second picture group structure, both the first picture group structure and the second picture group structure are composed of video frames, the video frames may include second video frames, and an occupation ratio of the second video frames in the first picture group structure may be higher than an occupation ratio of the second video frames in the second picture group structure, where the occupation ratio of the second video frames in the first picture group structure is a ratio between the number of the second video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the second video frames in the second picture group structure is a ratio between the number of the second video frames in the second picture group structure and the number of the video frames; the amount of data constituting the second video frame may be larger than the amount of data constituting the first video frame; when the processor 802 executes the video encoding policy set according to the content characteristics of the video and the historical feedback information, the following operations are specifically executed: and if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the picture group structure of the video as a second picture group structure.

In one implementation, the historical transmission results may include transmission conditions of the wireless channel, and the encoding parameters may include quantization parameters; when the processor 802 executes the video encoding policy set according to the content characteristics of the video and the historical feedback information, the following operations are specifically executed: if the transmission condition of the wireless channel meets the preset degradation condition, setting the quantization parameter of the video according to the content characteristics fed back by the user; and/or if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the quantization parameter of the video according to the content characteristic of the video and the content characteristic fed back by the user; and/or if the content characteristics of the video are different from the content characteristics fed back by the user, setting the quantization parameters of the video according to the content characteristics fed back by the user.

In one implementation, the processor 802 may also perform the following operations: the method is used for analyzing the original video data to obtain a target video scene of the video, and determining the content characteristics of the video based on the target video scene.

In one implementation, the target code stream may include a first video data packet and a second video data packet, and the processor 802 may further perform the following operations: the importance of the first video data packet is set and the importance of the second video data packet is set based on the content characteristics of the video, the content characteristics fed back by the user, and/or the encoding policy of the video.

In one implementation, the content features of the video may include a first category and a second category, and the content features of the user feedback may include a third category; when the processor 802 executes the video-based content feature, the content feature fed back by the user, and/or the video encoding policy, sets the importance of the first video data packet, and sets the importance of the second video data packet, the following operations may be specifically performed: acquiring the first number of times that a video frame corresponding to the first video data packet is referred to, and setting the importance of the first video data packet according to the first number of times; acquiring a second number of times that a video frame corresponding to the second video data packet is referred to, and setting the importance of the second video data packet according to the second number of times; and/or if the category to which the video data in the first video data packet belongs is a first category, setting the importance of the first video data packet according to the importance of the first category, and if the category to which the video data in the second video data packet belongs is a second category, setting the importance of the second video data packet according to the importance of the second category; and/or if the category to which the video data in the first video data packet belongs is the same as the third category, setting the importance of the first video data packet according to the importance of the third category, and if the category to which the video data in the second video data packet belongs is different from the third category, setting the importance of the second video data packet according to the category to which the video data in the second video data packet belongs.

In one implementation, the processor 802 may also perform the following operations: and acquiring a historical video scene, and if the historical video scene is different from the target video scene, adjusting the coding strategy.

In one implementation, when the processor 802 performs the adjustment of the encoding policy if the historical video scene is different from the target video scene, the following operations may be specifically performed: and if the historical video scene is different from the target video scene, increasing the number of second video frames in the picture group structure of the video.

In one implementation, the processor 802 may also perform the following operations: the calling transceiver 801 receives a source code stream of a video, and decodes the source code stream to obtain original video data.

Further, the processor 802 may also perform operations corresponding to the video processing devices in the embodiments shown in fig. 2 to fig. 5, which may specifically refer to the description in the method embodiments and will not be described herein again.

In one implementation, the relevant functions implemented by the various modules in fig. 7 may be implemented in conjunction with a processor and a transceiver. Referring to fig. 9, fig. 9 is a schematic structural diagram of a wireless transmission device according to an embodiment of the present application, where the wireless transmission device 90 includes: a transceiver 901, a processor 902 and a memory 903, the transceiver 901, the processor 902 and the memory 903 being connected by one or more communication buses, or may be connected by other means.

The transceiver 901 may be used to transmit data and/or signaling as well as receive data and/or signaling. In this embodiment, the transceiver 901 may be used to receive a target code stream of a video to be transmitted, which is sent by a video processing device.

The processor 902 is configured to perform the respective functions of the first wireless transmission device in the methods described in fig. 2-5. The processor 902 may include one or more processors, for example, the processor 902 may be one or more Central Processing Units (CPUs), Network Processors (NPs), hardware chips, or any combination thereof. In the case where the processor 902 is a single CPU, the CPU may be a single-core CPU or a multi-core CPU.

The memory 903 is used for storing program codes and the like. The memory 903 may include volatile memory (volatile), such as Random Access Memory (RAM); the memory 903 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory (flash memory), a Hard Disk Drive (HDD), or a solid-state drive (SSD); the memory 903 may also comprise a combination of memories of the kind described above.

The processor 902 may call program code stored in the memory 903 to perform the following operations:

calling a transceiver 901 to receive a target code stream of a video to be transmitted, which is sent by a video processing device, wherein the target code stream is obtained by the video processing device by encoding original video data of the video to be transmitted based on an encoding strategy, the encoding strategy is set by the video processing device according to content characteristics of the video to be transmitted and historical feedback information, the historical feedback information comprises a historical transmission result and/or content characteristics fed back by a user, and the encoding strategy comprises an encoding mode and/or encoding parameters;

the transceiver 901 is invoked to transmit the target code stream through a wireless channel.

In one implementation, the processor 902 may also perform the following operations: acquiring content characteristics fed back by a user, and counting historical transmission results in a preset time period; the invoking transceiver 901 transmits the historical transmission results and the content characteristics of the user feedback to the video processing device.

In one implementation, the target code stream may include a first video data packet and a second video data packet; when the processor 902 executes the target code stream transmitted through the wireless channel, the following operations may be specifically executed: screening a target code stream to obtain a code stream to be transmitted based on the transmission condition of the wireless channel, the importance of the first video data packet and the importance of the second video data packet, wherein the code stream to be transmitted comprises the first video data packet and/or the second video data packet; and transmitting the code stream to be transmitted through a wireless channel.

In an implementation manner, the importance of the first video data packet may be higher than that of the second video data packet, and when the processor 902 executes the transmission of the to-be-transmitted code stream through the wireless channel, the following operations may be specifically executed: if the transmission condition of the wireless channel meets the preset degradation condition, transmitting a first video data packet through the wireless channel; and/or if the transmission condition of the wireless channel meets the preset degradation condition, sequentially transmitting the first video data packet and the second video data packet according to the high-low order of importance; and/or modulating the first video data packet to a first satellite seat, modulating the second video data packet to a second satellite seat, and transmitting the modulated first video data packet and the modulated second video data packet, wherein the reliability of the first satellite seat can be higher than that of the second satellite seat.

Further, the processor 902 may also perform operations corresponding to the first wireless transmission device in the embodiments shown in fig. 2 to fig. 5, which may specifically refer to the description in the method embodiments and will not be described herein again.

The embodiment of the present application further provides a video processing system, where the video processing system includes the aforementioned video processing apparatus shown in fig. 6 and the aforementioned video processing apparatus shown in fig. 7, or the video processing system includes the aforementioned video processing device shown in fig. 8 and the aforementioned wireless transmission device shown in fig. 9.

Embodiments of the present application further provide a computer-readable storage medium, which can be used to store computer software instructions for the video processing apparatus in the embodiment shown in fig. 6, and which contains a program designed to execute the video processing apparatus in the above-mentioned embodiments.

An embodiment of the present application further provides a computer-readable storage medium, which can be used to store computer software instructions for the video processing apparatus in the embodiment shown in fig. 7, and which contains a program designed to execute the first wireless transmission device in the above-mentioned embodiment.

The computer readable storage medium includes, but is not limited to, flash memory, hard disk, solid state disk.

The embodiments of the present application further provide a computer program product, which, when executed by a computing device, can execute the video processing method designed for the video processing device in the foregoing embodiments of fig. 2 to 5.

An embodiment of the present application further provides a computer program product, which when executed by a computing device, can execute the video processing method designed for the first wireless transmission device in the foregoing embodiments of fig. 2 to 5.

The embodiment of the application further provides a processor, which comprises at least one circuit used for setting a video coding strategy according to the content characteristics of the video and the historical feedback information, and at least one circuit used for coding the original video data based on the coding strategy to obtain a target code stream. The processor may be a chip, and may execute instructions or programs designed for implementing the video processing device in the above embodiments.

The embodiment of the application also provides a processor, which comprises at least one circuit used for receiving the target code stream of the video to be transmitted, which is sent by the video processing equipment, and at least one circuit used for transmitting the target code stream through a wireless channel. The processor may be a chip, and may execute instructions or programs designed for implementing the first wireless transmission device in the above embodiments.

An embodiment of the present application further provides a chip system, where the chip system includes a processor, and the processor is configured to implement the video processing method designed for the video processing device in the embodiments of fig. 2 to fig. 5. In one possible implementation, the system-on-chip further includes a memory for storing program instructions and data necessary to implement the functions of the video processing device. The chip system may be formed by a chip, or may include a chip and other discrete devices.

An embodiment of the present application further provides a chip system, where the chip system includes a processor, and the processor is configured to implement the video processing method designed for the first wireless transmission device in the embodiments of fig. 2 to fig. 5. In one possible implementation, the chip system further includes a memory for storing program instructions and data necessary to implement the functions of the first wireless transmission device. The chip system may be formed by a chip, or may include a chip and other discrete devices.

There is also provided in an embodiment of the present application a chip including a processor and a memory, where the memory includes the processor and the memory, and the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, and the computer program is used to implement the method in the above method embodiment.

Those of ordinary skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in or transmitted over a computer-readable storage medium. The computer instructions may be transmitted from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A video processing method applied to a video processing apparatus, the method comprising:

acquiring original video data of a video to be transmitted;

setting a coding strategy of the video according to the content characteristics of the video and historical feedback information, wherein the historical feedback information comprises historical transmission results and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters;

and coding the original video data based on the coding strategy to obtain a target code stream, and sending the target code stream to wireless transmission equipment so that the wireless transmission equipment transmits the target code stream through a wireless channel.

2. The method of claim 1, wherein the historical transmission results comprise transmission conditions of the wireless channel; the coding mode comprises a picture group structure, the picture group structure of the video comprises a first picture group structure or a second picture group structure, the first picture group structure and the second picture group structure are both composed of video frames, the video frames comprise first video frames, the occupation ratio of the first video frames in the first picture group structure is higher than that of the first video frames in the second picture group structure, wherein the occupation ratio of the first video frames in the first picture group structure is the ratio between the number of the first video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the first video frames in the second picture group structure is the ratio between the number of the first video frames in the second picture group structure and the number of the video frames;

the setting of the video coding strategy according to the content characteristics of the video and the historical feedback information comprises the following steps:

and if the transmission condition of the wireless channel meets a preset degradation condition, setting the picture group structure of the video as the first picture group structure.

3. The method of claim 1, wherein the historical transmission results comprise transmission conditions of the wireless channel; the coding mode comprises a picture group structure, the picture group structure of the video comprises a first picture group structure or a second picture group structure, the first picture group structure and the second picture group structure are both composed of video frames, the video frames comprise second video frames, the occupation ratio of the second video frames in the first picture group structure is higher than that of the second video frames in the second picture group structure, wherein the occupation ratio of the second video frames in the first picture group structure is the ratio between the number of the second video frames in the first picture group structure and the number of the video frames, and the occupation ratio of the second video frames in the second picture group structure is the ratio between the number of the second video frames in the second picture group structure and the number of the video frames; the amount of data constituting the second video frame is larger than the amount of data constituting the first video frame;

and if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the picture group structure of the video as the second picture group structure.

4. The method of claim 1, wherein the historical transmission results comprise transmission conditions of the wireless channel, and wherein the encoding parameters comprise quantization parameters;

the setting of the video coding strategy according to the content characteristics of the video and the historical feedback information comprises one or more of the following steps:

if the transmission condition of the wireless channel meets a preset degradation condition, setting a quantization parameter of the video according to the content characteristics fed back by the user;

if the transmission condition of the wireless channel does not meet the preset degradation condition, setting the quantization parameter of the video according to the content characteristic of the video and the content characteristic fed back by the user;

and if the content characteristics of the video are different from the content characteristics fed back by the user, setting the quantization parameters of the video according to the content characteristics fed back by the user.

5. The method of claim 3, wherein before the setting the encoding policy of the video according to the content characteristics of the video and the historical feedback information, the method further comprises:

analyzing the original video data to obtain a target video scene of the video;

based on the target video scene, determining content characteristics of the video.

6. The method according to any one of claims 1 to 5, wherein the target code stream comprises a first video data packet and a second video data packet, the method further comprising:

setting the importance of the first video data packet and setting the importance of the second video data packet based on the content characteristics of the video, the content characteristics fed back by the user and/or the encoding strategy of the video.

7. The method of claim 6, wherein the content characteristics of the video comprise a first category and a second category, and wherein the content characteristics of the user feedback comprise a third category;

the setting the importance of the first video data packet and the importance of the second video data packet based on the content characteristics of the video, the content characteristics of the user feedback and/or the encoding policy of the video comprises one or more of the following steps:

acquiring a first time number of reference video frames corresponding to the first video data packet, and setting the importance of the first video data packet according to the first time number; acquiring a second number of times that a video frame corresponding to the second video data packet is referred to, and setting the importance of the second video data packet according to the second number of times;

if the category to which the video data in the first video data packet belongs is the first category, setting the importance of the first video data packet according to the importance of the first category, and if the category to which the video data in the second video data packet belongs is the second category, setting the importance of the second video data packet according to the importance of the second category;

if the category to which the video data in the first video data packet belongs is the same as the third category, setting the importance of the first video data packet according to the importance of the third category, and if the category to which the video data in the second video data packet belongs is different from the third category, setting the importance of the second video data packet according to the category to which the video data in the second video data packet belongs.

8. The method of claim 7,

if the first number of times is higher than the second number of times, the importance of the first video data packet is higher than that of the second video data packet; or,

if the importance of the first category is higher than the importance of the second category, the importance of the first video data packet is higher than the importance of the second video data packet; or,

if the category to which the video data in the first video data packet belongs is the same as the third category and the category to which the video data in the second video data packet belongs is different from the third category, the importance of the first video data packet is higher than that of the second video data packet.

9. The method of claim 5, further comprising:

acquiring a historical video scene;

and if the historical video scene is different from the target video scene, adjusting the encoding strategy.

10. The method of claim 9, wherein adjusting the encoding strategy if the historical video scene is different from the target video scene comprises:

and if the historical video scene is different from the target video scene, increasing the number of the second video frames in the picture group structure of the video.

11. The method according to any one of claims 1 to 5 and 7 to 10, wherein before obtaining original video data of a video to be transmitted, the method further comprises:

receiving a source code stream of the video;

and decoding the source code stream to obtain the original video data.

12. A video processing method applied to a wireless transmission device, the method comprising:

receiving a target code stream of a video to be transmitted, which is sent by video processing equipment, wherein the target code stream is obtained by the video processing equipment through coding processing on original video data of the video to be transmitted based on a coding strategy, the coding strategy is set by the video processing equipment according to content characteristics of the video to be transmitted and historical feedback information, the historical feedback information comprises a historical transmission result and/or content characteristics fed back by a user, and the coding strategy comprises a coding mode and/or coding parameters;

and transmitting the target code stream through a wireless channel.

13. The method according to claim 12, wherein before receiving the target code stream of the video to be transmitted sent by the video processing device, the method further comprises:

acquiring content characteristics fed back by the user, and counting historical transmission results in a preset time period;

and sending the historical transmission result and the content characteristics fed back by the user to the video processing equipment.

14. The method of claim 12 or 13, wherein the target code stream comprises a first video data packet and a second video data packet;

transmitting the target code stream through a wireless channel, including:

screening the target code stream to obtain a code stream to be transmitted based on the transmission condition of the wireless channel, the importance of the first video data packet and the importance of the second video data packet, wherein the code stream to be transmitted comprises the first video data packet and/or the second video data packet;

and transmitting the code stream to be transmitted through the wireless channel.

15. The method of claim 14, wherein the importance of the first video data packet is higher than the importance of the second video data packet, and wherein transmitting the bitstream to be transmitted over the wireless channel comprises one or more of:

if the transmission condition of the wireless channel meets a preset degradation condition, transmitting the first video data packet through the wireless channel;

if the transmission condition of the wireless channel meets the preset degradation condition, sequentially transmitting the first video data packet and the second video data packet according to the high-low order of importance;

and modulating the first video data packet to a first satellite seat, modulating the second video data packet to a second satellite seat, and transmitting the modulated first video data packet and the modulated second video data packet, wherein the reliability of the first satellite seat is higher than that of the second satellite seat.

16. A video processing apparatus, characterized in that the apparatus comprises means for performing the method according to any of claims 1-11.

17. A video processing apparatus, characterized in that the apparatus comprises means for performing the method according to any of claims 12-15.

18. A video processing system comprising the video processing apparatus of claim 16 and the video processing apparatus of claim 17.

19. A video processing apparatus comprising a memory having stored therein program instructions and a processor coupled to the memory via a bus, the processor executing the program instructions stored in the memory to cause the video processing apparatus to perform the method of any of claims 1-11.

20. A wireless transmission device comprising a memory having stored therein program instructions and a processor coupled to the memory via a bus, the processor executing the program instructions stored in the memory to cause the wireless transmission device to perform the method of any of claims 12-15.

21. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to carry out the method according to any one of claims 1 to 11.

22. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 12 to 15.