[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2020211422A1 - Video processing method and apparatus, and device - Google Patents

Video processing method and apparatus, and device Download PDF

Info

Publication number
WO2020211422A1
WO2020211422A1 PCT/CN2019/126757 CN2019126757W WO2020211422A1 WO 2020211422 A1 WO2020211422 A1 WO 2020211422A1 CN 2019126757 W CN2019126757 W CN 2019126757W WO 2020211422 A1 WO2020211422 A1 WO 2020211422A1
Authority
WO
WIPO (PCT)
Prior art keywords
images
video
image
frames
posture
Prior art date
Application number
PCT/CN2019/126757
Other languages
French (fr)
Chinese (zh)
Inventor
卢艺帆
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020211422A1 publication Critical patent/WO2020211422A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Definitions

  • the embodiments of the present disclosure relate to the field of computer technology, and in particular to a video processing method, device, and equipment.
  • special effects can be added to the video.
  • special effects can include adding light flicker, adding preset sounds, and so on in the video.
  • the video When adding special effects to a video, the video is usually watched by humans.
  • a special effect is associated with the playback moment corresponding to the preset action; during the video playback process, when the playback reaches that moment, The corresponding special effects are displayed in the video. For example, if you manually observe that a clapping action occurs at the 10th second of the video, a special effect is associated at the 10th second, and when the video is played to the 10th second, a special effect related to the applause is displayed.
  • special effects are added according to the playback time of the video. There may be deviations between the moment when the preset action appears in the video and the moment when the special effect corresponding to the preset action is displayed, resulting in poor accuracy of the special effects added in the video. .
  • the embodiments of the present disclosure provide a video processing method, device, and equipment, which improve the accuracy of special effects added in a video.
  • embodiments of the present disclosure provide a video processing method, including:
  • adding special effects to the video according to the posture distribution of the first object and the N frames of images includes:
  • determining the posture distribution of the first object according to the posture type of the first object in each frame of image includes:
  • the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
  • the posture distribution of the first object is determined according to the posture type corresponding to each group of images.
  • determining the posture type of the first object in the first image includes:
  • the object area is processed to determine the posture type of the first object in the first image.
  • detecting the object area in the first image includes:
  • the data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
  • processing the object area to determine the posture type of the first object in the first image includes:
  • the video is a video being shot; acquiring consecutive N frames of images in the video includes:
  • N frames of to-be-processed images in the video and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
  • each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
  • adding the target special effect to the video according to the N frames of images includes:
  • the video is a video that has been filmed; the obtaining of consecutive N frames of images in the video includes:
  • the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
  • the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
  • adding the target special effect to the video according to the N frames of images includes:
  • the special effect is added to at least one of the N frames of images.
  • the acquiring consecutive N frames of images in the video includes:
  • the N frames of images are determined in the video.
  • the method before acquiring consecutive N frames of images in the video, the method further includes:
  • an embodiment of the present disclosure provides a video processing device, including an acquisition module, a first determination module, a second determination module, and an addition module, wherein:
  • the acquiring module is configured to acquire consecutive N frames of images in a video, each frame of the image includes a first object, and the N is an integer greater than 1;
  • the first determining module is configured to determine the posture type of the first object in each frame of image
  • the second determining module is configured to determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate a change in the posture of the first object law;
  • the adding module is configured to add special effects to the video according to the posture distribution of the first object and the N frames of images.
  • the adding module is specifically used for:
  • the second determining module is specifically configured to:
  • the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
  • the first determining module is specifically configured to:
  • the object area is processed to obtain the posture type of the first object in the first image.
  • the first determining module is specifically configured to:
  • the data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
  • the first determining module is specifically configured to:
  • the video is a video being shot;
  • the acquisition module is specifically configured to:
  • N frames of to-be-processed images in the video and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
  • each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
  • the adding module is specifically used for:
  • the video is a completed video;
  • the acquisition module is specifically configured to:
  • the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
  • the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
  • the adding module is specifically used for:
  • the special effect is added to at least one of the N frames of images.
  • the acquisition module is specifically configured to:
  • the N frames of images are determined in the video.
  • the device further includes a third determining module, wherein:
  • the third determining module is configured to determine that the target special effect is not added to the N frames of images before the acquiring module acquires consecutive N frames of images in the video.
  • an embodiment of the present disclosure provides an electronic device, including: a processor coupled with a memory;
  • the memory is used to store a computer program
  • the processor is configured to execute the computer program stored in the memory, so that the terminal device executes the method according to any one of the foregoing first aspects.
  • an embodiment of the present disclosure provides a readable storage medium, including a program or instruction, and when the program or instruction runs on a computer, the method described in any one of the foregoing first aspect is executed.
  • the video processing method, device and equipment when it is necessary to add special effects corresponding to the first object in the video, determine the continuous N frames of images including the first object in the video, and obtain the The posture type of the first object, and the posture distribution of the first object is obtained according to the posture type of the first object in each frame of image, and special effects are added to the video according to the posture distribution of the first object and N frames of images.
  • the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video.
  • the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects.
  • FIG. 1 is an architecture diagram of video processing provided by an embodiment of the disclosure
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the disclosure
  • 3A is a schematic diagram of a video frame provided by an embodiment of the disclosure.
  • FIG. 3B is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • FIG. 4A is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • FIG. 4B is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • FIG. 5 is a schematic flowchart of another video processing method provided by an embodiment of the disclosure.
  • FIG. 6 is a schematic diagram of a video processing process provided by an embodiment of the application.
  • FIG. 7 is a schematic structural diagram of a video processing device provided by an embodiment of the disclosure.
  • FIG. 8 is a schematic structural diagram of another video processing device provided by an embodiment of the disclosure.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the disclosure.
  • FIG. 1 is an architecture diagram of video processing provided by an embodiment of the disclosure.
  • special effects when adding special effects to a video, it is usually judged whether there are preset actions in the video (for example, clapping, shaking the head, etc.). When it is determined that the preset actions appear in the video, the special effects corresponding to the preset actions are added to the video .
  • a preset action needs to be added to the video (assuming that the preset action corresponds to the first object, that is, the preset action is performed by the first object, and the first object can be hands, legs, head, vehicles, etc.).
  • Each extracted image can be recognized to obtain the posture type of the first object in each image, and the posture distribution of the first object can be obtained according to the posture type of the first object in each frame of image.
  • the posture distribution of the object satisfies the preset distribution, it can be determined that the preset action appears in the video, and the special effect corresponding to the preset action is added to the video.
  • the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video.
  • the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects.
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the disclosure. See Figure 2. The method can include:
  • the execution subject of the embodiments of the present disclosure may be an electronic device, or may be a video processing device provided in the electronic device.
  • the video processing device can be implemented by software, or by a combination of software and hardware.
  • the electronic device can be a mobile phone, a computer, a video camera with processing functions, and other devices.
  • each frame of image includes the first object, and N is an integer greater than 1.
  • Each frame of image includes complete video content.
  • the N frames of images are all key frames in the video.
  • the first object may be a hand, leg, head, vehicle, airplane, etc.
  • the special effect to be added to the video may be determined first, the first object corresponding to the special effect to be added to the video is determined, and N frames of images are determined in the video according to the first object.
  • the preset action corresponding to the special effect to be added in the video may be determined first, and the object performing the preset action is determined as the first object.
  • the special effect to be added in the video is a light special effect
  • the preset action corresponding to the light special effect is a clapping action
  • the object performing the clapping action is a hand. Therefore, the first object can be determined as a hand, and accordingly, determine it in the video N consecutive images of all include hands.
  • the process of determining consecutive N frames of images is also different.
  • it may include at least the following two possible application scenarios:
  • the video is a video being shot, that is, while the video is being shot, special effects are added to the video being shot.
  • N frames of to-be-processed images are obtained in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video. It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images. If not, the N frames of to-be-processed images are not determined as the N-frames of images. After a new image is captured, N frames of to-be-processed images can be updated, and the above process is repeated until the N frames of images are determined to be obtained.
  • FIG. 3A is a schematic diagram of a video frame provided by an embodiment of the disclosure.
  • the first object is a hand
  • N is 6. Please refer to Figure 3A.
  • the hands are included in the 75th frame to the 80th frame image. Since the last 6 frames of the captured image (the 75th frame to the 80th frame If the images include hands, the 75th frame to the 80th frame image can be determined as 6 consecutive frames of images.
  • FIG. 3B is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • the first object is a hand
  • N is 6. Please refer to Figure 3B.
  • the last image captured is the 80th frame, where the 75-76 and 78-80 frames include the hand, and the 77th frame does not include the hand, because the last captured image If there is an image that does not include hands in the 6 frames of images, continue shooting until at time T2, the last frame of image captured is the 83rd frame, and the 78th to 83rd frames all include hands ,
  • the image from frame 78 to frame 83 is determined to be 6 consecutive frames.
  • the video is a completed video, that is, special effects are added to the completed video.
  • continuous N frames of images can be obtained through the following feasible implementations: perform a to-be-processed image selection operation.
  • the to-be-processed image selection operation includes: starting from the preset image of the video, obtaining continuous images in the video N frames of images to be processed.
  • Perform N frames of image determination operation N frames of image determination operation includes: judging whether each frame of N frames of to-be-processed images includes the image corresponding to the first object, if so, determine N frames of to-be-processed images as N frames Image, if not, update the preset image to a frame after the preset image in the video. Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until it is determined that N frames of images are obtained.
  • the preset image can be updated to a frame of image after the preset image in the video.
  • the preset image may be updated to the next frame image of the second image, the second image being the last image that does not include the first object in the N frames of images to be processed.
  • FIG. 4A is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • N 6
  • the preset image is the first frame of image.
  • the preset image is the first frame of image. Therefore, it is determined that the N frames of to-be-processed images are the first frame of image to the sixth frame of image. Since the third frame of the image from the 1st frame to the 6th frame does not include the hand, the preset image is updated to the second frame image, and accordingly, the N frames to be processed are updated to the second frame image to the 7th frame image.
  • the preset image is updated to the third frame of image, and correspondingly, the N frames to be processed are updated to the third to eighth frame of image The image in the frame image. Since the hands are not included in the third frame of images from the third to the eighth frame, the preset image is updated to the fourth frame of image, and correspondingly, the N frames to be processed are updated to the fourth to ninth frame of image Frame images, since the 4th frame to the 9th frame image all include hands, the 4th frame image to the 9th frame image are determined to be 6 consecutive frames of images.
  • FIG. 4B is a schematic diagram of another video frame provided by an embodiment of the disclosure.
  • N 6
  • the preset image is the first frame of image.
  • the preset image is the first frame of image, therefore, it is determined that the N frames of to-be-processed images are the first frame to the sixth frame of images. Since the third image in the first to sixth frames does not include hands, the second image is determined in the first to sixth frames. Since the third image does not include hands, the The third frame image is determined to be the second image. Therefore, the preset image is updated to the fourth frame image (the next frame image of the second image). Accordingly, the N frames to be processed are updated to the fourth frame image to the ninth frame image. Frame images, since the 4th frame to the 9th frame image all include hands, the 4th frame image to the 9th frame image are determined to be 6 consecutive frames of images.
  • the obtained N frames of images are images without added target special effects (special effects to be added to the video).
  • S202 Determine the posture type of the first object in each frame of image.
  • multiple posture types of the first object may be preset.
  • the posture type of the hand may include: open hands, put both hands together, and make a fist.
  • the posture type of the head may include: head up, head down, left head tilted, right head tilted, and so on.
  • the process of obtaining the posture type of the first object in each frame of image is the same. In the following, the process of obtaining the posture type of the first object in the first image will be described.
  • the object area can be detected in the first image, and the object area includes the part of the first image corresponding to the first object, and the object area is processed to obtain the first image
  • the object area can be detected in the first image by the following feasible implementation: input data representing the first image to the first recognition model to obtain the object area; wherein the first recognition model is a pair of multiple sets of The sample is obtained by learning, each group of first samples includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
  • the data input representing the first image may be the first image, a grayscale image of the first image, or the like.
  • the object area may be a rectangular area including the first object in the first image.
  • the first recognition model Since the first recognition model is learned from a large number of first samples, the first recognition model can accurately detect the target area in the first image.
  • the target area can be determined based on the output of the first recognition model.
  • the output of the first recognition model may be an image corresponding to the object area in the first image, or may be the positions (for example, coordinates) of at least two vertices of the object area in the first image.
  • the output of the first recognition model is two vertices of the target area, the two vertices are two vertices on a diagonal line.
  • the posture type of the first object in the first image can be obtained by the following feasible implementation: input data representing the object area into the second recognition model to obtain the posture type of the first object in the first image ;
  • the second recognition model is obtained by learning multiple sets of second samples, each set of second samples includes the sample object area and the sample posture type recognized in the sample object area, the sample object area includes the first object corresponding Image.
  • the data representing the object area may be an image corresponding to the object area, or the positions (for example, coordinates) of at least two vertices of the object area in the first image.
  • the data representing the target area are two vertices of the target area, the two vertices are two vertices on a diagonal line.
  • the posture type of the first object in the first image can be determined according to the output of the second recognition model.
  • the output of the second recognition model may be characters (for example, numbers, letters, etc.) representing the type of gesture.
  • the posture type of the first object can be accurately determined in the object area through the second recognition model.
  • S203 Determine the posture distribution of the first object according to the posture type of the first object in each frame of image.
  • the posture distribution of the first object is used to indicate the law of change of the posture of the first object.
  • the first object is a hand
  • N is 6.
  • the posture types of the first object in the 6 frames of images are in order: open hands facing right, hands facing open, hands facing open, hands folded, hands folded, hands Namaste.
  • the posture distribution of the first object can be obtained as: the hands are open to the hands together.
  • the posture distribution of the first object may be obtained by the following feasible implementation manners: N frames of images are grouped according to the order of the N frames in the video, Obtain at least two sets of images. Each set of images includes consecutive M frames of images, where M is an integer greater than 1. According to the posture type of the first object in each image in each set of images, determine the correspondence of each set of images The posture type; according to the posture type corresponding to each group of images, the posture distribution of the first object is obtained.
  • the posture type of the images in the group of images greater than or equal to the first threshold is the first posture type, it is determined that the posture type corresponding to the group of images is the first posture type.
  • the posture type corresponding to the two-hands type is determined to be the posture type corresponding to the group of images is the two-hands type. Types of.
  • Table 1 merely illustrates the grouping of images in the form of examples, so that each image corresponds to the posture type.
  • S204 Add special effects to the video according to the posture distribution of the first object and the N frames of images.
  • the posture distribution of the first object satisfies the preset posture distribution, and when the posture distribution of the first object satisfies the preset posture distribution, the target special effect corresponding to the preset posture distribution is obtained, and according to the N frames of images in the video Added target special effects in.
  • the process of adding target special effects to the video according to N frames of images is also different.
  • the video is a video being shot, that is, while the video is being shot, special effects are added to the video being shot.
  • special effects can be added to the Nth frame of the N frame of image.
  • a special effect is added to the video at the playback time corresponding to the Nth frame of image, and the display time of the special effect may be a preset duration.
  • the video is a completed video, that is, special effects are added to the completed video.
  • special effects can be added to at least one of the N frames of images.
  • special effects can be added to all N frames of images, that is, special effects can be added to the video between the playback moments corresponding to the N frames of images.
  • special effects are added to some of the N frames of images, that is, special effects are added to the video between playback moments corresponding to the partial images of the N frames of images.
  • the continuous N frames of images including the first object are determined in the video, and the image of the first object in each frame is obtained.
  • the posture distribution of the first object is obtained according to the posture type of the first object in each frame of image, and special effects are added to the video according to the posture distribution of the first object and N frames of images.
  • the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video.
  • add special effects to the video based on consecutive N frames of images that is, add special effects to the video with the video frame as the granularity, which improves the accuracy of adding special effects.
  • FIG. 5 is a schematic flowchart of another video processing method provided by an embodiment of the disclosure. Referring to Figure 5, the method may include:
  • S501 Acquire consecutive N frames of images in the video.
  • S502 Group the N frames of images according to the order of the N frames of images in the video to obtain at least two sets of images.
  • each group of images includes consecutive M frames of images, and M is an integer greater than 1.
  • the consecutive M frame images are grouped into one group
  • the M+1th frame image to the 2Mth frame image are grouped into one group
  • the N frame images are grouped.
  • N is an integer multiple of M.
  • S503 Determine the posture type of the first object in each image in each group of images.
  • S504 Determine a posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images.
  • the posture type of the images in the group of images greater than or equal to the first threshold is the first posture type, it is determined that the posture type corresponding to the group of images is the first posture type.
  • the posture type corresponding to the two-hands type is determined to be the posture type corresponding to the group of images is the two-hands type. Types of.
  • S505 Determine the posture distribution of the first object according to the posture type corresponding to each group of images.
  • the first object is a hand
  • S506 Determine whether the posture distribution of the first object meets the preset posture distribution.
  • the change rule of the posture of the first object indicated by the posture distribution of the first object is the same as the change rule of the posture of the first object indicated by the preset posture distribution, it is determined whether the posture distribution of the first object Meet the preset posture distribution.
  • the corresponding relationship between the posture distribution and the special effect can be preset, and accordingly, the target special effect can be determined according to the preset posture distribution and the object relationship.
  • the posture distribution of the first object in the video is determined in units of video frames. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then Can accurately determine whether to add special effects to the video.
  • the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects. Further, even if the posture type of the first object in the individual image is incorrectly recognized, the correct posture distribution of the first object can still be obtained, so that the error tolerance performance of the video processing is higher.
  • FIG. 6 is a schematic diagram of a video processing process provided by an embodiment of the application. Assuming that the first object is a hand, N is 6, and the special effect to be added is flower spreading. Please refer to Fig. 6, assuming that the six images obtained are P1, P2, ..., P6.
  • Fig. 6 dividing P1, P2, and P3 into a group of images, and dividing P4, P5, and P6 into a group of images.
  • the data representing the 6 images are respectively input into the first preset model to obtain the object area in each image, wherein the object area includes the hand.
  • the posture types of the determined hands are: open hands facing right, hands facing right open, hands folded, hands Namaste, Namaste, Namaste. From this, it can be determined that the posture type corresponding to the first group of images is open with both hands facing each other, and the posture type corresponding to the second group of images is folded hands.
  • the posture distribution corresponding to the first object (hand) is: If the posture distribution satisfies the preset posture grouping, the special effects of spreading flowers are added to the 6 images. Of course, you can also add special effects of sprinkling to some of the 6 images.
  • the posture distribution of the first object in the video is determined in units of video frames. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then Can accurately determine whether to add special effects to the video.
  • the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects. Further, even if the posture type of the first object in the individual image is incorrectly recognized, the correct posture distribution of the first object can still be obtained, so that the error tolerance performance of the video processing is higher.
  • FIG. 7 is a schematic structural diagram of a video processing device provided by an embodiment of the disclosure.
  • the video processing device 10 may include an acquiring module 11, a first determining module 12, a second determining module 13, and an adding module 14.
  • the acquiring module 11 is configured to acquire consecutive N frames of images in a video, each frame of the image includes a first object, and the N is an integer greater than 1;
  • the first determining module 12 is configured to determine the posture type of the first object in each frame of image
  • the second determining module 13 is configured to determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate the posture of the first object. Law of change
  • the adding module 14 is configured to add special effects to the video according to the posture distribution of the first object and the N frames of images.
  • the video processing device provided in the embodiments of the present disclosure can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and details are not described herein again.
  • the adding module 14 is specifically configured to:
  • the second determining module 13 is specifically configured to:
  • the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
  • the first determining module 12 is specifically configured to:
  • the object area is processed to obtain the posture type of the first object in the first image.
  • the first determining module 12 is specifically configured to:
  • the data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
  • the first determining module 12 is specifically configured to:
  • the video is a video being shot;
  • the acquisition module 11 is specifically configured to:
  • N frames of to-be-processed images in the video and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
  • each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
  • the adding module 14 is specifically configured to:
  • the video is a completed video; the acquisition module 11 is specifically configured to:
  • the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
  • the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
  • the adding module 14 is specifically configured to:
  • the special effect is added to at least one of the N frames of images.
  • the obtaining module 11 is specifically configured to:
  • the N frames of images are determined in the video.
  • FIG. 8 is a schematic structural diagram of another video processing device provided by an embodiment of the disclosure. Based on the embodiment shown in FIG. 7, referring to FIG. 8, the video processing device 10 further includes a third determining module 15, where:
  • the third determining module 15 is configured to determine that the target special effect is not added to the N frames of images before the acquiring module 11 acquires consecutive N frames of images in the video.
  • the video processing device provided in the embodiments of the present disclosure can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and details are not described herein again.
  • FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the disclosure.
  • the electronic device 20 may be a terminal device or a server.
  • terminal devices may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablets (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, PMP for short), mobile terminals such as vehicle-mounted terminals (for example, vehicle navigation terminals), and fixed terminals such as digital TVs and desktop computers.
  • PDA Personal Digital Assistant
  • PDA Personal Digital Assistant
  • tablets Portable Android Device, PAD for short
  • portable multimedia players Portable Media Player, PMP for short
  • mobile terminals such as vehicle-mounted terminals (for example, vehicle navigation terminals)
  • fixed terminals such as digital TVs and desktop computers.
  • the electronic device shown in FIG. 9 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 20 may include a processing device (such as a central processing unit, a graphics processor, etc.) 21, which may be based on a program stored in a read only memory (Read Only Memory, ROM for short) 22 or from a storage device 28
  • the program loaded into the random access memory (Random Access Memory, RAM for short) 23 executes various appropriate actions and processing.
  • the RAM 23 also stores various programs and data required for the operation of the electronic device 20.
  • the processing device 21, the ROM 22, and the RAM 23 are connected to each other through a bus 24.
  • An input/output (I/O) interface 25 is also connected to the bus 24.
  • the following devices can be connected to the I/O interface 25: including input devices 26 such as touch screen, touch panel, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD) ), an output device 27 such as a speaker, a vibrator, etc.; a storage device 28 including a magnetic tape, a hard disk, etc.; and a communication device 29.
  • the communication device 29 may allow the electronic device 20 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 9 shows an electronic device 20 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices.
  • the process described above with reference to the flowchart can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 29, or installed from the storage device 28, or installed from the ROM 22.
  • the processing device 21 When the computer program is executed by the processing device 21, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the aforementioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the foregoing computer-readable medium carries one or more programs, and when the foregoing one or more programs are executed by the electronic device, the electronic device is caused to execute the method shown in the foregoing embodiment.
  • the computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof.
  • the above-mentioned programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to the outside Computer (for example, using an Internet service provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented in a software manner, or may be implemented in a hardware manner.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

Provided in embodiments of the present disclosure are a video processing method and apparatus, and a device; the method comprises: acquiring N continuous frames of an image from within a video, each frame of the image comprising a first object, and N being an integer greater than one; determining the posture type of the first objects in each frame of the image, and according to the posture type of the first objects in each frame of the image, determining the posture distribution of the first objects, the posture distribution being used to indicate a change rule for the posture of the first objects; and according to the posture distribution of the first objects and the N frames of the image, adding a special effect in the video. The accuracy of adding a special effect in a video is improved.

Description

视频处理方法、装置及设备Video processing method, device and equipment
相关申请的交叉引用Cross references to related applications
本申请要求于2019年04月16日提交的,申请号为201910304462.5、发明名称为“视频处理方法、装置及设备”的中国专利申请的优先权,该申请的全文通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on April 16, 2019 with the application number 201910304462.5 and the title of the invention "video processing method, device and equipment". The full text of this application is incorporated into this application by reference.
技术领域Technical field
本公开实施例涉及计算机技术领域,尤其涉及一种视频处理方法、装置及设备。The embodiments of the present disclosure relate to the field of computer technology, and in particular to a video processing method, device, and equipment.
背景技术Background technique
为了提高视频播放效果,可以在视频中增加特效,例如,特效可以包括在视频中增加灯光闪烁、增加预设声音等。In order to improve the video playback effect, special effects can be added to the video. For example, special effects can include adding light flicker, adding preset sounds, and so on in the video.
在视频中增加特效时,通常由人工看观看视频,在确定视频中出现预设动作时,在该预设动作对应的播放时刻关联一个特效;在视频播放过程中,当播放到该时刻时,则在视频中显示相应的特效。例如,在人工观察在视频的第10秒时出现鼓掌动作,则在第10秒关联一个特效,当视频播放到第10秒时,显示一个与鼓掌相关的特效。然而,在上述过程中,根据视频的播放时刻增加特效,视频中出现预设动作的时刻与显示该预设动作对应的特效的时刻可能存在偏差,导致在视频中增加的特效的精确度较差。When adding special effects to a video, the video is usually watched by humans. When it is determined that a preset action appears in the video, a special effect is associated with the playback moment corresponding to the preset action; during the video playback process, when the playback reaches that moment, The corresponding special effects are displayed in the video. For example, if you manually observe that a clapping action occurs at the 10th second of the video, a special effect is associated at the 10th second, and when the video is played to the 10th second, a special effect related to the applause is displayed. However, in the above process, special effects are added according to the playback time of the video. There may be deviations between the moment when the preset action appears in the video and the moment when the special effect corresponding to the preset action is displayed, resulting in poor accuracy of the special effects added in the video. .
发明内容Summary of the invention
本公开实施例提供一种视频处理方法、装置及设备,提高了在视频中增加的特效的精确度。The embodiments of the present disclosure provide a video processing method, device, and equipment, which improve the accuracy of special effects added in a video.
第一方面,本公开实施例提供一种视频处理方法,包括:In the first aspect, embodiments of the present disclosure provide a video processing method, including:
在视频中获取连续的N帧图像,每帧所述图像中均包括第一对象,所述N为大于1的整数;Acquire consecutive N frames of images in the video, each frame of the image includes the first object, and the N is an integer greater than 1;
确定每帧图像中的所述第一对象的姿势类型,并根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,所述姿势分布用于指示所述第一对象的姿势的变化规律;Determine the posture type of the first object in each frame of image, and determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate the The law of change of the posture of the first object;
根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效。According to the posture distribution of the first object and the N frames of images, special effects are added to the video.
在一种可能的实施方式中,根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效,包括:In a possible implementation manner, adding special effects to the video according to the posture distribution of the first object and the N frames of images includes:
判断所述第一对象的姿势分布是否满足预设姿势分布;Judging whether the posture distribution of the first object meets a preset posture distribution;
在所述第一对象的姿势分布满足预设姿势分布时,获取所述预设姿势分布对应的目标特效,并根据所述N帧图像在所述视频中增加所述目标特效。When the posture distribution of the first object satisfies a preset posture distribution, obtain a target special effect corresponding to the preset posture distribution, and add the target special effect to the video according to the N frames of images.
在一种可能的实施方式中,根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,包括:In a possible implementation manner, determining the posture distribution of the first object according to the posture type of the first object in each frame of image includes:
按照所述N帧图像在所述视频中的顺序,对所述N帧图像进行分组,得到至少两组图像,每组图像中包括连续的M帧图像,所述M为大于1的整数;According to the sequence of the N frames of images in the video, the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
根据每组图像中每个图像中的所述第一对象的姿势类型,确定每组图像对应的姿势类型;Determine the posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images;
根据每组图像对应的姿势类型,确定所述第一对象的姿势分布。The posture distribution of the first object is determined according to the posture type corresponding to each group of images.
在一种可能的实施方式中,针对所述N帧图像中的任意的第一图像,确定所述第一图像中的所述第一对象的姿势类型,包括:In a possible implementation manner, for any first image in the N frames of images, determining the posture type of the first object in the first image includes:
在所述第一图像中检测对象区域,所述对象区域中包括所述第一图像中与所述第一对象对应的部分;Detecting an object area in the first image, where the object area includes a part of the first image corresponding to the first object;
对所述对象区域进行处理,以确定所述第一图像中的所述第一对象的姿势类型。The object area is processed to determine the posture type of the first object in the first image.
在一种可能的实施方式中,在所述第一图像中检测对象区域,包括:In a possible implementation manner, detecting the object area in the first image includes:
将表示所述第一图像的数据输入至第一识别模型,以获取所述对象区域;其中,所述第一识别模型为对多组第一样本进行学习得到的,每组第一样本包括样本图像和所述样本图像中的样本对象区域,所述样本图像中包括所述第一对象对应的图像。The data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
在一种可能的实施方式中,对所述对象区域进行处理,以确定所述第一图像中的所述第一对象的姿势类型,包括:In a possible implementation manner, processing the object area to determine the posture type of the first object in the first image includes:
将表示所述对象区域的数据输入至第二识别模型,以获取所述第一图像中的所述第一对象的姿势类型;其中,所述第二识别模型为对多组第二样本进行学习得到的,每组第二样本包括样本对象区域和在所述样本对象区域中识别得到的样本姿势类型,所述样本对象区域中包括所述第一对象对应的图像。Input the data representing the object area into a second recognition model to obtain the posture type of the first object in the first image; wherein the second recognition model is learning from multiple sets of second samples Obtained, each set of second samples includes a sample object area and a sample gesture type recognized in the sample object area, and the sample object area includes an image corresponding to the first object.
在一种可能的实施方式中,所述视频为正在拍摄的视频;在视频中获取连续的N帧图像,包括:In a possible implementation manner, the video is a video being shot; acquiring consecutive N frames of images in the video includes:
在所述视频中获取N帧待处理图像,所述N帧待处理图像中包括所述视频中已拍摄的最后N帧图像;Acquiring N frames of to-be-processed images in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象,若是,则将所述N帧待处理图像确定为所述N帧图像。It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
在一种可能的实施方式中,根据所述N帧图像在所述视频中增加所述目标特效,包括:In a possible implementation manner, adding the target special effect to the video according to the N frames of images includes:
在所述N帧图像中的第N帧图像中的增加所述特效。Adding the special effect to the Nth frame of the N frames of images.
在一种可能的实施方式中,所述视频为拍摄完成的视频;所述在视频中获取连续的N帧图像,包括:In a possible implementation manner, the video is a video that has been filmed; the obtaining of consecutive N frames of images in the video includes:
执行待处理图像选择操作,所述待处理图像选择操作包括:从所述视频的预设图像起,在所述视频中获取连续的N帧待处理图像;Performing a to-be-processed image selection operation, the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
执行N帧图像确定操作,所述N帧图像确定操作包括:判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象对应的图像,若是,则将所述N帧待处理图像确定为所述N帧图像,若否,则将所述预设图像更新为所述视频中所述预设图像之后的一帧图像;Performing an operation of determining N frames of images, the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
重复执行所述待处理图像选择操作和所述N帧图像确定操作,直至确定得到所述N帧图像。Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until it is determined that the N frames of images are obtained.
在一种可能的实施方式中,根据所述N帧图像在所述视频中增加所述目标特效,包括:In a possible implementation manner, adding the target special effect to the video according to the N frames of images includes:
在所述N帧图像中的至少一帧图像中增加所述特效。The special effect is added to at least one of the N frames of images.
在一种可能的实施方式中,所述在视频中获取连续的N帧图像,包括:In a possible implementation manner, the acquiring consecutive N frames of images in the video includes:
确定待在所述视频中增加的特效;Determine the special effects to be added to the video;
确定待在所述视频中增加的特效对应的所述第一对象;Determine the first object corresponding to the special effect to be added in the video;
根据所述第一对象,在所述视频中确定所述N帧图像。According to the first object, the N frames of images are determined in the video.
在一种可能的实施方式中,在视频中获取连续的N帧图像之前,还包括:In a possible implementation manner, before acquiring consecutive N frames of images in the video, the method further includes:
确定未在所述N帧图像中增加所述目标特效。It is determined that the target special effect is not added to the N frames of images.
第二方面,本公开实施例提供一种视频处理装置,包括获取模块、第一确定模块、第二确定模块和增加模块,其中,In a second aspect, an embodiment of the present disclosure provides a video processing device, including an acquisition module, a first determination module, a second determination module, and an addition module, wherein:
所述获取模块用于,在视频中获取连续的N帧图像,每帧所述图像中均包括第一对象,所述N为大于1的整数;The acquiring module is configured to acquire consecutive N frames of images in a video, each frame of the image includes a first object, and the N is an integer greater than 1;
所述第一确定模块用于,确定每帧图像中的所述第一对象的姿势类型;The first determining module is configured to determine the posture type of the first object in each frame of image;
所述第二确定模块用于,根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,所述姿势分布用于指示所述第一对象的姿势的变化规律;The second determining module is configured to determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate a change in the posture of the first object law;
所述增加模块用于,根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效。The adding module is configured to add special effects to the video according to the posture distribution of the first object and the N frames of images.
在一种可能的实施方式中,所述增加模块具体用于:In a possible implementation manner, the adding module is specifically used for:
判断所述第一对象的姿势分布是否满足预设姿势分布;Judging whether the posture distribution of the first object meets a preset posture distribution;
在所述第一对象的姿势分布满足预设姿势分布时,获取所述预设姿势分布对应的目标特效,并根据所述N帧图像在所述视频中增加所述目标特效。When the posture distribution of the first object satisfies a preset posture distribution, obtain a target special effect corresponding to the preset posture distribution, and add the target special effect to the video according to the N frames of images.
在一种可能的实施方式中,所述第二确定模块具体用于:In a possible implementation manner, the second determining module is specifically configured to:
按照所述N帧图像在所述视频中的顺序,对所述N帧图像进行分组,得到至少两组图像,每组图像中包括连续的M帧图像,所述M为大于1的整数;According to the sequence of the N frames of images in the video, the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
根据每组图像中每个图像中的所述第一对象的姿势类型,确定每组图像对应的姿势类型;Determine the posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images;
根据每组图像对应的姿势类型,获取所述第一对象的姿势分布。Obtain the posture distribution of the first object according to the posture type corresponding to each group of images.
在一种可能的实施方式中,针对所述N帧图像中的任意的第一图像,所述第一确定模块具体用于:In a possible implementation manner, for any first image in the N frames of images, the first determining module is specifically configured to:
在所述第一图像中检测对象区域,所述对象区域中包括所述第一图像中与所述第一对象对应的部分;Detecting an object area in the first image, where the object area includes a part of the first image corresponding to the first object;
对所述对象区域进行处理,以获取所述第一图像中的所述第一对象的姿势类型。The object area is processed to obtain the posture type of the first object in the first image.
在一种可能的实施方式中,所述第一确定模块具体用于:In a possible implementation manner, the first determining module is specifically configured to:
将表示所述第一图像的数据输入至第一识别模型,以获取所述对象区域;其中,所述第一识别模型为对多组第一样本进行学习得到的,每组第一样本包括样本图像和所述样本图像中的样本对象区域,所述样本图像中包括所述第一对象对应的图像。The data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
在一种可能的实施方式中,所述第一确定模块具体用于:In a possible implementation manner, the first determining module is specifically configured to:
将表示所述对象区域的数据输入至第二识别模型,以获取所述第一图像中的所述第一对象的姿势类型;其中,所述第二识别模型为对多组第二样本进行学习得到的,每组第二样本包括样本对象区域和在所述样本对象区域中识别得到的样本姿势类型,所述样本对象区域中包括所述第一对象对应的图像。Input the data representing the object area into a second recognition model to obtain the posture type of the first object in the first image; wherein the second recognition model is learning from multiple sets of second samples Obtained, each set of second samples includes a sample object area and a sample gesture type recognized in the sample object area, and the sample object area includes an image corresponding to the first object.
在一种可能的实施方式中,所述视频为正在拍摄的视频;所述获取模块具体用于:In a possible implementation manner, the video is a video being shot; the acquisition module is specifically configured to:
在所述视频中获取N帧待处理图像,所述N帧待处理图像中包括所述视频中已拍摄的最后N帧图像;Acquiring N frames of to-be-processed images in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象,若是,则将所述N帧待处理图像确定为所述N帧图像。It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
在一种可能的实施方式中,所述增加模块具体用于:In a possible implementation manner, the adding module is specifically used for:
在所述N帧图像中的第N帧图像中的增加所述特效。Adding the special effect to the Nth frame of the N frames of images.
在一种可能的实施方式中,所述视频为拍摄完成的视频;所述获取模块具体用于:In a possible implementation manner, the video is a completed video; the acquisition module is specifically configured to:
执行待处理图像选择操作,所述待处理图像选择操作包括:从所述视频的预设图像起,在所述视频中获取连续的N帧待处理图像;Performing a to-be-processed image selection operation, the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
执行N帧图像确定操作,所述N帧图像确定操作包括:判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象对应的图像,若是,则将所述N帧待处理图像确定为所述N帧图像,若否,则将所述预设图像更新为所述视频中所述预设图像之后的一帧图像;Performing an operation of determining N frames of images, the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
重复执行所述待处理图像选择操作和所述N帧图像确定操作,直至确定得到所述N帧图像。Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until the N frames of images are determined to be obtained.
在一种可能的实施方式中,所述增加模块具体用于:In a possible implementation manner, the adding module is specifically used for:
在所述N帧图像中的至少一帧图像中增加所述特效。The special effect is added to at least one of the N frames of images.
在一种可能的实施方式中,所述获取模块具体用于:In a possible implementation manner, the acquisition module is specifically configured to:
确定待在所述视频中增加的特效;Determine the special effects to be added to the video;
确定待在所述视频中增加的特效对应的所述第一对象;Determine the first object corresponding to the special effect to be added in the video;
根据所述第一对象,在所述视频中确定所述N帧图像。According to the first object, the N frames of images are determined in the video.
在一种可能的实施方式中,所述装置还包括第三确定模块,其中,In a possible implementation manner, the device further includes a third determining module, wherein:
所述第三确定模块用于,在所述获取模块在视频中获取连续的N帧图像之前,确定未在所述N帧图像中增加所述目标特效。The third determining module is configured to determine that the target special effect is not added to the N frames of images before the acquiring module acquires consecutive N frames of images in the video.
第三方面,本公开实施例提供一种电子设备,包括:处理器,所述处理器与存储器耦合;In a third aspect, an embodiment of the present disclosure provides an electronic device, including: a processor coupled with a memory;
所述存储器用于,存储计算机程序;The memory is used to store a computer program;
所述处理器用于,执行所述存储器中存储的计算机程序,以使得所述终端设备执行上述第一方面任一项所述的方法。The processor is configured to execute the computer program stored in the memory, so that the terminal device executes the method according to any one of the foregoing first aspects.
第四方面,本公开实施例提供一种可读存储介质,包括程序或指令,当所述程序或指令在计算机上运行时,如上述第一方面任意一项所述的方法被执行。In a fourth aspect, an embodiment of the present disclosure provides a readable storage medium, including a program or instruction, and when the program or instruction runs on a computer, the method described in any one of the foregoing first aspect is executed.
本公开实施例提供的视频处理方法、装置及设备,当需要在视频中增加第一对象对应的特效时,在视频中确定连续的、包括第一对象的N帧图像,获取每帧图像中的第一对象的姿势类型,并根据每帧图像中的第一对象的姿势类型,获取第一对象的姿势分布,根据第一对象的姿势分布和N帧图像,在视频中增加特效。在上述过程中,以视频帧为单位,确定视频中第一对象的姿势分布,根据视频中第一对象的姿势分布,可以准确的确定得到视频中是否出现预设动作,进而可以准确的确定得到是否在视频中增加特效。在确定在视频中增加特效时,根据连续的N帧图像在视频中增加特效,即,可以以视频帧为粒度在视频中增加特效,提高了增加特效的精确度。In the video processing method, device and equipment provided by the embodiments of the present disclosure, when it is necessary to add special effects corresponding to the first object in the video, determine the continuous N frames of images including the first object in the video, and obtain the The posture type of the first object, and the posture distribution of the first object is obtained according to the posture type of the first object in each frame of image, and special effects are added to the video according to the posture distribution of the first object and N frames of images. In the above process, the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video. When it is determined to add special effects to the video, the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects.
附图说明Description of the drawings
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在 不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the technical solutions in the embodiments of the present disclosure or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1为本公开实施例提供的视频处理的架构图;FIG. 1 is an architecture diagram of video processing provided by an embodiment of the disclosure;
图2为本公开实施例提供的一种视频处理方法的流程示意图;2 is a schematic flowchart of a video processing method provided by an embodiment of the disclosure;
图3A为本公开实施例提供的一种视频帧的示意图;3A is a schematic diagram of a video frame provided by an embodiment of the disclosure;
图3B为本公开实施例提供的另一种视频帧的示意图;FIG. 3B is a schematic diagram of another video frame provided by an embodiment of the disclosure;
图4A为本公开实施例提供的又一种视频帧的示意图;4A is a schematic diagram of another video frame provided by an embodiment of the disclosure;
图4B为本公开实施例提供的另一种视频帧的示意图;4B is a schematic diagram of another video frame provided by an embodiment of the disclosure;
图5为本公开实施例提供的另一种视频处理方法的流程示意图;FIG. 5 is a schematic flowchart of another video processing method provided by an embodiment of the disclosure;
图6为本申请实施例提供的视频处理过程示意图;FIG. 6 is a schematic diagram of a video processing process provided by an embodiment of the application;
图7为本公开实施例提供的一种视频处理装置的结构示意图;FIG. 7 is a schematic structural diagram of a video processing device provided by an embodiment of the disclosure;
图8为本公开实施例提供的另一种视频处理装置的结构示意图;FIG. 8 is a schematic structural diagram of another video processing device provided by an embodiment of the disclosure;
图9为本公开实施例提供的电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the disclosure.
具体实施方式detailed description
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of the embodiments of the present invention, not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
图1为本公开实施例提供的视频处理的架构图。在视频中增加特效时,通常是判断视频中是否出现了预设动作(例如,鼓掌、摇头等),当确定视频中出现了预设动作之后,则在视频中增加预设的动作对应的特效。请参见图1,当需要在视频中增加预设动作(假设预设动作对应第一对象,即,由第一对象执行预设动作,第一对象可以为手、腿、头、车辆等)对应的特效时,可以在视频中进行图像提取,以得到N张连续的图像(图像1、图像2、……、图像N)。可以对提取到的每张图像进行识别处理,以得到每张图像中第一对象的姿势类型,并根据每帧图像中的第一对象的姿势类型,获取第一对象的姿势分布,在第一对象的姿势分布满足预设分布时,则可以确定视频中出现了预设动作,则在视频中增加预设动作对应的特效。FIG. 1 is an architecture diagram of video processing provided by an embodiment of the disclosure. When adding special effects to a video, it is usually judged whether there are preset actions in the video (for example, clapping, shaking the head, etc.). When it is determined that the preset actions appear in the video, the special effects corresponding to the preset actions are added to the video . Please refer to Figure 1, when a preset action needs to be added to the video (assuming that the preset action corresponds to the first object, that is, the preset action is performed by the first object, and the first object can be hands, legs, head, vehicles, etc.). When the special effects of, you can extract images from the video to get N continuous images (image 1, image 2, ..., image N). Each extracted image can be recognized to obtain the posture type of the first object in each image, and the posture distribution of the first object can be obtained according to the posture type of the first object in each frame of image. When the posture distribution of the object satisfies the preset distribution, it can be determined that the preset action appears in the video, and the special effect corresponding to the preset action is added to the video.
在上述过程中,以视频帧为单位,确定视频中第一对象的姿势分布,根 据视频中第一对象的姿势分布,可以准确的确定得到视频中是否出现预设动作,进而可以准确的确定得到是否在视频中增加特效。在确定在视频中增加特效时,根据连续的N帧图像在视频中增加特效,即,可以以视频帧为粒度在视频中增加特效,提高了增加特效的精确度。In the above process, the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video. When it is determined to add special effects to the video, the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects.
下面,通过具体实施例对本申请所示的技术方案进行详细说明。需要说明的是,下面几个具体实施例可以相互结合,对于相同或相似的内容,在不同的实施例中不再进行重复说明。Hereinafter, the technical solution shown in this application will be described in detail through specific embodiments. It should be noted that the following specific embodiments can be combined with each other, and the same or similar content will not be repeated in different embodiments.
图2为本公开实施例提供的一种视频处理方法的流程示意图。请参见图2,该方法可以包括:FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the disclosure. See Figure 2. The method can include:
S201、在视频中获取连续的N帧图像。S201: Acquire consecutive N frames of images in the video.
本公开实施例的执行主体可以为电子设备,也可以为设置在电子设备中的视频处理装置。可选的,视频处理装置可以通过软件实现,也可以通过软件和硬件的结合实现。The execution subject of the embodiments of the present disclosure may be an electronic device, or may be a video processing device provided in the electronic device. Optionally, the video processing device can be implemented by software, or by a combination of software and hardware.
可选的,电子设备可以为手机、电脑、具有处理功能的摄像机等设备。Optionally, the electronic device can be a mobile phone, a computer, a video camera with processing functions, and other devices.
其中,每帧图像中均包括第一对象,N为大于1的整数。Wherein, each frame of image includes the first object, and N is an integer greater than 1.
每帧图像中均包括完整的视频内容,例如,当视频为经过压缩处理的视频时,则N帧图像均为视频中的关键帧。Each frame of image includes complete video content. For example, when the video is a compressed video, the N frames of images are all key frames in the video.
可选的,第一对象可以为手、腿、头、车辆、飞机等。Optionally, the first object may be a hand, leg, head, vehicle, airplane, etc.
可选的,可以先确定待在视频中增加的特效,确定待在视频中增加的特效对应的第一对象,并根据第一对象,在视频中确定N帧图像。例如,在确定待在视频中增加的特效对应的第一对象时,可以先确定待在视频中增加的特效对应的预设动作,确定执行该预设动作的对象为第一对象。Optionally, the special effect to be added to the video may be determined first, the first object corresponding to the special effect to be added to the video is determined, and N frames of images are determined in the video according to the first object. For example, when determining the first object corresponding to the special effect to be added in the video, the preset action corresponding to the special effect to be added in the video may be determined first, and the object performing the preset action is determined as the first object.
例如,假设待在视频中增加的特效为灯光特效,灯光特效对应的预设动作为鼓掌动作,执行鼓掌动作的对象为手,因此,可以确定第一对象为手,相应的,在视频中确定的连续N张图像均包括手。For example, suppose the special effect to be added in the video is a light special effect, and the preset action corresponding to the light special effect is a clapping action, and the object performing the clapping action is a hand. Therefore, the first object can be determined as a hand, and accordingly, determine it in the video N consecutive images of all include hands.
当视频处理的应用场景不同时,确定连续的N帧图像的过程也不同,例如,可以包括至少如下两种可能的应用场景:When the application scenarios of video processing are different, the process of determining consecutive N frames of images is also different. For example, it may include at least the following two possible application scenarios:
一种可能的应用场景:视频为正在拍摄的视频,即,一边进行视频拍摄,一边在正在拍摄的视频中增加特效。A possible application scenario: the video is a video being shot, that is, while the video is being shot, special effects are added to the video being shot.
在该种可能的应用场景中,可以通过如下可行的实现方式获取连续的N 帧图像:在视频中获取N帧待处理图像,N帧待处理图像中包括视频中已拍摄的最后N帧图像,判断N帧待处理图像中是否每帧待处理图像中均包括第一对象,若是,则将N帧待处理图像确定为所述N帧图像。若否,则不将该N帧待处理图像确定为所述N帧图像,可以在拍摄得到新的图像之后,更新N帧待处理图像,并重复上述过程,直至确定得到所述N帧图像。In this possible application scenario, continuous N frames of images can be obtained through the following feasible implementation methods: N frames of to-be-processed images are obtained in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video. It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images. If not, the N frames of to-be-processed images are not determined as the N-frames of images. After a new image is captured, N frames of to-be-processed images can be updated, and the above process is repeated until the N frames of images are determined to be obtained.
下面,结合图3A-图3B,对在该种应用场景中确定连续的N帧图像的过程进行详细说明。In the following, the process of determining consecutive N frames of images in this application scenario will be described in detail with reference to FIGS. 3A-3B.
图3A为本公开实施例提供的一种视频帧的示意图。假设第一对象为手,N为6。请参见图3A,假设当前拍摄的最后一帧图像为第80帧图像,第75帧图像至第80帧图像中均包括手,由于拍摄得到的最后6帧图像(第75帧图像至第80帧图像)中均包括手,则可以将第75帧图像至第80帧图像确定为连续的6帧图像。FIG. 3A is a schematic diagram of a video frame provided by an embodiment of the disclosure. Suppose the first object is a hand, and N is 6. Please refer to Figure 3A. Assuming that the last frame of the image currently captured is the 80th frame, the hands are included in the 75th frame to the 80th frame image. Since the last 6 frames of the captured image (the 75th frame to the 80th frame If the images include hands, the 75th frame to the 80th frame image can be determined as 6 consecutive frames of images.
图3B为本公开实施例提供的另一种视频帧的示意图。假设第一对象为手,N为6。请参见图3B,在T1时刻,拍摄的最后一帧图像为第80帧图像,其中,第75-76、78-80帧图像中包括手,第77帧图像中不包括手,由于最后拍摄得到的6帧图像中存在不包括手的图像,则继续进行拍摄,直至在T2时刻时,拍摄得到的最后一帧图像为第83帧图像,且第78帧图像至第83帧图像中均包括手,则将第78帧图像至第83帧中图像确定为连续的6帧图像。FIG. 3B is a schematic diagram of another video frame provided by an embodiment of the disclosure. Suppose the first object is a hand, and N is 6. Please refer to Figure 3B. At time T1, the last image captured is the 80th frame, where the 75-76 and 78-80 frames include the hand, and the 77th frame does not include the hand, because the last captured image If there is an image that does not include hands in the 6 frames of images, continue shooting until at time T2, the last frame of image captured is the 83rd frame, and the 78th to 83rd frames all include hands , The image from frame 78 to frame 83 is determined to be 6 consecutive frames.
另一种可能的应用场景:视频为拍摄完成的视频,即,在已拍摄完成的视频中增加特效。Another possible application scenario: the video is a completed video, that is, special effects are added to the completed video.
在该种可能的应用场景中,可以通过如下可行的实现方式获取连续的N帧图像:执行待处理图像选择操作,待处理图像选择操作包括:从视频的预设图像起,在视频中获取连续的N帧待处理图像。执行N帧图像确定操作,N帧图像确定操作包括:判断N帧待处理图像中是否每帧待处理图像中均包括第一对象对应的图像,若是,则将N帧待处理图像确定为N帧图像,若否,则将预设图像更新为视频中预设图像之后的一帧图像。重复执行待处理图像选择操作和N帧图像确定操作,直至确定得到N帧图像。In this possible application scenario, continuous N frames of images can be obtained through the following feasible implementations: perform a to-be-processed image selection operation. The to-be-processed image selection operation includes: starting from the preset image of the video, obtaining continuous images in the video N frames of images to be processed. Perform N frames of image determination operation, N frames of image determination operation includes: judging whether each frame of N frames of to-be-processed images includes the image corresponding to the first object, if so, determine N frames of to-be-processed images as N frames Image, if not, update the preset image to a frame after the preset image in the video. Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until it is determined that N frames of images are obtained.
可选的,可以将预设图像更新为视频中预设图像之后的一帧图像。或者,可以将预设图像更新为第二图像的后一帧图像,所述第二图像为所述N帧待处理图像中最后一个不包括第一对象的图像。Optionally, the preset image can be updated to a frame of image after the preset image in the video. Alternatively, the preset image may be updated to the next frame image of the second image, the second image being the last image that does not include the first object in the N frames of images to be processed.
下面,结合图4A-图4B,对在该种应用场景中确定连续的N帧图像的过程进行详细说明。Hereinafter, the process of determining consecutive N frames of images in this kind of application scenario will be described in detail with reference to FIGS. 4A-4B.
图4A为本公开实施例提供的又一种视频帧的示意图。假设第一对象为手,N为6,预设图像为第一帧图像。请参见图4A,初始时,预设图像为第一帧图像,因此,确定N帧待处理图像为第1帧图像至第6帧图像。由于第1帧图像至第6帧图像中的第3帧图像不包括手,则将预设图像更新为第二帧图像,相应的,N帧待处理图像更新为第2帧图像至第7帧图像。由于第2帧图像至第7帧图像中的第3帧图像中不包括手,则将预设图像更新为第三帧图像,相应的,N帧待处理图像更新为第3帧图像至第8帧图像中的图像。由于第3帧图像至第8帧图像中的第3帧图像中不包括手,则将预设图像更新为第四帧图像,相应的,N帧待处理图像更新为第4帧图像至第9帧图像,由于第4帧图像至第9帧图像中均包括手,则将第4帧图像至第9帧图像确定为连续的6帧图像。FIG. 4A is a schematic diagram of another video frame provided by an embodiment of the disclosure. Assuming that the first object is a hand, N is 6, and the preset image is the first frame of image. Please refer to FIG. 4A. Initially, the preset image is the first frame of image. Therefore, it is determined that the N frames of to-be-processed images are the first frame of image to the sixth frame of image. Since the third frame of the image from the 1st frame to the 6th frame does not include the hand, the preset image is updated to the second frame image, and accordingly, the N frames to be processed are updated to the second frame image to the 7th frame image. Since the hand is not included in the third frame of the second to seventh frame of image, the preset image is updated to the third frame of image, and correspondingly, the N frames to be processed are updated to the third to eighth frame of image The image in the frame image. Since the hands are not included in the third frame of images from the third to the eighth frame, the preset image is updated to the fourth frame of image, and correspondingly, the N frames to be processed are updated to the fourth to ninth frame of image Frame images, since the 4th frame to the 9th frame image all include hands, the 4th frame image to the 9th frame image are determined to be 6 consecutive frames of images.
图4B为本公开实施例提供的另一种视频帧的示意图。假设第一对象为手,N为6,预设图像为第一帧图像。请参见图4B,初始时,预设图像为第一帧图像,因此,确定N帧待处理图像为第1帧图像至第6帧图像。由于第1帧图像至第6帧图像中的第3帧图像不包括手,则在第1帧图像至第6帧图像中确定第二图像,由于第3帧图像中不包括手,因此,将第3帧图像确定为第二图像,因此,将预设图像更新为第4帧图像(第二图像的后一帧图像),相应的,N帧待处理图像更新为第4帧图像至第9帧图像,由于第4帧图像至第9帧图像中均包括手,则将第4帧图像至第9帧图像确定为连续的6帧图像。FIG. 4B is a schematic diagram of another video frame provided by an embodiment of the disclosure. Assuming that the first object is a hand, N is 6, and the preset image is the first frame of image. Referring to FIG. 4B, initially, the preset image is the first frame of image, therefore, it is determined that the N frames of to-be-processed images are the first frame to the sixth frame of images. Since the third image in the first to sixth frames does not include hands, the second image is determined in the first to sixth frames. Since the third image does not include hands, the The third frame image is determined to be the second image. Therefore, the preset image is updated to the fourth frame image (the next frame image of the second image). Accordingly, the N frames to be processed are updated to the fourth frame image to the ninth frame image. Frame images, since the 4th frame to the 9th frame image all include hands, the 4th frame image to the 9th frame image are determined to be 6 consecutive frames of images.
可选的,为了避免在相同的视频帧中增加重复的特效,则确定得到的该N帧图像为未增加目标特效(待在视频中增加的特效)的图像。Optionally, in order to avoid adding repeated special effects to the same video frame, it is determined that the obtained N frames of images are images without added target special effects (special effects to be added to the video).
S202、确定每帧图像中的第一对象的姿势类型。S202: Determine the posture type of the first object in each frame of image.
可选的,可以预先设置第一对象的多种姿势类型,例如,当第一对象为手时,则手的姿势类型可以包括:双手正对打开、双手合十、握拳等。例如,当第一对象为头时,则头的姿势类型可以包括:抬头、低头、左侧偏头、右侧偏头等。Optionally, multiple posture types of the first object may be preset. For example, when the first object is a hand, the posture type of the hand may include: open hands, put both hands together, and make a fist. For example, when the first object is the head, the posture type of the head may include: head up, head down, left head tilted, right head tilted, and so on.
获取每帧图像中的第一对象的姿势类型的过程相同,下面,以获取第一 图像中的第一对象的姿势类型的过程进行说明。The process of obtaining the posture type of the first object in each frame of image is the same. In the following, the process of obtaining the posture type of the first object in the first image will be described.
针对N帧图像中的任意的第一图像,可以在第一图像中检测对象区域,对象区域中包括第一图像中与第一对象对应的部分,并对对象区域进行处理,以获取第一图像中的第一对象的姿势类型。For any first image in the N frames of images, the object area can be detected in the first image, and the object area includes the part of the first image corresponding to the first object, and the object area is processed to obtain the first image The posture type of the first object in.
可选的,可以通过如下可行的实现方式在第一图像中检测对象区域:将表示第一图像的数据输入至第一识别模型,以获取对象区域;其中,第一识别模型为对多组第一样本进行学习得到的,每组第一样本包括样本图像和样本图像中的样本对象区域,样本图像中包括第一对象对应的图像。Optionally, the object area can be detected in the first image by the following feasible implementation: input data representing the first image to the first recognition model to obtain the object area; wherein the first recognition model is a pair of multiple sets of The sample is obtained by learning, each group of first samples includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
表示第一图像的数据输入可以为第一图像、第一图像的灰度图像等。对象区域可以为第一图像中包括第一对象的一个矩形区域。The data input representing the first image may be the first image, a grayscale image of the first image, or the like. The object area may be a rectangular area including the first object in the first image.
由于第一识别模型为对大量的第一样本学习得到的,因此,通过第一识别模型可以在第一图像中准确的检测对象区域。Since the first recognition model is learned from a large number of first samples, the first recognition model can accurately detect the target area in the first image.
可以根据第一识别模型的输出确定对象区域。第一识别模型的输出可以为第一图像中对象区域对应的图像,也可以为对象区域的至少两个顶点在第一图像中的位置(例如坐标)。当第一识别模型的输出为对象区域的两个顶点时,该两个顶点为对角线上的两个顶点。The target area can be determined based on the output of the first recognition model. The output of the first recognition model may be an image corresponding to the object area in the first image, or may be the positions (for example, coordinates) of at least two vertices of the object area in the first image. When the output of the first recognition model is two vertices of the target area, the two vertices are two vertices on a diagonal line.
可选的,可以通过如下可行的实现方式获取第一图像中的第一对象的姿势类型:将表示对象区域的数据输入至第二识别模型,以获取第一图像中的第一对象的姿势类型;其中,第二识别模型为对多组第二样本进行学习得到的,每组第二样本包括样本对象区域和在样本对象区域中识别得到的样本姿势类型,样本对象区域中包括第一对象对应的图像。Optionally, the posture type of the first object in the first image can be obtained by the following feasible implementation: input data representing the object area into the second recognition model to obtain the posture type of the first object in the first image ; Among them, the second recognition model is obtained by learning multiple sets of second samples, each set of second samples includes the sample object area and the sample posture type recognized in the sample object area, the sample object area includes the first object corresponding Image.
表示对象区域的数据可以为对象区域对应的图像,或者对象区域的至少两个顶点在第一图像中的位置(例如坐标)。当表示对象区域的数据为对象区域的两个顶点时,该两个顶点为对角线上的两个顶点。The data representing the object area may be an image corresponding to the object area, or the positions (for example, coordinates) of at least two vertices of the object area in the first image. When the data representing the target area are two vertices of the target area, the two vertices are two vertices on a diagonal line.
可以根据第二识别模型的输出确定第一图像中的第一对象的姿势类型。第二识别模型的输出可以为表示姿势类型的字符(例如,数字、字母等)。The posture type of the first object in the first image can be determined according to the output of the second recognition model. The output of the second recognition model may be characters (for example, numbers, letters, etc.) representing the type of gesture.
由于第二识别模型为对大量的第二样本学习得到的,因此,通过第二识别模型可以在对象区域中准确的确定得到第一对象的姿势类型。Since the second recognition model is learned from a large number of second samples, the posture type of the first object can be accurately determined in the object area through the second recognition model.
S203、根据每帧图像中的第一对象的姿势类型,确定第一对象的姿势分布。S203: Determine the posture distribution of the first object according to the posture type of the first object in each frame of image.
其中,第一对象的姿势分布用于指示第一对象的姿势的变化规律。Wherein, the posture distribution of the first object is used to indicate the law of change of the posture of the first object.
例如,假设第一对象为手,N为6,该6帧图像中的第一对象的姿势类型依次为:双手正对打开、双手正对打开、双手正对打开、双手合十、双手合十、双手合十。由此,可以得到第一对象的姿势分布为:双手正对打开到双手合十。For example, suppose that the first object is a hand, and N is 6. The posture types of the first object in the 6 frames of images are in order: open hands facing right, hands facing open, hands facing open, hands folded, hands folded, hands Namaste. As a result, the posture distribution of the first object can be obtained as: the hands are open to the hands together.
可选的,为了提高获取第一对象的姿势分布的准确性,可以对通过如下可行的实现方式获取第一对象的姿势分布:按照N帧图像在视频中的顺序,对N帧图像进行分组,得到至少两组图像,每组图像中包括连续的M帧图像,M为大于1的整数;根据每组图像中每个图像中的第一对象的姿势类型,确定每组图像对应的姿势类型;根据每组图像对应的姿势类型,获取第一对象的姿势分布。Optionally, in order to improve the accuracy of obtaining the posture distribution of the first object, the posture distribution of the first object may be obtained by the following feasible implementation manners: N frames of images are grouped according to the order of the N frames in the video, Obtain at least two sets of images. Each set of images includes consecutive M frames of images, where M is an integer greater than 1. According to the posture type of the first object in each image in each set of images, determine the correspondence of each set of images The posture type; according to the posture type corresponding to each group of images, the posture distribution of the first object is obtained.
可选的,针对任意一组图像,若该组图像中大于或等于第一阈值个图像的姿势类型为第一姿势类型,则确定该组图像对应的姿势类型为第一姿势类型。Optionally, for any group of images, if the posture type of the images in the group of images greater than or equal to the first threshold is the first posture type, it is determined that the posture type corresponding to the group of images is the first posture type.
例如,假设M为3,第一阈值为2,则当一组图像中存在2个或3个图像对应的姿势类型为双手合十类型时,则确定该组图像对应的姿势类型为双手合十类型。For example, assuming that M is 3 and the first threshold is 2, then when there are 2 or 3 images in a group of images, the posture type corresponding to the two-hands type is determined to be the posture type corresponding to the group of images is the two-hands type. Types of.
例如,假设N为9,该9帧图像分别记为图像1、图像2、……、图像9,M为3,则对该9帧图像的分组、以及确定得到的各图像组对应的姿势类型可以为表1所示:For example, suppose N is 9, the 9 frames of images are respectively denoted as image 1, image 2, ..., image 9, and M is 3, then the 9 frames of images are grouped and the posture type corresponding to each image group determined It can be as shown in Table 1:
表1Table 1
Figure PCTCN2019126757-appb-000001
Figure PCTCN2019126757-appb-000001
Figure PCTCN2019126757-appb-000002
Figure PCTCN2019126757-appb-000002
需要说明的是,表1只是以示例的形式示意对图像进行的分组,以使各图像对应的姿势类型。It should be noted that Table 1 merely illustrates the grouping of images in the form of examples, so that each image corresponds to the posture type.
在上述过程中,即使对个别图像中的第一对象的姿势类型识别错误,依然可以获取得到正确的第一对象的姿势分布,使得视频处理的容错性能较高。In the above process, even if the posture type of the first object in the individual image is incorrectly recognized, the correct posture distribution of the first object can still be obtained, so that the error tolerance performance of the video processing is higher.
S204、根据第一对象的姿势分布和N帧图像,在视频中增加特效。S204: Add special effects to the video according to the posture distribution of the first object and the N frames of images.
可选的,可以判断第一对象的姿势分布是否满足预设姿势分布,在第一对象的姿势分布满足预设姿势分布时,获取预设姿势分布对应的目标特效,并根据N帧图像在视频中增加目标特效。Optionally, it can be judged whether the posture distribution of the first object satisfies the preset posture distribution, and when the posture distribution of the first object satisfies the preset posture distribution, the target special effect corresponding to the preset posture distribution is obtained, and according to the N frames of images in the video Added target special effects in.
可选的,当视频处理的应用场景不同时,根据N帧图像在视频中增加目标特效的过程也不同。Optionally, when the application scenarios of video processing are different, the process of adding target special effects to the video according to N frames of images is also different.
一种可能的应用场景:视频为正在拍摄的视频,即,一边进行视频拍摄,一边在正在拍摄的视频中增加特效。A possible application scenario: the video is a video being shot, that is, while the video is being shot, special effects are added to the video being shot.
在该种应用场景下,可以在N帧图像中的第N帧图像中的增加特效。或者,在第N帧图像对应的播放时刻在视频中增加特效,特效的显示时刻可以为预设时长。In this application scenario, special effects can be added to the Nth frame of the N frame of image. Alternatively, a special effect is added to the video at the playback time corresponding to the Nth frame of image, and the display time of the special effect may be a preset duration.
另一种可能的应用场景:视频为拍摄完成的视频,即,在已拍摄完成的视频中增加特效。Another possible application scenario: the video is a completed video, that is, special effects are added to the completed video.
在该种应用场景下,可以在N帧图像中的至少一帧图像中增加特效。例如,可以在N帧图像中全部增加特效,即,在该N帧图像对应的播放时刻之间在视频中增加特效。或者,在该N帧图像中的部分图像中增加特效,即,在该N帧图像中的部分图像对应的播放时刻之间在视频中增加特效。In this application scenario, special effects can be added to at least one of the N frames of images. For example, special effects can be added to all N frames of images, that is, special effects can be added to the video between the playback moments corresponding to the N frames of images. Alternatively, special effects are added to some of the N frames of images, that is, special effects are added to the video between playback moments corresponding to the partial images of the N frames of images.
本公开实施例提供的视频处理方法,当需要在视频中增加第一对象对应的特效时,在视频中确定连续的、包括第一对象的N帧图像,获取每帧图像中的第一对象的姿势类型,并根据每帧图像中的第一对象的姿势类型,获取第一对象的姿势分布,根据第一对象的姿势分布和N帧图像,在视频中增加特效。在上述过程中,以视频帧为单位,确定视频中第一对象的姿势分布,根据视频中第一对象的姿势分布,可以准确的确定得到视频中是否出现预设动作,进而可以准确的确定得到是否在视频中增加特效。在确定在视频中增加特效时,根据连续的N帧图像在视频中增加特效,即,可以以视频帧为粒 度在视频中增加特效,提高了增加特效的精确度。In the video processing method provided by the embodiments of the present disclosure, when a special effect corresponding to a first object needs to be added to the video, the continuous N frames of images including the first object are determined in the video, and the image of the first object in each frame is obtained. According to the posture type, the posture distribution of the first object is obtained according to the posture type of the first object in each frame of image, and special effects are added to the video according to the posture distribution of the first object and N frames of images. In the above process, the posture distribution of the first object in the video is determined by using the video frame as the unit. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then can be accurately determined. Whether to add special effects to the video. When determining to add special effects to the video, add special effects to the video based on consecutive N frames of images, that is, add special effects to the video with the video frame as the granularity, which improves the accuracy of adding special effects.
在上述任意一个实施例的基础上,下面,通过图5所示的实施例,对视频处理方法进行详细说明。On the basis of any one of the above embodiments, the following describes the video processing method in detail through the embodiment shown in FIG. 5.
图5为本公开实施例提供的另一种视频处理方法的流程示意图。请参见图5,该方法可以包括:FIG. 5 is a schematic flowchart of another video processing method provided by an embodiment of the disclosure. Referring to Figure 5, the method may include:
S501、在视频中获取连续的N帧图像。S501: Acquire consecutive N frames of images in the video.
需要说明的是,S501的执行过程可以参见S202的执行过程,此处不再进行赘述。It should be noted that, for the execution process of S501, refer to the execution process of S202, which will not be repeated here.
S502、按照N帧图像在视频中的顺序,对N帧图像进行分组,得到至少两组图像。S502: Group the N frames of images according to the order of the N frames of images in the video to obtain at least two sets of images.
其中,每组图像中包括连续的M帧图像,M为大于1的整数。Wherein, each group of images includes consecutive M frames of images, and M is an integer greater than 1.
从N帧图像中的第一帧图像起,依次将连续的M帧图像分为一组,得到至少两组图像。例如,N帧图像中的第1帧图像至第M帧图像分为一组,第M+1帧图像至第2M帧图像分为一组,依次类推,直至将N帧图像分组完毕。Starting from the first frame of the N frame images, successively divide the consecutive M frame images into one group to obtain at least two groups of images. For example, the first frame image to the Mth frame image in the N frame images are grouped into one group, the M+1th frame image to the 2Mth frame image are grouped into one group, and so on, until the N frame images are grouped.
可选的,N为M的整数倍。Optionally, N is an integer multiple of M.
S503、确定每组图像中每个图像中的第一对象的姿势类型。S503: Determine the posture type of the first object in each image in each group of images.
需要说明的是,S503的执行过程可以参见S202的执行过程,此处不再进行赘述。It should be noted that the execution process of S503 can be referred to the execution process of S202, which will not be repeated here.
S504、根据每组图像中每个图像中的第一对象的姿势类型,确定每组图像对应的姿势类型。S504: Determine a posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images.
针对任意一组图像,若该组图像中大于或等于第一阈值个图像的姿势类型为第一姿势类型,则确定该组图像对应的姿势类型为第一姿势类型。For any group of images, if the posture type of the images in the group of images greater than or equal to the first threshold is the first posture type, it is determined that the posture type corresponding to the group of images is the first posture type.
例如,假设M为3,第一阈值为2,则当一组图像中存在2个或3个图像对应的姿势类型为双手合十类型时,则确定该组图像对应的姿势类型为双手合十类型。For example, assuming that M is 3 and the first threshold is 2, then when there are 2 or 3 images in a group of images, the posture type corresponding to the two-hands type is determined to be the posture type corresponding to the group of images is the two-hands type. Types of.
S505、根据每组图像对应的姿势类型,确定第一对象的姿势分布。S505: Determine the posture distribution of the first object according to the posture type corresponding to each group of images.
例如,若第一对象为手,在S502中确定得到2组图像,假设第一组图像对应的姿势类型为双手正对打开,第二组图像对应的姿势类型为双手合十,则第一对象的姿势分布为双手正对打开到双手合十。For example, if the first object is a hand, it is determined in S502 that two sets of images are obtained. Assuming that the posture type corresponding to the first set of images is that the hands are facing up, and the posture type corresponding to the second set of images is the hands together, then The posture distribution of a subject is from the hands open to the hands folded.
S506、判断第一对象的姿势分布是否满足预设姿势分布。S506: Determine whether the posture distribution of the first object meets the preset posture distribution.
若是,则执行S507-S508。If yes, execute S507-S508.
若否,则执行S501。If not, execute S501.
可选的,若第一对象的姿势分布所指示的第一对象的姿势的变化规律,与预设姿势分布所指示的第一对象的姿势的变化规律相同,则确定第一对象的姿势分布是否满足预设姿势分布。Optionally, if the change rule of the posture of the first object indicated by the posture distribution of the first object is the same as the change rule of the posture of the first object indicated by the preset posture distribution, it is determined whether the posture distribution of the first object Meet the preset posture distribution.
S507、获取预设姿势分布对应的目标特效。S507: Obtain a target special effect corresponding to the preset posture distribution.
可选的,可以预先设置姿势分布与特效之间的对应关系,相应的,可以根据预设姿势分布和该对象关系确定目标特效。Optionally, the corresponding relationship between the posture distribution and the special effect can be preset, and accordingly, the target special effect can be determined according to the preset posture distribution and the object relationship.
S508、根据N帧图像在视频中增加目标特效。S508: Add target special effects to the video according to the N frames of images.
需要说明的是,S508的执行过程可以参见S204的执行过程,此处不再进行赘述。It should be noted that, for the execution process of S508, refer to the execution process of S204, which will not be repeated here.
在图5所示的实施例中,以视频帧为单位,确定视频中第一对象的姿势分布,根据视频中第一对象的姿势分布,可以准确的确定得到视频中是否出现预设动作,进而可以准确的确定得到是否在视频中增加特效。在确定在视频中增加特效时,根据连续的N帧图像在视频中增加特效,即,可以以视频帧为粒度在视频中增加特效,提高了增加特效的精确度。进一步的,即使对个别图像中的第一对象的姿势类型识别错误,依然可以获取得到正确的第一对象的姿势分布,使得视频处理的容错性能较高。In the embodiment shown in FIG. 5, the posture distribution of the first object in the video is determined in units of video frames. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then Can accurately determine whether to add special effects to the video. When it is determined to add special effects to the video, the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects. Further, even if the posture type of the first object in the individual image is incorrectly recognized, the correct posture distribution of the first object can still be obtained, so that the error tolerance performance of the video processing is higher.
在上述任意一个实施例的基础上,下面,结合图6,通过具体示例,对上述方法实施例所示的视频处理方法进行详细说明。On the basis of any one of the foregoing embodiments, the video processing method shown in the foregoing method embodiment will be described in detail below with reference to FIG. 6 through specific examples.
图6为本申请实施例提供的视频处理过程示意图。假设第一对象为手,N为6,待增加的特效为撒花。请参见图6,假设确定得到的6张图像分别为P1、P2、……、P6。FIG. 6 is a schematic diagram of a video processing process provided by an embodiment of the application. Assuming that the first object is a hand, N is 6, and the special effect to be added is flower spreading. Please refer to Fig. 6, assuming that the six images obtained are P1, P2, ..., P6.
请参见图6,将P1、P2和P3分为一组图像,将P4、P5和P6分为一组图像。分别将表示该6张图像的数据输入至第一预设模型,得到每张图像中的对象区域,其中,对象区域中包括手。分别将表示6张图像中的对象区域输入至第二预设模型,得到手的姿势类型,例如,确定得到的手的姿势类型分别为:双手正对打开、双手正对打开、双手合十、双手合十、双手合十、双手合十。由此可以确定第一组图像对应的姿势类型为双手正对打开,第二组图像对应的姿势类型为双手合十,因此,可以确定第一对象(手)对应的 姿势分布为:双手正对打开到双手合十,确定该姿势分布满足预设姿势分组,则在该6张图像中增加撒花特效。当然,还可以在该6张图像中的部分图像中增加撒花特效。Please refer to Fig. 6, dividing P1, P2, and P3 into a group of images, and dividing P4, P5, and P6 into a group of images. The data representing the 6 images are respectively input into the first preset model to obtain the object area in each image, wherein the object area includes the hand. Input the object regions in the 6 images into the second preset model to obtain the posture type of the hand. For example, the posture types of the determined hands are: open hands facing right, hands facing right open, hands folded, hands Namaste, Namaste, Namaste. From this, it can be determined that the posture type corresponding to the first group of images is open with both hands facing each other, and the posture type corresponding to the second group of images is folded hands. Therefore, it can be determined that the posture distribution corresponding to the first object (hand) is: If the posture distribution satisfies the preset posture grouping, the special effects of spreading flowers are added to the 6 images. Of course, you can also add special effects of sprinkling to some of the 6 images.
在图6所示的实施例中,以视频帧为单位,确定视频中第一对象的姿势分布,根据视频中第一对象的姿势分布,可以准确的确定得到视频中是否出现预设动作,进而可以准确的确定得到是否在视频中增加特效。在确定在视频中增加特效时,根据连续的N帧图像在视频中增加特效,即,可以以视频帧为粒度在视频中增加特效,提高了增加特效的精确度。进一步的,即使对个别图像中的第一对象的姿势类型识别错误,依然可以获取得到正确的第一对象的姿势分布,使得视频处理的容错性能较高。In the embodiment shown in FIG. 6, the posture distribution of the first object in the video is determined in units of video frames. According to the posture distribution of the first object in the video, it can be accurately determined whether a preset action appears in the video, and then Can accurately determine whether to add special effects to the video. When it is determined to add special effects to the video, the special effects are added to the video based on consecutive N frames of images, that is, the special effects can be added to the video at the granularity of the video frame, which improves the accuracy of adding the special effects. Further, even if the posture type of the first object in the individual image is incorrectly recognized, the correct posture distribution of the first object can still be obtained, so that the error tolerance performance of the video processing is higher.
图7为本公开实施例提供的一种视频处理装置的结构示意图。请参见图7,该视频处理装置10可以包括获取模块11、第一确定模块12、第二确定模块13和增加模块14,其中,FIG. 7 is a schematic structural diagram of a video processing device provided by an embodiment of the disclosure. Referring to FIG. 7, the video processing device 10 may include an acquiring module 11, a first determining module 12, a second determining module 13, and an adding module 14.
所述获取模块11用于,在视频中获取连续的N帧图像,每帧所述图像中均包括第一对象,所述N为大于1的整数;The acquiring module 11 is configured to acquire consecutive N frames of images in a video, each frame of the image includes a first object, and the N is an integer greater than 1;
所述第一确定模块12用于,确定每帧图像中的所述第一对象的姿势类型;The first determining module 12 is configured to determine the posture type of the first object in each frame of image;
所述第二确定模块13用于,根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,所述姿势分布用于指示所述第一对象的姿势的变化规律;The second determining module 13 is configured to determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate the posture of the first object. Law of change
所述增加模块14用于,根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效。The adding module 14 is configured to add special effects to the video according to the posture distribution of the first object and the N frames of images.
本公开实施例提供的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。The video processing device provided in the embodiments of the present disclosure can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and details are not described herein again.
在一种可能的实施方式中,所述增加模块14具体用于:In a possible implementation manner, the adding module 14 is specifically configured to:
判断所述第一对象的姿势分布是否满足预设姿势分布;Judging whether the posture distribution of the first object meets a preset posture distribution;
在所述第一对象的姿势分布满足预设姿势分布时,获取所述预设姿势分布对应的目标特效,并根据所述N帧图像在所述视频中增加所述目标特效。When the posture distribution of the first object satisfies a preset posture distribution, obtain a target special effect corresponding to the preset posture distribution, and add the target special effect to the video according to the N frames of images.
在一种可能的实施方式中,所述第二确定模块13具体用于:In a possible implementation manner, the second determining module 13 is specifically configured to:
按照所述N帧图像在所述视频中的顺序,对所述N帧图像进行分组,得到至少两组图像,每组图像中包括连续的M帧图像,所述M为大于1的整 数;According to the sequence of the N frames of images in the video, the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
根据每组图像中每个图像中的所述第一对象的姿势类型,确定每组图像对应的姿势类型;Determine the posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images;
根据每组图像对应的姿势类型,获取所述第一对象的姿势分布。Obtain the posture distribution of the first object according to the posture type corresponding to each group of images.
在一种可能的实施方式中,针对所述N帧图像中的任意的第一图像,所述第一确定模块12具体用于:In a possible implementation manner, for any first image in the N frames of images, the first determining module 12 is specifically configured to:
在所述第一图像中检测对象区域,所述对象区域中包括所述第一图像中与所述第一对象对应的部分;Detecting an object area in the first image, where the object area includes a part of the first image corresponding to the first object;
对所述对象区域进行处理,以获取所述第一图像中的所述第一对象的姿势类型。The object area is processed to obtain the posture type of the first object in the first image.
在一种可能的实施方式中,所述第一确定模块12具体用于:In a possible implementation manner, the first determining module 12 is specifically configured to:
将表示所述第一图像的数据输入至第一识别模型,以获取所述对象区域;其中,所述第一识别模型为对多组第一样本进行学习得到的,每组第一样本包括样本图像和所述样本图像中的样本对象区域,所述样本图像中包括所述第一对象对应的图像。The data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
在一种可能的实施方式中,所述第一确定模块12具体用于:In a possible implementation manner, the first determining module 12 is specifically configured to:
将表示所述对象区域的数据输入至第二识别模型,以获取所述第一图像中的所述第一对象的姿势类型;其中,所述第二识别模型为对多组第二样本进行学习得到的,每组第二样本包括样本对象区域和在所述样本对象区域中识别得到的样本姿势类型,所述样本对象区域中包括所述第一对象对应的图像。Input the data representing the object area into a second recognition model to obtain the posture type of the first object in the first image; wherein the second recognition model is learning from multiple sets of second samples Obtained, each set of second samples includes a sample object area and a sample gesture type recognized in the sample object area, and the sample object area includes an image corresponding to the first object.
在一种可能的实施方式中,所述视频为正在拍摄的视频;所述获取模块11具体用于:In a possible implementation manner, the video is a video being shot; the acquisition module 11 is specifically configured to:
在所述视频中获取N帧待处理图像,所述N帧待处理图像中包括所述视频中已拍摄的最后N帧图像;Acquiring N frames of to-be-processed images in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象,若是,则将所述N帧待处理图像确定为所述N帧图像。It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
在一种可能的实施方式中,所述增加模块14具体用于:In a possible implementation manner, the adding module 14 is specifically configured to:
在所述N帧图像中的第N帧图像中的增加所述特效。Adding the special effect to the Nth frame of the N frames of images.
在一种可能的实施方式中,所述视频为拍摄完成的视频;所述获取模块 11具体用于:In a possible implementation manner, the video is a completed video; the acquisition module 11 is specifically configured to:
执行待处理图像选择操作,所述待处理图像选择操作包括:从所述视频的预设图像起,在所述视频中获取连续的N帧待处理图像;Performing a to-be-processed image selection operation, the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
执行N帧图像确定操作,所述N帧图像确定操作包括:判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象对应的图像,若是,则将所述N帧待处理图像确定为所述N帧图像,若否,则将所述预设图像更新为所述视频中所述预设图像之后的一帧图像;Performing an operation of determining N frames of images, the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
重复执行所述待处理图像选择操作和所述N帧图像确定操作,直至确定得到所述N帧图像。Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until it is determined that the N frames of images are obtained.
在一种可能的实施方式中,所述增加模块14具体用于:In a possible implementation manner, the adding module 14 is specifically configured to:
在所述N帧图像中的至少一帧图像中增加所述特效。The special effect is added to at least one of the N frames of images.
在一种可能的实施方式中,所述获取模块11具体用于:In a possible implementation manner, the obtaining module 11 is specifically configured to:
确定待在所述视频中增加的特效;Determine the special effects to be added to the video;
确定待在所述视频中增加的特效对应的所述第一对象;Determine the first object corresponding to the special effect to be added in the video;
根据所述第一对象,在所述视频中确定所述N帧图像。According to the first object, the N frames of images are determined in the video.
图8为本公开实施例提供的另一种视频处理装置的结构示意图。在图7所示实施例的基础上,请参见图8,视频处理装置10还包括第三确定模块15,其中,FIG. 8 is a schematic structural diagram of another video processing device provided by an embodiment of the disclosure. Based on the embodiment shown in FIG. 7, referring to FIG. 8, the video processing device 10 further includes a third determining module 15, where:
所述第三确定模块15用于,在所述获取模块11在视频中获取连续的N帧图像之前,确定未在所述N帧图像中增加所述目标特效。The third determining module 15 is configured to determine that the target special effect is not added to the N frames of images before the acquiring module 11 acquires consecutive N frames of images in the video.
本公开实施例提供的视频处理装置可以执行上述方法实施例所示的技术方案,其实现原理以及有益效果类似,此处不再进行赘述。The video processing device provided in the embodiments of the present disclosure can execute the technical solutions shown in the foregoing method embodiments, and the implementation principles and beneficial effects are similar, and details are not described herein again.
图9为本公开实施例提供的电子设备的结构示意图。电子设备20可以为终端设备或服务器。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图9示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。FIG. 9 is a schematic structural diagram of an electronic device provided by an embodiment of the disclosure. The electronic device 20 may be a terminal device or a server. Among them, terminal devices may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablets (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, PMP for short), mobile terminals such as vehicle-mounted terminals (for example, vehicle navigation terminals), and fixed terminals such as digital TVs and desktop computers. The electronic device shown in FIG. 9 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
请参见图9,电子设备20可以包括处理装置(例如中央处理器、图形处 理器等)21,其可以根据存储在只读存储器(Read Only Memory,简称ROM)22中的程序或者从存储装置28加载到随机访问存储器(Random Access Memory,简称RAM)23中的程序而执行各种适当的动作和处理。在RAM 23中,还存储有电子设备20操作所需的各种程序和数据。处理装置21、ROM 22以及RAM 23通过总线24彼此相连。输入/输出(I/O)接口25也连接至总线24。Referring to FIG. 9, the electronic device 20 may include a processing device (such as a central processing unit, a graphics processor, etc.) 21, which may be based on a program stored in a read only memory (Read Only Memory, ROM for short) 22 or from a storage device 28 The program loaded into the random access memory (Random Access Memory, RAM for short) 23 executes various appropriate actions and processing. The RAM 23 also stores various programs and data required for the operation of the electronic device 20. The processing device 21, the ROM 22, and the RAM 23 are connected to each other through a bus 24. An input/output (I/O) interface 25 is also connected to the bus 24.
通常,以下装置可以连接至I/O接口25:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置26;包括例如液晶显示器(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置27;包括例如磁带、硬盘等的存储装置28;以及通信装置29。通信装置29可以允许电子设备20与其他设备进行无线或有线通信以交换数据。虽然图9示出了具有各种装置的电子设备20,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Generally, the following devices can be connected to the I/O interface 25: including input devices 26 such as touch screen, touch panel, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD) ), an output device 27 such as a speaker, a vibrator, etc.; a storage device 28 including a magnetic tape, a hard disk, etc.; and a communication device 29. The communication device 29 may allow the electronic device 20 to perform wireless or wired communication with other devices to exchange data. Although FIG. 9 shows an electronic device 20 having various devices, it should be understood that it is not required to implement or have all the illustrated devices. It may alternatively be implemented or provided with more or fewer devices.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置29从网络上被下载和安装,或者从存储装置28被安装,或者从ROM22被安装。在该计算机程序被处理装置21执行时,执行本公开实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the process described above with reference to the flowchart can be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication device 29, or installed from the storage device 28, or installed from the ROM 22. When the computer program is executed by the processing device 21, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可 读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。It should be noted that the aforementioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In the present disclosure, a computer-readable signal medium may include a data signal propagated in a baseband or as a part of a carrier wave, and a computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium. The computer-readable signal medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device . The program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to: wire, optical cable, RF (Radio Frequency), etc., or any suitable combination of the above.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。The foregoing computer-readable medium carries one or more programs, and when the foregoing one or more programs are executed by the electronic device, the electronic device is caused to execute the method shown in the foregoing embodiment.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。The computer program code used to perform the operations of the present disclosure may be written in one or more programming languages or a combination thereof. The above-mentioned programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language. The program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to the outside Computer (for example, using an Internet service provider to connect via the Internet).
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图 中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the possible implementation architecture, functions, and operations of the system, method, and computer program product according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram can represent a module, program segment, or part of code, and the module, program segment, or part of code contains one or more for realizing the specified logical function Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or operations Or it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。The units involved in the embodiments described in the present disclosure may be implemented in a software manner, or may be implemented in a hardware manner.
最后应说明的是:以上各实施例仅用以说明本公开实施例的技术方案,而非对其限制;尽管参照前述各实施例对本公开实施例进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开实施例方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the embodiments of the present disclosure, not to limit them; although the embodiments of the present disclosure have been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art It should be understood that it is still possible to modify the technical solutions recorded in the foregoing embodiments, or equivalently replace some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the embodiments of the present disclosure. The scope of the program.

Claims (15)

  1. 一种视频处理方法,其特征在于,包括:A video processing method, characterized by comprising:
    在视频中获取连续的N帧图像,每帧所述图像中均包括第一对象,所述N为大于1的整数;Acquire consecutive N frames of images in the video, each frame of the image includes the first object, and the N is an integer greater than 1;
    确定每帧图像中的所述第一对象的姿势类型,并根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,所述姿势分布用于指示所述第一对象的姿势的变化规律;Determine the posture type of the first object in each frame of image, and determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate the The law of change of the posture of the first object;
    根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效。According to the posture distribution of the first object and the N frames of images, special effects are added to the video.
  2. 根据权利要求1所述的方法,其特征在于,根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效,包括:The method according to claim 1, wherein adding special effects to the video according to the posture distribution of the first object and the N frames of images comprises:
    判断所述第一对象的姿势分布是否满足预设姿势分布;Judging whether the posture distribution of the first object meets a preset posture distribution;
    在所述第一对象的姿势分布满足预设姿势分布时,获取所述预设姿势分布对应的目标特效,并根据所述N帧图像在所述视频中增加所述目标特效。When the posture distribution of the first object satisfies a preset posture distribution, obtain a target special effect corresponding to the preset posture distribution, and add the target special effect to the video according to the N frames of images.
  3. 根据权利要求1或2所述的方法,其特征在于,根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,包括:The method according to claim 1 or 2, wherein the determining the posture distribution of the first object according to the posture type of the first object in each frame of image comprises:
    按照所述N帧图像在所述视频中的顺序,对所述N帧图像进行分组,得到至少两组图像,每组图像中包括连续的M帧图像,所述M为大于1的整数;According to the sequence of the N frames of images in the video, the N frames of images are grouped to obtain at least two sets of images, each group of images includes consecutive M frames of images, where M is an integer greater than 1. ;
    根据每组图像中每个图像中的所述第一对象的姿势类型,确定每组图像对应的姿势类型;Determine the posture type corresponding to each group of images according to the posture type of the first object in each image in each group of images;
    根据每组图像对应的姿势类型,确定所述第一对象的姿势分布。The posture distribution of the first object is determined according to the posture type corresponding to each group of images.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,针对所述N帧图像中的任意的第一图像,确定所述第一图像中的所述第一对象的姿势类型,包括:The method according to any one of claims 1-3, wherein for any first image in the N frames of images, determining the posture type of the first object in the first image comprises :
    在所述第一图像中检测对象区域,所述对象区域中包括所述第一图像中与所述第一对象对应的部分;Detecting an object area in the first image, where the object area includes a part of the first image corresponding to the first object;
    对所述对象区域进行处理,以确定所述第一图像中的所述第一对象的姿势类型。The object area is processed to determine the posture type of the first object in the first image.
  5. 根据权利要求4所述的方法,其特征在于,在所述第一图像中检测对象区域,包括:The method of claim 4, wherein detecting an object area in the first image comprises:
    将表示所述第一图像的数据输入至第一识别模型,以获取所述对象区域;其中,所述第一识别模型为对多组第一样本进行学习得到的,每组第一样本包括样本图像和所述样本图像中的样本对象区域,所述样本图像中包括所述第一对象对应的图像。The data representing the first image is input to a first recognition model to obtain the object area; wherein, the first recognition model is obtained by learning multiple groups of first samples, each group of first samples It includes a sample image and a sample object area in the sample image, and the sample image includes an image corresponding to the first object.
  6. 根据权利要求4或5所述的方法,其特征在于,对所述对象区域进行处理,以确定所述第一图像中的所述第一对象的姿势类型,包括:The method according to claim 4 or 5, wherein processing the object area to determine the posture type of the first object in the first image comprises:
    将表示所述对象区域的数据输入至第二识别模型,以获取所述第一图像中的所述第一对象的姿势类型;其中,所述第二识别模型为对多组第二样本进行学习得到的,每组第二样本包括样本对象区域和在所述样本对象区域中识别得到的样本姿势类型,所述样本对象区域中包括所述第一对象对应的图像。Input the data representing the object area into a second recognition model to obtain the posture type of the first object in the first image; wherein the second recognition model is learning from multiple sets of second samples Obtained, each set of second samples includes a sample object area and a sample gesture type recognized in the sample object area, and the sample object area includes an image corresponding to the first object.
  7. 根据权利要求2所述的方法,其特征在于,所述视频为正在拍摄的视频;在视频中获取连续的N帧图像,包括:The method according to claim 2, wherein the video is a video being shot; acquiring consecutive N frames of images in the video comprises:
    在所述视频中获取N帧待处理图像,所述N帧待处理图像中包括所述视频中已拍摄的最后N帧图像;Acquiring N frames of to-be-processed images in the video, and the N frames of to-be-processed images include the last N frames of images that have been taken in the video;
    判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象,若是,则将所述N帧待处理图像确定为所述N帧图像。It is determined whether each of the N frames of images to be processed includes the first object, and if so, the N frames of images to be processed are determined as the N frames of images.
  8. 根据权利要求7所述的方法,其特征在于,根据所述N帧图像在所述视频中增加所述目标特效,包括:The method of claim 7, wherein adding the target special effect to the video according to the N frames of images comprises:
    在所述N帧图像中的第N帧图像中的增加所述特效。Adding the special effect to the Nth frame of the N frames of images.
  9. 根据权利要求2所述的方法,其特征在于,所述视频为拍摄完成的视频;所述在视频中获取连续的N帧图像,包括:The method according to claim 2, wherein the video is a completed video; and the acquiring consecutive N frames of images in the video comprises:
    执行待处理图像选择操作,所述待处理图像选择操作包括:从所述视频的预设图像起,在所述视频中获取连续的N帧待处理图像;Performing a to-be-processed image selection operation, the to-be-processed image selection operation includes: acquiring, from a preset image of the video, consecutive N frames of to-be-processed images in the video;
    执行N帧图像确定操作,所述N帧图像确定操作包括:判断所述N帧待处理图像中是否每帧待处理图像中均包括所述第一对象对应的图像,若是,则将所述N帧待处理图像确定为所述N帧图像,若否,则将所述预设图像更新为所述视频中所述预设图像之后的一帧图像;Performing an operation of determining N frames of images, the operation of determining N frames of images includes: determining whether each frame of the N frames of to-be-processed images includes the image corresponding to the first object; The frame of the image to be processed is determined to be the N frames of images, and if not, the preset image is updated to a frame of image after the preset image in the video;
    重复执行所述待处理图像选择操作和所述N帧图像确定操作,直至确定得到所述N帧图像。Repeat the operation of selecting the image to be processed and the operation of determining the N frames of images until it is determined that the N frames of images are obtained.
  10. 根据权利要求9所述的方法,其特征在于,根据所述N帧图像在所述视频中增加所述目标特效,包括:The method of claim 9, wherein adding the target special effect to the video according to the N frames of images comprises:
    在所述N帧图像中的至少一帧图像中增加所述特效。The special effect is added to at least one of the N frames of images.
  11. 根据权利要求1-10任一项所述的方法,其特征在于,所述在视频中获取连续的N帧图像,包括:The method according to any one of claims 1-10, wherein the acquiring consecutive N frames of images in a video comprises:
    确定待在所述视频中增加的特效;Determine the special effects to be added to the video;
    确定待在所述视频中增加的特效对应的所述第一对象;Determine the first object corresponding to the special effect to be added in the video;
    根据所述第一对象,在所述视频中确定所述N帧图像。According to the first object, the N frames of images are determined in the video.
  12. 根据权利要求2所述的方法,其特征在于,在视频中获取连续的N帧图像之前,还包括:The method according to claim 2, characterized in that, before acquiring consecutive N frames of images in the video, the method further comprises:
    确定未在所述N帧图像中增加所述目标特效。It is determined that the target special effect is not added to the N frames of images.
  13. 一种视频处理装置,其特征在于,包括获取模块、第一确定模块、第二确定模块和增加模块,其中,A video processing device, characterized in that it comprises an acquisition module, a first determination module, a second determination module, and an addition module, wherein:
    所述获取模块用于,在视频中获取连续的N帧图像,每帧所述图像中均包括第一对象,所述N为大于1的整数;The acquiring module is configured to acquire consecutive N frames of images in a video, each frame of the image includes a first object, and the N is an integer greater than 1;
    所述第一确定模块用于,确定每帧图像中的所述第一对象的姿势类型;The first determining module is configured to determine the posture type of the first object in each frame of image;
    所述第二确定模块用于,根据每帧图像中的所述第一对象的姿势类型,确定所述第一对象的姿势分布,所述姿势分布用于指示所述第一对象的姿势的变化规律;The second determining module is configured to determine the posture distribution of the first object according to the posture type of the first object in each frame of image, and the posture distribution is used to indicate a change in the posture of the first object law;
    所述增加模块用于,根据所述第一对象的姿势分布和所述N帧图像,在所述视频中增加特效。The adding module is configured to add special effects to the video according to the posture distribution of the first object and the N frames of images.
  14. 一种电子设备,其特征在于,包括:至少一个处理器和存储器;An electronic device, characterized by comprising: at least one processor and a memory;
    所述存储器存储计算机执行指令;The memory stores computer execution instructions;
    所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1-12任一项所述的视频处理方法。The at least one processor executes computer-executable instructions stored in the memory, so that the at least one processor executes the video processing method according to any one of claims 1-12.
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1-12任一项所述的视频处理方法。A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the computer-readable storage medium implements any one of claims 1-12 Video processing method.
PCT/CN2019/126757 2019-04-16 2019-12-19 Video processing method and apparatus, and device WO2020211422A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910304462.5 2019-04-16
CN201910304462.5A CN109889893A (en) 2019-04-16 2019-04-16 Method for processing video frequency, device and equipment

Publications (1)

Publication Number Publication Date
WO2020211422A1 true WO2020211422A1 (en) 2020-10-22

Family

ID=66937553

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/126757 WO2020211422A1 (en) 2019-04-16 2019-12-19 Video processing method and apparatus, and device

Country Status (2)

Country Link
CN (1) CN109889893A (en)
WO (1) WO2020211422A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109889893A (en) * 2019-04-16 2019-06-14 北京字节跳动网络技术有限公司 Method for processing video frequency, device and equipment
CN110223325B (en) * 2019-06-18 2021-04-27 北京字节跳动网络技术有限公司 Object tracking method, device and equipment
CN112396676B (en) * 2019-08-16 2024-04-02 北京字节跳动网络技术有限公司 Image processing method, device, electronic equipment and computer-readable storage medium
CN111416991B (en) * 2020-04-28 2022-08-05 Oppo(重庆)智能科技有限公司 Special effect processing method and apparatus, and storage medium
CN112199016B (en) * 2020-09-30 2023-02-21 北京字节跳动网络技术有限公司 Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112929743B (en) * 2021-01-22 2023-03-21 广州光锥元信息科技有限公司 Method and device for adding video special effect to specified object in video and mobile terminal

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004112112A (en) * 2002-09-13 2004-04-08 Sony Corp Information processing apparatus
US20130201328A1 (en) * 2012-02-08 2013-08-08 Hing Ping Michael CHUNG Multimedia processing as a service
CN107481327A (en) * 2017-09-08 2017-12-15 腾讯科技(深圳)有限公司 On the processing method of augmented reality scene, device, terminal device and system
CN108289180A (en) * 2018-01-30 2018-07-17 广州市百果园信息技术有限公司 Method, medium and the terminal installation of video are handled according to limb action
CN108833818A (en) * 2018-06-28 2018-11-16 腾讯科技(深圳)有限公司 video recording method, device, terminal and storage medium
CN109089058A (en) * 2018-07-06 2018-12-25 广州华多网络科技有限公司 Video pictures processing method, electric terminal and device
CN109889893A (en) * 2019-04-16 2019-06-14 北京字节跳动网络技术有限公司 Method for processing video frequency, device and equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902212B (en) * 2015-04-30 2019-05-10 努比亚技术有限公司 A kind of video communication method and device
US20160365116A1 (en) * 2015-06-11 2016-12-15 Yaron Galant Video editing apparatus with participant sharing
CN106385591B (en) * 2016-10-17 2020-05-15 腾讯科技(上海)有限公司 Video processing method and video processing device
CN109391792B (en) * 2017-08-03 2021-10-29 腾讯科技(深圳)有限公司 Video communication method, device, terminal and computer readable storage medium
CN108712661B (en) * 2018-05-28 2022-02-25 广州虎牙信息科技有限公司 Live video processing method, device, equipment and storage medium
CN109618183B (en) * 2018-11-29 2019-10-25 北京字节跳动网络技术有限公司 A kind of special video effect adding method, device, terminal device and storage medium
CN109462776B (en) * 2018-11-29 2021-08-20 北京字节跳动网络技术有限公司 Video special effect adding method and device, terminal equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004112112A (en) * 2002-09-13 2004-04-08 Sony Corp Information processing apparatus
US20130201328A1 (en) * 2012-02-08 2013-08-08 Hing Ping Michael CHUNG Multimedia processing as a service
CN107481327A (en) * 2017-09-08 2017-12-15 腾讯科技(深圳)有限公司 On the processing method of augmented reality scene, device, terminal device and system
CN108289180A (en) * 2018-01-30 2018-07-17 广州市百果园信息技术有限公司 Method, medium and the terminal installation of video are handled according to limb action
CN108833818A (en) * 2018-06-28 2018-11-16 腾讯科技(深圳)有限公司 video recording method, device, terminal and storage medium
CN109089058A (en) * 2018-07-06 2018-12-25 广州华多网络科技有限公司 Video pictures processing method, electric terminal and device
CN109889893A (en) * 2019-04-16 2019-06-14 北京字节跳动网络技术有限公司 Method for processing video frequency, device and equipment

Also Published As

Publication number Publication date
CN109889893A (en) 2019-06-14

Similar Documents

Publication Publication Date Title
WO2020211422A1 (en) Video processing method and apparatus, and device
US20210029305A1 (en) Method and apparatus for adding a video special effect, terminal device and storage medium
WO2020082870A1 (en) Real-time video display method and apparatus, and terminal device and storage medium
CN110070063B (en) Target object motion recognition method and device and electronic equipment
JP7199527B2 (en) Image processing method, device, hardware device
US20230421716A1 (en) Video processing method and apparatus, electronic device and storage medium
WO2020248900A1 (en) Panoramic video processing method and apparatus, and storage medium
CN109600559B (en) Video special effect adding method and device, terminal equipment and storage medium
CN114245028B (en) Image display method and device, electronic equipment and storage medium
CN110781823A (en) Screen recording detection method and device, readable medium and electronic equipment
CN110072047A (en) Control method, device and the hardware device of image deformation
WO2023185391A1 (en) Interactive segmentation model training method, labeling data generation method, and device
US20240119082A1 (en) Method, apparatus, device, readable storage medium and product for media content processing
CN111862349A (en) Virtual brush implementation method and device and computer readable storage medium
CN112258622B (en) Image processing method and device, readable medium and electronic equipment
US11880919B2 (en) Sticker processing method and apparatus
WO2023202590A1 (en) Page switching method and apparatus, and interaction method for terminal device
CN110022493B (en) Playing progress display method and device, electronic equipment and storage medium
CN113535105B (en) Media file processing method, device, equipment, readable storage medium and product
WO2021227953A1 (en) Image special effect configuration method, image recognition method, apparatuses, and electronic device
US12177534B2 (en) Method, system and device for playing effect in live room
CN116527993A (en) Video processing method, apparatus, electronic device, storage medium and program product
WO2025002075A1 (en) Video generation method and apparatus, electronic device, and storage medium
CN113129360B (en) Method and device for positioning object in video, readable medium and electronic equipment
WO2020207083A1 (en) Information sharing method and apparatus, and electronic device and computer-readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19924945

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19924945

Country of ref document: EP

Kind code of ref document: A1