CN108289201B

CN108289201B - Video data processing method and device and electronic equipment

Info

Publication number: CN108289201B
Application number: CN201810067998.5A
Authority: CN
Inventors: 武锐
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2018-01-24
Filing date: 2018-01-24
Publication date: 2020-12-25
Anticipated expiration: 2038-01-24
Also published as: CN108289201A

Abstract

Disclosed is a video data processing method including: acquiring, by a first image capture device, a first video including an object, the first image capture device having a first resolution; detecting an initial pose of the object from an initial video frame of the first video; determining whether the initial pose of the object meets a first predetermined criterion; and, in response to the initial pose of the object meeting the first predetermined criteria, triggering acquisition of a second video including the object by a second image capture device, the second image capture device having a second resolution, the second resolution being higher than the first resolution. Thus, efficient processing of video data is achieved.

Description

Video data processing method and device and electronic equipment

Technical Field

The present application relates to the field of video technologies, and in particular, to a video data processing method, a video data processing apparatus, and an electronic device.

Background

Images and videos are important data in life and production of people, and video monitoring is an important application of the images and the videos. The basic function of video monitoring is to provide real-time video acquisition and to record, transmit and store the acquired pictures for later confirmation. With the wide arrangement and resolution of cameras for video monitoring increasing year by year, data is presented with all-weather and sea quantization, and corresponding data processing, storage, query and the like all encounter new challenges.

Accordingly, there is a need for improved video data processing schemes.

Disclosure of Invention

The present application is proposed to solve the above-mentioned technical problems. Embodiments of the present application provide a video data processing method, a video data processing apparatus, and an electronic device, which can implement efficient processing of video data.

According to an aspect of the present application, there is provided a video data processing method including: acquiring, by a first image capture device, a first video including an object, the first image capture device having a first resolution; detecting an initial pose of the object from an initial video frame of the first video; determining whether the initial pose of the object meets a first predetermined criterion; and, in response to the initial pose of the object meeting the first predetermined criteria, triggering acquisition of a second video including the object by a second image capture device, the second image capture device having a second resolution, the second resolution being higher than the first resolution.

According to another aspect of the present application, there is provided a video data processing apparatus including: an acquisition unit configured to acquire a first video including an object by a first image capturing apparatus having a first resolution; a detection unit configured to detect an initial pose of the object from an initial video frame of the first video; a determination unit for determining whether the initial pose of the object meets a first predetermined criterion; and a control unit for triggering acquisition of a second video comprising the object by a second image capturing device in response to the initial pose of the object meeting the first predetermined criterion, the second image capturing device having a second resolution, the second resolution being higher than the first resolution.

According to another aspect of the present application, there is provided an electronic device including: a processor; and a memory in which are stored computer program instructions which, when executed by the processor, cause the processor to perform the video data processing method as described above.

According to yet another aspect of the present application, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the video data processing method as described above.

Compared with the prior art, by adopting the video data processing method, the video data processing device and the electronic equipment, a first video comprising an object can be acquired through first image acquisition equipment, and the first image acquisition equipment has a first resolution; detecting an initial pose of the object from an initial video frame of the first video; determining whether the initial pose of the object meets a first predetermined criterion; and, in response to the initial pose of the object meeting the first predetermined criteria, triggering acquisition of a second video including the object by a second image capture device, the second image capture device having a second resolution, the second resolution being higher than the first resolution. Accordingly, an important part in the video can be determined by the gesture recognition of the object, thereby achieving efficient processing of the video data accordingly.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 illustrates a schematic diagram of an application scenario of video data processing according to an embodiment of the present application.

Fig. 2 illustrates a flow chart of a video data processing method according to an embodiment of the present application.

FIG. 3 illustrates a schematic diagram of a single frame estimation technique of human body pose according to an embodiment of the application.

FIG. 4 illustrates a schematic diagram of a human gesture-based behavior recognition technique according to an embodiment of the application.

Fig. 5 illustrates a flow chart of a video data acquisition process according to an embodiment of the application.

Fig. 6 illustrates a flow chart of a video data storage process according to an embodiment of the present application.

Fig. 7 illustrates a block diagram of a video data processing apparatus according to an embodiment of the present application.

FIG. 8 illustrates a block diagram of an electronic device in accordance with an embodiment of the present application.

Detailed Description

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.

Summary of the application

As described above, a large amount of video data is currently generated in life and production of people. However, in these data videos, the importance of all frames is not completely consistent, and this feature is especially apparent in the field of surveillance. In general, abnormal behavior of a person in the imaging range is sparse, and this part is the most important part of the entire monitoring data. Therefore, by adopting the human body posture and behavior recognition technology, key frames in the video can be given, so that the video data can be processed efficiently.

In view of the technical problem, the present application provides a video data processing method, a video data processing apparatus, and an electronic device, which can determine an important part in a video through gesture recognition of an object and perform different processing for the important part and an unimportant part in the video, thereby achieving efficient processing of video data.

It should be noted that the above basic concept of the present application can be applied not only to gesture recognition of a human body, but also to gesture recognition of other objects, such as a non-human moving object, or a human-like robot.

Having described the general principles of the present application, various non-limiting embodiments of the present application will now be described with reference to the accompanying drawings.

Exemplary System

As shown in fig. 1, an application scenario for video data processing includes a first image capture device 110, a second image capture device 120, and an object 130.

The first image capturing device 110 may be any type of image capturing device, such as a camera or the like. For example, the image data acquired by the camera may be a continuous image frame sequence (i.e., a video stream) or a discrete image frame sequence (i.e., an image data set sampled at a predetermined sampling time point), etc. For example, the camera may be a monocular camera, a binocular camera, a multi-view camera, etc., and in addition, it may be used to capture a gray scale image, and may also capture a color image with color information. Of course, any other type of camera known in the art and that may appear in the future may be applied to the present application, and the present application has no particular limitation on the manner in which an image is captured as long as gray scale or color information of an input image can be obtained. To reduce the amount of computation in subsequent operations, in one embodiment, the color map may be grayed out before analysis and processing. Of course, to preserve a larger amount of information, in another embodiment, the color map may also be analyzed and processed directly.

The second image acquisition device 120 may also be any type of image acquisition device, which may be of a different type than the first image acquisition device 110, or which may have different imaging parameters than the first image acquisition device 110.

The object 130 may be any type of imaging object. For example, the object 130 may be a human, or other moving object, including a robot, etc.

Exemplary method

As shown in fig. 2, a video data processing method according to an embodiment of the present application includes: s210, acquiring a first video including an object through a first image acquisition device, wherein the first image acquisition device has a first resolution; s220, detecting an initial posture of the object from an initial video frame of the first video; s230, determining whether the initial posture of the object meets a first preset standard; and S240, in response to the initial pose of the object meeting the first predetermined criterion, triggering acquisition of a second video including the object by a second image acquisition device, the second image acquisition device having a second resolution, the second resolution being higher than the first resolution.

Hereinafter, each step will be described in detail.

In step S210, a first video including an object is acquired by a first image capturing device 110, for example, a first camera. Here, in order to reduce the amount of computation and memory space for gesture recognition, the first image capturing device may have a lower first resolution.

In step S220, an initial pose of the object is detected from an initial video frame of the first video.

As shown in fig. 3, the pose of the human body may be detected from a single video frame in the video. Specifically, the pose estimation of a human body, a human face, and the like can be performed by an emerging computer vision technology. For example, the skeleton (head, shoulder, elbow, wrist, hip, knee, ankle, etc.) and the head posture (left and right pitch and corresponding angle) of the human body can be identified with high accuracy from the image acquired by the monocular or binocular camera, thereby estimating the relationship of the limbs and the fat, thin, tall and short.

In step S230, it is determined whether the initial pose of the object meets a first predetermined criterion. That is, by determining whether the initial posture of the subject meets a first predetermined criterion, it can be determined whether the initial posture of the subject is a specific posture requiring attention, for example, in a home care scenario, a forward leaning posture or a lying posture of a human body, or the like.

Subsequently, in step S240, in case the initial pose of the object is a specific pose requiring attention, the acquisition of a second video comprising the object by the second image capturing device 120 is triggered. Here, the second image pickup device may have a higher resolution in order to obtain a clearer image of interest more.

Therefore, in the video data processing method according to the embodiment of the present application, depending on whether a specific pose of an object is detected, a first video and a second video of the object are respectively acquired by a first image capturing device and a second image capturing device having different resolutions. In this way, the processing of the key frame including the key pose of the object can be performed by the second video having the higher second resolution, thereby improving the processing efficiency of the video data.

However, in the case of recognizing only the posture of the object, there may be a case where the behavior of the object cannot be clearly determined. For example, when the forward leaning posture of the human body is detected, the subsequent behavior of the human body cannot be accurately determined, for example, the subsequent behavior of the human body may be an active running behavior or a passive falling behavior.

Therefore, in the video data processing method according to the embodiment of the present application, further according to the sequence of video frames, the behavior (e.g., climbing, sitting down, falling down, etc. of the human body) is effectively recognized through the gesture recognition results of the adjacent frames.

FIG. 4 illustrates a schematic diagram of a human gesture-based behavior recognition technique according to an embodiment of the application. As shown in fig. 4, based on the human body gestures recognized in the consecutive frames, the behavior of the human body can be recognized through the gesture sequence.

The video data processing method according to an embodiment of the present application may include two examples with respect to the identification of the behavior of the object.

First, in a first example, acquisition of the second video by the second image capture device may be triggered upon determining that a particular behavior of an object is detected. In this way, the probability of false activation of the second image capturing device with high resolution can be reduced. For example, it may be further determined whether a first subsequent pose detected in a first subsequent video frame subsequent to an initial video frame also meets a first predetermined criterion in the event that it is determined that the initial pose detected from the initial video frame meets the first predetermined criterion. And, in the event that a particular gesture of the user is detected in a plurality of consecutive video frames, identifying a behavior of the object based on the sequence of gestures. That is, in the video data processing method according to the embodiment of the present application, the behavior of the object is recognized based on the initial pose of the initial video frame and the first subsequent pose of the first subsequent video frame.

Through the behavior recognition result of the object, the priority of the behavior of the object can be determined. That is, in the video data processing method according to the embodiment of the present application, it is determined whether or not the behavior of the object is a high-priority behavior according to the behavior category information of the behavior of the object. Here, the high priority behavior refers to a behavior of an object that requires a significant attention, for example, a falling behavior of a human body in a home care scenario. Accordingly, low priority behavior is common behavior of the subject, e.g., walking behavior of a human body, etc.

In this way, when it is determined that the high-priority behavior of the object is detected, the second video is triggered to be acquired through the second image acquisition device, so that the second image acquisition device is ensured not to be started by mistake due to the unimportant behavior of the object.

Therefore, in the video data processing method according to the embodiment of the present application, in response to the initial pose of the object meeting the first predetermined criterion, triggering acquisition of a second video including the object by a second image capturing device S240 may include: in response to the initial pose of the object meeting the first predetermined criterion, detecting a first subsequent pose of the object from a first subsequent video frame subsequent to the initial video frame; determining whether a first subsequent pose of the object meets the first predetermined criteria; in response to a first subsequent pose of the object meeting the first predetermined criteria, determining that the behavior of the object is a high priority behavior having a priority greater than a predetermined threshold; and in response to determining that the behavior of the object is the high priority behavior, triggering acquisition, by the second image capture device, of a second video that includes the object.

However, in the above case, if the time length of the behavior of the object is long, there is a possibility that the start-up time of the second image capturing apparatus is too late, and thus key information is missed.

Thus, alternatively, in the second example, in case a specific posture of the object is detected, for example, a forward leaning posture of the human body, the second image capturing device with high resolution is triggered to acquire the second video of the object. And then determining whether high priority behavior of the object is detected based on subsequent video frames of the first video. Maintaining acquisition of the second video by the second image capture device upon determining that high priority behavior of the object is detected. And in the event that it is determined that no high priority behavior of the object has been detected, i.e., in the event of a false start based on the initial pose of the object, stopping the acquisition of the second video by the second image capturing device.

Here, as can be understood by those skilled in the art, in the second example, the process of identifying the high-priority behavior of the object based on the gesture sequence in the video frame is the same as that in the first example, and is not described here again to avoid redundancy.

Therefore, in the video data processing method according to the embodiment of the present application, the method may further include: detecting a first subsequent pose of the object from a first subsequent video frame subsequent to the initial video frame; determining whether a first subsequent pose of the object meets the first predetermined criteria; in response to a first subsequent pose of the object meeting the first predetermined criteria, determining that the behavior of the object is a high priority behavior having a priority greater than a predetermined threshold; and in response to determining that the behavior of the object is the high priority behavior, maintaining acquisition, by the second image capture device, of a second video that includes the object.

In addition, the video data processing method may further include: determining that the behavior of the object is not the high priority behavior in response to a first subsequent pose of the object not meeting the first predetermined criteria; and in response to the behavior of the object not being the high priority behavior, ceasing acquisition of a second video including the object by the second image capture device.

In the video data processing method according to the embodiment of the application, after the second video is triggered to be acquired by the second image acquisition device, the first video may still be acquired by the first image acquisition device. In this way, the pose of the object may be detected from the video frames of the first video, thereby determining the behavior of the object. Since the first video has a low resolution, this can reduce the time and resource consumption required for gesture recognition.

While the second video is acquired by the second image capture device, continuing to detect a second subsequent pose of the object from a second subsequent video frame subsequent to the first subsequent video frame and determining whether the second subsequent pose meets a second predetermined criterion. Here, the above-described first predetermined criterion is a criterion for determining whether the posture of the object is a specific posture requiring attention, and the second predetermined criterion here is an end of a high-priority behavior for determining whether the posture of the object reflects the object requiring attention. For example, the person stands up again after falling, and the posture of the person undergoes a process of returning from the forward-inclined posture to the standing posture. Alternatively, the person stops after running, and the posture of the human body also goes through a process of returning from the running posture to the walking posture or the standing posture. Thus, by determining whether a second subsequent pose of the object meets a second predetermined criterion, it may be determined whether the high priority behavior of the object that requires attention ends. And, in the event that the high priority behavior of the object ends, ceasing acquisition of a second video comprising the object by the second image capture device.

That is, in the video data processing method according to the embodiment of the present application, the method may further include: detecting a second subsequent pose of the object from a second subsequent video frame subsequent to the first subsequent video frame; determining whether a second subsequent pose of the object meets a second predetermined criterion; in response to a second subsequent pose of the object meeting the second predetermined criteria, determining that the high priority behavior of the object is over; and in response to the high priority behavior of the object ending, ceasing acquisition of a second video including the object by the second image capture device.

Here, it can be understood by those skilled in the art that the above description detects the start and end of a specific behavior of the object based on the first video acquired with the first image capturing device. However, in case the second image capturing device is triggered to acquire the second video, the start and end of the specific behavior of the object may also be determined based on the second video.

After the first video is acquired by the first image capturing device and the second video is acquired by the second image capturing device, the first video and the second video may be further stored for subsequent video processing, querying, and the like.

Additionally, as described above, if a first subsequent gesture detected in a first subsequent video frame of the first video does not meet the first predetermined criteria, then the behavior of the object is not said high priority behavior. In this way, the second video may be deleted to reduce the occupation of storage space, since there is no need to pay special attention to the behavior of the object stored in the second video.

That is, in the above video data processing method, the method may further include: determining that the behavior of the object is not the high priority behavior in response to a first subsequent pose of the object not meeting the first predetermined criteria; and deleting the second video in response to the behavior of the object not being the high priority behavior.

Fig. 5 illustrates a flow chart of a video data acquisition process according to an embodiment of the application. As shown in fig. 5, in step S310, the low resolution camera is activated. In step S320, a low resolution image is acquired with a low resolution camera. In step S330, pose information of the object is acquired from the low resolution image. In step S340, based on the preset gesture, the behavior classification standard and the priority definition of the behavior, the behavior category information and the priority information of the behavior of the object are obtained through pattern recognition and behavior level classification. Next, in step S350, it is determined whether the behavior of the object is a high-priority behavior. In step S360, in response to the behavior of the object being a high priority behavior, a high resolution camera is activated. In step S370, a high resolution image is captured with a high resolution camera. Finally, in step S380, the high resolution image is stored. In addition, also at step S280, in response to the behavior of the object not being a high priority behavior, a low resolution image is stored. Also, the behavior category information and the priority information of the behavior of the object obtained in the previous step S340 may be further stored in step S380, which will be further described later.

That is, in the above video data acquisition process, the high resolution and low resolution cameras are used to work simultaneously to acquire image information. The low-resolution camera collects and stores image information in real time, performs gesture recognition and fragment behavior recognition on an object at the same time, and gives and stores priority information, a gesture result, a behavior tag (namely, behavior category information and/or priority information) representing the criticality of a video frame and related video segment information of the behavior based on preset abnormal standards (such as lying down, falling down and the like). In addition, if the abnormal level is high, the high-resolution camera is triggered to collect images and the label information of the behaviors is stored.

In addition, in the storage process of the first video and the second video, since the second video includes key information about object behaviors that need to be focused on, and the first video includes more general object behaviors, the second video is stored at a low compression rate while the first video is stored at a high compression rate to obtain better storage space utilization efficiency.

That is, in video storage, when the capacity of the storage device is limited, differentiated compression can be performed according to the key information of the acquired video. The video frames with low abnormal levels are stored by adopting a high compression rate, and the video frames with high abnormal levels are stored by adopting a low compression rate. In this way, by storing the compressed video, the capacity of the storage device can be released.

And, if the storage capacity of the storage device is below a predetermined threshold, the second video may be preferentially retained while at least a portion of the first video is deleted.

As described above, in the process of storing the first video and the second video, in addition to the behavior category information and the priority information of the object behaviors of the first video and the second video, the time information of the first video and the second video may be stored. In this way, when deleting at least a part of the first video, one or more video frames of the first video that are earlier in time, i.e., one or more video frames having time information representing an earlier time, may be deleted based on the time information of the first video.

That is, when the storage capacity of the storage device still cannot meet the requirement of continuous recording by encoding at a high compression rate, deletion of stored content is required to free up space. And sequencing the priorities of the image frames and the video segments according to the video recording time sequence, reserving the video frame information with high priority, and deleting the video frame information with low priority and early time, thereby releasing the storage capacity of the storage equipment.

Fig. 6 illustrates a flow chart of a video data storage process according to an embodiment of the present application. As shown in fig. 6, first, in step S410, priority information of video and behavior is acquired. Then, in step S420, it is determined whether the behavior of the object is a high-priority behavior. If it is a high priority action, lossless or low compression rate encoding is performed, and if it is not a high priority action, lossy or high compression rate encoding is performed, and encoded information is obtained at step S430. Next, in step S440, the current effective storage space of the storage device is acquired. Then, in step S450, it is determined whether the storage capacity is insufficient. In step S460, if it is determined that the storage capacity is insufficient, the video frames are time-ordered according to the time information of the video. Finally, in step S470, the current earliest low-priority video frame is deleted while the high-priority video frame is retained, and returning to step S440, the current effective storage space of the storage device is acquired.

Here, it may be understood by those skilled in the art that, in the video data processing method according to the embodiment of the present application, it is not necessary to store time information of the first video and the second video, or to store behavior category information and priority information of behaviors of the objects in the first video and the second video. If the above information does not need to be applied in the subsequent video data processing, the above information does not need to be stored. For example, if the storage capacity of the storage device is large and there is no need to delete video frames that are earlier in time, then there is no need to store the time information of the video.

Further, in the case where the behavior category information and the priority information of the behaviors of the first video and the second video are stored, a behavior tag of the behavior of the object may be generated based on the information, thereby identifying the second video in which a specific behavior of the object is recorded. And, based on the behavior tag of the behavior of the object, thumbnail information of the second video, for example, a thumbnail of the second video, may be further generated.

Specifically, based on the behavior category information of the behavior of the object and the priority information, at least one of an initial video frame or a first subsequent video frame in the first video in which a specific gesture of the object is recorded may be selected as the thumbnail information of the second video.

Therefore, in the video data processing method according to the embodiment of the application, in the process of viewing the video, the representative frame or the representative segment can be generated according to the key tag and the posture and behavior information which are stored while the video is stored, so as to facilitate quick browsing.

Also, when storing the video, the video may be segmented based on at least one of time information, behavior category information, and priority information of the video. For example, a video is divided into a plurality of video segments in chronological order, wherein individual video segments represent behavior of a certain category or behavior with different priorities. For example, during a person's fall, a video representing the person's normal walking behavior before the fall, a video of the person's fall process (including the process of the person climbing up from the ground), and a video of the person's normal walking continuing after the person rises are stored in accordance with the time information, the behavior category information, and the priority information, respectively.

Therefore, in subsequent video retrieval, according to the key label, the posture and the behavior information when the video is stored, the quick retrieval can be carried out based on the posture and the behavior of the object. Specifically, a video representing a specific behavior of a subject, e.g., a fall of a person, may be retrieved based on behavior category information of the behavior of the subject. Furthermore, videos including object behaviors that require special attention may also be retrieved based on high priority information of the object's behaviors. Therefore, the manual labor consumed in video retrieval can be reduced.

That is, in the video data processing method according to the embodiment of the present application, it may further include: retrieving the stored first video and the second video based on at least one of the temporal information, the behavior category information, and the priority information.

For example, video content or thumbnail images or thumbnail videos to be retrieved may be viewed by time or time period. The video content or thumbnail image or thumbnail video to be retrieved can also be viewed by the kind of gesture. The video content or thumbnail images or thumbnail videos to be retrieved may also be prioritized by the behavior.

Therefore, according to the video data processing method of the embodiment of the application, the key video frames or video clips can be detected based on the gesture recognition and behavior recognition of the object, such as a human body, and the video is differentially acquired, compressed, stored, released in capacity, and rapidly viewed and retrieved based on the criticality of the video frames or video clips. Therefore, the effective storage capacity of the storage device is improved, and the storage burden of the storage device is reduced. In addition, the stored video clips are convenient to query and retrieve, and the user experience is improved.

Exemplary devices

As shown in fig. 7, the video data processing apparatus 500 according to the embodiment of the present application includes: an obtaining unit 510 for obtaining a first video comprising an object by a first image capturing device, the first image capturing device having a first resolution; a detecting unit 520, configured to detect an initial pose of the object from an initial video frame of the first video acquired by the acquiring unit 510; a determining unit 530 for determining whether the initial posture of the object detected by the detecting unit 520 meets a first predetermined criterion; and a control unit 540, configured to trigger acquisition of a second video comprising the object by a second image capturing device in response to the determination unit 530 determining that the initial pose of the object meets the first predetermined criterion, the second image capturing device having a second resolution, the second resolution being higher than the first resolution.

In one example, in the above video data processing apparatus 500, the detecting unit 520 is further configured to detect a first subsequent pose of the object from a first subsequent video frame after the initial video frame in response to the initial pose of the object meeting the first predetermined criterion; the determining unit 530 is further configured to determine whether a first subsequent pose of the object meets the first predetermined criterion, and in response to the first subsequent pose of the object meeting the first predetermined criterion, determine that the behavior of the object is a high priority behavior with a priority greater than a predetermined threshold; and the control unit 540 is configured to trigger acquisition of a second video including an object by the second image capturing device in response to determining that the behavior of the object is the high priority behavior.

In one example, in the above video data processing apparatus 500, the detecting unit 520 is further configured to detect a first subsequent pose of the object from a first subsequent video frame after the initial video frame; the determining unit 530 is further configured to determine whether a first subsequent pose of the object meets the first predetermined criterion, and in response to the first subsequent pose of the object meeting the first predetermined criterion, determine that the behavior of the object is a high priority behavior with a priority greater than a predetermined threshold; and the control unit 540 is configured to keep acquiring, by the second image capturing device, a second video including the object in response to determining that the behavior of the object is the high priority behavior.

In one example, in the above video data processing apparatus 500, the determining unit 530 is further configured to determine that the behavior of the object is not the high priority behavior in response to a first subsequent gesture of the object not meeting the first predetermined criterion; and the control unit 540 is configured to stop acquiring the second video including the object by the second image capturing device in response to the behavior of the object not being the high priority behavior.

In one example, in the video data processing apparatus 500, the determining unit 530 is configured to: identifying a behavior of the object based on an initial pose of the initial video frame and a first subsequent pose of the first subsequent video frame; and determining that the behavior of the object is the high-priority behavior based on the behavior category information of the behavior of the object.

In one example, in the video data processing apparatus 500, the control unit 540 is further configured to keep acquiring the first video through the first image capturing device after triggering the second video through the second image capturing device.

In one example, in the above video data processing apparatus 500, the detecting unit 520 is further configured to detect a second subsequent pose of the object from a second subsequent video frame subsequent to the first subsequent video frame; the determining unit 530 is further configured to determine whether a second subsequent pose of the object meets a second predetermined criterion, and in response to the second subsequent pose of the object meeting the second predetermined criterion, determine that the high priority behavior of the object is ended; and the control unit 540 is further configured to stop acquiring the second video comprising the object by the second image capturing device in response to the high priority behavior of the object ending.

In one example, in the video data processing apparatus 500 described above, a storage unit is further included for storing the first video and the second video.

In one example, in the above video data processing apparatus 500, the determining unit 530 is further configured to determine that the behavior of the object is not the high priority behavior in response to a first subsequent gesture of the object not meeting the first predetermined criterion; and the control unit 540 is configured to delete the second video in response to the behavior of the object not being the high priority behavior.

In one example, in the above-described video data processing apparatus 500, the storage unit is configured to store the first video at a high compression rate; and storing the second video at a low compression rate.

In one example, in the video data processing apparatus 500, the storage unit is further configured to store at least one of: time information of the first video and the second video; and behavior category information and priority information of behaviors of the object in the first video and the second video.

In one example, in the above-described video data processing apparatus 500, a thumbnail information generating unit is further included for generating thumbnail information of the second video based on the behavior category information and the priority information.

In one example, in the above-described video data processing apparatus 500, the thumbnail information generation unit is configured to: selecting at least one of an initial video frame and a first subsequent video frame of the first video as thumbnail information of the second video based on the behavior category information and the priority information.

In one example, in the above video data processing apparatus 500, the storage unit is further configured to: determining whether the storage capacity is below a predetermined threshold; and deleting at least a portion of the first video in response to the storage capacity being below a predetermined threshold.

In one example, in the above video data processing apparatus 500, at least a portion of the first video is one or more video frames of the first video having temporal information representing an earlier time.

In one example, in the above video data processing apparatus 500, a retrieving unit is further included for retrieving the stored first video and the second video based on at least one of the time information, the behavior category information, and the priority information.

The specific functions and operations of the respective units and modules in the video data processing apparatus 500 described above have been described in detail in the video data processing method described above with reference to fig. 1 to 6, and thus, a repetitive description thereof will be omitted.

As described above, the video data processing apparatus 500 according to the embodiment of the present application may be implemented in a video data processing device, which may be either or both of the first image capturing device 110 and the second image capturing device 120 shown in fig. 1, or a stand-alone device independent therefrom.

In one example, the video data processing apparatus 500 according to the embodiment of the present application may be integrated into the video data processing device as a software module and/or a hardware module. For example, the video data processing apparatus 500 may be a software module in an operating system of the video data processing device, or may be an application developed for the video data processing device; of course, the video data processing apparatus 500 may also be one of many hardware modules of the video data processing device.

Alternatively, in another example, the video data processing apparatus 500 and the video data processing device may be separate devices, and the video data processing apparatus 400 may be connected to the video data processing device through a wired and/or wireless network and transmit the interactive information according to the agreed data format.

Exemplary electronic device

Next, an electronic apparatus according to an embodiment of the present application is described with reference to fig. 8. The electronic device may be integrated with either or both of the first image capturing device 110 and the second image capturing device 120, or a stand-alone device independent thereof, which may communicate with the first image capturing device and the second image capturing device to receive the captured input signals therefrom.

As shown in fig. 8, the electronic device 10 includes one or more processors 11 and memory 12.

The processor 8 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 10 to perform desired functions.

Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by the processor 11 to implement the video data processing methods of the various embodiments of the present application described above and/or other desired functions. Various contents such as the first video, the second video, behavior category information of the behavior of the object, priority information, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device 10 may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, when the electronic device is integrated with the first image capturing apparatus 110 and the second image capturing apparatus 120, the input device 13 may be the first image capturing apparatus 110 and the second image capturing apparatus 120, such as a camera, for capturing a video including an object. When the electronic device is a stand-alone device, the input means 13 may be a communication network connector for receiving the captured input signals from the first image capturing device 110 and the second image capturing device 120.

The input device 13 may also include, for example, a keyboard, a mouse, and the like.

The output device 14 can output various information including video data and information of the behavior of an object, etc. to the outside. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the electronic device 10 relevant to the present application are shown in fig. 8, and components such as buses, input/output interfaces, and the like are omitted. In addition, the electronic device 10 may include any other suitable components depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the video data processing method according to various embodiments of the present application described in the "exemplary methods" section of this specification, supra.

The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the video data processing method according to various embodiments of the present application described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A video data processing method, comprising:

acquiring, by a first image capture device, a first video including an object, the first image capture device having a first resolution;

detecting an initial pose of the object from an initial video frame of the first video;

determining whether the initial pose of the object meets a first predetermined criterion;

triggering acquisition of a second video including the object by a second image acquisition device in response to the initial pose of the object meeting the first predetermined criteria, the second image acquisition device having a second resolution, the second resolution being higher than the first resolution; and

after the second video is triggered to be acquired through the second image acquisition device, the first video is kept acquired through the first image acquisition device;

further comprising:

detecting a first subsequent pose of the object from a first subsequent video frame subsequent to the initial video frame;

determining whether the first subsequent pose of the object meets the first predetermined criteria;

in response to the first subsequent pose of the object meeting the first predetermined criteria, determining that the behavior of the object is a high priority behavior having a priority greater than a predetermined threshold; and

in response to determining that the behavior of the object is the high priority behavior, maintaining acquisition, by the second image capture device, of the second video including the object.

2. The video data processing method of claim 1, further comprising:

determining that the behavior of the object is not the high priority behavior in response to the first subsequent pose of the object not meeting the first predetermined criteria; and

in response to the behavior of the object not being the high priority behavior, ceasing acquisition of the second video including the object by the second image capture device.

3. The video data processing method of claim 1, wherein determining that the behavior of the object is a high priority behavior with a priority greater than a predetermined threshold comprises:

identifying a behavior of the object based on an initial pose of the initial video frame and the first subsequent pose of the first subsequent video frame; and

determining that the behavior of the object is the high-priority behavior based on behavior category information of the behavior of the object.

4. The video data processing method of claim 1, further comprising:

detecting a second subsequent pose of the object from a second subsequent video frame subsequent to the first subsequent video frame;

determining whether a second subsequent pose of the object meets a second predetermined criterion; in response to a second subsequent pose of the object meeting the second predetermined criteria, determining that the high priority behavior of the object is over; and

in response to the high priority behavior of the object ending, ceasing acquisition of the second video including the object by the second image capture device.

5. The video data processing method of claim 1, further comprising:

storing the first video and the second video.

6. The video data processing method of claim 5, further comprising:

deleting the second video in response to the behavior of the object not being the high priority behavior.

7. The video data processing method of claim 5, wherein storing the first video and the second video comprises:

storing the first video at a high compression rate; and

storing the second video at a low compression rate.

8. The video data processing method of claim 5, wherein storing the first video and the second video comprises storing at least one of:

time information of the first video and the second video; and

behavior category information and priority information of behaviors of the object in the first video and the second video.

9. The video data processing method of claim 8, further comprising:

generating thumbnail information of the second video based on the behavior category information and the priority information.

10. The video data processing method of claim 9, wherein generating thumbnail information for the second video based on the behavior category information and the priority information comprises:

selecting at least one of an initial video frame and a first subsequent video frame of the first video as thumbnail information of the second video based on the behavior category information and the priority information.

11. The video data processing method of claim 8, further comprising:

determining whether the storage capacity is below a predetermined threshold;

deleting at least a portion of the first video in response to the storage capacity being below a predetermined threshold.

12. The video data processing method of claim 11, wherein the at least a portion of the first video is one or more video frames of the first video having temporal information representing earlier times.

13. The video data processing method of claim 8, further comprising:

retrieving the stored first video and the second video based on at least one of the temporal information, the behavior category information, and the priority information.

14. A video data processing apparatus comprising:

an acquisition unit configured to acquire a first video including an object by a first image capturing apparatus having a first resolution;

a detection unit configured to detect an initial pose of the object from an initial video frame of the first video;

a determination unit for determining whether the initial pose of the object meets a first predetermined criterion; and

a control unit for triggering acquisition of a second video comprising an object by a second image acquisition device in response to the initial pose of the object meeting the first predetermined criterion, the second image acquisition device having a second resolution, the second resolution being higher than the first resolution,

after the control unit triggers the second image acquisition device to acquire the second video, the acquisition unit keeps acquiring the first video through the first image acquisition device; and is

The control unit detects a first subsequent pose of the object from a first subsequent video frame subsequent to the initial video frame; determining whether the first subsequent pose of the object meets the first predetermined criteria; in response to the first subsequent pose of the object meeting the first predetermined criteria, determining that the behavior of the object is a high priority behavior having a priority greater than a predetermined threshold; and in response to determining that the behavior of the object is the high priority behavior, maintaining acquisition, by the second image capture device, of the second video including the object.

15. An electronic device, comprising:

a processor; and

memory having stored therein computer program instructions which, when executed by the processor, cause the processor to carry out the video data processing method of any of claims 1-13.