WO2020215552A1 - Multi-target tracking method, apparatus, computer device, and storage medium - Google Patents
Multi-target tracking method, apparatus, computer device, and storage medium Download PDFInfo
- Publication number
- WO2020215552A1 WO2020215552A1 PCT/CN2019/102318 CN2019102318W WO2020215552A1 WO 2020215552 A1 WO2020215552 A1 WO 2020215552A1 CN 2019102318 W CN2019102318 W CN 2019102318W WO 2020215552 A1 WO2020215552 A1 WO 2020215552A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- area
- preset
- image
- detected
- head
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- This application relates to the technical field of target tracking, in particular to a multi-target tracking method, device, computer equipment and storage medium.
- the intelligent video analysis monitoring system can automatically identify different objects, find abnormal situations in the monitoring screen, and can issue alarms and provide useful information in the fastest and best way, so as to more effectively assist security personnel in dealing with crises.
- Target detection is a basic function of video analysis technology, which is of great significance to the realization of follow-up target tracking, target recognition and behavior analysis applications, especially in the field of real-time target event monitoring, its importance is self-evident.
- the human body As a non-rigid body, the human body has various morphological changes and is prone to occlusion, as well as complex and diverse video scene changes, which makes it very difficult to effectively detect and track video pedestrians.
- problems such as different poses of pedestrians, occlusion of the human body, sudden light changes, and disturbance of the background environment, so how to quickly and accurately in a video with a complex background, especially when there are occlusions between multiple targets
- Ground tracking of targets is still an important and difficult point in the field of video image processing technology.
- the first aspect of the present application provides a multi-target tracking method, the method includes:
- the method further includes:
- the pedestrian corresponding to the physical area is determined as the target tracking object
- the pedestrian corresponding to the head area is determined as the target tracking object
- said segmenting the occluded pedestrians according to the head area and the body area includes:
- the occluded pedestrians are segmented according to the enlarged body area.
- the method further includes:
- the central axis of the two head regions is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the blocked pedestrians.
- the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area.
- the calling the preset first detection model to detect the head region in the image to be detected includes:
- the head region corresponding to each human body in the image to be detected is determined according to the multiple human body nodes of each human body.
- the calculating the area ratio based on the head area and the body area includes:
- the area ratio is calculated based on the first area and the second area.
- a second aspect of the present application provides a multi-target tracking device, the device including:
- the acquisition module is used to acquire an image to be detected including multiple targets;
- the detection module is configured to call a preset first detection model to detect the head region in the image to be detected;
- the detection module is further configured to call a preset second detection model to detect the shape area in the image to be detected;
- a calculation module configured to calculate an area ratio based on the head area and the body area
- a judging module for judging whether there is an area ratio smaller than a preset first threshold in the area ratio, wherein the preset first threshold is less than 1;
- the segmentation module is used to determine that there is a pedestrian in the image to be detected that is blocked when the area ratio is less than the preset first threshold, and segment according to the head area and the shape area Pedestrians out of shade;
- the tracking module is used to call a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
- a third aspect of the present application provides a computer device that includes a processor, and the processor is configured to implement the multi-target tracking method when executing computer-readable instructions stored in a memory.
- a fourth aspect of the present application provides a non-volatile readable storage medium having computer readable instructions stored thereon, and when the computer readable instructions are executed by a processor, the multi-target tracking method is implemented.
- the multi-target tracking method, device, computer equipment, and storage medium described in this application first acquire the image to be detected including the multi-target with occlusion, and call the preset first detection model and the preset first detection model respectively.
- the second detection model detects the head area and the shape area in the image to be detected, and calculates an area ratio between the head area and the shape area. When the area ratio is less than the preset first threshold When the area ratio of, it is determined that there are pedestrians in the image to be detected that are occluded, and then the occluded pedestrians are segmented according to the head region and the shape region, and finally the preset tracking algorithm is called to segment the occluded pedestrians Follow unobstructed pedestrians.
- This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
- FIG. 1 is a flowchart of a multi-target tracking method provided in Embodiment 1 of the present application.
- FIG. 2 is a structural diagram of a multi-target tracking device provided in Embodiment 2 of the present application.
- FIG. 3 is a schematic diagram of the structure of a computer device provided in Embodiment 3 of the present application.
- FIG. 1 is a flowchart of a multi-target tracking method provided in Embodiment 1 of the present application.
- the multi-target tracking method can be applied to a computer device.
- the multi-target tracking function provided by the method of this application can be directly integrated on the computer device, or It runs in computer equipment in the form of Software Development Kit (SKD).
- the multi-target tracking method specifically includes the following steps. According to different requirements, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
- the image to be detected may be any suitable image that requires target tracking, for example, an image collected for a monitored area.
- the image to be detected may be a static image collected by an image collection device such as a camera, or any video frame in a video collected by an image collection device such as a camera.
- the image to be detected may be an original image or an image obtained after preprocessing the original image.
- the image to be detected contains multiple pedestrians, and the body parts of multiple pedestrians may overlap significantly. That is, the target tracking object is determined when the body parts of multiple pedestrians have a large overlap, so as to prevent a pedestrian from being blocked by other pedestrians from causing misdetection or missed detection.
- the first detection model can be trained in advance, and by directly calling the pre-trained first detection model, multiple human body nodes of each human body in the image to be detected can be detected directly and quickly.
- the preset first detection model may be various detection models based on deep learning, for example, a detection model based on a neural network, or a detection model based on a residual network.
- the method further includes:
- the first detection model is trained in advance, wherein the training process of the first detection model includes:
- tools such as OpenPose or PoseMachine are used to label multiple human body nodes in the head region in the human body pictures, for example, to label left eye nodes, right eye nodes, left ear nodes, Right ear node.
- Extract a first preset ratio of human body pictures as the sample picture set to be trained (referred to as the training set), and extract the second preset ratio of human body pictures as the sample picture set to be verified (referred to as the verification set for short).
- the number of human body pictures is much larger than the number of human body pictures in the verification set. For example, 80% of the human body pictures in the human body pictures are used as the training set, and the remaining 20% of the human body pictures are used as the verification set.
- the parameters of the neural network adopt the default parameters. After that, the parameters were adjusted continuously during the training process.
- the generated first detection model is verified using the human body pictures in the verification set. If the verification pass rate is greater than or equal to a preset threshold, for example, the pass rate is greater than or equal to 98%, the training ends.
- the first detection model obtained by this training is used to identify the human body node. If the verification pass rate is less than the preset threshold, for example, less than 98%, the number of human body pictures participating in the training is increased, and the above steps are performed again until the verification pass rate is greater than or equal to the preset threshold.
- the first detection model obtained by training is used to identify the human body node in the human body picture in the verification set, and the recognition result is compared with the human body node of the human body picture in the verification set to evaluate the trained first detection model The recognition effect.
- the calling the preset first detection model to detect the head region in the image to be detected includes:
- multiple human body nodes of each human body in the image to be detected are detected through the preset first detection model, for example, a neural network model.
- the human body node may be an important position of the human body such as the joint points of the human body and the facial features.
- the multiple human body nodes include at least multiple nodes of the head and the neck.
- the multiple human body nodes include: one or more of a neck node, a nose tip node, a left eye node, a right eye node, a left ear node, and a right ear node.
- the multiple human body nodes determined by the preset first detection model further include at least a wrist node, an elbow node, and a shoulder node.
- Each human body node represents the human body area including the node, for example, the left eye node represents the entire left eye area of the human body, rather than just a specific pixel.
- the head area is an area determined according to multiple nodes of the head and the neck to characterize the human head.
- the head area of the human body is determined according to the neck node, nose tip node, left eye node, right eye node, left ear node, and right ear node.
- the determined shape of the head region can be rectangular, circular, oval, or any other regular or irregular shapes. This application does not specifically limit the shape of the determined head region.
- the process of pre-training the first detection model may be an offline training process.
- the process of calling the first detection model to detect the head region in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the first detection model, and the output is the human node information in the image to be detected, for example, the top of the head, eyes, mouth, chin, ears, neck, etc.
- a human body node appears. According to the multiple human body nodes, the human head is framed with a geometric figure, such as a rectangular frame, and the rectangular frame at this time is called a head frame.
- the preset second detection model is called to detect the shape area in the image to be detected.
- the preset second detection model can be implemented using an accelerated version of the region-based convolutional neural network (Faster-RCNN).
- the preset second detection model is pre-trained using a large number of human images.
- the preset second detection model may be trained before acquiring the image to be detected including multiple targets.
- the process of training the second detection model in advance is similar to the process of training the first detection model in advance, and will not be repeated here.
- the shape area in the image to be detected is recognized by inputting the image to be detected into the second detection model.
- the process of pre-training the second detection model may be an offline training process.
- the process of calling the preset second detection model to detect the shape area in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the second detection model, and the output is the human body information in the image to be detected. According to the human body information, the human body shape area is framed by a rectangular frame. Call it a pedestrian frame.
- the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area.
- the parallel processing method is used to simultaneously input the image to be detected into the predetermined head area in the preset first detection model and input the predetermined shape area in the preset second detection model, which can save processing time and improve Processing efficiency.
- an area ratio can be calculated based on the head region and the body region.
- the area ratio refers to the ratio between the intersection area of the head area and the body area and the head area.
- the determining the area ratio based on the head area and the body area includes:
- the area ratio is calculated based on the first area and the second area.
- the position coordinate system is established with the upper left corner of the image to be detected as the origin, the upper edge of the image as the X axis, and the left side of the image as the Y axis.
- the first position coordinates of each vertex of the head frame (take a rectangular frame as an example) corresponding to the head area are obtained, and the shape frame corresponding to the shape area (take the rectangular frame as an example) is obtained.
- the second position coordinates of each vertex Determine the first area of the head area according to the first position coordinates, determine the intersection area of the head area and the body area according to the first position coordinates and the second position coordinates, and then obtain all According to the third position coordinates of each vertex of the intersection area, the second area of the intersection area is determined according to the third position coordinates.
- an area ratio (Intersection over Union, IOU) is calculated according to the first area and the second area.
- S15 Determine whether there is an area ratio that is smaller than a preset first threshold in the area ratios, where the preset first threshold is less than 1.
- the head area is contained in the body area, that is, the head frame is contained in the pedestrian frame.
- the pedestrian's head area is completely contained in the body area, and the calculated area ratio should be 1.
- the pedestrian's head area is partially contained in the body area. In the area, the calculated area ratio at this time is less than 1.
- the pedestrian's physical area is completely occluded, the pedestrian's head area is not included in the physical area at all, and the calculated area ratio is 0 at this time.
- a first threshold may be preset, and the preset first threshold is less than 1, which may be, for example, 0.7.
- the ratio of the intersection of the head frame and the pedestrian frame to the head frame is used to measure the overlap of the head frame and the pedestrian frame or to determine whether the head frame matches the pedestrian frame.
- the larger the area ratio the larger the overlap ratio of the head frame and the pedestrian frame, and the more matching the head frame and the pedestrian frame.
- the magnitude relationship between each area ratio and the preset first threshold can be determined. If there is a target area ratio that is less than the preset first threshold in the multiple area ratios, it indicates that the pedestrian corresponding to the target area ratio in the image to be tested is severely blocked. If each of the multiple area ratios is greater than or equal to the preset first threshold, it indicates that multiple pedestrians in the image to be tested are not blocked or are not seriously blocked.
- the severely occluded pedestrian in the image to be detected may be first segmented according to the head region and the shape region.
- the segmenting the occluded pedestrian according to the head area and the shape area includes:
- the central axis of the two head areas is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the shaded pedestrian.
- a second threshold may be preset, and the preset second threshold is smaller than the preset first threshold, for example, 0.3.
- the area ratio is greater than the preset second threshold but less than the preset first threshold, although the corresponding pedestrian is severely blocked, it indicates that the pedestrian's head area has been accurately detected.
- the shape of the pedestrian does not match the human head, and the confidence of this lower detection is relatively low, and it is easy to be masked as a false detection during post-processing.
- the corresponding shape area is expanded according to a preset scale factor (for example, 1.5) and then divided, which improves the detection confidence of the blocked pedestrian, thereby reducing the risk of being blocked in the post-processing detection and screening.
- Pedestrian A and Pedestrian B share a human body frame, but correspond to two head frames.
- the human body frame can be marked It is a double, and along the central axis of the two head frames, and the key points of the shoulders are used as the left and right boundaries of the human body to separate the pedestrian armor.
- the preset tracking algorithm may be a multi-target tracking algorithm. After the pedestrians in the image to be tested are segmented, the segmented pedestrians and the pedestrians that are not blocked can be tracked.
- the multi-target tracking algorithm is an existing technology, and this article will not elaborate on it.
- the method further includes:
- the pedestrian corresponding to the physical area is determined as the target tracking object
- the pedestrian corresponding to the head area is determined as the target tracking object
- the area ratio is greater than or equal to the preset first threshold, that is, when multiple pedestrians in the image to be tested are not blocked or are not severely blocked, it is necessary to further determine whether the area ratio is 1 to determine Pedestrians in the image to be tested are not blocked or slightly blocked.
- the ratio of each area is 1, it means that the head area of the pedestrian in the image to be tested is completely contained in the physical area, that is, the pedestrian is not blocked, because the determined physical area is the entire area of the pedestrian, so Pedestrians corresponding to the physical area are tracked for the target tracking object, and the tracking effect is better. If the area ratio is not 1, it indicates that there is a phenomenon that the pedestrian is slightly occluded in the image to be tested. Because the head area is more distinct, the pedestrian corresponding to the head area is used as the target tracking object for tracking, and the tracking effect Better.
- the multi-target tracking method described in this application first acquires the image to be detected including the multi-target that is occluded, and calls the preset first detection model and the preset second detection model to detect the Detect the head area and the body area in the image, calculate an area ratio of the head area and the body area, and when the area ratio has an area ratio smaller than the preset first threshold, determine the Pedestrians are occluded in the image to be detected, and then the occluded pedestrians are segmented according to the head region and the body region, and finally the preset tracking algorithm is called to track the segmented occluded pedestrians and unoccluded pedestrians .
- This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
- FIG. 2 is a structural diagram of a multi-target tracking device provided in Embodiment 2 of the present application.
- the multi-target tracking device 20 may include multiple functional modules composed of computer-readable instruction code segments.
- the code of each computer-readable instruction code segment in the multi-target tracking device 20 may be stored in the memory of the computer device and executed by the at least one processor to execute (see FIG. 1 for details) to prevent the existence of Multi-target detection.
- the multi-target tracking device 20 can be divided into multiple functional modules according to the functions it performs.
- the functional modules may include: an acquisition module 201, a detection module 202, a training module 203, a calculation module 204, a judgment module 205, a determination module 206, a segmentation module 207, and a tracking module 208.
- the module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
- the acquisition module 201 is used to acquire a to-be-detected image including multiple targets.
- the image to be detected may be any suitable image that requires target tracking, for example, an image collected for a monitored area.
- the image to be detected may be a static image collected by an image collection device such as a camera, or any video frame in a video collected by an image collection device such as a camera.
- the image to be detected may be an original image or an image obtained after preprocessing the original image.
- the image to be detected contains multiple pedestrians, and the body parts of multiple pedestrians may overlap significantly. That is, the target tracking object is determined when the body parts of multiple pedestrians have a large overlap, so as to prevent a pedestrian from being blocked by other pedestrians from causing misdetection or missed detection.
- the detection module 202 is configured to call a preset first detection model to detect the head region in the image to be detected.
- the first detection model can be trained in advance, and by directly calling the pre-trained first detection model, multiple human body nodes of each human body in the image to be detected can be detected directly and quickly.
- the preset first detection model may be various detection models based on deep learning, for example, a detection model based on a neural network, or a detection model based on a residual network.
- the training module 203 is configured to pre-train the first detection model, where the training process of the first detection model includes:
- tools such as OpenPose or PoseMachine are used to label multiple human body nodes in the head region in the human body pictures, for example, to label left eye nodes, right eye nodes, left ear nodes, Right ear node.
- Extract a first preset ratio of human body pictures as the sample picture set to be trained (referred to as the training set), and extract the second preset ratio of human body pictures as the sample picture set to be verified (referred to as the verification set for short).
- the number of human body pictures is much larger than the number of human body pictures in the verification set. For example, 80% of the human body pictures in the human body pictures are used as the training set, and the remaining 20% of the human body pictures are used as the verification set.
- the parameters of the neural network adopt the default parameters. After that, the parameters were adjusted continuously during the training process.
- the generated first detection model is verified using the human body pictures in the verification set. If the verification pass rate is greater than or equal to a preset threshold, for example, the pass rate is greater than or equal to 98%, the training ends.
- the first detection model obtained by this training is used to identify the human body node. If the verification pass rate is less than the preset threshold, for example, less than 98%, the number of human body pictures participating in the training is increased, and the above steps are performed again until the verification pass rate is greater than or equal to the preset threshold.
- the first detection model obtained by training is used to identify the human body node in the human body picture in the verification set, and the recognition result is compared with the human body node of the human body picture in the verification set to evaluate the trained first detection model The recognition effect.
- the detection module 202 calling a preset first detection model to detect the head region in the image to be detected includes:
- multiple human body nodes of each human body in the image to be detected are detected through the preset first detection model, for example, a neural network model.
- the human body node may be an important position of the human body such as the joint points of the human body and the facial features.
- the multiple human body nodes include at least multiple nodes of the head and the neck.
- the multiple human body nodes include: one or more of a neck node, a nose tip node, a left eye node, a right eye node, a left ear node, and a right ear node.
- the multiple human body nodes determined by the preset first detection model further include at least a wrist node, an elbow node, and a shoulder node.
- Each human body node represents the human body area including the node, for example, the left eye node represents the entire left eye area of the human body, rather than just a specific pixel.
- the head area is an area determined according to multiple nodes of the head and the neck to characterize the human head.
- the head area of the human body is determined according to the neck node, nose tip node, left eye node, right eye node, left ear node, and right ear node.
- the determined shape of the head region can be rectangular, circular, oval, or any other regular or irregular shapes. This application does not specifically limit the shape of the determined head region.
- the process of pre-training the first detection model may be an offline training process.
- the process of calling the first detection model to detect the head region in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the first detection model, and the output is the human node information in the image to be detected, for example, the top of the head, eyes, mouth, chin, ears, neck, etc.
- a human body node appears. According to the multiple human body nodes, the human head is framed with a geometric figure, such as a rectangular frame, and the rectangular frame at this time is called a head frame.
- the detection module 202 is further configured to call a preset second detection model to detect the shape area in the image to be detected.
- the preset second detection model is called to detect the shape area in the image to be detected.
- the preset second detection model can be implemented using an accelerated version of the region-based convolutional neural network (Faster-RCNN).
- the preset second detection model is pre-trained using a large number of human images.
- the preset second detection model may be trained before acquiring the image to be detected including multiple targets.
- the process of pre-training the second detection model is similar to the foregoing process of pre-training the first detection model, and will not be repeated here.
- the shape area in the image to be detected is recognized by inputting the image to be detected into the second detection model.
- the process of pre-training the second detection model may be an offline training process.
- the process of calling the preset second detection model to detect the shape area in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the second detection model, and the output is the human body information in the image to be detected. According to the human body information, the human body shape area is framed by a rectangular frame. Call it a pedestrian frame.
- the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area.
- the parallel processing method is used to simultaneously input the image to be detected into the predetermined head area in the preset first detection model and input the predetermined shape area in the preset second detection model, which can save processing time and improve Processing efficiency.
- the calculation module 204 is configured to calculate an area ratio based on the head area and the body area.
- an area ratio can be calculated based on the head region and the body region.
- the area ratio refers to the ratio between the intersection area of the head area and the body area and the head area.
- the calculation module 204 determining the area ratio according to the head area and the body area includes:
- the area ratio is calculated based on the first area and the second area.
- the position coordinate system is established with the upper left corner of the image to be detected as the origin, the upper edge of the image as the X axis, and the left side of the image as the Y axis.
- the first position coordinates of each vertex of the head frame (take a rectangular frame as an example) corresponding to the head area are obtained, and the shape frame corresponding to the shape area (take the rectangular frame as an example) is obtained.
- the second position coordinates of each vertex Determine the first area of the head area according to the first position coordinates, determine the intersection area of the head area and the body area according to the first position coordinates and the second position coordinates, and then obtain all According to the third position coordinates of each vertex of the intersection area, the second area of the intersection area is determined according to the third position coordinates.
- an area ratio (Intersection over Union, IOU) is calculated according to the first area and the second area.
- the judging module 205 is used to judge whether there is an area ratio smaller than a preset first threshold in the area ratios, wherein the preset first threshold is less than one.
- the head area is contained in the body area, that is, the head frame is contained in the pedestrian frame.
- the pedestrian's head area is completely contained in the body area, and the calculated area ratio should be 1.
- the pedestrian's head area is partially contained in the body area. In the area, the calculated area ratio at this time is less than 1.
- the pedestrian's physical area is completely occluded, the pedestrian's head area is not included in the physical area at all, and the calculated area ratio is 0 at this time.
- a first threshold may be preset, and the preset first threshold is less than 1, which may be, for example, 0.7.
- the ratio of the intersection of the head frame and the pedestrian frame to the head frame is used to measure the overlap of the head frame and the pedestrian frame or to determine whether the head frame matches the pedestrian frame.
- the larger the area ratio the larger the overlap ratio of the head frame and the pedestrian frame, and the more matching the head frame and the pedestrian frame.
- the determining module 206 is configured to determine that a pedestrian is blocked in the image to be detected when there is an area ratio in the area ratio that is less than the preset first threshold.
- the magnitude relationship between each area ratio and the preset first threshold can be determined. If there is a target area ratio that is less than the preset first threshold in the multiple area ratios, it indicates that the pedestrian corresponding to the target area ratio in the image to be tested is severely blocked. If each of the multiple area ratios is greater than or equal to the preset first threshold, it indicates that multiple pedestrians in the image to be tested are not blocked or are not seriously blocked.
- the segmentation module 207 is used to segment the occluded pedestrians according to the head area and the body area.
- the severely occluded pedestrian in the image to be detected may be first segmented according to the head region and the shape region.
- the segmentation module 207 segmenting the occluded pedestrian according to the head area and the body area includes:
- the central axis of the two head areas is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the shaded pedestrian.
- a second threshold may be preset, and the preset second threshold is smaller than the preset first threshold, for example, 0.3.
- the area ratio is greater than the preset second threshold but less than the preset first threshold, although the corresponding pedestrian is severely blocked, it indicates that the pedestrian's head area has been accurately detected.
- the shape of the pedestrian does not match the human head, and the confidence of this lower detection is relatively low, and it is easy to be masked as a false detection during post-processing.
- the corresponding shape area is expanded according to a preset scale factor (for example, 1.5) and then divided, which improves the detection confidence of the blocked pedestrian, thereby reducing the risk of being blocked in the post-processing detection and screening.
- Pedestrian A and Pedestrian B share a human body frame, but correspond to two head frames.
- the human body frame can be marked It is a double and separates the pedestrian armor along the central axis of the two head frames and the key points of the shoulders as the left and right boundaries of the human body.
- the tracking module 208 is configured to call a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
- the preset tracking algorithm may be a multi-target tracking algorithm. After the pedestrians in the image to be tested are segmented, the segmented pedestrians and the pedestrians that are not blocked can be tracked.
- the multi-target tracking algorithm is an existing technology, and this article will not elaborate on it.
- the determining module 205 is further configured to determine whether the area ratio is 1 when the area ratio is greater than or equal to the preset first threshold.
- the determining module 206 is further configured to determine the pedestrian corresponding to the physical area as the target tracking object when the area ratio is 1.
- the determining module 206 is further configured to determine the pedestrian corresponding to the head area as the target tracking object when the area ratio is not 1.
- the tracking module 208 is further configured to call the preset tracking algorithm to track the target tracking object.
- the area ratio is greater than or equal to the preset first threshold, that is, when multiple pedestrians in the image to be tested are not blocked or are not severely blocked, it is necessary to further determine whether the area ratio is 1 to determine Pedestrians in the image to be tested are not blocked or slightly blocked.
- the ratio of each area is 1, it means that the head area of the pedestrian in the image to be tested is completely contained in the physical area, that is, the pedestrian is not blocked, because the determined physical area is the entire area of the pedestrian, so Pedestrians corresponding to the physical area are tracked for the target tracking object, and the tracking effect is better. If the area ratio is not 1, it indicates that there is a phenomenon that the pedestrian is slightly occluded in the image to be tested. Because the head area is more distinct, the pedestrian corresponding to the head area is used as the target tracking object for tracking, and the tracking effect Better.
- the multi-target tracking device described in this application first acquires images to be detected including multiple targets that are occluded, and respectively calls a preset first detection model and a preset second detection model to detect the Detect the head area and the body area in the image, calculate an area ratio of the head area and the body area, and when the area ratio has an area ratio smaller than the preset first threshold, determine the Pedestrians are occluded in the image to be detected, and then the occluded pedestrians are segmented according to the head region and the body region, and finally the preset tracking algorithm is called to track the segmented occluded pedestrians and unoccluded pedestrians .
- This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
- the computer device 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
- FIG. 3 does not constitute a limitation of the embodiment of the present application. It may be a bus-type structure or a star structure.
- the computer device 3 may also include a graph Show more or less other hardware or software, or different component arrangements.
- the computer device 3 includes a computer device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor and an application specific integrated circuit. , Programmable gate arrays, digital processors and embedded devices, etc.
- the computer device 3 may also include a client device, and the client device includes but is not limited to any electronic product that can interact with a client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, Personal computers, tablet computers, smart phones, digital cameras, etc.
- the computer device 3 is only an example, and other existing or future electronic products that can be adapted to this application should also be included in the scope of protection of this application and included here by reference .
- the memory 31 is used to store computer-readable instruction codes and various data, such as the multi-target tracking device 20 installed in the computer equipment 3, and achieve high speed during the operation of the computer equipment 3 , Automatically complete the access of computer readable instructions or data.
- the memory 31 includes Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), and Erasable Programmable Read-Only Memory (EPROM) , One-time Programmable Read-Only Memory (OTPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), CD-ROM (Compact Disc Read- Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
- ROM Read-Only Memory
- PROM Programmable Read-Only Memory
- EPROM Erasable Programmable Read-Only Memory
- OTPROM One-time Programmable Read-Only Memory
- EEPROM Electronically-Erasable Programmable Read-Only Memory
- the at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one Or a combination of multiple central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips.
- the at least one processor 32 is the control core (Control Unit) of the computer device 3, which uses various interfaces and lines to connect the various components of the entire computer device 3, and can run or execute the computer stored in the memory 31. Read instructions or modules, and call data stored in the memory 31 to perform various functions and process data of the computer device 3, for example, perform multi-target tracking.
- the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
- the computer device 3 may also include a power supply for supplying power to various components.
- the power supply may be logically connected to the at least one processor 32 through a power management device, so that the management of charging and discharging is realized through the power management device. , And power management and other functions.
- the computer device 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
- the above-mentioned integrated unit implemented in the form of a software function module can be stored in a non-volatile readable storage medium and includes several instructions to enable a computer device (which can be a personal computer, a computer device, or a network device, etc.) ) Or a processor (processor) executes part of the method described in each embodiment of the present application.
- a computer device which can be a personal computer, a computer device, or a network device, etc.
- a processor processor
- the at least one processor 32 can execute the operating device of the computer device 3 and various installed applications (such as the multi-target tracking device 20), etc., for example, The various modules mentioned above.
- the memory 31 stores computer readable instruction codes
- the at least one processor 32 can call the computer readable instruction codes stored in the memory 31 to perform related functions.
- the various modules described in FIG. 2 are computer-readable instruction codes stored in the memory 31 and executed by the at least one processor 32, so as to realize the functions of the various modules to achieve multi-target tracking. the goal of.
- the memory 31 stores multiple instructions, and the multiple instructions are executed by the at least one processor 32 to achieve multi-target tracking.
- the at least one processor 32 executes the multiple instructions to achieve multi-target tracking.
- the modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional modules.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
Abstract
A multi-target tracking method, a multi-target tracking apparatus, a computer device, and a storage medium, comprising: acquiring an image to be examined containing a plurality of targets (S11); using a preset first detection model to detect head regions in the image to be examined (S12); using a preset second detection model to detect figure regions in the image to be examined (S13); on the basis of the head regions and the figure regions, calculating regional ratios (S14); determining whether a regional ratio smaller than a first threshold is present in the regional ratios, the preset threshold being less than 1 (S15); when a regional ratio less than the first preset threshold is preset, determining that an obstructed pedestrian is present in the image to be examined (S16); on the basis of the head region and the figure region, separating out the obstructed pedestrian (S17); and using a preset tracking algorithm to track the separated obstructed pedestrian and an unobstructed pedestrian (S18). In the present method, tracking targets are determined by means of combining both a head region and a figure region, and said method therefore features an excellent tracking effect when multiple obstructed targets are present.
Description
本申请要求于2019年04月26日提交中国专利局,申请号为201910345956.8发明名称为“多目标跟踪方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on April 26, 2019. The application number is 201910345956.8. The invention title is "Multi-target tracking method, device, computer equipment and storage medium". The entire content is incorporated by reference. In this application.
本申请涉及目标跟踪技术领域,具体涉及一种多目标跟踪方法、装置、计算机设备及存储介质。This application relates to the technical field of target tracking, in particular to a multi-target tracking method, device, computer equipment and storage medium.
随着社会的不断进步和经济建设的迅速发展,视频监控也越来越多的应用于各个行业和方面。智能视频分析监控系统能够自动识别不同的物体,发现监控画面中的异常情况,并能够以最快和最佳的方式发出警报和提供有用信息,从而能够更加有效的协助安全人员处理危机。With the continuous progress of society and the rapid development of economic construction, video surveillance is increasingly used in various industries and aspects. The intelligent video analysis monitoring system can automatically identify different objects, find abnormal situations in the monitoring screen, and can issue alarms and provide useful information in the fastest and best way, so as to more effectively assist security personnel in dealing with crises.
目标的检测是视频分析技术的一个基础功能,对实现后续目标跟踪、目标识别及行为分析等应用有重要意义,尤其是在实时目标事件监控领域,其重要性更是不言而喻。Target detection is a basic function of video analysis technology, which is of great significance to the realization of follow-up target tracking, target recognition and behavior analysis applications, especially in the field of real-time target event monitoring, its importance is self-evident.
人体作为非刚体,形态变化多样,且易发生遮挡,以及视频场景变化复杂多样,导致有效的视频行人检测与跟踪十分困难。在实际应用场景中,存在行人姿态各异、人体被遮挡、光照突变以及背景环境扰动等问题,所以如何在复杂背景的视频中,尤其是当多个目标之间存在遮挡时,如何快速和精确地对目标进行跟踪仍是视频图像处理技术领域的一个重点和难点。As a non-rigid body, the human body has various morphological changes and is prone to occlusion, as well as complex and diverse video scene changes, which makes it very difficult to effectively detect and track video pedestrians. In practical application scenarios, there are problems such as different poses of pedestrians, occlusion of the human body, sudden light changes, and disturbance of the background environment, so how to quickly and accurately in a video with a complex background, especially when there are occlusions between multiple targets Ground tracking of targets is still an important and difficult point in the field of video image processing technology.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种多目标跟踪方法、装置、计算机设备及存储介质,旨在于解决存在遮挡的多目标的跟踪问题,通过结合人头区域和形体区域双重确定目标跟踪对象,能够提高目标跟踪的效果。In view of the above, it is necessary to propose a multi-target tracking method, device, computer equipment and storage medium, which aims to solve the tracking problem of multiple targets that are occluded. By combining the human head area and the body area to determine the target tracking object, the target can be improved. The effect of tracking.
本申请的第一方面提供一种多目标跟踪方法,所述方法包括:The first aspect of the present application provides a multi-target tracking method, the method includes:
获取包含多个目标在内的待检测图像;Obtain an image to be detected including multiple targets;
调用预设第一检测模型检测出所述待检测图像中的头部区域;Calling a preset first detection model to detect the head region in the image to be detected;
调用预设第二检测模型检测出所述待检测图像中的形体区域;Calling a preset second detection model to detect the shape area in the image to be detected;
根据所述头部区域和所述形体区域计算区域比值;Calculating an area ratio based on the head area and the body area;
判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;Judging whether there is an area ratio smaller than a preset first threshold in the area ratio, where the preset first threshold is less than 1;
当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡;When there is an area ratio in the area ratio that is less than the preset first threshold, it is determined that there is a pedestrian in the image to be detected that is blocked;
根据所述头部区域及所述形体区域分割出被遮挡的行人;Segmenting the shaded pedestrians according to the head area and the shape area;
调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。Call the preset tracking algorithm to track the segmented pedestrians that are occluded and unoccluded.
优选的,当所述区域比值大于或者等于所述预设第一阈值时,所述方法还包括:Preferably, when the area ratio is greater than or equal to the preset first threshold, the method further includes:
判断所述区域比值是否为1;Determine whether the area ratio is 1;
当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象;When the area ratio is 1, the pedestrian corresponding to the physical area is determined as the target tracking object;
当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象;When the area ratio is not 1, the pedestrian corresponding to the head area is determined as the target tracking object;
调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Invoking the preset tracking algorithm to track the target tracking object.
优选的,所述根据所述头部区域及所述形体区域分割出被遮挡的行人包括:Preferably, said segmenting the occluded pedestrians according to the head area and the body area includes:
判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;
当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset ratio coefficient;
根据扩大后的形体区域分割出被遮挡的行人。The occluded pedestrians are segmented according to the enlarged body area.
优选的,当所述区域比值小于或等于所述预设第二阈值时,所述方法还包括:Preferably, when the area ratio is less than or equal to the preset second threshold, the method further includes:
以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。The central axis of the two head regions is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the blocked pedestrians.
优选的,采用并行处理的方式同时调用所述预设第一检测模型检测出所述待检测图像中的头部区域和调用所述预设第二检测模型检测出所述待检测图像中的形体区域。Preferably, the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area.
优选的,所述调用预设第一检测模型检测出所述待检测图像中的头部区域包括:Preferably, the calling the preset first detection model to detect the head region in the image to be detected includes:
调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;Calling the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;
根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。The head region corresponding to each human body in the image to be detected is determined according to the multiple human body nodes of each human body.
优选的,所述根据所述头部区域和所述形体区域计算区域比值包括:Preferably, the calculating the area ratio based on the head area and the body area includes:
根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;
获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;
获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;
根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
本申请的第二方面提供一种多目标跟踪装置,所述装置包括:A second aspect of the present application provides a multi-target tracking device, the device including:
获取模块,用于获取包含多个目标在内的待检测图像;The acquisition module is used to acquire an image to be detected including multiple targets;
检测模块,用于调用预设第一检测模型检测出所述待检测图像中的头部区域;The detection module is configured to call a preset first detection model to detect the head region in the image to be detected;
所述检测模块,还用于调用预设第二检测模型检测出所述待检测图像中的形体区域;The detection module is further configured to call a preset second detection model to detect the shape area in the image to be detected;
计算模块,用于根据所述头部区域和所述形体区域计算区域比值;A calculation module, configured to calculate an area ratio based on the head area and the body area;
判断模块,用于判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;A judging module for judging whether there is an area ratio smaller than a preset first threshold in the area ratio, wherein the preset first threshold is less than 1;
分割模块,用于当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡,及根据所述头部区域及所述形体区域分割出被遮挡的行人;The segmentation module is used to determine that there is a pedestrian in the image to be detected that is blocked when the area ratio is less than the preset first threshold, and segment according to the head area and the shape area Pedestrians out of shade;
跟踪模块,用于调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。The tracking module is used to call a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
本申请的第三方面提供一种计算机设备,所述计算机设备包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现所述多目标跟踪方法。A third aspect of the present application provides a computer device that includes a processor, and the processor is configured to implement the multi-target tracking method when executing computer-readable instructions stored in a memory.
本申请的第四方面提供一种非易失性可读存储介质,其上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现所述多目标跟踪方法。A fourth aspect of the present application provides a non-volatile readable storage medium having computer readable instructions stored thereon, and when the computer readable instructions are executed by a processor, the multi-target tracking method is implemented.
综上所述,本申请所述的多目标跟踪方法、装置、计算机设备及存储介质,首先获取包含存在遮挡的多目标在内的待检测图像,分别调用预设第一检测模型和预设第二检测模型检测出所述待检测图像中的头部区域和形体区域,计算所述头部区域和所述形体区域的一个区域比值,当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡,进而根据所述头部区域及所述形体区域分割出被遮挡的行人,最后调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。本申请通过区域比值衡量行人被遮挡的情况,从而能够将被遮挡的行人检测出来;此外,结合人头区域和形体区域双重确定目标跟踪的对象,降低因行人身体被遮挡而造成漏检或者误检,提高了目标跟踪的效果。因而可以应用在复杂背景的场景中,尤其是当多个目标之间存在遮挡时,能够快速和精确地对目标进行跟踪,具有较高的实用价值。In summary, the multi-target tracking method, device, computer equipment, and storage medium described in this application first acquire the image to be detected including the multi-target with occlusion, and call the preset first detection model and the preset first detection model respectively. The second detection model detects the head area and the shape area in the image to be detected, and calculates an area ratio between the head area and the shape area. When the area ratio is less than the preset first threshold When the area ratio of, it is determined that there are pedestrians in the image to be detected that are occluded, and then the occluded pedestrians are segmented according to the head region and the shape region, and finally the preset tracking algorithm is called to segment the occluded pedestrians Follow unobstructed pedestrians. This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
图1是本申请实施例一提供的多目标跟踪方法的流程图。FIG. 1 is a flowchart of a multi-target tracking method provided in Embodiment 1 of the present application.
图2是本申请实施例二提供的多目标跟踪装置的结构图。FIG. 2 is a structural diagram of a multi-target tracking device provided in Embodiment 2 of the present application.
图3是本申请实施例三提供的计算机设备的结构示意图。FIG. 3 is a schematic diagram of the structure of a computer device provided in Embodiment 3 of the present application.
如下具体实施方式将结合上述附图进一步说明本申请。The following specific embodiments will further illustrate this application in conjunction with the above-mentioned drawings.
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
实施例一Example one
图1是本申请实施例一提供的多目标跟踪方法的流程图。FIG. 1 is a flowchart of a multi-target tracking method provided in Embodiment 1 of the present application.
在本实施例中,所述多目标跟踪方法可以应用于计算机设备中,对于需要进行多目标跟踪的计算机设备,可以直接在计算机设备上集成本申请的方法所提供的多目标跟踪的功能,或者以软件开发工具包(Software Development Kit, SKD)的形式运行在计算机设备中。In this embodiment, the multi-target tracking method can be applied to a computer device. For computer devices that need to perform multi-target tracking, the multi-target tracking function provided by the method of this application can be directly integrated on the computer device, or It runs in computer equipment in the form of Software Development Kit (SKD).
如图1所示,所述多目标跟踪方法具体包括以下步骤,根据不同的需求,该流程图中步骤的顺序可以改变,某些可以省略。As shown in Figure 1, the multi-target tracking method specifically includes the following steps. According to different requirements, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
S11,获取包含多个目标在内的待检测图像。S11: Obtain an image to be detected including multiple targets.
本实施例中,所述待检测图像可以是任何合适的、需要进行目标跟踪的图像,例如针对被监控区域采集到的图像。待检测图像可以是摄像头等图像采集装置采集到的静态的图像,也可以是摄像头等图像采集装置采集到的一段视频中的任一视频帧。In this embodiment, the image to be detected may be any suitable image that requires target tracking, for example, an image collected for a monitored area. The image to be detected may be a static image collected by an image collection device such as a camera, or any video frame in a video collected by an image collection device such as a camera.
所述待检测图像可以是原始图像,也可以是对原始图像进行预处理之后得到的图像。The image to be detected may be an original image or an image obtained after preprocessing the original image.
本实施例中,所述待检测图像中包含了多个行人,而多个行人的形体部位可能存在较大重叠。即在多个行人的形体部位存在较大重叠情况下确定出目标跟踪对象,避免某一行人由于被其他行人遮挡而造成误检或者漏检。In this embodiment, the image to be detected contains multiple pedestrians, and the body parts of multiple pedestrians may overlap significantly. That is, the target tracking object is determined when the body parts of multiple pedestrians have a large overlap, so as to prevent a pedestrian from being blocked by other pedestrians from causing misdetection or missed detection.
S12,调用预设第一检测模型检测出所述待检测图像中的头部区域。S12, calling a preset first detection model to detect the head region in the image to be detected.
本实施例中,可以事先训练出第一检测模型,通过直接调用该预先训练的第一检测模型,能够直接且快速的检测出所述待检测图像中的每一个人体的多个人体节点。所述预设第一检测模型可以为基于深度学习的各类检测模型,例如,基于神经网络的检测模型,或者基于残差网络的检测模型等。In this embodiment, the first detection model can be trained in advance, and by directly calling the pre-trained first detection model, multiple human body nodes of each human body in the image to be detected can be detected directly and quickly. The preset first detection model may be various detection models based on deep learning, for example, a detection model based on a neural network, or a detection model based on a residual network.
优选的,在所述获取包含多个目标在内的待检测图像之前,所述方法还包括:Preferably, before the acquiring the image to be detected including multiple targets, the method further includes:
预先训练所述第一检测模型,其中,所述第一检测模型的训练过程包括:The first detection model is trained in advance, wherein the training process of the first detection model includes:
1)获取多张人体图片,对每一张人体图片中的头部区域进行手动标注多个人体节点之后作为样本图片集;1) Obtain multiple human pictures, and manually label the head area in each human picture as a sample picture set;
2)从所述样本图片集中提取出第一预设比例的人体图片作为待训练的样本图片集,从所述样本图片集中提取出第二预设比例的人体图片作为待验证的样本图片集;2) Extracting a first preset ratio of human body pictures from the sample picture set as a sample picture set to be trained, and extracting a second preset ratio of human body pictures from the sample picture set as a sample picture set to be verified;
3)利用所述待训练的样本图片集对预设的神经网络进行训练,得到第一检测模型,并利用所述待验证的样本图片集对训练得到的第一检测模型进行验证;3) Use the sample picture set to be trained to train a preset neural network to obtain a first detection model, and use the sample picture set to be verified to verify the first detection model obtained through training;
4)若验证通过率大于或者等于预设阈值,则第一检测模型的训练完成,否则增加所述待训练的样本图片集中的人体图片的数量,以重新对第一检测模型进行训练及验证。4) If the verification pass rate is greater than or equal to the preset threshold, the training of the first detection model is completed; otherwise, the number of human images in the sample image set to be trained is increased to retrain and verify the first detection model.
示例性的,假设获取10万张人体图片,运用OpenPose或者PoseMachine等工具对人体图片中的头部区域中的多个人体节点进行标注,例如,标注左眼节点、右眼节点、左耳节点、右耳节点。提取第一预设比例的人体图片作为待训练的样本图片集(简称为训练集),及提取第二预设比例的人体图片作为待验证的样本图片集(简称为验证集),训练集中的人体图片的数量远大于验证集中的人体图片的数量,例如将人体图片中的80%的人体图片作为训练集,将剩余的20%的人体图片作为验证集。Exemplarily, assuming that 100,000 human body pictures are obtained, tools such as OpenPose or PoseMachine are used to label multiple human body nodes in the head region in the human body pictures, for example, to label left eye nodes, right eye nodes, left ear nodes, Right ear node. Extract a first preset ratio of human body pictures as the sample picture set to be trained (referred to as the training set), and extract the second preset ratio of human body pictures as the sample picture set to be verified (referred to as the verification set for short). The number of human body pictures is much larger than the number of human body pictures in the verification set. For example, 80% of the human body pictures in the human body pictures are used as the training set, and the remaining 20% of the human body pictures are used as the verification set.
在第一次训练神经网络以得到第一检测模型时,该神经网络的参数采用默认的参数。此后,在训练过程中不断的调整参数。在训练生成该第一检测模型 后,利用验证集中的人体图片对所生成的第一检测模型进行验证,如果验证通过率大于或者等于预设阈值,例如通过率大于或者等于98%,训练结束,以该训练得到的第一检测模型进行人体节点的识别。如果验证通过率小于所述预设阈值,例如小于98%,则增加参与训练的人体图片的数量,并重新执行上述的步骤,直至验证通过率大于或者等于预设阈值。When the neural network is trained for the first time to obtain the first detection model, the parameters of the neural network adopt the default parameters. After that, the parameters were adjusted continuously during the training process. After training to generate the first detection model, the generated first detection model is verified using the human body pictures in the verification set. If the verification pass rate is greater than or equal to a preset threshold, for example, the pass rate is greater than or equal to 98%, the training ends. The first detection model obtained by this training is used to identify the human body node. If the verification pass rate is less than the preset threshold, for example, less than 98%, the number of human body pictures participating in the training is increased, and the above steps are performed again until the verification pass rate is greater than or equal to the preset threshold.
在测试时,使用训练得到的第一检测模型对验证集中的人体图片进行人体节点识别,并将识别结果与该验证集中的人体图片的人体节点进行比对,以评估所训练的第一检测模型的识别效果。During the test, the first detection model obtained by training is used to identify the human body node in the human body picture in the verification set, and the recognition result is compared with the human body node of the human body picture in the verification set to evaluate the trained first detection model The recognition effect.
优选的,所述调用预设第一检测模型检测出所述待检测图像中的头部区域包括:Preferably, the calling the preset first detection model to detect the head region in the image to be detected includes:
1)调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;1) Call the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;
本实施例中,通过所述预设第一检测模型,例如,神经网络模型检测出所述待检测图像中的每一个人体的多个人体节点。In this embodiment, multiple human body nodes of each human body in the image to be detected are detected through the preset first detection model, for example, a neural network model.
其中,人体节点可以是人体的关节点、五官等人体的重要位置。多个人体节点至少包括头部与颈部的多个节点。示例性的,多个人体节点包括:颈部节点、鼻尖节点、左眼节点、右眼节点、左耳节点和右耳节点中的一个或多个。在其他实施例中,通过所述预设第一检测模型确定的多个人体节点还至少包括手腕节点、手肘节点、肩部节点。Among them, the human body node may be an important position of the human body such as the joint points of the human body and the facial features. The multiple human body nodes include at least multiple nodes of the head and the neck. Exemplarily, the multiple human body nodes include: one or more of a neck node, a nose tip node, a left eye node, a right eye node, a left ear node, and a right ear node. In other embodiments, the multiple human body nodes determined by the preset first detection model further include at least a wrist node, an elbow node, and a shoulder node.
每一个人体节点代表包括该节点在内的人体区域,例如左眼节点代表人体的整个左眼区域,而不是仅仅表示一个具体的像素点。Each human body node represents the human body area including the node, for example, the left eye node represents the entire left eye area of the human body, rather than just a specific pixel.
2)根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。2) Determine the head area corresponding to each human body in the image to be detected according to the multiple human body nodes of each human body.
本实施例中,所述头部区域为根据头部与颈部的多个节点确定的用于表征人体头部的区域。例如,根据颈部节点、鼻尖节点、左眼节点、右眼节点、左耳节点和右耳节点确定人体的头部区域。确定出的头部区域的形状可以为矩形、圆形、椭圆形或者其他任意规则或者不规则的形状等。本申请对确定出的头部区域的形状不做具体限制。In this embodiment, the head area is an area determined according to multiple nodes of the head and the neck to characterize the human head. For example, the head area of the human body is determined according to the neck node, nose tip node, left eye node, right eye node, left ear node, and right ear node. The determined shape of the head region can be rectangular, circular, oval, or any other regular or irregular shapes. This application does not specifically limit the shape of the determined head region.
本实施例中,预先训练第一检测模型的过程可以为离线训练过程。调用所述第一检测模型检测出所述待检测图像中的头部区域的过程可以为在线检测过程。即,将待检测图像作为所述第一检测模型的输入,输出则是所述待检测图像中的人体节点信息,例如,人的头顶、眼睛、嘴巴、下巴、耳朵、颈部等会分别作为一个人体节点呈现出来。根据所述多个人体节点将人体头部用几何图形,例如矩形框给框出来,此时的矩形框称之为头部框。In this embodiment, the process of pre-training the first detection model may be an offline training process. The process of calling the first detection model to detect the head region in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the first detection model, and the output is the human node information in the image to be detected, for example, the top of the head, eyes, mouth, chin, ears, neck, etc. A human body node appears. According to the multiple human body nodes, the human head is framed with a geometric figure, such as a rectangular frame, and the rectangular frame at this time is called a head frame.
S13,调用预设第二检测模型检测出所述待检测图像中的形体区域。S13, calling a preset second detection model to detect the shape area in the image to be detected.
本实施例中,在获取待检测图像之后,调用预设第二检测模型检测出所述待检测图像中的形体区域。所述预设第二检测模型可以采用加速版基于区域的卷积神经网络(Faster-RCNN)来实现。In this embodiment, after the image to be detected is acquired, the preset second detection model is called to detect the shape area in the image to be detected. The preset second detection model can be implemented using an accelerated version of the region-based convolutional neural network (Faster-RCNN).
所述预设第二检测模型是预先采用大量的人体图像训练好的。所述预设第二检测模型可以在所述获取包含多个目标在内的待检测图像之前训练完成。事 先训练所述第二检测模型的过程同上述事先训练所述第一检测模型的过程类似,此处不赘述。The preset second detection model is pre-trained using a large number of human images. The preset second detection model may be trained before acquiring the image to be detected including multiple targets. The process of training the second detection model in advance is similar to the process of training the first detection model in advance, and will not be repeated here.
通过将所述待检测图像输入至所述第二检测模型中来识别所述待检测图像中的形体区域。The shape area in the image to be detected is recognized by inputting the image to be detected into the second detection model.
本实施例中,预先训练第二检测模型的过程可以为离线训练过程。调用预设第二检测模型检测出所述待检测图像中的形体区域的过程可以为在线检测过程。即,将待检测图像作为所述第二检测模型的输入,输出则是所述待检测图像中的人体信息,根据所述人体信息将人体形体区域用矩形框给框出来,此时的矩形框称之为行人框。In this embodiment, the process of pre-training the second detection model may be an offline training process. The process of calling the preset second detection model to detect the shape area in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the second detection model, and the output is the human body information in the image to be detected. According to the human body information, the human body shape area is framed by a rectangular frame. Call it a pedestrian frame.
优选的,采用并行处理的方式同时调用所述预设第一检测模型检测出所述待检测图像中的头部区域和调用所述预设第二检测模型检测出所述待检测图像中的形体区域。本实施例中,采用并行处理的方式,将所述将待检测图像同时输入预设第一检测模型中确定头部区域和输入预设第二检测模型中确定形体区域,能够节省处理时间,提高处理效率。Preferably, the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area. In this embodiment, the parallel processing method is used to simultaneously input the image to be detected into the predetermined head area in the preset first detection model and input the predetermined shape area in the preset second detection model, which can save processing time and improve Processing efficiency.
S14,根据所述头部区域和所述形体区域计算区域比值。S14: Calculate an area ratio based on the head area and the body area.
本实施例中,确定所述待检测图像中的多个头部区域和多个形体区域之后,可以根据所述头部区域和所述形体区域计算得到一个区域比值。In this embodiment, after determining multiple head regions and multiple body regions in the image to be detected, an area ratio can be calculated based on the head region and the body region.
所述区域比值是指头部区域与形体区域的交集区域和头部区域之间的比值。The area ratio refers to the ratio between the intersection area of the head area and the body area and the head area.
优选的,所述根据所述头部区域和所述形体区域确定区域比值包括:Preferably, the determining the area ratio based on the head area and the body area includes:
根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;
获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;
获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;
根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
本实施例中,以所述待检测图像的左上角为原点,以图像的上边为X轴,以图像的左边为Y轴建立位置坐标系。In this embodiment, the position coordinate system is established with the upper left corner of the image to be detected as the origin, the upper edge of the image as the X axis, and the left side of the image as the Y axis.
位置坐标系建立好之后,获取所述头部区域所对应的头部框(以矩形框为例)的各个顶点的第一位置坐标,以及获取所述形体区域所对应的形体框(以矩形框为例)的各个顶点的第二位置坐标。根据所述第一位置坐标确定所述头部区域的第一面积,根据所述第一位置坐标和所述第二位置坐标确定所述头部区域和所述形体区域的交集区域,接着获取所述交集区域的各个顶点的第三位置坐标,根据所述第三位置坐标确定所述交集区域的第二面积。最后,根据所述第一面积和所述第二面积计算区域比值(Intersection over Union,IOU)。After the position coordinate system is established, the first position coordinates of each vertex of the head frame (take a rectangular frame as an example) corresponding to the head area are obtained, and the shape frame corresponding to the shape area (take the rectangular frame as an example) is obtained. For example) the second position coordinates of each vertex. Determine the first area of the head area according to the first position coordinates, determine the intersection area of the head area and the body area according to the first position coordinates and the second position coordinates, and then obtain all According to the third position coordinates of each vertex of the intersection area, the second area of the intersection area is determined according to the third position coordinates. Finally, an area ratio (Intersection over Union, IOU) is calculated according to the first area and the second area.
S15,判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1。S15: Determine whether there is an area ratio that is smaller than a preset first threshold in the area ratios, where the preset first threshold is less than 1.
一般情况下,对于同一个行人来说,头部区域是包含在形体区域中的,即头部框包含在行人框中。当行人没有被遮挡时,该行人的头部区域是完全包含在形体区域中的,此时计算得到的区域比值应当为1;当行人被部分遮挡时,该行人的头部区域部分包含在形体区域中,此时计算得到的区域比值小于1;当行 人的形体区域完全被遮挡时,该行人的头部区域完全不包含在形体区域中,此时计算得到的区域比值为0。In general, for the same pedestrian, the head area is contained in the body area, that is, the head frame is contained in the pedestrian frame. When the pedestrian is not blocked, the pedestrian's head area is completely contained in the body area, and the calculated area ratio should be 1. When the pedestrian is partially blocked, the pedestrian's head area is partially contained in the body area. In the area, the calculated area ratio at this time is less than 1. When the pedestrian's physical area is completely occluded, the pedestrian's head area is not included in the physical area at all, and the calculated area ratio is 0 at this time.
本实施例中,可以预先设置一个第一阈值,预设第一阈值小于1,可以是,例如,0.7。In this embodiment, a first threshold may be preset, and the preset first threshold is less than 1, which may be, for example, 0.7.
通过比较计算得到的区域比值与预先设置的阈值的大小关系来判断待检测图像中是否有行人被遮挡。即通过人头框和行人框的交集与人头框的比值来衡量该人头框和行人框的重叠情况或者来判断该人头框与该行人框是否匹配。区域比值越大,可以认为人头框和行人框的重叠比例越大,该人头框与该行人框则越匹配。By comparing the calculated area ratio with the preset threshold value, it is determined whether there is a pedestrian in the image to be detected that is blocked. That is, the ratio of the intersection of the head frame and the pedestrian frame to the head frame is used to measure the overlap of the head frame and the pedestrian frame or to determine whether the head frame matches the pedestrian frame. The larger the area ratio, the larger the overlap ratio of the head frame and the pedestrian frame, and the more matching the head frame and the pedestrian frame.
S16,当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡。S16: When there is an area ratio that is less than the preset first threshold in the area ratio, it is determined that there is a pedestrian in the image to be detected and is blocked.
本实施例中,若计算出有多个区域比值,则可以判断每个区域比值与预设第一阈值之间的大小关系。若多个区域比值中有小于预设第一阈值的目标区域比值,则表明待测图像中的对应所述目标区域比值的行人被严重遮挡了。若多个区域比值中的每个区域比值大于或者等于所述预设第一阈值时,表明待测图像中的多个行人没有被遮挡或者被遮挡的并不严重。In this embodiment, if multiple area ratios are calculated, the magnitude relationship between each area ratio and the preset first threshold can be determined. If there is a target area ratio that is less than the preset first threshold in the multiple area ratios, it indicates that the pedestrian corresponding to the target area ratio in the image to be tested is severely blocked. If each of the multiple area ratios is greater than or equal to the preset first threshold, it indicates that multiple pedestrians in the image to be tested are not blocked or are not seriously blocked.
S17,根据所述头部区域及所述形体区域分割出被遮挡的行人。S17, segmenting the blocked pedestrians according to the head area and the shape area.
本实施例中,当确定所述待检测图像中存在行人被严重遮挡时,可以根据所述头部区域及所述形体区域先将待测图像中被严重遮挡的行人分割出来。In this embodiment, when it is determined that there is a pedestrian in the image to be detected that is severely occluded, the severely occluded pedestrian in the image to be detected may be first segmented according to the head region and the shape region.
具体的,所述根据所述头部区域及所述形体区域分割出被遮挡的行人包括:Specifically, the segmenting the occluded pedestrian according to the head area and the shape area includes:
判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;
当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域,根据扩大后的形体区域分割出被遮挡的行人;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset scale factor, and segment the shaded pedestrian according to the expanded shape area;
当所述区域比值小于或等于所述预设第二阈值时,以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。When the area ratio is less than or equal to the preset second threshold, the central axis of the two head areas is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the shaded pedestrian.
在人群聚集的场合,大概率会出现行人甲被行人乙遮挡的情况,此时行人乙的检测没有问题,但行人甲的身体由于部分被行人乙遮挡,会出现两种情况:第一种是行人甲存在部分未被遮挡,第二种是行人甲几乎完全被遮挡。可以预先设置一个第二阈值,预设第二阈值小于预设第一阈值,例如,0.3。通过进一步比较区域比值与预设第二阈值之间的大小关系,即能确定行人甲是否几乎完全被遮挡。When crowds gather, there will be a high probability that Pedestrian A is blocked by Pedestrian B. At this time, there is no problem with the detection of Pedestrian B. However, because the body of Pedestrian A is partially blocked by Pedestrian B, there will be two situations: the first is Pedestrian A is partially unobstructed, and the second is that Pedestrian A is almost completely blocked. A second threshold may be preset, and the preset second threshold is smaller than the preset first threshold, for example, 0.3. By further comparing the magnitude relationship between the area ratio and the preset second threshold, it can be determined whether the pedestrian armor is almost completely blocked.
对于上述第一种情况:当所述区域比值大于所述预设第二阈值但小于预设第一阈值时,对应的行人虽然被严重遮挡了,表明该行人的人头区域被准确检测出来了,但该行人的形体与人头不匹配,这种下检测的置信度比较低,在后处理时很容易被当做错检而屏蔽掉。将对应的形体区域按照预设比例系数(例如,1.5)进行扩大后再进行分割,提高了该被遮挡的行人的检测置信度,从而在后处理检测筛选时减少被屏蔽的风险。For the first case above: when the area ratio is greater than the preset second threshold but less than the preset first threshold, although the corresponding pedestrian is severely blocked, it indicates that the pedestrian's head area has been accurately detected. However, the shape of the pedestrian does not match the human head, and the confidence of this lower detection is relatively low, and it is easy to be masked as a false detection during post-processing. The corresponding shape area is expanded according to a preset scale factor (for example, 1.5) and then divided, which improves the detection confidence of the blocked pedestrian, thereby reducing the risk of being blocked in the post-processing detection and screening.
对于上述第二种情况:当所述区域比值小于或等于所述预设第二阈值时,行人甲和行人乙共享一个人体框,但是对应两个头部框,此时可以将该人体框 标记为double,并且沿着两个头部框的中轴线,和肩部的关键点作为人体的左右边界将行人甲给分离出来。For the second case above: when the area ratio is less than or equal to the preset second threshold, Pedestrian A and Pedestrian B share a human body frame, but correspond to two head frames. At this time, the human body frame can be marked It is a double, and along the central axis of the two head frames, and the key points of the shoulders are used as the left and right boundaries of the human body to separate the pedestrian armor.
S18,调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。S18, calling a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
本实施例中,所述预设跟踪算法可以是多目标跟踪算法,待测图像中的行人被分割出来之后,可以对分割出来的行人和未被遮挡的行人进行目标跟踪。In this embodiment, the preset tracking algorithm may be a multi-target tracking algorithm. After the pedestrians in the image to be tested are segmented, the segmented pedestrians and the pedestrians that are not blocked can be tracked.
所述多目标跟踪算法为现有技术,本文不再详细阐述。The multi-target tracking algorithm is an existing technology, and this article will not elaborate on it.
优选的,当所述区域比值大于或者等于所述预设第一阈值时,所述方法还包括:Preferably, when the area ratio is greater than or equal to the preset first threshold, the method further includes:
判断所述区域比值是否为1;Determine whether the area ratio is 1;
当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象;When the area ratio is 1, the pedestrian corresponding to the physical area is determined as the target tracking object;
当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象;When the area ratio is not 1, the pedestrian corresponding to the head area is determined as the target tracking object;
调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Invoking the preset tracking algorithm to track the target tracking object.
本实施例中,在区域比值大于或者等于预设第一阈值时,即在待测图像中的多个行人没有被遮挡或者被遮挡的并不严重时,需要进一步判断区域比值是否为1来判断待测图像中的行人没有被遮挡还是被轻微遮挡。In this embodiment, when the area ratio is greater than or equal to the preset first threshold, that is, when multiple pedestrians in the image to be tested are not blocked or are not severely blocked, it is necessary to further determine whether the area ratio is 1 to determine Pedestrians in the image to be tested are not blocked or slightly blocked.
若每个区域比值均为1,则表明待测图像中行人的头部区域完全包含在形体区域中,即该行人没有被遮挡,由于确定的形体区域即为该行人的整个区域,此时以形体区域对应的行人为目标跟踪对象进行跟踪,跟踪效果更佳。若区域比值不为1,则表明待测图像中存在行人被轻微遮挡的现象,由于头部区域更具明显的区分性,此时以头部区域对应的行人为目标跟踪对象进行跟踪,跟踪效果更佳。If the ratio of each area is 1, it means that the head area of the pedestrian in the image to be tested is completely contained in the physical area, that is, the pedestrian is not blocked, because the determined physical area is the entire area of the pedestrian, so Pedestrians corresponding to the physical area are tracked for the target tracking object, and the tracking effect is better. If the area ratio is not 1, it indicates that there is a phenomenon that the pedestrian is slightly occluded in the image to be tested. Because the head area is more distinct, the pedestrian corresponding to the head area is used as the target tracking object for tracking, and the tracking effect Better.
综上所述,本申请所述的多目标跟踪方法,首先获取包含存在遮挡的多目标在内的待检测图像,分别调用预设第一检测模型和预设第二检测模型检测出所述待检测图像中的头部区域和形体区域,计算所述头部区域和所述形体区域的一个区域比值,当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡,进而根据所述头部区域及所述形体区域分割出被遮挡的行人,最后调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。本申请通过区域比值衡量行人被遮挡的情况,从而能够将被遮挡的行人检测出来;此外,结合人头区域和形体区域双重确定目标跟踪的对象,降低因行人身体被遮挡而造成漏检或者误检,提高了目标跟踪的效果。因而可以应用在复杂背景的场景中,尤其是当多个目标之间存在遮挡时,能够快速和精确地对目标进行跟踪,具有较高的实用价值。To sum up, the multi-target tracking method described in this application first acquires the image to be detected including the multi-target that is occluded, and calls the preset first detection model and the preset second detection model to detect the Detect the head area and the body area in the image, calculate an area ratio of the head area and the body area, and when the area ratio has an area ratio smaller than the preset first threshold, determine the Pedestrians are occluded in the image to be detected, and then the occluded pedestrians are segmented according to the head region and the body region, and finally the preset tracking algorithm is called to track the segmented occluded pedestrians and unoccluded pedestrians . This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
实施例二Example two
图2是本申请实施例二提供的多目标跟踪装置的结构图。FIG. 2 is a structural diagram of a multi-target tracking device provided in Embodiment 2 of the present application.
在一些实施例中,所述多目标跟踪装置20可以包括多个由计算机可读指令代码段所组成的功能模块。所述多目标跟踪装置20中的各个计算机可读指令代码段的代码可以存储于计算机设备的存储器中,并由所述至少一个处理器所执行,以执行(详见图1描述)对存在遮挡的多目标进行检测。In some embodiments, the multi-target tracking device 20 may include multiple functional modules composed of computer-readable instruction code segments. The code of each computer-readable instruction code segment in the multi-target tracking device 20 may be stored in the memory of the computer device and executed by the at least one processor to execute (see FIG. 1 for details) to prevent the existence of Multi-target detection.
本实施例中,所述多目标跟踪装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:获取模块201、检测模块202、训练模块203、计算模块204、判断模块205、确定模块206、分割模块207及跟踪模块208。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器中。在本实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the multi-target tracking device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: an acquisition module 201, a detection module 202, a training module 203, a calculation module 204, a judgment module 205, a determination module 206, a segmentation module 207, and a tracking module 208. The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
获取模块201,用于获取包含多个目标在内的待检测图像。The acquisition module 201 is used to acquire a to-be-detected image including multiple targets.
本实施例中,所述待检测图像可以是任何合适的、需要进行目标跟踪的图像,例如针对被监控区域采集到的图像。待检测图像可以是摄像头等图像采集装置采集到的静态的图像,也可以是摄像头等图像采集装置采集到的一段视频中的任一视频帧。In this embodiment, the image to be detected may be any suitable image that requires target tracking, for example, an image collected for a monitored area. The image to be detected may be a static image collected by an image collection device such as a camera, or any video frame in a video collected by an image collection device such as a camera.
所述待检测图像可以是原始图像,也可以是对原始图像进行预处理之后得到的图像。The image to be detected may be an original image or an image obtained after preprocessing the original image.
本实施例中,所述待检测图像中包含了多个行人,而多个行人的形体部位可能存在较大重叠。即在多个行人的形体部位存在较大重叠情况下确定出目标跟踪对象,避免某一行人由于被其他行人遮挡而造成误检或者漏检。In this embodiment, the image to be detected contains multiple pedestrians, and the body parts of multiple pedestrians may overlap significantly. That is, the target tracking object is determined when the body parts of multiple pedestrians have a large overlap, so as to prevent a pedestrian from being blocked by other pedestrians from causing misdetection or missed detection.
检测模块202,用于调用预设第一检测模型检测出所述待检测图像中的头部区域。The detection module 202 is configured to call a preset first detection model to detect the head region in the image to be detected.
本实施例中,可以事先训练出第一检测模型,通过直接调用该预先训练的第一检测模型,能够直接且快速的检测出所述待检测图像中的每一个人体的多个人体节点。所述预设第一检测模型可以为基于深度学习的各类检测模型,例如,基于神经网络的检测模型,或者基于残差网络的检测模型等。In this embodiment, the first detection model can be trained in advance, and by directly calling the pre-trained first detection model, multiple human body nodes of each human body in the image to be detected can be detected directly and quickly. The preset first detection model may be various detection models based on deep learning, for example, a detection model based on a neural network, or a detection model based on a residual network.
训练模块203,用于预先训练所述第一检测模型,其中,所述第一检测模型的训练过程包括:The training module 203 is configured to pre-train the first detection model, where the training process of the first detection model includes:
1)获取多张人体图片,对每一张人体图片中的头部区域进行手动标注多个人体节点之后作为样本图片集;1) Obtain multiple human pictures, and manually label the head area in each human picture as a sample picture set;
2)从所述样本图片集中提取出第一预设比例的人体图片作为待训练的样本图片集,从所述样本图片集中提取出第二预设比例的人体图片作为待验证的样本图片集;2) Extracting a first preset ratio of human body pictures from the sample picture set as a sample picture set to be trained, and extracting a second preset ratio of human body pictures from the sample picture set as a sample picture set to be verified;
3)利用所述待训练的样本图片集对预设的神经网络进行训练,得到第一检测模型,并利用所述待验证的样本图片集对训练得到的第一检测模型进行验证;3) Use the sample picture set to be trained to train a preset neural network to obtain a first detection model, and use the sample picture set to be verified to verify the first detection model obtained through training;
4)若验证通过率大于或者等于预设阈值,则第一检测模型的训练完成,否则增加所述待训练的样本图片集中的人体图片的数量,以重新对第一检测模型进行训练及验证。4) If the verification pass rate is greater than or equal to the preset threshold, the training of the first detection model is completed; otherwise, the number of human images in the sample image set to be trained is increased to retrain and verify the first detection model.
示例性的,假设获取10万张人体图片,运用OpenPose或者PoseMachine等工具对人体图片中的头部区域中的多个人体节点进行标注,例如,标注左眼节点、右眼节点、左耳节点、右耳节点。提取第一预设比例的人体图片作为待训练的样本图片集(简称为训练集),及提取第二预设比例的人体图片作为待验证的样本图片集(简称为验证集),训练集中的人体图片的数量远大于验证集中的人体图片的数量,例如将人体图片中的80%的人体图片作为训练集,将 剩余的20%的人体图片作为验证集。Exemplarily, assuming that 100,000 human body pictures are obtained, tools such as OpenPose or PoseMachine are used to label multiple human body nodes in the head region in the human body pictures, for example, to label left eye nodes, right eye nodes, left ear nodes, Right ear node. Extract a first preset ratio of human body pictures as the sample picture set to be trained (referred to as the training set), and extract the second preset ratio of human body pictures as the sample picture set to be verified (referred to as the verification set for short). The number of human body pictures is much larger than the number of human body pictures in the verification set. For example, 80% of the human body pictures in the human body pictures are used as the training set, and the remaining 20% of the human body pictures are used as the verification set.
在第一次训练神经网络以得到第一检测模型时,该神经网络的参数采用默认的参数。此后,在训练过程中不断的调整参数。在训练生成该第一检测模型后,利用验证集中的人体图片对所生成的第一检测模型进行验证,如果验证通过率大于或者等于预设阈值,例如通过率大于或者等于98%,训练结束,以该训练得到的第一检测模型进行人体节点的识别。如果验证通过率小于所述预设阈值,例如小于98%,则增加参与训练的人体图片的数量,并重新执行上述的步骤,直至验证通过率大于或者等于预设阈值。When the neural network is trained for the first time to obtain the first detection model, the parameters of the neural network adopt the default parameters. After that, the parameters were adjusted continuously during the training process. After training to generate the first detection model, the generated first detection model is verified using the human body pictures in the verification set. If the verification pass rate is greater than or equal to a preset threshold, for example, the pass rate is greater than or equal to 98%, the training ends. The first detection model obtained by this training is used to identify the human body node. If the verification pass rate is less than the preset threshold, for example, less than 98%, the number of human body pictures participating in the training is increased, and the above steps are performed again until the verification pass rate is greater than or equal to the preset threshold.
在测试时,使用训练得到的第一检测模型对验证集中的人体图片进行人体节点识别,并将识别结果与该验证集中的人体图片的人体节点进行比对,以评估所训练的第一检测模型的识别效果。During the test, the first detection model obtained by training is used to identify the human body node in the human body picture in the verification set, and the recognition result is compared with the human body node of the human body picture in the verification set to evaluate the trained first detection model The recognition effect.
优选的,所述检测模块202调用预设第一检测模型检测出所述待检测图像中的头部区域包括:Preferably, the detection module 202 calling a preset first detection model to detect the head region in the image to be detected includes:
1)调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;1) Call the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;
本实施例中,通过所述预设第一检测模型,例如,神经网络模型检测出所述待检测图像中的每一个人体的多个人体节点。In this embodiment, multiple human body nodes of each human body in the image to be detected are detected through the preset first detection model, for example, a neural network model.
其中,人体节点可以是人体的关节点、五官等人体的重要位置。多个人体节点至少包括头部与颈部的多个节点。示例性的,多个人体节点包括:颈部节点、鼻尖节点、左眼节点、右眼节点、左耳节点和右耳节点中的一个或多个。在其他实施例中,通过所述预设第一检测模型确定的多个人体节点还至少包括手腕节点、手肘节点、肩部节点。Among them, the human body node may be an important position of the human body such as the joint points of the human body and the facial features. The multiple human body nodes include at least multiple nodes of the head and the neck. Exemplarily, the multiple human body nodes include: one or more of a neck node, a nose tip node, a left eye node, a right eye node, a left ear node, and a right ear node. In other embodiments, the multiple human body nodes determined by the preset first detection model further include at least a wrist node, an elbow node, and a shoulder node.
每一个人体节点代表包括该节点在内的人体区域,例如左眼节点代表人体的整个左眼区域,而不是仅仅表示一个具体的像素点。Each human body node represents the human body area including the node, for example, the left eye node represents the entire left eye area of the human body, rather than just a specific pixel.
2)根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。2) Determine the head area corresponding to each human body in the image to be detected according to the multiple human body nodes of each human body.
本实施例中,所述头部区域为根据头部与颈部的多个节点确定的用于表征人体头部的区域。例如,根据颈部节点、鼻尖节点、左眼节点、右眼节点、左耳节点和右耳节点确定人体的头部区域。确定出的头部区域的形状可以为矩形、圆形、椭圆形或者其他任意规则或者不规则的形状等。本申请对确定出的头部区域的形状不做具体限制。In this embodiment, the head area is an area determined according to multiple nodes of the head and the neck to characterize the human head. For example, the head area of the human body is determined according to the neck node, nose tip node, left eye node, right eye node, left ear node, and right ear node. The determined shape of the head region can be rectangular, circular, oval, or any other regular or irregular shapes. This application does not specifically limit the shape of the determined head region.
本实施例中,预先训练第一检测模型的过程可以为离线训练过程。调用所述第一检测模型检测出所述待检测图像中的头部区域的过程可以为在线检测过程。即,将待检测图像作为所述第一检测模型的输入,输出则是所述待检测图像中的人体节点信息,例如,人的头顶、眼睛、嘴巴、下巴、耳朵、颈部等会分别作为一个人体节点呈现出来。根据所述多个人体节点将人体头部用几何图形,例如矩形框给框出来,此时的矩形框称之为头部框。In this embodiment, the process of pre-training the first detection model may be an offline training process. The process of calling the first detection model to detect the head region in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the first detection model, and the output is the human node information in the image to be detected, for example, the top of the head, eyes, mouth, chin, ears, neck, etc. A human body node appears. According to the multiple human body nodes, the human head is framed with a geometric figure, such as a rectangular frame, and the rectangular frame at this time is called a head frame.
所述检测模块202,还用于调用预设第二检测模型检测出所述待检测图像中的形体区域。The detection module 202 is further configured to call a preset second detection model to detect the shape area in the image to be detected.
本实施例中,在获取待检测图像之后,调用预设第二检测模型检测出所述 待检测图像中的形体区域。所述预设第二检测模型可以采用加速版基于区域的卷积神经网络(Faster-RCNN)来实现。In this embodiment, after the image to be detected is acquired, the preset second detection model is called to detect the shape area in the image to be detected. The preset second detection model can be implemented using an accelerated version of the region-based convolutional neural network (Faster-RCNN).
所述预设第二检测模型是预先采用大量的人体图像训练好的。所述预设第二检测模型可以在所述获取包含多个目标在内的待检测图像之前训练完成。事先训练所述第二检测模型的过程同上述事先训练所述第一检测模型的过程类似,此处不赘述。The preset second detection model is pre-trained using a large number of human images. The preset second detection model may be trained before acquiring the image to be detected including multiple targets. The process of pre-training the second detection model is similar to the foregoing process of pre-training the first detection model, and will not be repeated here.
通过将所述待检测图像输入至所述第二检测模型中来识别所述待检测图像中的形体区域。The shape area in the image to be detected is recognized by inputting the image to be detected into the second detection model.
本实施例中,预先训练第二检测模型的过程可以为离线训练过程。调用预设第二检测模型检测出所述待检测图像中的形体区域的过程可以为在线检测过程。即,将待检测图像作为所述第二检测模型的输入,输出则是所述待检测图像中的人体信息,根据所述人体信息将人体形体区域用矩形框给框出来,此时的矩形框称之为行人框。In this embodiment, the process of pre-training the second detection model may be an offline training process. The process of calling the preset second detection model to detect the shape area in the image to be detected may be an online detection process. That is, the image to be detected is used as the input of the second detection model, and the output is the human body information in the image to be detected. According to the human body information, the human body shape area is framed by a rectangular frame. Call it a pedestrian frame.
优选的,采用并行处理的方式同时调用所述预设第一检测模型检测出所述待检测图像中的头部区域和调用所述预设第二检测模型检测出所述待检测图像中的形体区域。本实施例中,采用并行处理的方式,将所述将待检测图像同时输入预设第一检测模型中确定头部区域和输入预设第二检测模型中确定形体区域,能够节省处理时间,提高处理效率。Preferably, the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model to detect the shape in the image to be detected area. In this embodiment, the parallel processing method is used to simultaneously input the image to be detected into the predetermined head area in the preset first detection model and input the predetermined shape area in the preset second detection model, which can save processing time and improve Processing efficiency.
计算模块204,用于根据所述头部区域和所述形体区域计算区域比值。The calculation module 204 is configured to calculate an area ratio based on the head area and the body area.
本实施例中,确定所述待检测图像中的多个头部区域和多个形体区域之后,可以根据所述头部区域和所述形体区域计算得到一个区域比值。In this embodiment, after determining multiple head regions and multiple body regions in the image to be detected, an area ratio can be calculated based on the head region and the body region.
所述区域比值是指头部区域与形体区域的交集区域和头部区域之间的比值。The area ratio refers to the ratio between the intersection area of the head area and the body area and the head area.
优选的,所述计算模块204根据所述头部区域和所述形体区域确定区域比值包括:Preferably, the calculation module 204 determining the area ratio according to the head area and the body area includes:
根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;
获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;
获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;
根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
本实施例中,以所述待检测图像的左上角为原点,以图像的上边为X轴,以图像的左边为Y轴建立位置坐标系。In this embodiment, the position coordinate system is established with the upper left corner of the image to be detected as the origin, the upper edge of the image as the X axis, and the left side of the image as the Y axis.
位置坐标系建立好之后,获取所述头部区域所对应的头部框(以矩形框为例)的各个顶点的第一位置坐标,以及获取所述形体区域所对应的形体框(以矩形框为例)的各个顶点的第二位置坐标。根据所述第一位置坐标确定所述头部区域的第一面积,根据所述第一位置坐标和所述第二位置坐标确定所述头部区域和所述形体区域的交集区域,接着获取所述交集区域的各个顶点的第三位置坐标,根据所述第三位置坐标确定所述交集区域的第二面积。最后,根据所述第一面积和所述第二面积计算区域比值(Intersection over Union,IOU)。After the position coordinate system is established, the first position coordinates of each vertex of the head frame (take a rectangular frame as an example) corresponding to the head area are obtained, and the shape frame corresponding to the shape area (take the rectangular frame as an example) is obtained. For example) the second position coordinates of each vertex. Determine the first area of the head area according to the first position coordinates, determine the intersection area of the head area and the body area according to the first position coordinates and the second position coordinates, and then obtain all According to the third position coordinates of each vertex of the intersection area, the second area of the intersection area is determined according to the third position coordinates. Finally, an area ratio (Intersection over Union, IOU) is calculated according to the first area and the second area.
判断模块205,用于判断所述区域比值中是否有小于预设第一阈值的区域比 值,其中所述预设第一阈值小于1。The judging module 205 is used to judge whether there is an area ratio smaller than a preset first threshold in the area ratios, wherein the preset first threshold is less than one.
一般情况下,对于同一个行人来说,头部区域是包含在形体区域中的,即头部框包含在行人框中。当行人没有被遮挡时,该行人的头部区域是完全包含在形体区域中的,此时计算得到的区域比值应当为1;当行人被部分遮挡时,该行人的头部区域部分包含在形体区域中,此时计算得到的区域比值小于1;当行人的形体区域完全被遮挡时,该行人的头部区域完全不包含在形体区域中,此时计算得到的区域比值为0。In general, for the same pedestrian, the head area is contained in the body area, that is, the head frame is contained in the pedestrian frame. When the pedestrian is not blocked, the pedestrian's head area is completely contained in the body area, and the calculated area ratio should be 1. When the pedestrian is partially blocked, the pedestrian's head area is partially contained in the body area. In the area, the calculated area ratio at this time is less than 1. When the pedestrian's physical area is completely occluded, the pedestrian's head area is not included in the physical area at all, and the calculated area ratio is 0 at this time.
本实施例中,可以预先设置一个第一阈值,预设第一阈值小于1,可以是,例如,0.7。In this embodiment, a first threshold may be preset, and the preset first threshold is less than 1, which may be, for example, 0.7.
通过比较计算得到的区域比值与预先设置的阈值的大小关系来判断待检测图像中是否有行人被遮挡。即通过人头框和行人框的交集与人头框的比值来衡量该人头框和行人框的重叠情况或者来判断该人头框与该行人框是否匹配。区域比值越大,可以认为人头框和行人框的重叠比例越大,该人头框与该行人框则越匹配。By comparing the calculated area ratio with the preset threshold value, it is determined whether there is a pedestrian in the image to be detected that is blocked. That is, the ratio of the intersection of the head frame and the pedestrian frame to the head frame is used to measure the overlap of the head frame and the pedestrian frame or to determine whether the head frame matches the pedestrian frame. The larger the area ratio, the larger the overlap ratio of the head frame and the pedestrian frame, and the more matching the head frame and the pedestrian frame.
确定模块206,用于当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡。The determining module 206 is configured to determine that a pedestrian is blocked in the image to be detected when there is an area ratio in the area ratio that is less than the preset first threshold.
本实施例中,若计算出有多个区域比值,则可以判断每个区域比值与预设第一阈值之间的大小关系。若多个区域比值中有小于预设第一阈值的目标区域比值,则表明待测图像中的对应所述目标区域比值的行人被严重遮挡了。若多个区域比值中的每个区域比值大于或者等于所述预设第一阈值时,表明待测图像中的多个行人没有被遮挡或者被遮挡的并不严重。In this embodiment, if multiple area ratios are calculated, the magnitude relationship between each area ratio and the preset first threshold can be determined. If there is a target area ratio that is less than the preset first threshold in the multiple area ratios, it indicates that the pedestrian corresponding to the target area ratio in the image to be tested is severely blocked. If each of the multiple area ratios is greater than or equal to the preset first threshold, it indicates that multiple pedestrians in the image to be tested are not blocked or are not seriously blocked.
分割模块207,用于根据所述头部区域及所述形体区域分割出被遮挡的行人。The segmentation module 207 is used to segment the occluded pedestrians according to the head area and the body area.
本实施例中,当确定所述待检测图像中存在行人被严重遮挡时,可以根据所述头部区域及所述形体区域先将待测图像中被严重遮挡的行人分割出来。In this embodiment, when it is determined that there is a pedestrian in the image to be detected that is severely occluded, the severely occluded pedestrian in the image to be detected may be first segmented according to the head region and the shape region.
具体的,所述分割模块207根据所述头部区域及所述形体区域分割出被遮挡的行人包括:Specifically, the segmentation module 207 segmenting the occluded pedestrian according to the head area and the body area includes:
判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;
当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域,根据扩大后的形体区域分割出被遮挡的行人;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset scale factor, and segment the shaded pedestrian according to the expanded shape area;
当所述区域比值小于或等于所述预设第二阈值时,以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。When the area ratio is less than or equal to the preset second threshold, the central axis of the two head areas is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the shaded pedestrian.
在人群聚集的场合,大概率会出现行人甲被行人乙遮挡的情况,此时行人乙的检测没有问题,但行人甲的身体由于部分被行人乙遮挡,会出现两种情况:第一种是行人甲存在部分未被遮挡,第二种是行人甲几乎完全被遮挡。可以预先设置一个第二阈值,预设第二阈值小于预设第一阈值,例如,0.3。通过进一步比较区域比值与预设第二阈值之间的大小关系,即能确定行人甲是否几乎完全被遮挡。When crowds gather, there will be a high probability that Pedestrian A is blocked by Pedestrian B. At this time, there is no problem with the detection of Pedestrian B. However, because the body of Pedestrian A is partially blocked by Pedestrian B, there will be two situations: the first is Pedestrian A is partially unobstructed, and the second is that Pedestrian A is almost completely blocked. A second threshold may be preset, and the preset second threshold is smaller than the preset first threshold, for example, 0.3. By further comparing the magnitude relationship between the area ratio and the preset second threshold, it can be determined whether the pedestrian armor is almost completely blocked.
对于上述第一种情况:当所述区域比值大于所述预设第二阈值但小于预设 第一阈值时,对应的行人虽然被严重遮挡了,表明该行人的人头区域被准确检测出来了,但该行人的形体与人头不匹配,这种下检测的置信度比较低,在后处理时很容易被当做错检而屏蔽掉。将对应的形体区域按照预设比例系数(例如,1.5)进行扩大后再进行分割,提高了该被遮挡的行人的检测置信度,从而在后处理检测筛选时减少被屏蔽的风险。For the first case above: when the area ratio is greater than the preset second threshold but less than the preset first threshold, although the corresponding pedestrian is severely blocked, it indicates that the pedestrian's head area has been accurately detected. However, the shape of the pedestrian does not match the human head, and the confidence of this lower detection is relatively low, and it is easy to be masked as a false detection during post-processing. The corresponding shape area is expanded according to a preset scale factor (for example, 1.5) and then divided, which improves the detection confidence of the blocked pedestrian, thereby reducing the risk of being blocked in the post-processing detection and screening.
对于上述第二种情况:当所述区域比值小于或等于所述预设第二阈值时,行人甲和行人乙共享一个人体框,但是对应两个头部框,此时可以将该人体框标记为double,并且沿着两个头部框的中轴线,和肩部的关键点作为人体的左右边界将行人甲给分离出来。For the second case above: when the area ratio is less than or equal to the preset second threshold, Pedestrian A and Pedestrian B share a human body frame, but correspond to two head frames. At this time, the human body frame can be marked It is a double and separates the pedestrian armor along the central axis of the two head frames and the key points of the shoulders as the left and right boundaries of the human body.
跟踪模块208,用于调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。The tracking module 208 is configured to call a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
本实施例中,所述预设跟踪算法可以是多目标跟踪算法,待测图像中的行人被分割出来之后,可以对分割出来的行人和未被遮挡的行人进行目标跟踪。In this embodiment, the preset tracking algorithm may be a multi-target tracking algorithm. After the pedestrians in the image to be tested are segmented, the segmented pedestrians and the pedestrians that are not blocked can be tracked.
所述多目标跟踪算法为现有技术,本文不再详细阐述。The multi-target tracking algorithm is an existing technology, and this article will not elaborate on it.
优选的,所述判断模块205,还用于当所述区域比值大于或者等于所述预设第一阈值时,判断所述区域比值是否为1。Preferably, the determining module 205 is further configured to determine whether the area ratio is 1 when the area ratio is greater than or equal to the preset first threshold.
优选的,所述确定模块206,还用于当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象。Preferably, the determining module 206 is further configured to determine the pedestrian corresponding to the physical area as the target tracking object when the area ratio is 1.
优选的,所述确定模块206,还用于当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象。Preferably, the determining module 206 is further configured to determine the pedestrian corresponding to the head area as the target tracking object when the area ratio is not 1.
优选的,所述跟踪模块208,还用于调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Preferably, the tracking module 208 is further configured to call the preset tracking algorithm to track the target tracking object.
本实施例中,在区域比值大于或者等于预设第一阈值时,即在待测图像中的多个行人没有被遮挡或者被遮挡的并不严重时,需要进一步判断区域比值是否为1来判断待测图像中的行人没有被遮挡还是被轻微遮挡。In this embodiment, when the area ratio is greater than or equal to the preset first threshold, that is, when multiple pedestrians in the image to be tested are not blocked or are not severely blocked, it is necessary to further determine whether the area ratio is 1 to determine Pedestrians in the image to be tested are not blocked or slightly blocked.
若每个区域比值均为1,则表明待测图像中行人的头部区域完全包含在形体区域中,即该行人没有被遮挡,由于确定的形体区域即为该行人的整个区域,此时以形体区域对应的行人为目标跟踪对象进行跟踪,跟踪效果更佳。若区域比值不为1,则表明待测图像中存在行人被轻微遮挡的现象,由于头部区域更具明显的区分性,此时以头部区域对应的行人为目标跟踪对象进行跟踪,跟踪效果更佳。If the ratio of each area is 1, it means that the head area of the pedestrian in the image to be tested is completely contained in the physical area, that is, the pedestrian is not blocked, because the determined physical area is the entire area of the pedestrian, so Pedestrians corresponding to the physical area are tracked for the target tracking object, and the tracking effect is better. If the area ratio is not 1, it indicates that there is a phenomenon that the pedestrian is slightly occluded in the image to be tested. Because the head area is more distinct, the pedestrian corresponding to the head area is used as the target tracking object for tracking, and the tracking effect Better.
综上所述,本申请所述的多目标跟踪装置,首先获取包含存在遮挡的多目标在内的待检测图像,分别调用预设第一检测模型和预设第二检测模型检测出所述待检测图像中的头部区域和形体区域,计算所述头部区域和所述形体区域的一个区域比值,当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡,进而根据所述头部区域及所述形体区域分割出被遮挡的行人,最后调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。本申请通过区域比值衡量行人被遮挡的情况,从而能够将被遮挡的行人检测出来;此外,结合人头区域和形体区域双重确定目标跟踪的对象,降低因行人身体被遮挡而造成漏检或者误检,提高了目标跟踪的效 果。因而可以应用在复杂背景的场景中,尤其是当多个目标之间存在遮挡时,能够快速和精确地对目标进行跟踪,具有较高的实用价值。To sum up, the multi-target tracking device described in this application first acquires images to be detected including multiple targets that are occluded, and respectively calls a preset first detection model and a preset second detection model to detect the Detect the head area and the body area in the image, calculate an area ratio of the head area and the body area, and when the area ratio has an area ratio smaller than the preset first threshold, determine the Pedestrians are occluded in the image to be detected, and then the occluded pedestrians are segmented according to the head region and the body region, and finally the preset tracking algorithm is called to track the segmented occluded pedestrians and unoccluded pedestrians . This application uses the area ratio to measure the occlusion of pedestrians, so that the occluded pedestrians can be detected; in addition, the target tracking object is determined by combining the head area and the body area to reduce the missed or false detection caused by the pedestrian body being blocked , Improve the effect of target tracking. Therefore, it can be used in scenes with complex backgrounds, especially when there are occlusions between multiple targets, it can track the targets quickly and accurately, which has high practical value.
实施例三Example three
参阅图3所示,为本申请实施例三提供的计算机设备的结构示意图。在本申请较佳实施例中,所述计算机设备3包括存储器31、至少一个处理器32、至少一条通信总线33及收发器34。Refer to FIG. 3, which is a schematic structural diagram of a computer device provided in Embodiment 3 of this application. In a preferred embodiment of the present application, the computer device 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
本领域技术人员应该了解,图3示出的计算机设备的结构并不构成本申请实施例的限定,既可以是总线型结构,也可以是星形结构,所述计算机设备3还可以包括比图示更多或更少的其他硬件或者软件,或者不同的部件布置。Those skilled in the art should understand that the structure of the computer device shown in FIG. 3 does not constitute a limitation of the embodiment of the present application. It may be a bus-type structure or a star structure. The computer device 3 may also include a graph Show more or less other hardware or software, or different component arrangements.
在一些实施例中,所述计算机设备3包括一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的计算机设备,其硬件包括但不限于微处理器、专用集成电路、可编程门阵列、数字处理器及嵌入式设备等。所述计算机设备3还可包括客户设备,所述客户设备包括但不限于任何一种可与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、数码相机等。In some embodiments, the computer device 3 includes a computer device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor and an application specific integrated circuit. , Programmable gate arrays, digital processors and embedded devices, etc. The computer device 3 may also include a client device, and the client device includes but is not limited to any electronic product that can interact with a client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, Personal computers, tablet computers, smart phones, digital cameras, etc.
需要说明的是,所述计算机设备3仅为举例,其他现有的或今后可能出现的电子产品如可适应于本申请,也应包含在本申请的保护范围以内,并以引用方式包含于此。It should be noted that the computer device 3 is only an example, and other existing or future electronic products that can be adapted to this application should also be included in the scope of protection of this application and included here by reference .
在一些实施例中,所述存储器31用于存储计算机可读指令代码和各种数据,例如安装在所述计算机设备3中的多目标跟踪装置20,并在计算机设备3的运行过程中实现高速、自动地完成计算机可读指令或数据的存取。所述存储器31包括只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的任何其他介质。In some embodiments, the memory 31 is used to store computer-readable instruction codes and various data, such as the multi-target tracking device 20 installed in the computer equipment 3, and achieve high speed during the operation of the computer equipment 3 , Automatically complete the access of computer readable instructions or data. The memory 31 includes Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), and Erasable Programmable Read-Only Memory (EPROM) , One-time Programmable Read-Only Memory (OTPROM), Electronically-Erasable Programmable Read-Only Memory (EEPROM), CD-ROM (Compact Disc Read- Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or any other computer-readable medium that can be used to carry or store data.
在一些实施例中,所述至少一个处理器32可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述至少一个处理器32是所述计算机设备3的控制核心(Control Unit),利用各种接口和线路连接整个计算机设备3的各个部件,通过运行或执行存储在所述存储器31内的计算机可读指令或者模块,以及调用存储在所述存储器31内的数据,以执行计算机设备3的各种功能和处理数据,例如执行多目标跟踪。In some embodiments, the at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one Or a combination of multiple central processing units (CPU), microprocessors, digital processing chips, graphics processors, and various control chips. The at least one processor 32 is the control core (Control Unit) of the computer device 3, which uses various interfaces and lines to connect the various components of the entire computer device 3, and can run or execute the computer stored in the memory 31. Read instructions or modules, and call data stored in the memory 31 to perform various functions and process data of the computer device 3, for example, perform multi-target tracking.
在一些实施例中,所述至少一条通信总线33被设置为实现所述存储器31以及所述至少一个处理器32等之间的连接通信。In some embodiments, the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
尽管未示出,所述计算机设备3还可以包括给各个部件供电的电源,优选的,电源可以通过电源管理装置与所述至少一个处理器32逻辑相连,从而通过 电源管理装置实现管理充电、放电、以及功耗管理等功能。所述计算机设备3还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。Although not shown, the computer device 3 may also include a power supply for supplying power to various components. Preferably, the power supply may be logically connected to the at least one processor 32 through a power management device, so that the management of charging and discharging is realized through the power management device. , And power management and other functions. The computer device 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个非易失性可读存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,计算机设备,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above-mentioned integrated unit implemented in the form of a software function module can be stored in a non-volatile readable storage medium and includes several instructions to enable a computer device (which can be a personal computer, a computer device, or a network device, etc.) ) Or a processor (processor) executes part of the method described in each embodiment of the present application.
在进一步的实施例中,结合图2,所述至少一个处理器32可执行所述计算机设备3的操作装置以及安装的各类应用程序(如所述的多目标跟踪装置20)等,例如,上述的各个模块。In a further embodiment, with reference to FIG. 2, the at least one processor 32 can execute the operating device of the computer device 3 and various installed applications (such as the multi-target tracking device 20), etc., for example, The various modules mentioned above.
所述存储器31中存储有计算机可读指令代码,且所述至少一个处理器32可调用所述存储器31中存储的计算机可读指令代码以执行相关的功能。例如,图2中所述的各个模块是存储在所述存储器31中的计算机可读指令代码,并由所述至少一个处理器32所执行,从而实现所述各个模块的功能以达到多目标跟踪的目的。The memory 31 stores computer readable instruction codes, and the at least one processor 32 can call the computer readable instruction codes stored in the memory 31 to perform related functions. For example, the various modules described in FIG. 2 are computer-readable instruction codes stored in the memory 31 and executed by the at least one processor 32, so as to realize the functions of the various modules to achieve multi-target tracking. the goal of.
在本申请的一个实施例中,所述存储器31存储多个指令,所述多个指令被所述至少一个处理器32所执行以实现多目标跟踪。具体地,所述至少一个处理器32对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。In an embodiment of the present application, the memory 31 stores multiple instructions, and the multiple instructions are executed by the at least one processor 32 to achieve multi-target tracking. Specifically, for the specific implementation method of the at least one processor 32 on the foregoing instructions, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the objectives of the solutions of the embodiments. In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional modules.
对于本领域技术人员而言,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。此外,显然“包括”一词不排除其他单元或,单数不排除复数。装置权利要求中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。For those skilled in the art, no matter from which point of view, the embodiments should be regarded as exemplary and non-restrictive. The scope of the present application is defined by the appended claims rather than the above description, so it is intended that All changes falling within the meaning and scope of equivalent elements of the claims are included in this application. In addition, it is obvious that the word "including" does not exclude other elements or, and the singular does not exclude the plural. Multiple units or devices stated in the device claims can also be implemented by one unit or device through software or hardware. Words such as first and second are used to denote names, but do not denote any specific order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Modifications or equivalent replacements are made without departing from the spirit and scope of the technical solution of this application.
Claims (20)
- 一种多目标跟踪方法,其特征在于,所述方法包括:A multi-target tracking method, characterized in that the method includes:获取包含多个目标在内的待检测图像;Obtain an image to be detected including multiple targets;调用预设第一检测模型检测出所述待检测图像中的头部区域;Calling a preset first detection model to detect the head region in the image to be detected;调用预设第二检测模型检测出所述待检测图像中的形体区域;Calling a preset second detection model to detect the shape area in the image to be detected;根据所述头部区域和所述形体区域计算区域比值;Calculating an area ratio based on the head area and the body area;判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;Judging whether there is an area ratio smaller than a preset first threshold in the area ratio, where the preset first threshold is less than 1;当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡;When there is an area ratio in the area ratio that is less than the preset first threshold, it is determined that there is a pedestrian in the image to be detected that is blocked;根据所述头部区域及所述形体区域分割出被遮挡的行人;Segmenting the shaded pedestrians according to the head area and the shape area;调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。Call the preset tracking algorithm to track the segmented pedestrians that are occluded and unoccluded.
- 如权利要求1所述的方法,其特征在于,当所述区域比值大于或者等于所述预设第一阈值时,所述方法还包括:The method according to claim 1, wherein when the area ratio is greater than or equal to the preset first threshold, the method further comprises:判断所述区域比值是否为1;Determine whether the area ratio is 1;当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象;When the area ratio is 1, the pedestrian corresponding to the physical area is determined as the target tracking object;当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象;When the area ratio is not 1, the pedestrian corresponding to the head area is determined as the target tracking object;调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Invoking the preset tracking algorithm to track the target tracking object.
- 如权利要求1所述的方法,其特征在于,所述根据所述头部区域及所述形体区域分割出被遮挡的行人包括:The method according to claim 1, wherein the segmenting the occluded pedestrian according to the head area and the body area comprises:判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset ratio coefficient;根据扩大后的形体区域分割出被遮挡的行人。The occluded pedestrians are segmented according to the enlarged body area.
- 如权利要求3所述的方法,其特征在于,当所述区域比值小于或等于所述预设第二阈值时,所述方法还包括:8. The method of claim 3, wherein when the area ratio is less than or equal to the preset second threshold, the method further comprises:以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。The central axis of the two head regions is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the blocked pedestrians.
- 如权利要求1所述的方法,其特征在于,采用并行处理的方式同时调用所述预设第一检测模型检测出所述待检测图像中的头部区域和调用所述预设第二检测模型检测出所述待检测图像中的形体区域。The method according to claim 1, wherein the method of parallel processing is used to simultaneously call the preset first detection model to detect the head region in the image to be detected and call the preset second detection model The shape area in the image to be detected is detected.
- 如权利要求1所述的方法,其特征在于,所述调用预设第一检测模型检测出所述待检测图像中的头部区域包括:The method according to claim 1, wherein said calling a preset first detection model to detect the head region in the image to be detected comprises:调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;Calling the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。The head region corresponding to each human body in the image to be detected is determined according to the multiple human body nodes of each human body.
- 如权利要求1至6中任意一项所述的方法,其特征在于,所述根据所述 头部区域和所述形体区域计算区域比值包括:The method according to any one of claims 1 to 6, wherein said calculating an area ratio based on said head area and said body area comprises:根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
- 一种多目标跟踪装置,其特征在于,所述装置包括:A multi-target tracking device, characterized in that the device includes:获取模块,用于获取包含多个目标在内的待检测图像;The acquisition module is used to acquire an image to be detected including multiple targets;检测模块,用于调用预设第一检测模型检测出所述待检测图像中的头部区域;The detection module is configured to call a preset first detection model to detect the head region in the image to be detected;所述检测模块,还用于调用预设第二检测模型检测出所述待检测图像中的形体区域;The detection module is further configured to call a preset second detection model to detect the shape area in the image to be detected;计算模块,用于根据所述头部区域和所述形体区域计算区域比值;A calculation module, configured to calculate an area ratio based on the head area and the body area;判断模块,用于判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;A judging module for judging whether there is an area ratio smaller than a preset first threshold in the area ratio, wherein the preset first threshold is less than 1;分割模块,用于当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡,及根据所述头部区域及所述形体区域分割出被遮挡的行人;The segmentation module is used to determine that there is a pedestrian in the image to be detected that is blocked when the area ratio is less than the preset first threshold, and segment according to the head area and the shape area Pedestrians out of shade;跟踪模块,用于调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。The tracking module is used to call a preset tracking algorithm to track the segmented pedestrians that are blocked and those that are not blocked.
- 一种计算机设备,其特征在于,所述计算机设备包括处理器和存储器,所述存储器用于存储计算机可读指令,所述处理器执行所述计算机可读指令以实现以下步骤:A computer device, characterized in that the computer device includes a processor and a memory, the memory is used to store computer-readable instructions, and the processor executes the computer-readable instructions to implement the following steps:获取包含多个目标在内的待检测图像;Obtain an image to be detected including multiple targets;调用预设第一检测模型检测出所述待检测图像中的头部区域;Calling a preset first detection model to detect the head region in the image to be detected;调用预设第二检测模型检测出所述待检测图像中的形体区域;Calling a preset second detection model to detect the shape area in the image to be detected;根据所述头部区域和所述形体区域计算区域比值;Calculating an area ratio based on the head area and the body area;判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;Judging whether there is an area ratio smaller than a preset first threshold in the area ratio, where the preset first threshold is less than 1;当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡;When there is an area ratio in the area ratio that is less than the preset first threshold, it is determined that there is a pedestrian in the image to be detected that is blocked;根据所述头部区域及所述形体区域分割出被遮挡的行人;Segmenting the shaded pedestrians according to the head area and the shape area;调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。Call the preset tracking algorithm to track the segmented pedestrians that are occluded and unoccluded.
- 如权利要求9所述的计算机设备,其特征在于,当所述区域比值大于或者等于所述预设第一阈值时,所述处理器还用于执行计算机可读指令以实现以下步骤:9. The computer device of claim 9, wherein when the area ratio is greater than or equal to the preset first threshold, the processor is further configured to execute computer readable instructions to implement the following steps:判断所述区域比值是否为1;Determine whether the area ratio is 1;当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象;When the area ratio is 1, the pedestrian corresponding to the physical area is determined as the target tracking object;当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象;When the area ratio is not 1, the pedestrian corresponding to the head area is determined as the target tracking object;调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Invoking the preset tracking algorithm to track the target tracking object.
- 如权利要求9所述的计算机设备,其特征在于,所述处理器在执行所述计算机可读指令以实现所述根据所述头部区域及所述形体区域分割出被遮挡的行人时,包括以下步骤:The computer device according to claim 9, wherein when the processor executes the computer-readable instruction to realize the segmentation of the occluded pedestrian according to the head area and the body area, it comprises The following steps:判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset ratio coefficient;根据扩大后的形体区域分割出被遮挡的行人。The occluded pedestrians are segmented according to the enlarged body area.
- 如权利要求11所述的计算机设备,其特征在于,当所述区域比值小于或等于所述预设第二阈值时,所述处理器还用于执行计算机可读指令以实现以下步骤:11. The computer device of claim 11, wherein when the area ratio is less than or equal to the preset second threshold, the processor is further configured to execute computer readable instructions to implement the following steps:以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。The central axis of the two head regions is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the blocked pedestrians.
- 如权利要求9所述的计算机设备,其特征在于,所述处理器在执行所述计算机可读指令以实现所述调用预设第一检测模型检测出所述待检测图像中的头部区域时,包括以下步骤:The computer device according to claim 9, wherein the processor executes the computer-readable instruction to implement the calling preset first detection model to detect the head region in the image to be detected , Including the following steps:调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;Calling the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。The head region corresponding to each human body in the image to be detected is determined according to the multiple human body nodes of each human body.
- 如权利要求9至13中任意一项所述的计算机设备,其特征在于,所述处理器在执行所述计算机可读指令以实现所述根据所述头部区域和所述形体区域计算区域比值时,包括以下步骤:The computer device according to any one of claims 9 to 13, wherein the processor is executing the computer-readable instructions to implement the calculation of the area ratio based on the head area and the body area , Including the following steps:根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
- 一种非易失性可读存储介质,其上存储有计算机可读指令,其特征在于,所述计算机可读指令被处理器执行时实现以下步骤:A non-volatile readable storage medium having computer readable instructions stored thereon, characterized in that, when the computer readable instructions are executed by a processor, the following steps are implemented:获取包含多个目标在内的待检测图像;Obtain an image to be detected including multiple targets;调用预设第一检测模型检测出所述待检测图像中的头部区域;Calling a preset first detection model to detect the head region in the image to be detected;调用预设第二检测模型检测出所述待检测图像中的形体区域;Calling a preset second detection model to detect the shape area in the image to be detected;根据所述头部区域和所述形体区域计算区域比值;Calculating an area ratio based on the head area and the body area;判断所述区域比值中是否有小于预设第一阈值的区域比值,其中所述预设第一阈值小于1;Judging whether there is an area ratio smaller than a preset first threshold in the area ratio, where the preset first threshold is less than 1;当所述区域比值中有小于所述预设第一阈值的区域比值时,确定所述待检测图像中存在行人被遮挡;When there is an area ratio in the area ratio that is less than the preset first threshold, it is determined that there is a pedestrian in the image to be detected that is blocked;根据所述头部区域及所述形体区域分割出被遮挡的行人;Segmenting the shaded pedestrians according to the head area and the shape area;调用预设跟踪算法对分割出的被遮挡的行人和未被遮挡的行人进行跟踪。Call the preset tracking algorithm to track the segmented pedestrians that are occluded and unoccluded.
- 如权利要求15所述的存储介质,其特征在于,当所述区域比值大于或者等于所述预设第一阈值时,所述计算机可读指令被处理器执行时还用以实现以下步骤:15. The storage medium of claim 15, wherein when the area ratio is greater than or equal to the preset first threshold, the computer-readable instructions are further used to implement the following steps when executed by the processor:判断所述区域比值是否为1;Determine whether the area ratio is 1;当所述区域比值为1时,将所述形体区域对应的行人确定为目标跟踪对象;When the area ratio is 1, the pedestrian corresponding to the physical area is determined as the target tracking object;当所述区域比值不为1时,将所述头部区域对应的行人确定为目标跟踪对象;When the area ratio is not 1, the pedestrian corresponding to the head area is determined as the target tracking object;调用所述预设跟踪算法对所述目标跟踪对象进行跟踪。Invoking the preset tracking algorithm to track the target tracking object.
- 如权利要求15所述的存储介质,其特征在于,所述计算机可读指令被处理器执行以实现所述根据所述头部区域及所述形体区域分割出被遮挡的行人时,包括以下步骤:The storage medium of claim 15, wherein the computer-readable instructions are executed by a processor to realize the segmentation of the occluded pedestrian based on the head area and the body area, comprising the following steps :判断所述区域比值是否大于预设第二阈值,其中,所述预设第二阈值小于所述预设第一阈值;Determining whether the area ratio is greater than a preset second threshold, where the preset second threshold is less than the preset first threshold;当所述区域比值大于所述预设第二阈值时,根据预设比例系数扩大所述形体区域;When the area ratio is greater than the preset second threshold, expand the shape area according to a preset ratio coefficient;根据扩大后的形体区域分割出被遮挡的行人。The occluded pedestrians are segmented according to the enlarged body area.
- 如权利要求17所述的存储介质,其特征在于,当所述区域比值小于或等于所述预设第二阈值时,所述计算机可读指令被处理器执行还用以实现以下步骤:18. The storage medium of claim 17, wherein when the area ratio is less than or equal to the preset second threshold, the computer-readable instructions are executed by the processor to further implement the following steps:以两个头部区域的中轴线为分割线,以肩部的关键点作为边界,分割出被遮挡的行人。The central axis of the two head regions is used as the dividing line, and the key points of the shoulders are used as the boundary to divide the blocked pedestrians.
- 如权利要求15所述的存储介质,其特征在于,所述计算机可读指令被处理器执行以实现所述调用预设第一检测模型检测出所述待检测图像中的头部区域时,包括以下步骤:15. The storage medium according to claim 15, wherein the computer-readable instructions are executed by the processor to realize that when the calling preset first detection model detects the head region in the image to be detected, it includes The following steps:调用所述预设第一检测模型检测出所述待检测图像中的每一个人体的多个人体节点;Calling the preset first detection model to detect multiple human body nodes of each human body in the image to be detected;根据所述每一个人体的多个人体节点确定对应所述待检测图像中的每一个人体的头部区域。The head region corresponding to each human body in the image to be detected is determined according to the multiple human body nodes of each human body.
- 如权利要求15至19中任意一项所述的存储介质,其特征在于,所述计算机可读指令被处理器执行以实现所述根据所述头部区域和所述形体区域计算区域比值时,包括以下步骤:The storage medium according to any one of claims 15 to 19, wherein the computer-readable instructions are executed by a processor to realize the calculation of the area ratio based on the head area and the body area, It includes the following steps:根据所述待检测图像建立位置坐标系;Establishing a position coordinate system according to the image to be detected;获取所述头部区域在所述位置坐标系中的第一面积;Acquiring the first area of the head region in the position coordinate system;获取所述头部区域和所述形体区域的交集区域在所述位置坐标系中的第二面积;Acquiring the second area of the intersection area of the head area and the shape area in the position coordinate system;根据所述第一面积和所述第二面积计算所述区域比值。The area ratio is calculated based on the first area and the second area.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910345956.8A CN110210302B (en) | 2019-04-26 | 2019-04-26 | Multi-target tracking method, device, computer equipment and storage medium |
CN201910345956.8 | 2019-04-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020215552A1 true WO2020215552A1 (en) | 2020-10-29 |
Family
ID=67786374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/102318 WO2020215552A1 (en) | 2019-04-26 | 2019-08-23 | Multi-target tracking method, apparatus, computer device, and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110210302B (en) |
WO (1) | WO2020215552A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112308073A (en) * | 2020-11-06 | 2021-02-02 | 中冶赛迪重庆信息技术有限公司 | Method, system, equipment and medium for identifying loading and unloading transshipment state of scrap steel train |
CN112489086A (en) * | 2020-12-11 | 2021-03-12 | 北京澎思科技有限公司 | Target tracking method, target tracking device, electronic device, and storage medium |
CN112530059A (en) * | 2020-11-24 | 2021-03-19 | 厦门熵基科技有限公司 | Channel gate inner draw-bar box judgment method, device, equipment and storage medium |
CN113052049A (en) * | 2021-03-18 | 2021-06-29 | 国网内蒙古东部电力有限公司 | Off-duty detection method and device based on artificial intelligence tool identification |
CN113158732A (en) * | 2020-12-31 | 2021-07-23 | 深圳市商汤科技有限公司 | Image processing method and related device |
CN113253357A (en) * | 2021-03-29 | 2021-08-13 | 航天信息股份有限公司 | Method and system for determining action state of target object based on light curtain |
CN113312995A (en) * | 2021-05-18 | 2021-08-27 | 华南理工大学 | Anchor-free vehicle-mounted pedestrian detection method based on central axis |
CN113516092A (en) * | 2021-07-27 | 2021-10-19 | 浙江大华技术股份有限公司 | Method and device for determining target behavior, storage medium and electronic device |
CN113516093A (en) * | 2021-07-27 | 2021-10-19 | 浙江大华技术股份有限公司 | Marking method and device of identification information, storage medium and electronic device |
CN114332924A (en) * | 2021-12-17 | 2022-04-12 | 河北鼎联科技有限公司 | Information processing method, device, electronic equipment and storage medium |
CN115131827A (en) * | 2022-06-29 | 2022-09-30 | 珠海视熙科技有限公司 | Passenger flow human body detection method and device, storage medium and passenger flow statistical camera |
CN116912230A (en) * | 2023-08-11 | 2023-10-20 | 海格欧义艾姆(天津)电子有限公司 | Patch welding quality detection method and device, electronic equipment and storage medium |
CN117935171A (en) * | 2024-03-19 | 2024-04-26 | 中国联合网络通信有限公司湖南省分公司 | Target tracking method and system based on gesture key points |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111080697B (en) * | 2019-10-29 | 2024-04-09 | 京东科技信息技术有限公司 | Method, apparatus, computer device and storage medium for detecting direction of target object |
CN112101139B (en) * | 2020-08-27 | 2024-05-03 | 普联国际有限公司 | Human shape detection method, device, equipment and storage medium |
CN112330714B (en) * | 2020-09-29 | 2024-01-09 | 深圳大学 | Pedestrian tracking method and device, electronic equipment and storage medium |
CN112926410B (en) * | 2021-02-03 | 2024-05-14 | 深圳市维海德技术股份有限公司 | Target tracking method, device, storage medium and intelligent video system |
CN114220119B (en) * | 2021-11-10 | 2022-08-12 | 深圳前海鹏影数字软件运营有限公司 | Human body posture detection method, terminal device and computer readable storage medium |
CN117876968B (en) * | 2024-03-11 | 2024-05-28 | 盛视科技股份有限公司 | Dense pedestrian detection method combining multiple targets |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150104066A1 (en) * | 2013-10-10 | 2015-04-16 | Canon Kabushiki Kaisha | Method for improving tracking in crowded situations using rival compensation |
CN108062536A (en) * | 2017-12-29 | 2018-05-22 | 纳恩博(北京)科技有限公司 | A kind of detection method and device, computer storage media |
CN108256404A (en) * | 2016-12-29 | 2018-07-06 | 北京旷视科技有限公司 | Pedestrian detection method and device |
CN108920997A (en) * | 2018-04-10 | 2018-11-30 | 国网浙江省电力有限公司信息通信分公司 | Judge that non-rigid targets whether there is the tracking blocked based on profile |
CN109035295A (en) * | 2018-06-25 | 2018-12-18 | 广州杰赛科技股份有限公司 | Multi-object tracking method, device, computer equipment and storage medium |
CN109446942A (en) * | 2018-10-12 | 2019-03-08 | 北京旷视科技有限公司 | Method for tracking target, device and system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9792491B1 (en) * | 2014-03-19 | 2017-10-17 | Amazon Technologies, Inc. | Approaches for object tracking |
CN105303191A (en) * | 2014-07-25 | 2016-02-03 | 中兴通讯股份有限公司 | Method and apparatus for counting pedestrians in foresight monitoring scene |
-
2019
- 2019-04-26 CN CN201910345956.8A patent/CN110210302B/en active Active
- 2019-08-23 WO PCT/CN2019/102318 patent/WO2020215552A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150104066A1 (en) * | 2013-10-10 | 2015-04-16 | Canon Kabushiki Kaisha | Method for improving tracking in crowded situations using rival compensation |
CN108256404A (en) * | 2016-12-29 | 2018-07-06 | 北京旷视科技有限公司 | Pedestrian detection method and device |
CN108062536A (en) * | 2017-12-29 | 2018-05-22 | 纳恩博(北京)科技有限公司 | A kind of detection method and device, computer storage media |
CN108920997A (en) * | 2018-04-10 | 2018-11-30 | 国网浙江省电力有限公司信息通信分公司 | Judge that non-rigid targets whether there is the tracking blocked based on profile |
CN109035295A (en) * | 2018-06-25 | 2018-12-18 | 广州杰赛科技股份有限公司 | Multi-object tracking method, device, computer equipment and storage medium |
CN109446942A (en) * | 2018-10-12 | 2019-03-08 | 北京旷视科技有限公司 | Method for tracking target, device and system |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112308073A (en) * | 2020-11-06 | 2021-02-02 | 中冶赛迪重庆信息技术有限公司 | Method, system, equipment and medium for identifying loading and unloading transshipment state of scrap steel train |
CN112308073B (en) * | 2020-11-06 | 2023-08-25 | 中冶赛迪信息技术(重庆)有限公司 | Method, system, equipment and medium for identifying loading and unloading and transferring states of scrap steel train |
CN112530059A (en) * | 2020-11-24 | 2021-03-19 | 厦门熵基科技有限公司 | Channel gate inner draw-bar box judgment method, device, equipment and storage medium |
CN112489086A (en) * | 2020-12-11 | 2021-03-12 | 北京澎思科技有限公司 | Target tracking method, target tracking device, electronic device, and storage medium |
CN113158732A (en) * | 2020-12-31 | 2021-07-23 | 深圳市商汤科技有限公司 | Image processing method and related device |
CN113052049A (en) * | 2021-03-18 | 2021-06-29 | 国网内蒙古东部电力有限公司 | Off-duty detection method and device based on artificial intelligence tool identification |
CN113052049B (en) * | 2021-03-18 | 2023-12-19 | 国网内蒙古东部电力有限公司 | Off-duty detection method and device based on artificial intelligent tool identification |
CN113253357B (en) * | 2021-03-29 | 2023-06-30 | 航天信息股份有限公司 | Method and system for determining action state of target object based on light curtain |
CN113253357A (en) * | 2021-03-29 | 2021-08-13 | 航天信息股份有限公司 | Method and system for determining action state of target object based on light curtain |
CN113312995B (en) * | 2021-05-18 | 2023-02-14 | 华南理工大学 | Anchor-free vehicle-mounted pedestrian detection method based on central axis |
CN113312995A (en) * | 2021-05-18 | 2021-08-27 | 华南理工大学 | Anchor-free vehicle-mounted pedestrian detection method based on central axis |
CN113516093A (en) * | 2021-07-27 | 2021-10-19 | 浙江大华技术股份有限公司 | Marking method and device of identification information, storage medium and electronic device |
CN113516092A (en) * | 2021-07-27 | 2021-10-19 | 浙江大华技术股份有限公司 | Method and device for determining target behavior, storage medium and electronic device |
CN114332924A (en) * | 2021-12-17 | 2022-04-12 | 河北鼎联科技有限公司 | Information processing method, device, electronic equipment and storage medium |
CN115131827A (en) * | 2022-06-29 | 2022-09-30 | 珠海视熙科技有限公司 | Passenger flow human body detection method and device, storage medium and passenger flow statistical camera |
CN116912230A (en) * | 2023-08-11 | 2023-10-20 | 海格欧义艾姆(天津)电子有限公司 | Patch welding quality detection method and device, electronic equipment and storage medium |
CN117935171A (en) * | 2024-03-19 | 2024-04-26 | 中国联合网络通信有限公司湖南省分公司 | Target tracking method and system based on gesture key points |
Also Published As
Publication number | Publication date |
---|---|
CN110210302A (en) | 2019-09-06 |
CN110210302B (en) | 2023-06-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020215552A1 (en) | Multi-target tracking method, apparatus, computer device, and storage medium | |
CN109508688B (en) | Skeleton-based behavior detection method, terminal equipment and computer storage medium | |
CN107358149B (en) | Human body posture detection method and device | |
CN102831439B (en) | Gesture tracking method and system | |
US20220180534A1 (en) | Pedestrian tracking method, computing device, pedestrian tracking system and storage medium | |
CN109598229B (en) | Monitoring system and method based on action recognition | |
US8675917B2 (en) | Abandoned object recognition using pedestrian detection | |
US10803604B1 (en) | Layered motion representation and extraction in monocular still camera videos | |
US11062126B1 (en) | Human face detection method | |
CN113177469A (en) | Training method and device for human body attribute detection model, electronic equipment and medium | |
TWI776176B (en) | Device and method for scoring hand work motion and storage medium | |
WO2022252737A1 (en) | Image processing method and apparatus, processor, electronic device, and storage medium | |
CN111753724A (en) | Abnormal behavior identification method and device | |
CN111325133A (en) | Image processing system based on artificial intelligence recognition | |
WO2021022698A1 (en) | Following detection method and apparatus, and electronic device and storage medium | |
CN113378836A (en) | Image recognition method, apparatus, device, medium, and program product | |
CN116524435A (en) | Online invigilation method based on electronic fence and related equipment | |
CN110348272B (en) | Dynamic face recognition method, device, system and medium | |
CN111985331B (en) | Detection method and device for preventing trade secret from being stolen | |
CN111597889B (en) | Method, device and system for detecting target movement in video | |
EP4273816A1 (en) | Reducing false positive identifications during video conferencing tracking and detection | |
CN113762221B (en) | Human body detection method and device | |
Bharathi et al. | A Conceptual Real-Time Deep Learning Approach for Object Detection, Tracking and Monitoring Social Distance using Yolov5 | |
CN111325132A (en) | Intelligent monitoring system | |
KR100729265B1 (en) | A face detection method using difference image and color information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19926699 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19926699 Country of ref document: EP Kind code of ref document: A1 |