CN112101139A

CN112101139A - Human shape detection method, device, equipment and storage medium

Info

Publication number: CN112101139A
Application number: CN202010875179.0A
Authority: CN
Inventors: 肖传利
Original assignee: Pulian International Co ltd
Current assignee: Pulian International Co ltd
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2020-12-18
Anticipated expiration: 2040-08-27
Also published as: CN112101139B

Abstract

The invention discloses a human shape detection method, a human shape detection device, human shape detection equipment and a storage medium, wherein the method comprises the following steps: acquiring an image to be detected; extracting the characteristics of the image to be detected to obtain a characteristic diagram of the image to be detected; carrying out human shape detection on the characteristic diagram of the image to be detected to obtain a target human shape area of the image to be detected; the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region; and marking the target human-shaped area in the image to be detected. By adopting the embodiment of the invention, the effective human shape detection can be realized when the human shape target is incomplete, and the human shape detection accuracy is improved.

Description

Human shape detection method, device, equipment and storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a human form detection method, apparatus, device, and storage medium.

Background

With the development of scientific computing, the use of cameras has gone deep into life, and meanwhile, human shape detection is widely applied to mobile devices such as cameras. Compared with human face detection and the like, human shape detection can effectively warn targets under different human posture light rays.

At present, the human shape detection result is generally obtained by detecting the complete human shape in the image to be detected. However, in the process of implementing the present invention, the inventor finds that, when people shape detection is actually performed, because the people shape target is easily blocked by other objects or other pedestrians, and the existing people shape detection method can only detect the complete people shape, when the people shape target is partially blocked, the existing people shape detection method cannot detect the people shape target, and therefore, the people shape detection accuracy of the existing people shape detection method is low.

Disclosure of Invention

The embodiment of the invention provides a human shape detection method, a human shape detection device, human shape detection equipment and a human shape detection storage medium, which can simultaneously detect a target whole body area, a target upper body area and a target head and shoulder area in a human shape detection process, and improve the human shape detection accuracy.

An embodiment of the present invention provides a human form detection method, including:

acquiring an image to be detected;

extracting the characteristics of the image to be detected to obtain a characteristic diagram of the image to be detected;

carrying out human shape detection on the characteristic diagram of the image to be detected to obtain a target human shape area of the image to be detected; the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region;

and marking the target human-shaped area in the image to be detected.

As an improvement of the above scheme, the human shape detection is performed on the feature map of the image to be detected to obtain a target human shape region of the image to be detected, and the method specifically includes:

detecting a human figure whole body region of the characteristic diagram of the image to be detected through a pre-trained whole body detector to obtain a whole body region detection result;

when the whole-body region detection result comprises a plurality of candidate whole-body regions, determining a first candidate upper body region and a first candidate head-shoulder region in the plurality of candidate whole-body regions according to the positions and the sizes of a preset upper body detection region and a head-shoulder detection region in a standard whole-body template based on the length-width proportional relation between the plurality of candidate whole-body regions and the preset standard whole-body template;

judging whether a first candidate upper body region in the plurality of candidate whole body regions contains a humanoid upper body through a pre-trained first upper body detector to obtain first judgment results corresponding to the plurality of candidate whole body regions;

judging whether a first candidate head-shoulder area in the candidate whole-body areas contains a humanoid head-shoulder through a pre-trained first head-shoulder detector to obtain second judgment results corresponding to the candidate whole-body areas;

determining that the corresponding candidate whole-body region with the first judgment result and the second judgment result being both yes is the target whole-body region of the image to be detected;

carrying out human-shaped upper body region detection on the characteristic diagram of the image to be detected through a pre-trained second upper body detector and a pre-trained second head and shoulder detector to obtain a target upper body region of the image to be detected;

and carrying out human-shaped head-shoulder area detection on the characteristic diagram of the image to be detected through a pre-trained third head-shoulder detector to obtain a target head-shoulder area of the image to be detected.

As an improvement of the above-mentioned scheme, the human-shaped upper body region detection is performed on the feature map of the image to be detected through the pre-trained second upper body detector and the pre-trained second head and shoulder detector, so as to obtain the target upper body region of the image to be detected, and the method specifically includes:

carrying out human-shaped upper body area detection on the characteristic diagram of the image to be detected through a pre-trained second upper body detector to obtain an upper body area detection result;

when the upper body area detection result comprises a plurality of second candidate upper body areas, determining second candidate head and shoulder areas in the plurality of candidate upper body areas according to the positions and sizes of the head and shoulder detection areas in the upper body detection area based on the length-width proportional relation between the plurality of second candidate upper body areas and the upper body detection area in the standard whole-body template;

judging whether a second candidate head-shoulder area in the plurality of second candidate upper body areas contains a human-shaped head-shoulder through a pre-trained second head-shoulder detector to obtain a third judgment result corresponding to the plurality of second candidate upper body areas;

and determining that the corresponding second candidate upper body area with the third judgment result being yes is the target upper body area of the image to be detected.

As an improvement of the above scheme, before the human shape detection is performed on the feature map of the image to be detected to obtain the target human shape region of the image to be detected, the method further includes:

acquiring a pedestrian data training set; the pedestrian data training set comprises a plurality of human-shaped whole body sample graphs with the sizes unified to preset standard sizes, and each human-shaped whole body sample graph is marked with a human-shaped upper body area and a head-shoulder area through a circumscribed rectangle frame;

determining the central positions and the sizes of the head and shoulder detection windows according to the information of the upper body areas and the head and shoulder areas marked in the plurality of human-shaped whole body sample images;

zooming the human-shaped whole body sample images according to the preset three sizes respectively to obtain a plurality of human-shaped whole body sample images corresponding to the three sizes respectively; wherein the three dimensions include a first dimension, a second dimension, and a third dimension, the first dimension being greater than the second dimension, the second dimension being greater than the third dimension;

based on the proportional relation between the standard sizes and the three sizes, according to the central position and the size of the head and shoulder detection window, performing head and shoulder region extraction on the plurality of humanoid whole body sample pictures respectively corresponding to the three sizes to obtain a plurality of humanoid head and shoulder sample pictures respectively corresponding to the three sizes;

respectively extracting upper body regions of the plurality of humanoid whole body sample diagrams with the third size and the plurality of humanoid whole body sample diagrams with the second size according to the central position and the size of the upper body detection window on the basis of the proportional relation between the standard size and the third size and the proportional relation between the standard size and the second size, and obtaining the plurality of humanoid upper body sample diagrams with the third size and the plurality of humanoid upper body sample diagrams with the second size;

training six pre-acquired preset classifiers according to the plurality of human-shaped whole body sample diagrams of the third size, the plurality of human-shaped upper body sample diagrams of the second size and the plurality of human-shaped head and shoulder sample diagrams respectively corresponding to the three sizes to obtain the whole body detector, the first upper body detector, the second upper body detector, the first head and shoulder detector, the second head and shoulder detector and the third head and shoulder detector.

As a modification of the above scheme, the size of the preset standard whole-body template is equal to the preset standard size;

the right carry out human shape detection to the characteristic diagram of waiting to detect the image, obtain before waiting to detect the target human shape region of image it is in after obtaining pedestrian data training set, still include:

determining the central positions of the upper body detection area and the head and shoulder detection area in the standard whole-body template according to the information of the upper body area and the head and shoulder area marked in the plurality of human-shaped whole-body sample images;

and determining the sizes of the upper body detection area and the head and shoulder detection area according to the information of the upper body area and the head and shoulder area marked in the plurality of human-shaped whole body sample images.

As an improvement of the above scheme, the determining the sizes of the upper body detection region and the head-shoulder detection region according to the information of the upper body region and the head-shoulder region marked in the plurality of human-shaped whole body sample drawings specifically includes:

calculating the minimum value of the width of the upper body detection area which enables a first preset condition to be met, and taking the minimum value of the width of the upper body detection area as the width of the upper body detection area; the first preset condition is that the proportion of the number of the whole body sample images completely contained in the upper body detection area in the upper body area in the transverse range to the total number of the human-shaped whole body sample images is larger than a preset proportion threshold value;

calculating the minimum value of the height of the upper body detection area which enables a second preset condition to be met, and taking the minimum value of the height of the upper body detection area as the height of the upper body detection area; wherein the second preset condition is that the proportion of the number of whole body sample images completely contained in the upper body detection region in the transverse range and the longitudinal range of the upper body region in the total number of the plurality of humanoid whole body sample images is greater than the preset proportion threshold;

calculating the minimum value of the width of the head and shoulder detection area, which enables a third preset condition to be met, and taking the minimum value of the width of the head and shoulder detection area as the width of the head and shoulder detection area; the third preset condition is that the proportion of the number of whole body sample images completely contained in the head and shoulder region by the head and shoulder detection region in the transverse range to the total number of the human-shaped whole body sample images is greater than the preset proportion threshold;

calculating the minimum value of the height of the head and shoulder detection area, which enables a fourth preset condition to be met, and taking the minimum value of the height of the head and shoulder detection area as the height of the head and shoulder detection area; the fourth preset condition is that the proportion of the number of whole body sample images completely contained in the head and shoulder region in the transverse range and the longitudinal range by the head and shoulder detection region to the total number of the human-shaped whole body sample images is larger than the preset proportion threshold value.

As an improvement of the above scheme, the marking a target human-shaped region in the image to be detected specifically includes:

merging the target human-shaped areas in the images to be detected to obtain merged target human-shaped areas;

and marking the combined target human-shaped area in the image to be detected.

Another embodiment of the present invention provides a human form detecting apparatus, including:

the image acquisition module is used for acquiring an image to be detected;

the characteristic extraction module is used for extracting the characteristics of the image to be detected to obtain a characteristic diagram of the image to be detected;

the human shape detection module is used for carrying out human shape detection on the characteristic diagram of the image to be detected to obtain a target human shape area of the image to be detected; the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region;

and the human-shaped marking module is used for marking the target human-shaped area in the image to be detected.

Another embodiment of the present invention further provides a human form detection device, which includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, and the processor implements the human form detection method according to any one of the above items when executing the computer program.

Another embodiment of the present invention further provides a computer-readable storage medium, which includes a stored computer program, where when the computer program runs, the computer-readable storage medium is controlled to execute the human form detection method according to any one of the above.

Compared with the prior art, the human shape detection method, the human shape detection device, the human shape detection equipment and the storage medium provided by the embodiment of the invention have the advantages that the characteristic extraction is carried out on the obtained image to be detected, the characteristic diagram of the image to be detected is obtained, then the human shape detection is carried out on the characteristic diagram of the image to be detected, the target human shape area of the image to be detected is obtained, and then the target human shape area in the image to be detected is marked, so that the human shape detection is realized. Based on the analysis, in the embodiment of the invention, in the process of human shape detection on the characteristic diagram of the image to be detected, not only the whole target body region is detected, but also the upper body region and the head and shoulder region of the target are detected, so that the human shape detection result is obtained, therefore, when the part below the head and shoulder of the human shape target is shielded, effective human shape detection can still be realized, and the human shape detection accuracy is improved.

Drawings

Fig. 1 is a schematic flow chart of a human form detection method according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of a human shape detection apparatus according to an embodiment of the present invention.

Fig. 3 is a schematic structural diagram of a human shape detection device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The human form detection method provided by the embodiment of the invention comprises the following steps:

and S11, acquiring an image to be detected.

And S12, performing feature extraction on the image to be detected to obtain a feature map of the image to be detected.

It should be noted that there are various ways of feature extraction, for example, ACF feature extraction, Hog feature extraction, or feature extraction using a convolutional neural network, and in the specific implementation, the manner of feature extraction may be selected according to actual situations, and is not limited herein.

S13, detecting the human shape of the characteristic diagram of the image to be detected to obtain a target human shape area of the image to be detected; wherein the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region.

Specifically, for example, human-shaped whole body region detection may be performed on a feature map of an image to be detected by a pre-trained whole body detector to obtain a target whole body region of the image to be detected, human-shaped upper body region detection is performed on the feature map of the image to be detected by a pre-trained upper body detector to obtain a target upper body region of the image to be detected, human-shaped head and shoulder region detection is performed on the feature map of the image to be detected by a pre-trained head and shoulder detector to obtain a target head and shoulder region of the image to be detected, so that when a portion below the head and shoulder of a human-shaped target is shielded, the human-shaped head and shoulder and the upper body can be detected to obtain a relatively accurate target human-shaped region, omission of the human-shaped target in the image is avoided, and accuracy of human-shaped detection is improved. The whole-body detector is a classifier which is trained in advance and can detect a human-shaped whole-body region contained in an input image through a feature map of the image; the upper body detector is a classifier which is trained in advance and can detect a human-shaped upper body area contained in an input image through a feature map of the image; the head-shoulder detector is a classifier trained in advance that can detect a human-shaped head-shoulder region included in an input image from a feature map of the image. In one embodiment, the detection by the whole body detector, the upper body detector and the head and shoulder detector may be performed simultaneously to improve the efficiency of the human shape detection.

And S14, marking the target human-shaped area in the image to be detected.

After the target human-shaped area of the image to be detected is obtained, the target human-shaped area in the image to be detected can be marked through the external rectangular frame.

Specifically, the step S14 specifically includes:

s141, merging the target human-shaped areas in the images to be detected to obtain merged target human-shaped areas;

and S142, marking the combined target human-shaped area in the image to be detected.

It should be noted that, in the human shape detection process, it is very likely that a human-shaped target is marked as multiple regions, and if a target is marked as multiple regions, a tracking error and a tracking loss are easily caused in subsequent tracking. For example, it may be specifically determined whether the two target human-shaped regions need to be merged by determining whether a central distance between the two target human-shaped regions is within a threshold range, or it may be determined whether the two target human-shaped regions need to be merged by determining whether an intersection exists between the two regions, and then merging the target human-shaped regions determined to need to be merged, so as to obtain a merged target human-shaped region.

According to the human shape detection method, the human shape detection device, the human shape detection equipment and the storage medium, the characteristic diagram of the image to be detected is obtained by performing characteristic extraction on the obtained image to be detected, then the human shape detection is performed on the characteristic diagram of the image to be detected through a whole body detector, an upper body detector and a head and shoulder detector which are trained in advance respectively to obtain a target human shape area of the image to be detected, and then the target human shape area in the image to be detected is marked to realize human shape detection. Based on the analysis, in the embodiment of the invention, in the process of human shape detection on the characteristic diagram of the image to be detected, not only the whole target body region is detected, but also the upper body region and the head and shoulder region of the target are detected, so that the human shape detection result is obtained, therefore, when the part below the head and shoulder of the human shape target is shielded, effective human shape detection can still be realized, and the human shape detection accuracy is improved.

In this embodiment of the present invention, the step S13 specifically includes:

s131, detecting the human-shaped whole body region of the characteristic diagram of the image to be detected through a pre-trained whole body detector to obtain a whole body region detection result.

It can be understood that the following situations exist in the process of detecting the human-shaped whole body region through the pre-trained whole body detector for the feature map of the image to be detected: firstly, if the existence of the humanoid whole body region is not detected, the output whole body region detection result does not contain the candidate whole body region; secondly, a plurality of candidate whole-body regions are detected, and the output whole-body region detection result contains information such as the positions and sizes of a plurality of personal whole-body regions.

S132, when the whole body region detection result comprises a plurality of candidate whole body regions, determining a first candidate upper body region and a first candidate head-shoulder region in the plurality of candidate whole body regions according to the positions and the sizes of a preset upper body detection region and a head-shoulder detection region in a standard whole body template based on the length-width proportional relation between the plurality of candidate whole body regions and the preset standard whole body template.

Specifically, a standard whole body template marked with an upper body detection region and a head and shoulder detection region may be preset, and when the whole body region detection result includes a plurality of candidate whole body regions, based on a length-width proportional relationship between each candidate whole body region and a preset standard whole body template, the positions and sizes of the upper body position to-be-detected region and the head and shoulder position to-be-detected region in each candidate whole body region are determined according to the positions and sizes of the upper body detection region and the head and shoulder detection region in the standard whole body template, so as to obtain the positions and sizes of the first candidate upper body region and the first candidate head and shoulder region in each candidate whole body region.

S133, judging whether a first candidate upper body region in the plurality of candidate whole body regions contains the humanoid upper body through a pre-trained first upper body detector, and obtaining first judgment results corresponding to the plurality of candidate whole body regions.

In order to reduce time consumption in extracting the feature map, the method specifically includes extracting the upper body region from the feature map of the image to be detected according to the position and size of a first candidate upper body region in the plurality of candidate whole body regions to obtain the feature map of the first candidate upper body region in the plurality of candidate whole body regions, detecting features of the image without repeating, and inputting the feature map of the first candidate upper body region in the plurality of candidate whole body regions into the first upper body detector for analysis to judge whether the first candidate upper body region in the plurality of candidate whole body regions includes the humanoid upper body, so as to obtain a first judgment result corresponding to the plurality of candidate whole body regions.

S134, judging whether a first candidate head-shoulder area in the candidate whole-body areas contains a humanoid head-shoulder through a pre-trained first head-shoulder detector, and obtaining second judgment results corresponding to the candidate whole-body areas.

In order to reduce time consumption in extracting the feature map, specifically, the head and shoulder region may be extracted from the feature map of the image to be detected according to the position and size of a first candidate head and shoulder region in the plurality of candidate whole body regions to obtain the feature map of the first candidate head and shoulder region in the plurality of candidate whole body regions, feature detection of the image is not required to be repeated, and the feature map of the first candidate head and shoulder region in the plurality of candidate whole body regions is respectively input to the first head and shoulder detector to be analyzed so as to determine whether the first candidate head and shoulder region in the plurality of candidate whole body regions includes a human-shaped head and shoulder, thereby obtaining a second determination result corresponding to the plurality of candidate whole body regions.

And S135, determining that the corresponding candidate whole body region with the first judgment result and the second judgment result being both positive is the target whole body region of the image to be detected.

And when the first judgment result and the second judgment result corresponding to the candidate whole-body region are both yes, judging the candidate whole-body region as the target whole-body region of the image to be detected.

S136, human-shaped upper body region detection is carried out on the feature map of the image to be detected through a pre-trained second upper body detector and a pre-trained second head and shoulder classifier, and the target upper body region of the image to be detected is obtained.

Inputting the feature map of the image to be detected into a second upper body detector to detect the human-shaped upper body area, obtaining a second candidate head-shoulder area according to the position relation between the upper body area and the head-shoulder, judging whether the second candidate head-shoulder area contains the human-shaped head-shoulder through the second head-shoulder detector, and determining the candidate upper body area with the third judgment result as the target upper body area of the image to be detected to output the target upper body area of the image to be detected.

And S137, detecting the human-shaped head-shoulder area of the characteristic diagram of the image to be detected through a pre-trained third head-shoulder detector to obtain the target head-shoulder area of the image to be detected.

And inputting the characteristic diagram of the image to be detected into a third head-shoulder detector to detect the human-shaped head-shoulder area so as to output the target head-shoulder area of the image to be detected.

Illustratively, the whole-body detector may be for detecting a distant human-shaped whole-body target, the first upper-body detector may be for detecting an upper-body region in the distant human-shaped whole-body target, the first head-shoulder detector may be for detecting a head-shoulder region in a human-shaped whole-body target at a distance, the second upper body detector is used for detecting a human-shaped upper body target at a short distance, the second head and shoulder detector is used for detecting a head and shoulder area in the human-shaped upper body target at a short distance, the third head and shoulder detector is used for detecting a human-shaped head and shoulder target at a shorter distance, therefore, the detection size of the first upper body detector is smaller than the detection size of the second upper body detector, the detection size of the first head-shoulder detector is smaller than that of the second head-shoulder detector, and the detection size of the second head-shoulder detector is smaller than that of the third head-shoulder detector.

It can be understood that, generally, when a pedestrian target at a long distance is detected, because the pedestrian target is small and the feature dimension is low, the false detection rate of a low-resolution image containing less information is increased compared with a high-resolution image containing more information, and the detection accuracy of the whole-body human shape is not high. Therefore, in order to reduce the false alarm rate and improve the detection accuracy of the human shape of the whole body, in this embodiment, the candidate whole body region is detected first in a whole body to local manner, and then the head-shoulder feature and the upper body feature in the candidate whole body region are detected to determine whether the candidate whole body region is the target whole body region. Moreover, when a pedestrian target in a short distance is detected, due to the fact that the resolution ratio of a short-distance image is relatively high and the feature dimension is high, the false alarm rate of the classifier per se is low, in the embodiment, a plurality of classifiers are not used for further reducing false alarm, the resource occupation of the short-distance target can be reduced, and the consumption of detection time is reduced. In addition, in this embodiment, since the whole body detector, the first upper body detector, and the first head-shoulder detector are in a cascade structure, the detection time of the non-human target does not significantly increase relative to that of the single classifier, and when the human-shaped whole body target is detected, since the relative positions of the upper body, the head-shoulder, and the whole body are used, other irrelevant regions are not detected, so that a large number of irrelevant windows are filtered, and even if the human-shaped target exists, the classifier time is not too long, and the detection efficiency is high.

Further, the step S136 specifically includes:

s1361, detecting the human-shaped upper body area of the characteristic diagram of the image to be detected through a pre-trained second upper body detector to obtain an upper body area detection result.

It can be understood that the following situations exist in the process of detecting the human-shaped upper body region of the feature map of the image to be detected by the second upper body detector trained in advance: firstly, if the existence of the human-shaped upper body region is not detected, the output upper body region detection result does not contain a second candidate upper body region; secondly, a plurality of second candidate upper body regions are detected, and the output upper body region detection result comprises information such as positions and sizes of the second candidate upper body regions.

S1362, when the upper body area detection result includes a plurality of second candidate upper body areas, determining the second candidate upper body areas in the plurality of candidate upper body areas according to the positions and the sizes of the head and shoulder detection areas in the upper body detection area based on the length-width proportional relationship between the plurality of second candidate upper body areas and the upper body detection area in the standard whole-body template.

Specifically, a standard whole body template marked with an upper body detection area and a head and shoulder detection area may be preset, and when the upper body area detection result includes a plurality of second candidate upper body areas, based on a length-width proportional relationship between each second candidate upper body area and an upper body detection area in the preset standard whole body template, the position and size of a head and shoulder position detection area in each second candidate upper body area are determined according to the position and size of the head and shoulder detection area in the upper body detection area of the standard whole body template, so as to obtain the position and size of the second candidate head and shoulder area in each second candidate upper body area.

S1363, judging whether a second candidate head-shoulder area in the second candidate upper body areas contains a human-shaped head-shoulder through a pre-trained second head-shoulder detector, and obtaining a third judgment result corresponding to the second candidate upper body areas.

In order to reduce time consumption in extracting the feature map, specifically, the head and shoulder region may be extracted from the feature map of the image to be detected according to the position and size of a second candidate head and shoulder region in the second candidate upper body regions to obtain the feature map of the second candidate head and shoulder region in the second candidate upper body regions, the feature map of the second candidate head and shoulder region in the second candidate upper body regions is not required to be repeatedly detected, and then the feature map of the second candidate head and shoulder region in the second candidate upper body regions is respectively input to a second head and shoulder detector to be analyzed to determine whether the second candidate head and shoulder region in the second candidate upper body regions includes a human-shaped head and shoulder, so as to obtain a third determination result corresponding to the second candidate upper body regions.

And S1364, determining that the corresponding second candidate upper body area with the third judgment result of yes is the target upper body area of the image to be detected.

And when the third judgment result corresponding to the second candidate upper body area is yes, judging that the second candidate upper body area is the target upper body area of the image to be detected.

It is understood that, generally, when detecting a pedestrian target at a long distance, since the pedestrian target is small, the false detection rate of a low-resolution image containing less information is increased relative to a high-resolution image containing more information, and the detection accuracy of the human-shaped upper body is not high. Therefore, in order to reduce the false alarm rate and improve the detection accuracy of the human figure upper body, in this embodiment, the second candidate upper body region is detected first, and then the head and shoulder features in the second candidate upper body region are detected to determine whether the target upper body region is the target upper body region.

Still further, before the step S13, the method further includes:

s21, acquiring a pedestrian data training set; the pedestrian data training set comprises a plurality of human-shaped whole body sample graphs with the sizes unified to preset standard sizes, and each human-shaped whole body sample graph is marked with an upper body area and a head and shoulder area of a human shape through a circumscribed rectangle frame.

The pedestrian data training set can be obtained by performing complete pedestrian target screenshot on a plurality of pedestrian pictures. The marked circumscribed rectangle frame can reserve part of edge pixels according to the target size in proportion. The preset standard size is width _ body _ height _ body.

And S22, determining the center positions and the sizes of the head and shoulder detection windows according to the information of the upper body areas and the head and shoulder areas marked in the plurality of human-shaped whole body sample drawings.

Wherein, the step S22 specifically includes:

determining the central position of an upper body detection window according to the coordinates and the size of the top point of the lower left corner of the upper body area marked in the plurality of human-shaped whole body sample images; wherein the abscissa of the central position of the upper body detection window is equal to the average of the sum of the abscissa of the top point of the lower left corner of the upper body region marked in the plurality of human-shaped whole body sample graphs and half of the width; the vertical coordinate of the central position of the upper body detection window is equal to the average value of the sum of the vertical coordinate of the top point of the lower left corner of the upper body area marked in the plurality of human-shaped whole body sample graphs and half of the width;

determining the central position of a head-shoulder detection window according to the coordinates and the size of the top point of the lower left corner of the head-shoulder area marked in the plurality of human-shaped whole body sample images; the abscissa of the central position of the head and shoulder detection window is equal to the average value of the sum of the abscissa of the vertex of the lower left corner of the head and shoulder area marked in the plurality of human-shaped whole body sample graphs and half of the width; the vertical coordinate of the center position of the head and shoulder detection window is equal to the average value of the sum of the vertical coordinate of the top point of the lower left corner of the head and shoulder area marked in the plurality of human-shaped whole body sample graphs and half of the width;

determining the width and the height of the upper body detection window according to the width and the height of the upper body area marked in the plurality of human-shaped whole body sample images respectively; wherein the width of the upper body detection window is equal to the average of the widths of the upper body regions marked in the plurality of human-shaped whole body sample drawings, and the height of the upper body detection window is equal to the average of the heights of the upper body regions marked in the plurality of human-shaped whole body sample drawings;

determining the width and the height of the head and shoulder detection window according to the width and the height of the head and shoulder regions marked in the human-shaped whole body sample drawings respectively; the width of the head and shoulder detection window is equal to the average value of the widths of the head and shoulder regions marked in the human-shaped whole body sample drawings, and the height of the head and shoulder detection window is equal to the average value of the heights of the head and shoulder regions marked in the human-shaped whole body sample drawings.

Illustratively, taking the head-shoulder detection window as an example, the head-shoulder position recorded in the ith human-shaped whole-body sample diagram (K pieces in total) is taken as (x _ head)_i，y_head_i，width_head_i，height_head_i) If the horizontal right direction is taken as the positive x-axis direction and the vertical upward direction is taken as the positive y-axis direction, the coordinates (center _ x _ head, center _ y _ head) of the center position of the head and shoulder detection window are calculated as follows:

wherein center _ x _ head is an abscissa of a center position of the head-shoulder detection window, and K is the plurality ofNumber of human-shaped whole body sample maps, x _ head_iIs the abscissa of the vertex of the lower left corner of the head-shoulder region labeled in the ith human-shaped whole-body sample graph, width _ head_iThe width of the head-shoulder area marked in the ith human-shaped whole-body sample graph, center _ y _ head is the vertical coordinate of the central position of the head-shoulder detection window, and y _ head_iIs the ordinate of the vertex of the lower left corner of the head-shoulder area marked in the ith human-shaped whole-body sample graph, height _ head_iThe height of the head and shoulder area marked in the ith human-shaped whole body sample graph;

the calculation formula of the standard width and height of the head and shoulder detection window is as follows:

wherein, width _ head is the width of the head and shoulder detection window, and height _ head is the height of the head and shoulder detection window.

Similarly, a formula for calculating the center position (center _ x _ upper body, center _ y _ upper body) and the size width _ upper body height _ upper body of the upper body detection window may be derived, which is not described herein again.

S23, zooming the human-shaped whole body sample images according to the preset three sizes respectively to obtain a plurality of human-shaped whole body sample images corresponding to the three sizes respectively; wherein the three dimensions include a first dimension, a second dimension, and a third dimension, the first dimension being greater than the second dimension, the second dimension being greater than the third dimension.

Specifically, the human-shaped whole body sample images are zoomed according to a preset first size, a preset second size and a preset third size, so that the human-shaped whole body sample images with the first size, the human-shaped whole body sample images with the second size and the human-shaped whole body sample images with the third size are obtained.

And S24, based on the proportional relation between the standard sizes and the three sizes, respectively, and according to the central position and the size of the head and shoulder detection window, performing head and shoulder region extraction on the plurality of humanoid whole body sample diagrams respectively corresponding to the three sizes to obtain a plurality of humanoid head and shoulder sample diagrams respectively corresponding to the three sizes.

Specifically, based on a proportional relation between the standard size and the first size, head and shoulder regions in a plurality of human-shaped whole body sample images of the first size are determined according to the central position and the size of a head and shoulder detection window, and a plurality of human-shaped head and shoulder sample images of the first size are extracted according to the head and shoulder regions in the plurality of human-shaped whole body sample images of the first size; determining head and shoulder regions in the plurality of human-shaped whole body sample images with the second size according to the central position and the size of the head and shoulder detection window based on the proportional relation between the standard size and the second size, and extracting the plurality of human-shaped head and shoulder sample images with the second size according to the head and shoulder regions in the plurality of human-shaped whole body sample images with the second size; and determining head and shoulder regions in the plurality of human-shaped whole body sample images with the third dimension according to the central position and the size of the head and shoulder detection window based on the proportional relation between the standard dimension and the third dimension, and extracting the plurality of human-shaped head and shoulder sample images with the third dimension according to the head and shoulder regions in the plurality of human-shaped whole body sample images with the third dimension.

And S25, respectively carrying out upper body region extraction on the plurality of human-shaped whole body sample pictures with the third size and the plurality of human-shaped whole body sample pictures with the second size according to the central position and the size of the upper body detection window based on the proportional relation between the standard size and the third size and the proportional relation between the standard size and the second size, and obtaining the plurality of human-shaped upper body sample pictures with the third size and the plurality of human-shaped upper body sample pictures with the second size.

Specifically, based on the proportional relation between the standard size and the second size, determining upper body areas in the plurality of human-shaped whole body sample images with the second size according to the central position and the size of the upper body detection window, and extracting the plurality of human-shaped upper body sample images with the second size according to the upper body areas in the plurality of human-shaped whole body sample images with the second size; and determining upper body areas in the plurality of human-shaped whole body sample images with the third size according to the central position and the size of the upper body detection window based on the proportional relation between the standard size and the third size, and extracting the upper body areas in the plurality of human-shaped whole body sample images with the third size to obtain the plurality of human-shaped upper body sample images with the third size.

S26, training six pre-acquired pre-set classifiers according to the plurality of human-shaped whole body sample diagrams of the third size, the plurality of human-shaped upper body sample diagrams of the second size and the plurality of human-shaped head and shoulder sample diagrams respectively corresponding to the three sizes to obtain the whole body detector, the first upper body detector, the second upper body detector, the first head and shoulder detector, the second head and shoulder detector and the third head and shoulder detector.

Specifically, a negative sample data set may be obtained in advance, a non-pedestrian target is selected according to needs by selecting the negative sample, feature extraction is performed on the negative sample, a plurality of humanoid whole body sample diagrams of a third size, a plurality of humanoid upper body sample diagrams of a second size and a plurality of humanoid head and shoulder sample diagrams respectively corresponding to the three sizes in the negative sample data set, then a preset first classifier is trained according to the negative sample feature diagram in the negative sample data set and the feature diagrams of the humanoid upper body sample diagrams of the third size to obtain a trained whole body detector, a preset second classifier is trained according to the negative sample feature diagram in the negative sample data set and the feature diagrams of the humanoid upper body sample diagrams of the third size to obtain a trained first upper body detector, and a non-pedestrian target is selected according to needs by selecting the negative sample, a plurality of humanoid whole body sample diagrams of the third size, a plurality of humanoid head and shoulder sample diagrams corresponding to the three sizes respectively, a plurality of humanoid head and shoulder sample Training a preset third classifier to obtain a trained second upper body detector, training a preset fourth classifier according to the negative sample feature diagram in the negative sample data set and the feature diagrams of the plurality of human-shaped head and shoulder sample diagrams with the third size to obtain a trained first head and shoulder detector, training a preset fifth classifier according to the negative sample feature diagram in the negative sample data set and the feature diagrams of the plurality of human-shaped head and shoulder sample diagrams with the second size to obtain a trained second head and shoulder detector, and training a preset sixth classifier according to the negative sample feature diagram in the negative sample data set and the feature diagrams of the plurality of human-shaped head and shoulder sample diagrams with the first size to obtain a trained third head and shoulder detector.

Note that, in step S23, the size of the obtained several human-shaped whole-body sample maps corresponding to three sizes is (scale _ i width _ body, scale _ i height _ body), where scale _ i (i ═ 1,2,3) corresponds to the first size, the second size, and the third size, respectively. The selected detection target region has an inclusion relationship, the human-shaped whole body region comprises a human-shaped upper body region, and the human-shaped upper body region comprises a human-shaped head-shoulder region. Illustratively, the selection rule of scale _ i is as follows:

for the humanoid whole-body region, a complete humanoid whole-body object (scale _3 × width _ body, scale _3 _ height _ body) can appear in the video, but a complete humanoid whole-body object (scale _2 × width _ body, scale _2 _ height _ body) cannot appear. Wherein,

for the humanoid upper body region, a complete humanoid upper body region (scale _2 width _ upper body, scale _2 height _ upper body) can appear in the video, but a complete humanoid upper body region (scale _1 width _ upper body, scale _1 height _ upper body) cannot appear.

For the human-shaped head-shoulder region, a complete human-shaped head-shoulder region (scale _1 width _ head, scale _1 height _ head) can appear in the video.

Further, the size of the preset standard whole-body template is equal to the preset standard size.

Then, before the step S13, after the step S21, the method further includes:

s31, determining the center positions of the upper body detection area and the head and shoulder detection area in the standard whole-body template according to the information of the upper body area and the head and shoulder area marked in the plurality of human-shaped whole-body sample drawings.

Specifically, the center positions of the upper body detection region and the head-shoulder detection region in the standard whole-body template are the same as the center positions of the head-shoulder detection window and the upper body detection window in the humanoid whole-body sample map, that is, the center positions of the upper body detection region and the head-shoulder detection region in the standard whole-body template are calculated in the same manner as the center positions of the head-shoulder detection window and the upper body detection window.

And S32, determining the sizes of the upper body detection area and the head and shoulder detection area according to the information of the upper body area and the head and shoulder area marked in the plurality of human-shaped whole body sample drawings.

Specifically, the step S32 specifically includes:

When the abscissa of the vertex at the lower left corner of one region is greater than the difference between the abscissa of the central position of the other region and half the width of the other region, and the abscissa of the vertex at the lower right corner of the region is less than the sum of the abscissa of the central position of the other region and half the width of the other region, it is determined that the region is completely contained in the lateral range by the other region;

when the ordinate of the vertex at the lower left corner of one region is greater than the difference between the ordinate of the vertex at the upper left corner of the other region and half of the height of the other region, and the ordinate of the vertex at the upper left corner of the region is less than the sum of the ordinate of the vertex at the center of the other region and half of the height of the other region, it is determined that the region is completely contained by the other region within the longitudinal range.

Illustratively, the standard width and height calculation method of the head and shoulder detection area is as follows:

and recording R as a set of positions of the head and shoulder regions, selecting a corresponding width value, selecting the width to ensure that the transverse range can completely contain the proportion of the head and shoulder regions in the R to alpha or above, and selecting the minimum value capable of meeting the condition as roi _ width _ head. After the width value is selected, the head and shoulder regions completely contained in the transverse range are selected to form a set R'. The height selection mode is similar, the proportion of the head and shoulder area in the R' is required to be ensured to be fully contained in the longitudinal range and is more than alpha, and the minimum value meeting the condition is also selected as the roi _ height _ head.

The calculation formula of the minimum value of the width of the head and shoulder detection area, which is satisfied by the third preset condition, is as follows:

wherein widt represents the width of the head and shoulder detection region; k is the number of the human-shaped whole body sample graphs; alpha is a preset proportional threshold; for the ith personal shape whole body sample map in the K personal shape whole body sample maps, there are

Removing M_iAssuming that the number of elements in R 'is K', a calculation formula of a minimum value of the height of the head and shoulder detection region that satisfies the fourth preset condition is:

wherein, for the ith element of K 'elements in R', the

Wherein height represents the height of the head and shoulder detection area.

Correspondingly, the embodiment of the invention also provides a human shape detection device, which can implement all the processes of the human shape detection method.

Fig. 2 is a schematic structural diagram of a human-shaped detecting device according to an embodiment of the present invention.

The human shape detection device provided by the embodiment of the invention comprises:

the image acquisition module 21 is used for acquiring an image to be detected;

the feature extraction module 22 is configured to perform feature extraction on the image to be detected to obtain a feature map of the image to be detected;

the human shape detection module 23 is configured to perform human shape detection on the feature map of the image to be detected to obtain a target human shape region of the image to be detected; the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region;

and the human-shaped marking module 24 is used for marking a target human-shaped area in the image to be detected.

The principle of realizing human shape detection by the human shape detection device is the same as that of the embodiment of the method, and the description is omitted here.

The human shape detection device provided by the embodiment of the invention obtains the characteristic diagram of the image to be detected by extracting the characteristics of the obtained image to be detected, then carries out human shape detection on the characteristic diagram of the image to be detected respectively through a whole body detector, an upper body detector and a head and shoulder detector which are trained in advance to obtain the target human shape area of the image to be detected, and then marks the target human shape area in the image to be detected to realize human shape detection. In the embodiment of the invention, in the process of detecting the human shape of the characteristic diagram of the image to be detected, not only the whole target body region is detected, but also the upper body region and the head and shoulder region of the target are detected, so that the human shape detection result is obtained, therefore, when the part below the head and shoulder of the human shape target is shielded, the effective human shape detection can still be realized, and the human shape detection accuracy is improved.

Further, the human shape detection module specifically comprises a whole body region detection unit, an upper body region detection unit and a head and shoulder region detection unit:

the whole body region detection unit is used for carrying out humanoid whole body region detection on the characteristic diagram of the image to be detected through a pre-trained whole body detector to obtain a whole body region detection result;

the whole-body region detection unit is further configured to, when the whole-body region detection result includes a plurality of candidate whole-body regions, determine, based on a length-width proportional relationship between the plurality of candidate whole-body regions and a preset standard whole-body template, a first candidate upper body region and a first candidate head-shoulder region in the plurality of candidate whole-body regions according to positions and sizes of a preset upper body detection region and a head-shoulder detection region in the standard whole-body template;

the whole body region detection unit is further configured to judge whether a first candidate upper body region of the plurality of candidate whole body regions includes a humanoid upper body through a pre-trained first upper body detector, and obtain first judgment results corresponding to the plurality of candidate whole body regions;

the whole body region detection unit is further configured to judge whether a first candidate head-shoulder region of the plurality of candidate whole body regions includes a humanoid head-shoulder through a pre-trained first head-shoulder detector, and obtain second judgment results corresponding to the plurality of candidate whole body regions;

the whole body region detection unit is further configured to determine that the candidate whole body region where the corresponding first judgment result and the second judgment result are both yes is the target whole body region of the image to be detected;

the upper body area detection unit is used for carrying out human-shaped upper body area detection on the characteristic diagram of the image to be detected through a pre-trained second upper body detector and a pre-trained second head and shoulder detector to obtain a target upper body area of the image to be detected;

and the head and shoulder area detection unit is used for carrying out human-shaped head and shoulder area detection on the characteristic diagram of the image to be detected through a pre-trained third head and shoulder detector to obtain a target head and shoulder area of the image to be detected.

Still further, the upper body area detecting unit is specifically configured to:

Furthermore, the apparatus further includes a classifier training module, and the classifier training module is specifically configured to:

Still further, the size of the preset standard whole-body template is equal to the preset standard size;

then, the apparatus further includes a detection area acquisition module, where the detection area acquisition module is specifically configured to:

Specifically, the human-shaped marking module is specifically configured to:

and marking the combined target human-shaped area in the image to be detected.

Fig. 3 is a schematic diagram of a human-shaped detecting apparatus according to an embodiment of the present invention.

The human form detection device provided by the embodiment of the invention comprises a processor 31, a memory 32 and a computer program stored in the memory 32 and configured to be executed by the processor 31, wherein the processor 31 implements the human form detection method according to any one of the above embodiments when executing the computer program.

The processor 31, when executing the computer program, implements the steps in the above-described human form detection method embodiment, for example, all the steps of the human form detection method shown in fig. 1. Alternatively, the processor 31, when executing the computer program, implements the functions of the modules/units in the human shape detecting device embodiment, for example, the functions of the modules of the human shape detecting device shown in fig. 2.

Illustratively, the computer program may be divided into one or more modules, which are stored in the memory 32 and executed by the processor 31 to accomplish the present invention. The one or more modules may be a series of computer program instruction segments capable of performing certain functions, which are used to describe the execution of the computer program in the human form detection device. For example, the computer program may be divided into an image acquisition module, a feature extraction module, a human shape detection module, and a human shape marking module, each of which functions specifically as follows: the image acquisition module is used for acquiring an image to be detected; the characteristic extraction module is used for extracting the characteristics of the image to be detected to obtain a characteristic diagram of the image to be detected; the human shape detection module is used for carrying out human shape detection on the characteristic diagram of the image to be detected to obtain a target human shape area of the image to be detected; the target humanoid region comprises a target whole body region, a target upper body region and a target head-shoulder region; and the human-shaped marking module is used for marking the target human-shaped area in the image to be detected.

The human-shaped detection device can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing devices. The human-shaped detection device may include, but is not limited to, a processor 31, a memory 32. It will be appreciated by those skilled in the art that the schematic diagram is merely an example of a human form detection device and does not constitute a limitation of a human form detection device and may include more or fewer components than shown, or some components may be combined, or different components, for example the human form detection device may also include input output devices, network access devices, buses, etc.

The Processor 31 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, and the processor 31 is the control center of the human form detecting apparatus and connects the various parts of the whole human form detecting apparatus by various interfaces and lines.

The memory 32 may be used to store the computer programs and/or modules, and the processor 31 may implement various functions of the human form detection apparatus by running or executing the computer programs and/or modules stored in the memory 32 and invoking data stored in the memory 32. The memory 32 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the human shape detection apparatus, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

Wherein, the integrated module/unit of the human-shaped detection device can be stored in a computer readable storage medium if the integrated module/unit is realized in the form of a software functional unit and is sold or used as an independent product. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like.

It should be noted that the above-described device embodiments are merely illustrative, where the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. In addition, in the drawings of the embodiment of the apparatus provided by the present invention, the connection relationship between the modules indicates that there is a communication connection between them, and may be specifically implemented as one or more communication buses or signal lines. One of ordinary skill in the art can understand and implement it without inventive effort.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

Claims

1. A human form detection method is characterized by comprising the following steps:

acquiring an image to be detected;

and marking the target human-shaped area in the image to be detected.

2. The human form detection method as claimed in claim 1, wherein the human form detection of the feature map of the image to be detected to obtain the target human form region of the image to be detected specifically comprises:

3. The human form detection method as claimed in claim 2, wherein the human form upper body region detection is performed on the feature map of the image to be detected through a second upper body detector trained in advance and a second head and shoulder detector trained in advance to obtain the target upper body region of the image to be detected, and specifically comprises:

4. The human form detection method as claimed in claim 3, wherein before the human form detection is performed on the feature map of the image to be detected to obtain the target human form region of the image to be detected, the method further comprises:

5. The humanoid test method of claim 4, wherein the size of the preset standard whole-body template is equal to the preset standard size;

then, before the human shape detection is performed on the feature map of the image to be detected to obtain the target human shape region of the image to be detected, after the training set of pedestrian data is obtained, the method further includes:

6. The human form detection method according to claim 5, wherein the determining the size of the upper body detection region and the size of the head-shoulder detection region according to the information of the upper body region and the head-shoulder region labeled in the plurality of human form whole body sample drawings specifically comprises:

7. The human form detection method as claimed in claim 1, wherein said marking of the target human form region in the image to be detected specifically comprises:

and marking the combined target human-shaped area in the image to be detected.

8. A human form detection device, comprising:

the image acquisition module is used for acquiring an image to be detected;

9. A human form detection apparatus comprising a processor, a memory and a computer program stored in the memory and configured to be executed by the processor, the processor implementing the human form detection method as claimed in any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, comprising a stored computer program, wherein the computer program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the human form detection method according to any one of claims 1 to 7.