CN111429487B

CN111429487B - Method and device for segmenting adhesion foreground of depth image

Info

Publication number: CN111429487B
Application number: CN202010191067.3A
Authority: CN
Inventors: 王磊; 李骊
Original assignee: Beijing HJIMI Technology Co Ltd
Current assignee: Beijing HJIMI Technology Co Ltd
Priority date: 2020-03-18
Filing date: 2020-03-18
Publication date: 2023-10-24
Anticipated expiration: 2040-03-18
Also published as: CN111429487A

Abstract

The application discloses a method and a device for segmenting an adhesion foreground of a depth image, wherein the method comprises the following steps: after a target depth image to be segmented is obtained, the target depth image is segmented into a background and a foreground where a target tracking object is located, the target depth image is segmented into connected areas, each connected area blob contained in the target depth image is obtained, then each blob is classified to obtain each type of blob, then the preset type of blob is divided into different small connected areas patches according to a preset division rule, and finally each patch is traversed, and all patches belonging to the same target tracking object are aggregated one by one to obtain all complete target tracking objects in the target depth image. Therefore, the accurate segmentation of the adhesion foreground is realized under the condition of only depth images, the segmentation cost and the operand are reduced, the real-time property of segmentation is improved, and the method has wide application space.

Description

Method and device for segmenting adhesion foreground of depth image

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for segmentation of a depth image.

Background

With the development of portable and inexpensive depth cameras, depth images are of increasing importance in research and application in the field of image processing. The application of depth image information can promote the performance of related research and application in the field of machine vision, such as image segmentation, object tracking, image recognition, image reconstruction and the like.

When a target tracking object moving in the depth image is contacted with other objects or contacted with other tracking objects, namely when the foreground is adhered, accurate segmentation is performed on the target tracking object, which is a precondition for continuous tracking and gesture recognition of the target tracking object. The existing method for segmenting the adhesion foreground of the depth image mainly utilizes registered color image information to segment, for example, a neural network is used for segmenting the adhesion foreground of the depth image on the color image, and the neural network usually needs a large amount of manual annotation data although the segmentation accuracy can be high, the cost is high, the operation amount is high, and real-time segmentation cannot be achieved; in addition, this results in a failure to segment the adhesion foreground with only the depth image due to the need for registered color map information.

Disclosure of Invention

The embodiment of the application mainly aims to provide a method and a device for segmenting the adhesion foreground of a depth image, which can accurately segment the adhesion foreground under the condition of only the depth image, reduce the segmentation cost and the operand, improve the real-time performance of segmentation and have wide application space.

In a first aspect, an embodiment of the present application provides a method for segmentation of a depth image, including:

acquiring a target depth image to be segmented; the target depth image comprises a background and a foreground where a target tracking object is located;

acquiring a background in the target depth image and a foreground where a target tracking object is located, and carrying out connected region segmentation on the target depth image to obtain each connected region blob contained in the target depth image;

classifying each blob contained in the target depth image to obtain each type of blob contained in the target depth image;

dividing a blob of a preset type into different small connected areas patch according to a preset dividing rule;

and traversing each patch, and aggregating all patches belonging to the same target tracking object one by one to obtain all complete target tracking objects in the target depth image.

Optionally, the classifying each blob contained in the target depth image to obtain each type of blob contained in the target depth image includes:

s1: when the foreground proportion in the blob is judged to be smaller than a foreground proportion threshold value, determining the type of the blob as a blob only containing a background;

s2: when judging that the foreground proportion in the blob is not smaller than a foreground proportion threshold, if judging that the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total area of the blob is larger than a first proportion threshold, determining that the type of the blob is the blob containing only one target tracking object; if the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total blob area is smaller than a second proportion threshold value, determining that the type of the blob is a blob only containing a background; the second proportional threshold is much smaller than the first proportional threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, in the total area of the blob is judged to be smaller than the first proportion threshold and not smaller than the second proportion threshold, determining the type of the blob as a blob containing one target tracking object adhered to the background;

S3: when judging that the foreground proportion in the blob is not smaller than a foreground proportion threshold, if judging that the proportion of the part, appearing in the blob, of the region in the previous frame in at least two target tracking objects is larger than a third proportion threshold and the number of effective target tracking objects with the proportion normalized and the proportion value larger than a normalization proportion threshold is 0, determining that the type of the blob is the blob only containing the background; if the proportion of the part of the region in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion value after proportion normalization is larger than a normalization proportion threshold value, is 1, repeatedly executing the step S2, and determining the type of the blob; if the proportion of the part of the area in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is larger than 1, the proportion of the part of the area in the previous frame in the effective target tracking objects, which is found in the blob, to the total area of the blob is added, and if the added proportion value is larger than a fourth proportion threshold value, the type of the blob is determined to be the blob containing adhesion between at least two target tracking objects; and if the added proportion value is not greater than the fourth proportion threshold value, determining the type of the blob as the blob containing at least two target tracking objects adhered to the background.

Optionally, the preset type of blob includes the following three adhesion type blobs:

the method comprises the following steps of including a blob with a target tracking object adhered to a background;

a blob comprising an adhesion between at least two target tracked objects;

comprising at least two blobs of target tracking objects adhered to a background.

Optionally, the dividing the blob of the preset type into different small connected areas patch according to a preset dividing rule includes:

each depth pixel point in the preset type of blob is respectively used as an independent patch, and a patch data structure object is allocated to each pixel; the patch data structure comprises the number of the patch, the number of pixel points and a depth value;

taking all patch pairs with adjacent relations in the blob as one edge respectively; the structure of the edge comprises the positions and weights of two endpoints;

all edges in the blob are arranged in ascending order according to weight;

and merging the latches where the two endpoints in the edge meeting the preset condition are located into one latch according to the relation between the weight of each edge in the blob and the depth values of the two endpoints.

Optionally, the step of traversing each patch, and aggregating all patches belonging to the same target tracking object one by one to obtain a complete target tracking object includes:

Dividing each patch with high attribution confidence to the affiliated target tracking object after performing first traversal on each patch; performing row-by-row analysis on the patch with lower attribution confidence but larger area, and dividing the patch into corresponding target tracking objects according to the proportion of the number of pixels of the target tracking object of the previous frame appearing in different segments of each row to the length of the current segment; marking the patch with lower attribution confidence but smaller area as the patch to be processed;

and dividing each patch to be processed into target tracking objects closest to the three-dimensional distance of the patch to be processed by performing second traversal on each patch to be processed.

In a second aspect, an embodiment of the present application further provides a device for adhering foreground segmentation of a depth image, including:

the acquisition unit is used for acquiring a target depth image to be segmented; the target depth image comprises a background and a foreground where a target tracking object is located;

the segmentation unit is used for acquiring a background in the target depth image and a foreground where a target tracking object is located, and carrying out connected region segmentation on the target depth image to obtain each connected region blob contained in the target depth image;

The classification unit is used for classifying the blobs contained in the target depth image to obtain blobs of all types contained in the target depth image;

the dividing unit is used for dividing the blob of the preset type into different small communication areas patch according to a preset dividing rule;

the obtaining unit is used for gathering all the patches belonging to the same target tracking object one by one through traversing each patch to obtain all complete target tracking objects in the target depth image.

Optionally, the classification unit includes:

a first determining subunit, configured to determine, when it is determined that the foreground proportion in the blob is smaller than the foreground proportion threshold, that the type of the blob is a blob that only includes a background;

a second determining subunit, configured to determine, when it is determined that the foreground proportion in the blob is not less than the foreground proportion threshold, that the type of the blob is a blob that only includes one target tracking object if it is determined that a proportion of a portion of an area of the previous frame in which the target tracking object appears in the blob to the total area of the blob is greater than the first proportion threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total blob area is smaller than a second proportion threshold value, determining that the type of the blob is a blob only containing a background; the second proportional threshold is much smaller than the first proportional threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, in the total area of the blob is judged to be smaller than the first proportion threshold and not smaller than the second proportion threshold, determining the type of the blob as a blob containing one target tracking object adhered to the background;

A third determining subunit, configured to determine, when it is determined that the foreground proportion in the blob is not less than the foreground proportion threshold, that the type of the blob is a blob that only includes a background if it is determined that a proportion of a portion of a region in a previous frame, which appears in the blob, in at least two target tracking objects is greater than a third proportion threshold and that the proportion normalized proportion value is greater than a normalized proportion threshold is 0; if the proportion of the part of the region in the previous frame, which appears in the blob, in the at least two target tracking objects is judged to be larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is 1, a second determination subunit is called, and the type of the blob is determined; if the proportion of the part of the area in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is larger than 1, the proportion of the part of the area in the previous frame in the effective target tracking objects, which is found in the blob, to the total area of the blob is added, and if the added proportion value is larger than a fourth proportion threshold value, the type of the blob is determined to be the blob containing adhesion between at least two target tracking objects; and if the added proportion value is not greater than the fourth proportion threshold value, determining the type of the blob as the blob containing at least two target tracking objects adhered to the background.

a blob comprising an adhesion between at least two target tracked objects;

Optionally, the dividing unit includes:

the allocation subunit is used for respectively taking each depth pixel point in the preset type of blob as an independent patch and allocating a patch data structure object for each pixel; the patch data structure comprises the number of the patch, the number of pixel points and a depth value;

the sub-unit is used for taking all patch pairs with adjacent relations in the blob as one edge respectively; the structure of the edge comprises the positions and weights of two endpoints;

an arrangement subunit, configured to arrange all edges in the blob in ascending order according to weight sizes;

and the merging subunit is used for merging the patch where the two endpoints in the edge meeting the preset condition are located into one patch according to the relation between the weight of each edge in the blob and the depth values of the two endpoints.

Optionally, the obtaining unit includes:

the first traversal subunit is used for dividing each patch into the belonging target tracking objects with high attribution confidence after performing first traversal on each patch; performing row-by-row analysis on the patch with lower attribution confidence but larger area, and dividing the patch into corresponding target tracking objects according to the proportion of the number of pixels of the target tracking object of the previous frame appearing in different segments of each row to the length of the current segment; marking the patch with lower attribution confidence but smaller area as the patch to be processed;

And the second traversal subunit is used for dividing each patch to be processed into target tracking objects closest to the three-dimensional distance of the target tracking objects after performing second traversal on each patch to be processed.

The embodiment of the application also provides a device for adhering the foreground segmentation of the depth image, which comprises the following steps: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

the memory is configured to store one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any one of the implementations of the sticky foreground segmentation method for depth images described above.

The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores instructions, and when the instructions run on the terminal equipment, the terminal equipment is caused to execute any implementation mode of the method for splitting the adhesion prospect of the depth image.

According to the method and the device for segmenting the adhesion foreground of the depth image, after the target depth image to be segmented comprising the background and the foreground of the target tracking object is obtained, the background in the target depth image and the foreground of the target tracking object are obtained, the communication area segmentation is carried out on the target depth image to obtain all the communication area blobs contained in the target depth image, then all the blobs contained in the target depth image are classified to obtain all the types of blobs contained in the target depth image, then the preset type of adhesion blobs are divided into different small communication areas according to the preset division rule, and finally all the patches belonging to the same target tracking object are polymerized one by one through traversing. Therefore, in the embodiment of the application, each blob contained in the depth image is classified, the blob with the adhesion prospect is segmented into a plurality of patches, the patches are matched with the target tracking object one by one, and all the patches belonging to the same tracking object are aggregated to obtain all the complete target tracking objects, so that the adhesion prospect is accurately segmented under the condition of only the depth image, the segmentation cost and the operation amount are reduced, the real-time performance of segmentation is improved, and the method has wide application space.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic flow chart of a method for segmenting an adhesion foreground of a depth image according to an embodiment of the present application;

fig. 2 is a schematic diagram of a foreground and a background where a target tracking object in an obtained target depth image according to an embodiment of the present application is located;

fig. 3 is a schematic diagram of each connected region blob included in a target depth image according to an embodiment of the present application;

FIG. 4 is a schematic flow chart of classifying blobs included in a target depth image according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of dividing a blob of a preset type into different small connected regions according to a preset division rule according to an embodiment of the present application;

FIG. 6 is a schematic diagram of the effect of dividing a blob of a preset type into different small connected regions according to a preset division rule according to an embodiment of the present application;

FIG. 7 is a schematic flow chart of partitioning a blob including a target tracking object and a background adhesion according to an embodiment of the application;

FIG. 8 is a schematic diagram of the effect of dividing a blob containing an adhesion between two target tracking objects according to an embodiment of the application;

fig. 9 is a schematic diagram of a composition of a device for adhering foreground segmentation of a depth image according to an embodiment of the present application.

Detailed Description

At present, a pedestrian tracking and gesture recognition technology based on depth images is a basic technology and can be used in the fields of human-computer interaction, somatosensory games, human body behavior analysis and the like. When a target tracking object moving in the depth image is contacted with other objects or contacted with other tracking objects, namely, when the foreground is adhered, accurate segmentation is performed on the target tracking object, which is a precondition for continuous tracking and gesture recognition of the target tracking object. The existing method for segmenting the adhesion foreground of the depth image mainly utilizes registered color image information to segment, for example, a neural network is used for segmenting the adhesion foreground of the depth image on the color image, and the neural network usually needs a large amount of manual annotation data although the segmentation accuracy can be high, the cost is high, the operation amount is large, and real-time segmentation cannot be achieved; in addition, this results in a failure to segment the adhesion foreground with only the depth image due to the need for registered color map information.

In order to solve the above-mentioned drawbacks, an embodiment of the present application provides a method for segmenting an adhesion foreground of a depth image, after obtaining a target depth image to be segmented including a background and a foreground where a target tracking object is located, firstly obtaining the background in the target depth image and the foreground where the target tracking object is located, and segmenting a communication region of the target depth image to obtain each communication region blob contained in the target depth image, then classifying each blob contained in the target depth image to obtain each type of blob contained in the target depth image, then dividing the preset type of blob into different small communication regions patches according to a preset division rule, and finally traversing each patch one by one to aggregate all patches belonging to the same target tracking object, thereby obtaining all complete target tracking objects in the target depth image. Therefore, in the embodiment of the application, each blob contained in the depth image is classified, the blob with the adhesion prospect is segmented into a plurality of patches, the patches are matched with the target tracking object one by one, and all the patches belonging to the same tracking object are aggregated to obtain all the complete target tracking objects, so that the adhesion prospect is accurately segmented under the condition of only the depth image, the segmentation cost and the operation amount are reduced, the real-time performance of segmentation is improved, and the method has wide application space.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

First embodiment

Referring to fig. 1, a flow chart of a method for adhering foreground segmentation of a depth image according to the present embodiment is provided, and the method includes the following steps:

s101: acquiring a target depth image to be segmented; the target depth image comprises a background and a foreground where a target tracking object is located.

In this embodiment, any depth image for implementing the adhesion foreground segmentation by using this embodiment is defined as a target depth image, where the target depth image includes a background and a foreground where a target tracking object is located. The depth image is also called range image, and refers to an image with the distance (depth) from an image collector to each point in a scene as a pixel value, and directly reflects the geometric shape of the visible surface of the scene. The target tracking object refers to a moving object in the target depth image, such as a moving person in the scene, and correspondingly, the foreground of the target tracking object refers to a pixel area in the target depth image, such as a pixel area corresponding to the moving person in the scene in the image, where the target tracking object is located. The background refers to all areas in the image except the target pixel area of interest, i.e., the pixel area outside the foreground where the target tracking object is located. The adhesion foreground refers to a foreground where a target tracking object contacted with a background exists in a target depth image, or a foreground where a plurality of target tracking objects contacted together exist, or a foreground where a plurality of target tracking objects contacted together with a background exist, for example, one pedestrian (target tracking object) in a scene holds a chair (background), two pedestrians (two target tracking objects) side by side, and the like. It should be noted that, the depth image of the target may be obtained by photographing with a depth camera having a depth of field module or a camera, and directly give accurate three-dimensional coordinates of the target.

S102: and acquiring a background in the target depth image and a foreground where the target tracking object is located, and carrying out connected region segmentation on the target depth image to obtain each connected region blob contained in the target depth image.

In this embodiment, after the target depth image to be segmented is obtained through step S101, the foreground segmentation algorithm that occurs in the existing or future may be further utilized to segment the target depth image, for example, the foreground and the background of the target depth image may be segmented by using the coding index codebook algorithm, the mixed gaussian model, the vibe algorithm, etc., so as to accurately segment the background in the target depth image and the foreground where the target tracking object is located, as shown in fig. 2, the black area represents the background, the white area represents the foreground, and it can be seen that the foreground segmentation result has a certain noise, and may be that part of the background area is segmented into the foreground by mistake, or that the area that belongs to the foreground is segmented into the background.

Then, the connected domain segmentation can be performed on the target depth image by using the existing or future connected domain algorithm, so as to obtain each connected domain blob contained in the target depth image. For example, the connected domain segmentation may be performed on the target depth image by using a one-pass algorithm, a two-pass algorithm, or a multiple-pass algorithm, that is, each different target tracking object in the target depth image and each independent individual in the background are respectively segmented into independent connected domain parts, for example, a complete pedestrian (target tracking object), a chair (background), a wall (background), etc., as shown in fig. 3, different color blocks represent different connected domains, it can be seen that most independent targets are segmented into one complete blob, but when the depth value inside the target is large, the targets may be segmented into a plurality of blobs, and when the targets are contacted with each other, the targets may be segmented into one complete blob.

S103: and classifying the blobs contained in the target depth image to obtain the blobs of each type contained in the target depth image.

In this embodiment, after each blob included in the target depth image is obtained in step S102, when the target tracking object contacts with another object (such as a background or another target tracking object), the target tracking object and the other object may be segmented into a complete blob in step S102, and in addition, there may be a case where one complete target tracking object is segmented into a plurality of blobs, so each blob included in the target depth image needs to be classified, and each type of blob included in the target depth image is obtained.

In an alternative implementation manner of the embodiment of the present application, the specific implementation process of the present step S103 may include the following steps S1 to S3:

step S1: when the foreground proportion in the blob is judged to be smaller than the foreground proportion threshold, the type of the blob is determined to be a blob containing only the background.

Step S2: when the foreground proportion in the blob is judged to be not smaller than the foreground proportion threshold value, if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, is judged to be larger than the first proportion threshold value, the type of the blob is determined to be the blob containing only one target tracking object; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, to the total blob area is smaller than a second proportion threshold value, determining the type of the blob as the blob only containing the background; the second proportional threshold is much smaller than the first proportional threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, accounting for the total area of the blob is judged to be smaller than the first proportion threshold value and not smaller than the second proportion threshold value, the type of the blob is determined to be the blob containing one target tracking object adhered to the background.

Step S3: when judging that the foreground proportion in the blob is not smaller than a foreground proportion threshold, if judging that the proportion of the part of the region in the previous frame in the at least two target tracking objects, which is present in the blob, is larger than a third proportion threshold and the number of effective target tracking objects with the proportion normalized proportion value larger than a normalized proportion threshold is 0, determining that the type of the blob is the blob only containing the background; if the proportion of the part of the region in the previous frame in the at least two target tracking objects, which is in the whole area of the blob, is larger than a third proportion threshold value and the number of effective target tracking objects, of which the proportion value after proportion normalization is larger than a normalization proportion threshold value, is 1, repeating the step S2 to determine the type of the blob; if the proportion of the part of the region in the previous frame in the at least two target tracking objects, which is in the total area of the blob, is larger than a third proportion threshold value, and the number of the effective target tracking objects, which are in the total area of the blob, is larger than 1, the proportion of the part of the region in the previous frame in the effective target tracking objects, which is in the total area of the blob, is judged, if the proportion of the part of the region in the previous frame in the blob, which is in the total area of the blob, is larger than a fourth proportion threshold value, the type of the blob is determined to be the blob comprising adhesion between the at least two target tracking objects; and if the added proportion value is not greater than the fourth proportion threshold value, determining the type of the blob as the blob containing at least two target tracking objects adhered to the background.

Specifically, as shown in fig. 4, after each blob included in the target depth image is acquired through step S102, for each blob, the foreground information in the blob is first counted, and if the foreground proportion (defined herein as for_ratio) in the blob is smaller than the foreground proportion threshold (defined herein as for_ratio_threshold), the type of the blob is determined to be a blob including only the background. This is because if the foreground proportion in the blob is too small, it is likely that the noise points of the foreground segmentation occur within the background blob and therefore cannot be attributed to any target tracked object. It should be noted that, the foreground ratio threshold value for_ratio_threshold is set according to the performance of the foreground segmentation algorithm, the frame rate, the motion speed of the target tracking object, and the camera parameter, and the specific value may be set according to the actual situation, which is not limited in this embodiment, for example, the foreground ratio threshold value for_ratio_threshold may be set to 0.1.

In addition, when the foreground ratio for_ratio in the blob is not less than the foreground ratio threshold for_ratio_threshold, the proportion of the current blob area of the portion of the current blob where the region of the previous frame appears in the current blob of each target tracking object currently being tracked is counted (here, the portion is defined as pre_object_ratio), and for convenience of subsequent processing and expression, the proportion of the previous frame background region appearing in the current blob is represented by pre_object_ratio [0], the background is regarded as the tracking object with ID of 0, and the remaining target tracking objects are numbered 1,2, … and N, respectively. Next, the number of tracking targets whose pre_object_ratio [ i ] (i > 0) is greater than 0 in the current blob (which is defined herein as pre_object_num) is counted. It will be appreciated that if pre_object_num is 0, i.e., tracking target information of the previous frame does not appear in the current blob, the type of the blob is determined to be a blob containing only background.

As shown in fig. 4, when it is determined that there is a target tracking object ratio for_ratio not less than the foreground ratio threshold for_ratio_threshold, i.e., pre_object_num is equal to 1, it is marked as a target tracking object No. 1, its pre_object_ratio is greater than 0, and if the ratio of the portion of the region in the previous frame appearing in the blob to the total area of the blob, pre_object_ratio, is greater than the first ratio threshold (defined herein as pre_object_ratio_threshold 1), the blob is considered to be completely a target tracking object No. 1, i.e., the type of the blob is determined to be a blob containing only one target tracking object.

A type of blob is considered to be a background-only blob if the proportion of the portion of its region in the previous frame that appears in the blob, pre_object_ratio, is less than a second proportion threshold (defined herein as pre_object_ratio_threshold 2). Wherein the second proportional threshold pre_object_ratio_threshold 2 is much smaller than the first proportional threshold pre_object_ratio_threshold 1. If the proportion of the portion of the region in the previous frame appearing in the blob, pre_object_ratio, is smaller than the first proportion threshold pre_object_ratio_threshold 1 and not smaller than the second proportion threshold pre_object_ratio_threshold 2, the type of the blob is determined to be a blob containing a target tracking object 1 adhered to the background, and then the target tracking object 1 needs to be segmented from the background through the following steps S104-S105.

It should be noted that, the first proportional threshold pre_object_ratio_threshold 1 should be a larger value, that is, most of the area of the blob should coincide with the area of the target tracking object No. 1 in the previous frame; whereas the second scale threshold pre_object_ratio_threshold 2 should be a small value, i.e. the majority of the area of the blob should coincide with the background area of the previous frame. And the pre_object_ratio_threshold 1 and the pre_object_ratio_threshold 2 are set according to the motion speed of the target tracking object, the specific value can be set according to the actual situation, and the embodiment is not limited thereto, for example, the pre_object_ratio_threshold 1 can be set to 0.9 and the pre_object_ratio_threshold 2 can be set to 0.1.

As shown in fig. 4, when it is determined that there are at least two target tracking object proportions for_ratio not less than the foreground proportion threshold for_ratio_threshold, that is, pre_object_num is greater than 1, at this time, there are a plurality of target tracking objects whose pre_object_ratio is greater than 0, the pre_object_ratio is normalized first to obtain a normalized proportion value (which is defined herein as pre_object_ratio_norm), and then the normalized proportion value is compared with a third proportion threshold (which is defined herein as pre_object_ratio_threshold 3), and statistics are made that all the pre_object_ratio is greater than the third proportion threshold pre_object_threshold 3, and that the pre_object_ratio_norm is greater than the normalized proportion threshold (which is defined herein as pre_object_ratio_norm) is defined herein as the effective number of target tracking objects whose pre_object_ratio is defined herein as pre_object_threshold. Note that, the pre_object_ratio_threshold 3 and the pre_object_ratio_norm_threshold are set according to the motion speed of the target tracking object, the number of target tracking objects pre_object_sum of the previous frame occurring in the current blob, and the frame rate of the camera, and the specific value may be set according to the actual situation, which is not limited in this embodiment, for example, the pre_object_ratio_threshold 3 may be set to 0.05, and the pre_object_ratio_norm_threshold may be set to 0.15.

If valid_pre_object_num is equal to 0, the proportion of the region where the target tracking object of the previous frame appearing in the current blob is located is extremely small, so that the type of the current blob can be determined to be a blob containing only the background.

If valid_pre_object_num is equal to 1, indicating that there is only one valid target tracking object, comparing the proportion of the part of the region in the previous frame, which appears in the blob, to the total area of the blob, pre_object_ratio, and the sizes of pre_object_ratio_threshold 1 and pre_object_ratio_threshold 2, and if the size is larger than the size of pre_object_ratio_threshold 1, determining that the type of blob is a blob containing only one target tracking object. If the type of the blob is less than pre_object_ratio_threshold 2, the type of the blob is determined to be a blob containing only the background. If the object belongs to the object to be tracked and the background, the type of the object can be determined as the object comprising the object to be tracked 1 and the background, and the object to be tracked needs to be segmented from the background through the following steps S104-S105.

If valid_pre_object_num is greater than 1, indicating that a plurality of valid target tracking objects are not low in proportion, at this time, the proportion of the part of the region in the previous frame of the valid target tracking object, which part appears in the blob, may be added, and if the added proportion value is greater than a fourth proportion threshold (defined as pre_object_ratio_threshold 4), indicating that the blob belongs to the common object of the plurality of target tracking objects, thereby determining that the type of the blob is a blob comprising adhesion between at least two target tracking objects; if the added proportion value is not greater than pre_object_ratio_threshold 4, the determination that the blob belongs to a plurality of target tracking objects and the background is indicated, so that the type of the blob can be determined to be the blob containing at least two target tracking objects adhered to the background. At this time, both types of blobs appear, and the target tracking object needs to be segmented from the background through the subsequent steps S104 to S105.

S104: and dividing the preset type of the adhered blob into different small connected areas patch according to a preset dividing rule.

In this embodiment, after each type of blob included in the target depth image is obtained in step S103, for a preset type of bonded blob in which a bonding foreground exists, median filtering is required to be performed first to reduce the influence of noise, and meanwhile, edge information inside the blob can be better maintained. Thereafter, the preset type of blob may be partitioned into several different small connected regions patches based on depth value information inside the blob. And then, carrying out attribution division on the patch according to a preset division rule, attribution conditions of the blob, depth information of the patch, foreground information in the patch and target tracking object information of the previous frame.

The preset type of blob comprises three adhesion types of blobs, which are respectively: the method comprises the following steps of including a blob with a target tracking object adhered to a background; a blob comprising an adhesion between at least two target tracked objects; comprising at least two blobs of target tracking objects adhered to a background.

In an alternative implementation manner of the embodiment of the present application, the specific implementation process of the present step S104 may include the following steps A1-A4:

Step A1: each depth pixel point in the preset type of blob is respectively used as an independent patch, and a patch data structure object is allocated to each pixel; the patch data structure comprises the number of the patch, the number of the pixel points and the depth value.

Step A2: taking all patch pairs with adjacent relations in the blob as one edge respectively; wherein the structure of the edge includes the positions and weights of the two endpoints.

Step A3: all edges in the blob are arranged in ascending order according to weight.

Step A4: and merging the latches where the two endpoints in the edge meeting the preset condition are located into one latch according to the relation between the weight of each edge in the blob and the depth values of the two endpoints.

Specifically, as shown in fig. 5, after obtaining the blobs of each preset type included in the target depth image in step S103, for each of the blobs, first, each depth pixel point in the blob is used as an independent patch, and a patch data structure object is allocated to each pixel; the patch data structure includes the number id of the patch (i.e. the id of all pixels in the patch), the number of pixels (i.e. how many pixels are included in the patch), and the depth value. During initialization, each pixel point in the blob is used as an independent patch, the id of the pixel point is the serial number of the pixel point in the blob, the size of the pixel point is 1, and the representative depth is the depth value of the pixel point.

Then, taking all depth pixel pairs with adjacent relations in the blob as one edge (defined herein as edge), for example, the current patch and the patch above it can form an edge. The structure of an edge includes three main members, the position of the two endpoints of the edge (defined herein as a, b) and a weight, where the weight is the absolute value of the difference between the depth values of the two endpoints of the edge (defined herein as w).

Then, all edge in the blob may be arranged in ascending order according to the weight w, and the edge with the smaller weight w may be arranged in front.

Finally, traversing all the edge in the blob, and determining whether to merge the patch of the two endpoints a and b of the edge into one patch according to the relation between the weight w of the edge and the patch of the two endpoints a and b. Specifically, if the weight of the edge is smaller than the threshold threshold_a determined by the depth image value depth_a of the latch_a where the endpoint a is located and the threshold threshold_b determined by the depth image value depth_b of the latch_b where the endpoint b is located, merging the latch_a and the latch_b into a latch_c, wherein the number of the latch_c is the smaller one of the numbers of the latch_a and the latch_b, the number of the pixels is the sum of the two, and the depth value is the smaller one of the depth values of the latch_a and the latch_b; otherwise, patch_a and patch_b remain unchanged. The threshold values threshold_a and threshold_b are determined by the depth value of the patch and the measurement error of the target depth image, and in general, the larger the depth value, the larger the measurement error corresponding to the depth image, and further the threshold value threshold is. For example, if the depth value of the endpoint a is 2m, the depth value of the endpoint b is 3m, the threshold value threshold_a determined by the endpoint a is 5mm, the threshold value threshold_b determined by the endpoint b is 10mm, and the difference 1m between the depth values of the endpoint a and the endpoint b is greater than 5mm, then the endpoint a and the endpoint b cannot be combined into one patch.

In addition, all the merged patches can be traversed again, and the patches with the too small number of pixels (such as the patches with 50 pixels) are merged into the patches in the neighborhood, specifically, the patches closest to the representative depth value.

As shown in fig. 6, where the left-hand graph shows two pedestrians (i.e., two target tracked objects) in contact with each other, segmented into a complete blob at the time of "blob segmentation", the right-hand graph is the result of a re-segmentation of the blob into patches, where patches of different gray levels represent different patches. It can be seen that a large area blob is partitioned into many small area patches, and that the area on both sides of the contact boundary is partitioned into multiple different patches because the depth value changes are still more pronounced on both sides of the contact boundary than in other areas.

S105: and traversing each patch, and aggregating all patches belonging to the same target tracking object one by one to obtain all complete target tracking objects in the target depth image.

In this embodiment, if the blob of the preset type is divided into different small connected regions patches in step S104, each patch may be further traversed, and all patches belonging to the same target tracking object are aggregated one by one, so as to obtain all complete target tracking objects in the target depth image.

In an alternative implementation manner, the specific implementation process of the step S104 may include the following steps B1-B2:

step B1: dividing each patch with high attribution confidence to the affiliated target tracking object in the first traversal of each patch; performing row-by-row analysis on the patch with lower attribution confidence but larger area, and dividing the patch into corresponding target tracking objects according to the proportion of the number of pixels of the target tracking object of the previous frame appearing in different segments of each row to the length of the current segment; the patch with lower attribution confidence but smaller area is marked as the patch to be processed.

Step B2: and dividing each patch to be processed into target tracking objects closest to the three-dimensional distance of the patch to be processed after performing second traversal on the patch to be processed.

The attribution confidence of a patch refers to the proportion of the part of the area of the target tracking object in the previous frame, which appears in the blob, in the blob to which the patch belongs, to the total area of the blob. It will be appreciated that the higher the ratio, the higher the attribution confidence of the corresponding patch, and conversely, the lower the ratio, the lower the attribution confidence of the corresponding patch.

It should be noted that, for the patches in the blobs of three different preset types (including the blob with one target tracking object adhered to the background, the blob with at least two target tracking objects adhered to each other, and the blob with at least two target tracking objects adhered to the background), the overall ideas of the traversing and aggregating processes are the same, but the detailed parts of the specific implementation processes are still different, and then, the traversing and aggregating processes of the patches in the blobs of three different preset types are respectively introduced:

(1) The first is directed to a patch in a blob containing one target trace object attached to the background. As shown in FIG. 7, since it has been determined that the blob belongs to a target trace object in common with the background, the patch of the blob belongs to either the target trace object entirely, the background entirely, or both the background and the trace object in common. Wherein most of the patches belong to the target tracking object completely or to the background completely, and only a few patches belong to the target tracking object and the background simultaneously. Firstly, taking out an unprocessed patch, calculating a foreground ratio for_ratio, and if the ratio is smaller than a preset threshold value for_ratio_threshold 1, the patch necessarily belongs to the background; otherwise, calculating the proportion pre_object_request of the part of the previous frame of the target tracking object, which appears in the current patch, to the area of the current patch, and if the pre_object_request is larger than a higher threshold pre_ratio_threshold 1, attributing the patch as the target tracking object; otherwise, further judging the relation between the pre_object_request and a smaller threshold value pre_object_request_threshold 2, if the pre_object_request is smaller than the threshold value, further attributing the patch with larger area to the background according to the relation between the area of the current patch and a smaller area threshold value path_area_threshold, and marking the patch with smaller area as the patch to be processed for processing in the second traversal. And for the patch with the pre_object_ratio larger than the smaller threshold pre_object_ratio_threshold 2, performing row-by-row analysis (specifically, after determining the circumscribed rectangle where the patch is located, performing row-by-row analysis on the patch contained in each row in the rectangle). For any row in the patch, firstly dividing the pixel point of the row into a plurality of continuous line segments according to whether the pixel point of the row is continuous or not, counting the proportion of the number of pixels of a tracking target of the previous frame appearing in each segment to the length of the current line segment, marking the line segment as belonging to the target tracking object if the proportion is larger than a threshold value pre_object_ratio_threshold 3, and otherwise marking the line segment as belonging to the background. After all the patches of the blob are processed, a second pass is entered. And for each patch to be processed, calculating the nearest three-dimensional distance of the patch adjacent to the patch which is marked as belonging to the target tracking object, and marking the patch as belonging to the target tracking object if the three-dimensional distance is smaller than a threshold range_threshold, otherwise marking the patch as belonging to the background.

(2) The second is directed to a patch in a blob containing an adhesion between at least two target tracked objects. The traversal and aggregation flow is basically the same as the flow for the patch processing in the blob containing one target trace object attached to the background described in (1) above, and is different in the following two aspects: firstly, since the blob is determined to belong to all of a plurality of target tracking objects, the patch belongs to either one target tracking object or the plurality of target tracking objects, and cannot belong to the background, so that the foreground proportion in the patch is not required to be counted again, and the patch is not required to be classified to be affiliated based on the proportion; secondly, when judging the attribution of a patch or the attribution of a line segment under the condition that the blob belongs to one target tracking object and the background are commonly owned, determining the attribution based on the proportion of the pixel area of the target tracking object in the current patch or the current line segment in the previous frame, wherein under the condition that the blob belongs to a plurality of target tracking objects, determining the attribution according to the interrelation of the proportion of the pixel area of the previous frame in the current patch or the current line segment in the target tracking object, if the pre-object_ratio of a certain target tracking object is far higher than a threshold pre-object_ratio_threshold 1, marking the pre-object_ratio as the attribution of the target tracking object, otherwise, judging the patch with a larger area by line segment according to the size of the area, and dividing the patch with a smaller area into the target tracking objects closest to the three dimensions in the second traversal.

As shown in fig. 8, the result of re-segmentation of the tracked pedestrians (i.e., the two target tracking objects that are in contact with each other) in fig. 6 shows that most of the segmented regions are correct, and that only a few pixels near the contact boundary are erroneously segmented to other targets, but this has no effect on subsequent pedestrian tracking and other data processing, such as human gesture recognition, etc.

(3) The second is that the traversing and aggregating flow for the patch in the blob containing at least two target trace objects attached to the background is basically the same as the flow for the patch processing in the blob containing one target trace object attached to the background described in (1) above, except that: on the premise that the foreground proportion in the patch is larger than a determined threshold value for_ratio_threshold, taking the background as a tracked target No. 0, simultaneously comparing the pre-object_ratio of the background with the pre-object_ratio of other tracked targets, finding out the largest one of the background and the pre-object_ratio, and determining that the background or the foreground belongs to the background according to the relation between the largest pre-object_ratio and the threshold values pre-object_threshold 1 and pre-object_threshold 2, or further processing the background or the foreground line by line.

In summary, in the method for adhering foreground segmentation of a depth image provided in this embodiment, after a target depth image to be segmented including a background and a foreground where a target tracking object is located is obtained, the background in the target depth image and the foreground where the target tracking object is located are firstly obtained, and connected region segmentation is performed on the target depth image to obtain each connected region blob contained in the target depth image, then each blob contained in the target depth image is classified to obtain each type of blob contained in the target depth image, then the preset type of blob is divided into different small connected regions latches according to a preset division rule, and finally all latches belonging to the same target tracking object are polymerized one by traversing each latch to obtain all complete target tracking objects in the target depth image. Therefore, in the embodiment of the application, each blob contained in the depth image is classified, the blob with the adhesion prospect is segmented into a plurality of patches, the patches are matched with the target tracking object one by one, and all the patches belonging to the same tracking object are aggregated to obtain all the complete target tracking objects, so that the adhesion prospect is accurately segmented under the condition of only the depth image, the segmentation cost and the operation amount are reduced, the real-time performance of segmentation is improved, and the method has wide application space.

Second embodiment

The embodiment will be described with reference to a device for adhering foreground segmentation of depth images, and the related content is referred to the above method embodiment.

Referring to fig. 9, a schematic composition diagram of a device for adhering foreground segmentation of a depth image according to the present embodiment is provided, where the device includes:

an acquisition unit 901, configured to acquire a target depth image to be segmented; the target depth image comprises a background and a foreground where a target tracking object is located;

the segmentation unit 902 is configured to obtain a background in the target depth image and a foreground where the target tracking object is located, and segment a communication region of the target depth image, so as to obtain each communication region blob contained in the target depth image;

a classification unit 903, configured to classify each blob contained in the target depth image, so as to obtain each type of blob contained in the target depth image;

a dividing unit 904, configured to divide a blob of a preset type into different small connected areas patch according to a preset dividing rule;

and an obtaining unit 905, configured to aggregate, by traversing each patch, all patches belonging to the same target tracking object one by one, to obtain all complete target tracking objects in the target depth image.

In one implementation of this embodiment, the classification unit 903 includes:

In one implementation manner of this embodiment, the preset type of blob includes the following three adhesion type blobs:

a blob comprising an adhesion between at least two target tracked objects;

In one implementation of this embodiment, the dividing unit 904 includes:

In one implementation of the present embodiment, the obtaining unit 905 includes:

the first traversal subunit is used for dividing each patch into the belonging target tracking objects with high attribution confidence after performing first traversal on each patch; performing row-by-row analysis on the patch with lower attribution confidence coefficient and larger area, and dividing the patch into corresponding target tracking objects according to the confidence coefficient of different segments of each row; marking the patch with lower attribution confidence but smaller area as the patch to be processed;

In summary, in the device for adhering foreground segmentation of a depth image provided in this embodiment, after a target depth image to be segmented including a background and a foreground where a target tracking object is located is obtained, the background in the target depth image and the foreground where the target tracking object is located are firstly obtained, and connected region segmentation is performed on the target depth image to obtain each connected region blob contained in the target depth image, then each blob contained in the target depth image is classified to obtain each type of blob contained in the target depth image, then the preset type of blob is divided into different small connected regions latches according to a preset division rule, and finally all latches belonging to the same target tracking object are polymerized one by traversing each latch to obtain all complete target tracking objects in the target depth image. Therefore, in the embodiment of the application, each blob contained in the depth image is classified, the blob with the adhesion prospect is segmented into a plurality of patches, the patches are matched with the target tracking object one by one, and all the patches belonging to the same tracking object are aggregated to obtain all the complete target tracking objects, so that the adhesion prospect is accurately segmented under the condition of only the depth image, the segmentation cost and the operation amount are reduced, the real-time performance of segmentation is improved, and the method has wide application space.

Further, the embodiment of the application also provides a device for adhering foreground segmentation of the depth image, which comprises: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform any of the implementations of the stuck-at foreground segmentation method of depth images described above.

Further, the embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores instructions, and when the instructions run on a terminal device, the terminal device is caused to execute any implementation method of the adhesion foreground segmentation method of the depth image.

From the above description of embodiments, it will be apparent to those skilled in the art that all or part of the steps of the above described example methods may be implemented in software plus necessary general purpose hardware platforms. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network communication device such as a media gateway, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.

It should be noted that, in the present description, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different manner from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

It is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. The method for segmenting the adhesion foreground of the depth image is characterized by comprising the following steps of:

Traversing each patch, and aggregating all patches belonging to the same target tracking object one by one to obtain all complete target tracking objects in the target depth image;

the classifying each blob contained in the target depth image to obtain each type of blob contained in the target depth image includes:

s2: when judging that the foreground proportion in the blob is not smaller than a foreground proportion threshold, if judging that the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total area of the blob is larger than a first proportion threshold, determining that the type of the blob is the blob containing only one target tracking object; if the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total blob area is smaller than a second proportion threshold value, determining that the type of the blob is a blob only containing a background; the second proportional threshold is less than the first proportional threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, in the total area of the blob is judged to be smaller than the first proportion threshold and not smaller than the second proportion threshold, determining the type of the blob as a blob containing one target tracking object adhered to the background;

S3: when judging that the foreground proportion in the blob is not smaller than a foreground proportion threshold, if judging that the proportion of the part, appearing in the blob, of the region in the previous frame in at least two target tracking objects is larger than a third proportion threshold and the number of effective target tracking objects with the proportion normalized and the proportion value larger than a normalization proportion threshold is 0, determining that the type of the blob is the blob only containing the background; if the proportion of the part of the region in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion value after proportion normalization is larger than a normalization proportion threshold value, is 1, repeatedly executing the step S2, and determining the type of the blob; if the proportion of the part of the area in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is larger than 1, the proportion of the part of the area in the previous frame in the effective target tracking objects, which is found in the blob, to the total area of the blob is added, and if the added proportion value is larger than a fourth proportion threshold value, the type of the blob is determined to be the blob containing adhesion between at least two target tracking objects; if the added proportion value is not greater than a fourth proportion threshold value, determining the type of the blob as a blob containing at least two target tracking objects adhered to the background;

Dividing the blob of the preset type into different small connected areas patch according to the preset dividing rule, including:

all edges in the blob are arranged in ascending order according to weight;

merging the latches where the two endpoints in the edge meeting the preset condition are located into one latch according to the relation between the weight of each edge in the blob and the depth values of the two endpoints;

and aggregating all the patches belonging to the same target tracking object one by traversing each patch to obtain all complete target tracking objects in the target depth image, wherein the method comprises the following steps:

2. The utility model provides a depth image's adhesion prospect segmentation device which characterized in that includes:

the acquisition unit is used for gathering all the patches belonging to the same target tracking object one by traversing each patch to obtain all complete target tracking objects in the target depth image;

The classification unit includes:

a second determining subunit, configured to determine, when it is determined that the foreground proportion in the blob is not less than the foreground proportion threshold, that the type of the blob is a blob that only includes one target tracking object if it is determined that a proportion of a portion of an area of the previous frame in which the target tracking object appears in the blob to the total area of the blob is greater than the first proportion threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which appears in the blob, in the total blob area is smaller than a second proportion threshold value, determining that the type of the blob is a blob only containing a background; the second proportional threshold is less than the first proportional threshold; if the proportion of the part of the region of the target tracking object in the previous frame, which is appeared in the blob, in the total area of the blob is judged to be smaller than the first proportion threshold and not smaller than the second proportion threshold, determining the type of the blob as a blob containing one target tracking object adhered to the background;

A third determining subunit, configured to determine, when it is determined that the foreground proportion in the blob is not less than the foreground proportion threshold, that the type of the blob is a blob that only includes a background if it is determined that a proportion of a portion of a region in a previous frame, which appears in the blob, in at least two target tracking objects is greater than a third proportion threshold and that the proportion normalized proportion value is greater than a normalized proportion threshold is 0; if the proportion of the part of the region in the previous frame, which appears in the blob, in the at least two target tracking objects is judged to be larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is 1, a second determination subunit is called, and the type of the blob is determined; if the proportion of the part of the area in the previous frame in the at least two target tracking objects, which is found in the blob, to the total area of the blob is larger than a third proportion threshold value, and the number of effective target tracking objects, of which the proportion normalized proportion value is larger than a normalized proportion threshold value, is larger than 1, the proportion of the part of the area in the previous frame in the effective target tracking objects, which is found in the blob, to the total area of the blob is added, and if the added proportion value is larger than a fourth proportion threshold value, the type of the blob is determined to be the blob containing adhesion between at least two target tracking objects; if the added proportion value is not greater than a fourth proportion threshold value, determining the type of the blob as a blob containing at least two target tracking objects adhered to the background;

The dividing unit includes:

a merging subunit, configured to merge, according to a relation between a weight of each edge in the blob and depth values of two endpoints of the blob, a patch where two endpoints of edges meeting a preset condition are located into one patch;

the obtaining unit includes:

3. An adhesive foreground segmentation apparatus for a depth image, comprising: a processor, memory, system bus;

the processor and the memory are connected through the system bus;

the memory is for storing one or more programs, the one or more programs comprising instructions, which when executed by the processor, cause the processor to perform the method of claim 1.

4. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein instructions, which when run on a terminal device, cause the terminal device to perform the method of claim 1.