CN118262101A

CN118262101A - Image processing method and electronic equipment

Info

Publication number: CN118262101A
Application number: CN202211699467.0A
Authority: CN
Inventors: 刘帅帅; 孙萁浩; 田友强; 高雪松
Original assignee: Hisense Group Holding Co Ltd
Current assignee: Hisense Group Holding Co Ltd
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2024-06-28

Abstract

The application discloses an image processing method and electronic equipment, which are used for avoiding the problem that targets cannot be segmented due to too small target duty ratio in the process of segmenting targets in an image and improving the target segmentation effect of the image. The application provides an image processing method, which comprises the following steps: acquiring a current frame image and determining a target area of the current frame image; when the current frame image is determined to be needed to be cut, cutting the current frame image according to the target area to obtain a cut image containing a target corresponding to the current frame image; and according to the preset image size required by a preset target segmentation algorithm, performing target segmentation based on the cut image containing the target corresponding to the current frame image, and extracting a target image.

Description

Image processing method and electronic equipment

Technical Field

The present application relates to the field of image technologies, and in particular, to an image processing method and an electronic device.

Background

Portrait segmentation algorithms are widely used, for example, in background blurring, background replacement, etc. In addition, in recent years, along with the improvement of the precision and the speed of the detection algorithm, a mode of combining the human body detection algorithm with the human image segmentation algorithm is adopted more and more, and the method can realize the appointed human image segmentation, for example, a plurality of human images are arranged on one image, the target person in the image can be identified through the human body detection algorithm, and then the human image of the target person is segmented through the human image segmentation algorithm. The algorithm segmentation accuracy is also improved to a certain extent due to the reduction of the background area.

Disclosure of Invention

The embodiment of the application provides an image processing method and electronic equipment, which are used for avoiding the problem that targets cannot be segmented due to too small target duty ratio in the process of segmenting targets in an image and improving the target segmentation effect of the image.

The image processing method provided by the embodiment of the application comprises the following steps:

acquiring a current frame image and determining a target area of the current frame image;

When the current frame image is determined to be needed to be cut, cutting the current frame image according to the target area to obtain a cut image containing a target corresponding to the current frame image;

And according to the preset image size required by a preset target segmentation algorithm, performing target segmentation based on the cut image containing the target corresponding to the current frame image, and extracting a target image.

According to the method, the image is cut, and the cut image is subjected to target segmentation, so that the target duty ratio can be enlarged, the situation that the target cannot be segmented due to too small target is avoided, and the target segmentation effect is improved.

In some embodiments, the method further comprises:

When the current frame image is determined not to be needed to be cut, the current frame image is directly scaled to a preset image size required by a preset target segmentation algorithm, target segmentation is carried out on the scaled image, and a target image is extracted.

In some embodiments, the determining that clipping the current frame image is required includes:

If the proportion of the target area of the previous frame image of the current frame image to the previous frame image is smaller than a preset threshold value, determining that the current frame image needs to be cut.

In some embodiments, the method further comprises:

Determining the proportion of the target area to the current frame image, and comparing the proportion with a preset threshold value;

When the proportion of the target area to the current frame image is smaller than a preset threshold value, cutting a next frame image of the current frame image according to the target area of the next frame image to obtain a cut image which corresponds to the next frame image and contains a target; and performing target segmentation based on the cut image corresponding to the next frame image according to the preset image size required by a preset target segmentation algorithm, and extracting a target image.

In some embodiments, cropping the current frame image according to the target area to obtain a cropped image containing a target corresponding to the current frame image, including:

judging whether the ratio of the size of the target area to the preset image size required by a preset target segmentation algorithm is larger than a preset threshold value or not;

if yes, cutting out a cut image containing a target from the current frame image according to the preset image size;

Otherwise, according to a preset clipping size, clipping the current frame image to obtain a clipping image containing a target, wherein the proportion of the target area to the clipping image is larger than a preset threshold value.

In some embodiments, when a cropping image containing a target is obtained by cropping from the current frame image according to the preset image size, the target segmentation is performed based on the cropping image containing the target corresponding to the current frame image, including: and directly carrying out target segmentation on the clipping image.

In some embodiments, when a cropping image containing a target is cropped from the current frame image according to a preset cropping size, the aspect ratio of the cropping image is the same as the aspect ratio of the preset image size.

In some embodiments, when a cropping image containing a target is obtained by cropping from the current frame image according to a preset cropping size, the target segmentation is performed based on the cropping image containing the target corresponding to the current frame image, including:

And the clipping image is amplified to the preset image size in equal proportion, and target segmentation is carried out on the amplified clipping image.

In some embodiments, the method further comprises:

When the fact that the current frame image does not need to be cut is determined, determining a target area image of the current frame image;

According to a preset image size, taking the boundary of the target area image as a reference, and performing pixel expansion on boundary pixels of the target area image to obtain an expanded image with the preset image size, wherein the expanded image comprises the target area image;

and carrying out target segmentation on the expanded image to extract a target image.

According to the method, according to the preset image size, the boundary pixels of the target area image are subjected to pixel expansion by taking the boundary of the target area image as a reference, so that an expanded image with the preset image size is obtained, the shape of the target is not changed, and the deformation of the target obtained by segmentation is avoided.

Another embodiment of the present application provides an electronic device, including a memory for storing program instructions and a processor for calling the program instructions stored in the memory, and executing any one of the methods according to the obtained program.

Furthermore, according to an embodiment, for example, a computer program product for a computer is provided, comprising software code portions for performing the steps of the method defined above, when said product is run on a computer. The computer program product may include a computer-readable medium having software code portions stored thereon. Furthermore, the computer program product may be directly loaded into the internal memory of the computer and/or transmitted via the network by at least one of an upload procedure, a download procedure and a push procedure.

Another embodiment of the present application provides a computer-readable storage medium storing computer-executable instructions for causing the computer to perform any of the methods described above.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of image resizing that results in deformation of a target provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of image resizing directly filled with 0 pixel values according to an embodiment of the present application;

FIG. 3 is a schematic diagram of an advanced human image segmentation reuse human detection algorithm according to an embodiment of the present application, in which a target region is extracted from a human image segmentation result image;

FIG. 4 is a schematic diagram of an embodiment of the present application, in which the target cannot be segmented due to a small target duty ratio;

FIG. 5 is a schematic view of expanding a photo frame of a person according to an embodiment of the present application;

FIG. 6 is a schematic diagram of the embodiment of the application, in which the aspect ratio of the RGB miniimage is kept unchanged, and the pixel values of the RGB three channels at the four boundaries of the portrait frame are respectively extended outwards;

FIG. 7 is a schematic diagram of pixel expansion according to an embodiment of the present application;

fig. 8 is a schematic flowchart of an image processing method according to an embodiment of the present application;

FIG. 9 is a schematic diagram of calculating a target area duty ratio according to an embodiment of the present application;

FIG. 10 is a schematic diagram of the scaling of an original image to a network inference size provided by an embodiment of the present application;

FIG. 11 is a schematic view of image cropping when the target area is smaller according to an embodiment of the present application;

FIG. 12 is a schematic diagram of image cropping and scaling up when the target area is relatively small, according to an embodiment of the present application;

FIG. 13 is a flowchart illustrating another embodiment of an image processing method according to the present application;

fig. 14 is a general flow chart of an image processing method according to an embodiment of the present application;

FIG. 15 is a flowchart of an image processing method according to an embodiment of the present application when clipping of a current frame image is not required;

Fig. 16 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The embodiment of the application provides an image processing method and electronic equipment, which are used for avoiding the condition that a target is shielded by a shielding object in an image of the target, so that the target can be displayed more completely in the image and is not shielded by the shielding object.

The method and the device are based on the same application, and because the principles of solving the problems by the method and the device are similar, the implementation of the device and the method can be referred to each other, and the repetition is not repeated.

The terms first, second and the like in the description and in the claims of embodiments of the application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The following examples and embodiments are to be construed as illustrative only. Although the specification may refer to "an", "one", or "some" example or embodiment(s) at several points, this does not mean that each such reference is related to the same example or embodiment, nor that the feature is applicable to only a single example or embodiment. Individual features of different embodiments may also be combined to provide further embodiments. Furthermore, terms such as "comprising" and "including" should be understood not to limit the described embodiments to consist of only those features already mentioned; such examples and embodiments may also include features, structures, units, modules, etc. that are not specifically mentioned.

The term "module" refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware or/and software code that is capable of performing the function associated with that element.

Various embodiments of the application are described in detail below with reference to the drawings attached to the specification. It should be noted that, the display sequence of the embodiments of the present application only represents the sequence of the embodiments, and does not represent the advantages or disadvantages of the technical solutions provided by the embodiments.

First, two image processing algorithms described in the embodiments of the present application are described:

Portrait segmentation algorithm:

The portrait segmentation algorithm belongs to a Convolutional Neural Network (CNN) model, which is generally an encoder and decoder (encoder-decoder) model, and the encoder-decoder model comprises an encoder (encoder) module and a decoder (decoder) module. Wherein, the encoder module is responsible for image feature extraction. The image is processed by a decoder module to output a feature image with smaller resolution, the decoder module is responsible for upsampling the feature image with smaller resolution to the original image size, and finally each pixel value of the feature image is mapped into a probability value through a softmax function and is output as a mask (mask) image. For example, the input of the portrait segmentation algorithm is a red, green and blue (RGB) image, and then the portrait in the RGB image is output through the process of the portrait segmentation algorithm, the target area in the portrait mask is 1, the background area is 0, the pixels between the portrait and the background boundary are values between 0 and 1, and the portrait mask is a single-channel black-and-white image, that is, the portrait in the RGB image is segmented.

Human body detection algorithm:

the human body detection algorithm belongs to one of the object detection algorithms, and for example yolo series of algorithms can be used as the human body detection algorithm. The input of the human body detection algorithm is an RGB image, and a target area image is output, wherein the image includes position information of each target area, and generally includes left vertex coordinates (x, y) of a circumscribed rectangular frame of the target area and width and height (w, h) of the frame.

The human body detection and image segmentation algorithm can be divided into two modes as far as the use mode is concerned: the first way is to intercept the target area first and then divide the target, and the second way is to intercept the target area after the whole image is divided. Specifically:

The first way is to extract a detection area, i.e. a target area, before the image segmentation, and specifically intercept the target area through a human detection algorithm. Then, the intercepted target area is input into a human image segmentation algorithm to carry out human image segmentation, namely, the human image is segmented from the target area.

The second method extracts a mask (mask) of a specified region after the portrait segmentation, specifically, a rectangular frame obtained by using a human body detection algorithm, and directly extracts the mask on the result of the portrait segmentation, wherein the rectangular frame is a circumscribed rectangular frame of the target region.

Both of the above approaches have certain drawbacks and disadvantages, such as:

Referring to fig. 1 and 2, regarding the first mode, the network inference size of the portrait segmentation algorithm (i.e. the image size required for segmenting the portrait) is generally fixed, and before the size and dimension of the figure where the person is located are uncertain, the operation of adjusting the size of the image needs to be performed to be adjusted to the network inference size, that is, from an uncertain image size to a fixed image size, but this causes deformation of the object in the figure, as shown in fig. 1, in which the object region, that is, the object region obtained by the human body detection algorithm, can be seen to be deformed after the size adjustment according to the network inference size is performed; or in order to obtain an image with the size of the network reasoning size without deforming the target, the image is directly filled with 0 pixel value, as shown in fig. 2, wherein the black area is the pixel area filled with 0 pixel value, and it can be seen that the pixel mutation of the target area and other areas (black areas) can be caused, obvious gradient transformation of the pixel value is generated, and the accuracy of human image segmentation can be affected.

As to the second mode, as shown in fig. 3, that is, the image segmentation is advanced to obtain an image segmentation result image, and then a human body detection algorithm is used to extract a target area (that is, a human body area or a human body area) from the image segmentation result image, that is, a rectangular frame area around the human body in the image segmentation result image shown in fig. 3. However, referring to fig. 4, when the human occupancy ratio is small in the figure, since the image segmentation algorithm segments the whole image, it is generally difficult to segment the whole image in a small area, which results in that when the human is far from the camera, i.e. the image of the target person is small, the image of the target person cannot be segmented, that is, the image of the target person does not exist in the image segmentation result image, as shown in fig. 4. Then, when the image of the target person is cut off from the image of the human image segmentation result through the human body detection algorithm, only a block of all-black area can be cut off, namely, the cut-off target area is the all-black area and cannot be used for subsequent operation.

Aiming at the problems, the embodiment of the application provides two implementation schemes for the combined use of human body detection and portrait segmentation:

firstly, a human body detection algorithm is adopted for an image (RGB original image) to determine a target area, and then a human image segmentation algorithm is adopted for carrying out human image segmentation on the target area, namely, a mode of serial processing of the human body detection algorithm and the human image segmentation algorithm is adopted;

The second is to use a parallel processing mode of a human body detection algorithm and a human image segmentation algorithm for the image (RGB original image), and finally intercept the image of the appointed target on the image of the human image segmentation result.

For the above-described serial processing, first, a target area specified based on a human body detection result is cut out to obtain an RGB small image (i.e., target area image) containing only a single human figure. The RGB small image is expanded by adopting an upward, downward, leftward and rightward mode on the four boundaries on the upper, lower, left and right sides by taking the boundary pixel as a reference, the pixel value is gradually reduced to zero in the expansion process, and finally the RGB small image is expanded to the network reasoning size, so that the target shape is not changed, and meanwhile, the problem that the segmentation precision or effect is influenced due to the fact that the edge pixel value is too large in change can be effectively solved by adopting the expansion mode that the edge pixel is gradually reduced.

For the parallel processing mode, compared with the serial mode, the parallel processing mode has the advantages that the portrait segmentation algorithm and the human body detection algorithm adopt parallel processing, so that the speed of acquiring the target image is faster, and the parallel processing mode is universally suitable for the target image segmentation task for videos. In addition, aiming at the problem that when the proportion of the target area to the whole graph is smaller, the target is difficult to be completely segmented, and the intercepted mask graph is easy to cause no figure, the scheme is characterized in that the target area to the whole graph is calculated by adding the judgment of the proportion of the target area to the whole graph while the algorithm is processed in parallel, if the proportion of the target area to the whole graph is larger than a preset threshold value, the RGB original graph is directly adjusted to be in a network reasoning size, the whole graph is subjected to figure segmentation, and then the target area is intercepted on the segmentation result. If the duty cycle of the target area over the whole map is smaller than the preset threshold, for example: and cutting the image with the length-width ratio consistent with the network reasoning size on the whole image by taking the target area as the center to obtain a cut image, wherein the cut image is consistent with the network reasoning size, so that the cut image does not need to be subjected to size adjustment.

In addition, because the human body detection algorithm and the human image segmentation algorithm are adopted for parallel processing, when the duty ratio of the target area of the first frame image in the whole image is smaller than a preset threshold value, the second frame image can be directly cut, so that the image processing efficiency is improved. Because the inter-frame image of the original video is small in change and does not affect the overall effect, the first frame is detected and segmented, and the target area of the detected frame image occupies a smaller area in the whole image, so that the second frame image can be directly cut and amplified when the second frame is detected and segmented, and because the inter-frame change of the video is small, the target area of the first frame image occupies a smaller area in the whole image, and the target area of the second frame image occupies a smaller area in the whole image, the target area of the second frame image is directly considered to occupy a smaller area in the whole image than a preset threshold value. And inputting the cut image into a human image segmentation algorithm to carry out human image segmentation, and finally intercepting a target area on a segmentation result diagram.

In conclusion, the scheme can effectively avoid the problem that the target is smaller and cannot be segmented, and improves the image processing efficiency.

Specific implementation flow examples are given below for two kinds of human body detection algorithms and figure segmentation algorithm combination modes provided by the embodiment of the application.

1. Scheme of target detection and then single target segmentation (i.e. serial mode described above):

step one, human body detection:

Human body detection is firstly carried out on the RGB original image by utilizing a human body detection algorithm, so as to obtain a human body detection result, namely, a human image frame of each person in the RGB original image and corresponding information thereof, wherein the human image frame is an external rectangular frame of the human image.

Wherein, the information of the photo frame corresponding to each person comprises:

The portrait location, i.e., the coordinates of the portrait frame (e.g., center location coordinates or vertex coordinates of a rectangular frame);

the size of the portrait, i.e., the width and height of the area occupied by the portrait frame (i.e., the length and width of the rectangular frame).

Step two, intercepting a target RGB small image:

after the human body detection result is obtained in the first step, one target may be selected for segmentation, or a plurality of targets may be selected for segmentation.

In the following, a single target is taken as an example, and the principle of multi-target implementation is similar and will not be described again.

In some embodiments, based on the human body detection result, that is, for any human frame region, the region is first enlarged, that is, the dashed frame (original human frame) in fig. 5 is enlarged to a solid frame, so as to prevent that a small portion of the target region with a possible edge in human body detection is cut off by the frame, and meanwhile, four boundary pixels of the target region are not human pixels, so that filling with pixels of the human image region can be avoided when pixel filling is performed next. And obtaining a target area image, namely a target RGB small image, according to the expanded solid line frame.

The number of the four sides of the human photo frame is specifically enlarged, and the size of the four sides of the human photo frame can be determined according to actual needs, and the embodiment of the application is not limited.

Filling to the network reasoning size:

The RGB small image obtained in the second step is different from the size (namely the network reasoning size) of the input image required by the portrait segmentation algorithm, and the RGB small image needs to be adjusted to the network reasoning size. In order to ensure that the targets in the RGB small drawings are not deformed, the method is adjusted to be the network reasoning size according to the size of the RGB small drawings under the condition that the length-width ratio of the RGB small drawings is unchanged. As shown in fig. 6, the aspect ratio of the RGB small image is kept unchanged, the pixel values of the RGB three channels of the four boundaries of the human frame are respectively expanded outwards (i.e. the pixel expansion is performed towards the direction indicated by the arrow in fig. 6), the principle that the pixel value and the like are reduced to 0 is followed in the process of pixel expansion, that is, from the boundary of the human frame, the pixel value gradually becomes 0 until the pixel value is expanded to the network reasoning size, for example, as shown in fig. 7, assuming that the pixel value of one pixel on the right boundary is 58, and the pixel is expanded by 5 pixels on the line, then the pixel values of the sequentially expanded pixels are 38, 18, 0 and 0, respectively.

As shown in fig. 7, from the boundary pixels of the portrait frame, the expanded pixel values expand outward according to the principle of gradually becoming smaller until the pixels become 0, so that the pixels become 0 not once but slowly become 0, and the gradient or amplitude of the boundary pixel change is relaxed. Therefore, the object is not deformed, the background pixels are mostly 0, and the segmentation precision can be improved. At the expanded boundary, the pixels are gradually changed into 0, so that no pixel mutation exists, and the influence of the edge pixel value mutation on segmentation can be effectively reduced. Since the adjacent pixel value changes too much, for example 200 for a border pixel and 0 for the next outer pixel, this causes a gradient of the pixel values to change too much, resulting in that the border of the human frame is also segmented on the segmented portrait.

Step four, human image segmentation:

And (3) inputting the expanded image obtained in the step (III) into a portrait segmentation algorithm to segment the portrait and outputting a mask image with only the target portrait.

In summary, the overall flow of the above method is shown in fig. 8, in which the method adopts a mode of first target detection and then target segmentation, the background complexity of the image (the image after the expansion) input into the target segmentation algorithm network is far lower than that of the original image, and meanwhile, the target portrait maintains the original aspect ratio, and has no deformation problem. Furthermore, the edge pixels of the target area are filled in a mode of decreasing to 0, so that the problem that the segmentation is greatly affected by the change of the edge pixel values is solved, and the target segmentation precision is effectively improved.

2. A scheme of cutting out the target mask image on the cut-out image of the target segmentation result (namely, the parallel mode):

The video frame-to-frame variation is very small, and by utilizing the characteristic, the human body detection algorithm and the portrait segmentation algorithm in the scheme adopt parallel operation, and the specific implementation flow comprises the following steps:

step one, parallel processing of a portrait segmentation algorithm and a human body detection algorithm:

and acquiring a current frame image, namely an RGB image, and simultaneously carrying out human body detection and human image segmentation on the current frame image.

Specifically, the current frame image is duplicated into two parts, one part is subjected to human body detection processing to obtain a target area, and the other part is subjected to human image segmentation processing, so that the parallel processing mode can effectively improve the overall processing speed of an algorithm.

And step two, calculating the duty ratio of the target area.

Based on the product W 'of the width and the length of the target area obtained by human body detection, the ratio of the product W' of the width and the length of the original image is calculated, as shown in fig. 9, so as to obtain the proportion of the target area on the current frame image to the current frame image (simply referred to as the target area proportion S).

And step three, adjusting the image according to the threshold value.

When the calculated target area ratio S is greater than the preset threshold, it is indicated that the target area ratio S is relatively large, and the problem that the image cannot be segmented is avoided, as shown in fig. 10, the image scaled to the network inference size is directly scaled to the network inference size, and then the image scaled to the network inference size is subject to target segmentation, that is, segmented in the whole image range.

When the calculated target area ratio S is smaller than a preset threshold (which may be the same as or different from the threshold), it is indicated that the target area ratio S is smaller, and there is a problem that the target portrait is not segmented. For the situation, the target area ratio is adjusted according to the network reasoning size, namely, the target area ratio is improved in a mode that the image is cut to obtain a cut image containing the target. Also, in some embodiments, the cropped image is consistent with the aspect ratio of the image at the inferred size of the network because the object is not deformed if the image is scaled down or up as long as the aspect ratio is guaranteed to be consistent. For example: the network reasoning size is 300 x 500, the length-width ratio is 3:5, in the original image, a target area is taken as the center, one area with the target area ratio at least not lower than a preset threshold and the length-width ratio meeting 3:5 is cut, and the cut image of the area is used as an image input into a portrait segmentation algorithm to carry out portrait segmentation. When the image is cut, if the target area is located at the image boundary, if the target area is still cut according to the center of the target area, the cut part exceeds the whole image, so that when the target area is located at the image boundary, the target area can be properly adjusted, and the cutting is not necessarily performed by taking the target area as the center.

Fig. 10, 11, 12 illustrate image adjustment modes in several cases. As shown in fig. 10, when the target area is relatively large, scaling to the network inference size is directly performed in equal proportion, without clipping. As shown in fig. 11, when the target area occupies a relatively small area, the image containing the target may be cut directly from the original image according to the network inference size, the obtained cut image is directly used as the input of the portrait segmentation algorithm, and finally the segmented portrait, that is, the target image, is output. In addition, the clipping may be performed according to the fact that the clipped portions are located in the original image and the fact that the portrait ratio in the clipped image is larger than the preset threshold value is ensured, that is, the image containing the target may be clipped from the original image without the network reasoning size, see fig. 12, at this time, the size of the clipped image may be smaller than the network reasoning size, and the scaling operation may be directly performed on the clipped image, that is, the scaling operation may be performed to the network reasoning size under the condition that the aspect ratio is unchanged, and there is no problem of deformation of the target because the aspect ratio is consistent.

In addition, whether the image is cut by the center of the target area or not, the image containing the target can be cut from the original image directly according to the network reasoning size, or the image containing the target can be cut from the original image not according to the network reasoning size, so long as the target duty ratio in the cut image is ensured to be larger than a preset threshold value, the cut image can be enlarged or reduced according to the network reasoning size, and the requirement of human image segmentation is met.

Specific image cropping schemes, for example:

Judging whether the ratio of the size of the target area to the preset image size required by a preset target segmentation algorithm is larger than a preset threshold value or not; that is, judging whether the duty ratio of the target on the network reasoning size is larger than a preset threshold value;

If yes, cutting out a cut image containing a target from the current frame image according to the preset image size; directly obtaining a clipping image of the network reasoning size;

Otherwise, according to the preset clipping size, clipping the current frame image to obtain a clipping image containing a target, wherein the proportion of the target area to the clipping image is larger than a preset threshold, that is, the size of the clipping image at the moment is smaller than the network reasoning size, but the target duty ratio on the clipping image is larger than the preset threshold.

The preset cutting size may be determined according to actual needs, and the embodiment of the present application is not limited.

In some embodiments, when a clipping image containing a target is clipped from the current frame image according to the preset image size, the clipping image is directly subjected to target segmentation.

In some embodiments, when a cropping image containing the target is cropped from the current frame image according to a preset cropping size, the aspect ratio of the cropping image is the same as the aspect ratio of the preset image size, so that the target is not deformed when the cropping image is scaled up in equal proportion.

In some embodiments, when a cropping image containing a target is cropped from the current frame image according to a preset cropping size, the cropping image is scaled up to the preset image size in equal proportion, and the object segmentation is performed on the scaled-up cropping image.

And fourthly, inputting the cut image into a human image segmentation algorithm to segment the human image.

The size of the obtained clipping image or the size of the clipping image after being amplified is consistent with the network reasoning size, so that the clipping image or the clipping image after being amplified can be directly input into a portrait segmentation algorithm for portrait segmentation.

And fifthly, intercepting a target mask graph.

In some embodiments, because there may be multiple portraits on a clipping image, if a specified target in the multiple portraits needs to be extracted, the target needs to be intercepted in the mask image of the segmentation output, so as to realize the segmentation of the specified target.

In summary, as shown in fig. 13, in the whole flow, the human body detection algorithm and the portrait segmentation algorithm operate in parallel, and when the duty ratio of the target area in the whole image is greater than a preset threshold, the image input to the portrait segmentation algorithm is not cut; when the duty ratio of the target area in the whole image is smaller than the threshold value in the whole image, because the two algorithms are processed in parallel, the current frame image may be processed by the portrait segmentation algorithm, so that image clipping can be performed when the next frame image is subject to target segmentation, and the duty ratio of the target area in the next frame on the clipping image is larger than the preset threshold value.

As can be seen, referring to fig. 14, an image processing method provided in an embodiment of the present application includes:

s101, acquiring a current frame image and determining a target area of the current frame image;

the target area is, for example, a human body area obtained by human body detection, i.e., a human body frame.

S102, when determining that the current frame image needs to be cut, cutting the current frame image according to the target area to obtain a cut image containing a target corresponding to the current frame image;

And S103, performing target segmentation based on the cut image containing the target corresponding to the current frame image according to the preset image size required by a preset target segmentation algorithm, and extracting a target image.

The preset image size is, for example, the above-mentioned network reasoning size.

Therefore, the method can enlarge the target duty ratio by cutting out the image and carrying out target segmentation on the cut-out image, avoid the situation that the target cannot be segmented out due to too small target, and improve the target segmentation effect.

In some embodiments, the method further comprises:

That is, in the embodiment of the present application, by using the characteristic that the inter-frame variation of the video is small, and by adopting the parallel operation of the human body detection algorithm and the human image segmentation algorithm, when the proportion of the target area of the previous frame image to the previous frame image is smaller than the preset threshold, it can be directly determined that the current frame image needs to be cut.

In some embodiments, the method further comprises:

Because the inter-frame image has small change, the judgment result of the previous frame can be directly utilized to directly cut, the step of determining the proportion of the target area to the whole image and comparing the proportion with a preset threshold value is omitted, and therefore the efficiency is improved.

In some embodiments, the method further comprises:

when the proportion of the target area to the current frame image is larger than a preset threshold value, scaling the current frame image to a preset image size required by a preset target segmentation algorithm, and performing target segmentation on the scaled image to extract a target image.

In some embodiments, referring to fig. 15, the method further comprises:

S201, when determining that clipping is not needed to be carried out on the current frame image, determining a target area image of the current frame image;

S202, according to a preset image size, taking the boundary of the target area image as a reference, performing pixel expansion on boundary pixels of the target area image to obtain an expanded image with the preset image size, wherein the expanded image comprises the target area image;

s203, performing target segmentation on the expanded image to extract a target image.

That is, when it is determined that clipping of the current frame image is not required, the embodiment of the present application can process, for example, in the following two ways:

mode one: and directly scaling the current frame image to the preset image size required by a preset target segmentation algorithm, and carrying out target segmentation on the scaled image to extract a target image.

Mode two:

Determining a target area image of the current frame image; for example, firstly, a target area designated based on a human body detection result is intercepted, and an RGB small image (namely a target area image) which only contains a single human figure is obtained;

Performing target segmentation on the expanded image to extract a target image

According to the second mode, according to the preset image size, the boundary pixels of the target area image are subjected to pixel expansion by taking the boundary of the target area image as a reference, so that an expanded image with the preset image size is obtained, and the target shape can not be changed.

In some embodiments, during pixel expansion of boundary pixels of the target region image, the expanded pixel values taper down to zero.

Therefore, the method adopts the mode that the edge pixels become smaller gradually to expand, and can effectively solve the problem that the segmentation precision or effect is affected due to the fact that the value of the edge pixels is too changed.

In some embodiments, the aspect ratio of the target region image remains unchanged during pixel expansion of boundary pixels of the target region image.

Therefore, the method enables the target portrait to keep the original aspect ratio and avoids the deformation of the portrait.

In some embodiments, determining the target area image of the current frame image includes:

Performing target detection on the current frame image to obtain a target detection frame;

and enlarging the target detection frame, and determining an image in the enlarged target detection frame as a target area image.

Therefore, the method prevents the small part of the target area with the possible edge from being cut off by the frame, and simultaneously can ensure that the four boundary pixels of the target area are not target pixels, so that the pixels of the target area are not filled when the pixels are filled next.

The following describes a device or apparatus provided by an embodiment of the present application, where explanation or illustration of the same or corresponding technical features as those described in the above method is omitted.

Referring to fig. 16, an electronic device provided in an embodiment of the present application includes:

The processor 600, configured to read the program in the memory 620, performs the following procedures:

In some embodiments, the processor 600 is further configured to read the program in the memory 620, and perform the following procedure:

when the fact that the current frame image does not need to be cut is determined, determining a target area image of the image;

In some embodiments, the electronic device provided by embodiments of the present application further includes a transceiver 610 for receiving and transmitting data under the control of the processor 600.

Where in FIG. 16, a bus architecture may comprise any number of interconnected buses and bridges, with various circuits of the one or more processors, as represented by processor 600, and the memory, as represented by memory 620, being chained together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. Transceiver 610 may be a number of elements, including a transmitter and a receiver, providing a means for communicating with various other apparatus over a transmission medium.

In some embodiments, further comprising a user interface 630, the user interface 630 may be an interface capable of interfacing with an inscribed desired device, including but not limited to a keypad, display, speaker, microphone, joystick, etc.

The processor 600 is responsible for managing the bus architecture and general processing, and the memory 620 may store data used by the processor 600 in performing operations.

In some embodiments, the processor 600 may be a CPU (Central processing Unit), ASIC (Application SPECIFIC INTEGRATED Circuit), FPGA (Field-Programmable Gate array GATE ARRAY), or CPLD (Complex Programmable Logic Device ).

It should be noted that, in the embodiment of the present application, the division of the modules is schematic, which is merely a logic function division, and other division manners may be implemented in actual implementation. In addition, each functional module in the embodiments of the present application may be integrated in one processing unit, or each module may exist alone physically, or two or more modules may be integrated in one unit. The integrated modules may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The embodiment of the application provides a computing device which can be a desktop computer, a portable computer, a smart phone, a tablet Personal computer, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA) and the like. The computing device may include a central processing unit (Center Processing Unit, CPU), memory, input/output devices, etc., the input devices may include a keyboard, mouse, touch screen, etc., and the output devices may include a display device, such as a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), cathode Ray Tube (CRT), etc.

The memory may include Read Only Memory (ROM) and Random Access Memory (RAM) and provides the processor with program instructions and data stored in the memory. In the embodiment of the present application, the memory may be used to store a program of any of the methods provided in the embodiment of the present application.

The processor is configured to execute any of the methods provided by the embodiments of the present application according to the obtained program instructions by calling the program instructions stored in the memory.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the method of any of the above embodiments. The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

An embodiment of the present application provides a computer readable storage medium storing computer program instructions for use in an apparatus provided in the embodiment of the present application, where the computer program instructions include a program for executing any one of the methods provided in the embodiment of the present application. The computer readable storage medium may be a non-transitory computer readable medium.

The computer-readable storage medium can be any available medium or data storage device that can be accessed by a computer, including, but not limited to, magnetic storage (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical storage (e.g., CD, DVD, BD, HVD, etc.), and semiconductor storage (e.g., ROM, EPROM, EEPROM, nonvolatile storage (NAND FLASH), solid State Disk (SSD)), etc.

It should be understood that:

The access technology via which an entity in the communication network communicates traffic may be any suitable current or future technology, such as WLAN (wireless local access network), wiMAX (worldwide interoperability for microwave access), LTE-a, 5G, bluetooth, infrared, etc. may be used; in addition, embodiments may also apply wired technologies, e.g., IP-based access technologies, such as wired networks or fixed lines.

Embodiments suitable for implementation as software code or a portion thereof and operation using a processor or processing function are software code independent and may be specified using any known or future developed programming language, such as a high-level programming language, such as an object-C, C, C ++, c#, java, python, javascript, other scripting language, etc., or a low-level programming language, such as a machine language or assembler.

The implementation of the embodiments is hardware-independent and may be implemented using any known or future developed hardware technology or any hybrid thereof, such as microprocessors or CPUs (central processing units), MOS (metal oxide semiconductors), CMOS (complementary MOS), biMOS (bipolar MOS), biCMOS (bipolar CMOS), ECL (emitter coupled logic), and/or TTL (transistor-transistor logic).

Embodiments may be implemented as a single device, apparatus, unit, component, or function, or in a distributed fashion, e.g., one or more processors or processing functions may be used or shared in a process, or one or more processing segments or portions may be used and shared in a process where one physical processor or more than one physical processor may be used to implement one or more processing portions dedicated to a particular process as described.

The apparatus may be implemented by a semiconductor chip, a chipset, or a (hardware) module comprising such a chip or chipset.

Embodiments may also be implemented as any combination of hardware and software, such as an ASIC (application specific IC (integrated circuit)) component, an FPGA (field programmable gate array) or CPLD (complex programmable logic device) component, or a DSP (digital signal processor) component.

Embodiments may also be implemented as a computer program product comprising a computer usable medium having a computer readable program code embodied therein, the computer readable program code adapted to perform a process as described in the embodiments, wherein the computer usable medium may be a non-transitory medium.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. An image processing method, the method comprising:

2. The method according to claim 1, wherein the method further comprises:

3. The method of claim 1, wherein the determining that the current frame image needs to be cropped comprises:

4. The method according to claim 1, wherein the method further comprises:

5. The method according to claim 1, wherein cropping the current frame image according to the target area to obtain a cropped image containing a target corresponding to the current frame image, comprises:

6. The method according to claim 5, wherein when clipping the current frame image to obtain a clipping image containing a target according to the preset image size, the target segmentation based on the clipping image containing a target corresponding to the current frame image includes: and directly carrying out target segmentation on the clipping image.

7. The method according to claim 5, wherein when a clip image containing a target is clipped from the current frame image according to a preset clipping size, an aspect ratio of the clip image is the same as an aspect ratio of the preset image size.

8. The method of claim 7, wherein the performing object segmentation based on the cropped image containing the object corresponding to the current frame image comprises:

9. The method according to claim 1, wherein the method further comprises:

10. An electronic device, comprising:

A memory for storing program instructions;

a processor for invoking program instructions stored in said memory to perform the method of any of claims 1 to 9 in accordance with the obtained program.