CN118135635A - Face image recognition method and device, electronic equipment and medium - Google Patents
Face image recognition method and device, electronic equipment and medium Download PDFInfo
- Publication number
- CN118135635A CN118135635A CN202410254312.9A CN202410254312A CN118135635A CN 118135635 A CN118135635 A CN 118135635A CN 202410254312 A CN202410254312 A CN 202410254312A CN 118135635 A CN118135635 A CN 118135635A
- Authority
- CN
- China
- Prior art keywords
- image
- corrected
- face
- image sequence
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000012937 correction Methods 0.000 claims abstract description 68
- 230000003287 optical effect Effects 0.000 claims description 30
- 238000006243 chemical reaction Methods 0.000 claims description 25
- 238000001514 detection method Methods 0.000 claims description 16
- 230000002457 bidirectional effect Effects 0.000 claims description 4
- 210000000887 face Anatomy 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000003062 neural network model Methods 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000003128 head Anatomy 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/242—Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/80—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
- G06V10/806—Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Evolutionary Computation (AREA)
- Human Computer Interaction (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Image Processing (AREA)
Abstract
The embodiment of the invention discloses a face image recognition method, a device, electronic equipment and a medium, wherein the method comprises the following steps: acquiring a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle; determining a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence; correcting the first image to be corrected and the second image to be corrected to obtain a first corrected image sequence and a second corrected image sequence; and carrying out face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object. By implementing the embodiment of the invention, the correction can be carried out on the face image, and the accuracy of face recognition is improved.
Description
Technical Field
The present invention relates to the field of artificial intelligence, and in particular, to the field of image processing technologies, and in particular, to a method and apparatus for recognizing a face image, an electronic device, and a medium.
Background
With the development of electronic technology, in order to maintain public order and ensure legal rights of people, cameras are required to be arranged in many scenes to realize monitoring; based on the monitoring data, face recognition is carried out to track effective information, so that legal rights and interests of public places are essentially maintained; however, the quality of the image sequence extracted in the complex scene is not stable, so that the face image result based on the image sequence is poor, and therefore, how to improve the face recognition accuracy is a problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention provides a face image recognition method, a face image recognition device, electronic equipment and a medium, which can correct a face image and further improve the accuracy of face recognition.
In one aspect, an embodiment of the present invention provides a method for identifying a face image, where the method includes: acquiring a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle; determining a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence; correcting the first image to be corrected and the second image to be corrected to obtain a first corrected image sequence and a second corrected image sequence; and carrying out face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object.
In one possible implementation, the acquiring the first image sequence and the second image sequence includes: acquiring a first image sequence from a first preset angle through a first camera and acquiring a second image sequence from a second preset angle through a second camera; the first preset angle is a overlook angle, and the second preset angle is a head-up or a bottom view angle.
In one possible implementation manner, the correcting the first to-be-corrected image and the second to-be-corrected image to obtain a first corrected image sequence and a second corrected image sequence includes: registering based on the angle to obtain a position conversion relation between the first image to be corrected and the second image to be corrected; traversing a first preset range of the first image to be corrected based on a first sliding window until a first vehicle area is detected; extracting regional image features of the first vehicle region, and determining N feature points of a target vehicle based on the regional image features; determining a first A-pillar position of the target vehicle and a windshield area according to the N characteristic points; determining a first target face area based on the size of the windshield area and a preset relative position relation; performing face detection in the first target face area to obtain a first face image; determining first binocular position coordinates of the eyes in the first face image; determining a first link between the eyes based on the first binocular position coordinates; determining a first A column connecting line according to the position of the first A column, wherein the A column connecting line is a connecting line of one longer side of the A column; determining a first included angle between the first connecting line and the first A column connecting line; determining a second target face region in the second image to be corrected at the same time as the first image to be corrected according to the first target face region and the position conversion relation; performing face detection in the second target face area to obtain a second face image; determining second binocular position coordinates of eyes in the second face image; determining a second link between the eyes based on the position coordinates of the second eyes; determining a second A column connecting line according to the first A column connecting line and the position conversion relation; determining a second included angle between the second connecting line and the second A column connecting line; judging whether the first included angle and the second included angle are both in a first preset angle range or not; if yes, face correction is not needed; if not, rotating the first image to be corrected based on the first included angle, and rotating the second image to be corrected based on the second included angle, so as to obtain the first correction image sequence and the second correction image sequence.
In a possible implementation manner, the first image to be corrected further includes a ground area; the second image to be corrected further comprises a sky area, and the method further comprises: calculating the average brightness value of the ground area to obtain a first brightness value; calculating the average brightness value of the sky area to obtain a second brightness value; determining the ratio of the first brightness value to a first preset brightness value to obtain a first ratio; determining the ratio of the second brightness value to a second preset brightness value to obtain a second ratio; obtaining a brightness conversion coefficient according to the first ratio and the second ratio; performing brightness correction on the first face image and the second face image according to the brightness conversion coefficient to obtain a corrected first face image and a corrected second face image; the brightness of the corrected first face image and the brightness of the corrected second face image decrease with an increase in the brightness conversion coefficient.
Therefore, in the implementation mode, the brightness of the face image can be adjusted based on the brightness of the ambient light, so that the correction of the brightness of the face image is realized, and the accuracy of the subsequent face recognition is improved.
In one possible implementation manner, performing face recognition on the initial correction image to obtain a recognition result includes: converting the second correction image sequence into a YUV color space, extracting the characteristics of a Y channel, a U channel and a V channel from the YUV color space, and combining the extracted characteristics to obtain a first characteristic; extracting a second feature from the second corrected image sequence using an optical flow method; extracting a third image from the first corrected image sequence; extracting a fourth image from the second correction image sequence at the same time; registering the third image and the fourth image and extracting a third feature, wherein the third feature represents depth information of the face image; splicing the first feature, the second feature and the third feature to obtain an intermediate feature; and carrying out face recognition on the intermediate features to obtain a face recognition result of the target person object.
By performing color space conversion on the original image, extracting the characteristics of a brightness channel and a color channel, and extracting richer color characteristics, the system is helped to improve the understanding capability of the image, and the detection efficiency is improved;
in addition, the time features extracted by an optical flow method and the depth features extracted by the two images are fused with the color features, so that all the features of all the faces are expressed in multiple aspects, and the recognition accuracy of the faces in various scenes, particularly dynamic scenes, is improved.
In a possible implementation manner, the extracting the second feature according to the second corrected image sequence using an optical flow method includes: the second correction image sequence comprises a first frame image, the fourth image and a second frame image; the first frame image, the fourth image, and the second frame image are continuous; predicting optical flow of a fourth image to the first frame image, optical flow of the fourth image to the second frame image; and acquiring the second characteristic by adopting a splicing operation to fuse the bidirectional optical flow information.
The optical flow from the fourth image to the first frame image and the optical flow from the fourth image to the second frame image start from the same fourth image, so that errors caused by the fact that the traditional unidirectional optical flows cannot be aligned are avoided, and the accuracy of optical flow estimation is improved.
In a possible implementation manner, the performing face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object includes: inputting the intermediate features into a first module, a second module and a third module respectively to obtain a fourth feature, a fifth feature and a sixth feature, wherein the structures of the first module, the second module and the third module are different; inputting the fused fourth characteristic and fifth characteristic into a transducer structure to obtain a seventh characteristic; the fifth feature and the sixth feature are fused to obtain an eighth feature, and the eighth feature is input into a fourth module to obtain a ninth feature; wherein the fourth module comprises a convolution layer and two transducer layers which are sequentially connected; inputting the fused seventh feature and the fused ninth feature into a transformer layer to obtain a tenth feature; fusing the fourth feature and the tenth feature to obtain a final feature; and obtaining a recognition result according to the final characteristics.
The different structures of the first module, the second module and the third module realize multiple interactions of the multi-layer features so as to ensure that the feature extraction is more complete, prevent the feature from being lost and enhance the adaptability to different scenes.
In another aspect, an embodiment of the present invention provides a device for identifying a face image, including: an acquisition unit configured to acquire a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle; a determining unit, configured to determine a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence; the correction unit is used for correcting the first image to be corrected and the second image to be corrected to obtain a first correction image sequence and a second correction image sequence; the recognition unit is used for recognizing the face of the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object.
In still another aspect, an embodiment of the present invention provides an electronic device, including: a processor adapted to implement one or more instructions; and a computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the steps of the method for recognizing a face image according to the first aspect of the present invention.
In yet another aspect, an embodiment of the present invention provides a computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform the steps of the method for identifying a face image according to the first aspect of the present invention.
In the embodiment of the invention, the electronic equipment firstly acquires an image sequence; secondly, determining a target face image based on the image sequence; secondly, correcting the target face image to obtain an initial corrected image sequence; and finally, carrying out face recognition on the initial correction image sequence to obtain a recognition result. Therefore, the embodiment of the invention corrects the acquired target face image, and carries out face recognition based on the corrected image, so that the characteristics of face recognition are more accurate, and the accuracy of face recognition is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a block diagram of a system according to an embodiment of the present invention;
fig. 2 is a flow chart of a face image recognition method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.
As shown in fig. 1, fig. 1 is a system architecture diagram provided in an embodiment of the present application, and a method for recognizing a face image provided in the embodiment of the present application may be applied to the system architecture shown in fig. 1, where the system architecture includes a camera C1, a camera C2, and a person A1, where the camera C1 and the camera C2 are respectively at a top view angle, or a bottom view angle with respect to the person A1; specifically, the character A1 may be located in the vehicle; the camera C1 can shoot to obtain an image sequence of a road surface and a person A1 in a vehicle, and the camera C2 can shoot to obtain an image sequence of the sky and the person A1 in the vehicle; the camera C1 and the camera C2 may transmit the photographed image sequence to a server, and the server performs a face image recognition method.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present invention and the above-described drawings are used for distinguishing between different objects and not for describing a particular sequential order.
Based on the above description, the embodiment of the present invention proposes a face image recognition method, which may be executed by an electronic device. Referring to fig. 2, the method for recognizing a face image may include the following steps S201 to S204:
s201, acquiring a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle;
The first image sequence and the second image sequence are acquired through cameras arranged at different angles, the cameras can be 2D cameras or 3D cameras, the first image sequence is acquired by the first camera, the second image sequence is acquired by the second camera, the first camera and the second camera are arranged on an intersection path, the first image sequence comprises an image sequence formed by a first image to be corrected, and the second image sequence comprises an image sequence formed by a second image to be corrected; the first image to be corrected and the second image to be corrected include a vehicle and face images of a person object in the vehicle.
In a preferred embodiment of the present invention, the first image to be corrected further includes a ground area and the second image to be corrected further includes a sky area in addition to the face image including the vehicle and the person object within the vehicle; such spatial information can better assist the image to be corrected in completing the correction.
202, Determining a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence;
in the embodiment of the invention, the first image sequence and the second image sequence can be detected through a face detection model, such as a neural network model, so as to obtain a first image to be corrected and a second image to be corrected, which correspond to the target person object.
The principle of determining the image to be corrected from the image sequence in the embodiment is as follows: if a plurality of characters exist in the image sequence at the same time, the characters are necessarily different, and if only one character exists in the image sequence, the characters are the same character object with high probability; and under the condition of stronger robustness of face tracking, the same face which is tracked is also necessarily corresponding to the same person. Therefore, the face image of the target person object is automatically extracted from the image sequence by combining the technologies of face detection, face tracking and the like, and information labeling is automatically carried out on the extracted face image, so that a first image to be corrected and a second image to be corrected corresponding to the target person object are obtained, and the accuracy and the extraction efficiency of the extracted images to be corrected can be simultaneously ensured.
Specifically, in this embodiment, face detection needs to be performed on image frames in the first image sequence and the second image sequence, and then face tracking is performed in the image sequence based on the detected faces, so that face detection is not required to be performed on all the image frames contained in the image sequence.
S203, correcting the first image to be corrected and the second image to be corrected to obtain a first corrected image sequence and a second corrected image sequence;
In a possible implementation manner, registration is performed based on the angles, so as to obtain a position conversion relation between a first image to be corrected and a second image to be corrected, wherein the angles are determined by a first preset angle and a second preset angle, and the first preset angle and the second preset angle are determined by shooting angles of a first camera and a second camera respectively and a person object; traversing a first preset range of the first image to be corrected based on a first sliding window until a first vehicle area is detected; extracting regional image features of the first vehicle region, and determining N feature points of a target vehicle based on the regional image features; determining a first A-pillar position of the target vehicle and a windshield area according to the N characteristic points; determining a first target face area based on the size of the windshield area and a preset relative position relation; performing face detection in the first target face area to obtain a first face image; determining first binocular position coordinates of the eyes in the first face image; determining a first link between the eyes based on the first binocular position coordinates; determining a first A column connecting line according to the position of the first A column, wherein the A column connecting line is a connecting line of one longer side of the A column; determining a first included angle between the first connecting line and the first A column connecting line; determining a second target face region in the second image to be corrected at the same time as the first image to be corrected according to the first target face region and the position conversion relation; performing face detection in the second target face area to obtain a second face image; determining second binocular position coordinates of eyes in the second face image; determining a second link between the eyes based on the position coordinates of the second eyes; determining a second A column connecting line according to the first A column connecting line and the position conversion relation; determining a second included angle between the second connecting line and the second A column connecting line; judging whether the first included angle and the second included angle are both in a first preset angle range or not; if yes, face correction is not needed; if not, rotating the first image to be corrected based on the first included angle, and rotating the second image to be corrected based on the second included angle, so as to obtain the first correction image sequence and the second correction image sequence.
It should be noted that, under the condition that the camera is fixed in position, the area of the image occupied by the road is fixed, so that in the embodiment of the application, only the vehicle detection and positioning are needed to be performed within the first preset range of the road, and the whole image is not required to be traversed, thereby reducing the data processing amount and improving the face recognition efficiency.
If the head of the driver is not inclined during the driving of the vehicle by the driver, the connecting line between the eyes of the driver and the connecting line of the A column are vertical; the first preset angle range may specifically be 80 to 90 degrees.
Therefore, the data calculation amount can be reduced by determining the preset range and detecting the vehicle, so that the determination efficiency of the face image in the determined image is accelerated, and the efficiency of the whole face recognition process is improved.
And secondly, judging whether the human face is inclined or not by combining an included angle of a straight line formed by the vehicle A column and key feature points in the human face in the vehicle driving process, and further performing angle correction, so that the subsequent human face image recognition process is more accurate, and the accuracy of human face recognition is improved.
In a possible implementation manner, the first image to be corrected further includes a ground area; the second image to be corrected further comprises a sky area, and the method further comprises: calculating the average brightness value of the ground area to obtain a first brightness value; calculating the average brightness value of the sky area to obtain a second brightness value; determining the ratio of the first brightness value to a first preset brightness value to obtain a first ratio; determining the ratio of the second brightness value to a second preset brightness value to obtain a second ratio; obtaining a brightness conversion coefficient according to the first ratio and the second ratio; and carrying out brightness correction on the first face image and the second face image according to the brightness conversion coefficient to obtain a corrected first face image and a corrected second face, wherein the brightness of the corrected first face image and the brightness of the corrected second face image are reduced along with the increase of the brightness conversion coefficient.
The first preset brightness value is a brightness value corresponding to a ground area of the first face image under a standard brightness value, namely the brightness value corresponding to the ground area of the first face image under the condition that the brightness of the first face image is not required to be corrected.
The second preset luminance value is a luminance value corresponding to a sky area of the second face image under the standard luminance value, that is, a luminance value corresponding to the sky area of the second face image under the condition that the luminance of the second face image is not corrected.
For example, if the current time point is noon, the sunlight intensity is higher, and at this time, the face image, the road surface and the sky are all at higher brightness values, and the correction is performed by adjusting the brightness values down; the first camera can shoot to obtain a road surface at the noon time point, the second camera can shoot to obtain a blue sky and/or cloud, and at the moment, the average brightness value of the road surface, namely the first brightness value is higher than a first preset brightness value; the average luminance value of sky/cloud, that is, the second luminance value is higher than the second preset luminance value, the first ratio and the second ratio are both greater than 1, and the luminance conversion coefficient may specifically be: luminance conversion coefficient c= (first ratio+second ratio)/2; the optional corrected luminance values may be: luminance value of face image before correction× (2-C 2) =luminance value of face image after correction.
Therefore, in the implementation mode, the brightness of the face image can be adjusted based on the brightness of the ambient light, so that the correction of the brightness of the face image is realized, and the accuracy of the subsequent face recognition is improved.
S204, carrying out face recognition on the initial correction image sequence to obtain a recognition result;
in one possible implementation manner, the performing face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object includes: converting the second correction image sequence into a YUV color space, extracting the characteristics of a Y channel, a U channel and a V channel from the YUV color space, and combining the extracted characteristics to obtain a first characteristic; extracting a second feature from the second corrected image sequence using an optical flow method; extracting a third image from the first corrected image sequence; extracting a fourth image from the second correction image sequence at the same time; registering the third image and the fourth image and extracting a third feature, wherein the third feature represents depth information of the face image; splicing the first feature, the second feature and the third feature to obtain an intermediate feature; and carrying out face recognition on the intermediate features to obtain a face recognition result of the target person object.
By performing color space conversion on the original image, extracting the characteristics of a brightness channel and a color channel, and extracting richer color characteristics, the system is helped to improve the understanding capability of the image, and the detection efficiency is improved;
in addition, the time features extracted by an optical flow method and the depth features extracted by the two images are fused with the color features, so that all the features of all the faces are expressed in multiple aspects, and the recognition accuracy of the faces in various scenes, particularly dynamic scenes, is improved.
In a possible implementation manner, the extracting the second feature according to the second corrected image sequence using an optical flow method includes: the second correction image sequence comprises a first frame image, the fourth image and a second frame image; the first frame image, the fourth image, and the second frame image are continuous; predicting optical flow of a fourth image to the first frame image, optical flow of the fourth image to the second frame image; and acquiring the second characteristic by adopting a splicing operation to fuse the bidirectional optical flow information.
The optical flow from the fourth image to the first frame image and the optical flow from the fourth image to the second frame image start from the same fourth image, so that errors caused by the fact that the traditional unidirectional optical flows cannot be aligned are avoided, and the accuracy of optical flow estimation is improved.
In a possible implementation manner, the performing face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object includes: inputting the intermediate features into a first module, a second module and a third module respectively to obtain a fourth feature, a fifth feature and a sixth feature, wherein the structures of the first module, the second module and the third module are different; inputting the fused fourth characteristic and fifth characteristic into a transducer structure to obtain a seventh characteristic; the fifth feature and the sixth feature are fused to obtain an eighth feature, and the eighth feature is input into a fourth module to obtain a ninth feature; wherein the fourth module comprises a convolution layer and two transducer layers which are sequentially connected; inputting the fused seventh feature and the fused ninth feature into a transformer layer to obtain a tenth feature; fusing the fourth feature and the tenth feature to obtain a final feature; and obtaining a recognition result according to the final characteristics.
Specifically, the first module is formed by a convolution layer, relu layers, a convolution layer and a sequential connection; the second module is sequentially connected by a convolution layer, relu layers and a transducer structure; the third modules are connected sequentially by a convolutional layer, relu layers, a transducer structure, a depth-separable convolutional layer, and a transducer structure.
The different structures of the first module, the second module and the third module realize multiple interactions of the multi-layer features so as to ensure that the feature extraction is more complete, prevent the feature from being lost and enhance the adaptability to different scenes.
Based on the description of the foregoing face image recognition method embodiment, the embodiment of the present invention also discloses a face image recognition device, where the face image recognition device may be a computer program (including program code) running in an electronic device. The face image recognition device can operate the following units:
An acquisition unit configured to acquire a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle;
A determining unit, configured to determine a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence;
The correction unit is used for correcting the first image to be corrected and the second image to be corrected to obtain a first correction image sequence and a second correction image sequence;
The recognition unit is used for recognizing the face of the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object.
In a possible implementation, the acquisition unit is configured to acquire the image sequence, in particular: acquiring a first image sequence from a first preset angle through a first camera and acquiring a second image sequence from a second preset angle through a second camera; the first preset angle is a overlook angle, and the second preset angle is a head-up or a bottom view angle.
In one possible implementation manner, the determining unit is configured to determine a first image to be corrected and a second image to be corrected corresponding to the target person object, specifically: and detecting the first image sequence and the second image sequence through a face detection model such as a neural network model to obtain a first image to be corrected and a second image to be corrected, which correspond to the target person object.
In one possible implementation, the correction unit is used for correcting the image to be corrected, in particular for; registering based on the angle to obtain a position conversion relation between the first image to be corrected and the second image to be corrected; traversing a first preset range of the first image to be corrected based on a first sliding window until a first vehicle area is detected; extracting regional image features of the first vehicle region, and determining N feature points of a target vehicle based on the regional image features; determining a first A-pillar position of the target vehicle and a windshield area according to the N characteristic points; determining a first target face area based on the size of the windshield area and a preset relative position relation; performing face detection in the first target face area to obtain a first face image; determining first binocular position coordinates of the eyes in the first face image; determining a first link between the eyes based on the first binocular position coordinates; determining a first A column connecting line according to the position of the first A column, wherein the A column connecting line is a connecting line of one longer side of the A column; determining a first included angle between the first connecting line and the first A column connecting line; determining a second target face region in the second image to be corrected at the same time as the first image to be corrected according to the first target face region and the position conversion relation; performing face detection in the second target face area to obtain a second face image; determining second binocular position coordinates of eyes in the second face image; determining a second link between the eyes based on the position coordinates of the second eyes; determining a second A column connecting line according to the first A column connecting line and the position conversion relation; determining a second included angle between the second connecting line and the second A column connecting line; judging whether the first included angle and the second included angle are both in a first preset angle range or not; if yes, face correction is not needed; if not, rotating the first image to be corrected based on the first included angle, and rotating the second image to be corrected based on the second included angle, so as to obtain the first correction image sequence and the second correction image sequence.
In one possible implementation manner, the recognition unit is specifically configured to: converting the second correction image sequence into a YUV color space, extracting the characteristics of a Y channel, a U channel and a V channel from the YUV color space, and combining the extracted characteristics to obtain a first characteristic; extracting a second feature from the second corrected image sequence using an optical flow method; extracting a third image from the first corrected image sequence; extracting a fourth image from the second correction image sequence at the same time; registering the third image and the fourth image and extracting a third feature, wherein the third feature represents depth information of a face image; splicing the first feature, the second feature and the third feature to obtain an intermediate feature; and carrying out face recognition on the intermediate features to obtain a face recognition result of the target person object.
In a possible implementation, the identification unit is configured to extract the second feature from the second corrected image sequence using optical streaming, in particular: the second correction image sequence comprises a first frame image, the fourth image and a second frame image; the first frame image, the fourth image, and the second frame image are continuous; predicting optical flow of a fourth image to the first frame image, optical flow of the fourth image to the second frame image; and acquiring the second characteristic by adopting a splicing operation to fuse the bidirectional optical flow information.
In a possible implementation manner, the recognition unit is configured to perform face recognition on the intermediate feature to obtain a recognition result, and specifically is configured to: inputting the intermediate features into a first module, a second module and a third module respectively to obtain a fourth feature, a fifth feature and a sixth feature, wherein the structures of the first module, the second module and the third module are different; inputting the fused fourth characteristic and fifth characteristic into a transducer structure to obtain a seventh characteristic; the fifth feature and the sixth feature are fused to obtain an eighth feature, and the eighth feature is input into a fourth module to obtain a ninth feature; wherein the fourth module comprises a convolution layer and two transducer layers which are sequentially connected; inputting the fused seventh feature and the fused ninth feature into a transformer layer to obtain a tenth feature; fusing the fourth feature and the tenth feature to obtain a final feature; and obtaining a recognition result according to the final characteristics.
In the embodiment of the invention, the electronic equipment firstly acquires a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle; secondly, determining a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence; thirdly, correcting the first image to be corrected and the second image to be corrected to obtain a first corrected image sequence and a second corrected image sequence; and finally, carrying out face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object. Therefore, the embodiment of the invention corrects the acquired target face image, and carries out face recognition based on the corrected image, so that the characteristics of face recognition are more accurate, and the accuracy of face recognition is improved.
The embodiment of the invention also provides a computer storage medium (Memory), which is a Memory device in the electronic device and is used for storing programs and data. It is understood that the computer storage media herein may include both built-in storage media in the electronic device and extended storage media supported by the electronic device. The computer storage medium provides a storage space that stores an operating system of the electronic device. Also stored in this memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor 201. The computer storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory; optionally, at least one computer storage medium remote from the processor may be present.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The device comprises: at least one processor 301, such as a central processing unit (central processing unit, CPU), at least one memory 302, and at least one bus 303.
The memory 302 may store program instructions, and the processor 301 may be configured to invoke the program instructions to perform a method for recognizing a face image.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the above embodiments may be implemented by hardware associated with a program that may be stored in a computer-readable storage medium, including Read Only Memory (ROM), random access memory (random access memory, RAM), programmable read only memory (programmable read only memory, PROM), erasable programmable read only memory (erasable programmable read only memory, EPROM), one-time programmable read only memory (OTPROM), electrically Erasable Programmable Read Only Memory (EEPROM), compact disc read only memory (compact disc read-only memory, CD-ROM), solid state hard disk (solid state drive STATE DISK, or any other medium that can be used to carry or store data.
It should be noted that all the steps of the embodiments of the present application are performed under the condition of legal compliance, i.e., all the steps of the embodiments of the present application are performed under the condition of authorization.
Claims (10)
1. A method for recognizing a face image, the method comprising:
acquiring a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle;
determining a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence;
Correcting the first image to be corrected and the second image to be corrected to obtain a first corrected image sequence and a second corrected image sequence;
And carrying out face recognition on the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object.
2. The method of claim 1, wherein the acquiring the first image sequence and the second image sequence comprises:
acquiring a first image sequence from a first preset angle through a first camera and acquiring a second image sequence from a second preset angle through a second camera; the first preset angle is a overlook angle, and the second preset angle is a head-up or a bottom view angle.
3. The method of claim 1, wherein correcting the first image to be corrected and the second image to be corrected to obtain a first sequence of corrected images and a second sequence of corrected images comprises:
registering based on the angle to obtain a position conversion relation between the first image to be corrected and the second image to be corrected;
traversing a first preset range of the first image to be corrected based on a first sliding window until a first vehicle area is detected;
Extracting regional image features of the first vehicle region, and determining N feature points of a target vehicle based on the regional image features;
determining a first A-pillar position of the target vehicle and a windshield area according to the N characteristic points;
determining a first target face area based on the size of the windshield area and a preset relative position relation;
performing face detection in the first target face area to obtain a first face image;
determining first binocular position coordinates of the eyes in the first face image;
determining a first link between the eyes based on the first binocular position coordinates;
determining a first A column connecting line according to the position of the first A column, wherein the A column connecting line is a connecting line of one longer side of the A column;
determining a first included angle between the first connecting line and the first A column connecting line;
determining a second target face region in the second image to be corrected at the same time as the first image to be corrected according to the first target face region and the position conversion relation;
performing face detection in the second target face area to obtain a second face image;
determining second binocular position coordinates of eyes in the second face image;
determining a second link between the eyes based on the position coordinates of the second eyes;
determining a second A column connecting line according to the first A column connecting line and the position conversion relation;
Determining a second included angle between the second connecting line and the second A column connecting line;
Judging whether the first included angle and the second included angle are both in a first preset angle range or not;
if yes, face correction is not needed;
If not, rotating the first image to be corrected based on the first included angle, and rotating the second image to be corrected based on the second included angle, so as to obtain the first correction image sequence and the second correction image sequence.
4. A method according to claim 3, wherein the first image to be corrected further comprises a ground area; the second image to be corrected further comprises a sky area, and the method further comprises:
Calculating the average brightness value of the ground area to obtain a first brightness value;
calculating the average brightness value of the sky area to obtain a second brightness value;
determining the ratio of the first brightness value to a first preset brightness value to obtain a first ratio;
determining the ratio of the second brightness value to a second preset brightness value to obtain a second ratio;
Obtaining a brightness conversion coefficient according to the first ratio and the second ratio;
performing brightness correction on the first face image and the second face image according to the brightness conversion coefficient to obtain a corrected first face image and a corrected second face image; the brightness of the corrected first face image and the brightness of the corrected second face image decrease with an increase in the brightness conversion coefficient.
5. The method of claim 1, wherein the performing face recognition on the first corrected image sequence and the second corrected image sequence to obtain a face recognition result of the target person object comprises:
converting the second correction image sequence into a YUV color space, extracting the characteristics of a Y channel, a U channel and a V channel from the YUV color space, and combining the extracted characteristics to obtain a first characteristic;
extracting a second feature from the second corrected image sequence using an optical flow method;
extracting a third image from the first corrected image sequence; extracting a fourth image from the second correction image sequence at the same time; registering the third image and the fourth image and extracting a third feature, wherein the third feature represents depth information of a face image;
splicing the first feature, the second feature and the third feature to obtain an intermediate feature;
and carrying out face recognition on the intermediate features to obtain a face recognition result of the target person object.
6. The method of claim 5, wherein the extracting the second feature from the second corrected image sequence using an optical flow method comprises:
The second correction image sequence includes a first frame image, the fourth image, and a second frame image; the first frame image, the fourth image, and the second frame image are continuous;
Predicting optical flow of the fourth image to the first frame image, optical flow of the fourth image to the second frame image;
And acquiring the second characteristic by adopting a splicing operation to fuse the bidirectional optical flow information.
7. The method of claim 5, wherein performing face recognition on the first and second corrected image sequences to obtain a face recognition result of the target person object comprises:
Inputting the intermediate features into a first module, a second module and a third module respectively to obtain a fourth feature, a fifth feature and a sixth feature, wherein the structures of the first module, the second module and the third module are different;
Inputting the fused fourth characteristic and fifth characteristic into a transducer structure to obtain a seventh characteristic;
the fifth feature and the sixth feature are fused to obtain an eighth feature, and the eighth feature is input into a fourth module to obtain a ninth feature; the fourth module comprises a convolution layer and two transformer layers which are sequentially connected;
inputting the fused seventh feature and the fused ninth feature into a transformer layer to obtain a tenth feature;
Fusing the fourth feature and the tenth feature to obtain a final feature;
and obtaining the face recognition result of the target person object according to the final characteristics.
8. A face image recognition apparatus, comprising:
An acquisition unit configured to acquire a first image sequence and a second image sequence; the first image sequence and the second image sequence are respectively image sequences of the same target person object under different angles in the vehicle;
A determining unit, configured to determine a first image to be corrected and a second image to be corrected corresponding to the target person object based on the first image sequence and the second image sequence;
The correction unit is used for correcting the first image to be corrected and the second image to be corrected to obtain a first correction image sequence and a second correction image sequence;
The recognition unit is used for recognizing the face of the first correction image sequence and the second correction image sequence to obtain a face recognition result of the target person object.
9. An electronic device, comprising:
a processor adapted to implement one or more instructions; and
A computer storage medium storing one or more instructions adapted to be loaded by the processor and to perform the method of face image recognition as claimed in any one of claims 1 to 7.
10. A computer storage medium storing one or more instructions adapted to be loaded by a processor and to perform a method of face image recognition according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410254312.9A CN118135635A (en) | 2024-03-06 | 2024-03-06 | Face image recognition method and device, electronic equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410254312.9A CN118135635A (en) | 2024-03-06 | 2024-03-06 | Face image recognition method and device, electronic equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN118135635A true CN118135635A (en) | 2024-06-04 |
Family
ID=91245541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410254312.9A Pending CN118135635A (en) | 2024-03-06 | 2024-03-06 | Face image recognition method and device, electronic equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118135635A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118314587A (en) * | 2024-06-11 | 2024-07-09 | 江苏魔视智能科技有限公司 | Semantic map fusion method, semantic map fusion device, semantic map fusion equipment, semantic map fusion storage medium and semantic map fusion program product |
-
2024
- 2024-03-06 CN CN202410254312.9A patent/CN118135635A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118314587A (en) * | 2024-06-11 | 2024-07-09 | 江苏魔视智能科技有限公司 | Semantic map fusion method, semantic map fusion device, semantic map fusion equipment, semantic map fusion storage medium and semantic map fusion program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111160172B (en) | Parking space detection method, device, computer equipment and storage medium | |
CN110650292B (en) | Method and device for assisting user in shooting vehicle video | |
US20150138310A1 (en) | Automatic scene parsing | |
CN111507210A (en) | Traffic signal lamp identification method and system, computing device and intelligent vehicle | |
CN111860352B (en) | Multi-lens vehicle track full tracking system and method | |
US20080013837A1 (en) | Image Comparison | |
CN110570456A (en) | Motor vehicle track extraction method based on fusion of YOLO target detection algorithm and optical flow tracking algorithm | |
CN111837158A (en) | Image processing method and device, shooting device and movable platform | |
CN114973028B (en) | Aerial video image real-time change detection method and system | |
CN112598743B (en) | Pose estimation method and related device for monocular vision image | |
CN111967396A (en) | Processing method, device and equipment for obstacle detection and storage medium | |
CN118135635A (en) | Face image recognition method and device, electronic equipment and medium | |
CN109754034A (en) | A kind of terminal device localization method and device based on two dimensional code | |
JP2019075097A (en) | Method and apparatus for data reduction of feature-based peripheral information of driver assistance system | |
CN116052090A (en) | Image quality evaluation method, model training method, device, equipment and medium | |
CN107506753B (en) | Multi-vehicle tracking method for dynamic video monitoring | |
CN111832345A (en) | Container monitoring method, device and equipment and storage medium | |
CN115965934A (en) | Parking space detection method and device | |
CN115346155A (en) | Ship image track extraction method for visual feature discontinuous interference | |
CN112444251A (en) | Vehicle driving position determining method and device, storage medium and computer equipment | |
CN111860050B (en) | Loop detection method and device based on image frames and vehicle-mounted terminal | |
CN113784026B (en) | Method, apparatus, device and storage medium for calculating position information based on image | |
CN115565155A (en) | Training method of neural network model, generation method of vehicle view and vehicle | |
CN113449574A (en) | Method and device for identifying content on target, storage medium and computer equipment | |
CN114550124A (en) | Method for detecting obstacle in parking space and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |