[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2020029406A1 - Human face emotion identification method and device, computer device and storage medium - Google Patents

Human face emotion identification method and device, computer device and storage medium Download PDF

Info

Publication number
WO2020029406A1
WO2020029406A1 PCT/CN2018/108251 CN2018108251W WO2020029406A1 WO 2020029406 A1 WO2020029406 A1 WO 2020029406A1 CN 2018108251 W CN2018108251 W CN 2018108251W WO 2020029406 A1 WO2020029406 A1 WO 2020029406A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
facial
emotion
training sample
feature vector
Prior art date
Application number
PCT/CN2018/108251
Other languages
French (fr)
Chinese (zh)
Inventor
吴壮伟
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020029406A1 publication Critical patent/WO2020029406A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, a device, a computer device, and a storage medium for facial emotion recognition.
  • facial expressions are an important carrier of human communication and an important way of non-verbal communication. It can express human emotions well.
  • human emotions affect human behavior to a certain extent. For example, when drivers are in negative emotions such as anger, sadness, and anxiety, it is easy to ignore the surrounding road conditions and reduce the speed of response to emergency things, resulting in a high incidence of traffic accidents. Based on this, the behavior of drivers and other personnel can be guided by identifying facial emotions. For example, when the driver's facial emotions are identified, if the driver is identified as having a negative emotion, the driver can be prompted to adjust his emotional state to avoid a traffic accident. Therefore, how to accurately recognize facial emotions has become an urgent technical problem.
  • the present application provides a facial emotion recognition method, device, computer equipment, and storage medium to accurately recognize facial emotions.
  • the present application provides a facial emotion recognition method, which includes: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors; and obtaining standard energy Feature vector, and calculate a Euclidean distance value between each of the energy eigenvectors and the standard energy eigenvector according to an image difference calculation method; determine whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values If there is a European-style distance value exceeding the preset threshold in the plurality of European-style distance values, an image corresponding to an energy feature vector exceeding the European-style distance value of the preset threshold is taken as a key frame image, wherein the key The number of frame images is at least one; obtaining a pre-stored emotion recognition model, and recognizing a face emotion in each of the key frame images based on the emotion recognition model; and according to all the faces in the key frame images Emotionally obtain a facial emotion corresponding to the video
  • the present application provides a facial emotion recognition device, which includes: an acquiring unit for acquiring a video image collected in real time; and a transform unit for performing wavelet transformation on all frame images in the video image to Obtain a corresponding energy feature vector; a distance calculation unit configured to obtain a standard energy feature vector, and calculate an Euclidean distance value between each of the energy feature vector and the standard energy feature vector according to an image difference calculation method; a distance judgment unit For determining whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values; and a key frame acquisition unit, which is used for if there is an Euclidean distance value that exceeds the preset threshold among the plurality of Euclidean distance values Taking an image corresponding to an energy feature vector of a European distance value exceeding the preset threshold as a key frame image, wherein the number of the key frame images is at least one; an emotion recognition unit is configured to obtain a previously stored emotion recognition A model, and identifying facial emotions in
  • the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor.
  • the processor is implemented when the computer program is executed.
  • the facial emotion recognition method provided by the first aspect.
  • the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the first aspect.
  • the described facial emotion recognition method when executed by a processor, causes the processor to execute the first aspect.
  • the application provides a method, a device, a computer device, and a storage medium for facial emotion recognition. This method can accurately recognize facial emotions.
  • FIG. 1 is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application
  • FIGS. 2 to 6 are another schematic flowchart of a facial emotion recognition method provided by an embodiment of the present application.
  • FIG. 7 to 8 are specific schematic flowcharts of a facial emotion recognition method provided by an embodiment of the present application.
  • FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • 11 to 15 are another schematic block diagrams of a facial emotion recognition device according to an embodiment of the present application.
  • FIG. 16 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • the facial emotion recognition method can be applied to a facial emotion recognition system, and the facial emotion recognition system can be installed in a device with a camera function such as a mobile phone or a car.
  • the facial emotion recognition system can exist in the device as an independent system, or it can be embedded in other systems of the device.
  • a facial emotion recognition system can be embedded in a car driving system to identify the driver's emotions.
  • the facial emotion recognition system may be embedded in an application program of a mobile phone to assist the application program to realize a facial emotion recognition function.
  • the facial emotion recognition method includes steps S101 to S107.
  • the device where the facial emotion recognition system is located invokes a camera to perform real-time image acquisition of the user.
  • the device acquires video images collected within a certain period of time through a camera. For example, capture video images within 10 seconds of real-time capture. It can be understood that the video image will include multiple frames of images.
  • the facial emotion recognition method needs to use information such as a neutral expression image, a standard energy feature vector, and an emotion recognition model when performing facial emotion recognition
  • the user uses the facial emotion recognition system for facial emotion recognition.
  • the facial emotion recognition system also needs to perform the following operations:
  • FIG. 2 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101a, S101b, and S101c are also included.
  • S101b Perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
  • a neutral expression image and a standard energy feature vector need to be prepared in advance.
  • the neutral expression may be a facial expression of the user in a relatively stable mood.
  • the facial expressions commonly used by users when taking ID photos can be understood as neutral expressions.
  • the device may issue a voice prompt or a text prompt to prompt the user to make a neutral expression.
  • the image of the user's neutral expression is captured by the camera to obtain a neutral expression image.
  • neutral expression images can also be obtained in other ways.
  • a neutral expression image such as a photo of the user input is acquired.
  • the user transfers the image of the neutral expression taken in the past into the device where the facial emotion recognition system is located as a neutral expression image.
  • the identity information input by the user is obtained, and then the identity photo corresponding to the identity information is obtained from the background server as a neutral facial expression image.
  • the background server may be The background server of the vehicle system, the background server of mobile phone applications, the background server of the facial emotion recognition system, etc.
  • the background server can store the ID photos corresponding to the user's identity information, or call the third party after obtaining the identity information
  • the neutral expression image is subjected to wavelet transform using Gabor wavelet transform to obtain a corresponding standard energy feature vector, and the neutral expression image and the corresponding standard energy feature vector are stored for the convenience of the user.
  • the neutral expression image and the corresponding standard energy feature vector may be called for facial emotion recognition.
  • FIG. 3 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101d and S101e are also included.
  • S101d Obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
  • S101e input the sentiment training sample image and the corresponding sentiment label into a convolutional neural network model for machine learning to obtain a sentiment recognition model, and store the sentiment recognition model.
  • an emotion recognition model needs to be prepared in advance.
  • the facial emotion recognition system needs to acquire a set of emotional training sample images.
  • the emotional training sample image set includes a large number of emotional training sample images and an emotional label of a human face corresponding to each emotional training sample image. It should be noted that the emotional labels of the faces in each of the emotional training sample images can be labeled manually, or can be labeled by other methods, which are not specifically limited here.
  • the emotional training sample image and the corresponding emotional labels of the human face are input to a convolutional neural network (English full name: Convolutional Neural Networks, CNN) model for machine learning to obtain emotional recognition Model, and then store the emotion recognition model in the device where the facial emotion recognition system is located, so as to facilitate the subsequent use of the facial emotion recognition system, the emotion recognition model can be called for emotion recognition.
  • a convolutional neural network English full name: Convolutional Neural Networks, CNN
  • the camera of the device where the facial emotion recognition system is located cannot perform a real-time image acquisition of the user, for example, the angle of the camera No, the user's face information is not collected in the video images collected in real time, or only half of the user's face information is collected.
  • Such video images will inevitably reduce the accuracy of facial emotion recognition during subsequent facial emotion recognition. . Therefore, in order to ensure that in the subsequent facial emotion recognition, a better video image of the face can be captured, and to improve the accuracy of subsequent facial emotion recognition, the camera needs to be calibrated before acquiring the real-time captured video image. work.
  • FIG. 4 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • steps S101f, S101g, S101h and S101j are also included.
  • S101g Extract a preset number of frames as a calibration image from a plurality of frames of the calibration video image according to a preset rule.
  • S101h Based on a pre-stored face detection and recognition model, determine whether face information exists in the calibration image in each frame.
  • step S101j If there is no face information in the calibration image of at least one frame, issue a prompt message so that the user adjusts the angle of the camera according to the prompt information, and after adjusting the angle of the camera, return to step S101f, Until the face information exists in the calibration image of each frame.
  • the facial emotion recognition system needs to acquire a segment of a calibration video image collected in real time.
  • the calibration video image includes multiple frames of images. Then, according to a preset extraction rule, an image with a preset number of frames is extracted from a plurality of frames of the calibration video image as a calibration image.
  • the preset extraction rule may extract one image as a calibration image every 1 second.
  • the preset frame number can be set to 100, and the preset frame number can be set according to actual requirements.
  • the preset extraction rule may not be limited to the above-mentioned rules, and may be set according to actual requirements, and is not limited here.
  • step S101 can be performed, that is, real-time acquisition is performed. Steps to capture video images.
  • a prompt message can be sent by voice or display mode, so that the user can readjust the camera angle according to the prompt
  • step S101f that is, return to the step of obtaining a calibration video image acquired in real time, until the face information is present in each frame of the calibration image, thereby completing the camera Angle calibration.
  • FIG. 5 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101k, S101m, S101n, and S101p are also included.
  • S101k Obtain a training sample image set, where the training sample image set includes multiple training sample images and a face label used to characterize whether or not face information exists in the training sample image.
  • S101m Obtain a face Hal feature vector of the training sample image.
  • S101n input the face hal feature vector and face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model.
  • a face detection recognition model needs to be prepared in advance, so as to be used when performing camera angle calibration.
  • a training sample image set is first obtained, where the training sample image set includes multiple training sample images, and a face label corresponding to each training sample image.
  • the face label is used to characterize whether there is face information in a corresponding face sample image.
  • the Hal feature extraction of the face is performed on each training sample image to obtain the Hal feature vector of the face corresponding to each training sample image.
  • the face hal feature vector corresponding to each training sample image and the corresponding face label are input to an Adaboost lifting model based on a decision tree model for training, and a face detection and recognition model can be obtained.
  • the face detection and recognition model is stored in the device where the facial emotion recognition system is located.
  • wavelet transform is required for all frame images in the video image to obtain the energy feature vector corresponding to each frame image.
  • the wavelet transform may be, for example, a Gabor wavelet transform.
  • the wavelet transform may also adopt other methods, which is not limited herein.
  • the standard energy feature vector is an energy feature vector obtained by performing wavelet transform on a neutral expression image of a user collected in advance.
  • the standard energy feature vector is stored in advance in a device where the facial emotion recognition system is located. Because the standard energy feature vector is stored in the device in advance, obtaining the standard energy feature vector is specifically to obtain the previously stored standard energy feature vector.
  • the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S102 will be calculated according to the image difference calculation method.
  • the standard energy feature vector since the standard energy feature vector is stored in advance in the device where the facial emotion recognition system is located, the standard energy feature vector can be directly called in step S103, thereby reducing the Occupation of CPU resources of the device on which the facial emotion recognition system is located, reducing calculation time, etc.
  • the device on which the facial emotion recognition system is located may only store the neutral expression image in advance. In this way, when obtaining the standard energy feature vector in step S103, the previously stored neutral expression image is obtained first, and then Wavelet transform is performed on the neutral expression image to obtain a standard energy feature vector, and there is no limitation on the time for calculating the standard energy feature vector.
  • step S103 After calculating the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S103, a plurality of Euclidean distance values are obtained, and then it is determined whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values. If there is an Euclidean distance value exceeding a preset threshold, it indicates that the difference between the facial expression and the neutral expression in the video image is large, and step S105 is executed at this time.
  • FIG. 6 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • step S104 determines that there is no Euclidean distance value exceeding a preset threshold.
  • step S108 will be executed, that is, the standard energy feature vector corresponding to The neutral expression image is used as the key frame image.
  • the subsequent steps S106 and S107 are performed.
  • the facial emotion corresponding to the video image may also be set as a neutral emotion to complete the recognition of the facial emotion.
  • an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is used as a key frame image, where the The number of key frame images is at least one.
  • the number of Euclidean distance values exceeding a preset threshold may be one, or may be two or more. At this time, the number of key frame images is at least one.
  • the emotion recognition model is a model for recognizing facial emotions obtained by performing machine learning training in advance
  • the emotion recognition module may be, for example, a convolutional neural network model.
  • the device where the facial emotion recognition system is located first obtains the emotion recognition model, and then inputs the key frame image as an input value into the emotion recognition model.
  • the emotion recognition model performs emotion recognition on the key frame images to output each key frame image. Face emotions.
  • FIG. 7 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • This step S106 includes steps S1061 to S1063.
  • Each of the key frame images is sequentially input as an input value into the emotion recognition model.
  • each key frame image is sequentially input as an input value into an emotion recognition model, and then the emotion recognition model outputs probability values of each key frame image on various preset emotions.
  • the multiple preset emotions include 7 preset emotions such as fear, anger, romance, disgust, joy, surprise, and neutrality.
  • the emotion recognition model will recognize the probability of the face emotion in each key frame image on these 7 preset expressions.
  • the emotion recognition model recognizes the face emotion in a key frame image in the above 7 preset expressions.
  • the emotional probabilities are 10%, 70%, 15%, 5%, 0%, 0%, and 0%.
  • the emotion corresponding to the larger probability value among the multiple probability values corresponding to each key frame image is used as the facial emotion in the key frame image.
  • 70% of the anger emotions with the largest probability value are used as the facial emotions corresponding to a certain key frame image.
  • FIG. 8 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application.
  • This step S107 includes steps S1071 to S1072.
  • the facial emotion with a higher probability of occurrence is used as the facial emotion corresponding to the video image to complete the recognition of the facial emotion.
  • the number of key frame images is 10, and the facial emotion of 8 key frame images is identified by emotion recognition model as anger, the facial emotion of 1 key frame image is aversion, and the facial emotion of one key frame image is For fear, by performing probability statistics on the facial emotions of the 10 key frame images, it can be concluded that the probability of the appearance of angry facial emotions is 80%, the probability of the appearance of disgusted facial emotions is 10%, and the fear of facial emotions The probability of occurrence is 10%. In this way, the anger emotion with a high probability of occurrence can be used as the facial emotion corresponding to the entire video image, thereby completing the recognition of the facial emotion within the time period corresponding to the video image.
  • FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. After step S107, steps S109 to S112 are also included.
  • S111 Determine whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
  • the preset emotion category is the negative emotion category.
  • the facial emotions included in the negative emotion category are four types of fear, anger, sadness, and disgust.
  • the number of video images in the emotion list within two minutes is 100, then there will be 100 facial emotions, and then the probability of facial emotions belonging to the negative emotion category among these 100 facial emotions will be counted, such as The probability is 99%.
  • the probability of facial emotions belonging to the category of negative emotions exceeds the preset probability value of 80%, it means that the user has been in negative emotions within these 2 minutes.
  • the preset reminder method and pre-emption will be obtained.
  • Set prompt information and prompt the user with the preset prompt information according to the preset prompt mode.
  • the preset prompt mode may be, for example, a voice prompt mode, a text display mode, a voice prompt and vibration combination mode, and the like.
  • the preset prompt information may be, for example, "your current mood is low, please pay attention to driving safely” and the like.
  • the facial emotion recognition method in this embodiment can accurately recognize facial emotions.
  • An embodiment of the present application further provides a facial emotion recognition device, which is configured to execute any one of the foregoing facial emotion recognition methods.
  • a facial emotion recognition device which is configured to execute any one of the foregoing facial emotion recognition methods.
  • FIG. 10 is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 may be installed in a device such as a car or a mobile phone.
  • the facial emotion recognition device 300 includes an acquisition unit 301, a transformation unit 302, a distance calculation unit 303, a distance judgment unit 304, a key frame acquisition unit 305, an emotion recognition unit 306 and an emotion acquisition unit 307.
  • the obtaining unit 301 is configured to obtain a video image collected in real time.
  • FIG. 11 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes a storage unit 308.
  • the obtaining unit 301 is further configured to obtain a neutral expression image.
  • the transform unit 302 is further configured to perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
  • the storage unit 308 is configured to store the neutral expression image and the standard energy feature vector.
  • FIG. 12 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes an emotion model training unit 309.
  • the obtaining unit 301 is further configured to obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
  • An emotional model training unit 309 is configured to input the emotional training sample image and a corresponding emotional label into a convolutional neural network model and perform machine learning to obtain an emotional recognition model, and store the emotional recognition model.
  • FIG. 13 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 further includes an extraction unit 310, a face determination unit 311, and a prompting unit 312.
  • the obtaining unit 301 is further configured to obtain a calibration video image collected in real time.
  • the extraction unit 310 is configured to extract an image of a preset number of frames from a plurality of frames of the calibration video image according to a preset rule as a calibration image.
  • the face judging unit 311 is configured to determine whether face information exists in the calibration image in each frame based on a face detection and recognition model stored in advance.
  • the obtaining unit 301 is further configured to obtain real-time collected video images if face information exists in the calibration image in each frame.
  • a prompting unit 312 is configured to send a prompting message if the face information does not exist in at least one frame of the calibration image, so that the user can adjust the angle of the camera according to the prompting information, and obtain the unit after adjusting the angle of the camera.
  • Step 301 returns to performing the step of acquiring a calibration video image collected in real time, until face information exists in the calibration image in each frame.
  • FIG. 14 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition apparatus 300 further includes a vector acquisition unit 313 and a face model training unit 314.
  • the obtaining unit 301 is further configured to obtain a training sample image set, where the training sample image set includes a plurality of training sample images and a face label used to characterize whether there is face information in the training sample image.
  • a vector obtaining unit 313 is configured to obtain a face Hal feature vector of the training sample image.
  • a face model training unit 314, configured to input a face Hal feature vector and a face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model, and The face detection recognition model is stored.
  • a transformation unit 302 is configured to perform wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors.
  • the distance calculation unit 303 is configured to obtain a standard energy feature vector, and calculate a Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method.
  • the distance judging unit 304 is configured to judge whether there is an Euclidean distance value among the plurality of Euclidean distance values that exceeds a preset threshold.
  • a key frame obtaining unit 305 is configured to use, as a key frame, an image corresponding to an energy feature vector of the Euclidean distance value exceeding the preset threshold value if there are Euclidean distance values exceeding the preset threshold value among the plurality of Euclidean distance values.
  • the key frame obtaining unit 305 is further configured to, if there is no Euclidean distance value exceeding the preset threshold value among the plurality of Euclidean distance values, convert the neutral expression image corresponding to the standard energy feature vector As the key frame image.
  • the emotion recognition unit 306 is configured to obtain a pre-stored emotion recognition model, and recognize a facial emotion in each of the key frame images based on the emotion recognition model.
  • the emotion recognition unit 306 is specifically configured to: sequentially input each of the key frame images as an input value into the emotion recognition model; and obtain each of the information output by the emotion recognition model.
  • the emotion acquiring unit 307 is configured to acquire the facial emotion corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotion.
  • the emotion acquiring unit 307 is specifically configured to: perform probability statistics on the facial emotions in all the key frame images; and use the facial emotions with a higher probability of occurrence as the video image corresponding Facial emotions to complete recognition of facial emotions.
  • FIG. 15 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application.
  • the facial emotion recognition device 300 further includes a recording unit 315, a statistics unit 316, a probability judgment unit 317, and an information prompting unit 318.
  • the recording unit 315 is configured to record a time period corresponding to the video image and a facial emotion corresponding to the video image into an emotion list.
  • the statistics unit 316 is configured to count, according to the emotion list, a probability of a facial emotion belonging to a preset emotional category among facial emotions corresponding to all the video images in a preset time period.
  • the probability judging unit 317 is configured to judge whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
  • An information prompting unit 318 is configured to obtain a preset prompting method and preset prompting information if the probability of a face emotion belonging to the preset emotion category exceeds the preset probability value, and provide the user with a preset prompting method according to the preset prompting method. Prompt the preset prompt information.
  • the facial emotion recognition device 300 in this embodiment can accurately recognize facial emotions.
  • the above-mentioned facial emotion recognition device can be implemented in the form of a computer program, which can be run on a computer device as shown in FIG. 16.
  • FIG. 16 is a schematic block diagram of a computer device according to an embodiment of the present application.
  • the computer device 500 may be a terminal such as a mobile phone, or may be a device used in a car.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501.
  • the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute a method for facial emotion recognition.
  • the processor 502 is used to provide computing and control capabilities to support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for running a computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute a method for facial emotion recognition.
  • the network interface 505 is used for network communication, such as sending assigned tasks.
  • 16 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment 500 to which the solution of the present application is applied.
  • the specific computer equipment 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in a memory to achieve the following functions: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy characteristics Vector; obtaining a standard energy feature vector, and calculating an Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method; determining whether a plurality of the Euclidean distance values exceed a preset value Threshold Euclidean distance value; if there is an Euclidean distance value exceeding the preset threshold among the plurality of Euclidean distance values, an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is taken as a key frame image, Wherein, the number of the key frame images is at least one; acquiring a pre-stored emotion recognition model, and recognizing a facial emotion in each of the key frame images based on the emotion recognition model; and according to all the key frames The facial emotion in the image acquires the facial emotion
  • the processor 502 before executing the real-time acquisition of the video image, the processor 502 also implements the following functions: acquiring a neutral expression image; performing wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector; and The neutral expression image and the standard energy feature vector are stored.
  • the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring an emotional training sample image set, wherein the emotional training sample image set includes a plurality of emotional training sample images and the Emotion labels of human faces in the emotion training sample image; and inputting the emotion training sample images and corresponding emotion labels into a convolutional neural network model for machine learning to obtain an emotion recognition model, and storing the emotion recognition model.
  • the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring a training sample image set, wherein the training sample image set includes multiple training sample images and is used to characterize the Whether there is a face label of the face information in the training sample image; obtaining the face Hal feature vector of the training sample image; and inputting the face Hal feature vector corresponding to the training sample image and the face label into a decision-based Training in the Adaboost lifting model of the tree model to obtain a face detection and recognition model; and storing the face detection and recognition model.
  • the processor 502 before executing the real-time video image acquisition, the processor 502 also implements the following functions: acquiring a real-time calibration video image; and extracting a pre-defined video from a plurality of frames of the calibration video image according to a preset rule.
  • the processor 502 when the processor 502 recognizes the facial emotion in each of the key frame images based on the emotion recognition model, the processor 502 specifically implements the following function: inputting each of the key frame images as an input value in sequence To the emotion recognition model; obtaining probability values of each of the key frame images output by the emotion recognition model on a plurality of preset emotions; and among the plurality of probability values corresponding to each of the key frame images The emotion corresponding to the larger probability value is used as the facial emotion in the key frame image.
  • the processor 502 when the processor 502 executes acquiring facial emotions corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotions, it specifically implements the following functions: Probability statistics are performed on the facial emotions in the key frame image; and facial emotions with a higher probability of occurrence are used as facial emotions corresponding to the video image to complete the recognition of facial emotions.
  • the processor 502 may be a central processing unit, and the processor 502 may also be other general-purpose processors, digital signal processors, application specific integrated circuits, ready-made programmable gate arrays, or other programmable logic. Devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • a person of ordinary skill in the art can understand that all or part of the processes in the embodiment of the method for recognizing facial emotions described above can be performed by a computer program instructing related hardware.
  • the computer program may be stored in a computer-readable storage medium.
  • the computer program is executed by at least one processor in the computer system to implement the process steps of the embodiment including the facial emotion recognition method as described above.
  • the storage medium may be various media that can store program codes, such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • program codes such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
  • each unit is only a logical function division, and there may be another division manner in actual implementation.
  • the steps in the method of the embodiment of the present application can be adjusted, combined, and deleted according to actual needs.
  • the units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs.
  • Each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
  • the integrated unit When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a terminal, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a human face emotion identification method, a device, a computer device and a storage medium. The method comprises obtaining an energy feature vector of each frame of a video image; calculating an Euclidean distance value between each energy feature vector and a standard energy feature vector; screening out a key frame image according to the Euclidean distance value; identifying human face emotion in each key frame image; and Obaining, according to the human face emotion in all key frame images, a human face emotion corresponding to the video image to complete identification of human face emotion.

Description

人脸情绪识别方法、装置、计算机设备及存储介质Facial emotion recognition method, device, computer equipment and storage medium
本申请要求于2018年8月7日提交中国专利局、申请号为201810892915.6、发明名称为“人脸情绪识别方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on August 7, 2018, with application number 201810892915.6, and the invention name is "Facial Emotion Recognition Method, Device, Computer Equipment, and Storage Medium". Citations are incorporated in this application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种人脸情绪识别方法、装置、计算机设备及存储介质。The present application relates to the field of computer technology, and in particular, to a method, a device, a computer device, and a storage medium for facial emotion recognition.
背景技术Background technique
在人们的日常生活中,通过语言来传递的信息占7%,通过声音来传递的信息占38%,而通过面部表情来传递的信息则达到55%。由此可见人脸表情是人类交流的重要载体和非语言交流的一种重要方式,它可以很好地表达出人类的情感状态。In people's daily life, 7% of the information is transmitted through language, 38% is transmitted through sound, and 55% is transmitted through facial expressions. It can be seen that facial expressions are an important carrier of human communication and an important way of non-verbal communication. It can express human emotions well.
一般情况下,人类的情感状态会在一定程度上影响人类的行为活动。譬如,当司机处于愤怒、悲伤、焦虑等负面情绪时,就很容易忽略周围的路况、对应急事物的反应速度降低,导致交通事故发生率较高。基于这一点,可以通过对人脸情绪进行识别来指导司机等人员的行为。譬如,当通过对司机人脸情绪进行识别时,若识别出司机处于负面情绪,可以提示司机调整情绪状态以避免发生交通事故。因此,如何准确地识别出人脸情绪成为亟待解决的技术问题。In general, human emotions affect human behavior to a certain extent. For example, when drivers are in negative emotions such as anger, sadness, and anxiety, it is easy to ignore the surrounding road conditions and reduce the speed of response to emergency things, resulting in a high incidence of traffic accidents. Based on this, the behavior of drivers and other personnel can be guided by identifying facial emotions. For example, when the driver's facial emotions are identified, if the driver is identified as having a negative emotion, the driver can be prompted to adjust his emotional state to avoid a traffic accident. Therefore, how to accurately recognize facial emotions has become an urgent technical problem.
发明内容Summary of the invention
本申请提供了一种人脸情绪识别方法、装置、计算机设备及存储介质,以准确地识别人脸情绪。The present application provides a facial emotion recognition method, device, computer equipment, and storage medium to accurately recognize facial emotions.
第一方面,本申请提供了一种人脸情绪识别方法,其包括:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能 量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。In a first aspect, the present application provides a facial emotion recognition method, which includes: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors; and obtaining standard energy Feature vector, and calculate a Euclidean distance value between each of the energy eigenvectors and the standard energy eigenvector according to an image difference calculation method; determine whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values If there is a European-style distance value exceeding the preset threshold in the plurality of European-style distance values, an image corresponding to an energy feature vector exceeding the European-style distance value of the preset threshold is taken as a key frame image, wherein the key The number of frame images is at least one; obtaining a pre-stored emotion recognition model, and recognizing a face emotion in each of the key frame images based on the emotion recognition model; and according to all the faces in the key frame images Emotionally obtain a facial emotion corresponding to the video image to complete recognition of the facial emotion.
第二方面,本申请提供了一种人脸情绪识别装置,其包括:获取单元,用于获取实时采集的视频图像;变换单元,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;距离计算单元,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;距离判断单元,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;关键帧获取单元,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;情绪识别单元,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及情绪获取单元,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。In a second aspect, the present application provides a facial emotion recognition device, which includes: an acquiring unit for acquiring a video image collected in real time; and a transform unit for performing wavelet transformation on all frame images in the video image to Obtain a corresponding energy feature vector; a distance calculation unit configured to obtain a standard energy feature vector, and calculate an Euclidean distance value between each of the energy feature vector and the standard energy feature vector according to an image difference calculation method; a distance judgment unit For determining whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values; and a key frame acquisition unit, which is used for if there is an Euclidean distance value that exceeds the preset threshold among the plurality of Euclidean distance values Taking an image corresponding to an energy feature vector of a European distance value exceeding the preset threshold as a key frame image, wherein the number of the key frame images is at least one; an emotion recognition unit is configured to obtain a previously stored emotion recognition A model, and identifying facial emotions in each of the key frame images based on the emotion recognition model; and emotions Acquiring means for acquiring video images corresponding to the emotions based on all the face image in the key frame mood face to face emotion recognition is completed.
第三方面,本申请又提供了一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现第一方面提供的所述的人脸情绪识别方法。According to a third aspect, the present application further provides a computer device including a memory, a processor, and a computer program stored on the memory and executable on the processor. The processor is implemented when the computer program is executed. The facial emotion recognition method provided by the first aspect.
第四方面,本申请还提供了一种计算机可读存储介质,其中所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行第一方面提供的所述的人脸情绪识别方法。According to a fourth aspect, the present application also provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the first aspect. The described facial emotion recognition method.
本申请提供一种人脸情绪识别方法、装置、计算机设备及存储介质。该方法可以准确地识别出人脸情绪。The application provides a method, a device, a computer device, and a storage medium for facial emotion recognition. This method can accurately recognize facial emotions.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要 使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the drawings used in the description of the embodiments are briefly introduced below. Obviously, the drawings in the following description are some embodiments of the present application. For ordinary technicians, other drawings can be obtained based on these drawings without paying creative work.
图1为本申请实施例提供的一种人脸情绪识别方法的示意流程图;FIG. 1 is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application; FIG.
图2至图6均为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图;2 to 6 are another schematic flowchart of a facial emotion recognition method provided by an embodiment of the present application;
图7至图8均为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图;7 to 8 are specific schematic flowcharts of a facial emotion recognition method provided by an embodiment of the present application;
图9为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图;FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application; FIG.
图10为本申请实施例提供的一种人脸情绪识别装置的示意性框图;10 is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application;
图11至图15均为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图;11 to 15 are another schematic block diagrams of a facial emotion recognition device according to an embodiment of the present application;
图16为本申请实施例提供的一种计算机设备的示意性框图。FIG. 16 is a schematic block diagram of a computer device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
请参阅图1,图1是本申请实施例提供的一种人脸情绪识别方法的示意流程图。该人脸情绪识别方法可以应用于人脸情绪识别系统,该人脸情绪识别系统可以安装于手机、汽车等具备摄像功能的设备中。该人脸情绪识别系统可以作为独立的系统存在于设备中,也可以嵌入至设备的其他系统中。譬如,人脸情绪识别系统可内嵌至车驾驶系统中,以识别司机的情绪。又譬如,该人脸情绪识别系统可内嵌至手机的某个应用程序中,以辅助该应用程序实现人脸情绪识别功能等。如图1所示,该人脸情绪识别方法包括步骤S101~S107。Please refer to FIG. 1, which is a schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. The facial emotion recognition method can be applied to a facial emotion recognition system, and the facial emotion recognition system can be installed in a device with a camera function such as a mobile phone or a car. The facial emotion recognition system can exist in the device as an independent system, or it can be embedded in other systems of the device. For example, a facial emotion recognition system can be embedded in a car driving system to identify the driver's emotions. For another example, the facial emotion recognition system may be embedded in an application program of a mobile phone to assist the application program to realize a facial emotion recognition function. As shown in FIG. 1, the facial emotion recognition method includes steps S101 to S107.
S101、获取实时采集的视频图像。S101. Obtain a video image collected in real time.
当用户开启人脸情绪识别系统以进行人脸情绪识别时,该人脸情绪识别系统所在的设备调用摄像头以对用户进行实时的图像采集。该设备通过摄像头获取实时采集的一定时间段内的视频图像。譬如,获取实时采集的10秒内的视频 图像。可以理解的是,该视频图像将包括多帧图像。When the user turns on the facial emotion recognition system to perform facial emotion recognition, the device where the facial emotion recognition system is located invokes a camera to perform real-time image acquisition of the user. The device acquires video images collected within a certain period of time through a camera. For example, capture video images within 10 seconds of real-time capture. It can be understood that the video image will include multiple frames of images.
由于该人脸情绪识别方法在进行人脸情绪识别时,需要使用到中性表情图像、标准能量特征向量、情绪识别模型等信息,因此,在用户使用该人脸情绪识别系统进行人脸情绪识别之前,即,在步骤S101之前,人脸情绪识别系统还需执行以下操作:Since the facial emotion recognition method needs to use information such as a neutral expression image, a standard energy feature vector, and an emotion recognition model when performing facial emotion recognition, the user uses the facial emotion recognition system for facial emotion recognition. Before, that is, before step S101, the facial emotion recognition system also needs to perform the following operations:
在一实施例中,如图2所示,图2为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101a、S101b和S101c。In an embodiment, as shown in FIG. 2, FIG. 2 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101a, S101b, and S101c are also included.
S101a、获取中性表情图像。S101a. Acquire a neutral expression image.
S101b、对所述中性表情图像进行小波变换以得到对应的标准能量特征向量。S101b: Perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
S101c、存储所述中性表情图像和标准能量特征向量。S101c. Store the neutral expression image and a standard energy feature vector.
在图2所示的实施例中,在进行人脸情绪识别之前,需要预先准备好中性表情图像和标准能量特征向量。其中,该中性表情可以为用户在较为平稳的情绪下的面部表情。譬如,用户在拍摄证件照片时一般采用的表情可理解为中性表情。In the embodiment shown in FIG. 2, before performing facial emotion recognition, a neutral expression image and a standard energy feature vector need to be prepared in advance. The neutral expression may be a facial expression of the user in a relatively stable mood. For example, the facial expressions commonly used by users when taking ID photos can be understood as neutral expressions.
当用户首次使用该人脸情绪识别系统时,设备可以发出语音提示或文字提示等以提示用户做好中性表情。在用户做好中性表情后,通过摄像头拍摄用户的中性表情的图像以获取到中性表情图像。When the user uses the facial emotion recognition system for the first time, the device may issue a voice prompt or a text prompt to prompt the user to make a neutral expression. After the user has made the neutral expression, the image of the user's neutral expression is captured by the camera to obtain a neutral expression image.
当然,也可以通过其他方式获取到中性表情图像。譬如,当用户首次使用该人脸情绪识别系统时,获取用户输入的证件照等中性表情图像。也就是说,用户将以往拍过的中性表情的图像传入人脸情绪识别系统所在的设备中作为中性表情图像。又譬如,当用户首次使用该人脸情绪识别系统时,获取用户输入的身份信息,然后根据身份信息从后台服务器中获取身份信息对应的证件照片作为中性表情图像,其中,该后台服务器可以为车载系统的后台服务器、手机应用程序的后台服务器、人脸情绪识别系统的后台服务器等等,该后台服务器可以存储用户的身份信息对应的证件照片,也可以在获取到身份信息后,调用第三方服务器或通过网络爬虫等技术从网络数据中获取用户的身份信息对应的证件照片等。在此不对中性表情图像的获取方式做限制。Of course, neutral expression images can also be obtained in other ways. For example, when the user uses the facial emotion recognition system for the first time, a neutral expression image such as a photo of the user input is acquired. In other words, the user transfers the image of the neutral expression taken in the past into the device where the facial emotion recognition system is located as a neutral expression image. For another example, when a user uses the facial emotion recognition system for the first time, the identity information input by the user is obtained, and then the identity photo corresponding to the identity information is obtained from the background server as a neutral facial expression image. The background server may be The background server of the vehicle system, the background server of mobile phone applications, the background server of the facial emotion recognition system, etc. The background server can store the ID photos corresponding to the user's identity information, or call the third party after obtaining the identity information The server or obtains the ID photo corresponding to the user's identity information from the network data through technologies such as web crawlers. There are no restrictions on how to obtain neutral expression images.
当获取到中性表情图像后,对中性表情图像采用Gabor小波变换等方式进行小波变换以得到对应的标准能量特征向量,并存储该中性表情图像和对应的 标准能量特征向量,以方便用户在使用该人脸情绪识别系统进行人脸情绪识别时,可以调用该中性表情图像和对应的标准能量特征向量进行人脸情绪识别。After the neutral expression image is obtained, the neutral expression image is subjected to wavelet transform using Gabor wavelet transform to obtain a corresponding standard energy feature vector, and the neutral expression image and the corresponding standard energy feature vector are stored for the convenience of the user. When using the facial emotion recognition system for facial emotion recognition, the neutral expression image and the corresponding standard energy feature vector may be called for facial emotion recognition.
在一实施例中,如图3所示,图3为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101d和S101e。In an embodiment, as shown in FIG. 3, FIG. 3 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101d and S101e are also included.
S101d、获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签。S101d: Obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
S101e、将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。S101e: input the sentiment training sample image and the corresponding sentiment label into a convolutional neural network model for machine learning to obtain a sentiment recognition model, and store the sentiment recognition model.
在图3所示的实施例中,在进行人脸情绪识别之前,需要预先准备好情绪识别模型。具体地,人脸情绪识别系统需要获取情绪训练样本图像集。该情绪训练样本图像集包括大量的情绪训练样本图像和每个情绪训练样本图像对应的人脸的情绪标签。需要说明的是,每张情绪训练样本图像中人脸的情绪标签可以通过人工方式进行标记,也可以通过其他方法进行标记,在此不做具体限制。In the embodiment shown in FIG. 3, before performing facial emotion recognition, an emotion recognition model needs to be prepared in advance. Specifically, the facial emotion recognition system needs to acquire a set of emotional training sample images. The emotional training sample image set includes a large number of emotional training sample images and an emotional label of a human face corresponding to each emotional training sample image. It should be noted that the emotional labels of the faces in each of the emotional training sample images can be labeled manually, or can be labeled by other methods, which are not specifically limited here.
在获得情绪训练样本图像集后,将情绪训练样本图像和对应的人脸的情绪标签输入至卷积神经网络(英文全称:Convolutional Neural Networks,简称:CNN)模型中进行机器学习,从而获得情绪识别模型,再将情绪识别模型存储在人脸情绪识别系统所在的设备中,以方便后续使用人脸情绪识别系统时,可以调用该情绪识别模型进行情绪识别。After obtaining the emotional training sample image set, the emotional training sample image and the corresponding emotional labels of the human face are input to a convolutional neural network (English full name: Convolutional Neural Networks, CNN) model for machine learning to obtain emotional recognition Model, and then store the emotion recognition model in the device where the facial emotion recognition system is located, so as to facilitate the subsequent use of the facial emotion recognition system, the emotion recognition model can be called for emotion recognition.
在一实施例中,当用户开启人脸情绪识别系统以进行人脸情绪识别时,若人脸情绪识别系统所在的设备的摄像头不能很好地对用户进行实时的图像采集,譬如,摄像头的角度不对,使得实时采集的视频图像中没有采集到用户脸部信息,或者用户脸部信息只采集到一半,这样的视频图像在后续进行人脸情绪识别时,势必会降低人脸情绪识别的准确率。因此,为了确保后续在进行人脸情绪识别时,可以拍摄到较好的人脸的视频图像,提高后续人脸情绪识别的准确性,在获取实时采集的视频图像之前,还需要对摄像头进行校准的工作。In an embodiment, when the user activates the facial emotion recognition system for facial emotion recognition, if the camera of the device where the facial emotion recognition system is located cannot perform a real-time image acquisition of the user, for example, the angle of the camera No, the user's face information is not collected in the video images collected in real time, or only half of the user's face information is collected. Such video images will inevitably reduce the accuracy of facial emotion recognition during subsequent facial emotion recognition. . Therefore, in order to ensure that in the subsequent facial emotion recognition, a better video image of the face can be captured, and to improve the accuracy of subsequent facial emotion recognition, the camera needs to be calibrated before acquiring the real-time captured video image. work.
具体地,如图4所示,图4为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101f、S101g、S101h和S101j。Specifically, as shown in FIG. 4, FIG. 4 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101f, S101g, S101h and S101j are also included.
S101f、获取实时采集的校准视频图像。S101f: Obtain a calibration video image collected in real time.
S101g、从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像。S101g: Extract a preset number of frames as a calibration image from a plurality of frames of the calibration video image according to a preset rule.
S101h、基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息。S101h: Based on a pre-stored face detection and recognition model, determine whether face information exists in the calibration image in each frame.
S101j、若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行S101f的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。S101j: If there is no face information in the calibration image of at least one frame, issue a prompt message so that the user adjusts the angle of the camera according to the prompt information, and after adjusting the angle of the camera, return to step S101f, Until the face information exists in the calibration image of each frame.
在图4所示的实施例中,人脸情绪识别系统需要获取实时采集的一段校准视频图像。可以理解的是,该校准视频图像包括多帧图像。然后按照预设抽取规则从校准视频图像的多帧图像中抽取出预设帧数的图像作为校准图像。In the embodiment shown in FIG. 4, the facial emotion recognition system needs to acquire a segment of a calibration video image collected in real time. It can be understood that the calibration video image includes multiple frames of images. Then, according to a preset extraction rule, an image with a preset number of frames is extracted from a plurality of frames of the calibration video image as a calibration image.
在一实施例中,该预设抽取规则可以为每间隔1秒钟抽取1张图像作为校准图像。预设帧数可以设置为100,该预设帧数可以根据实际需求进行设置,另外,该预设抽取规则也可以不局限于上述的规则,可以根据实际需求进行设置,在此不做限制。在获得多帧校准图像后,获取预先存储的人脸检测识别模型,该人脸检测识别模型是用来识别校准图像中是否存在人脸信息的。In one embodiment, the preset extraction rule may extract one image as a calibration image every 1 second. The preset frame number can be set to 100, and the preset frame number can be set according to actual requirements. In addition, the preset extraction rule may not be limited to the above-mentioned rules, and may be set according to actual requirements, and is not limited here. After obtaining multiple frames of calibration images, a pre-stored face detection and recognition model is obtained, and the face detection and recognition model is used to identify whether or not there is face information in the calibration image.
若通过人脸检测识别模型判断出每个校准图像中均存在人脸信息,说明当前摄像头的角度良好,可以拍摄较好的人脸的视频图像,此时,可以执行步骤S101,即执行获取实时采集的视频图像的步骤。If it is determined through the face detection recognition model that there is face information in each calibration image, it means that the current camera angle is good and a good video image of the face can be taken. At this time, step S101 can be performed, that is, real-time acquisition is performed. Steps to capture video images.
若至少有一帧校准图像中不存在人脸信息,说明当前摄像头的角度不好,需要进行调整,此时可以通过语音方或显示方式等发出提示信息,以使得用户根据提示信息重新调整摄像头的角度,并在调整好摄像头角度后,重新返回执行S101f的步骤,即返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止,从而完成对摄像头的角度校准。If there is no face information in at least one frame of the calibration image, it means that the current camera angle is not good and needs to be adjusted. At this time, a prompt message can be sent by voice or display mode, so that the user can readjust the camera angle according to the prompt And after adjusting the camera angle, return to step S101f, that is, return to the step of obtaining a calibration video image acquired in real time, until the face information is present in each frame of the calibration image, thereby completing the camera Angle calibration.
由于在进行摄像头的角度校准时,需要使用到人脸检测识别模型,因此,该人脸检测识别模型需要预先生成并存储在人脸情绪识别系统所在的设备中。在一实施例中,如图5所示,图5为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S101之前,还包括步骤S101k、S101m、S101n、和S101p。Because the face detection and recognition model needs to be used when performing the camera angle calibration, the face detection and recognition model needs to be generated in advance and stored in the device where the face emotion recognition system is located. In an embodiment, as shown in FIG. 5, FIG. 5 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. Prior to step S101, steps S101k, S101m, S101n, and S101p are also included.
S101k、获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签。S101k: Obtain a training sample image set, where the training sample image set includes multiple training sample images and a face label used to characterize whether or not face information exists in the training sample image.
S101m、获取所述训练样本图像的人脸哈尔特征向量。S101m: Obtain a face Hal feature vector of the training sample image.
S101n、将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至 基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型。S101n: input the face hal feature vector and face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model.
S101p、存储所述人脸检测识别模型。S101p. Store the face detection and recognition model.
在图5所示的实施例中,在进行人脸情绪识别之前,需要预先准备好人脸检测识别模型,以便于在进行摄像头角度校准时使用。具体地,首先获取训练样本图像集,该训练样本图像集中包括多个训练样本图像,以及每个训练样本图像对应的人脸标签。该人脸标签用于表征对应的人脸样本图像中是否存在人脸信息的。然后,对每个训练样本图像进行人脸的哈尔特征提取,以获取到每个训练样本图像对应的人脸哈尔特征向量。再将每个训练样本图像对应的人脸哈尔特征向量以及对应的人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,这就可以得到人脸检测识别模型。最后将该人脸检测识别模型存储在人脸情绪识别系统所在的设备中。In the embodiment shown in FIG. 5, before performing facial emotion recognition, a face detection recognition model needs to be prepared in advance, so as to be used when performing camera angle calibration. Specifically, a training sample image set is first obtained, where the training sample image set includes multiple training sample images, and a face label corresponding to each training sample image. The face label is used to characterize whether there is face information in a corresponding face sample image. Then, the Hal feature extraction of the face is performed on each training sample image to obtain the Hal feature vector of the face corresponding to each training sample image. Then, the face hal feature vector corresponding to each training sample image and the corresponding face label are input to an Adaboost lifting model based on a decision tree model for training, and a face detection and recognition model can be obtained. Finally, the face detection and recognition model is stored in the device where the facial emotion recognition system is located.
S102、对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量。S102. Perform wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors.
在步骤S101获得视频图像后,需要对视频图像中的所有帧图像进行小波变换,以获取到每帧图像对应的能量特征向量。在一实施例中,该小波变换可例如为Gabor小波变换,当然,该小波变换还可以采用其他方法,在此不做限制。After obtaining the video image in step S101, wavelet transform is required for all frame images in the video image to obtain the energy feature vector corresponding to each frame image. In an embodiment, the wavelet transform may be, for example, a Gabor wavelet transform. Of course, the wavelet transform may also adopt other methods, which is not limited herein.
S103、获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值。S103. Obtain a standard energy feature vector, and calculate a Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method.
该标准能量特征向量为预先采集的用户的中性表情图像进行小波变换后的能量特征向量。在本实施例中,该标准能量特征向量预先存储于人脸情绪识别系统所在的设备中。由于设备中预先存储了该标准能量特征向量,因此,获取标准能量特征向量,具体为获取预先存储的标准能量特征向量。The standard energy feature vector is an energy feature vector obtained by performing wavelet transform on a neutral expression image of a user collected in advance. In this embodiment, the standard energy feature vector is stored in advance in a device where the facial emotion recognition system is located. Because the standard energy feature vector is stored in the device in advance, obtaining the standard energy feature vector is specifically to obtain the previously stored standard energy feature vector.
在获得标准能量特征向量后,将根据图像差分运算方法计算步骤S102中每个能量特征向量与标准能量特征向量之间的欧式距离值。After obtaining the standard energy feature vector, the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S102 will be calculated according to the image difference calculation method.
需要说明的是,在本实施例中,由于该标准能量特征向量是预先存储于人脸情绪识别系统所在的设备中,这样在步骤S103中就可以直接调用该标准能量特征向量,从而减小对人脸情绪识别系统所在的设备的CPU资源的占用,降低计算时间等。当然,在其他实施例中,人脸情绪识别系统所在的设备也可以只预先存储中性表情图像,这样,在步骤S103获取标准能量特征向量时,先获取预先存储的中性表情图像,然后再对中性表情图像进行小波变换以得到标准能 量特征向量,在此不对计算标准能量特征向量的时间做限制。It should be noted that, in this embodiment, since the standard energy feature vector is stored in advance in the device where the facial emotion recognition system is located, the standard energy feature vector can be directly called in step S103, thereby reducing the Occupation of CPU resources of the device on which the facial emotion recognition system is located, reducing calculation time, etc. Of course, in other embodiments, the device on which the facial emotion recognition system is located may only store the neutral expression image in advance. In this way, when obtaining the standard energy feature vector in step S103, the previously stored neutral expression image is obtained first, and then Wavelet transform is performed on the neutral expression image to obtain a standard energy feature vector, and there is no limitation on the time for calculating the standard energy feature vector.
S104、判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值。S104. Determine whether there is an Euclidean distance value among the plurality of Euclidean distance values that exceeds a preset threshold.
在步骤S103计算出每个能量特征向量与标准能量特征向量之间的欧式距离值后,会得到多个欧式距离值,然后判断多个欧式距离值中是否存在超过预设阈值的欧式距离值。若存在超过预设阈值的欧式距离值,说明视频图像中的人脸表情与中性表情之间的差距较大,此时执行步骤S105。After calculating the Euclidean distance value between each energy feature vector and the standard energy feature vector in step S103, a plurality of Euclidean distance values are obtained, and then it is determined whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values. If there is an Euclidean distance value exceeding a preset threshold, it indicates that the difference between the facial expression and the neutral expression in the video image is large, and step S105 is executed at this time.
在一实施例中,如图6所示,图6为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。当步骤S104判断出不存在超过预设阈值的欧式距离值时,说明当前的视频图像中人脸表情与中性表情差距较小,此时将执行步骤S108,即将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像。然后再执行后续的步骤S106和S107等。当然,在其他实施例中,若步骤S104判断出不存在超过预设阈值的欧式距离值,也可以直接设置视频图像对应的人脸情绪为中性情绪,从而完成人脸情绪的识别。In an embodiment, as shown in FIG. 6, FIG. 6 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. When it is determined in step S104 that there is no Euclidean distance value exceeding a preset threshold, it means that the difference between the facial expression and the neutral expression in the current video image is small. At this time, step S108 will be executed, that is, the standard energy feature vector corresponding to The neutral expression image is used as the key frame image. Then, the subsequent steps S106 and S107 are performed. Of course, in other embodiments, if it is determined in step S104 that there is no Euclidean distance value exceeding a preset threshold, the facial emotion corresponding to the video image may also be set as a neutral emotion to complete the recognition of the facial emotion.
S105、若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个。S105. If there is an Euclidean distance value exceeding the preset threshold among the plurality of Euclidean distance values, an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is used as a key frame image, where the The number of key frame images is at least one.
在本实施例中,超过预设阈值的欧式距离值的个数可以为一个,也可以为两个或更多个,此时关键帧图像的个数就为至少一个。In this embodiment, the number of Euclidean distance values exceeding a preset threshold may be one, or may be two or more. At this time, the number of key frame images is at least one.
S106、获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪。S106. Obtain a pre-stored emotion recognition model, and recognize a facial emotion in each of the key frame images based on the emotion recognition model.
在本实施例中,该情绪识别模型为预先进行机器学习训练得到的用于识别人脸情绪的模型,该情绪识别模块可例如为卷积神经网络模型。人脸情绪识别系统所在的设备先获取该情绪识别模型,然后将关键帧图像作为输入值输入至情绪识别模型中,该情绪识别模型对关键帧图像进行情绪识别,以输出每个关键帧图像中的人脸情绪。In this embodiment, the emotion recognition model is a model for recognizing facial emotions obtained by performing machine learning training in advance, and the emotion recognition module may be, for example, a convolutional neural network model. The device where the facial emotion recognition system is located first obtains the emotion recognition model, and then inputs the key frame image as an input value into the emotion recognition model. The emotion recognition model performs emotion recognition on the key frame images to output each key frame image. Face emotions.
具体地,在一实施例中,如图7所示,图7为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图。该步骤S106包括步骤S1061至S1063。Specifically, in an embodiment, as shown in FIG. 7, FIG. 7 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. This step S106 includes steps S1061 to S1063.
S1061、依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中。S1061. Each of the key frame images is sequentially input as an input value into the emotion recognition model.
S1062、获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪 上的概率值。S1062. Obtain probability values of each of the key frame images output by the emotion recognition model on multiple preset emotions.
S1063、将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。S1063. Use the emotion corresponding to the larger probability value among the multiple probability values corresponding to each of the key frame images as the facial emotion in the key frame image.
在图7所示的实施例中,依次将每个关键帧图像作为输入值输入至情绪识别模型中,然后情绪识别模型会输出每个关键帧图像在多种预设情绪上的概率值。譬如,多种预设情绪包括害怕、愤怒、悲哀、厌恶、高兴、惊奇和中性等7种预设的情绪。情绪识别模型会识别出每个关键帧图像中的人脸情绪在这7种预设表情上的概率,如,情绪识别模型识别出某个关键帧图像中的人脸情绪在上述7种预设情绪上的概率依次为10%、70%、15%、5%、0%、0%和0%。In the embodiment shown in FIG. 7, each key frame image is sequentially input as an input value into an emotion recognition model, and then the emotion recognition model outputs probability values of each key frame image on various preset emotions. For example, the multiple preset emotions include 7 preset emotions such as fear, anger, sorrow, disgust, joy, surprise, and neutrality. The emotion recognition model will recognize the probability of the face emotion in each key frame image on these 7 preset expressions. For example, the emotion recognition model recognizes the face emotion in a key frame image in the above 7 preset expressions. The emotional probabilities are 10%, 70%, 15%, 5%, 0%, 0%, and 0%.
然后,将每个关键帧图像对应的多个概率值中较大的概率值对应的情绪作为该关键帧图像中的人脸情绪。譬如,将概率值最大的70%的愤怒情绪作为某个关键帧图像对应的人脸情绪。Then, the emotion corresponding to the larger probability value among the multiple probability values corresponding to each key frame image is used as the facial emotion in the key frame image. For example, 70% of the anger emotions with the largest probability value are used as the facial emotions corresponding to a certain key frame image.
S107、根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。S107. Acquire the facial emotions corresponding to the video images according to the facial emotions in all the key frame images to complete the recognition of the facial emotions.
具体地,在一实施例中,如图8所示,图8为本申请实施例提供的一种人脸情绪识别方法的具体示意流程图。该步骤S107包括步骤S1071至S1072。Specifically, in an embodiment, as shown in FIG. 8, FIG. 8 is a specific schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. This step S107 includes steps S1071 to S1072.
S1071、对所有所述关键帧图像中的人脸情绪进行概率统计。S1071. Perform probability statistics on the facial emotions in all the key frame images.
S1072、将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。S1072. The facial emotion with a higher probability of occurrence is used as the facial emotion corresponding to the video image to complete the recognition of the facial emotion.
譬如,关键帧图像的个数为10个,通过情绪识别模型识别出8个关键帧图像的人脸情绪为愤怒,1个关键帧图像的人脸情绪为厌恶,一个关键帧图像的人脸情绪为害怕,通过对10个关键帧图像的人脸情绪进行概率统计,可以得出愤怒的人脸情绪出现的概率为80%,厌恶的人脸情绪出现的概率为10%,害怕的人脸情绪出现的概率为10%。这样就可以将出现概率较大的愤怒情绪作为整段视频图像对应的人脸情绪,从而完成在视频图像对应的时间段内人脸情绪的识别。For example, the number of key frame images is 10, and the facial emotion of 8 key frame images is identified by emotion recognition model as anger, the facial emotion of 1 key frame image is aversion, and the facial emotion of one key frame image is For fear, by performing probability statistics on the facial emotions of the 10 key frame images, it can be concluded that the probability of the appearance of angry facial emotions is 80%, the probability of the appearance of disgusted facial emotions is 10%, and the fear of facial emotions The probability of occurrence is 10%. In this way, the anger emotion with a high probability of occurrence can be used as the facial emotion corresponding to the entire video image, thereby completing the recognition of the facial emotion within the time period corresponding to the video image.
在一实施例中,如图9所示,图9为本申请实施例提供的一种人脸情绪识别方法的另一示意流程图。在步骤S107之后,还包括步骤S109至S112。In an embodiment, as shown in FIG. 9, FIG. 9 is another schematic flowchart of a facial emotion recognition method according to an embodiment of the present application. After step S107, steps S109 to S112 are also included.
S109、将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中。S109. Record a time period corresponding to the video image and a facial emotion corresponding to the video image into an emotion list.
S110、根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率。S110. Count the probability of the facial emotions belonging to the preset emotion category among the facial emotions corresponding to all the video images in the preset time period according to the emotion list.
S111、判断属于预设情绪类的人脸情绪的概率是否超过预设概率值。S111. Determine whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
S112、若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。S112. If the probability of a facial emotion belonging to the preset emotion category exceeds the preset probability value, obtain a preset prompt mode and preset prompt information, and prompt the user with the preset according to the preset prompt mode. Prompt message.
譬如,假设预设时间段为2分钟,预设情绪类为负面情绪类,该负面情绪类所包括的人脸情绪为害怕、愤怒、悲哀和厌恶四种。同时,假设情绪列表中2分钟内的视频图像的个数为100个,那么就会有100个人脸情绪,然后统计这100个人脸情绪中属于负面情绪类的人脸情绪所占的概率,比如概率为99%,当属于负面情绪类的人脸情绪所占的概率超过预设概率值80%时,说明用户在这2分钟内一直处于负面情绪中,此时将获取预设提示方式和预设提示信息,并根据预设提示方式向用户提示预设提示信息。其中,该预设提示方式可例如为语音提示方式、文字显示方式、语音提示与震动组合方式等等。该预设提示信息可例如为“您目前的情绪较低落,请注意安全驾驶”等。For example, suppose the preset time period is 2 minutes, and the preset emotion category is the negative emotion category. The facial emotions included in the negative emotion category are four types of fear, anger, sadness, and disgust. At the same time, assuming that the number of video images in the emotion list within two minutes is 100, then there will be 100 facial emotions, and then the probability of facial emotions belonging to the negative emotion category among these 100 facial emotions will be counted, such as The probability is 99%. When the probability of facial emotions belonging to the category of negative emotions exceeds the preset probability value of 80%, it means that the user has been in negative emotions within these 2 minutes. At this time, the preset reminder method and pre-emption will be obtained. Set prompt information, and prompt the user with the preset prompt information according to the preset prompt mode. The preset prompt mode may be, for example, a voice prompt mode, a text display mode, a voice prompt and vibration combination mode, and the like. The preset prompt information may be, for example, "your current mood is low, please pay attention to driving safely" and the like.
本实施例中的人脸情绪识别方法,可以准确地识别出人脸情绪。The facial emotion recognition method in this embodiment can accurately recognize facial emotions.
本申请实施例还提供一种人脸情绪识别装置,该人脸情绪识别装置用于执行前述任一项人脸情绪识别方法。具体地,请参阅图10,图10是本申请实施例提供的一种人脸情绪识别装置的示意性框图。人脸情绪识别装置300可以安装于汽车、手机等设备中。An embodiment of the present application further provides a facial emotion recognition device, which is configured to execute any one of the foregoing facial emotion recognition methods. Specifically, please refer to FIG. 10, which is a schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition device 300 may be installed in a device such as a car or a mobile phone.
如图10所示,人脸情绪识别装置300包括获取单元301、变换单元302、距离计算单元303、距离判断单元304、关键帧获取单元305、情绪识别单元306和情绪获取单元307。As shown in FIG. 10, the facial emotion recognition device 300 includes an acquisition unit 301, a transformation unit 302, a distance calculation unit 303, a distance judgment unit 304, a key frame acquisition unit 305, an emotion recognition unit 306 and an emotion acquisition unit 307.
获取单元301,用于获取实时采集的视频图像。The obtaining unit 301 is configured to obtain a video image collected in real time.
在一实施例中,如图11所示,图11为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括存储单元308。In an embodiment, as shown in FIG. 11, FIG. 11 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition apparatus 300 further includes a storage unit 308.
获取单元301,还用于获取中性表情图像。The obtaining unit 301 is further configured to obtain a neutral expression image.
变换单元302,还用于对所述中性表情图像进行小波变换以得到对应的标准能量特征向量。The transform unit 302 is further configured to perform wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector.
存储单元308,用于存储所述中性表情图像和标准能量特征向量。The storage unit 308 is configured to store the neutral expression image and the standard energy feature vector.
在一实施例中,如图12所示,图12为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括情绪模型训练单元309。In an embodiment, as shown in FIG. 12, FIG. 12 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition apparatus 300 further includes an emotion model training unit 309.
获取单元301,还用于获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签。The obtaining unit 301 is further configured to obtain an emotional training sample image set, where the emotional training sample image set includes a plurality of emotional training sample images and an emotional label of a human face in the emotional training sample image.
情绪模型训练单元309,用于将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。An emotional model training unit 309 is configured to input the emotional training sample image and a corresponding emotional label into a convolutional neural network model and perform machine learning to obtain an emotional recognition model, and store the emotional recognition model.
在一实施例中,如图13所示,图13为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括抽取单元310、人脸判断单元311和提示单元312。In an embodiment, as shown in FIG. 13, FIG. 13 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition device 300 further includes an extraction unit 310, a face determination unit 311, and a prompting unit 312.
获取单元301,还用于获取实时采集的校准视频图像。The obtaining unit 301 is further configured to obtain a calibration video image collected in real time.
抽取单元310,用于从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像。The extraction unit 310 is configured to extract an image of a preset number of frames from a plurality of frames of the calibration video image according to a preset rule as a calibration image.
人脸判断单元311,用于基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息。The face judging unit 311 is configured to determine whether face information exists in the calibration image in each frame based on a face detection and recognition model stored in advance.
获取单元301,还用于若每帧所述校准图像中均存在人脸信息,获取实时采集的视频图像。The obtaining unit 301 is further configured to obtain real-time collected video images if face information exists in the calibration image in each frame.
提示单元312,用于若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,获取单元301返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。A prompting unit 312 is configured to send a prompting message if the face information does not exist in at least one frame of the calibration image, so that the user can adjust the angle of the camera according to the prompting information, and obtain the unit after adjusting the angle of the camera. Step 301 returns to performing the step of acquiring a calibration video image collected in real time, until face information exists in the calibration image in each frame.
相应地,在一实施例中,如图14所示,图14为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括向量获取单元313和人脸模型训练单元314。Correspondingly, in an embodiment, as shown in FIG. 14, FIG. 14 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition apparatus 300 further includes a vector acquisition unit 313 and a face model training unit 314.
获取单元301,还用于获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签。The obtaining unit 301 is further configured to obtain a training sample image set, where the training sample image set includes a plurality of training sample images and a face label used to characterize whether there is face information in the training sample image.
向量获取单元313,用于获取所述训练样本图像的人脸哈尔特征向量。A vector obtaining unit 313 is configured to obtain a face Hal feature vector of the training sample image.
人脸模型训练单元314,用于将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型,以及存储所述人脸检测识别模型。A face model training unit 314, configured to input a face Hal feature vector and a face label corresponding to the training sample image into an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model, and The face detection recognition model is stored.
变换单元302,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量。A transformation unit 302 is configured to perform wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors.
距离计算单元303,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值。The distance calculation unit 303 is configured to obtain a standard energy feature vector, and calculate a Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method.
距离判断单元304,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值。The distance judging unit 304 is configured to judge whether there is an Euclidean distance value among the plurality of Euclidean distance values that exceeds a preset threshold.
关键帧获取单元305,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个。A key frame obtaining unit 305 is configured to use, as a key frame, an image corresponding to an energy feature vector of the Euclidean distance value exceeding the preset threshold value if there are Euclidean distance values exceeding the preset threshold value among the plurality of Euclidean distance values. An image, wherein the number of the key frame images is at least one.
在一实施例中,关键帧获取单元305,还用于若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像。In an embodiment, the key frame obtaining unit 305 is further configured to, if there is no Euclidean distance value exceeding the preset threshold value among the plurality of Euclidean distance values, convert the neutral expression image corresponding to the standard energy feature vector As the key frame image.
情绪识别单元306,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪。The emotion recognition unit 306 is configured to obtain a pre-stored emotion recognition model, and recognize a facial emotion in each of the key frame images based on the emotion recognition model.
具体地,在一实施例中,该情绪识别单元306具体用于:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。Specifically, in an embodiment, the emotion recognition unit 306 is specifically configured to: sequentially input each of the key frame images as an input value into the emotion recognition model; and obtain each of the information output by the emotion recognition model. The probability value of the key frame image on a plurality of preset emotions; and the emotion corresponding to the larger probability value among the plurality of probability values corresponding to each of the key frame images as the facial emotion in the key frame image .
情绪获取单元307,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。The emotion acquiring unit 307 is configured to acquire the facial emotion corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotion.
具体地,在一实施例中,该情绪获取单元307具体用于:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。Specifically, in an embodiment, the emotion acquiring unit 307 is specifically configured to: perform probability statistics on the facial emotions in all the key frame images; and use the facial emotions with a higher probability of occurrence as the video image corresponding Facial emotions to complete recognition of facial emotions.
在一实施例中,如图15所示,图15为本申请实施例提供的一种人脸情绪识别装置的另一示意性框图。该人脸情绪识别装置300还包括记录单元315、统 计单元316、概率判断单元317和信息提示单元318。In an embodiment, as shown in FIG. 15, FIG. 15 is another schematic block diagram of a facial emotion recognition device according to an embodiment of the present application. The facial emotion recognition device 300 further includes a recording unit 315, a statistics unit 316, a probability judgment unit 317, and an information prompting unit 318.
记录单元315,用于将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中。The recording unit 315 is configured to record a time period corresponding to the video image and a facial emotion corresponding to the video image into an emotion list.
统计单元316,用于根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率。The statistics unit 316 is configured to count, according to the emotion list, a probability of a facial emotion belonging to a preset emotional category among facial emotions corresponding to all the video images in a preset time period.
概率判断单元317,用于判断属于预设情绪类的人脸情绪的概率是否超过预设概率值。The probability judging unit 317 is configured to judge whether a probability of a facial emotion belonging to a preset emotion category exceeds a preset probability value.
信息提示单元318,用于若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。An information prompting unit 318 is configured to obtain a preset prompting method and preset prompting information if the probability of a face emotion belonging to the preset emotion category exceeds the preset probability value, and provide the user with a preset prompting method according to the preset prompting method. Prompt the preset prompt information.
需要说明的是,所属领域的技术人员可以清楚地了解到,上述人脸情绪识别装置300和各单元的具体实现过程,可以参考前述人脸情绪识别方法实施例中的相应描述,为了描述的方便和简洁,在此不再赘述。It should be noted that those skilled in the art can clearly understand that the specific implementation process of the aforementioned facial emotion recognition device 300 and each unit can refer to the corresponding description in the foregoing embodiment of the facial emotion recognition method, for convenience of description And brevity, will not repeat them here.
本实施例中的人脸情绪识别装置300可以准确地识别出人脸情绪。The facial emotion recognition device 300 in this embodiment can accurately recognize facial emotions.
上述人脸情绪识别装置可以实现为一种计算机程序的形式,该计算机程序可以在如图16所示的计算机设备上运行。The above-mentioned facial emotion recognition device can be implemented in the form of a computer program, which can be run on a computer device as shown in FIG. 16.
请参阅图16,图16是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备500可以是手机等终端,也可以是应用于汽车中的设备。Please refer to FIG. 16, which is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a terminal such as a mobile phone, or may be a device used in a car.
参阅图16,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。Referring to FIG. 16, the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501. The memory may include a non-volatile storage medium 503 and an internal memory 504.
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032包括程序指令,该程序指令被执行时,可使得处理器502执行一种人脸情绪识别方法。该处理器502用于提供计算和控制能力,支撑整个计算机设备500的运行。该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行一种人脸情绪识别方法。该网络接口505用于进行网络通信,如发送分配的任务等。本领域技术人员可以理解,图16中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备 500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute a method for facial emotion recognition. The processor 502 is used to provide computing and control capabilities to support the operation of the entire computer device 500. The internal memory 504 provides an environment for running a computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute a method for facial emotion recognition. The network interface 505 is used for network communication, such as sending assigned tasks. Those skilled in the art can understand that the structure shown in FIG. 16 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment 500 to which the solution of the present application is applied. The specific computer equipment 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现如下功能:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。The processor 502 is configured to run a computer program 5032 stored in a memory to achieve the following functions: acquiring video images collected in real time; performing wavelet transformation on all frame images in the video image to obtain corresponding energy characteristics Vector; obtaining a standard energy feature vector, and calculating an Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method; determining whether a plurality of the Euclidean distance values exceed a preset value Threshold Euclidean distance value; if there is an Euclidean distance value exceeding the preset threshold among the plurality of Euclidean distance values, an image corresponding to an energy feature vector exceeding the Euclidean distance value exceeding the preset threshold is taken as a key frame image, Wherein, the number of the key frame images is at least one; acquiring a pre-stored emotion recognition model, and recognizing a facial emotion in each of the key frame images based on the emotion recognition model; and according to all the key frames The facial emotion in the image acquires the facial emotion corresponding to the video image to complete the recognition of the facial emotion. do not.
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。In an embodiment, before executing the real-time acquisition of the video image, the processor 502 also implements the following functions: acquiring a neutral expression image; performing wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector; and The neutral expression image and the standard energy feature vector are stored.
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。In an embodiment, the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring an emotional training sample image set, wherein the emotional training sample image set includes a plurality of emotional training sample images and the Emotion labels of human faces in the emotion training sample image; and inputting the emotion training sample images and corresponding emotion labels into a convolutional neural network model for machine learning to obtain an emotion recognition model, and storing the emotion recognition model.
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。In an embodiment, the processor 502 also implements the following function before acquiring video images acquired in real time: acquiring a training sample image set, wherein the training sample image set includes multiple training sample images and is used to characterize the Whether there is a face label of the face information in the training sample image; obtaining the face Hal feature vector of the training sample image; and inputting the face Hal feature vector corresponding to the training sample image and the face label into a decision-based Training in the Adaboost lifting model of the tree model to obtain a face detection and recognition model; and storing the face detection and recognition model.
在一实施例中,处理器502在执行获取实时采集的视频图像之前,还实现如下功能:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图 像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。In an embodiment, before executing the real-time video image acquisition, the processor 502 also implements the following functions: acquiring a real-time calibration video image; and extracting a pre-defined video from a plurality of frames of the calibration video image according to a preset rule. Set the number of frames as the calibration image; determine whether there is face information in the calibration image in each frame based on the pre-stored face detection recognition model; if face information exists in the calibration image in each frame, perform acquisition A step of acquiring a video image in real time; if there is no face information in the calibration image of at least one frame, sending a prompt message so that the user can adjust the angle of the camera according to the prompt information, and after adjusting the angle of the camera, Return to the step of obtaining a calibration video image acquired in real time, until the face information exists in the calibration image for each frame.
在一实施例中,处理器502在执行基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪时,具体实现如下功能:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。In an embodiment, when the processor 502 recognizes the facial emotion in each of the key frame images based on the emotion recognition model, the processor 502 specifically implements the following function: inputting each of the key frame images as an input value in sequence To the emotion recognition model; obtaining probability values of each of the key frame images output by the emotion recognition model on a plurality of preset emotions; and among the plurality of probability values corresponding to each of the key frame images The emotion corresponding to the larger probability value is used as the facial emotion in the key frame image.
在一实施例中,处理器502在执行根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别时,具体实现如下功能:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。In an embodiment, when the processor 502 executes acquiring facial emotions corresponding to the video image according to the facial emotions in all the key frame images to complete the recognition of the facial emotions, it specifically implements the following functions: Probability statistics are performed on the facial emotions in the key frame image; and facial emotions with a higher probability of occurrence are used as facial emotions corresponding to the video image to complete the recognition of facial emotions.
应当理解,在本申请实施例中,处理器502可以是中央处理单元,该处理器502还可以是其他通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that, in the embodiment of the present application, the processor 502 may be a central processing unit, and the processor 502 may also be other general-purpose processors, digital signal processors, application specific integrated circuits, ready-made programmable gate arrays, or other programmable logic. Devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
本领域普通技术人员可以理解的是实现上述人脸情绪识别方法实施例中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成。该计算机程序可存储于一计算机可读存储介质中。该计算机程序被该计算机系统中的至少一个处理器执行,以实现包括如上述各人脸情绪识别方法的实施例的流程步骤。A person of ordinary skill in the art can understand that all or part of the processes in the embodiment of the method for recognizing facial emotions described above can be performed by a computer program instructing related hardware. The computer program may be stored in a computer-readable storage medium. The computer program is executed by at least one processor in the computer system to implement the process steps of the embodiment including the facial emotion recognition method as described above.
该存储介质可以是U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The storage medium may be various media that can store program codes, such as a U disk, a mobile hard disk, a read-only memory (ROM, Read-Only Memory), a magnetic disk, or an optical disk.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,各个单元的划分仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。本申请实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。本申请实施例装置中的单元可以根据实际需要进行合并、划分和删减。在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元 集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the division of each unit is only a logical function division, and there may be another division manner in actual implementation. The steps in the method of the embodiment of the present application can be adjusted, combined, and deleted according to actual needs. The units in the apparatus of the embodiment of the present application may be combined, divided, and deleted according to actual needs. Each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit. The above integrated unit may be implemented in the form of hardware or in the form of software functional unit.
该集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,终端,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。When the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology, or all or part of the technical solution may be embodied in the form of a software product, which is stored in a storage medium Included are instructions for causing a computer device (which may be a personal computer, a terminal, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, and these modifications or replacements should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (20)

  1. 一种人脸情绪识别方法,其包括:A facial emotion recognition method includes:
    获取实时采集的视频图像;Obtain video images collected in real time;
    对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;Performing wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors;
    获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;Obtaining a standard energy feature vector, and calculating a Euclidean distance value between each of the energy feature vector and the standard energy feature vector according to an image difference calculation method;
    判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;Judging whether there is an Euclidean distance value that exceeds a preset threshold among the plurality of Euclidean distance values;
    若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;If there is a European-style distance value exceeding the preset threshold in the plurality of European-style distance values, an image corresponding to an energy feature vector exceeding the European-style distance value of the preset threshold is used as a key frame image, where the key frame The number of images is at least one;
    获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及Acquiring a pre-stored emotion recognition model, and recognizing a face emotion in each of the key frame images based on the emotion recognition model; and
    根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。The facial emotions corresponding to the video images are acquired according to the facial emotions in all the key frame images to complete the recognition of the facial emotions.
  2. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。The method for facial emotion recognition according to claim 1, further comprising: obtaining a neutral expression image; and performing wavelet transform on the neutral expression image to obtain a corresponding standard before the acquisition of the video image collected in real time. An energy feature vector; and storing the neutral expression image and a standard energy feature vector.
  3. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。The method for facial emotion recognition according to claim 1, before the acquiring the video images collected in real time, further comprising: acquiring an emotional training sample image set, wherein the emotional training sample image set includes a plurality of emotional training A sample image and an emotional label of a face in the emotional training sample image; and inputting the emotional training sample image and a corresponding emotional label into a convolutional neural network model for machine learning to obtain an emotional recognition model, and storing the emotional recognition model Emotion recognition model.
  4. 根据权利要求1所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模 型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。The method for facial emotion recognition according to claim 1, before the acquiring the video images collected in real time, further comprising: acquiring a training sample image set, wherein the training sample image set includes a plurality of training sample images and A face label used to characterize whether or not face information exists in the training sample image; obtaining a face Hal feature vector of the training sample image; and combining the face Hal feature vector and the face corresponding to the training sample image The labels are input to an Adaboost lifting model based on a decision tree model for training to obtain a face detection and recognition model; and the face detection and recognition model is stored.
  5. 根据权利要求4所述的人脸情绪识别方法,其中,在所述获取实时采集的视频图像之前,还包括:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。The method for facial emotion recognition according to claim 4, wherein before the acquiring a video image acquired in real time, further comprising: acquiring a calibrated video image acquired in real time; and selecting from a plurality of frames of the calibrated video image according to The preset rule extracts an image of a preset number of frames as a calibration image; based on a pre-stored face detection recognition model, determines whether face information exists in the calibration image of each frame; if a person exists in the calibration image of each frame Face information, performing the step of acquiring video images collected in real time; if no face information exists in the calibration image of at least one frame, a prompt message is sent to enable the user to adjust the angle of the camera according to the prompt information, and after adjusting the After the angle of the camera, return to the step of acquiring a calibration video image acquired in real time, until the face information exists in the calibration image for each frame.
  6. 根据权利要求1所述的人脸情绪识别方法,其中,所述基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪,包括:依次将每个所述关键帧图像作为输入值输入至所述情绪识别模型中;获取所述情绪识别模型输出的每个所述关键帧图像在多种预设情绪上的概率值;以及将每个所述关键帧图像对应的多个概率值中较大的概率值对应的情绪作为所述关键帧图像中的人脸情绪。The method for facial emotion recognition according to claim 1, wherein the identifying facial emotions in each of the key frame images based on the emotion recognition model comprises: sequentially using each of the key frame images as an input Inputting values into the emotion recognition model; obtaining probability values of each of the key frame images output by the emotion recognition model on a plurality of preset emotions; and a plurality of probabilities corresponding to each of the key frame images The emotion corresponding to the larger probability value among the values is used as the facial emotion in the key frame image.
  7. 根据权利要求1所述的人脸情绪识别方法,其中,所述根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别,包括:对所有所述关键帧图像中的人脸情绪进行概率统计;以及将出现概率较大的人脸情绪作为所述视频图像对应的人脸情绪,以完成人脸情绪的识别。The facial emotion recognition method according to claim 1, wherein the acquiring facial emotions corresponding to the video image based on the facial emotions in all the key frame images to complete the recognition of the facial emotions comprises: Probability statistics are performed on the facial emotions in all the key frame images; and the facial emotions with a higher probability of occurrence are used as the facial emotions corresponding to the video images to complete the recognition of the facial emotions.
  8. 根据权利要求2所述的人脸情绪识别方法,其中,在所述判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值之后,还包括:若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,将所述标准能量特征向量对应的中性表情图像作为所述关键帧图像,并返回执行获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪的步骤。The method for facial emotion recognition according to claim 2, wherein after determining whether there is an Euclidean distance value exceeding a preset threshold among the plurality of Euclidean distance values, further comprising: if a plurality of the Euclidean distance values There is no Euclidean distance value exceeding the preset threshold, the neutral expression image corresponding to the standard energy feature vector is used as the key frame image, and the execution is returned to obtain a previously stored emotion recognition model, and based on the emotion The step of identifying the facial emotion in each of the key frame images by the recognition model.
  9. 根据权利要求2所述的人脸情绪识别方法,其中,在所述判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值之后,还包括:若多个所述欧式距离值中不存在超过所述预设阈值的欧式距离值,设置所述视频图像对应 的人脸情绪为中性情绪,以完成人脸情绪的识别。The method for facial emotion recognition according to claim 2, wherein after determining whether there is an Euclidean distance value exceeding a preset threshold among the plurality of Euclidean distance values, further comprising: if a plurality of the Euclidean distance values There is no Euclidean distance value exceeding the preset threshold, and the facial emotion corresponding to the video image is set to be a neutral emotion to complete the recognition of the facial emotion.
  10. 根据权利要求1所述的人脸情绪识别方法,其中,在所述根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪之后,还包括:将所述视频图像对应的时间段以及所述视频图像对应的人脸情绪记录至情绪列表中;根据所述情绪列表,统计预设时间段内的所有所述视频图像对应的人脸情绪中属于预设情绪类的人脸情绪的概率;判断属于所述预设情绪类的人脸情绪的概率是否超过预设概率值;若属于所述预设情绪类的人脸情绪的概率超过所述预设概率值,获取预设提示方式和预设提示信息,并根据所述预设提示方式向用户提示所述预设提示信息。The facial emotion recognition method according to claim 1, wherein after the obtaining the facial emotion corresponding to the video image according to the facial emotion in all the key frame images, further comprising: converting the video image The corresponding time period and the facial emotions corresponding to the video image are recorded into the emotion list; according to the emotional list, statistics are collected on the facial emotions corresponding to all the video images in the preset time period that belong to the preset emotion category. The probability of a facial emotion; determining whether the probability of a facial emotion belonging to the preset emotion category exceeds a preset probability value; if the probability of a facial emotion belonging to the preset mood category exceeds the preset probability value, obtaining Preset the prompt mode and the preset prompt information, and prompt the user with the preset prompt information according to the preset prompt mode.
  11. 一种人脸情绪识别装置,其包括:A facial emotion recognition device includes:
    获取单元,用于获取实时采集的视频图像;An acquisition unit for acquiring a video image collected in real time;
    变换单元,用于对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;A transformation unit, configured to perform wavelet transformation on all frame images in the video image to obtain corresponding energy feature vectors;
    距离计算单元,用于获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;A distance calculation unit, configured to obtain a standard energy feature vector, and calculate an Euclidean distance value between each of the energy feature vectors and the standard energy feature vector according to an image difference calculation method;
    距离判断单元,用于判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;A distance judging unit, configured to determine whether a plurality of Euclidean distance values has an Euclidean distance value exceeding a preset threshold;
    关键帧获取单元,用于若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;A key frame acquisition unit is configured to: if a plurality of Euclidean distance values exists, and an Euclidean distance value exceeding the preset threshold exists, use an image corresponding to an energy feature vector of the Euclidean distance value exceeding the preset threshold as a key frame image , Wherein the number of the key frame images is at least one;
    情绪识别单元,用于获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及An emotion recognition unit, configured to obtain a pre-stored emotion recognition model, and recognize a facial emotion in each of the key frame images based on the emotion recognition model; and
    情绪获取单元,用于根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。An emotion acquiring unit is configured to acquire a facial emotion corresponding to the video image according to facial emotions in all the key frame images, so as to complete recognition of facial emotions.
  12. 一种计算机设备,包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机程序,其中,所述处理器执行所述计算机程序时实现如下步骤:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个 所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。A computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein when the processor executes the computer program, the following steps are implemented: A video image; performing wavelet transform on all frame images in the video image to obtain corresponding energy feature vectors; obtaining a standard energy feature vector, and calculating each of the energy feature vector and the standard energy feature according to an image difference calculation method Euclidean distance values between vectors; judging whether there are Euclidean distance values that exceed a preset threshold among the plurality of Euclidean distance values; if there are Euclidean distance values that exceed the preset threshold among the plurality of Euclidean distance values, An image corresponding to the energy feature vector of the Euclidean distance value exceeding the preset threshold is taken as a key frame image, where the number of the key frame images is at least one; a pre-stored emotion recognition model is obtained, and based on the emotion recognition The model identifies facial emotions in each of said keyframe images; and according to all said keys The facial emotion in the frame image acquires the facial emotion corresponding to the video image to complete the recognition of the facial emotion.
  13. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。The computer device according to claim 12, wherein before the processor executes acquiring a video image acquired in real time, it further implements the steps of: acquiring a neutral expression image; and performing wavelet transform on the neutral expression image to obtain a corresponding A standard energy feature vector; and storing the neutral expression image and the standard energy feature vector.
  14. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。The computer device according to claim 12, wherein before the processor executes acquiring video images acquired in real time, the processor further implements the step of: acquiring an emotional training sample image set, wherein the emotional training sample image set includes a plurality of emotions A training sample image and an emotional label of a face in the emotional training sample image; and inputting the emotional training sample image and a corresponding emotional label into a convolutional neural network model for machine learning to obtain an emotional recognition model, and storing the The emotion recognition model is described.
  15. 根据权利要求12所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。The computer device according to claim 12, wherein before the processor executes acquiring video images acquired in real time, the processor further implements the step of: acquiring a training sample image set, wherein the training sample image set includes a plurality of training sample images And a face label for characterizing whether or not face information exists in the training sample image; obtaining a face Hal feature vector of the training sample image; and combining the face Hal feature vector corresponding to the training sample image and a person The face label is input to an Adaboost lifting model based on a decision tree model for training to obtain a face detection recognition model; and storing the face detection recognition model.
  16. 根据权利要求15所述的计算机设备,其中,所述处理器执行获取实时采集的视频图像之前,还实现如下步骤:获取实时采集的校准视频图像;从所述校准视频图像中的多帧图像中按照预设规则抽取预设帧数的图像作为校准图像;基于预先存储的人脸检测识别模型,判断每帧所述校准图像中是否均存在人脸信息;若每帧所述校准图像中均存在人脸信息,执行获取实时采集的视频图像的步骤;若至少一帧所述校准图像中不存在人脸信息,发出提示信息以使得用户根据所述提示信息调整摄像头的角度,并在调整好所述摄像头的角度后,返回执行获取实时采集的校准视频图像的步骤,直至使得每帧所述校准图像中均存在人脸信息为止。The computer device according to claim 15, wherein before the processor executes acquiring a video image acquired in real time, it further implements the steps of: acquiring a calibration video image acquired in real time; and from a plurality of frames of the calibration video image Extract a preset number of frames as a calibration image according to a preset rule; based on a pre-stored face detection and recognition model, determine whether face information exists in the calibration image of each frame; if the calibration image exists in each frame For face information, perform the step of obtaining a video image collected in real time; if no face information exists in the calibration image of at least one frame, a prompt message is sent to enable the user to adjust the angle of the camera according to the prompt information and adjust the camera After describing the angle of the camera, return to the step of obtaining a calibration video image acquired in real time, until face information is present in the calibration image for each frame.
  17. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序当被处理器执行时使所述处理器执行如下步骤:获取实时采集的视频图像;对所述视频图像中的所有帧图像进行小波变换以得到对应的能量特征向量;获取标准能量特征向量,并根据图像差分运算方法计算每个所述能量特征向量与所述标准能量特征向量之间的欧式距离值;判断多个所述欧式距离值中是否存在超过预设阈值的欧式距离值;若多个所述欧式距离值中存在超过所述预设阈值的欧式距离值,将超过所述预设阈值的欧式距离值的能量特征向量对应的图像作为关键帧图像,其中,所述关键帧图像的个数为至少一个;获取预先存储的情绪识别模型,并基于所述情绪识别模型识别每个所述关键帧图像中的人脸情绪;以及根据所有所述关键帧图像中的人脸情绪获取所述视频图像对应的人脸情绪,以完成人脸情绪的识别。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, causes the processor to perform the following steps: acquiring a video image acquired in real time; All frame images in the video image are wavelet transformed to obtain corresponding energy feature vectors; a standard energy feature vector is obtained, and a European-style between each of the energy feature vectors and the standard energy feature vector is calculated according to an image difference calculation method. Distance value; judging whether there is a European-style distance value that exceeds a preset threshold in a plurality of European-style distance values; if a European-style distance value that exceeds the preset threshold exists in a plurality of European-style distance values, the preset value will be exceeded An image corresponding to the energy feature vector of the threshold Euclidean distance value is used as a key frame image, where the number of the key frame images is at least one; a pre-stored emotion recognition model is obtained, and each of the images is identified based on the emotion recognition model. The face emotions in the keyframe image; and the faces in all the keyframe images Xu acquiring the video image corresponding to the emotional face to face emotion recognition is completed.
  18. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取中性表情图像;对所述中性表情图像进行小波变换以得到对应的标准能量特征向量;以及存储所述中性表情图像和标准能量特征向量。The computer-readable storage medium of claim 17, wherein the computer program, before being executed by a processor to acquire a video image acquired in real time, further causes the processor to perform the steps of: acquiring a neutral expression image; Performing wavelet transform on the neutral expression image to obtain a corresponding standard energy feature vector; and storing the neutral expression image and the standard energy feature vector.
  19. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取情绪训练样本图像集,其中,所述情绪训练样本图像集包括多个情绪训练样本图像和所述情绪训练样本图像中人脸的情绪标签;以及将所述情绪训练样本图像和对应的情绪标签输入至卷积神经网络模型中进行机器学习以得到情绪识别模型,并存储所述情绪识别模型。The computer-readable storage medium of claim 17, wherein the computer program, before being executed by a processor to acquire video images acquired in real time, further causes the processor to perform the following steps: acquiring a set of emotional training sample images, wherein The emotional training sample image set includes a plurality of emotional training sample images and emotional labels of faces in the emotional training sample image; and inputting the emotional training sample images and corresponding emotional labels into a convolutional neural network model Machine learning is performed to obtain an emotion recognition model, and the emotion recognition model is stored.
  20. 根据权利要求17所述的计算机可读存储介质,其中,所述计算机程序当被处理器执行获取实时采集的视频图像之前,还使所述处理器执行如下步骤:获取训练样本图像集,其中,所述训练样本图像集包括多个训练样本图像和用于表征所述训练样本图像中是否存在人脸信息的人脸标签;获取所述训练样本图像的人脸哈尔特征向量;将所述训练样本图像对应的人脸哈尔特征向量以及人脸标签输入至基于决策树模型的Adaboost提升模型中进行训练,以得到人脸检测识别模型;以及存储所述人脸检测识别模型。The computer-readable storage medium according to claim 17, wherein the computer program, before being executed by a processor to acquire video images acquired in real time, further causes the processor to perform the following steps: acquiring a training sample image set, wherein, The training sample image set includes a plurality of training sample images and a face label used to characterize whether or not there is face information in the training sample image; obtaining a face Hal feature vector of the training sample image; The face hal feature vector corresponding to the sample image and the face label are input into an Adaboost lifting model based on a decision tree model for training to obtain a face detection recognition model; and the face detection recognition model is stored.
PCT/CN2018/108251 2018-08-07 2018-09-28 Human face emotion identification method and device, computer device and storage medium WO2020029406A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810892915.6 2018-08-07
CN201810892915.6A CN109190487A (en) 2018-08-07 2018-08-07 Face Emotion identification method, apparatus, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2020029406A1 true WO2020029406A1 (en) 2020-02-13

Family

ID=64921037

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/108251 WO2020029406A1 (en) 2018-08-07 2018-09-28 Human face emotion identification method and device, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN109190487A (en)
WO (1) WO2020029406A1 (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111627086A (en) * 2020-06-03 2020-09-04 上海商汤智能科技有限公司 Head portrait display method and device, computer equipment and storage medium
CN111860407A (en) * 2020-07-29 2020-10-30 华侨大学 A method, device, device and storage medium for facial expression recognition of characters in video
CN111950447A (en) * 2020-08-11 2020-11-17 合肥工业大学 Emotion recognition method and system based on walking gesture, storage medium
CN112016437A (en) * 2020-08-26 2020-12-01 中国科学院重庆绿色智能技术研究院 A method of living body detection based on key frames of face video
CN112364736A (en) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 Dynamic facial expression recognition method, device and equipment
CN112418146A (en) * 2020-12-02 2021-02-26 深圳市优必选科技股份有限公司 Expression recognition method and device, service robot and readable storage medium
CN112507959A (en) * 2020-12-21 2021-03-16 中国科学院心理研究所 Method for establishing emotion perception model based on individual face analysis in video
CN112507824A (en) * 2020-11-27 2021-03-16 长威信息科技发展股份有限公司 Method and system for identifying video image features
CN112633239A (en) * 2020-12-31 2021-04-09 中国工商银行股份有限公司 Micro-expression identification method and device
CN112686195A (en) * 2021-01-07 2021-04-20 风变科技(深圳)有限公司 Emotion recognition method and device, computer equipment and storage medium
CN112699774A (en) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 Method and device for recognizing emotion of person in video, computer equipment and medium
CN112712022A (en) * 2020-12-29 2021-04-27 华南理工大学 Pressure detection method, system and device based on image recognition and storage medium
CN113434647A (en) * 2021-06-18 2021-09-24 竹间智能科技(上海)有限公司 Man-machine interaction method, system and storage medium
CN114360053A (en) * 2021-12-15 2022-04-15 中国科学院深圳先进技术研究院 Motion recognition method, terminal and storage medium
CN114943924A (en) * 2022-06-21 2022-08-26 深圳大学 Pain assessment method, system, device and medium based on facial expression video
CN114998440A (en) * 2022-08-08 2022-09-02 广东数业智能科技有限公司 Multi-mode-based evaluation method, device, medium and equipment
CN115019374A (en) * 2022-07-18 2022-09-06 北京师范大学 Method and system for low-consumption detection of students' concentration in smart classroom based on artificial intelligence
CN115177252A (en) * 2022-07-13 2022-10-14 承德石油高等专科学校 Intelligent endowment service psychology of can convenient operation is dredged device
CN115429271A (en) * 2022-09-09 2022-12-06 北京理工大学 Autism spectrum disorder screening system and method based on eye movement and facial expression
CN116453159A (en) * 2023-04-06 2023-07-18 华院计算技术(上海)股份有限公司 An emotion recognition method, electronic device and medium
CN116665281A (en) * 2023-06-28 2023-08-29 湖南创星科技股份有限公司 Key emotion extraction method based on doctor-patient interaction
CN117312992A (en) * 2023-11-30 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Emotion recognition method and system for fusion of multi-view face features and audio features
CN119026923A (en) * 2024-10-28 2024-11-26 福建省万物智联科技有限公司 An identity recognition management method and related equipment for interactive education

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784277B (en) * 2019-01-17 2023-04-28 南京大学 An emotion recognition method based on smart glasses
CN109934097A (en) * 2019-01-23 2019-06-25 深圳市中银科技有限公司 A kind of expression and mental health management system based on artificial intelligence
CN109816893B (en) * 2019-01-23 2022-11-04 深圳壹账通智能科技有限公司 Information transmission method, information transmission device, server, and storage medium
CN109919001A (en) * 2019-01-23 2019-06-21 深圳壹账通智能科技有限公司 Customer service monitoring method, device, device and storage medium based on emotion recognition
CN109871807B (en) * 2019-02-21 2023-02-10 百度在线网络技术(北京)有限公司 Face image processing method and device
CN109992505B (en) * 2019-03-15 2024-07-02 平安科技(深圳)有限公司 Application program testing method and device, computer equipment and storage medium
CN110047588A (en) * 2019-03-18 2019-07-23 平安科技(深圳)有限公司 Method of calling, device, computer equipment and storage medium based on micro- expression
CN110175526B (en) * 2019-04-28 2024-06-21 平安科技(深圳)有限公司 Training method and device for dog emotion recognition model, computer equipment and storage medium
CN110097004B (en) * 2019-04-30 2022-03-29 北京字节跳动网络技术有限公司 Facial expression recognition method and device
CN110414323A (en) * 2019-06-14 2019-11-05 平安科技(深圳)有限公司 Emotion detection method, device, electronic device and storage medium
CN110399837B (en) * 2019-07-25 2024-01-05 深圳智慧林网络科技有限公司 User emotion recognition method, device and computer readable storage medium
CN110751381A (en) * 2019-09-30 2020-02-04 东南大学 Road rage vehicle risk assessment and prevention and control method
CN110991427B (en) * 2019-12-25 2023-07-14 北京百度网讯科技有限公司 Emotion recognition method and device for video and computer equipment
CN111401198B (en) * 2020-03-10 2024-04-23 广东九联科技股份有限公司 Audience emotion recognition method, device and system
CN111767779B (en) * 2020-03-18 2024-10-22 北京沃东天骏信息技术有限公司 Image processing method, device, equipment and computer readable storage medium
CN111783587B (en) * 2020-06-22 2024-11-29 腾讯数码(天津)有限公司 Interaction method, device and storage medium
CN111859025A (en) * 2020-07-03 2020-10-30 广州华多网络科技有限公司 Expression instruction generation method, device, device and storage medium
CN112541425B (en) * 2020-12-10 2024-09-03 深圳地平线机器人科技有限公司 Emotion detection method, emotion detection device, emotion detection medium and electronic equipment
TWI811605B (en) 2020-12-31 2023-08-11 宏碁股份有限公司 Method and system for mental index prediction
CN114005153A (en) * 2021-02-01 2022-02-01 南京云思创智信息科技有限公司 A real-time recognition method for personalized micro-expressions based on facial diversity
CN113076813B (en) * 2021-03-12 2024-04-12 首都医科大学宣武医院 Training method and device for mask face feature recognition model
CN113128399B (en) * 2021-04-19 2022-05-17 重庆大学 Speech image key frame extraction method for emotion recognition
CN113505665B (en) * 2021-06-28 2023-06-20 哈尔滨工业大学(深圳) Method and device for interpreting students' emotions in school based on video
CN114239640B (en) * 2021-11-15 2025-03-21 国网江西省电力有限公司吉安供电分公司 Substation secondary circuit signal detection method based on wavelet decomposition rolling learning
CN114694234B (en) * 2022-06-02 2023-02-03 杭州智诺科技股份有限公司 Emotion recognition method, system, electronic device and storage medium
CN116563915B (en) * 2023-04-28 2024-07-26 深圳大器时代科技有限公司 Face state recognition method and device based on deep learning algorithm
CN116343314B (en) * 2023-05-30 2023-08-25 之江实验室 Expression recognition method and device, storage medium and electronic equipment
CN118799943B (en) * 2024-07-12 2025-03-18 安徽字节互连科技有限公司 A special identification method and system for AI intelligent platform

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008391A (en) * 2014-04-30 2014-08-27 首都医科大学 Face micro-expression capturing and recognizing method based on nonlinear dimension reduction
CN105354527A (en) * 2014-08-20 2016-02-24 南京普爱射线影像设备有限公司 Negative expression recognizing and encouraging system
CN107403142A (en) * 2017-07-05 2017-11-28 山东中磁视讯股份有限公司 A kind of detection method of micro- expression
CN107665074A (en) * 2017-10-18 2018-02-06 维沃移动通信有限公司 A kind of color temperature adjusting method and mobile terminal

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104298682B (en) * 2013-07-18 2018-02-23 广州华久信息科技有限公司 A kind of evaluation method and mobile phone of the information recommendation effect based on Facial Expression Image
CN103870820A (en) * 2014-04-04 2014-06-18 南京工程学院 Illumination normalization method for extreme illumination face recognition
CN106295566B (en) * 2016-08-10 2019-07-09 北京小米移动软件有限公司 Facial expression recognizing method and device
CN106980811A (en) * 2016-10-21 2017-07-25 商汤集团有限公司 Facial expression recognition method and facial expression recognition device
CN106951856A (en) * 2017-03-16 2017-07-14 腾讯科技(深圳)有限公司 Bag extracting method of expressing one's feelings and device
CN107358169A (en) * 2017-06-21 2017-11-17 厦门中控智慧信息技术有限公司 A kind of facial expression recognizing method and expression recognition device
CN107633203A (en) * 2017-08-17 2018-01-26 平安科技(深圳)有限公司 Facial emotions recognition methods, device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104008391A (en) * 2014-04-30 2014-08-27 首都医科大学 Face micro-expression capturing and recognizing method based on nonlinear dimension reduction
CN105354527A (en) * 2014-08-20 2016-02-24 南京普爱射线影像设备有限公司 Negative expression recognizing and encouraging system
CN107403142A (en) * 2017-07-05 2017-11-28 山东中磁视讯股份有限公司 A kind of detection method of micro- expression
CN107665074A (en) * 2017-10-18 2018-02-06 维沃移动通信有限公司 A kind of color temperature adjusting method and mobile terminal

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111627086A (en) * 2020-06-03 2020-09-04 上海商汤智能科技有限公司 Head portrait display method and device, computer equipment and storage medium
CN111860407A (en) * 2020-07-29 2020-10-30 华侨大学 A method, device, device and storage medium for facial expression recognition of characters in video
CN111860407B (en) * 2020-07-29 2023-04-25 华侨大学 Method, device, equipment and storage medium for identifying expression of character in video
CN111950447A (en) * 2020-08-11 2020-11-17 合肥工业大学 Emotion recognition method and system based on walking gesture, storage medium
CN111950447B (en) * 2020-08-11 2023-08-22 合肥工业大学 Emotion recognition method and system based on walking posture, storage medium
CN112016437A (en) * 2020-08-26 2020-12-01 中国科学院重庆绿色智能技术研究院 A method of living body detection based on key frames of face video
CN112016437B (en) * 2020-08-26 2023-02-10 中国科学院重庆绿色智能技术研究院 Living body detection method based on face video key frame
CN112364736A (en) * 2020-10-30 2021-02-12 深圳点猫科技有限公司 Dynamic facial expression recognition method, device and equipment
CN112507824A (en) * 2020-11-27 2021-03-16 长威信息科技发展股份有限公司 Method and system for identifying video image features
CN112418146B (en) * 2020-12-02 2024-04-30 深圳市优必选科技股份有限公司 Expression recognition method, apparatus, service robot, and readable storage medium
CN112418146A (en) * 2020-12-02 2021-02-26 深圳市优必选科技股份有限公司 Expression recognition method and device, service robot and readable storage medium
CN112507959A (en) * 2020-12-21 2021-03-16 中国科学院心理研究所 Method for establishing emotion perception model based on individual face analysis in video
CN112699774A (en) * 2020-12-28 2021-04-23 深延科技(北京)有限公司 Method and device for recognizing emotion of person in video, computer equipment and medium
CN112699774B (en) * 2020-12-28 2024-05-24 深延科技(北京)有限公司 Emotion recognition method and device for characters in video, computer equipment and medium
CN112712022A (en) * 2020-12-29 2021-04-27 华南理工大学 Pressure detection method, system and device based on image recognition and storage medium
CN112712022B (en) * 2020-12-29 2023-05-23 华南理工大学 Pressure detection method, system, device and storage medium based on image recognition
CN112633239A (en) * 2020-12-31 2021-04-09 中国工商银行股份有限公司 Micro-expression identification method and device
CN112686195A (en) * 2021-01-07 2021-04-20 风变科技(深圳)有限公司 Emotion recognition method and device, computer equipment and storage medium
CN113434647A (en) * 2021-06-18 2021-09-24 竹间智能科技(上海)有限公司 Man-machine interaction method, system and storage medium
CN113434647B (en) * 2021-06-18 2024-01-12 竹间智能科技(上海)有限公司 A human-computer interaction method, system and storage medium
CN114360053A (en) * 2021-12-15 2022-04-15 中国科学院深圳先进技术研究院 Motion recognition method, terminal and storage medium
CN114943924A (en) * 2022-06-21 2022-08-26 深圳大学 Pain assessment method, system, device and medium based on facial expression video
CN114943924B (en) * 2022-06-21 2024-05-14 深圳大学 Pain assessment method, system, device and medium based on facial expression video
CN115177252A (en) * 2022-07-13 2022-10-14 承德石油高等专科学校 Intelligent endowment service psychology of can convenient operation is dredged device
CN115177252B (en) * 2022-07-13 2024-01-12 承德石油高等专科学校 Intelligent pension service psychological dispersion device capable of being operated conveniently
CN115019374B (en) * 2022-07-18 2022-10-11 北京师范大学 Method and system for low-consumption detection of students' concentration in smart classroom based on artificial intelligence
CN115019374A (en) * 2022-07-18 2022-09-06 北京师范大学 Method and system for low-consumption detection of students' concentration in smart classroom based on artificial intelligence
CN114998440B (en) * 2022-08-08 2022-11-11 广东数业智能科技有限公司 Multi-mode-based evaluation method, device, medium and equipment
CN114998440A (en) * 2022-08-08 2022-09-02 广东数业智能科技有限公司 Multi-mode-based evaluation method, device, medium and equipment
CN115429271A (en) * 2022-09-09 2022-12-06 北京理工大学 Autism spectrum disorder screening system and method based on eye movement and facial expression
CN116453159A (en) * 2023-04-06 2023-07-18 华院计算技术(上海)股份有限公司 An emotion recognition method, electronic device and medium
CN116665281A (en) * 2023-06-28 2023-08-29 湖南创星科技股份有限公司 Key emotion extraction method based on doctor-patient interaction
CN116665281B (en) * 2023-06-28 2024-05-10 湖南创星科技股份有限公司 Key emotion extraction method based on doctor-patient interaction
CN117312992A (en) * 2023-11-30 2023-12-29 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Emotion recognition method and system for fusion of multi-view face features and audio features
CN117312992B (en) * 2023-11-30 2024-03-12 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Emotion recognition method and system for fusion of multi-view face features and audio features
CN119026923A (en) * 2024-10-28 2024-11-26 福建省万物智联科技有限公司 An identity recognition management method and related equipment for interactive education

Also Published As

Publication number Publication date
CN109190487A (en) 2019-01-11

Similar Documents

Publication Publication Date Title
WO2020029406A1 (en) Human face emotion identification method and device, computer device and storage medium
CN111488433B (en) Artificial intelligence interactive system suitable for bank and capable of improving field experience
WO2020211388A1 (en) Behavior prediction method and device employing prediction model, apparatus, and storage medium
WO2020098249A1 (en) Electronic device, response conversation technique recommendation method and computer readable storage medium
WO2020007129A1 (en) Context acquisition method and device based on voice interaction
WO2025077380A1 (en) Data processing method and apparatus, and electronic device and storage medium
WO2021051602A1 (en) Lip password-based face recognition method and system, device, and storage medium
CN112581938B (en) Speech breakpoint detection method, device and equipment based on artificial intelligence
CN114120425A (en) Emotion recognition method and device, electronic equipment and storage medium
WO2024188277A1 (en) Text semantic matching method and refrigeration device system
CN111832651B (en) Video multi-mode emotion inference method and device
US20210166685A1 (en) Speech processing apparatus and speech processing method
WO2023272833A1 (en) Data detection method, apparatus and device and readable storage medium
CN116010545A (en) Data processing method, device and equipment
CN115661889A (en) Audio and video multimode-based specific character deep forgery detection method
CN114639152A (en) Multimodal voice interaction method, device, device and medium based on face recognition
CN111708988B (en) Infringement video identification method and device, electronic equipment and storage medium
CN118629676A (en) Visitation management method and electronic device
CN111522943A (en) Automatic test method, device, equipment and storage medium for logic node
CN117370934A (en) Multi-mode data enhancement method of sensitive information discovery model
CN113642503B (en) Window service scoring method and system based on image and voice recognition
CN116089906A (en) Multi-mode classification method and system based on dynamic context representation and mode fusion
CN110084143A (en) A kind of emotional information guard method and system for recognition of face
CN111970311B (en) Session segmentation method, electronic device and computer readable medium
CN115394294A (en) Voice recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18929826

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18929826

Country of ref document: EP

Kind code of ref document: A1