WO2019196312A1 - Method and apparatus for adjusting sound volume by robot, computer device and storage medium - Google Patents
Method and apparatus for adjusting sound volume by robot, computer device and storage medium Download PDFInfo
- Publication number
- WO2019196312A1 WO2019196312A1 PCT/CN2018/102853 CN2018102853W WO2019196312A1 WO 2019196312 A1 WO2019196312 A1 WO 2019196312A1 CN 2018102853 W CN2018102853 W CN 2018102853W WO 2019196312 A1 WO2019196312 A1 WO 2019196312A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- distance
- user
- robot
- image
- volume
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/60—Analysis of geometric attributes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/18—Eye characteristics, e.g. of the iris
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present application relates to the field of robotics, and in particular, to a method and apparatus for automatically adjusting a volume by a robot, and a computer device and a storage medium storing computer readable instructions.
- the purpose of the present application is to solve at least one of the above technical drawbacks, in particular, a technical defect of poor interaction efficiency.
- the present application provides a method for a robot to automatically adjust a volume.
- the robot has a camera, a speaker, and an ambient microphone for collecting ambient sound.
- the first user with a predefined height H is corresponding to the speaker volume when the distance from the robot is D.
- the method includes the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user according to the second user image feature, and comparing the a distance d of the robot, determining a height gain k h according to the relationship between the h and the H, and determining a distance gain k d by the relationship between the d and the D; obtaining an environmental noise value by collecting the ambient volume through the ambient microphone e , determining a corresponding environmental gain k e according to the v e and a preset correspondence relationship; determining a speaker volume according to the k h , the k d , the k e , and the V
- the present application also provides a device for automatically adjusting the volume of a robot, the robot having a camera, a speaker and an environment microphone for collecting ambient sound, and a speaker volume corresponding to a first user of a predefined height H at a distance D from the robot
- the apparatus includes: a first calculation module, configured to acquire an image by the camera and detect a second user image feature in the image, and calculate a height h of the second user according to the second user image feature a distance d from the robot, a height gain k h is determined according to the relationship between the h and the H, and a relationship between the d and the D determines a distance gain k d ;
- a second calculation module is configured to pass the ambient microphone Collecting an ambient volume to obtain an environmental noise value v e , determining and corresponding a corresponding environmental gain k e according to the v e and a preset correspondence relationship; a volume calculation module, configured to perform , according to the k h , the k
- the application also provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor, such that the processor executes a robot automatically a method for adjusting a volume, the robot having a camera, a speaker, and an ambient microphone for collecting ambient sound, the first user of the predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, and the robot automatically adjusts
- the method of volume includes the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user and a relative robot according to the second user image feature a distance d, a height gain k h is determined according to the relationship between the h and the H, and a relationship between the d and the D determines a distance gain k d ; and an ambient noise value v e is obtained by collecting the ambient volume through the ambient microphone; determining a gain corresponding to the environment according to corresponding
- the present application also provides a non-volatile storage medium storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform a robot to automatically adjust the volume
- the method has a camera, a speaker and an environment microphone for collecting ambient sound, and the first user of the predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, and the robot automatically adjusts the volume.
- the method comprises the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user and a distance d relative to the robot according to the second user image feature, according to The relationship between the h and the H determines a height gain k h , and the relationship between the d and the D determines a distance gain k d ; the ambient microphone generates an ambient noise value v e according to the ambient microphone, according to the v Corresponding relationship between e and preset determines a corresponding environmental gain k e ; determining speaker volume according to the k h , the k d , the k e , and the V
- the above method, device, computer device and storage medium for automatically adjusting the volume of the robot determine the speaker volume V m by determining the distance h between the user's h and the user relative to the robot and the environmental noise value v e measured by the environmental microphone, so that the robot can Intelligently adjust the speaker volume according to the actual situation, so that the user can give the most appropriate volume level in any environment, which improves the interaction efficiency and user experience.
- FIG. 1 is a schematic diagram showing the internal structure of a computer device in an embodiment
- FIG. 2 is a schematic flow chart of a method for automatically adjusting a volume of a robot according to an embodiment
- Figure 3 is a top plan view of the spatial position between the robot and the user of one embodiment
- FIG. 4 is a schematic diagram of a device module for automatically adjusting a volume of a robot according to an embodiment.
- FIG. 1 is a schematic diagram showing the internal structure of a computer device in an embodiment.
- the computer device includes a processor, a non-volatile storage medium, a memory, and a network interface connected by a system bus.
- the non-volatile storage medium of the computer device stores an operating system, a database, and computer readable instructions.
- the database may store a sequence of control information.
- the processor may implement a processor. A method in which the robot automatically adjusts the volume.
- the processor of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device.
- Computer readable instructions may be stored in the memory of the computer device, the computer readable instructions being executable by the processor to cause the processor to perform a method of automatically adjusting the volume by the robot.
- the network interface of the computer device is used to communicate with the terminal connection. It will be understood by those skilled in the art that the structure shown in FIG. 1 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied.
- the specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
- the method of automatically adjusting the volume of the robot described below can be applied to an intelligent robot such as a customer service robot, a child education robot, and the like.
- FIG. 2 is a schematic flow chart of a method for automatically adjusting a volume of a robot according to an embodiment.
- the present application provides a method for automatically adjusting a volume of a robot, the robot having a camera, a speaker, and an environment microphone for collecting ambient sound (also having a microphone for collecting user sound), and the first user with a predefined height H is away from the robot
- the distance is D
- the corresponding speaker volume is V
- the method includes the following steps:
- Step S100 acquiring an image by the camera and detecting an image feature of the second user in the image, and calculating a height h of the second user and a distance d of the relative robot according to the second user image feature, according to the h and The relationship of H determines the height gain k h , and the relationship of d to D determines the distance gain k d .
- the face detection method can be used for face detection to detect the second user in the image.
- the robot's camera may capture multiple faces, some of whom are just background characters, and do not interact with the robot, such as a conversation, it is only necessary to consider the person who is facing the camera and the robot.
- the camera is usually placed in the direction of the robot facing the user. For example, if the robot has a head, it can be placed at the position of the forehead or face of the head; if the robot has a torso, it can also be placed at the position of the front torso.
- the setting position of the camera is not limited, and it is only necessary to ensure that the second user can capture the second user when talking to the robot.
- the captured image (picture or video frame) is fixed in size, and the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area.
- the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area.
- a 1000 ⁇ 1000 rectangular position can be delineated at the center of the picture as a face recognition area.
- the face detection technology which is a technique for detecting a face existing in an image by image analysis and accurately arranging the position of the face with a rectangular frame, is the basis of face feature point detection and face recognition.
- Commonly used face detection data sets including FDDB (Face Detection Data Set and Benchmark). With the rapid development of deep learning in recent years, many excellent face detection methods have emerged.
- the FDDB database has submitted many excellent face detection methods, such as the face detection method using Convolutional Neural Network (Convolutional Neural Network): A Convolutioanal Neural Network Cascade, improved fast rcnn for face detection. :Face Detection using Deep Learning: An Improved Faster RCNN Approach, and Finding tiny faces that are very successful for small face detection.
- Convolutional Neural Network Convolutional Neural Network
- dlib Convolutioanal Neural Network Cascade
- libfacedetect An Improved Faster RCNN Approach
- the size of the face on the image changes.
- the best way is to use the image pyramid to scale the image to be detected to different sizes for multi-scale face detection.
- the non-maximum value suppression NMS is obtained, and the result of the final face detection is obtained.
- the FaceDetector type may be applied to determine whether the image captured by the camera has a face image.
- Android has a built-in face recognition API: FaceDetector, which can perform face recognition with a small amount of code, but this recognition is the most basic recognition, that is, only the face in the image can be recognized.
- the face recognition technology in Android requires the underlying library: android/external/neven/, architecture layer: frameworks/base/media/java/android/media/FaceDetector.java.
- A android.media.FaceDetector.FaceDetector(int width, int height, int maxFaces);
- B int android.media.FaceDetector.findFaces(Bitmap bitmap, Face[]faces).
- the two-eye position of the face image can be obtained by the FaceDetector class, and the position of the face image on the desktop is determined according to the position of the two eyes.
- the specific step may be: acquiring the center point of the two-eye position of the face image, acquiring the distance between the two eyes of the face image (pitch distance), and drawing a rectangular area (rectangular frame) according to the center point of the two-eye position and the distance between the two eyes, and the rectangular area As the location of the face image on the desktop.
- the center point of the two-eye position of the face image can be obtained by the following code:
- the two-eye spacing of the face image can be obtained by the following code:
- eyesDistance mFace[i].eyesDistance();
- myEyesDistance face.eyesDistance();//Get the center point and the two eye spacing parameters of the two eyes, and frame each face
- the face detection method is not limited. After detecting the face, the detected height h of the second user and the distance d relative to the robot are calculated, the height gain k h is determined according to the relationship between h and H, and the relationship between d and D determines the distance gain k d .
- the distance d of the second user relative to the robot can be calculated and determined by the sensed values of the associated ranging sensors, such as infrared ranging by infrared sensors, laser ranging using laser sensors, and the like.
- the determination d is calculated by an image analysis method.
- the distance between people's distances is not much different ( ⁇ 2cm).
- the user who talks to the robot the distance between the user and the robot is limited to a relatively small range.
- the principle is to estimate the distance by comparing the ratio of the face distance in the captured picture to the face distance in the calibration picture. The closer the user's face is to the camera, the larger the size of the face, and the relationship is approximately linear.
- the second user image feature includes a portrait distance
- the portrait distance in the image of the first user when the distance from the robot is D1 is A1 (actual interpupillary distance)
- the distance between the first user and the robot is
- the distance d between the second user and the robot is calculated by the following formula:
- a is the second user portrait distance detected when the face detection method is applied, that is, the portrait distance in the image.
- the gain is determined from the relationship K d and D d.
- the distance gains k d and d have a positive relationship, such as a proportional relationship.
- the distance gain k d d / D.
- k d d/D+m (m is a preset coefficient), and the like, and details are not described herein.
- the second user height h is calculated by the following formula:
- H1 is the camera height
- ⁇ h is the pixel difference between the center of the face rectangle and the center of the image detected when the face detection method is applied for face detection.
- Step S200 The ambient noise value v e is obtained by collecting the ambient volume by the environment microphone, and the corresponding environment gain k e is determined according to the v e and the preset correspondence relationship.
- the environmental noise value v e is obtained by collecting the ambient volume by the environmental microphone, and the environmental gain k e corresponding to the interval range is determined according to the range of the interval in which the v e is located.
- a plurality of interval ranges can be preset, and each interval range has a corresponding preset environment gain. That is, the (v 1 , v 2 ) interval range corresponds to the environmental gains k 1 , (v 2 , v 3 ), the interval ranges correspond to the environmental gains k 2 , ..., (v n-1 , v n ), and the interval ranges correspond to the environmental gains k n -1 .
- the environmental gain k e can be determined by v e and a preset calculation formula.
- the ambient microphone includes at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot (baseline in front of the robot), such as on both sides of the robot head, or two robot torso Side, see FIG. 3; the process of collecting the ambient volume by the ambient microphone to obtain the ambient noise value v e includes:
- the first ambient noise value v 1 is obtained by collecting the ambient volume by the first microphone
- the second ambient noise value v 2 is obtained by the second microphone collecting the ambient volume
- the interval range of v e can be obtained by querying the range of v e in the data table, and then the environmental gain k e corresponding to the range of the range described by v e is obtained.
- Step S300 determining the speaker volume according to k h , k d , k e , and V
- k h , k d , k e are all in a positive relationship (for example, a proportional relationship) with the speaker volume V m , Equivalent to the source gain, Corresponding to the total gain, any appropriate deformation of the calculation formula of the above-mentioned speaker volume V m based on the forward relationship can be considered to be reasonable, and will not be described herein.
- the above method for automatically adjusting the volume of the robot determines the speaker volume V m by determining the distance d between the second user and the second user relative to the robot and in combination with the ambient noise value v e measured by the environmental microphone, so that the robot can intelligently adjust according to the actual situation.
- the speaker volume which gives the user the optimum volume level regardless of the environment, improves the interaction efficiency and user experience.
- FIG. 4 is a schematic diagram of a device module for automatically adjusting a volume of a robot according to an embodiment.
- the present application also provides a device for automatically adjusting the volume of the robot, the robot having a camera, a speaker and an environment microphone for collecting ambient sound (also having a microphone for collecting user sound), a predefined height
- the first user of H is at a speaker volume of V when the distance from the robot is D
- the device includes a first calculation module 100, a second calculation module 200, and a volume calculation module 300.
- the first calculation module 100 is configured to acquire an image by the camera and detect an image feature of a second user in the image, and calculate a height h of the second user and a distance d of the relative robot according to the second user image feature, according to The relationship between h and H determines the height gain k h , and the relationship between d and D determines the distance gain k d ;
- the second calculation module 200 is configured to obtain the ambient noise value v e by collecting the ambient volume through the ambient microphone, according to v e and pre The corresponding relationship is determined to determine the corresponding environment gain k e ;
- the volume calculation module 300 is configured to determine the speaker volume according to k h , k d , k e and V
- the first calculation module 100 acquires an image by the camera and detects a second user image feature in the image, and calculates a height h of the second user and a distance d relative to the robot according to the second user image feature, according to h and H
- the relationship determines the height gain k h
- the relationship between d and D determines the distance gain k d .
- the first calculation module 100 may perform face detection using a face detection method to detect a second user in the image.
- the robot's camera may capture multiple faces, some of whom are just background characters, and do not interact with the robot, such as a conversation, it is only necessary to consider the person who is facing the camera and the robot.
- the camera is usually placed in the direction of the robot facing the user. For example, if the robot has a head, it can be placed at the position of the forehead or face of the head; if the robot has a torso, it can also be placed at the position of the front torso.
- the setting position of the camera is not limited, and it is only necessary to ensure that the second user can capture the second user when talking to the robot.
- the captured image (picture or video frame) is fixed in size, and the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area.
- the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area.
- a rectangular position of 1000 ⁇ 1000 can be delimited at the center of the picture as a face recognition area.
- the face detection technology which is a technique for detecting a face existing in an image by image analysis and accurately arranging the position of the face with a rectangular frame, is the basis of face feature point detection and face recognition.
- Commonly used face detection data sets including FDDB (Face Detection Data Set and Benchmark). With the rapid development of deep learning in recent years, many excellent face detection methods have emerged.
- the FDDB database has submitted many excellent face detection methods, such as the face detection method using Convolutional Neural Network (Convolutional Neural Network): A Convolutioanal Neural Network Cascade, improved fast rcnn for face detection. :Face Detection using Deep Learning: An Improved Faster RCNN Approach, and Finding tiny faces that are very successful for small face detection.
- Convolutional Neural Network Convolutional Neural Network
- dlib Convolutioanal Neural Network Cascade
- libfacedetect An Improved Faster RCNN Approach
- the size of the face on the image changes.
- the best way is to use the image pyramid to scale the image to be detected to different sizes for multi-scale face detection.
- the non-maximum value suppression NMS is obtained, and the result of the final face detection is obtained.
- the first calculation module 100 may apply the FaceDetector type to determine whether the image captured by the camera has a face image.
- Android has a built-in face recognition API: FaceDetector, which can perform face recognition with a small amount of code, but this recognition is the most basic recognition, that is, only the face in the image can be recognized.
- the face recognition technology in Android requires the underlying library: android/external/neven/, architecture layer: frameworks/base/media/java/android/media/FaceDetector.java.
- A android.media.FaceDetector.FaceDetector(int width, int height, int maxFaces);
- B int android.media.FaceDetector.findFaces(Bitmap bitmap, Face[]faces).
- the first calculation module 100 can acquire the two-eye position of the face image through the FaceDetector class, and determine the location of the face image on the desktop according to the two-eye position.
- the specific step may be: acquiring the center point of the two-eye position of the face image, acquiring the distance between the two eyes of the face image (pitch distance), and drawing a rectangular area (rectangular frame) according to the center point of the two-eye position and the distance between the two eyes, and the rectangular area As the location of the face image on the desktop.
- the center point of the two-eye position of the face image can be obtained by the following code:
- the two-eye spacing of the face image can be obtained by the following code:
- eyesDistance mFace[i].eyesDistance();
- myEyesDistance face.eyesDistance();//Get the center point and the two eye spacing parameters of the two eyes, and frame each face
- the face detection method is not limited.
- the first calculation module 100 calculates the detected height h of the second user and the distance d of the relative robot, determines the height gain k h according to the relationship between h and H, and determines the distance gain by the relationship between d and D. k d .
- the distance d of the second user relative to the robot can be calculated and determined by the sensed values of the associated ranging sensors, such as infrared ranging by infrared sensors, laser ranging using laser sensors, and the like.
- the first calculation module 100 calculates the determination d by an image analysis method.
- the distance between people's distances is not much different ( ⁇ 2cm).
- the user who talks to the robot the distance between the user and the robot is limited to a relatively small range.
- the principle is to estimate the distance by comparing the ratio of the face distance in the captured picture to the face distance in the calibration picture. The closer the user's face is to the camera, the larger the size of the face, and the relationship is approximately linear.
- the second user image feature includes a portrait distance
- the portrait distance in the image of the first user when the distance from the robot is D1 is A1 (actual interpupillary distance)
- the distance between the first user and the robot is
- the first calculation module 100 calculates the distance d between the second user and the robot by the following formula:
- a is the second user portrait distance detected when the face detection method is applied, that is, the portrait distance in the image.
- the first calculation module 100 can also use other image analysis methods to calculate the determination d.
- other calculation formulas are used, and only the above two assumptions and principles need to be followed, and details are not described herein.
- the distance gain k d is determined according to the relationship between d and D.
- the distance gains k d and d have a positive relationship, such as a proportional relationship.
- the distance gain k d d / D.
- k d d/D+m (m is a preset coefficient), and the like, and details are not described herein.
- the first calculation module 100 calculates the first formula by the following formula.
- Two user height h is C (the actual lay length)
- the portrait distance in the corresponding image is c (the portrait distance in the image)
- the first calculation module 100 calculates the first formula by the following formula.
- H1 is the camera height
- ⁇ h is the pixel difference between the center of the face rectangle and the center of the image detected when the face detection method is applied for face detection.
- the first calculation module 100 may also use other image analysis methods to calculate and determine h.
- other calculation formulas may be used, and only the above two assumptions and principles need to be followed, and details are not described herein.
- the second computing module 200 obtains the ambient noise value v e by collecting the ambient volume through the ambient microphone, and determines the corresponding environmental gain k e according to the v e and the preset correspondence. Specifically, the environmental noise value v e is obtained by collecting the ambient volume by the environmental microphone, and the environmental gain k e corresponding to the interval range is determined according to the range of the interval in which the v e is located.
- a plurality of interval ranges can be preset, and each interval range has a corresponding preset environment gain. That is, the (v 1 , v 2 ) interval range corresponds to the environmental gains k 1 , (v 2 , v 3 ), the interval ranges correspond to the environmental gains k 2 , ..., (v n-1 , v n ), and the interval ranges correspond to the environmental gains k n -1 .
- the second calculation module 200 can determine the environmental gain k e by using v e and a preset calculation formula.
- the ambient microphone includes at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot (baseline in front of the robot), such as on both sides of the robot head, or two robot torso Side, see FIG. 3; the process of the second computing module 200 collecting the ambient volume through the ambient microphone to obtain the ambient noise value v e includes:
- the first ambient noise value v 1 is obtained by collecting the ambient volume by the first microphone
- the second ambient noise value v 2 is obtained by the second microphone collecting the ambient volume
- the second calculation module 200 determines v e
- the data table can query v e the interval range, and to give the section v e environmental gain corresponding to the range k e.
- the volume calculation module 300 determines the speaker volume based on k h , k d , k e , and V
- k h , k d , k e are all in a positive relationship (for example, a proportional relationship) with the speaker volume V m , Equivalent to the source gain, Corresponding to the total gain, any appropriate deformation of the calculation formula of the above-mentioned speaker volume V m based on the forward relationship can be considered to be reasonable, and will not be described herein.
- the device for automatically adjusting the volume by the robot determines the speaker volume V m by determining the distance d between the second user and the second user relative to the robot and the ambient noise value v e measured by the ambient microphone, so that the robot can intelligently adjust according to the actual situation.
- the speaker volume which gives the user the optimum volume level regardless of the environment, improves the interaction efficiency and user experience.
- the application also provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor, causing the processor to perform any of the above The steps of the method for automatically adjusting the volume of the robot by the embodiment.
- the present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the robotic automatic described in any of the above embodiments The steps of the method of adjusting the volume.
- the robot has a camera, a speaker and an environmental microphone for collecting ambient sound, and the volume of the speaker corresponding to the first household when the distance from the robot is D is V
- the method includes the following steps:
- the camera acquires an image and detects a second user image feature in the image, and calculates a height h of the second user and a distance d relative to the robot according to the second user image feature, according to the h and the H
- the relationship determines the height gain k h
- the relationship between the d and the D determines the distance gain k d
- the ambient noise is obtained by the ambient microphone to obtain the ambient noise value v e , which is determined according to the v e and the preset correspondence relationship Corresponding environmental gain k e ;
- the speaker volume V m is determined by judging the distance h between the user's h and the user relative to the robot
- the storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Geometry (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Ophthalmology & Optometry (AREA)
- Manipulator (AREA)
Abstract
The present application provides a method and an apparatus for automatically adjusting sound volume by a robot. The robot has a camera, a speaker and an ambient microphone for acquiring an ambient sound. It is predefined that the corresponding sound volume of the speaker is V when a first user with a height of H has a distance D from the robot. The method comprises the following steps: acquiring an image by means of a camera and detecting a second user image feature in the image; calculating according to the second user image feature the height h of the second user and a distance d relative to a robot; determining a height gain kh according to a relationship between the h and the H, and determining a distance gain kd according to a relationship between the d and the D; acquiring an ambient sound volume by means of an ambient microphone to obtain an ambient noise value ve, and determining a corresponding ambient gain ke according to the ve and a preset correlation; determining the sound volume of a speaker (I) according to the kh, kd, ke and V, so that the robot can intelligently adjust the sound volume of the speaker according to the actual situation, improving interaction efficiency and user experience. Further provided are a computer device and a storage medium.
Description
本申请要求于2018年4月10日提交中国专利局、申请号为201810314093.3,发明名称为“机器人调节音量的方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed on April 10, 2018, the Chinese Patent Office, Application No. 201101314093.3, entitled "Method, Apparatus, Computer Equipment and Storage Medium for Robots to Adjust Volume", the entire contents of which are The citations are incorporated herein by reference.
本申请涉及机器人技术领域,具体而言,本申请涉及一种机器人自动调节音量的方法和装置,以及一种计算机设备和存储有计算机可读指令的存储介质。The present application relates to the field of robotics, and in particular, to a method and apparatus for automatically adjusting a volume by a robot, and a computer device and a storage medium storing computer readable instructions.
发明人意识到目前的服务机器人一般采用固定的音量进行语音对话和视频播放等功能,可能会由于各种因素导致环境噪音的分贝值较大,比如人流/其他音响设备的声音,从而导致用户难以听到机器人的声音,交互效率差,用户体验差。The inventor realized that current service robots generally use a fixed volume for voice dialogue and video playback, and may have a large decibel value of environmental noise due to various factors, such as the sound of people/other audio equipment, which makes it difficult for users. When you hear the sound of the robot, the interaction efficiency is poor and the user experience is poor.
发明内容Summary of the invention
本申请的目的旨在至少能解决上述的技术缺陷之一,特别是交互效率差的技术缺陷。The purpose of the present application is to solve at least one of the above technical drawbacks, in particular, a technical defect of poor interaction efficiency.
本申请提供一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述方法包括如下步骤:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对所述机器人的距离d,根据所述h与所述H的关系确定高度增益k
h,以及所述d与所述D的关系确定距离增益k
d;通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据所述v
e和预设的对应关系确定对应的环境增益k
e;根据所述k
h、所述k
d、所述k
e和所述V确定扬声器音量
The present application provides a method for a robot to automatically adjust a volume. The robot has a camera, a speaker, and an ambient microphone for collecting ambient sound. The first user with a predefined height H is corresponding to the speaker volume when the distance from the robot is D. V, the method includes the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user according to the second user image feature, and comparing the a distance d of the robot, determining a height gain k h according to the relationship between the h and the H, and determining a distance gain k d by the relationship between the d and the D; obtaining an environmental noise value by collecting the ambient volume through the ambient microphone e , determining a corresponding environmental gain k e according to the v e and a preset correspondence relationship; determining a speaker volume according to the k h , the k d , the k e , and the V
本申请还提供一种机器人自动调节音量的装置,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述装置包括:第一计算模块,用于通过所述摄像机获取图像并检测所述图像中的第二用户图像特征,根据第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k
h,以及所述d与所述D的关系确定距离增益k
d;第二计算模块,用于通过所述 环境麦克风采集环境音量得到环境噪音值v
e,根据所述v
e和预设的对应关系确定与对应的环境增益k
e;音量计算模块,用于根据所述k
h、所述k
d、所述k
e和所述V确定扬声器音量
The present application also provides a device for automatically adjusting the volume of a robot, the robot having a camera, a speaker and an environment microphone for collecting ambient sound, and a speaker volume corresponding to a first user of a predefined height H at a distance D from the robot For V, the apparatus includes: a first calculation module, configured to acquire an image by the camera and detect a second user image feature in the image, and calculate a height h of the second user according to the second user image feature a distance d from the robot, a height gain k h is determined according to the relationship between the h and the H, and a relationship between the d and the D determines a distance gain k d ; a second calculation module is configured to pass the ambient microphone Collecting an ambient volume to obtain an environmental noise value v e , determining and corresponding a corresponding environmental gain k e according to the v e and a preset correspondence relationship; a volume calculation module, configured to perform , according to the k h , the k d , the k e and the V determine the speaker volume
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述机器人自动调节音量的方法包括以下方法的步骤:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k
h,以及所述d与所述D的关系确定距离增益k
d;通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据所述v
e和预设的对应关系确定对应的环境增益k
e;根据所述k
h、所述k
d、所述k
e和所述V确定扬声器音量
The application also provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor, such that the processor executes a robot automatically a method for adjusting a volume, the robot having a camera, a speaker, and an ambient microphone for collecting ambient sound, the first user of the predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, and the robot automatically adjusts The method of volume includes the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user and a relative robot according to the second user image feature a distance d, a height gain k h is determined according to the relationship between the h and the H, and a relationship between the d and the D determines a distance gain k d ; and an ambient noise value v e is obtained by collecting the ambient volume through the ambient microphone; determining a gain corresponding to the environment according to corresponding relationship between the k e v e and a preset; based on the k h, the k d, the k e Determining the speaker volume V
本申请还提供一种存储有计算机可读指令的非易失性存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述机器人自动调节音量的方法包括以下步骤:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k
h,以及所述d与所述D的关系确定距离增益k
d;通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据所述v
e和预设的对应关系确定对应的环境增益k
e;根据所述k
h、所述k
d、所述k
e和所述V确定扬声器音量
The present application also provides a non-volatile storage medium storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform a robot to automatically adjust the volume The method has a camera, a speaker and an environment microphone for collecting ambient sound, and the first user of the predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, and the robot automatically adjusts the volume. The method comprises the steps of: acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user and a distance d relative to the robot according to the second user image feature, according to The relationship between the h and the H determines a height gain k h , and the relationship between the d and the D determines a distance gain k d ; the ambient microphone generates an ambient noise value v e according to the ambient microphone, according to the v Corresponding relationship between e and preset determines a corresponding environmental gain k e ; determining speaker volume according to the k h , the k d , the k e , and the V
上述的机器人自动调节音量的方法、装置、计算机设备和存储介质,通过判断用户的h和用户相对机器人的距离d并结合环境麦克风测量的环境噪音值v
e来确定扬声器音量V
m,使得机器人可以根据实际情况智能调节扬声器音量,从而无论在什么环境都可以给用户最适宜的音量大小,提高了交互效率和用户体验。
The above method, device, computer device and storage medium for automatically adjusting the volume of the robot determine the speaker volume V m by determining the distance h between the user's h and the user relative to the robot and the environmental noise value v e measured by the environmental microphone, so that the robot can Intelligently adjust the speaker volume according to the actual situation, so that the user can give the most appropriate volume level in any environment, which improves the interaction efficiency and user experience.
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and readily understood from
图1为一个实施例中计算机设备的内部结构示意图;1 is a schematic diagram showing the internal structure of a computer device in an embodiment;
图2为一个实施例的机器人自动调节音量的方法流程示意图;2 is a schematic flow chart of a method for automatically adjusting a volume of a robot according to an embodiment;
图3为一个实施例的机器人和用户之间的空间位置俯视图;Figure 3 is a top plan view of the spatial position between the robot and the user of one embodiment;
图4为一个实施例的机器人自动调节音量的装置模块示意图。4 is a schematic diagram of a device module for automatically adjusting a volume of a robot according to an embodiment.
图1为一个实施例中计算机设备的内部结构示意图。如图1所示,该计算机设备包括通过系统总线连接的处理器、非易失性存储介质、存储器和网络接口。其中,该计算机设备的非易失性存储介质存储有操作系统、数据库和计算机可读指令,数据库中可存储有控件信息序列,该计算机可读指令被处理器执行时,可使得处理器实现一种机器人自动调节音量的方法。该计算机设备的处理器用于提供计算和控制能力,支撑整个计算机设备的运行。该计算机设备的存储器中可存储有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行一种机器人自动调节音量的方法。该计算机设备的网络接口用于与终端连接通信。本领域技术人员可以理解,图1中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。FIG. 1 is a schematic diagram showing the internal structure of a computer device in an embodiment. As shown in FIG. 1, the computer device includes a processor, a non-volatile storage medium, a memory, and a network interface connected by a system bus. The non-volatile storage medium of the computer device stores an operating system, a database, and computer readable instructions. The database may store a sequence of control information. When the computer readable instructions are executed by the processor, the processor may implement a processor. A method in which the robot automatically adjusts the volume. The processor of the computer device is used to provide computing and control capabilities to support the operation of the entire computer device. Computer readable instructions may be stored in the memory of the computer device, the computer readable instructions being executable by the processor to cause the processor to perform a method of automatically adjusting the volume by the robot. The network interface of the computer device is used to communicate with the terminal connection. It will be understood by those skilled in the art that the structure shown in FIG. 1 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation of the computer device to which the solution of the present application is applied. The specific computer device may It includes more or fewer components than those shown in the figures, or some components are combined, or have different component arrangements.
以下描述的机器人自动调节音量的方法,可以应用于智能机器人,例如客服机器人、儿童教育机器人等等。The method of automatically adjusting the volume of the robot described below can be applied to an intelligent robot such as a customer service robot, a child education robot, and the like.
图2为一个实施例的机器人自动调节音量的方法流程示意图。FIG. 2 is a schematic flow chart of a method for automatically adjusting a volume of a robot according to an embodiment.
本申请提供一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风(还具有采集用户声音的麦克风),预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述方法包括如下步骤:The present application provides a method for automatically adjusting a volume of a robot, the robot having a camera, a speaker, and an environment microphone for collecting ambient sound (also having a microphone for collecting user sound), and the first user with a predefined height H is away from the robot When the distance is D, the corresponding speaker volume is V, and the method includes the following steps:
步骤S100:通过所述摄像机获取图像并检测图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与H的关系确定高度增益k
h,以及所述d与D的关系确定距离增益k
d。
Step S100: acquiring an image by the camera and detecting an image feature of the second user in the image, and calculating a height h of the second user and a distance d of the relative robot according to the second user image feature, according to the h and The relationship of H determines the height gain k h , and the relationship of d to D determines the distance gain k d .
可以采用人脸检测方法进行人脸检测,以检测图像中的第二用户。The face detection method can be used for face detection to detect the second user in the image.
由于机器人的摄像机可能捕捉到多个人脸,其中有些人只是背景的人物,并没有和机器人进行交互例如对话,因此可以只需要考虑面向摄像机与机器人对话的人。摄像机通常设置在机器人的面对用户的方向,例如如果机器人具有头部,则可以设置在头部的额头或面部的位置;如果机器人具有躯干,则也可以设置在正面躯干的位置。在此不对摄像机的设置位置进行限定,只需要确保第二用户与机器人对话时 能够拍摄到第二用户即可。Since the robot's camera may capture multiple faces, some of whom are just background characters, and do not interact with the robot, such as a conversation, it is only necessary to consider the person who is facing the camera and the robot. The camera is usually placed in the direction of the robot facing the user. For example, if the robot has a head, it can be placed at the position of the forehead or face of the head; if the robot has a torso, it can also be placed at the position of the front torso. Here, the setting position of the camera is not limited, and it is only necessary to ensure that the second user can capture the second user when talking to the robot.
对于摄像机来说,拍摄得到的图像(图片或视频帧)是固定大小的,可以在图片中心位置划定预设的矩形位置作为人脸识别区域,只在这个人脸识别区域中进行人脸检测。例如假设使用的是1920×1080尺寸的图片,可以在图片中心位置划定1000×1000的矩形位置作为人脸识别区域。For the camera, the captured image (picture or video frame) is fixed in size, and the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area. . For example, assuming a 1920×1080 size picture, a 1000×1000 rectangular position can be delineated at the center of the picture as a face recognition area.
人脸检测(Face Detection)技术,即通过图像分析检测出图像中存在的人脸,并把人脸的位置准确地用矩形框框出来的技术,是人脸特征点检测、人脸识别的基础。常用的人脸检测数据集,包括FDDB(Face Detection Data Set and Benchmark)等。随着近年来随着深度学习的快速发展,涌现出来很多优秀的人脸检测方法。The face detection technology, which is a technique for detecting a face existing in an image by image analysis and accurately arranging the position of the face with a rectangular frame, is the basis of face feature point detection and face recognition. Commonly used face detection data sets, including FDDB (Face Detection Data Set and Benchmark). With the rapid development of deep learning in recent years, many excellent face detection methods have emerged.
例如,FDDB数据库就提交了很多出色的人脸检测方法,例如采用级联CNN网络(卷积神经网络,Convolutional Neural Network)的人脸检测方法:A Convolutioanal Neural Network Cascade,改进的faster rcnn做人脸检测:Face Detection using Deep Learning:An Improved Faster RCNN Approach,还有对小脸检测非常成功的Finding tiny faces等等。另外,像opencv、dlib、libfacedetect等数据库也提供了人脸检测的接口。For example, the FDDB database has submitted many excellent face detection methods, such as the face detection method using Convolutional Neural Network (Convolutional Neural Network): A Convolutioanal Neural Network Cascade, improved fast rcnn for face detection. :Face Detection using Deep Learning: An Improved Faster RCNN Approach, and Finding tiny faces that are very successful for small face detection. In addition, databases such as opencv, dlib, and libfacedetect also provide interfaces for face detection.
常用的人脸检测方法有以下几种:Commonly used face detection methods are as follows:
1.单个CNN人脸检测方法1. Single CNN face detection method
2.级联CNN人脸检测方法2. Cascading CNN face detection method
3.OpenCV人脸检测方法3.OpenCV face detection method
4.Dlib人脸检测方法4.Dlib face detection method
5.libfacedetect人脸检测方法5.libfacedetect face detection method
6.Seetaface人脸检测方法6.Seetaface face detection method
以下简单介绍下单个CNN人脸检测方法。The following is a brief introduction to the single CNN face detection method.
首先训练一个判断人脸和非人脸的二分类器。例如采用卷积神经网络caffenet进行二分类,可以在imagenet数据集训练过的模型,利用自己的人脸数据集,进行微调。也可以自定义卷积网络进行训练,为了能检测到更小的人脸目标,我们一般采用小一点的卷积神经网络作为二分类模型,减小图像输入尺寸,加快预测速度。First, train a second classifier that judges faces and non-faces. For example, using the convolutional neural network caffenet for two classifications, models that can be trained in the imagenet dataset can be fine-tuned using their own face datasets. It is also possible to customize the convolutional network for training. In order to detect smaller face targets, we generally use a smaller convolutional neural network as the two-class model to reduce the image input size and speed up the prediction.
然后将训练好的人脸判断分类网络的全连接层改为卷积层,这样网络变成了全卷积网络,可以接受任意输入图像大小,图像经过全卷积网络将得到特征图,特征图上每一个“点”对应该位置映射到原图上的感受野区域属于人脸的概率,将属于人脸概率大于设定阈值的视为人脸候选框。Then change the fully connected layer of the trained face judgment classification network to the convolution layer, so that the network becomes a full convolution network, which can accept any input image size, and the image will obtain the feature map through the full convolution network. The probability that each "point" corresponding position maps to the receptive field on the original image belongs to the face, and the face probability that the face probability is greater than the set threshold is regarded as a face candidate box.
图像上人脸的大小是变化的,为了适应这种变化,最好的办法就是使用图像金字塔的方式,将待检测的图像缩放到不同大小,以进行多尺度人脸检测。对多个尺度下检测出来的所有人脸候选框,做非极大值抑制NMS,得到最后人脸检测的结果。The size of the face on the image changes. In order to adapt to this change, the best way is to use the image pyramid to scale the image to be detected to different sizes for multi-scale face detection. For all face candidate frames detected at multiple scales, the non-maximum value suppression NMS is obtained, and the result of the final face detection is obtained.
如果本实施例中的机器人自动调节音量的方法应用于安卓系统,可以应用FaceDetector类判断摄像机拍摄的图像是否存在人脸图像。安卓系统内置有人脸识别API:FaceDetector,该API可以通过少量代码完成人脸识别,但是这种识别是最基本的识别,即只能识别出图像中的人脸。If the method of automatically adjusting the volume of the robot in the embodiment is applied to the Android system, the FaceDetector type may be applied to determine whether the image captured by the camera has a face image. Android has a built-in face recognition API: FaceDetector, which can perform face recognition with a small amount of code, but this recognition is the most basic recognition, that is, only the face in the image can be recognized.
安卓中的人脸识别技术,需要用到的底层库:android/external/neven/,架构层:frameworks/base/media/java/android/media/FaceDetector.java。Java层的限制:1,只能接受Bitmap格式的数据;2,只能识别双眼距离大于20像素的人脸像(可在framework层中修改);C,只能检测出人脸的位置(双眼的中心点及距离),不能对人脸进行匹配(查找指定的脸谱)。The face recognition technology in Android requires the underlying library: android/external/neven/, architecture layer: frameworks/base/media/java/android/media/FaceDetector.java. Java layer restrictions: 1, can only accept data in Bitmap format; 2, can only recognize face images with binocular distance greater than 20 pixels (can be modified in the framework layer); C, can only detect the position of the face (both eyes) The center point and distance) cannot match the face (find the specified face).
Neven库提供的主要方法:The main methods provided by the Neven library:
A,android.media.FaceDetector.FaceDetector(int width,int height,int maxFaces);B,int android.media.FaceDetector.findFaces(Bitmap bitmap,Face[]faces)。A, android.media.FaceDetector.FaceDetector(int width, int height, int maxFaces); B, int android.media.FaceDetector.findFaces(Bitmap bitmap, Face[]faces).
在安卓系统中,可以通过FaceDetector类获取人脸图像的两眼位置,并根据两眼位置确定人脸图像在桌面的所处位置。具体步骤可以是:获取人脸图像的两眼位置中心点,获取人脸图像的两眼间距(瞳距),根据两眼位置中心点和两眼间距绘制矩形区域(矩形框),以矩形区域作为人脸图像在桌面的所处位置。可以通过以下代码获取人脸图像的两眼位置中心点:In the Android system, the two-eye position of the face image can be obtained by the FaceDetector class, and the position of the face image on the desktop is determined according to the position of the two eyes. The specific step may be: acquiring the center point of the two-eye position of the face image, acquiring the distance between the two eyes of the face image (pitch distance), and drawing a rectangular area (rectangular frame) according to the center point of the two-eye position and the distance between the two eyes, and the rectangular area As the location of the face image on the desktop. The center point of the two-eye position of the face image can be obtained by the following code:
mFace[i].getMidPoint(eyeMidPoint);mFace[i].getMidPoint(eyeMidPoint);
可以通过以下代码获取人脸图像的两眼间距:The two-eye spacing of the face image can be obtained by the following code:
eyesDistance=mFace[i].eyesDistance();eyesDistance=mFace[i].eyesDistance();
可以通过以下代码绘制矩形区域:You can draw a rectangular area with the following code:
myEyesDistance=face.eyesDistance();//得到两眼位置中心点和两眼间距参数,并对每个人脸画框myEyesDistance=face.eyesDistance();//Get the center point and the two eye spacing parameters of the two eyes, and frame each face
关于人脸检测方法在此不再赘述,在本实施例中,并不对人脸检测方法进行限定。检测出人脸后,计算所检测到的第二用户的高度h和相对机器人的距离d,根据h与H的关系确定高度增益k
h,以及d与D的关系确定距离增益k
d。
Regarding the face detection method, it will not be described here. In the present embodiment, the face detection method is not limited. After detecting the face, the detected height h of the second user and the distance d relative to the robot are calculated, the height gain k h is determined according to the relationship between h and H, and the relationship between d and D determines the distance gain k d .
第二用户相对于机器人的距离d可以通过相关的测距传感器的感测值来计算确定,例如通过红外传感器进行红外测距、使用激光传感器进行激光测距等等。在本实施例中,以图像分析方法来计算确定d。The distance d of the second user relative to the robot can be calculated and determined by the sensed values of the associated ranging sensors, such as infrared ranging by infrared sensors, laser ranging using laser sensors, and the like. In the present embodiment, the determination d is calculated by an image analysis method.
在计算确定d时,基于两个假设。第一,对于绝大部分人群来说,人的瞳距的差距相差不大(±2cm左右)。第二,与机器人对话的用户,用户与机器人的距离变化限定在比较小的范围。原理是,通过对比拍摄图片中的人脸瞳距与标定图片中的人脸瞳距的比值来估算距离。用户人脸离摄像机越近,人脸的尺寸越大,这个关系近似成线性关系。When calculating the determination d, it is based on two assumptions. First, for most people, the gap between people's distances is not much different (±2cm). Second, the user who talks to the robot, the distance between the user and the robot is limited to a relatively small range. The principle is to estimate the distance by comparing the ratio of the face distance in the captured picture to the face distance in the calibration picture. The closer the user's face is to the camera, the larger the size of the face, and the relationship is approximately linear.
在本实施例中,第二用户图像特征包括人像瞳距,预定义第一用户在离机器人距离为D1时图像中的人像瞳距为A1(实际瞳距),第一用户在离机器人距离为D2时图像中的人像瞳距为A2(实际瞳距),则通过以下公式计算第二用户相对机器人的距离d:In this embodiment, the second user image feature includes a portrait distance, and the portrait distance in the image of the first user when the distance from the robot is D1 is A1 (actual interpupillary distance), and the distance between the first user and the robot is When the portrait distance in the image at D2 is A2 (actual interpupillary distance), the distance d between the second user and the robot is calculated by the following formula:
d=k(a-A1)+D1d=k(a-A1)+D1
其中
a为应用人脸检测方法进行人脸检测时检测到的第二用户人像瞳距,即图像中的人像瞳距。
among them a is the second user portrait distance detected when the face detection method is applied, that is, the portrait distance in the image.
当然,还可以采用其他的图像分析方法来计算确定d,例如采用其他的计算公式,只需要遵循上述两个假设以及原理即可,在此不赘述。Of course, other image analysis methods may be used to calculate and determine d. For example, other calculation formulas may be used, and only the above two assumptions and principles need to be followed, and are not described herein.
确定了d之后,根据d与D的关系确定距离增益k
d。距离增益k
d与d是具有正向关系的,例如正比例关系。在本实施例中,距离增益k
d=d/D。当然在其他的实施例中,还可以是其他的计算方式,k
d=d/D+m(m为预设系数)等等,在此不赘述。
After determining d, the gain is determined from the relationship K d and D d. The distance gains k d and d have a positive relationship, such as a proportional relationship. In the present embodiment, the distance gain k d = d / D. Of course, in other embodiments, other calculation manners may be used, k d =d/D+m (m is a preset coefficient), and the like, and details are not described herein.
在本实施例中,预定义第一用户瞳距为C(实际瞳距)时对应图像中的人像瞳距为c(图像中的人像瞳距),则通过以下公式计算第二用户高度h:In this embodiment, when the first user distance is C (the actual lay length) and the portrait distance in the corresponding image is c (the portrait distance in the image), the second user height h is calculated by the following formula:
其中,H1为摄像机高度,Δh为应用人脸检测方法进行人脸检测时检测到的人脸矩形框中心与图像中心的像素差值。Among them, H1 is the camera height, and Δh is the pixel difference between the center of the face rectangle and the center of the image detected when the face detection method is applied for face detection.
同样的,还可以采用其他的图像分析方法来计算确定h,例如采用其他的计算公式,只需要遵循上述两个假设以及原理即可,在此不赘述。Similarly, other image analysis methods may be used to calculate and determine h. For example, other calculation formulas may be used, and only the above two assumptions and principles need to be followed, and are not described herein.
确定了h之后,根据h与H的关系确定高度增益k
h。同样的,h与H是具有正向关系的,例如正比例关系。在本实施例中,高度增益k
h=(h-Δ)/(H-Δ),其中Δ为扬声器高度。当然,在一些实施例中,可以忽略扬声器高度,即Δ=0。
After h is determined, the height gain k h is determined according to the relationship between h and H. Similarly, h and H are positively related, such as a proportional relationship. In the present embodiment, the height gain k h = (h - Δ) / (H - Δ), where Δ is the speaker height. Of course, in some embodiments, the speaker height can be ignored, ie Δ=0.
步骤S200:通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据v
e和预设的对应关系确定对应的环境增益k
e。
Step S200: The ambient noise value v e is obtained by collecting the ambient volume by the environment microphone, and the corresponding environment gain k e is determined according to the v e and the preset correspondence relationship.
具体的,通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据v
e所处于的区间范围确定与该区间范围对应的环境增益k
e。
Specifically, the environmental noise value v e is obtained by collecting the ambient volume by the environmental microphone, and the environmental gain k e corresponding to the interval range is determined according to the range of the interval in which the v e is located.
可以预设多个区间范围,每个区间范围都有对应的预设环境增益。即(v
1,v
2)区间范围对应环境增益k
1、(v
2,v
3)区间范围对应环境增益k
2、……、(v
n-1,v
n)区间范围对应环境增益k
n-1。
A plurality of interval ranges can be preset, and each interval range has a corresponding preset environment gain. That is, the (v 1 , v 2 ) interval range corresponds to the environmental gains k 1 , (v 2 , v 3 ), the interval ranges correspond to the environmental gains k 2 , ..., (v n-1 , v n ), and the interval ranges correspond to the environmental gains k n -1 .
例如,设定噪音标准为70dB。For example, set the noise standard to 70dB.
对于安静环境下(v
e<40dB),k
e=0.8;
For quiet environments (v e <40dB), k e =0.8;
对于普通环境下(40dB<v
e<70dB),k
e=1;
For normal environments (40dB < v e <70dB), k e =1;
对于嘈杂环境下(70dB<v
e<90dB),k
e=1+(v
e-70)/100;
For noisy environments (70dB<v e <90dB), k e =1+(v e -70)/100;
对于极其嘈杂情况下(v
e>90dB),k
e=∞;
For extremely noisy cases (v e >90dB), k e =∞;
当然,在一些实施例中,可以通过v
e和预设的计算公式确定环境增益k
e。
Of course, in some embodiments, the environmental gain k e can be determined by v e and a preset calculation formula.
在其中一个实施例中,环境麦克风至少包括第一麦克风和第二麦克风,第一麦克风和第二麦克风位于机器人两侧(以机器人正面方向为基线),例如机器人头部两侧,或者机器人躯干两侧,见图3;通过所述环境麦克风采集环境音量得到环境噪音值v
e的过程包括:
In one embodiment, the ambient microphone includes at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot (baseline in front of the robot), such as on both sides of the robot head, or two robot torso Side, see FIG. 3; the process of collecting the ambient volume by the ambient microphone to obtain the ambient noise value v e includes:
通过第一麦克风采集环境音量得到第一环境噪音值v
1,通过第二麦克风采集环境音量得到第二环境噪音值v
2,将v
1和v
2之中的最大者确定为环境噪音值v
e,即v
e=max(v
1,v
2)。
The first ambient noise value v 1 is obtained by collecting the ambient volume by the first microphone, the second ambient noise value v 2 is obtained by the second microphone collecting the ambient volume, and the largest one of v 1 and v 2 is determined as the environmental noise value v e That is, v e =max(v 1 , v 2 ).
确定v
e后,即可通过在数据表中查询v
e所述区间范围,然后得到v
e所述区间范围对应的环境增益k
e。
After determining v e , the interval range of v e can be obtained by querying the range of v e in the data table, and then the environmental gain k e corresponding to the range of the range described by v e is obtained.
步骤S300:根据k
h、k
d、k
e和V确定扬声器音量
Step S300: determining the speaker volume according to k h , k d , k e , and V
k
h、k
d、k
e都是与扬声器音量V
m具有正向关系(例如正比例关系)的,
相当于音源增益,
相当于总增益,因此任何基于该正向关系而对上述扬声器音量V
m的计算公式的适当变形,都可以认为是合理的,在此不赘述。
k h , k d , k e are all in a positive relationship (for example, a proportional relationship) with the speaker volume V m , Equivalent to the source gain, Corresponding to the total gain, any appropriate deformation of the calculation formula of the above-mentioned speaker volume V m based on the forward relationship can be considered to be reasonable, and will not be described herein.
当然,还可以预设最大音量V
max和最小音量V
min,如果V
m<V
min,则V
m=V
min;如果V
m>V
max,则V
m=V
max。
Of course, it is also possible to preset the maximum volume V max and the minimum volume V min , if V m <V min , then V m =V min ; if V m >V max , then V m =V max .
上述机器人自动调节音量的方法,通过判断第二用户的h和第二用户相对机器人的距离d并结合环境麦克风测量的环境噪音值v
e来确定扬声器音量V
m,使得机器人可以根据实际情况智能调节扬声器音量,从而无论在什么环境都可以给用户最适宜的音量大小,提高了交互效率和用户体验。
The above method for automatically adjusting the volume of the robot determines the speaker volume V m by determining the distance d between the second user and the second user relative to the robot and in combination with the ambient noise value v e measured by the environmental microphone, so that the robot can intelligently adjust according to the actual situation. The speaker volume, which gives the user the optimum volume level regardless of the environment, improves the interaction efficiency and user experience.
图4为一个实施例的机器人自动调节音量的装置模块示意图。对应机器人自动调节音量的方法,本申请还提供一种机器人自动调节音量的装置,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风(还具有采集用户声音的麦克风),预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述装置包括第一计算模块100、第二计算模块200和音量计算模块300。4 is a schematic diagram of a device module for automatically adjusting a volume of a robot according to an embodiment. Corresponding to the method for automatically adjusting the volume of the robot, the present application also provides a device for automatically adjusting the volume of the robot, the robot having a camera, a speaker and an environment microphone for collecting ambient sound (also having a microphone for collecting user sound), a predefined height The first user of H is at a speaker volume of V when the distance from the robot is D, and the device includes a first calculation module 100, a second calculation module 200, and a volume calculation module 300.
第一计算模块100用于通过所述摄像机获取图像并检测图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据h与H的关系确定高度增益k
h,以及d与D的关系确定距离增益k
d;第二计算模块200用于通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据v
e和预设的对应关系确定对应的环境增益k
e;音量计算模块300用于根据k
h、k
d、k
e和V确定扬声器音量
The first calculation module 100 is configured to acquire an image by the camera and detect an image feature of a second user in the image, and calculate a height h of the second user and a distance d of the relative robot according to the second user image feature, according to The relationship between h and H determines the height gain k h , and the relationship between d and D determines the distance gain k d ; the second calculation module 200 is configured to obtain the ambient noise value v e by collecting the ambient volume through the ambient microphone, according to v e and pre The corresponding relationship is determined to determine the corresponding environment gain k e ; the volume calculation module 300 is configured to determine the speaker volume according to k h , k d , k e and V
第一计算模块100通过所述摄像机获取图像并检测图像中的第二用户图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据h与H的关系确定高度增益k
h,以及d与D的关系确定距离增益k
d。
The first calculation module 100 acquires an image by the camera and detects a second user image feature in the image, and calculates a height h of the second user and a distance d relative to the robot according to the second user image feature, according to h and H The relationship determines the height gain k h , and the relationship between d and D determines the distance gain k d .
第一计算模块100可以采用人脸检测方法进行人脸检测,以检测图像中的第二用户。The first calculation module 100 may perform face detection using a face detection method to detect a second user in the image.
由于机器人的摄像机可能捕捉到多个人脸,其中有些人只是背景的人物,并没有和机器人进行交互例如对话,因此可以只需要考虑面向摄像机与机器人对话的人。摄像机通常设置在机器人的面对用户的方向,例如如果机器人具有头部,则可以设置在头部的额头或面部的位置;如果机器人具有躯干,则也可以设置在正面躯干的位置。在此不对摄像机的设置位置进行限定,只需要确保第二用户与机器人对话时能够拍摄到第二用户即可。Since the robot's camera may capture multiple faces, some of whom are just background characters, and do not interact with the robot, such as a conversation, it is only necessary to consider the person who is facing the camera and the robot. The camera is usually placed in the direction of the robot facing the user. For example, if the robot has a head, it can be placed at the position of the forehead or face of the head; if the robot has a torso, it can also be placed at the position of the front torso. Here, the setting position of the camera is not limited, and it is only necessary to ensure that the second user can capture the second user when talking to the robot.
对于摄像机来说,拍摄得到的图像(图片或视频帧)是固定大小的,可以在图片中心位置划定预设的矩形位置作为人脸识别区域,只在这个人脸识别区域中进行人脸检测。例如假设使用的是1920×1080尺寸的图片,可以在图片中心位置划定 1000×1000的矩形位置作为人脸识别区域。For the camera, the captured image (picture or video frame) is fixed in size, and the preset rectangular position can be defined as the face recognition area at the center of the picture, and the face detection is performed only in this face recognition area. . For example, if a picture of 1920×1080 size is used, a rectangular position of 1000×1000 can be delimited at the center of the picture as a face recognition area.
人脸检测(Face Detection)技术,即通过图像分析检测出图像中存在的人脸,并把人脸的位置准确地用矩形框框出来的技术,是人脸特征点检测、人脸识别的基础。常用的人脸检测数据集,包括FDDB(Face Detection Data Set and Benchmark)等。随着近年来随着深度学习的快速发展,涌现出来很多优秀的人脸检测方法。The face detection technology, which is a technique for detecting a face existing in an image by image analysis and accurately arranging the position of the face with a rectangular frame, is the basis of face feature point detection and face recognition. Commonly used face detection data sets, including FDDB (Face Detection Data Set and Benchmark). With the rapid development of deep learning in recent years, many excellent face detection methods have emerged.
例如,FDDB数据库就提交了很多出色的人脸检测方法,例如采用级联CNN网络(卷积神经网络,Convolutional Neural Network)的人脸检测方法:A Convolutioanal Neural Network Cascade,改进的faster rcnn做人脸检测:Face Detection using Deep Learning:An Improved Faster RCNN Approach,还有对小脸检测非常成功的Finding tiny faces等等。另外,像opencv、dlib、libfacedetect等数据库也提供了人脸检测的接口。For example, the FDDB database has submitted many excellent face detection methods, such as the face detection method using Convolutional Neural Network (Convolutional Neural Network): A Convolutioanal Neural Network Cascade, improved fast rcnn for face detection. :Face Detection using Deep Learning: An Improved Faster RCNN Approach, and Finding tiny faces that are very successful for small face detection. In addition, databases such as opencv, dlib, and libfacedetect also provide interfaces for face detection.
常用的人脸检测方法有以下几种:Commonly used face detection methods are as follows:
1.单个CNN人脸检测方法1. Single CNN face detection method
2.级联CNN人脸检测方法2. Cascading CNN face detection method
3.OpenCV人脸检测方法3.OpenCV face detection method
4.Dlib人脸检测方法4.Dlib face detection method
5.libfacedetect人脸检测方法5.libfacedetect face detection method
6.Seetaface人脸检测方法6.Seetaface face detection method
以下简单介绍下单个CNN人脸检测方法。The following is a brief introduction to the single CNN face detection method.
首先训练一个判断人脸和非人脸的二分类器。例如采用卷积神经网络caffenet进行二分类,可以在imagenet数据集训练过的模型,利用自己的人脸数据集,进行微调。也可以自定义卷积网络进行训练,为了能检测到更小的人脸目标,我们一般采用小一点的卷积神经网络作为二分类模型,减小图像输入尺寸,加快预测速度。First, train a second classifier that judges faces and non-faces. For example, using the convolutional neural network caffenet for two classifications, models that can be trained in the imagenet dataset can be fine-tuned using their own face datasets. It is also possible to customize the convolutional network for training. In order to detect smaller face targets, we generally use a smaller convolutional neural network as the two-class model to reduce the image input size and speed up the prediction.
然后将训练好的人脸判断分类网络的全连接层改为卷积层,这样网络变成了全卷积网络,可以接受任意输入图像大小,图像经过全卷积网络将得到特征图,特征图上每一个“点”对应该位置映射到原图上的感受野区域属于人脸的概率,将属于人脸概率大于设定阈值的视为人脸候选框。Then change the fully connected layer of the trained face judgment classification network to the convolution layer, so that the network becomes a full convolution network, which can accept any input image size, and the image will obtain the feature map through the full convolution network. The probability that each "point" corresponding position maps to the receptive field on the original image belongs to the face, and the face probability that the face probability is greater than the set threshold is regarded as a face candidate box.
图像上人脸的大小是变化的,为了适应这种变化,最好的办法就是使用图像金字塔的方式,将待检测的图像缩放到不同大小,以进行多尺度人脸检测。对多个尺度下检测出来的所有人脸候选框,做非极大值抑制NMS,得到最后人脸检测的结果。The size of the face on the image changes. In order to adapt to this change, the best way is to use the image pyramid to scale the image to be detected to different sizes for multi-scale face detection. For all face candidate frames detected at multiple scales, the non-maximum value suppression NMS is obtained, and the result of the final face detection is obtained.
如果本实施例中的机器人自动调节音量的装置应用于安卓系统,第一计算模块100可以应用FaceDetector类判断摄像机拍摄的图像是否存在人脸图像。安卓系统内置有人脸识别API:FaceDetector,该API可以通过少量代码完成人脸识别,但是这种识别是最基本的识别,即只能识别出图像中的人脸。If the apparatus for automatically adjusting the volume of the robot in the embodiment is applied to the Android system, the first calculation module 100 may apply the FaceDetector type to determine whether the image captured by the camera has a face image. Android has a built-in face recognition API: FaceDetector, which can perform face recognition with a small amount of code, but this recognition is the most basic recognition, that is, only the face in the image can be recognized.
安卓中的人脸识别技术,需要用到的底层库:android/external/neven/,架构层:frameworks/base/media/java/android/media/FaceDetector.java。Java层的限制:1,只能接受Bitmap格式的数据;2,只能识别双眼距离大于20像素的人脸像(可在framework层中修改);C,只能检测出人脸的位置(双眼的中心点及距离),不能对人脸进行匹配(查找指定的脸谱)。The face recognition technology in Android requires the underlying library: android/external/neven/, architecture layer: frameworks/base/media/java/android/media/FaceDetector.java. Java layer restrictions: 1, can only accept data in Bitmap format; 2, can only recognize face images with binocular distance greater than 20 pixels (can be modified in the framework layer); C, can only detect the position of the face (both eyes) The center point and distance) cannot match the face (find the specified face).
Neven库提供的主要方法:The main methods provided by the Neven library:
A,android.media.FaceDetector.FaceDetector(int width,int height,int maxFaces);B,int android.media.FaceDetector.findFaces(Bitmap bitmap,Face[]faces)。A, android.media.FaceDetector.FaceDetector(int width, int height, int maxFaces); B, int android.media.FaceDetector.findFaces(Bitmap bitmap, Face[]faces).
在安卓系统中,第一计算模块100可以通过FaceDetector类获取人脸图像的两眼位置,并根据两眼位置确定人脸图像在桌面的所处位置。具体步骤可以是:获取人脸图像的两眼位置中心点,获取人脸图像的两眼间距(瞳距),根据两眼位置中心点和两眼间距绘制矩形区域(矩形框),以矩形区域作为人脸图像在桌面的所处位置。可以通过以下代码获取人脸图像的两眼位置中心点:In the Android system, the first calculation module 100 can acquire the two-eye position of the face image through the FaceDetector class, and determine the location of the face image on the desktop according to the two-eye position. The specific step may be: acquiring the center point of the two-eye position of the face image, acquiring the distance between the two eyes of the face image (pitch distance), and drawing a rectangular area (rectangular frame) according to the center point of the two-eye position and the distance between the two eyes, and the rectangular area As the location of the face image on the desktop. The center point of the two-eye position of the face image can be obtained by the following code:
mFace[i].getMidPoint(eyeMidPoint);mFace[i].getMidPoint(eyeMidPoint);
可以通过以下代码获取人脸图像的两眼间距:The two-eye spacing of the face image can be obtained by the following code:
eyesDistance=mFace[i].eyesDistance();eyesDistance=mFace[i].eyesDistance();
可以通过以下代码绘制矩形区域:You can draw a rectangular area with the following code:
myEyesDistance=face.eyesDistance();//得到两眼位置中心点和两眼间距参数,并对每个人脸画框myEyesDistance=face.eyesDistance();//Get the center point and the two eye spacing parameters of the two eyes, and frame each face
关于人脸检测方法在此不再赘述,在本实施例中,并不对人脸检测方法进行限定。第一计算模块100检测出人脸后,计算所检测到的第二用户的高度h和相对机 器人的距离d,根据h与H的关系确定高度增益k
h,以及d与D的关系确定距离增益k
d。
Regarding the face detection method, it will not be described here. In the present embodiment, the face detection method is not limited. After detecting the human face, the first calculation module 100 calculates the detected height h of the second user and the distance d of the relative robot, determines the height gain k h according to the relationship between h and H, and determines the distance gain by the relationship between d and D. k d .
第二用户相对于机器人的距离d可以通过相关的测距传感器的感测值来计算确定,例如通过红外传感器进行红外测距、使用激光传感器进行激光测距等等。在本实施例中,第一计算模块100以图像分析方法来计算确定d。The distance d of the second user relative to the robot can be calculated and determined by the sensed values of the associated ranging sensors, such as infrared ranging by infrared sensors, laser ranging using laser sensors, and the like. In the present embodiment, the first calculation module 100 calculates the determination d by an image analysis method.
在计算确定d时,基于两个假设。第一,对于绝大部分人群来说,人的瞳距的差距相差不大(±2cm左右)。第二,与机器人对话的用户,用户与机器人的距离变化限定在比较小的范围。原理是,通过对比拍摄图片中的人脸瞳距与标定图片中的人脸瞳距的比值来估算距离。用户人脸离摄像机越近,人脸的尺寸越大,这个关系近似成线性关系。When calculating the determination d, it is based on two assumptions. First, for most people, the gap between people's distances is not much different (±2cm). Second, the user who talks to the robot, the distance between the user and the robot is limited to a relatively small range. The principle is to estimate the distance by comparing the ratio of the face distance in the captured picture to the face distance in the calibration picture. The closer the user's face is to the camera, the larger the size of the face, and the relationship is approximately linear.
在本实施例中,第二用户图像特征包括人像瞳距,预定义第一用户在离机器人距离为D1时图像中的人像瞳距为A1(实际瞳距),第一用户在离机器人距离为D2时图像中的人像瞳距为A2(实际瞳距),则第一计算模块100通过以下公式计算第二用户相对机器人的距离d:In this embodiment, the second user image feature includes a portrait distance, and the portrait distance in the image of the first user when the distance from the robot is D1 is A1 (actual interpupillary distance), and the distance between the first user and the robot is When the portrait distance in the image at D2 is A2 (actual interpupillary distance), the first calculation module 100 calculates the distance d between the second user and the robot by the following formula:
d=k(a-A1)+D1d=k(a-A1)+D1
其中
a为应用人脸检测方法进行人脸检测时检测到的第二用户人像瞳距,即图像中的人像瞳距。
among them a is the second user portrait distance detected when the face detection method is applied, that is, the portrait distance in the image.
当然,第一计算模块100还可以采用其他的图像分析方法来计算确定d,例如采用其他的计算公式,只需要遵循上述两个假设以及原理即可,在此不赘述。Of course, the first calculation module 100 can also use other image analysis methods to calculate the determination d. For example, other calculation formulas are used, and only the above two assumptions and principles need to be followed, and details are not described herein.
第一计算模块100确定了d之后,根据d与D的关系确定距离增益k
d。距离增益k
d与d是具有正向关系的,例如正比例关系。在本实施例中,距离增益k
d=d/D。当然在其他的实施例中,还可以是其他的计算方式,k
d=d/D+m(m为预设系数)等等,在此不赘述。
After the first calculation module 100 determines d, the distance gain k d is determined according to the relationship between d and D. The distance gains k d and d have a positive relationship, such as a proportional relationship. In the present embodiment, the distance gain k d = d / D. Of course, in other embodiments, other calculation manners may be used, k d =d/D+m (m is a preset coefficient), and the like, and details are not described herein.
在本实施例中,预定义第一用户瞳距为C(实际瞳距)时对应图像中的人像瞳距为c(图像中的人像瞳距),则第一计算模块100通过以下公式计算第二用户高度h:In this embodiment, when the first user distance is C (the actual lay length), the portrait distance in the corresponding image is c (the portrait distance in the image), and the first calculation module 100 calculates the first formula by the following formula. Two user height h:
其中,H1为摄像机高度,Δh为应用人脸检测方法进行人脸检测时检测到的人脸矩形框中心与图像中心的像素差值。Among them, H1 is the camera height, and Δh is the pixel difference between the center of the face rectangle and the center of the image detected when the face detection method is applied for face detection.
同样的,第一计算模块100还可以采用其他的图像分析方法来计算确定h,例如采用其他的计算公式,只需要遵循上述两个假设以及原理即可,在此不赘述。Similarly, the first calculation module 100 may also use other image analysis methods to calculate and determine h. For example, other calculation formulas may be used, and only the above two assumptions and principles need to be followed, and details are not described herein.
第一计算模块100确定了h之后,根据h与H的关系确定高度增益k
h。同样的,h与H是具有正向关系的,例如正比例关系。在本实施例中,高度增益k
h=(h-Δ)/(H-Δ),其中Δ为扬声器高度。当然,在一些实施例中,可以忽略扬声器高度,即Δ=0。
After the first calculation module 100 determines h, the height gain k h is determined according to the relationship between h and H. Similarly, h and H are positively related, such as a proportional relationship. In the present embodiment, the height gain k h = (h - Δ) / (H - Δ), where Δ is the speaker height. Of course, in some embodiments, the speaker height can be ignored, ie Δ=0.
第二计算模块200通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据v
e和预设的对应关系确定对应的环境增益k
e。具体的,通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据v
e所处于的区间范围确定与该区间范围对应的环境增益k
e。
The second computing module 200 obtains the ambient noise value v e by collecting the ambient volume through the ambient microphone, and determines the corresponding environmental gain k e according to the v e and the preset correspondence. Specifically, the environmental noise value v e is obtained by collecting the ambient volume by the environmental microphone, and the environmental gain k e corresponding to the interval range is determined according to the range of the interval in which the v e is located.
可以预设多个区间范围,每个区间范围都有对应的预设环境增益。即(v
1,v
2)区间范围对应环境增益k
1、(v
2,v
3)区间范围对应环境增益k
2、……、(v
n-1,v
n)区间范围对应环境增益k
n-1。
A plurality of interval ranges can be preset, and each interval range has a corresponding preset environment gain. That is, the (v 1 , v 2 ) interval range corresponds to the environmental gains k 1 , (v 2 , v 3 ), the interval ranges correspond to the environmental gains k 2 , ..., (v n-1 , v n ), and the interval ranges correspond to the environmental gains k n -1 .
例如,设定噪音标准为70dB。For example, set the noise standard to 70dB.
对于安静环境下(v
e<40dB),k
e=0.8;
For quiet environments (v e <40dB), k e =0.8;
对于普通环境下(40dB<v
e<70dB),k
e=1;
For normal environments (40dB < v e <70dB), k e =1;
对于嘈杂环境下(70dB<v
e<90dB),k
e=1+(v
e-70)/100;
For noisy environments (70dB<v e <90dB), k e =1+(v e -70)/100;
对于极其嘈杂情况下(v
e>90dB),k
e=∞;
For extremely noisy cases (v e >90dB), k e =∞;
当然,在一些实施例中,第二计算模块200可以通过v
e和预设的计算公式确定环境增益k
e。
Of course, in some embodiments, the second calculation module 200 can determine the environmental gain k e by using v e and a preset calculation formula.
在其中一个实施例中,环境麦克风至少包括第一麦克风和第二麦克风,第一麦克风和第二麦克风位于机器人两侧(以机器人正面方向为基线),例如机器人头部两侧,或者机器人躯干两侧,见图3;第二计算模块200通过所述环境麦克风采集环境音量得到环境噪音值v
e的过程包括:
In one embodiment, the ambient microphone includes at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot (baseline in front of the robot), such as on both sides of the robot head, or two robot torso Side, see FIG. 3; the process of the second computing module 200 collecting the ambient volume through the ambient microphone to obtain the ambient noise value v e includes:
通过第一麦克风采集环境音量得到第一环境噪音值v
1,通过第二麦克风采集环境音量得到第二环境噪音值v
2,将v
1和v
2之中的最大者确定为环境噪音值v
e,即v
e=max(v
1,v
2)。
The first ambient noise value v 1 is obtained by collecting the ambient volume by the first microphone, the second ambient noise value v 2 is obtained by the second microphone collecting the ambient volume, and the largest one of v 1 and v 2 is determined as the environmental noise value v e That is, v e =max(v 1 , v 2 ).
第二计算模块200确定v
e后,即可通过在数据表中查询v
e所述区间范围,然后得到v
e所述区间范围对应的环境增益k
e。
After the second calculation module 200 determines v e, by the data table can query v e the interval range, and to give the section v e environmental gain corresponding to the range k e.
音量计算模块300根据k
h、k
d、k
e和V确定扬声器音量
The volume calculation module 300 determines the speaker volume based on k h , k d , k e , and V
k
h、k
d、k
e都是与扬声器音量V
m具有正向关系(例如正比例关系)的,
相当于音源增益,
相当于总增益,因此任何基于该正向关系而对上述扬声器音量V
m的计算公式的适当变形,都可以认为是合理的,在此不赘述。
k h , k d , k e are all in a positive relationship (for example, a proportional relationship) with the speaker volume V m , Equivalent to the source gain, Corresponding to the total gain, any appropriate deformation of the calculation formula of the above-mentioned speaker volume V m based on the forward relationship can be considered to be reasonable, and will not be described herein.
当然,还可以预设最大音量V
max和最小音量V
min,如果V
m<V
min,则V
m=V
min;如果V
m>V
max,则V
m=V
max。
Of course, it is also possible to preset the maximum volume V max and the minimum volume V min , if V m <V min , then V m =V min ; if V m >V max , then V m =V max .
上述机器人自动调节音量的装置,通过判断第二用户的h和第二用户相对机器人的距离d并结合环境麦克风测量的环境噪音值v
e来确定扬声器音量V
m,使得机器人可以根据实际情况智能调节扬声器音量,从而无论在什么环境都可以给用户最适宜的音量大小,提高了交互效率和用户体验。
The device for automatically adjusting the volume by the robot determines the speaker volume V m by determining the distance d between the second user and the second user relative to the robot and the ambient noise value v e measured by the ambient microphone, so that the robot can intelligently adjust according to the actual situation. The speaker volume, which gives the user the optimum volume level regardless of the environment, improves the interaction efficiency and user experience.
本申请还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行上述任一项实施例所述机器人自动调节音量的方法的步骤。The application also provides a computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor, causing the processor to perform any of the above The steps of the method for automatically adjusting the volume of the robot by the embodiment.
本申请还提供一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述任一项实施例所述机器人自动调节音量的方法的步骤。The present application also provides a storage medium storing computer readable instructions that, when executed by one or more processors, cause one or more processors to perform the robotic automatic described in any of the above embodiments The steps of the method of adjusting the volume.
所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的用第一户在离机器人距离为D时对应的扬声器音量为V,所述方法包括如下步骤:通过所述摄像机获取图像并检测所述图像中的第二用户图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k
h,以及所述d与所述D的关系确定距离增益k
d;通过所述环境麦克风采集环境音量得到环境噪音值v
e,根据所述v
e和预设的对应关系确定对应的环境增益k
e;根据所述k
h、所述k
d、所述k
e和所述V确定扬声器音量
通过判断用户的h和用户相对机器人的距离d并结合环境麦克风测量的环境噪音值v
e来确定扬声器音量V
m,使得机器人可以根据实际情况智能调节扬声器音量,从而无论在什么环境都可以给用户最适宜的音量大小,提高了交互效率和用户体验。
The robot has a camera, a speaker and an environmental microphone for collecting ambient sound, and the volume of the speaker corresponding to the first household when the distance from the robot is D is V, the method includes the following steps: The camera acquires an image and detects a second user image feature in the image, and calculates a height h of the second user and a distance d relative to the robot according to the second user image feature, according to the h and the H The relationship determines the height gain k h , and the relationship between the d and the D determines the distance gain k d ; the ambient noise is obtained by the ambient microphone to obtain the ambient noise value v e , which is determined according to the v e and the preset correspondence relationship Corresponding environmental gain k e ; determining speaker volume according to the k h , the k d , the k e , and the V The speaker volume V m is determined by judging the distance h between the user's h and the user relative to the robot and the ambient noise value v e measured by the ambient microphone, so that the robot can intelligently adjust the speaker volume according to the actual situation, so that the user can be given the environment regardless of the environment. The most appropriate volume level improves interaction efficiency and user experience.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。A person skilled in the art can understand that all or part of the process of implementing the above embodiment method can be completed by a computer program to instruct related hardware, and the computer program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
Claims (20)
- 一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述方法包括如下步骤:A method for automatically adjusting a volume by a robot, the robot having a camera, a speaker, and an environment microphone for collecting ambient sound, and the first user having a predefined height H is corresponding to a speaker volume of V when the distance from the robot is D. The method includes the following steps:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对所述机器人的距离d,根据所述h与所述H的关系确定高度增益k h,以及所述d与所述D的关系确定距离增益k d; Acquiring an image by the camera and detecting an image feature of a second user in the image, calculating a height h of the second user and a distance d relative to the robot according to the second user image feature, according to the h The relationship with the H determines a height gain k h , and the relationship between the d and the D determines a distance gain k d ;通过所述环境麦克风采集环境音量得到环境噪音值v e,根据所述v e和预设的对应关系确定对应的环境增益k e; Obtaining an ambient noise value v e by collecting the ambient volume by the ambient microphone, and determining a corresponding environmental gain k e according to the v e and the preset correspondence relationship;
- 根据权利要求1所述的机器人自动调节音量的方法,所述第二用户图像特征包括人像瞳距,预定义所述第一用户在离所述机器人距离为D1时所述图像中的人像瞳距为A1,所述第一用户在离所述机器人距离为D2时所述图像中的人像瞳距为A2,则通过以下公式计算所述第二用户相对所述机器人的距离d:d=k(a-A1)+D1,The method of automatically adjusting a volume of a robot according to claim 1, wherein the second user image feature comprises a portrait pupil distance, and predefining a portrait distance of the first user in the image when the distance from the robot is D1 For A1, when the first user is at a distance A2 from the robot when the robot is at a distance D2, the distance d of the second user relative to the robot is calculated by the following formula: d=k ( a-A1)+D1,
- 根据权利要求1所述的机器人自动调节音量的方法,所述第二用户图像特征包括人像瞳距,预定义所述第一用户的真实瞳距为C时对应所述图像中的人像瞳距为c,则通过以下公式计算第二用户高度h:The method of automatically adjusting a volume of a robot according to claim 1, wherein the second user image feature comprises a portrait distance, and a pre-defined vertical distance of the first user is C, corresponding to a portrait distance in the image. c, the second user height h is calculated by the following formula:其中,H1为摄像机高度,Δh为检测到的人脸矩形框中心与图像中心的像素差值。Where H1 is the camera height, and Δh is the pixel difference between the detected center of the rectangular frame and the center of the image.
- 根据权利要求1所述的机器人自动调节音量的方法,所述环境麦克风至少包括第一麦克风和第二麦克风,第一麦克风和第二麦克风位于机器人两侧;所述通过所述环境麦克风采集环境音量得到环境噪音值v e包括: The method of automatically adjusting a volume of a robot according to claim 1, wherein the ambient microphone comprises at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot; and the ambient volume is collected by the ambient microphone Obtaining an environmental noise value v e includes:通过第一麦克风采集环境音量得到第一环境噪音值v 1,通过第二麦克风采集环境音量得到第二环境噪音值v 2,将所述v 1和所述v 2之中的最大者确定为环境噪音值v e。 Acquiring the ambient volume by the first microphone to obtain a first ambient noise value v 1 , collecting the ambient volume by the second microphone to obtain a second ambient noise value v 2 , determining the largest of the v 1 and the v 2 as the environment The noise value v e .
- 根据权利要求1所述的机器人自动调节音量的方法,高度增益k h=(h-Δ)/(H-Δ),其中Δ为扬声器高度。 A method of automatically adjusting a volume by a robot according to claim 1, a height gain k h = (h - Δ) / (H - Δ), wherein Δ is a speaker height.
- 根据权利要求1所述的机器人自动调节音量的方法,距离增益k d=d/D。 The method of automatically adjusting the volume of the robot according to claim 1, wherein the distance gain k d =d/D.
- 一种机器人自动调节音量的装置,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述装置包括:A device for automatically adjusting a volume of a robot, the robot having a camera, a speaker and an environment microphone for collecting ambient sound, and the first user having a predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, The device includes:第一计算模块,用于通过所述摄像机获取图像并检测所述图像中的第二用户图像特征,根据第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k h,以及所述d与所述D的关系确定距离增益k d; a first calculation module, configured to acquire an image by the camera and detect a second user image feature in the image, and calculate a height h of the second user and a distance d of the relative robot according to the second user image feature, according to the The relationship between h and the H determines a height gain k h , and the relationship between the d and the D determines a distance gain k d ;第二计算模块,用于通过所述环境麦克风采集环境音量得到环境噪音值v e,根据所述v e和预设的对应关系确定与对应的环境增益k e; a second calculation module, configured to obtain an ambient noise value v e by collecting the ambient volume through the environment microphone, and determining and corresponding environment gain k e according to the v e and the preset correspondence relationship;
- 根据权利要求7所述的机器人自动调节音量的装置,所述第二用户图像特征包括人像瞳距,预定义所述第一用户在离所述机器人距离为D1时所述图像中的人像瞳距为A1,所述第一用户在离所述机器人距离为D2时所述图像中的人像瞳距为A2,则第一计算模块通过以下公式计算所述第二用户相对所述机器人的距离d:The apparatus for automatically adjusting a volume of a robot according to claim 7, wherein the second user image feature comprises a portrait pupil distance, and predefining a portrait distance of the first user in the image when the distance from the robot is D1 For A1, the first user calculates the distance d of the second user relative to the robot by the following formula when the distance of the portrait in the image is A2 when the distance from the robot is D2:d=k(a-A1)+D1,d=k(a-A1)+D1,
- 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述机器人自动调节音量的方法包括以下步骤:A computer device comprising a memory and a processor, the memory storing computer readable instructions, the computer readable instructions being executed by the processor, causing the processor to perform a method for automatically adjusting a volume by a robot The robot has a camera, a speaker and an environment microphone for collecting ambient sound. The first user with a predefined height H is corresponding to a speaker volume of V when the distance from the robot is D. The method for the robot to automatically adjust the volume includes The following steps:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k h,以及所述d与所述D的关系确定距离增益k d; Acquiring an image by the camera and detecting an image feature of the second user in the image, calculating a height h of the second user and a distance d relative to the robot according to the second user image feature, according to the h and the The relationship of H determines the height gain k h , and the relationship between d and the D determines the distance gain k d ;通过所述环境麦克风采集环境音量得到环境噪音值v e,根据所述v e和预设的对应关系确定对应的环境增益k e; Obtaining an ambient noise value v e by collecting the ambient volume by the ambient microphone, and determining a corresponding environmental gain k e according to the v e and the preset correspondence relationship;
- 根据权利要求9所述的计算机设备,所述第二用户图像特征包括人像瞳距,预定义所述第一用户在离所述机器人距离为D1时所述图像中的人像瞳距为A1,所述第一用户在离所述机器人距离为D2时所述图像中的人像瞳距为A2,则通过以下 公式计算所述第二用户相对所述机器人的距离d:The computer device according to claim 9, wherein the second user image feature comprises a portrait distance, and the portrait distance in the image of the first user when the distance from the robot is D1 is predefined to be A1. When the distance of the portrait in the image is A2 when the first user is at a distance D2 from the robot, the distance d between the second user and the robot is calculated by the following formula:d=k(a-A1)+D1,d=k(a-A1)+D1,
- 根据权利要求9所述的计算机设备,所述第二用户图像特征包括人像瞳距,预定义所述第一用户的真实瞳距为C时对应所述图像中的人像瞳距为c,则通过以下公式计算第二用户高度h:The computer device according to claim 9, wherein the second user image feature comprises a portrait distance, and if the real user's distance in the image is c when the real user's real distance is C, the The following formula calculates the second user height h:其中,H1为摄像机高度,Δh为检测到的人脸矩形框中心与图像中心的像素差值。Where H1 is the camera height, and Δh is the pixel difference between the detected center of the rectangular frame and the center of the image.
- 根据权利要求9所述的计算机设备,所述环境麦克风至少包括第一麦克风和第二麦克风,第一麦克风和第二麦克风位于机器人两侧;所述通过所述环境麦克风采集环境音量得到环境噪音值v e包括: The computer device according to claim 9, wherein the ambient microphone comprises at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot; and the ambient sound is collected by the ambient microphone to obtain an environmental noise value v e includes:通过第一麦克风采集环境音量得到第一环境噪音值v 1,通过第二麦克风采集环境音量得到第二环境噪音值v 2,将所述v 1和所述v 2之中的最大者确定为环境噪音值v e。 Acquiring the ambient volume by the first microphone to obtain a first ambient noise value v 1 , collecting the ambient volume by the second microphone to obtain a second ambient noise value v 2 , determining the largest of the v 1 and the v 2 as the environment The noise value v e .
- 根据权利要求9所述的计算机设备,高度增益k h=(h-Δ)/(H-Δ),其中Δ为扬声器高度。 The computer apparatus according to claim 9, wherein the height gain k h = (h - Δ) / (H - Δ), wherein Δ is the speaker height.
- 根据权利要求9所述的计算机设备,距离增益k d=d/D。 The computer apparatus according to claim 9, the distance gain k d = d / D.
- 一种非易失性存储介质,所述存储介质内存储有计算机可读指令,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行一种机器人自动调节音量的方法,所述机器人具有摄像机、扬声器和用于采集环境声音的环境麦克风,预定义高度为H的第一用户在离机器人距离为D时对应的扬声器音量为V,所述机器人自动调节音量的方法包括以下步骤:A non-volatile storage medium having stored therein computer readable instructions that, when executed by one or more processors, cause one or more processors to perform an automatic robot adjustment The method of volume, the robot has a camera, a speaker and an environment microphone for collecting ambient sound, and the first user with a predefined height H is corresponding to a speaker volume of V when the distance from the robot is D, and the robot automatically adjusts the volume. The method includes the following steps:通过所述摄像机获取图像并检测所述图像中的第二用户的图像特征,根据所述第二用户图像特征计算所述第二用户的高度h和相对机器人的距离d,根据所述h与所述H的关系确定高度增益k h,以及所述d与所述D的关系确定距离增益k d; Acquiring an image by the camera and detecting an image feature of the second user in the image, calculating a height h of the second user and a distance d relative to the robot according to the second user image feature, according to the h and the The relationship of H determines the height gain k h , and the relationship between d and the D determines the distance gain k d ;通过所述环境麦克风采集环境音量得到环境噪音值v e,根据所述v e和预设的对应关系确定对应的环境增益k e; Obtaining an ambient noise value v e by collecting the ambient volume by the ambient microphone, and determining a corresponding environmental gain k e according to the v e and the preset correspondence relationship;
- 根据权利要求15所述的非易失性存储介质,所述第二用户图像特征包括人像瞳距,预定义所述第一用户在离所述机器人距离为D1时所述图像中的人像瞳距 为A1,所述第一用户在离所述机器人距离为D2时所述图像中的人像瞳距为A2,则通过以下公式计算所述第二用户相对所述机器人的距离d:The non-volatile storage medium of claim 15, wherein the second user image feature comprises a portrait distance, predefining a portrait distance of the first user in the image when the distance from the robot is D1 For A1, when the distance of the portrait in the image is A2 when the first user is at a distance D2 from the robot, the distance d between the second user and the robot is calculated by the following formula:d=k(a-A1)+D1,d=k(a-A1)+D1,
- 根据权利要求15所述的非易失性存储介质,所述第二用户图像特征包括人像瞳距,预定义所述第一用户的真实瞳距为C时对应所述图像中的人像瞳距为c,则通过以下公式计算第二用户高度h:The non-volatile storage medium according to claim 15, wherein the second user image feature comprises a portrait distance, and a pre-defined vertical distance of the first user is C, corresponding to a portrait distance in the image. c, the second user height h is calculated by the following formula:其中,H1为摄像机高度,Δh为检测到的人脸矩形框中心与图像中心的像素差值。Where H1 is the camera height, and Δh is the pixel difference between the detected center of the rectangular frame and the center of the image.
- 根据权利要求15所述的非易失性存储介质,所述环境麦克风至少包括第一麦克风和第二麦克风,第一麦克风和第二麦克风位于机器人两侧;所述通过所述环境麦克风采集环境音量得到环境噪音值v e包括: The non-volatile storage medium of claim 15, the ambient microphone comprising at least a first microphone and a second microphone, the first microphone and the second microphone being located on both sides of the robot; the collecting the ambient volume through the ambient microphone Obtaining an environmental noise value v e includes:通过第一麦克风采集环境音量得到第一环境噪音值v 1,通过第二麦克风采集环境音量得到第二环境噪音值v 2,将所述v 1和所述v 2之中的最大者确定为环境噪音值v e。 Acquiring the ambient volume by the first microphone to obtain a first ambient noise value v 1 , collecting the ambient volume by the second microphone to obtain a second ambient noise value v 2 , determining the largest of the v 1 and the v 2 as the environment The noise value v e .
- 根据权利要求15所述的非易失性存储介质,高度增益k h=(h-Δ)/(H-Δ),其中Δ为扬声器高度。 The nonvolatile storage medium according to claim 15, wherein the height gain k h = (h - Δ) / (H - Δ), wherein Δ is the speaker height.
- 根据权利要求15所述的非易失性存储介质,距离增益k d=d/D。 The nonvolatile storage medium according to claim 15, wherein the distance gain k d =d/D.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810314093.3A CN108628572B (en) | 2018-04-10 | 2018-04-10 | Method and device for adjusting volume by robot, computer equipment and storage medium |
CN201810314093.3 | 2018-04-10 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019196312A1 true WO2019196312A1 (en) | 2019-10-17 |
Family
ID=63704910
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/102853 WO2019196312A1 (en) | 2018-04-10 | 2018-08-29 | Method and apparatus for adjusting sound volume by robot, computer device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108628572B (en) |
WO (1) | WO2019196312A1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111294706A (en) * | 2020-01-16 | 2020-06-16 | 珠海格力电器股份有限公司 | Voice electrical appliance control method and device, storage medium and voice electrical appliance |
CN111930336A (en) * | 2020-07-29 | 2020-11-13 | 歌尔科技有限公司 | Volume adjusting method and device of audio device and storage medium |
CN112330646A (en) * | 2020-11-12 | 2021-02-05 | 南京优视智能科技有限公司 | A two-dimensional image-based anomaly detection method for the underbody of a motor vehicle |
CN112839137A (en) * | 2020-12-30 | 2021-05-25 | 平安普惠企业管理有限公司 | Call processing method, device, equipment and storage medium based on background environment |
CN114070935A (en) * | 2022-01-12 | 2022-02-18 | 百融至信(北京)征信有限公司 | Intelligent outbound interruption method and system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113518180B (en) * | 2021-05-25 | 2022-08-05 | 宁夏宁电电力设计有限公司 | Vehicle-mounted camera mounting method for electric power working vehicle |
CN113157246B (en) * | 2021-06-25 | 2021-11-02 | 深圳小米通讯技术有限公司 | Volume adjusting method and device, electronic equipment and storage medium |
CN113907652B (en) * | 2021-10-21 | 2023-03-10 | 珠海一微半导体股份有限公司 | Cleaning robot control method, chip and cleaning robot |
CN114125138B (en) * | 2021-10-29 | 2022-11-01 | 歌尔科技有限公司 | Volume adjustment optimization method and device, electronic equipment and readable storage medium |
CN114845210A (en) * | 2022-04-24 | 2022-08-02 | 北京百度网讯科技有限公司 | Method and device for adjusting volume of robot and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101867739A (en) * | 2010-04-30 | 2010-10-20 | 中山大学 | A kind of intelligent television and television intelligent control method |
CN102981422A (en) * | 2012-11-23 | 2013-03-20 | 广州华多网络科技有限公司 | Volume adjusting method and system |
CN103793719A (en) * | 2014-01-26 | 2014-05-14 | 深圳大学 | Monocular distance-measuring method and system based on human eye positioning |
CN205754809U (en) * | 2016-05-11 | 2016-11-30 | 深圳市德宝威科技有限公司 | A kind of robot self-adapting volume control system |
CN106377264A (en) * | 2016-10-20 | 2017-02-08 | 广州视源电子科技股份有限公司 | Human height measuring method and device and intelligent mirror |
CN106534541A (en) * | 2016-11-15 | 2017-03-22 | 广东小天才科技有限公司 | Volume adjusting method and device and terminal equipment |
CN106648527A (en) * | 2016-11-08 | 2017-05-10 | 乐视控股(北京)有限公司 | Volume control method, device and playing equipment |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102722967A (en) * | 2012-06-15 | 2012-10-10 | 深圳Tcl新技术有限公司 | Method and device for warning on watching TV in near distance |
CN104173054B (en) * | 2013-05-21 | 2017-04-12 | 杭州海康威视数字技术股份有限公司 | Measuring method and measuring device for height of human body based on binocular vision technique |
CN105403146A (en) * | 2015-11-05 | 2016-03-16 | 上海卓易科技股份有限公司 | Object size measurement method and system and intelligent terminal |
CN107831891A (en) * | 2017-10-31 | 2018-03-23 | 维沃移动通信有限公司 | A kind of brightness adjusting method and mobile terminal |
-
2018
- 2018-04-10 CN CN201810314093.3A patent/CN108628572B/en active Active
- 2018-08-29 WO PCT/CN2018/102853 patent/WO2019196312A1/en active Application Filing
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101867739A (en) * | 2010-04-30 | 2010-10-20 | 中山大学 | A kind of intelligent television and television intelligent control method |
CN102981422A (en) * | 2012-11-23 | 2013-03-20 | 广州华多网络科技有限公司 | Volume adjusting method and system |
CN103793719A (en) * | 2014-01-26 | 2014-05-14 | 深圳大学 | Monocular distance-measuring method and system based on human eye positioning |
CN205754809U (en) * | 2016-05-11 | 2016-11-30 | 深圳市德宝威科技有限公司 | A kind of robot self-adapting volume control system |
CN106377264A (en) * | 2016-10-20 | 2017-02-08 | 广州视源电子科技股份有限公司 | Human height measuring method and device and intelligent mirror |
CN106648527A (en) * | 2016-11-08 | 2017-05-10 | 乐视控股(北京)有限公司 | Volume control method, device and playing equipment |
CN106534541A (en) * | 2016-11-15 | 2017-03-22 | 广东小天才科技有限公司 | Volume adjusting method and device and terminal equipment |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111294706A (en) * | 2020-01-16 | 2020-06-16 | 珠海格力电器股份有限公司 | Voice electrical appliance control method and device, storage medium and voice electrical appliance |
CN111930336A (en) * | 2020-07-29 | 2020-11-13 | 歌尔科技有限公司 | Volume adjusting method and device of audio device and storage medium |
CN112330646A (en) * | 2020-11-12 | 2021-02-05 | 南京优视智能科技有限公司 | A two-dimensional image-based anomaly detection method for the underbody of a motor vehicle |
CN112839137A (en) * | 2020-12-30 | 2021-05-25 | 平安普惠企业管理有限公司 | Call processing method, device, equipment and storage medium based on background environment |
CN114070935A (en) * | 2022-01-12 | 2022-02-18 | 百融至信(北京)征信有限公司 | Intelligent outbound interruption method and system |
Also Published As
Publication number | Publication date |
---|---|
CN108628572A (en) | 2018-10-09 |
CN108628572B (en) | 2020-03-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019196312A1 (en) | Method and apparatus for adjusting sound volume by robot, computer device and storage medium | |
US11561621B2 (en) | Multi media computing or entertainment system for responding to user presence and activity | |
CN111601088B (en) | Sitting posture monitoring system based on monocular camera sitting posture identification technology | |
CN109766759A (en) | Emotion recognition method and related products | |
US9002707B2 (en) | Determining the position of the source of an utterance | |
JP2003030667A (en) | Method for automatically locating eyes in image | |
WO2023151289A1 (en) | Emotion identification method, training method, apparatus, device, storage medium and product | |
WO2018120662A1 (en) | Photographing method, photographing apparatus and terminal | |
CN111767785A (en) | Human-computer interaction control method and device, intelligent robot and storage medium | |
US10757513B1 (en) | Adjustment method of hearing auxiliary device | |
CN112380972A (en) | Volume adjusting method applied to television scene | |
CN111930336A (en) | Volume adjusting method and device of audio device and storage medium | |
CN111863020B (en) | Voice signal processing method, device, equipment and storage medium | |
CN103279188A (en) | Method for operating and controlling PPT in non-contact mode based on Kinect | |
EP3757878A1 (en) | Head pose estimation | |
CN109986553B (en) | Active interaction robot, system, method and storage device | |
TW202303444A (en) | Image processing based emotion recognition system and method | |
WO2021166811A1 (en) | Information processing device and action mode setting method | |
CN110186171A (en) | Air conditioner and its control method and computer readable storage medium | |
US11107476B2 (en) | Speaker estimation method and speaker estimation device | |
CN106295579A (en) | Face alignment method and device | |
JP2005199373A (en) | Communication device and communication method | |
CN115862597A (en) | Method and device for determining character type, electronic equipment and storage medium | |
US11055517B2 (en) | Non-contact human input method and non-contact human input system | |
CN110148124A (en) | Throat recognition methods, device, system, storage medium and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18914380 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 25.01.2021) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18914380 Country of ref document: EP Kind code of ref document: A1 |