[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112669866B - Speech noise reduction method, system and computer storage medium based on loudness level - Google Patents

Speech noise reduction method, system and computer storage medium based on loudness level Download PDF

Info

Publication number
CN112669866B
CN112669866B CN201910945852.0A CN201910945852A CN112669866B CN 112669866 B CN112669866 B CN 112669866B CN 201910945852 A CN201910945852 A CN 201910945852A CN 112669866 B CN112669866 B CN 112669866B
Authority
CN
China
Prior art keywords
signal
loudness
sound
level
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910945852.0A
Other languages
Chinese (zh)
Other versions
CN112669866A (en
Inventor
袁智华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huiruisitong Technology Co Ltd
Original Assignee
Guangzhou Huiruisitong Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huiruisitong Technology Co Ltd filed Critical Guangzhou Huiruisitong Technology Co Ltd
Priority to CN201910945852.0A priority Critical patent/CN112669866B/en
Publication of CN112669866A publication Critical patent/CN112669866A/en
Application granted granted Critical
Publication of CN112669866B publication Critical patent/CN112669866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Telephone Function (AREA)

Abstract

本发明涉及一种基于响度级的语音降噪方法、系统及计算机存储介质,该方法包括:从实时获取的声音信号中识别出语音信号和环境噪声信号;分别获取所述语音信号的声压级和频率以及所述环境噪声信号的声压级和频率;根据所述语音信号的声压级和频率以及所述环境噪声信号的声压级和频率,分别得到所述语音信号的语音响度级及所述环境噪声信号的环境响度级;基于所述环境噪声信号与所述语音信号的响度级差值对所述声音信号进行降噪处理,得到通话信号。本发明提供的技术方案通过声音信号中语音信号和环境噪声信号的声压级和频率,确定响度级差值,根据响度级差值对声音信号进行降噪处理,基于与人耳感受更为贴合的响度级进行语音降噪,使降噪效果更佳。

The present invention relates to a method, system and computer storage medium for speech noise reduction based on loudness level, the method comprising: identifying a speech signal and an environmental noise signal from a sound signal acquired in real time; respectively acquiring the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the environmental noise signal; respectively obtaining the speech loudness level of the speech signal and the environmental loudness level of the environmental noise signal according to the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the environmental noise signal; performing noise reduction processing on the sound signal based on the loudness level difference between the environmental noise signal and the speech signal to obtain a call signal. The technical solution provided by the present invention determines the loudness level difference through the sound pressure level and frequency of the speech signal and the environmental noise signal in the sound signal, performs noise reduction processing on the sound signal according to the loudness level difference, and performs speech noise reduction based on a loudness level that is more in line with the human ear perception, so as to achieve a better noise reduction effect.

Description

Loudness-level-based voice noise reduction method, system and computer storage medium
Technical Field
The present invention relates to the field of speech noise reduction, and in particular, to a loudness level-based speech noise reduction method, system, and computer storage medium.
Background
At present, in order to improve the communication definition during voice communication, a mobile phone is generally provided with a noise reduction function.
In the prior art, noise reduction is generally performed according to the amplitude difference between a speech signal and an ambient noise signal. The noise is reduced by the signal amplitude, and although a certain noise reduction effect can be achieved, the noise reduction effect is not good by the amplitude of the sound signal because of the difference of the perception degree of the human ear to the sounds with different frequencies.
Accordingly, there is a need to provide a loudness-level-based speech noise reduction method, system, and computer storage medium that address the deficiencies of the prior art.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a loudness-level-based voice noise reduction method, a loudness-level-based voice noise reduction system and a loudness-level-based voice noise reduction computer storage medium.
The application provides a voice noise reduction method based on loudness level, comprising the following steps:
identifying a voice signal and an ambient noise signal from a sound signal acquired in real time;
respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
and carrying out noise reduction processing on the sound signal based on the loudness level difference value of the environment noise signal and the voice signal to obtain a call signal.
Further, the identifying the voice signal and the environmental noise signal from the voice signal acquired in real time includes:
Extracting voice characteristic information in the voice signal;
And identifying a voice signal and an environmental noise signal according to the voice characteristic information.
Further, the obtaining the speech loudness level of the speech signal and the environmental loudness level of the environmental noise signal according to the sound pressure level and the frequency of the speech signal and the sound pressure level and the frequency of the environmental noise signal respectively includes:
Searching the voice loudness level of the voice signal in an equal-loudness curve model according to the sound pressure level and the frequency of the voice signal;
and searching the environment loudness level of the environment noise signal in the equal loudness curve model according to the sound pressure level and the frequency of the environment noise signal.
Further, the construction method of the equal-loudness curve model comprises the following steps:
Obtaining the light sound judgment results of the user on pure tones with different sound pressure levels and different frequencies;
and inputting the light response judgment result into a neural network model trained through an equal response curve to obtain the equal response curve model.
Further, the noise reduction intensity of the noise reduction process is proportional to the loudness level difference.
Further, noise reduction processing is performed on the sound signal based on the loudness level difference value between the environmental noise signal and the voice signal, so as to obtain a call signal, including:
Calculating a loudness level difference of an ambient loudness level of the ambient noise signal and a speech loudness level of the speech signal;
screening out the maximum loudness level difference value in the loudness level difference values;
and carrying out noise reduction processing on the sound signal based on the maximum loudness level difference value to obtain a call signal.
Further, after the acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal, respectively, the method further includes:
Judging whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the sound signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
Further, the determining whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the voice signal and the sound pressure level and frequency of the environmental noise signal includes:
When the sound pressure level difference value of the environment noise signal and the voice signal is larger than a sound pressure level threshold value, or the environment noise signal with the frequency in a preset frequency range exists in the environment noise signal, judging that the sound signal meets the loudness level noise reduction condition;
And when the sound pressure level difference value of the environment noise signal and the voice signal is smaller than or equal to a sound pressure level threshold value and the environment noise signal with the frequency within a preset frequency range does not exist in the environment noise signal, judging that the sound signal does not meet the loudness level noise reduction condition.
The invention also provides a voice noise reduction system based on the loudness level, which comprises:
the first acquisition module is used for identifying a voice signal and an environment noise signal from the voice signal acquired in real time;
the second acquisition module is used for respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
The determining module is used for respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
And the noise reduction module is used for carrying out noise reduction processing on the sound signal based on the loudness level difference value of the environment noise signal and the voice signal to obtain a call signal.
The invention also provides a computer storage medium, wherein the computer storage medium is stored with a loudness-level-based voice noise reduction method program, and the loudness-level-based voice noise reduction method program realizes the steps of any one of the loudness-level-based voice noise reduction methods when being executed by a processor.
Compared with the closest prior art, the technical scheme of the invention has the following advantages:
The technical scheme provided by the invention is that firstly, a voice signal and an environment noise signal are identified from a voice signal obtained in real time, then, the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal are respectively obtained, the voice loudness level of the voice signal and the environment loudness level of the environment noise signal are respectively obtained according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal, and finally, noise reduction processing is carried out on the voice signal based on the loudness level difference value of the voice signal and the environment noise signal, so as to obtain a call signal. According to the technical scheme provided by the invention, the loudness level difference value between the environmental noise signal and the voice signal is determined through the sound pressure level and the frequency of the voice signal and the environmental noise signal in the voice signal, and the noise reduction treatment is carried out on the voice signal according to the loudness level difference value.
Drawings
FIG. 1 is a flow chart of a loudness level-based speech noise reduction method in an embodiment of the present invention;
FIG. 2 is a schematic illustration of an equal loudness curve provided in an embodiment of the present invention;
fig. 3 is a schematic diagram of a loudness-level-based speech noise reduction system in an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, the present application provides a loudness-level-based voice noise reduction method, which can implement real-time noise reduction processing on a sound signal in a call process. The loudness level-based speech noise reduction method may include the steps of:
s100, recognizing a voice signal and an environmental noise signal from a voice signal acquired in real time;
S200, respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
s300, respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
S400, noise reduction processing is carried out on the sound signal based on the loudness level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
In the embodiment of the application, firstly, a voice signal and an environmental noise signal are identified from a voice signal acquired in real time, then, the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal are respectively acquired, the voice loudness level of the voice signal and the environmental loudness level of the environmental noise signal are respectively obtained according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal, and finally, noise reduction processing is carried out on the voice signal based on the loudness level difference value of the voice signal and the environmental noise signal, so as to obtain a call signal. According to the technical scheme provided by the application, the loudness level difference value between the environmental noise signal and the voice signal is determined through the sound pressure level and the frequency of the voice signal and the environmental noise signal in the voice signal, and the noise reduction treatment is carried out on the voice signal according to the loudness level difference value.
In one possible embodiment of the present application, step S100 specifically includes:
Extracting voice characteristic information in the voice signal;
And identifying a voice signal and an environmental noise signal according to the voice characteristic information.
Specifically, it is first necessary to recognize a sound signal based on signal characteristics of a voice signal and an ambient noise signal. The above-described speech signal and the environmental noise signal may be identified by storing a human voice model and a noise model in advance. The model contains speech characteristics of the corresponding sound, such as intensity, loudness, frequency, signal-to-noise ratio, short-time average zero-crossing rate, etc. For example, after the sound signal is sampled, the sound signal is matched with the model, if the sound signal contains all the features in the voice model, that is, the voice signal indicates that a person is speaking currently, if the sound signal cannot be matched with the voice model, the voice signal may be too loud for the user, or the background noise is too loud, that indicates that the voice cannot be recognized from the currently acquired voice. Similarly, the acquired sound signal may be identified according to a pre-stored tire noise model, an air-conditioning noise model, or a wind noise model, so that the noise type contained in the current sound signal may be determined.
Further, a voice signal and an ambient noise signal in the sound signal are recognized by the voice characteristic information.
In one possible embodiment of the present application, the step S300 specifically includes:
Searching the voice loudness level of the voice signal in an equal-loudness curve model according to the sound pressure level and the frequency of the voice signal;
and searching the environment loudness level of the environment noise signal in the equal loudness curve model according to the sound pressure level and the frequency of the environment noise signal.
In the embodiment of the application, the speech loudness level of the speech signal and the environment loudness level of the environment noise signal can be searched and determined from the equal loudness curve model.
In one possible embodiment of the present application, the method for constructing the equal-loudness curve model specifically includes:
Obtaining the light sound judgment results of the user on pure tones with different sound pressure levels and different frequencies;
and inputting the light response judgment result into a neural network model trained through an equal response curve to obtain the equal response curve model.
In the embodiment of the application, the same sound pressure level, different frequencies and pure tones with different sound pressure levels and the same frequency can be compared in pairs by starting a preset program, a light sound judgment result input by a user is obtained, and then the light sound judgment result is input into a neural network model to obtain the equal sound curve model. The neural network model is trained through an equal-loudness curve.
In one possible embodiment of the application, the noise reduction strength of the noise reduction process is proportional to the loudness level difference.
In one possible embodiment of the present application, the step S400 specifically includes:
Calculating a loudness level difference of an ambient loudness level of the ambient noise signal and a speech loudness level of the speech signal;
screening out the maximum loudness level difference value in the loudness level difference values;
and carrying out noise reduction processing on the sound signal based on the maximum loudness level difference value to obtain a call signal.
In the embodiment of the application, because the environmental noise signal may contain noise with various different frequencies and sound pressure levels, when calculating the loudness level difference value of the environmental loudness level of the environmental noise signal and the speech loudness level of the speech signal, a plurality of loudness level difference values may appear, a maximum loudness level difference value with the maximum loudness level difference value is screened, and noise reduction processing is performed on the sound signal according to the maximum loudness level difference value, so as to obtain the call signal. Specifically, the noise reduction intensity is proportional to the maximum loudness level difference.
In a possible embodiment of the present application, after the step S200, the method further includes:
Judging whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the sound signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
That is, the noise reduction processing of the sound signal is classified into two cases, one is whether the sound signal satisfies the loudness level noise reduction condition, the noise reduction processing is performed by the loudness level difference value, that is, the case described in the above embodiment, and the other is that the sound signal does not satisfy the loudness level noise reduction condition, the noise reduction processing is performed directly according to the sound pressure level difference value of the environmental noise signal and the speech signal. The method comprises the steps of judging whether the sound signal meets the loudness level noise reduction condition or not, and determining the noise reduction processing mode, so that the operation amount can be reduced, and the noise reduction efficiency can be improved.
In one possible embodiment of the present application, the determining whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the ambient noise signal specifically includes:
When the sound pressure level difference value of the environment noise signal and the voice signal is larger than a sound pressure level threshold value, or the environment noise signal with the frequency in a preset frequency range exists in the environment noise signal, judging that the sound signal meets the loudness level noise reduction condition;
And when the sound pressure level difference value of the environment noise signal and the voice signal is smaller than or equal to a sound pressure level threshold value and the environment noise signal with the frequency within a preset frequency range does not exist in the environment noise signal, judging that the sound signal does not meet the loudness level noise reduction condition.
That is, the sound signal satisfies the loudness level noise reduction condition that a sound pressure level difference between the ambient noise signal and the speech signal is greater than a sound pressure level threshold, or that an ambient noise signal having a frequency within a preset frequency range exists in the ambient noise signal.
Specifically, the preset frequency range may be set to 2k-5kHz, and according to the equal-loudness curve, as shown in fig. 2, the human ear is most sensitive to the sound of 2k-5kHz, that is, the sound in the frequency range is louder than the sound in other frequencies at the same sound pressure level, so in order to avoid that the noise in the frequency range has a great influence on the call quality, the noise in the frequency range is emphasized.
Further, the case where the sound signal does not satisfy the loudness level noise reduction condition is described in detail in the following embodiments.
In one possible embodiment of the present application, determining whether the sound signal satisfies the loudness level noise reduction condition specifically includes:
calculating a sound pressure level difference value between the environmental noise signal and the voice signal;
Judging whether the sound pressure level difference value is larger than a sound pressure level threshold value or not;
and if the sound pressure level difference value is smaller than or equal to the sound pressure level threshold value, the sound signal is not satisfied with the loudness level noise reduction condition, and the noise is reduced according to the sound pressure level difference value.
In the embodiment of the application, whether the environmental noise signal with the sound pressure level difference value larger than the sound pressure level threshold value exists in the environmental noise signal or not is calculated, and if the environmental noise signal does not exist, the noise is reduced according to the sound pressure level difference value. Wherein the noise reduction intensity is proportional to the sound pressure level difference. If so, the noise reduction is performed according to the sound pressure level and the frequency of the environmental noise signal, and the noise reduction method is described in detail in the above embodiments, which is not described herein.
In one possible embodiment of the present application, the loudness-level-based speech noise reduction method may further include:
Calculating a loudness level difference value of the noise-reduced environment loudness level and the speech loudness level;
Judging whether the loudness level difference value is larger than a loudness threshold value or not;
and if the loudness level difference value is larger than the loudness threshold value, determining the noise-reduced sound signal as a call signal.
In the embodiment of the application, after the noise reduction process is completed, the loudness level difference value of the noise-reduced environment loudness level and the speech loudness level is further calculated, the loudness level difference value and the loudness threshold value are judged, and if the loudness level difference value is larger than the loudness threshold value, the noise-reduced sound signal is determined to be the final call signal. That is, the sound signal at this time is already very fit with the human ear, and the noise reduction effect is very good.
In one possible embodiment of the present application, the loudness-level-based speech noise reduction method may further include:
And transmitting the call signal to another terminal through a 5G network.
In the embodiment of the application, the voice signal is transmitted through the 5G network, so that the voice is clearer, the transmission speed is faster, and the user experience is enhanced.
In another embodiment of the present application, a loudness-level-based speech noise reduction system is also provided, as shown in fig. 3, where the loudness-level-based speech noise reduction system may include a first acquisition module 1, a second acquisition module 2, a determination module 3, and a noise reduction module 4.
The first acquisition module 1 is configured to recognize a speech signal and an ambient noise signal from a sound signal acquired in real time.
The second acquisition module 2 is configured to acquire the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the ambient noise signal, respectively.
The determining module 3 is configured to derive a speech loudness level of the speech signal and an ambient loudness level of the ambient noise signal from the sound pressure level and the frequency of the speech signal and the sound pressure level and the frequency of the ambient noise signal, respectively.
The noise reduction module 4 is configured to perform noise reduction processing on the sound signal based on the loudness level difference value between the ambient noise signal and the voice signal, so as to obtain a call signal.
In the embodiment of the application, a voice signal and an environmental noise signal are identified from a voice signal of a first terminal acquired in real time through a first acquisition module, then the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal are respectively acquired through a second acquisition module, the voice loudness level of the voice signal and the environmental loudness level of the environmental noise signal are respectively obtained through a determination module according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal, and finally noise reduction processing is carried out on the voice signal through a noise reduction module based on the loudness level difference value of the environmental noise signal and the voice signal, so that a call signal is obtained. According to the technical scheme provided by the application, the loudness level difference value between the environmental noise signal and the voice signal is determined through the sound pressure level and the frequency of the voice signal and the environmental noise signal in the voice signal, the voice signal is subjected to noise reduction according to the loudness level difference value, and the voice noise reduction is performed on the basis of the loudness level which is more fit with the human ear feeling, so that the noise reduction effect is better.
Further, the first acquisition module 1 is specifically configured to:
Extracting voice characteristic information in the voice signal;
And identifying a voice signal and an environmental noise signal according to the voice characteristic information.
Further, the determining module 3 is specifically configured to:
Searching the voice loudness level of the voice signal in an equal-loudness curve model according to the sound pressure level and the frequency of the voice signal;
and searching the environment loudness level of the environment noise signal in the equal loudness curve model according to the sound pressure level and the frequency of the environment noise signal.
Further, the determining module 3 may be further configured to:
Obtaining the light sound judgment results of the user on pure tones with different sound pressure levels and different frequencies;
and inputting the light response judgment result into a neural network model trained through an equal response curve to obtain the equal response curve model.
Further, the noise reduction module 4 is specifically configured to:
Calculating a loudness level difference of an ambient loudness level of the ambient noise signal and a speech loudness level of the speech signal;
screening out the maximum loudness level difference value in the loudness level difference values;
and carrying out noise reduction processing on the sound signal based on the maximum loudness level difference value to obtain a call signal.
Further, the loudness-level-based speech noise reduction system may further include a determination module.
The judging module is configured to judge whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the voice signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
Further, the judging module is specifically configured to:
When the sound pressure level difference value of the environment noise signal and the voice signal is larger than a sound pressure level threshold value, or the environment noise signal with the frequency in a preset frequency range exists in the environment noise signal, judging that the sound signal meets the loudness level noise reduction condition;
And when the sound pressure level difference value of the environment noise signal and the voice signal is smaller than or equal to a sound pressure level threshold value and the environment noise signal with the frequency within a preset frequency range does not exist in the environment noise signal, judging that the sound signal does not meet the loudness level noise reduction condition.
In another embodiment of the present disclosure, there is also provided a computer storage medium having stored thereon a loudness-based speech noise reduction method program that when executed by a processor performs the steps of:
identifying a voice signal and an ambient noise signal from a sound signal acquired in real time;
respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
And denoising the sound signal based on the loudness level difference value of the environment noise signal and the voice signal to obtain a call signal.
In the embodiment of the application, firstly, a voice signal and an environmental noise signal are identified from a voice signal acquired in real time, then, the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal are respectively acquired, the voice loudness level of the voice signal and the environmental loudness level of the environmental noise signal are respectively obtained according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal, and finally, noise reduction processing is carried out on the voice signal based on the loudness level difference value of the voice signal and the environmental noise signal, so as to obtain a call signal. According to the technical scheme provided by the application, the loudness level difference value between the environmental noise signal and the voice signal is determined through the sound pressure level and the frequency of the voice signal and the environmental noise signal in the voice signal, the voice signal is subjected to noise reduction according to the loudness level difference value, and the voice noise reduction is performed on the basis of the loudness level which is more fit with the human ear feeling, so that the noise reduction effect is better.
In one possible embodiment of the present application, the method for identifying a speech signal and an ambient noise signal from a sound signal acquired in real time specifically includes:
Extracting voice characteristic information in the voice signal;
And identifying a voice signal and an environmental noise signal according to the voice characteristic information.
Specifically, it is first necessary to recognize a sound signal based on signal characteristics of a voice signal and an ambient noise signal. The above-described speech signal and the environmental noise signal may be identified by storing a human voice model and a noise model in advance. The model contains speech characteristics of the corresponding sound, such as intensity, loudness, frequency, signal-to-noise ratio, short-time average zero-crossing rate, etc. For example, after the sound signal is sampled, the sound signal is matched with the model, if the sound signal contains all the features in the voice model, that is, the voice signal indicates that a person is speaking currently, if the sound signal cannot be matched with the voice model, the voice signal may be too loud for the user, or the background noise is too loud, that indicates that the voice cannot be recognized from the currently acquired voice. Similarly, the acquired sound signal may be identified according to a pre-stored tire noise model, an air-conditioning noise model, or a wind noise model, so that the noise type contained in the current sound signal may be determined.
Further, a voice signal and an ambient noise signal in the sound signal are recognized by the voice characteristic information.
In one possible embodiment of the present application, the obtaining the speech loudness level of the speech signal and the environmental loudness level of the environmental noise signal according to the sound pressure level and the frequency of the speech signal and the sound pressure level and the frequency of the environmental noise signal respectively specifically includes:
Searching the voice loudness level of the voice signal in an equal-loudness curve model according to the sound pressure level and the frequency of the voice signal;
and searching the environment loudness level of the environment noise signal in the equal loudness curve model according to the sound pressure level and the frequency of the environment noise signal.
In the embodiment of the application, the speech loudness level of the speech signal and the environment loudness level of the environment noise signal can be searched and determined from the equal loudness curve model.
In one possible embodiment of the present application, the method for constructing the equal-loudness curve model specifically includes:
Obtaining the light sound judgment results of the user on pure tones with different sound pressure levels and different frequencies;
and inputting the light response judgment result into a neural network model trained through an equal response curve to obtain the equal response curve model.
In the embodiment of the application, the same sound pressure level, different frequencies and pure tones with different sound pressure levels and the same frequency can be compared in pairs by starting a preset program, a light sound judgment result input by a user is obtained, and then the light sound judgment result is input into a neural network model to obtain the equal sound curve model. The neural network model is trained through an equal-loudness curve.
In one possible embodiment of the application, the noise reduction strength of the noise reduction process is proportional to the loudness level difference.
In one possible embodiment of the present application, the noise reduction processing is performed on the sound signal based on the loudness level difference between the ambient noise signal and the speech signal, so as to obtain a call signal, which specifically includes:
Calculating a loudness level difference of an ambient loudness level of the ambient noise signal and a speech loudness level of the speech signal;
screening out the maximum loudness level difference value in the loudness level difference values;
and carrying out noise reduction processing on the sound signal based on the maximum loudness level difference value to obtain a call signal.
In the embodiment of the application, because the environmental noise signal may contain noise with various different frequencies and sound pressure levels, when calculating the loudness level difference value of the environmental loudness level of the environmental noise signal and the speech loudness level of the speech signal, a plurality of loudness level difference values may appear, a maximum loudness level difference value with the maximum loudness level difference value is screened, and noise reduction processing is performed on the sound signal according to the maximum loudness level difference value, so as to obtain the call signal. Specifically, the noise reduction intensity is proportional to the maximum loudness level difference.
In one possible embodiment of the present application, after the acquiring the sound pressure level and frequency of the voice signal and the sound pressure level and frequency of the ambient noise signal, respectively, the method further includes:
Judging whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the sound signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
That is, the noise reduction processing of the sound signal is classified into two cases, one is whether the sound signal satisfies the loudness level noise reduction condition, the noise reduction processing is performed by the loudness level difference value, that is, the case described in the above embodiment, and the other is that the sound signal does not satisfy the loudness level noise reduction condition, the noise reduction processing is performed directly according to the sound pressure level difference value of the environmental noise signal and the speech signal. The method comprises the steps of judging whether the sound signal meets the loudness level noise reduction condition or not, and determining the noise reduction processing mode, so that the operation amount can be reduced, and the noise reduction efficiency can be improved.
In one possible embodiment of the present application, the determining whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the ambient noise signal specifically includes:
When the sound pressure level difference value of the environment noise signal and the voice signal is larger than a sound pressure level threshold value, or the environment noise signal with the frequency in a preset frequency range exists in the environment noise signal, judging that the sound signal meets the loudness level noise reduction condition;
And when the sound pressure level difference value of the environment noise signal and the voice signal is smaller than or equal to a sound pressure level threshold value and the environment noise signal with the frequency within a preset frequency range does not exist in the environment noise signal, judging that the sound signal does not meet the loudness level noise reduction condition.
That is, the sound signal satisfies the loudness level noise reduction condition that a sound pressure level difference between the ambient noise signal and the speech signal is greater than a sound pressure level threshold, or that an ambient noise signal having a frequency within a preset frequency range exists in the ambient noise signal.
Specifically, the preset frequency range may be set to 2k-5kHz, and according to the equal-loudness curve, as shown in fig. 2, the human ear is most sensitive to the sound of 2k-5kHz, that is, the sound in the frequency range is louder than the sound in other frequencies at the same sound pressure level, so in order to avoid that the noise in the frequency range has a great influence on the call quality, the noise in the frequency range is emphasized.
In one possible embodiment of the present application, the loudness-level-based speech noise reduction method further comprises:
And transmitting the call signal to another terminal through a 5G network.
In the embodiment of the application, the voice signal is transmitted through the 5G network, so that the voice is clearer, the transmission speed is faster, and the user experience is enhanced.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
It is to be understood that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For a hardware implementation, the Processing units may be implemented within one or more Application SPECIFIC INTEGRATED Circuits (ASICs), digital signal processors (DIGITAL SIGNAL Processing, DSPs), digital signal Processing devices (DSP DEVICE, DSPD), programmable logic devices (Programmable Logic Device, PLDs), field-Programmable gate arrays (Field-Programmable GATE ARRAY, FPGA), general purpose processors, controllers, micro-controllers, microprocessors, other electronic units for performing the functions described herein, or a combination thereof.
For a software implementation, the techniques described herein may be implemented by means of units that perform the functions described herein. The software codes may be stored in a memory and executed by a processor. The memory may be implemented within the processor or external to the processor.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art or a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention. The storage medium includes various media capable of storing program codes such as a U disk, a mobile hard disk, a ROM, a RAM, a magnetic disk or an optical disk.
It should be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention, and not for limiting the same, and although the present invention has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present invention.

Claims (9)

1. A method of loudness-level-based speech noise reduction, comprising:
identifying a voice signal and an ambient noise signal from a sound signal acquired in real time;
respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
noise reduction processing is carried out on the sound signal based on the loudness level difference value of the environment noise signal and the voice signal, so as to obtain a call signal;
Wherein after the acquiring of the sound pressure level and frequency of the voice signal and the sound pressure level and frequency of the environmental noise signal, respectively, further comprises:
Judging whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the sound signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
2. A method of loudness-based speech noise reduction according to claim 1, wherein the identifying speech signals and ambient noise signals from the sound signals acquired in real time comprises:
Extracting voice characteristic information in the voice signal;
And identifying a voice signal and an environmental noise signal according to the voice characteristic information.
3. The method of claim 1, wherein the obtaining the speech loudness level of the speech signal and the environmental loudness level of the environmental noise signal according to the sound pressure level and the frequency of the speech signal and the sound pressure level and the frequency of the environmental noise signal respectively comprises:
Searching the voice loudness level of the voice signal in an equal-loudness curve model according to the sound pressure level and the frequency of the voice signal;
and searching the environment loudness level of the environment noise signal in the equal loudness curve model according to the sound pressure level and the frequency of the environment noise signal.
4. A method of loudness-level-based speech noise reduction according to claim 3 wherein the method of constructing the equal loudness curve model comprises:
Obtaining the light sound judgment results of the user on pure tones with different sound pressure levels and different frequencies;
and inputting the light response judgment result into a neural network model trained through an equal response curve to obtain the equal response curve model.
5. A method of loudness-based speech noise reduction according to claim 1, wherein the noise reduction strength of the noise reduction process is proportional to the loudness level difference.
6. A method of loudness-level-based speech noise reduction according to claim 1 or 5 wherein noise reduction of the sound signal based on the loudness level difference between the ambient noise signal and the speech signal results in a speech signal comprising:
Calculating a loudness level difference of an ambient loudness level of the ambient noise signal and a speech loudness level of the speech signal;
screening out the maximum loudness level difference value in the loudness level difference values;
and carrying out noise reduction processing on the sound signal based on the maximum loudness level difference value to obtain a call signal.
7. A method of loudness-level-based speech noise reduction according to claim 1 wherein said determining whether the sound signal meets loudness level noise reduction conditions based on the sound pressure level and frequency of the speech signal and the sound pressure level and frequency of the ambient noise signal comprises:
When the sound pressure level difference value of the environment noise signal and the voice signal is larger than a sound pressure level threshold value, or the environment noise signal with the frequency in a preset frequency range exists in the environment noise signal, judging that the sound signal meets the loudness level noise reduction condition;
And when the sound pressure level difference value of the environment noise signal and the voice signal is smaller than or equal to a sound pressure level threshold value and the environment noise signal with the frequency within a preset frequency range does not exist in the environment noise signal, judging that the sound signal does not meet the loudness level noise reduction condition.
8. A loudness-level-based speech noise reduction system, comprising:
the first acquisition module is used for identifying a voice signal and an environment noise signal from the voice signal acquired in real time;
the second acquisition module is used for respectively acquiring the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environmental noise signal;
The determining module is used for respectively obtaining the voice loudness level of the voice signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the voice signal and the sound pressure level and the frequency of the environment noise signal;
the noise reduction module is used for carrying out noise reduction processing on the sound signal based on the loudness level difference value of the environment noise signal and the voice signal to obtain a call signal;
the loudness-level-based voice noise reduction system further comprises a judging module;
The judging module is configured to judge whether the sound signal meets the loudness level noise reduction condition according to the sound pressure level and frequency of the voice signal and the sound pressure level and frequency of the environment noise signal;
When the sound signal meets the loudness level noise reduction condition, respectively obtaining the sound loudness level of the sound signal and the environment loudness level of the environment noise signal according to the sound pressure level and the frequency of the sound signal and the sound pressure level and the frequency of the environment noise signal, and carrying out noise reduction on the sound signal based on the difference value of the environment loudness level and the sound loudness level to obtain a call signal;
And when the sound signal does not meet the loudness level noise reduction condition, noise reduction is carried out on the sound signal based on the sound pressure level difference value of the environment noise signal and the voice signal, and a call signal is obtained.
9. A computer storage medium having stored thereon a loudness-based speech noise reduction method program which, when executed by a processor, implements the steps of the loudness-based speech noise reduction method of any of claims 1-7.
CN201910945852.0A 2019-09-30 2019-09-30 Speech noise reduction method, system and computer storage medium based on loudness level Active CN112669866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910945852.0A CN112669866B (en) 2019-09-30 2019-09-30 Speech noise reduction method, system and computer storage medium based on loudness level

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910945852.0A CN112669866B (en) 2019-09-30 2019-09-30 Speech noise reduction method, system and computer storage medium based on loudness level

Publications (2)

Publication Number Publication Date
CN112669866A CN112669866A (en) 2021-04-16
CN112669866B true CN112669866B (en) 2025-01-28

Family

ID=75399808

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910945852.0A Active CN112669866B (en) 2019-09-30 2019-09-30 Speech noise reduction method, system and computer storage medium based on loudness level

Country Status (1)

Country Link
CN (1) CN112669866B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117795981A (en) * 2021-08-10 2024-03-29 三星电子株式会社 Electronic device for correcting sound signal and method for controlling electronic device
CN115019836A (en) * 2022-04-29 2022-09-06 东风汽车有限公司东风日产乘用车公司 Vehicle prompt sound playback control method, storage medium and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107910013A (en) * 2017-11-10 2018-04-13 广东欧珀移动通信有限公司 The output processing method and device of a kind of voice signal
CN109147814A (en) * 2018-09-07 2019-01-04 青岛黄海学院 Based on the communication control method in multi-person speech communication

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
USRE37864E1 (en) * 1990-07-13 2002-10-01 Sony Corporation Quantizing error reducer for audio signal
EP2980789A1 (en) * 2014-07-30 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus and method for enhancing an audio signal, sound enhancing system
CN109688498B (en) * 2018-11-23 2020-10-09 潍坊歌尔电子有限公司 Volume adjusting method, earphone and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107910013A (en) * 2017-11-10 2018-04-13 广东欧珀移动通信有限公司 The output processing method and device of a kind of voice signal
CN109147814A (en) * 2018-09-07 2019-01-04 青岛黄海学院 Based on the communication control method in multi-person speech communication

Also Published As

Publication number Publication date
CN112669866A (en) 2021-04-16

Similar Documents

Publication Publication Date Title
CN103578470B (en) A kind of processing method and system of telephonograph data
CN103730122B (en) Voice conversion device and method for converting user voice
CN103236263A (en) Method, system and mobile terminal for improving communicating quality
KR101068227B1 (en) Clarity Improvement Device and Voice Output Device Using the Same
CN112951259B (en) Audio noise reduction method and device, electronic equipment and computer readable storage medium
JP2007522706A (en) Audio signal processing system
CN112669866B (en) Speech noise reduction method, system and computer storage medium based on loudness level
CN108305637A (en) Earphone voice processing method, terminal equipment and storage medium
CN110830866A (en) Voice assistant awakening method and device, wireless earphone and storage medium
CN116844559A (en) Method for detecting an alert signal in a changing environment
CN106782586B (en) Audio signal processing method and device
JP2014513320A (en) Method and apparatus for attenuating dominant frequencies in an audio signal
CN112312258B (en) Intelligent earphone with hearing protection and hearing compensation
CN115223584B (en) Audio data processing method, device, equipment and storage medium
CN110556128B (en) Voice activity detection method and device and computer readable storage medium
CN114822573B (en) Voice enhancement method, device, earphone device and computer readable storage medium
TWI594232B (en) Method and apparatus for processing of audio signals
CN104464746A (en) Voice filtering method and device and electron equipment
EP2230664A1 (en) Method and apparatus for attenuating noise in an input signal
CN110197663B (en) Control method and device and electronic equipment
CN113259826B (en) Method and device for realizing hearing aid in electronic terminal
CN113314134B (en) Bone conduction signal compensation method and device
CN115954013A (en) Voice processing method, device, equipment and storage medium
CN116312606A (en) High-frequency noise suppression method and device, terminal equipment and storage medium
KR20120016709A (en) Apparatus and method for improving call quality in a portable terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant