[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111081233B - Audio processing method and electronic equipment - Google Patents

Audio processing method and electronic equipment Download PDF

Info

Publication number
CN111081233B
CN111081233B CN201911413261.5A CN201911413261A CN111081233B CN 111081233 B CN111081233 B CN 111081233B CN 201911413261 A CN201911413261 A CN 201911413261A CN 111081233 B CN111081233 B CN 111081233B
Authority
CN
China
Prior art keywords
audio
group
target
target logical
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911413261.5A
Other languages
Chinese (zh)
Other versions
CN111081233A (en
Inventor
刘金
马岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201911413261.5A priority Critical patent/CN111081233B/en
Publication of CN111081233A publication Critical patent/CN111081233A/en
Application granted granted Critical
Publication of CN111081233B publication Critical patent/CN111081233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application discloses an audio processing method and electronic equipment, wherein the method comprises the following steps: obtaining sound signals collected by at least one audio collector, and respectively processing the sound signals to generate at least one audio signal corresponding to the sound signals; according to grouping rules, the audio signals are input into corresponding target logical groups, wherein one audio signal corresponds to one or more target logical groups, and the target logical groups correspond to respective application scenes; processing the audio signals in the corresponding target logical group based on an audio processing model corresponding to the application scene of the target logical group, and generating logical data corresponding to the target logical group for a target program corresponding to the application scene of the target logical group.

Description

Audio processing method and electronic equipment
Technical Field
The present disclosure relates to the field of audio processing, and in particular, to an audio processing method and an electronic device.
Background
In a scene of processing sound, sound signals in a use environment often need to be collected, and then the collected sound signals are processed according to needs, but conflicts may be generated in the processing process. For example, currently, a set of audio Processing hardware DSP (Digital Signal Processing, digital Signal processor) + MIC (microphone) is used to process sound in a spatial environment, and the Processing method cannot simultaneously and well support different usage needs or application scenarios, for example, cannot well satisfy speech recognition and speech call at the same time, because speech recognition needs to focus and pick up sound in a specific direction, eliminate interference of other speech, and further perform preprocessing, so that a speech recognition engine can more accurately recognize speech as text, and the processed speech is not suitable for a user to listen to; while voice communication needs to pick up surrounding voice as clearly as possible and keep human voice as little distortion as possible, it can be seen that the above two voice application scenarios have conflicting requirements on hardware and software of the audio device. If the requirement of one application scene is optimized, the performance of the other application scene is necessarily reduced, and the audio processing effect is poor. If the voice communication effect is optimized, the voice recognition accuracy rate is reduced, and the false recognition rate is increased; otherwise, the voice interaction effect is optimized and improved, which can cause the distortion of voice communication and can not well support 360-degree omnidirectional pickup. There is currently no good solution to the above problems.
Disclosure of Invention
An embodiment of the present application aims to provide an audio processing method and an electronic device, and the following technical solutions are adopted in the embodiment of the present application: an audio processing method, comprising:
obtaining sound signals collected by at least one audio collector, and respectively processing the sound signals to generate at least one audio signal corresponding to the sound signals;
according to grouping rules, the audio signals are input into corresponding target logical groups, wherein one audio signal corresponds to one or more target logical groups, and the target logical groups correspond to respective application scenes;
processing the audio signals in the corresponding target logic group based on an audio processing model corresponding to the application scene of the target logic group, and generating logic data corresponding to the target logic group for a target program corresponding to the application scene of the target logic group.
Optionally, the target logical group and the audio collector have an association relationship, and the grouping rule includes setting audio signals corresponding to the audio collectors associated with the same target logical group as a group, where the association relationship can be adjusted;
correspondingly, the inputting the audio signals into the corresponding target logical group according to the grouping rules comprises:
inputting audio signals of the same group into the target logical group corresponding to the group.
Optionally, the processing the audio signal in the corresponding target logical group based on the audio processing model corresponding to the application scenario of the target logical group includes:
selecting the audio processing model based on the incidence relation between the application scenes of the target logic group and the audio processing model, wherein the audio processing model comprises at least one audio processing algorithm;
and processing the audio signals in the corresponding target logic group by using the selected audio processing algorithm in the audio processing model.
Optionally, the audio processing algorithm includes at least one of: echo cancellation algorithms, noise suppression algorithms, and automatic gain control algorithms.
Optionally, the processing the audio signals in the corresponding target logical group by using the selected audio processing algorithm in the audio processing model includes:
and loading the audio processing algorithm through an audio driver or audio firmware so as to process the audio signals in the corresponding target logic group.
Optionally, if there is no application scenario corresponding to a specified target logical group, sound signals collected by at least one audio collector corresponding to the specified target logical group are prohibited, or an audio processing model corresponding to the application scenario of the specified target logical group is prohibited, and the audio signals in the corresponding specified target logical group are processed.
Optionally, the method further comprises:
obtaining at least one reference signal, wherein the reference signal represents an echo in a sound signal collected by the audio collector;
inputting said reference signal to at least one of said target logical groups in accordance with said grouping provision;
correspondingly, the processing the audio signal in the corresponding target logical group based on the audio processing model corresponding to the application scenario of the target logical group comprises:
processing the audio signal and the reference signal in the corresponding target logical group based on an audio processing model corresponding to an application scenario of the target logical group.
Optionally, the method further comprises:
processing the corresponding logic data through the target program to generate a corresponding feedback signal;
and performing first processing on the feedback signal to generate corresponding feedback sound, wherein the first processing comprises removing echo in the feedback signal according to the reference signal.
Optionally, said inputting the audio signals into the corresponding target logical group according to the grouping rule includes:
inputting a first audio signal group into a first target logic group corresponding to the first audio signal group, wherein the first target logic group corresponds to an application scene of voice interaction;
inputting a second audio signal group into a second target logical group corresponding to the second audio signal group, wherein the second target logical group corresponds to an application scene of a voice call, and the first audio signal group is the same as at least part of audio signals in the second audio signal group, or the first audio signal group is completely different from the audio signals in the second audio signal group.
An embodiment of the present application further provides an electronic device, including:
an audio digital signal processor configured to: acquiring at least one sound signal collected by at least one audio collector connected with the audio collector, and respectively processing the sound signals to generate at least one audio signal corresponding to the sound signal;
according to grouping rules, the audio signals are input into corresponding target logical groups, wherein one audio signal corresponds to one or more target logical groups, and the target logical groups correspond to respective application scenes;
a driver configured to: processing the audio signals in the corresponding target logical group based on an audio processing model corresponding to the application scene of the target logical group, and generating logical data corresponding to the target logical group for a target program corresponding to the application scene of the target logical group.
The embodiment of the application provides an audio processing method, which can meet different types of use requirements through the logic grouping processing of audio signals in the same audio use environment, so that the accuracy of sound processing can be improved without increasing the cost.
The beneficial effects of the embodiment of the application are that: the audio processing method can enable each logic group to be used for one application scene through logic grouping processing of the collected audio signals in the same audio using environment, so that the requirements of different application scenes are met, and the accuracy of sound processing can be improved without increasing the cost.
Drawings
FIG. 1 is a flowchart of an audio processing method according to an embodiment of the present application;
FIG. 2 is a flowchart of one embodiment of step S3 of FIG. 1 according to an embodiment of the present application;
FIG. 3 is a flowchart of an embodiment of an audio processing method according to an embodiment of the present application;
FIG. 4 is a flow chart of another embodiment of an audio processing method according to an embodiment of the present application;
FIG. 5 is a flowchart of one embodiment of step S2 of FIG. 1 according to an embodiment of the present application;
fig. 6 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Various aspects and features of the present application are described herein with reference to the drawings.
It should be understood that various modifications may be made to the embodiments of the present application. Accordingly, the foregoing description should not be considered as limiting, but merely as exemplifications of embodiments. Those skilled in the art will envision other modifications within the scope and spirit of the application.
The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the application and, together with a general description of the application given above, and the detailed description of the embodiments given below, serve to explain the principles of the application.
These and other characteristics of the present application will become apparent from the following description of alternative forms of embodiment, given as non-limiting examples, with reference to the attached drawings.
It should also be understood that, although the present application has been described with reference to some specific examples, a person of skill in the art shall certainly be able to achieve many other equivalent forms of application, having the characteristics as set forth in the claims and hence all coming within the field of protection defined thereby.
The above and other aspects, features and advantages of the present application will become more apparent in view of the following detailed description when taken in conjunction with the accompanying drawings.
Specific embodiments of the present application are described hereinafter with reference to the drawings; however, it is to be understood that the disclosed embodiments are merely exemplary of the application, which can be embodied in various forms. Well-known and/or repeated functions and constructions are not described in detail to avoid obscuring the application of unnecessary or unnecessary detail. Therefore, specific structural and functional details disclosed herein are not to be interpreted as limiting, but merely as a basis for the claims and as a representative basis for teaching one skilled in the art to variously employ the present application in virtually any appropriately detailed structure.
The specification may use the phrases "in one embodiment," "in another embodiment," "in yet another embodiment," or "in other embodiments," which may each refer to one or more of the same or different embodiments in accordance with the application.
Fig. 1 is a flowchart of an audio processing method according to an embodiment of the present application, where the audio processing method according to the embodiment of the present application enables an electronic device to process audio in a targeted manner for different audio application scenarios, and as shown in fig. 1, the method includes the following steps:
s1, sound signals collected by at least one audio collector are obtained, and the sound signals are processed respectively to generate at least one audio signal corresponding to the sound signals.
Optionally, the audio collector may be a microphone.
Can use one or more audio collector to gather the sound signal in the application scene, when using a plurality of audio collectors, its position of placing can set up or adjust according to actual need, and then better receipt sound signal. In this embodiment, the sound signals collected by each audio collector are processed, for example, the audio digital signal processor is used to process the sound signals, and generate corresponding audio signals, for example, the audio collectors may send the collected sound signals to the audio digital signal processor through respective audio channels, and the audio digital signal processor may process the received sound signals into corresponding audio signals.
And S2, inputting the audio signals into corresponding target logic groups according to grouping rules, wherein one audio signal corresponds to one or more target logic groups, and the target logic groups correspond to respective application scenes.
Specifically, the present embodiment may have a plurality of target logical groups, and each target logical group may correspond to a respective application scene, for example, a call scene, a chat scene, a conference scene, a video scene, a singing scene, and the like. The target logical group may be a collection of software and/or hardware capable of processing the received audio signal based on the corresponding application scenario. The audio signals sent to the target logical groups may be the same or different, that is, one audio signal may correspond to one target logical group or a plurality of target logical groups. In the present embodiment, the audio signals are grouped according to a grouping rule, and the grouping rule may be set according to an application scenario, a situation of each hardware device, and/or a user's will, but of course, the grouping rule may also be adjusted as needed. For example, audio signals corresponding to the same target logical group may be set as one group, and the correspondence relationship between the audio signals and the target logical group may also be set according to the grouping specification. Such as dividing the first audio signal and the second audio signal corresponding to the first target logical group into a first group; and dividing the third audio signal and the fourth audio signal corresponding to the second target logic group into a second group. It is also possible to group the first audio signal, the second audio signal, the third audio signal, and the fourth audio signal corresponding to the first target logical group into a first group, group the first audio signal and the fifth audio signal corresponding to the second target logical group into a second group, and so on.
And S3, processing the audio signals in the corresponding target logic group based on an audio processing model corresponding to the application scene of the target logic group, and generating logic data corresponding to the target logic group for a target program corresponding to the application scene of the target logic group.
Different application scenes correspond to respective audio processing models, the audio processing models can set specific algorithms according to the characteristics of the corresponding application scenes, and then the target logic group can process the received audio signals by using the corresponding audio processing models according to the specific application scenes. Because the target logic group is established based on the logical relationship, different application scenes can be logically distinguished, and self-adjustment can be flexibly carried out. In this embodiment, the target logic group may be used to process the audio signals of the same group input thereto, and generate corresponding logic data, where the logic data is adapted to the application scenario, and the application scenario has respective target programs, for example, a chat scenario may have a social program, a call scenario may have a call program, and the like. The generated logic data can be used by corresponding target programs, so that the requirements of different types of application scenes are met.
The audio processing method of the embodiment can enable each logic group to be used for one application scene through logic grouping processing of the acquired audio signals in the same audio use environment (the same audio collector and other hardware can be arranged), so that the requirements of different application scenes are met, and the accuracy of sound processing can be improved without increasing the cost.
In an embodiment of the present application, the target logical group and the audio collector have an association relationship therebetween, and the grouping rule includes setting audio signals corresponding to the audio collectors associated with the same target logical group as a group, where the association relationship can be adjusted;
correspondingly, the inputting the audio signals into the corresponding target logical group according to the grouping rules comprises:
inputting audio signals of the same group into the target logical group corresponding to the group.
Specifically, the target logical group and the audio collector have an association relationship, so that after the audio collector collects the sound signal and processes the sound signal into a corresponding audio signal, the audio signal is sent to the target logical group having an association relationship with the audio collector. For example, a first target logical group is associated with a first audio collector, a second audio collector, a third audio collector, a fourth audio collector, and a fifth audio collector, respectively; the second target logic group is respectively associated with the first audio collector, the third audio collector and the fifth audio collector, so that audio signals processed by sound signals collected by the first audio collector, the second audio collector, the third audio collector, the fourth audio collector and the fifth audio collector are all sent to the first target logic group; and the audio signals processed by the sound signals collected by the first audio collector, the third audio collector and the fifth audio collector are all sent to the second target logic group. Furthermore, the association relationship between the target logical group and the audio collector can be adjusted (adjustment grouping rule), for example, to cope with the change of the application scene, the association relationship is adjusted, so that the audio signal processed by the target logical group is adjusted to adapt to the currently changed application scene.
In an embodiment of the present application, as shown in fig. 2, the processing the audio signal in the corresponding target logical group based on the audio processing model corresponding to the application scenario of the target logical group includes the following steps:
s31, selecting the audio processing model based on the incidence relation between the application scene of the target logic group and the audio processing model, wherein the audio processing model comprises at least one audio processing algorithm.
The audio processing model is provided with at least one audio processing algorithm, the audio processing algorithm can process audio signals, different application scenes correspond to the respective audio processing models, and the corresponding audio processing models can be applied to the application scenes in a targeted mode. In this embodiment, the audio processing model corresponding to the application scene may be selected based on the association relationship between the application scene of the target logical group and the audio processing model, so that the corresponding application scene is processed by using the audio processing model, and the accuracy of sound processing may be effectively improved.
And S32, processing the audio signals in the corresponding target logic group by using the selected audio processing algorithm in the audio processing model.
The audio processing model has one or more audio processing algorithms, such as echo cancellation algorithm, noise suppression algorithm, automatic gain control algorithm, etc., and each algorithm or a combination of algorithms can be applied to a corresponding application scenario. Therefore, in this embodiment, the audio signal in the corresponding target logical group is processed by using the audio processing algorithm in the selected audio processing model, so that the processed audio signal is adapted to the application scene corresponding to the target logical group, thereby effectively improving the accuracy of sound processing in the specific application scene.
In an embodiment of the present application, said processing the audio signals in the corresponding target logical group by using the selected audio processing algorithm in the audio processing model comprises the following steps: and loading the audio processing algorithm through an audio driver or audio firmware so as to process the audio signals in the corresponding target logic group.
Specifically, the audio driver is a necessary program for processing audio, and the audio firmware is a solidified program related to audio and can be regarded as a device "driver" stored inside the device. The audio driver and the audio firmware can both run when the electronic device runs, and the target logic group can be set in the audio driver or the audio firmware, so that an audio processing algorithm can be loaded when the audio driver or the audio firmware runs to process audio signals in the target logic group.
In one embodiment of the present application, the target logical group has program information associated with the audio signal, the method further comprising: and loading an audio driver or audio firmware through an operating system to run the program information arranged in the audio driver or audio firmware, thereby forming the target logic group. Specifically, the target logic group may be formed by program information (program segments), and the target logic group is set in the audio driver or the audio firmware, so that the audio driver or the audio firmware can be automatically loaded after the electronic device enters the operating system, and the program information set in the audio driver or the audio firmware can be conveniently and quickly run to form the target logic group.
In one embodiment of the present application, the audio processing method further includes: if no application scene corresponding to a specified target logical group exists currently, sound signals collected by at least one audio collector corresponding to the specified target logical group are forbidden, or an audio processing model corresponding to the application scene of the specified target logical group is forbidden, and the audio signals in the corresponding specified target logical group are processed.
Specifically, the application scenarios are various, and if there is no application scenario corresponding to the specified target logical group currently, that is, only in the current application scenario, the sound signal collected by the at least one audio collector corresponding to the specified target logical group is prohibited. For example, if the current application scenario is a first application scenario, the first application scenario corresponds to a first target logic group, and the second application scenario corresponds to a second target logic group, the audio signal collected by the audio collector corresponding to the second target logic group may be prohibited, or the audio processing model corresponding to the second application scenario of the second target logic group may be prohibited, so as to process the audio signal in the corresponding second target logic group. Therefore, only the first target logic group corresponding to the first application scene can work normally, corresponding audio signals are received (the audio signals can be collected by the audio collector corresponding to the first logic group and processed by the audio digital signal processor), and software and hardware related to the second target logic group (the designated target logic group) do not need to work, so that system resources are saved, and waste is avoided.
In one embodiment of the present application, as shown in fig. 3, the audio processing method further includes the steps of:
s4, obtaining at least one reference signal, wherein the reference signal represents an echo in the sound signal collected by the audio collector;
s5, inputting the reference signal into at least one target logic group according to the grouping regulation;
correspondingly, the processing the audio signal in the corresponding target logical group based on the audio processing model corresponding to the application scenario of the target logical group comprises:
processing the audio signal and the reference signal in the respective target logical group based on an audio processing model corresponding to an application scenario of the target logical group.
Specifically, in a specific application scenario, for example, an echo phenomenon inevitably occurs in a collection space of the audio collector, and the echo phenomenon affects audio processing, so that the audio processing is inaccurate or the processing effect is poor. In the grouping specification, the reference signals may be respectively grouped into groups, that is, one reference signal is respectively grouped into each audio signal group, and then the reference signals may be input into the target logical groups, so that each target logical group may process echo signals generated in an application scenario. And correspondingly, processing the audio signal and the reference signal in the corresponding target logical group based on the audio processing model corresponding to the application scene of the target logical group. That is, each target logical group can use the audio processing model to uniformly process the audio signal and the reference signal of the same group sent to the target logical group, so as to obtain a more accurate processing result (logical data), and the processing result (logical data) is adapted to the application scenario corresponding to the target logical group.
In one embodiment of the present application, as shown in fig. 4 in combination with fig. 6, the method further comprises the steps of:
and S6, processing the corresponding logic data through the target program to generate a corresponding feedback signal.
And S7, performing first processing on the feedback signal to generate corresponding feedback sound, wherein the first processing comprises removing echo in the feedback signal according to the reference signal.
Specifically, the feedback signal may be a feedback for the reference signal, the logic data generated by the target logic group includes related content of the reference information, the target program may process the corresponding logic data and then send the feedback signal to an independent playback unit, and the playback unit may be disposed in an audio driver or an audio firmware, on one hand, after receiving the feedback signal, the playback unit sends the feedback signal to an audio digital signal processor, and the audio digital signal processor performs a first process on the feedback signal, that is, removes an echo in the feedback signal according to the reference signal, and further generates a feedback sound; on the other hand, after receiving the feedback signal, the playback unit performs a first process on the feedback signal, that is, removes the echo in the feedback signal according to the reference signal, and further generates a feedback sound, and sends the feedback sound to the audio digital signal processor, and the audio digital signal processor performs another process on the feedback sound and then sends the processed feedback sound to the speaker (or the speaker through the amplifier AMP) arranged in the application scene, and since the echo is removed from the feedback sound, the speaker does not have an echo phenomenon after playing the feedback sound.
In one embodiment of the present application, as shown in fig. 5 and combined with fig. 6, the step of inputting the audio signals into the corresponding target logical group according to the grouping rules includes the following steps:
s21, inputting a first audio signal group into a first target logic group corresponding to the first audio signal group, wherein the first target logic group corresponds to an application scene of voice interaction;
and S22, inputting a second audio signal group into a second target logical group corresponding to the second audio signal group, wherein the second target logical group corresponds to an application scene of the voice call, and the first audio signal group is the same as at least part of audio signals in the second audio signal group, or the first audio signal group is completely different from the audio signals in the second audio signal group.
Specifically, two audio signal groups may be set, including a first audio signal group and a second audio signal group, where the first audio signal group corresponds to a first target logic group, the second audio signal group corresponds to a second target logic group, and the first target logic group and the second target logic group correspond to respective application scenarios, for example, the first target logic group corresponds to an application scenario of voice interaction (corresponds to a first target program), and the second target logic group corresponds to an application scenario of voice call (corresponds to a second target program). Wherein at least part of the audio signals in the first audio signal group and the second audio signal group are the same, or the audio signals in the first audio signal group and the second audio signal group are completely different, i.e. the first audio signal group and the second audio signal group may be completely the same, partially the same or completely different. For example, the first audio signal group includes a first audio signal, a second audio signal, a third audio signal, a fourth audio signal, a fifth audio signal, and a reference signal, and the second audio signal group includes a first audio signal, a third audio signal, and a reference signal. The first set of audio signals is sent to a first target logical group such that the first target logical group processes the first set of audio signals, and the second set of audio signals is sent to a second target logical group such that the second target logical group processes the second set of audio signals.
An embodiment of the present application further provides an electronic device, as shown in fig. 6 and combined with fig. 1, where the electronic device may process a sound signal, and the electronic device includes:
an audio digital signal processor configured to: the method comprises the steps of obtaining at least one sound signal collected by at least one audio collector connected with the device, and processing the sound signal respectively to generate at least one audio signal corresponding to the sound signal.
And inputting the audio signals into corresponding target logic groups according to grouping rules, wherein one audio signal corresponds to one or more target logic groups, and the target logic groups correspond to respective application scenes.
Particularly, one or more audio collectors can be used for collecting sound signals in application scenes, and when the plurality of audio collectors are used, the placement positions of the audio collectors can be set or adjusted according to actual needs, so that the sound signals can be received better. In this embodiment, the Audio digital signal processor (Audio DSP) processes the sound signals collected by each Audio collector to generate corresponding Audio signals, for example, the Audio collectors may send the collected sound signals to the Audio digital signal processor through respective Audio channels, and the Audio digital signal processor may process the received sound signals into corresponding Audio signals.
In this embodiment, a plurality of target logical groups may be provided, and each target logical group may correspond to a respective application scene, for example, a call scene, a chat scene, a conference scene, a video scene, a singing scene, and the like. The target logical group may be a set of software and/or hardware capable of processing the received audio signal based on the corresponding application scenario. The audio signals sent to the target logical groups may be the same or different, that is, one audio signal may correspond to one target logical group or a plurality of target logical groups. In the embodiment, the drivers group the audio signals according to a grouping rule, and the grouping rule may be set according to an application scenario, a situation of each hardware device, and/or a user's intention, but the grouping rule may also be adjusted as needed. For example, audio signals corresponding to the same target logical group may be set as one group, and the correspondence relationship between the audio signals and the target logical group may also be set according to the grouping specification. Such as dividing the first audio signal and the second audio signal corresponding to the first target logical group into a first group; and dividing the third audio signal and the fourth audio signal corresponding to the second target logic group into a second group. The first audio signal, the second audio signal, the third audio signal and the fourth audio signal corresponding to the first target logical group may also be divided into a first group, the first audio signal and the fifth audio signal corresponding to the second target logical group may be divided into a second group, and so on.
A driver configured to: processing the audio signals in the corresponding target logical group based on an audio processing model corresponding to the application scene of the target logical group, and generating logical data corresponding to the target logical group for a target program corresponding to the application scene of the target logical group.
Different application scenes correspond to respective audio processing models, the audio processing models can set specific algorithms according to the characteristics of the corresponding application scenes, and then the driver can process received audio signals through the corresponding audio processing models through the target logic group according to the specific application scenes. Because the target logic group is established based on the logical relationship, different application scenes can be logically distinguished, and self-adjustment can be flexibly carried out. In this embodiment, the driver may process the audio signals of the same group input thereto by using the target logic group to generate corresponding logic data, where the logic data is adapted to the application scenario, and the application scenario has respective target programs, for example, a chat scenario may have a social program, a call scenario may have a call program, and the like. The generated logic data can be used by corresponding target programs, so that the requirements of different types of application scenes are met.
The electronic device of the embodiment can be used in the same audio using environment (can have the same audio collector and other hardware), and each logic group can be used for one application scene through logic group processing of the collected audio signals, so that the requirements of different application scenes are met, and the accuracy of sound processing can be improved without increasing the cost.
In an embodiment of the present application, the target logical group and the audio collector have an association relationship therebetween, and the grouping specification includes setting audio signals corresponding to the audio collectors associated with the same target logical group as one group, where the association relationship can be adjusted;
accordingly, the audio digital signal processor is further configured to: inputting audio signals of the same group into the target logical group corresponding to the group.
Specifically, the target logical group has an association relationship with the audio collector, so that after the audio collector collects the sound signal and processes the sound signal into a corresponding audio signal, the audio digital signal processor sends the audio signal to the target logical group having an association relationship with the audio collector. For example, a first target logical group is associated with a first audio collector, a second audio collector, a third audio collector, a fourth audio collector, and a fifth audio collector, respectively; the second target logic group is respectively associated with the first audio collector, the third audio collector and the fifth audio collector, so that audio signals obtained by the first audio collector, the second audio collector, the third audio collector, the fourth audio collector and the fifth audio collector and processed by the audio digital signal processor are all sent to the first target logic group; and the audio signals processed by the audio digital signal processor are all sent to the second target logic group. Furthermore, the association relationship between the target logical group and the audio collector can be adjusted (adjustment grouping specification), for example, in order to cope with the change of the application scene, the association relationship is adjusted, so that the audio signal processed by the target logical group is adjusted, and the audio signal is further adapted to the application scene after the current change.
In one embodiment of the present application, the driver is further configured to: and selecting the audio processing model based on the incidence relation between the application scenes of the target logic group and the audio processing model, wherein the audio processing model comprises at least one audio processing algorithm.
The audio processing model is provided with at least one audio processing algorithm, the audio processing algorithm can process audio signals, different application scenes correspond to the respective audio processing models, and the corresponding audio processing models can be applied to the application scenes in a targeted mode. In this embodiment, the driver may select the audio processing model corresponding to the application scene based on the association relationship between the application scene of the target logical group and the audio processing model, so that the corresponding application scene is processed by using the audio processing model, and the accuracy of sound processing can be effectively improved.
And processing the audio signals in the corresponding target logic group by using the selected audio processing algorithm in the audio processing model.
The audio processing model has one or more audio processing algorithms, such as echo cancellation algorithm, noise suppression algorithm, automatic gain control algorithm, etc., and each algorithm or a combination of algorithms can be applied to a corresponding application scenario. Therefore, in this embodiment, the driver processes the audio signal in the corresponding target logical group by using the audio processing algorithm in the selected audio processing model, so that the processed audio signal is adapted to the application scenario corresponding to the target logical group, thereby effectively improving the accuracy of sound processing in the specific application scenario.
In one embodiment of the present application, the driver is further configured to: and loading the audio processing algorithm through an audio driver or audio firmware so as to process the audio signals in the corresponding target logic group.
Specifically, the audio driver is a necessary program for processing audio, and the audio firmware is a solidified audio-related program and can be regarded as a device "driver" stored inside the device. The audio driver and the audio firmware can run when the electronic device runs, and the target logic group can be arranged in the audio driver or the audio firmware, so that the driver can load an audio processing algorithm by the audio driver or the audio firmware when the audio driver or the audio firmware runs to process audio signals in the target logic group. In addition, in one embodiment, the driver may include an audio driver or audio firmware, or may be the audio driver or audio firmware itself.
In one embodiment of the present application, the driver is further configured to: if no application scene corresponding to a specified target logical group exists currently, sound signals collected by at least one audio collector corresponding to the specified target logical group are forbidden, or an audio processing model corresponding to the application scene of the specified target logical group is forbidden, and the audio signals in the corresponding specified target logical group are processed.
In particular, the application scenarios are various, and if there is no application scenario corresponding to the specified target logical group currently, i.e., only used in the current application scenario, the driver disables the sound signal collected by the at least one audio collector corresponding to the specified target logical group. For example, if the current application scenario is a first application scenario, the first application scenario corresponds to the first target logic group, and the second application scenario corresponds to the second target logic group, the driver may prohibit the sound signal collected by the audio collector corresponding to the second target logic group, or prohibit the audio processing model corresponding to the second application scenario of the second target logic group, so as to process the audio signal in the corresponding second target logic group. Therefore, only the first target logic group corresponding to the first application scene can work normally, corresponding audio signals are received (the audio signals can be collected by the audio collector corresponding to the first logic group and processed by the audio digital signal processor), and software and hardware related to the second target logic group (the designated target logic group) do not need to work, so that system resources are saved, and waste is avoided.
In one embodiment of the present application, the audio digital signal processor is further configured to:
obtaining at least one reference signal, wherein the reference signal represents an echo in a sound signal collected by the audio collector;
inputting said reference signal to at least one of said target logical groups according to said grouping provision;
accordingly, the driver is further configured to: processing the audio signal and the reference signal in the respective target logical group based on an audio processing model corresponding to an application scenario of the target logical group.
Specifically, in a specific application scenario, for example, an echo phenomenon inevitably occurs in a collection space of the audio collector, and the echo phenomenon affects audio processing, so that the audio processing is inaccurate or the processing effect is poor. The reference signals can be divided into groups in the grouping specification, that is, one reference signal is divided into each audio signal group, and then the reference signals can be input into the target logic groups, so that the driver can process echo signals generated in an application scene through each target logic group. And correspondingly, processing the audio signal and the reference signal in the corresponding target logical group based on the audio processing model corresponding to the application scene of the target logical group. That is, each target logical group can use the audio processing model to uniformly process the audio signal and the reference signal of the same group sent to the target logical group, so as to obtain a more accurate processing result (logical data), and the processing result (logical data) is adapted to the application scenario corresponding to the target logical group.
In one embodiment of the present application, the driver includes a playback unit configured to: and receiving the target program to process the corresponding logic data to generate a corresponding feedback signal, and performing first processing on the feedback signal to generate a corresponding feedback sound, wherein the first processing comprises removing an echo in the feedback signal according to the reference signal.
Specifically, referring to fig. 6, the feedback signal may be a feedback for the reference signal, the logic data generated by the target logic group includes the related content of the reference information, the target program can send the feedback signal to an independent playback unit after processing the corresponding logic data, and the playback unit may be disposed in an audio driver or an audio firmware, on one hand, after receiving the feedback signal, the playback unit sends the feedback signal to an audio digital signal processor, and the audio digital signal processor performs a first process on the feedback signal, that is, removes an echo in the feedback signal according to the reference signal, so as to generate a feedback sound; on the other hand, after receiving the feedback signal, the playback unit performs a first process on the feedback signal, that is, removes the echo in the feedback signal according to the reference signal, and further generates a feedback sound, and sends the feedback sound to the audio digital signal processor, and the audio digital signal processor performs another process on the feedback sound and then sends the processed feedback sound to the speaker (or the speaker through the amplifier AMP) arranged in the application scene, and since the echo is removed from the feedback sound, the speaker does not have an echo phenomenon after playing the feedback sound.
In one embodiment of the present application, the audio digital signal processor is further configured to:
inputting a first audio signal group into a first target logic group corresponding to the first audio signal group, wherein the first target logic group corresponds to an application scene of voice interaction;
inputting a second audio signal group into a second target logical group corresponding to the second audio signal group, wherein the second target logical group corresponds to an application scene of a voice call, and the first audio signal group is the same as at least part of audio signals in the second audio signal group, or the first audio signal group is completely different from the audio signals in the second audio signal group.
Specifically, with reference to fig. 6, two audio signal groups may be set, including a first audio signal group and a second audio signal group, where the first audio signal group corresponds to a first target logic group, the second audio signal group corresponds to a second target logic group, and the first target logic group and the second target logic group respectively correspond to respective application scenarios, for example, the first target logic group corresponds to an application scenario of voice interaction (corresponds to a first target program), and the second target logic group corresponds to an application scenario of voice call (corresponds to a second target program). Wherein at least part of the audio signals in the first audio signal group and the second audio signal group are the same, or the audio signals in the first audio signal group and the second audio signal group are completely different, i.e. the first audio signal group and the second audio signal group may be completely the same, partially the same or completely different. For example, the first audio signal group includes a first audio signal, a second audio signal, a third audio signal, a fourth audio signal, a fifth audio signal, and a reference signal, and the second audio signal group includes the first audio signal, the third audio signal, and the reference signal. The audio digital signal processor sends a first set of audio signals to a first target logical group such that the first target logical group processes the first set of audio signals, and the audio digital signal processor sends a second set of audio signals to a second target logical group such that the second target logical group processes the second set of audio signals.
The above embodiments are only exemplary embodiments of the present application, and are not intended to limit the present application, and the protection scope of the present application is defined by the claims. Various modifications and equivalents may be made by those skilled in the art within the spirit and scope of the present application and such modifications and equivalents should also be considered to be within the scope of the present application.

Claims (9)

1. An audio processing method, comprising:
obtaining sound signals collected by at least one audio collector, and respectively processing the sound signals to generate at least one audio signal corresponding to the sound signals;
according to grouping rules, the audio signals are input into corresponding target logical groups, wherein one audio signal corresponds to one or more target logical groups, the target logical groups are a set of software and/or hardware, the target logical groups correspond to respective application scenes, and the target logical groups and the audio collectors have association relations;
processing the audio signals in the corresponding target logical group based on an audio processing model corresponding to the application scene of the target logical group, and generating logical data corresponding to the target logical group for a target program corresponding to the application scene of the target logical group; wherein,
the grouping specification comprises that audio signals corresponding to the audio collectors associated with the same target logic group are set into one group, wherein the association relation can be adjusted;
correspondingly, the inputting the audio signals into the corresponding target logical group according to the grouping rules comprises:
inputting audio signals of the same group into the target logical group corresponding to the group.
2. The method of claim 1, wherein processing the audio signals in the corresponding target logical group based on an audio processing model corresponding to an application scenario of the target logical group comprises:
selecting the audio processing model based on the incidence relation between the application scenes of the target logic group and the audio processing model, wherein the audio processing model comprises at least one audio processing algorithm;
and processing the audio signals in the corresponding target logic group by using the selected audio processing algorithm in the audio processing model.
3. The method of claim 2, the audio processing algorithm comprising at least one of: echo cancellation algorithms, noise suppression algorithms, and automatic gain control algorithms.
4. The method of claim 2, wherein said processing the audio signals in the corresponding target logical group using the selected audio processing algorithm in the audio processing model comprises:
and loading the audio processing algorithm through an audio driver or audio firmware so as to process the audio signals in the corresponding target logic group.
5. The method of claim 1, wherein if there is no application scenario corresponding to a specific target logical group, sound signals collected by at least one audio collector corresponding to the specific target logical group are prohibited, or an audio processing model corresponding to the application scenario of the specific target logical group is prohibited, and the audio signals in the corresponding specific target logical group are processed.
6. The method of claim 1, further comprising:
obtaining at least one reference signal, wherein the reference signal represents an echo in a sound signal collected by the audio collector;
inputting said reference signal to at least one of said target logical groups in accordance with said grouping provision;
correspondingly, the processing the audio signal in the corresponding target logical group based on the audio processing model corresponding to the application scenario of the target logical group comprises:
processing the audio signal and the reference signal in the respective target logical group based on an audio processing model corresponding to an application scenario of the target logical group.
7. The method of claim 6, further comprising:
processing the corresponding logic data through the target program to generate a corresponding feedback signal;
and performing first processing on the feedback signal to generate corresponding feedback sound, wherein the first processing comprises removing echo in the feedback signal according to the reference signal.
8. The method of claim 1, wherein said grouping said audio signals into corresponding target logical groups comprises:
inputting a first audio signal group into a first target logic group corresponding to the first audio signal group, wherein the first target logic group corresponds to an application scene of voice interaction;
inputting a second audio signal group into a second target logical group corresponding to the second audio signal group, wherein the second target logical group corresponds to an application scene of a voice call, and the first audio signal group is the same as at least part of audio signals in the second audio signal group, or the first audio signal group is completely different from the audio signals in the second audio signal group.
9. An electronic device, comprising:
an audio digital signal processor configured to: acquiring at least one sound signal collected by an audio collector connected with the audio collector, and respectively processing the sound signal to generate at least one audio signal corresponding to the sound signal;
according to grouping rules, the audio signals are input into corresponding target logical groups, wherein one audio signal corresponds to one or more target logical groups, the target logical groups are sets of software and/or hardware, the target logical groups correspond to respective application scenes, and the target logical groups and the audio collectors have an association relationship;
a driver configured to: processing the audio signals in the corresponding target logic group based on an audio processing model corresponding to the application scene of the target logic group, and generating logic data corresponding to the target logic group for a target program corresponding to the application scene of the target logic group; wherein,
the grouping rule comprises setting audio signals corresponding to the audio collectors associated with the same target logic group into a group, and the association relationship can be adjusted;
accordingly, the audio digital signal processor is further configured to: and inputting the audio signals of the same group into the target logic group corresponding to the group.
CN201911413261.5A 2019-12-31 2019-12-31 Audio processing method and electronic equipment Active CN111081233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911413261.5A CN111081233B (en) 2019-12-31 2019-12-31 Audio processing method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911413261.5A CN111081233B (en) 2019-12-31 2019-12-31 Audio processing method and electronic equipment

Publications (2)

Publication Number Publication Date
CN111081233A CN111081233A (en) 2020-04-28
CN111081233B true CN111081233B (en) 2023-01-06

Family

ID=70320698

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911413261.5A Active CN111081233B (en) 2019-12-31 2019-12-31 Audio processing method and electronic equipment

Country Status (1)

Country Link
CN (1) CN111081233B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768767B (en) * 2020-05-22 2023-08-15 深圳追一科技有限公司 User tag extraction method and device, server and computer readable storage medium
CN113611318A (en) * 2021-06-29 2021-11-05 华为技术有限公司 Audio data enhancement method and related equipment
CN114115016B (en) * 2021-11-22 2024-10-15 腾讯云计算(北京)有限责任公司 Data processing method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424953A (en) * 2013-09-11 2015-03-18 华为技术有限公司 Speech signal processing method and device
CN104464739A (en) * 2013-09-18 2015-03-25 华为技术有限公司 Audio signal processing method and device and difference beam forming method and device
JP2017173608A (en) * 2016-03-24 2017-09-28 ヤマハ株式会社 Acoustic processing method and acoustic processing device
CN108198565A (en) * 2017-12-28 2018-06-22 深圳市东微智能科技股份有限公司 Mixed audio processing method, device, computer equipment and storage medium
CN108764304A (en) * 2018-05-11 2018-11-06 Oppo广东移动通信有限公司 scene recognition method, device, storage medium and electronic equipment
CN109545242A (en) * 2018-12-07 2019-03-29 广州势必可赢网络科技有限公司 A kind of audio data processing method, system, device and readable storage medium storing program for executing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104424953A (en) * 2013-09-11 2015-03-18 华为技术有限公司 Speech signal processing method and device
CN104464739A (en) * 2013-09-18 2015-03-25 华为技术有限公司 Audio signal processing method and device and difference beam forming method and device
JP2017173608A (en) * 2016-03-24 2017-09-28 ヤマハ株式会社 Acoustic processing method and acoustic processing device
CN108198565A (en) * 2017-12-28 2018-06-22 深圳市东微智能科技股份有限公司 Mixed audio processing method, device, computer equipment and storage medium
CN108764304A (en) * 2018-05-11 2018-11-06 Oppo广东移动通信有限公司 scene recognition method, device, storage medium and electronic equipment
CN109545242A (en) * 2018-12-07 2019-03-29 广州势必可赢网络科技有限公司 A kind of audio data processing method, system, device and readable storage medium storing program for executing

Also Published As

Publication number Publication date
CN111081233A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
US11017799B2 (en) Method for processing voice in interior environment of vehicle and electronic device using noise data based on input signal to noise ratio
US9197974B1 (en) Directional audio capture adaptation based on alternative sensory input
US9922663B2 (en) Voice signal processing method and apparatus
CN111081233B (en) Audio processing method and electronic equipment
US8842851B2 (en) Audio source localization system and method
CN108630219B (en) Processing system, method and device for echo suppression audio signal feature tracking
CN107464565B (en) Far-field voice awakening method and device
CN110996203B (en) Earphone noise reduction method, device and system and wireless earphone
JP2005318636A (en) Indoor communication system for cabin for vehicle
US9313573B2 (en) Method and device for microphone selection
CN110111805B (en) Automatic gain control method and device in far-field voice interaction and readable storage medium
CN110769352B (en) Signal processing method and device and computer storage medium
CN107919134B (en) Howling detection method and device and howling suppression method and device
CN110956976B (en) Echo cancellation method, device and equipment and readable storage medium
CN107333093A (en) A kind of sound processing method, device, terminal and computer-readable recording medium
US20240096343A1 (en) Voice quality enhancement method and related device
CN109600703A (en) Sound reinforcement system and its public address method and computer readable storage medium
CN115482830A (en) Speech enhancement method and related equipment
CN113038337A (en) Audio playing method, wireless earphone and computer readable storage medium
CN112289336B (en) Audio signal processing method and device
WO2020017284A1 (en) Sound collecting loudspeaker device, method for same, and program
CN112997249B (en) Voice processing method, device, storage medium and electronic equipment
CN111883153B (en) Microphone array-based double-end speaking state detection method and device
CN107750038B (en) Volume adjusting method, device, equipment and storage medium
CN114900771B (en) Volume adjustment optimization method, device, equipment and medium based on consonant earphone

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant