CN101373594A

CN101373594A - Method and apparatus for correcting audio signal

Info

Publication number: CN101373594A
Application number: CNA2007101452788A
Authority: CN
Inventors: 郭利斌
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2007-08-21
Filing date: 2007-08-21
Publication date: 2009-02-25

Abstract

The invention belongs to the technical field of communication, and discloses a method and a device for correcting audio signals. The method comprises the following steps: intercepting a segment of signal near the boundary of an audio signal data frame; subjecting the intercepted signal to linear treatment to obtain a new signal; and calculating the evaluation index of the new signal and, if the evaluation index is smaller than a preset evaluation index, continuing the linear treatment until the evaluation index of the signal after the linear treatment is larger than or equal to the preset evaluation index. The method has the advantages that the method has less calculation amount, can achieve smooth and periodic time domain waveforms of the audio signal by correcting the audio signal, and can ensure the phase information of the signal and reduce frequency spectrum divergence, thereby achieving smooth frequency spectrum and further eliminating boundary effect.

Description

Method and apparatus for correcting audio signal

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for correcting an audio signal.

Background

Transform domain coding is one of compression techniques commonly adopted by current audio coding standards, belongs to frequency domain coding, achieves the purpose of information compression by reducing the correlation among components in signals and quantizing and coding transformed coefficients, and fully utilizes the auditory properties of human ears on the frequency domain, such as masking effect and critical frequency band, to realize the compression of audio signals. In practical applications, an audio signal is usually divided into a plurality of independent data frames for FFT (Fast Fourier Transform) or DCT (Discrete Cosine Transform), but it cannot be guaranteed that each frame of signal is continuous at the edge, and the signal can be continuously extended into a periodic signal sequence, and the jump of the signal at the edge of a data block can make the energy spectrum of the signal disperse rather than concentrate, thereby generating a large amount of high-frequency signals; in addition, quantization coding is performed on coefficients of FFT or DCT, which inevitably generates quantization errors, and the errors caused by such quantization are amplified by many times by the integration window when synthesizing audio signals, so that the synthesized audio signals are severely distorted, i.e., a boundary effect is generated.

The boundary effect is caused by discontinuity between data frames of the audio signal, so that the naturalness and intelligibility of the audio signal are seriously influenced, the effect of an encoder is influenced, and the audio quality is seriously reduced; and causes the audio signal to sound with a noticeable periodic "beep" noise, appearing on the spectrogram as: clearly spaced "noise bars" appear.

In the prior art, in order to eliminate the boundary effect, MDCT (Modified discrete cosine transform) is usually adopted as a time-frequency transform tool, and a filter bank is eliminated by using 50% of sample point overlap and time-domain aliasing, so that the boundary effect in FFT and DCT processing operations is overcome without reducing transform coding performance. The MDCT takes a 50% data overlap technique with respect to the DCT, namely: the first half of data of the current data block is overlapped with the second half of data of the previous adjacent data block, and the second half of data is overlapped with the first half of data of the next adjacent data block.

The forward transform of the MDCT transform is defined as follows:

wherein,

n_{0} = \frac{N / 2 + 1}{2}

is the phase variable of MDCT. As can be seen from the MDCT definition, the transformed data block length N must be even, and the MDCT transforms N audio time domain samples to obtain N/2 frequency domain samples.

The inverse of the MDCT transform is defined as follows:

the inverse MDCT transform computes N time-domain audio samples from N/2 frequency-domain signal samples.

When signal samples are divided into relatively independent data frames and then subjected to time-frequency transformation processing, the edges of the data blocks are distorted, and the effective method for solving the problem is to adopt a data overlapping technology between adjacent data frames. As can be seen from the above, the MDCT uses a 50% data overlap and utilizes an analysis, integration window w_a(n)、w_s(n) further reduces discontinuities between data frames. Therefore, the MDCT reduces the boundary effect to a certain extent, improves the intelligibility of the coded audio and improves the coding quality.

However, quantization errors inevitably occur in MDCT coefficients, and such errors affect continuity between frames, so that MDCT cannot completely eliminate the influence caused by the boundary effect, and when the signal changes more severely, the boundary effect is particularly obvious after processing an audio signal with stronger energy; in the technology of processing multi-channel coding, energy is redistributed at a decoding end, so that each channel is discontinuous, and a more serious boundary effect occurs.

Disclosure of Invention

The embodiment of the invention provides a method and a device for correcting an audio signal, which can effectively eliminate the boundary effect.

A method of modifying an audio signal, comprising:

intercepting a segment of signal near a data frame boundary of the audio signal;

carrying out linear processing on the intercepted signals to obtain new signals;

and calculating the evaluation index of the new signal, and when the evaluation index is smaller than a preset evaluation index, continuing linear processing until the evaluation index of the signal after linear processing is larger than or equal to the preset evaluation index.

An apparatus for modifying an audio signal, comprising:

the intercepting signal unit is used for intercepting a section of signal near the boundary of the data frame of the audio signal;

the linear processing unit is used for carrying out linear processing on the received signals to obtain new signals;

the calculating unit is used for calculating the evaluation index of the new signal;

and the comparison unit is used for receiving the evaluation index from the calculation unit, comparing the evaluation index with the preset evaluation index, and sending the new signal to the linear processing unit when the evaluation index is smaller than the preset evaluation index until the received evaluation index is larger than or equal to the preset evaluation index.

According to the technical scheme, due to the fact that the discontinuity of the audio signals occurs near the boundary of the adjacent data frames, a section of signals are intercepted near the boundary of the data frames of the audio signals, the intercepted signals are subjected to linear processing, the signals subjected to linear processing replace the original signals subjected to jumping or discontinuity, evaluation indexes of the new signals subjected to linear processing are calculated, when the evaluation indexes of the new signals subjected to linear processing are smaller than preset evaluation indexes, linear processing is continued, the discontinuity of the signals near the boundary of the adjacent data frames is further reduced until the evaluation indexes of the signals subjected to linear processing are larger than or equal to the preset evaluation indexes, the signals near the boundary of the adjacent data frames are enabled to have continuity, and the purpose of eliminating the boundary effect is achieved.

Drawings

FIG. 1 is a flow chart of a method provided by an embodiment of the present invention;

FIG. 2 is a schematic diagram of an apparatus according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an apparatus according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of an apparatus according to a second embodiment of the present invention;

FIG. 5 is a schematic view of an apparatus according to a third embodiment of the present invention;

FIG. 6 is a schematic view of an apparatus according to a fourth embodiment of the present invention;

FIG. 7 is a schematic view of an apparatus according to a fifth embodiment of the present invention;

FIG. 8 is a schematic view of an apparatus according to a sixth embodiment of the present invention;

FIG. 9 is a schematic view of an apparatus according to a seventh embodiment of the present invention;

fig. 10 is a schematic view of an apparatus according to an eighth embodiment of the present invention.

Detailed Description

Embodiments of the present invention provide a method and an apparatus for correcting an audio signal, which are used to correct discontinuity between adjacent data frames of the audio signal, so as to smooth a waveform of the audio signal after correction, and further achieve an object of eliminating a boundary effect.

First, a general description will be given of a method provided by an embodiment of the present invention.

Referring to fig. 1, a flowchart of a method provided by an embodiment of the present invention is shown:

11): intercepting a section of signal near the boundary of the audio signal data frame, for example, the point X is the boundary point of the signal 1, because the frame length of the general spectrogram is 256 points, respectively intercepting 128 points forwards and 128 points backwards from the point X, and the intercepted 256 points form a frame signal;

12): carrying out linear processing on the intercepted signal to obtain a new audio signal;

13): calculating the evaluation indexes of the new audio signal, such as signal-to-noise ratio, frequency spectrum distortion degree and average opinion score, returning to step 12) when judging that the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear processing until the evaluation indexes of the signals after linear processing are larger than or equal to the preset evaluation indexes, otherwise, ending the linear processing.

The preset evaluation indexes are the signal-to-noise ratio, the frequency spectrum distortion degree and the average opinion score of the audio signal corresponding to the elimination of the boundary effect.

The following detailed description will be made with respect to the methods provided in the embodiments of the present invention, by respectively taking examples:

the first embodiment is as follows:

101: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal; (ii) a

102: performing LP (Linear Prediction) analysis on the intercepted signal to obtainPredicting coefficients, then using the formula

Carrying out linear prediction, and replacing a jump value near a data frame boundary with the obtained prediction value to obtain a new audio signal;

where s' (n) represents the prediction value, p represents the prediction order, a_iRepresenting the prediction coefficients;

among them, LP analysis is one of the most effective speech analysis techniques, and a linear prediction coefficient is solved by a more common Durbin (Durbin) push algorithm, in which a sampled value of a speech signal can be approximated by a linear combination of several sampled values in the past.

103: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 102 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

In this embodiment, the audio signal near the data frame boundary is subjected to LP calculation to obtain data with a large correlation instead of the original discontinuous or jumping data with a small correlation, so that the audio signal has continuity.

Example two:

201: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

202: performing LP (Linear Prediction) analysis on the intercepted signal to obtain a Prediction coefficient, and then utilizing a formula

203: averaging at least two data points before and after the data frame jump of the new audio signal, and using the data points and the average value to make a linear curve; taking the average value as a reference point, or taking any point in front of the average value as a reference point, or taking any point behind the average value as a reference point, performing linear interpolation, and replacing the data of the original corresponding position with the interpolated data to further obtain a new audio signal;

204: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 202 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the first embodiment, on the basis of carrying out linear prediction on the audio signal and eliminating distortion, the second embodiment adopts a linear interpolation method, and replaces the original point at the corresponding position with the linear interpolation value, so that the audio signal has continuity, and the discontinuity of the audio signal is further ensured.

And (3) implementation:

301: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

302: performing LP (Linear Prediction) analysis on the intercepted signal to obtain a Prediction coefficient, and then utilizing a formula

303: carrying out fast Fourier transform on the new audio signal to change a time domain into a frequency domain; intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate an average value, and replacing the high-frequency part with the average value; carrying out inverse fast Fourier transform on the high-frequency part substituted by the average value to obtain a new audio signal, and substituting the new signal for the signal before the fast Fourier transform;

304: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 302 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the second embodiment, the second embodiment replaces the linear interpolation method with the frequency domain smoothing method to correct the frequency spectrum divergence, so that the frequency spectrum of the audio signal is smoothed, and the purpose of eliminating the boundary effect is achieved.

Example four:

401: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

402: performing LP (Linear Prediction) analysis on the intercepted signal to obtain a Prediction coefficient, and then utilizing a formula

Performing linear prediction, and replacing data frame edge with the obtained predicted valueObtaining a new audio signal by the jump value near the boundary;

403: averaging at least two data points before and after the data frame jump of the new audio signal, and using the data points and the average value to make a linear curve; taking the average value as a reference point, or taking any point in front of the average value as the reference point, or taking any point behind the average value as the reference point to perform linear interpolation, and replacing the data of the original corresponding position with the interpolated data to further obtain a new audio signal;

404: performing fast Fourier transform on the audio signal obtained after linear interpolation to change a time domain into a frequency domain; intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate an average value, and replacing the high-frequency part with the average value; carrying out inverse fast Fourier transform on the high-frequency part substituted by the average value to obtain a new audio signal, and substituting the new signal for the signal before the fast Fourier transform;

405: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and average opinion score, returning to the step 402 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

The second embodiment and the third embodiment are combined, and the frequency spectrum divergence is further corrected by adopting a frequency domain smoothing method on the basis of linear interpolation of the audio signal, so that the frequency spectrum of the audio signal is smoothed, and the aim of eliminating the boundary effect is fulfilled.

And fifthly, implementation:

501: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

502: intercepting a section of signal from front to back in the intercepted signal to be set as an odd mark, intercepting a section of signal from back to front to be set as an even mark, carrying out linear prediction on the signal with the odd mark and the signal with the even mark to obtain a predicted value of the signal with the odd mark and a predicted value of the signal with the even mark, carrying out front-back inversion on the predicted value of the signal with the even mark, averaging the predicted value of the signal with the odd mark and the predicted value of the signal with the even mark after front-back inversion, and replacing a jump value near a data frame boundary by the average value;

503: performing LP (Linear Prediction) analysis on the intercepted signal to obtain a Prediction coefficient, and then utilizing a formula

504: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 502 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the first embodiment, the method for adding the parity mark in the linear prediction of the audio signal near the data frame boundary ensures the accuracy of the linear prediction, and further better corrects the audio signal to ensure the continuity of the audio signal.

Wherein, the step 502 in this embodiment may be respectively located before 202 in the second embodiment to constitute an embodiment; before 302 in the third embodiment, an embodiment is formed; the processing procedure of the embodiment five is the same as that of the embodiment four, wherein the embodiment is formed before 402.

Example six:

601: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

602: averaging at least two data points before and after data frame jump of the audio signal, and using the data points and the average value to make a linear curve; interpolating data on the linear curve by taking the average value as a reference point, or taking any point in front of the average value as a reference point, or taking any point behind the average value as a reference point, and replacing the data of the original corresponding position with the interpolated data to obtain a new audio signal;

603: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to step 602 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the first embodiment, the method of linear prediction in the first embodiment is replaced by the method of linear interpolation, the original points at the corresponding positions are replaced by the linearly interpolated values, so that the audio signal has continuity, and the sawtooth wave caused by data discontinuity is corrected in the frequency domain.

Example seven:

701: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

702: averaging at least two data points before and after data frame jump of the audio signal, and using the data points and the average value to make a linear curve; interpolating data on the linear curve by using the average value as a reference point, or using any point in front of the average value as a reference point, or using any point behind the average value as a reference point, and replacing the data of the original corresponding position with the interpolated data to obtain a linearly interpolated audio signal;

703: performing fast Fourier transform on the audio signal subjected to linear interpolation to change a time domain into a frequency domain; intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate an average value, and replacing the high-frequency part with the average value; carrying out inverse fast Fourier transform on the high-frequency part substituted by the average value to obtain a new audio signal, and substituting the new signal for the signal before the fast Fourier transform;

704: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 702 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing to perform linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the sixth embodiment, on the basis of performing linear interpolation on the audio signal, the frequency spectrum divergence is further corrected by adopting a frequency domain smoothing method, so that the frequency spectrum of the audio signal is smoothed, and the purpose of eliminating the boundary effect is achieved.

Example eight:

801: intercepting a segment of signal near a data frame boundary of the audio signal; for example, point X is a boundary point of signal 1, and since the frame length of the general spectrogram is 256 points, 128 points are respectively intercepted from point X forward and 128 points are intercepted from point X backward, and the intercepted 256 points form a frame signal;

802: carrying out fast Fourier transform on the received audio signal to change a time domain into a frequency domain; intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate an average value, and replacing the high-frequency part with the average value; carrying out inverse fast Fourier transform on the high-frequency part substituted by the average value to obtain a new audio signal, and substituting the new signal for the signal before the fast Fourier transform;

803: and calculating evaluation indexes of the new audio signal, such as signal-to-noise ratio, spectral distortion degree and mean opinion score, returning to the step 802 when the evaluation indexes are smaller than the preset evaluation indexes, and continuing linear prediction until the evaluation indexes of the signal subjected to linear processing are larger than or equal to the preset evaluation indexes.

Compared with the first embodiment, the method for eliminating distortion through linear prediction is replaced by a frequency domain smoothing method to correct spectrum divergence, so that the spectrum of the audio signal is smoothed, and the aim of eliminating the boundary effect is fulfilled.

In the above embodiments, there is no strict timing relationship between the steps, and each reference numeral merely represents a process for implementing the embodiment of the present invention.

The apparatus provided by the embodiments of the present invention will be described in detail below with reference to the accompanying drawings:

referring to fig. 2, a schematic diagram of an apparatus according to an embodiment of the present method includes:

an intercept signal unit 201 for intercepting a segment of the signal in the vicinity of a data frame boundary of the audio signal;

a linear processing unit 202, configured to perform linear processing on the received signal to obtain a new signal;

a calculating unit 203, configured to calculate evaluation indicators of the new signal, such as a signal-to-noise ratio, a spectrum distortion degree, and a mean opinion score;

a comparing unit 204, configured to receive the evaluation index from the calculating unit 203, compare the evaluation index with a preset evaluation index, and send the new signal to the linear processing unit 202 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

The above is a general description of a schematic diagram of an apparatus provided by an embodiment of the present invention, and the following is a detailed description by respectively exemplifying the embodiments:

referring to fig. 3, a schematic diagram of an apparatus provided in the first embodiment of the present invention includes:

a linear prediction unit 301, configured to perform linear prediction on a received signal to obtain a prediction value;

a replacing unit 302, configured to replace the received prediction value with a jump value near a data frame boundary to obtain a new signal;

a comparing unit 204, configured to receive the evaluation index from the calculating unit 203, compare the evaluation index with a preset evaluation index, and send the new signal to the linear predicting unit 301 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

The linear prediction unit 301 and the replacement unit 302 are disposed in the linear processing unit 202.

Referring to fig. 4, a schematic diagram of an apparatus provided in the second embodiment of the present invention includes:

a replacing unit 302, configured to replace the received prediction value with a jump value near a data frame boundary to obtain a new signal.

A drawing unit 401, configured to average at least two data points before and after data frame jump, and use the data points and the average to make a linear curve;

a linear interpolation unit 402, configured to interpolate data on the linear curve by using the average value as a reference point, or using any point in front of the average value as a reference point, or using any point behind the average value as a reference point, and replace the data at the original corresponding position with the interpolated data to obtain a new signal;

a comparing unit 204, configured to receive the evaluation index from the calculating unit 203, compare the evaluation index with a preset evaluation index, and send the new signal to the linear prediction unit 301 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

Among them, the linear prediction unit 301, the replacement unit 302, the rendering unit 401, and the linear interpolation unit 402 are disposed in the linear processing unit 202.

Compared with the schematic diagram provided in the first embodiment, the schematic diagram provided in the second embodiment adds the drawing unit 401 and the linear interpolation unit 402, so as to further correct the discontinuity of the audio signal on the basis of performing linear prediction on the audio signal to eliminate distortion.

Referring to fig. 5, a schematic diagram of an apparatus provided in the third embodiment of the present invention includes:

a fourier transform unit 501, configured to perform fast fourier transform on the new signal, so as to change a time domain into a frequency domain;

a frequency domain smoothing unit 502, configured to intercept a high frequency portion of the frequency domain, perform forward or backward staggered addition to obtain an average, and replace the high frequency portion with the average;

an inverse fourier transform unit 503, configured to perform inverse fast fourier transform on the high-frequency part substituted by the average value to obtain a new signal, and replace the signal before the fast fourier transform with the new signal after the fourier transform;

The linear prediction unit 301, the replacement unit 302, the fourier transform unit 501, the frequency domain smoothing unit 502, and the inverse fourier transform unit 503 are disposed in the linear processing unit 202.

Compared with the schematic device diagram provided by the second embodiment, the schematic device diagram provided by the second embodiment replaces the drawing unit 401 and the linear interpolation unit 402 in the second embodiment with a fourier transform unit 501, a frequency domain smoothing unit 502, and an inverse fourier transform unit 503, and is used for correcting spectrum divergence, smoothing the spectrum of the audio signal, and achieving the purpose of eliminating the boundary effect.

Referring to fig. 6, a schematic diagram of an apparatus provided in the fourth embodiment of the present invention includes:

Among them, the linear prediction unit 301, the replacement unit 302, the rendering unit 401, the linear interpolation unit 402, the fourier transform unit 501, the frequency domain smoothing unit 502, and the inverse fourier transform unit 503 are placed in the linear processing unit 202.

Compared with the schematic device diagram provided in the first embodiment, the schematic device diagram provided in the first embodiment is additionally provided with a drawing unit 401, a linear interpolation unit 402, a fourier transform unit 501, a frequency domain smoothing unit 502 and an inverse fourier transform unit 503, and is used for correcting spectrum divergence by further adopting a frequency domain smoothing method on the basis of performing linear interpolation on an audio signal, so that the spectrum of the audio signal is smoothed, and the purpose of eliminating a boundary effect is achieved.

Referring to fig. 7, a schematic diagram of an apparatus provided in the fifth embodiment of the present invention includes:

an odd-even flag unit 701, configured to receive a signal from the signal truncation unit 201, truncate a segment of the signal from front to back to set an odd flag, and send the odd flag to the linear prediction unit 301, so as to obtain a prediction value of the odd flag signal; intercepting a section of signal from back to front and setting the signal as an even mark;

a forward-backward inversion unit 702, configured to perform forward inversion when receiving a predicted value of a signal with an even flag from the linear prediction unit 301;

the average unit 703 is configured to receive the predicted value of the signal with the odd flag and the predicted value of the signal with the even flag after front-back inversion, and average the predicted values of the signal with the odd flag and the predicted values of the signal with the even flag after front-back inversion to obtain the predicted value of the intercepted signal.

a comparing unit 204, configured to receive the evaluation index from the calculating unit, compare the evaluation index with a preset evaluation index, and send the new signal to the linear predicting unit 301 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

Among them, the linear prediction unit 301, the replacement unit 302, the rendering unit 401, the linear interpolation unit 402, the parity flag unit 701, the front-back inversion unit 702, and the average value unit 703 are placed in the linear processing unit 202.

In the schematic diagram of the apparatus provided in this embodiment, the parity flag unit 701, the front-back inversion unit 702, and the average value unit 703 may respectively form a new apparatus with the apparatus provided in the second embodiment, the apparatus provided in the third embodiment, and the apparatus provided in the fourth embodiment.

Compared with the apparatus schematic diagram provided in the first embodiment, the apparatus schematic diagram provided in the first embodiment is additionally provided with an odd-even flag unit 701, a front-back inversion unit 702, and an average value unit 703, so as to ensure accuracy of linear prediction, and further better correct an audio signal, so that the audio signal has continuity.

Referring to fig. 8, a schematic diagram of an apparatus provided in the sixth embodiment of the present invention includes:

a comparing unit 204, configured to receive the evaluation index from the calculating unit 203, compare the evaluation index with a preset evaluation index, and send the new signal to the drawing unit 401 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

Among them, the drawing unit 401 and the linear interpolation unit 402 are disposed in the linear processing unit 202.

This embodiment provides a schematic diagram that compares with the schematic diagram provided in the first embodiment, the linear prediction unit 301, the replacement unit 302, are replaced with the rendering unit 401 and the linear interpolation unit 402 for correcting the discontinuity of the audio signal.

Referring to fig. 9, a schematic diagram of an apparatus provided in the seventh embodiment of the present invention includes:

Among them, the drawing unit 401, the linear interpolation unit 402, the fourier transform unit 501, the frequency domain smoothing unit 502, and the inverse fourier transform unit 503 are disposed in the linear processing unit 202.

Compared with the schematic device diagram provided in the first embodiment, the schematic device diagram provided in the first embodiment replaces the linear prediction unit 301 and the replacement unit 302 with the drawing unit 401, the linear interpolation unit 402, the fourier transform unit 501, the frequency domain smoothing unit 502, and the inverse fourier transform unit 503, and is used for further adopting a frequency domain smoothing method to correct the frequency spectrum divergence on the basis of performing linear interpolation on the audio signal, so as to smooth the frequency spectrum of the audio signal, and achieve the purpose of eliminating the boundary effect.

Referring to fig. 10, a schematic diagram of an apparatus provided in an eighth embodiment of the present invention includes:

a comparing unit 204, configured to receive the evaluation index from the calculating unit 203, compare the evaluation index with a preset evaluation index, and send the new signal to the fourier transform unit 501 when the evaluation index is smaller than the preset evaluation index until the received evaluation index is greater than or equal to the preset evaluation index.

Among them, a fourier transform unit 501, a frequency domain smoothing unit 502, and an inverse fourier transform unit 503 are disposed in the linear processing unit 202.

Compared with the schematic device diagram provided in the first embodiment, the schematic device diagram provided in the first embodiment replaces the linear prediction unit 301 and the replacement unit 302 with the fourier transform unit 501, the frequency domain smoothing unit 502, and the inverse fourier transform unit 503, and is used to correct spectrum divergence, smooth the spectrum of the audio signal, and achieve the purpose of eliminating the boundary effect.

It can be seen from the above embodiments that, since the discontinuity of the audio signal occurs near the boundary of the adjacent data frames, a segment of signal is intercepted near the boundary of the data frames of the audio signal, the intercepted signal is subjected to linear processing, the original signal is replaced by the signal subjected to linear processing, and the evaluation index of the new signal subjected to linear processing is calculated.

Those skilled in the art will appreciate that all or part of the steps in the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, and the program may be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

While the method and apparatus for modifying an audio signal provided by the present invention have been described in detail, those skilled in the art will appreciate that the various modifications, additions, substitutions, and substitutions are possible, without departing from the scope and spirit of the invention as disclosed in the accompanying claims.

Claims

1. A method of modifying an audio signal, comprising:

2. The method of claim 1, wherein said linearly processing the truncated signal comprises:

and performing linear prediction on the intercepted signal, and replacing a jump value near the data frame boundary by the obtained predicted value.

3. The method of claim 1, wherein said linearly processing the truncated signal comprises:

averaging at least two data points before and after the data frame jumps, and making a linear curve by using the data points and the average;

and interpolating data on the linear curve by taking the average value as a reference point, or taking any point in front of the average value as a reference point, or taking any point behind the average value as a reference point, and replacing the data at the original corresponding position by the interpolated data.

4. The method of claim 1, wherein said linearly processing the truncated signal comprises:

performing fast Fourier transform on the signal to change a time domain into a frequency domain;

intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate an average value, and replacing the high-frequency part with the average value;

and performing inverse fast Fourier transform on the high-frequency part substituted by the average value to obtain a new signal, and substituting the new signal for the signal before the fast Fourier transform.

5. The method of claim 2, wherein said linearly processing the truncated signal comprises:

intercepting a section of signal from front to back in the signal to set the signal as an odd mark, and intercepting a section of signal from back to front to set the signal as an even mark;

performing linear prediction on the signals with the odd marks and the signals with the even marks to obtain predicted values of the signals with the odd marks and the signals with the even marks, and inverting the predicted values of the signals with the even marks from front to back;

and averaging the predicted values of the signals with the odd marks and the predicted values of the signals with the even marks which are inverted back and forth, and replacing jump values near the data frame boundary by the average value.

6. The method of claim 1, wherein the evaluation index comprises:

signal to noise ratio, spectral distortion, mean opinion score of the audio signal.

7. An apparatus for modifying an audio signal, comprising:

8. The apparatus of claim 7, wherein the linear prediction unit comprises:

the linear prediction unit is used for performing linear prediction on the received signal to obtain a predicted value;

and the replacing unit is used for replacing the jump value near the data frame boundary with the received predicted value to obtain a new signal.

9. The apparatus of claim 8, further comprising:

the odd-even mark unit is used for receiving the signal from the signal intercepting unit, intercepting a section of signal from front to back to be set as an odd mark for the signal, and sending the odd mark to the linear prediction unit to obtain the prediction value of the signal with the odd mark; intercepting a section of signal from back to front and setting the signal as an even mark;

a front-back inversion unit, configured to perform front-back inversion when receiving a predicted value of a signal with an even sign from the linear prediction unit;

and the average value unit is used for receiving the predicted value of the signal with the odd mark and the predicted value of the signal with the even mark, and averaging the predicted value of the signal with the odd mark and the predicted value of the signal with the even mark which is inverted back and forth to obtain the predicted value of the intercepted signal unit.

10. The apparatus of claim 8, further comprising:

the data frame jumping device comprises a drawing unit, a data frame jumping unit and a control unit, wherein the drawing unit is used for averaging at least two data points before and after jumping of a data frame and using the data points and the average value as a linear curve;

and the linear interpolation unit is used for interpolating data on the linear curve by taking the average value as a reference point, or taking any point in front of the average value as a reference point, or taking any point behind the average value as a reference point, and replacing the data at the original corresponding position with the interpolated data.

11. The apparatus of claim 8, further comprising:

the Fourier transform unit is used for carrying out fast Fourier transform on the signal and changing a time domain into a frequency domain;

a frequency domain smoothing unit, which is used for intercepting the high-frequency part of the frequency domain, carrying out forward or backward dislocation addition to calculate the average value, and replacing the high-frequency part with the average value;

and an inverse fourier transform unit for performing inverse fast fourier transform on the high frequency part substituted with the average value to obtain a new signal, and replacing the signal before the fast fourier transform with the new signal.

12. The apparatus of claim 7, wherein the linear processing unit comprises:

13. The apparatus of claim 12, further comprising:

14. The apparatus of claim 7, wherein the linear processing unit comprises: