KR20080051042A

KR20080051042A - Apparatus and method for decoding multi-channel audio signal using cross-correlation

Info

Publication number: KR20080051042A
Application number: KR1020070107406A
Authority: KR
Inventors: 장대영; 강경옥; 홍진우; 하마다하레오; 사이토토시오
Original assignee: 한국전자통신연구원
Priority date: 2006-12-04
Filing date: 2007-10-24
Publication date: 2008-06-10
Also published as: KR100917845B1

Abstract

An apparatus and a method for decoding the multi-channel audio signal by using the cross correlation are provided to produce the multi-channel audio signal from the down-mixing audio signal by means of the cross correlation between the left and right channels and to control the produced audio signal by using the encoding information. A multi-channel signal producing member(121) produces plural audio signals in channels from the down-mixing stereo audio signal by means of the cross correlation value between the left/right channel. A multi-channel signal regulating member(122) regulates the produced cross correlation value of plural audio signals in channels and the power values in sub-bands by means of the cross correlation information between channels of an original signal and virtual sound source direction information. The multi-channel signal producing member includes the first surround channel signal producing member for producing the down-mixing surround left channel signal. The second surround channel signal producing member produces the down-mixing surround left channel signal.

Description

Multi-channel audio signal decoding apparatus using cross-correlation and its method {APPARATUS AND METHOD FOR DECODING MULTI-CHANNEL AUDIO SIGNAL USING CROSS-CORRELATION}

본 발명은 상호상관을 이용한 다채널 오디오 신호 복호화 장치 및 그 방법에 관한 것으로, 더욱 상세하게는 좌우 채널 간 상호상관 값을 이용하여 다운믹싱 스테레오 오디오 신호로부터 다채널 오디오 신호를 생성하고 부호화 정보(상호상관 정보, 가상음원 방향정보)를 이용하여 상기 생성된 다채널 오디오 신호를 조정함으로써, 다채널 오디오 신호 중 중앙채널 및 서라운드 채널 신호를 정확하게 복원하기 위한, 상호상관을 이용한 다채널 오디오 신호 복호화 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus and method for decoding a multichannel audio signal using cross-correlation, and more particularly, to generate a multi-channel audio signal from a downmixed stereo audio signal using cross-correlation values between left and right channels and to encode encoded information (correlation). A multi-channel audio signal decoding apparatus using cross-correlation for accurately reconstructing the center channel and surround channel signals among the multi-channel audio signals by adjusting the generated multi-channel audio signals using correlation information and virtual sound source direction information; It's about how.

본 발명은 정보통신부 및 정보통신연구진흥원의 IT차세대핵심기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다[과제관리번호: 2005-S-403-02, 과제명: 지능형 통합정보 방송(Smar TV) 기술 개발].The present invention is derived from the research conducted as part of the next generation core technology development project of the Ministry of Information and Communication and the Ministry of Information and Communication Research and Development. [Task Management Number: 2005-S-403-02, Title: Intelligent Integrated Information Broadcasting ) Technology development].

최근에 가정용 극장 시스템이 보편화되면서 5.1채널 오디오 형식은 가정용 오디오의 대세로 자리매김해 가고 있다. 또한, 휴대형 오디오 장비에서도 헤드폰 또는 내장된 소형 스피커에 의해 가상 서라운드를 재생하는 3차원 오디오 효과 기능이 필수 구비사항으로 되고 있다. 이러한 추세를 감안하면 향후 5.1채널 오디오 형식이 가정용 및 휴대용 오디오 장비의 기본 오디오 재생 형식이 될 것이라는 예측을 가능하게 한다.With the recent popularization of home theater systems, the 5.1-channel audio format is becoming the mainstream of home audio. In addition, portable audio equipment has become a necessity to have a three-dimensional audio effect function that reproduces virtual surround by headphones or a small built-in speaker. This trend makes it possible to predict that the 5.1-channel audio format will be the default audio playback format for home and portable audio equipment.

하지만, 종래의 5.1채널 오디오 기술은 채널 개수에 따라 데이터 량이 증가한다는 문제점이 있다. 그러므로 종래의 5.1채널 오디오 기술에서는 데이터 량을 효과적으로 압축할 수 있는 다채널 부호화 방식이 중요한 기능을 수행한다. 예를 들어, MPEG(Moving Picture Expert Group)-2 및 MPEG-4에서는 지각 부호화 방식을 사용한 다채널 부호화 방식을 표준화하고 있다. 그러나 그 특성상 채널 수에 비례하여 비트율이 증가하게 되는 문제점이 있다.However, the conventional 5.1 channel audio technology has a problem in that the amount of data increases with the number of channels. Therefore, in the conventional 5.1-channel audio technology, a multi-channel encoding method capable of compressing the data amount effectively performs an important function. For example, moving picture expert group (MPEG) -2 and MPEG-4 standardize the multi-channel coding method using the perceptual coding method. However, there is a problem in that the bit rate increases in proportion to the number of channels.

최근에, 채널 수가 증가하여도 비트율이 거의 증가하지 않는 BCC(Binaural Cue Coding) 방식이 개발되었다. BCC는 그 구조가 비교적 간단하다. 그리고 다채널 오디오를 스테레오 또는 모노로 다운믹스한 후, 이로부터 다채널 오디오 신호를 복원하기 위한 파라미터를 산출한다. 이들 파라미터는 채널간 레벨 차이(ICLD: Inter Channel Level Difference), 채널간 시간 차이(ICTD: Inter Channel Time Difference), 및 채널간 상호상관(ICC: Inter Channel Cross-correlation)을 포함할 수 있다.Recently, Binaural Cue Coding (BCC) schemes have been developed in which the bit rate does not increase even when the number of channels increases. The BCC is relatively simple in structure. After downmixing the multichannel audio to stereo or mono, a parameter for reconstructing the multichannel audio signal is calculated therefrom. These parameters may include Inter Channel Level Difference (ICLD), Inter Channel Time Difference (ICTD), and Inter Channel Cross-correlation (ICC).

또한, 스테레오 오디오 신호로부터 다채널 오디오 신호를 복원하는 기술로는 돌비 프로로직을 대표적인 기술로 들 수 있다. 그러나 돌비 프로로직의 경우 스테 레오 신호 사이의 상호상관에 따라 스펙트럼 상에서 불필요하게 제거되거나 증폭되는 신호가 발생할 수 있다는 문제점이 있다. 특히, 스테레오 오디오 신호로부터 다채널 오디오 신호를 복원할 때, 단순한 신호의 가산 및 감산을 통해 서라운드 신호성분이 정확하게 복원되지 않는다는 문제점이 있다.In addition, Dolby Pro Logic is a representative technology for recovering a multi-channel audio signal from a stereo audio signal. However, in the case of Dolby Pro Logic, there is a problem that a signal that is unnecessarily removed or amplified in the spectrum may occur depending on the correlation between the stereo signals. In particular, when restoring a multi-channel audio signal from a stereo audio signal, there is a problem that the surround signal component is not correctly restored through simple addition and subtraction of the signal.

따라서 상기와 같은 종래 기술은 다운믹싱 스테레오 오디오 신호로부터 원 신호인 다채널 오디오 신호를 복원할 때, 중앙채널, 서라운드 좌채널, 및 서라운드 우채널 신호성분이 스펙트럼 상에서 불필요하게 제거되거나 증폭되어 중앙채널, 및 서라운드 채널 신호성분을 충실하게 복원하지 못한다는 문제점이 있으며, 이러한 문제점을 해결하고자 하는 것이 본 발명의 과제이다.Therefore, in the prior art as described above, when restoring a multichannel audio signal that is an original signal from a downmixing stereo audio signal, the center channel, surround left channel, and surround right channel signal components are unnecessarily removed or amplified in the spectrum, thereby causing the center channel, And there is a problem that can not be faithfully restored to the surround channel signal component, it is an object of the present invention to solve this problem.

따라서 본 발명은 좌우 채널 간 상호상관 값을 이용하여 다운믹싱 스테레오 오디오 신호로부터 다채널 오디오 신호를 생성하고 부호화 정보(상호상관 정보, 가상음원 방향정보)를 이용하여 상기 생성된 다채널 오디오 신호를 조정함으로써, 다채널 오디오 신호 중 중앙채널 및 서라운드 채널 신호를 정확하게 복원하기 위한, 상호상관을 이용한 다채널 오디오 신호 복호화 장치 및 그 방법을 제공하는데 그 목적이 있다.Accordingly, the present invention generates a multi-channel audio signal from the downmixed stereo audio signal using the cross-correlation value between the left and right channels and adjusts the generated multi-channel audio signal using encoding information (correlation information, virtual sound source direction information). Accordingly, an object of the present invention is to provide an apparatus and method for decoding a multichannel audio signal using cross-correlation for accurately reconstructing a center channel and a surround channel signal among multichannel audio signals.

본 발명의 목적들은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 본 발명의 다른 목적 및 장점들은 하기의 설명에 의해서 이해될 수 있으며, 본 발명의 실시예에 의해 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허 청구 범위에 나타낸 수단 및 그 조합에 의해 실현될 수 있음을 쉽게 알 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects and advantages of the present invention which are not mentioned above can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. Also, it will be readily appreciated that the objects and advantages of the present invention may be realized by the means and combinations thereof indicated in the claims.

본 발명은 상기 문제점을 해결하기 위하여 제안된 것으로, 좌우 채널 간 상호상관 값을 이용하여 다운믹싱 스테레오 오디오 신호로부터 다채널 오디오 신호를 생성하고 부호화 정보(상호상관 정보, 가상음원 방향정보)를 이용하여 상기 생성된 다채널 오디오 신호를 조정하는 것을 특징으로 한다.The present invention has been proposed to solve the above problems, and generates a multi-channel audio signal from a downmixed stereo audio signal using cross-correlation values between left and right channels and uses encoding information (cross-correlation information, virtual sound source direction information). And adjusting the generated multi-channel audio signal.

더욱 구체적으로, 본 발명은, 상호상관을 이용한 다채널 오디오 신호 복호화 장치에 있어서, 좌/우 채널 간 상호상관 값을 이용하여 다운믹싱 스테레오 오디오 신호로부터 복수의 채널별 오디오 신호를 생성하기 위한 다채널 신호 생성 수단; 및 상기 다운믹싱 스테레오 오디오 신호에 대한 원 신호를 복원할 수 있도록, 상기 생성된 복수의 채널별 오디오 신호의 상호상관 값 및 서브밴드별 파워 값을 상기 원 신호의 채널 간 상호상관 정보와 가상음원 방향정보를 이용하여 조정하기 위한 다채널 신호 조정 수단을 포함한다.More specifically, in the multi-channel audio signal decoding apparatus using cross-correlation, a multi-channel for generating a plurality of channel-specific audio signals from the downmixed stereo audio signal using the cross-correlation value between the left and right channels Signal generating means; And a cross-correlation value and a sub-band power value of the generated plurality of channel-specific audio signals and the sub-band cross-correlation information and the virtual sound source direction to restore the original signal for the downmixed stereo audio signal. Multi-channel signal adjusting means for adjusting using information.

또한, 본 발명은, 상호상관을 이용한 다채널 오디오 신호 복호화 방법에 있어서, 좌/우 채널 간 상호상관 값을 이용하여 다운믹싱 스테레오 오디오 신호로부터 복수의 채널별 오디오 신호를 생성하는 다채널 신호 생성 단계; 및 상기 다운믹싱 스테레오 오디오 신호에 대한 원 신호를 복원할 수 있도록, 상기 생성된 복수의 채널별 오디오 신호의 상호상관 값 및 서브밴드별 파워 값을 상기 원 신호의 채널 간 상호상관 정보와 가상음원 방향정보를 이용하여 조정하는 다채널 신호 조정 단계를 포함한다.Also, in the multi-channel audio signal decoding method using cross-correlation, a multi-channel signal generation step of generating a plurality of channel-specific audio signals from the downmixed stereo audio signal using cross-correlation values between left and right channels ; And a cross-correlation value and a sub-band power value of the generated plurality of channel-specific audio signals and the sub-band cross-correlation information and the virtual sound source direction to restore the original signal for the downmixed stereo audio signal. A multi-channel signal adjustment step of adjusting using the information.

상기와 같은 본 발명은, 좌우 채널 간 상호상관 값에 따라 다운믹싱 스테레오 오디오 신호로부터 다채널 오디오 신호를 생성하고 채널 간 상호상관 및 가상음원 방향정보로 구성되는 공간음향 지각단서를 이용하여 다채널 오디오 신호를 조정함으로써, 다채널 오디오 신호 중 중앙채널 및 서라운드 채널 신호를 정확하게 복원할 수 있도록 하는 효과가 있다.As described above, the present invention generates multi-channel audio signals from downmixed stereo audio signals according to cross-correlation values between left and right channels, and uses multi-channel audio using spatial acoustic perception cues composed of cross-correlation between channel and virtual sound source direction information. By adjusting the signal, it is possible to accurately restore the center channel and surround channel signals among the multichannel audio signals.

또한, 본 발명은, 채널 간 상호상관 및 가상음원 방향정보로 구성되는 공간음향 지각단서를 이용하여 다채널 오디오 신호를 조정함으로써, 스펙트럼 왜곡현상을 완화시킬 수 있는 효과가 있다.In addition, the present invention has the effect of mitigating the spectral distortion phenomenon by adjusting the multi-channel audio signal using the spatial acoustic perception terminal composed of the cross-correlation between the channel and the virtual sound source direction information.

상술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술되어 있는 상세한 설명을 통하여 보다 명확해 질 것이며, 그에 따라 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. 이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명하기로 한다.The above objects, features, and advantages will become more apparent from the detailed description given hereinafter with reference to the accompanying drawings, and accordingly, those skilled in the art to which the present invention pertains may share the technical idea of the present invention. It will be easy to implement. In addition, in describing the present invention, when it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the gist of the present invention, the detailed description thereof will be omitted. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명에 따른 상호상관을 이용한 다채널 오디오 신호 복호화 장치의 일실시예 구성도로서, 다채널 오디오 신호 부호화 장치와 함께 도시되어 있다.1 is a configuration diagram of an apparatus for decoding a multi-channel audio signal using cross-correlation according to the present invention, which is illustrated together with the apparatus for encoding a multi-channel audio signal.

도 1에 도시된 바와 같이, 본 발명에 따른 상호상관을 이용한 다채널 오디오 신호 복호화 장치(120)는 다채널 신호 생성부(121)와 다채널 신호 조정부(122)를 포함하고, 참고적으로 부호화 장치(110)는 다운믹싱부(112)와 공간음향 지각단서 분석부(111)를 포함한다.As shown in FIG. 1, the multi-channel audio signal decoding apparatus 120 using cross-correlation according to the present invention includes a multi-channel signal generator 121 and a multi-channel signal adjuster 122, which are encoded by reference. The device 110 includes a downmixing unit 112 and a spatial acoustic perceptual cue analysis unit 111.

이하, 부호화 장치(110)와 복호화 장치(120)의 구성요소 각각에 대해 상세히 살펴보기로 한다.Hereinafter, each component of the encoding apparatus 110 and the decoding apparatus 120 will be described in detail.

공간음향 지각단서 분석부(111)는 다채널 오디오 신호를 전달받아 각 채널과 관련된 서브밴드별로 다채널 오디오 신호를 서브밴드 필터링한다. 그리고 공간음향 지각단서 분석부(111)는 서브밴드 필터링된 각 채널의 오디오 신호에서 인접 채널 간 레벨 차이 및 상호상관을 분석하여 공간음향 지각단서를 추출한다. 여기서, 공간음향 지각단서는 채널 간 상호상관 값과 가상음원 방향정보를 포함한다.The spatial acoustic perceptual analysis unit 111 receives the multichannel audio signal and subband filters the multichannel audio signal for each subband associated with each channel. In addition, the spatial acoustic perceptual analysis unit 111 extracts the spatial acoustic perceptual cue by analyzing the level difference and cross-correlation between adjacent channels in the audio signal of each subband filtered channel. Here, the spatial acoustic perception cues include cross-correlation values between the channels and virtual sound source direction information.

그리고 다운믹싱부(112)는 공간음향 지각단서 분석부(111)로부터 전달받은 다채널 오디오 신호를 스테레오 오디오 신호로 압축하는 기능을 수행한다. 즉, 다운믹싱부(112)는 공간음향 지각단서 분석부(111)에서 서브밴드 필터링된 다채널 음향 스펙트럼을 다운믹싱 스테레오 오디오 신호로 혼합하고, 다운믹싱 스테레오 오디오 신호를 시간영역 신호로 변환한다.The downmixing unit 112 performs a function of compressing the multi-channel audio signal received from the spatial acoustic perception cue analysis unit 111 into a stereo audio signal. That is, the downmixing unit 112 mixes the subband filtered multi-channel sound spectrum by the spatial acoustic perceptual analysis unit 111 into a downmixing stereo audio signal and converts the downmixing stereo audio signal into a time domain signal.

이때, 다운믹싱부(112)에서 다운믹싱하기 위한 일반적인 매트릭스의 수식은 하기의 [수학식 1]과 같다.At this time, the formula of the general matrix for downmixing in the downmixing unit 112 is shown in Equation 1 below.

L_dm = L + SL + SQRT(2)/2×CL _dm = L + SL + SQRT (2) / 2 × C

R_dm = R + SR + SQRT(2)/2×CR _dm = R + SR + SQRT (2) / 2 x C

여기서, L_dm 및 R_dm은 각각 다운믹싱 좌채널 및 다운믹싱 우채널 스테레오 신호, L 및 R은 다채널 음향 신호에 있어 좌채널 및 우채널 신호, SL 및 SR은 서라운드 좌채널 및 서라운드 우채널 신호, C는 중앙채널 신호를 나타낸다. 통상적으로 사용되는 저음 채널(LFE: Low Frequency Effect) 신호는 C와 동일하게 양측 다운믹싱 채널에 나누어 부가함으로써 처리할 수 있다.Where L _dm And R _dm are the downmix left channel and downmix right channel stereo signals, L and R are the left and right channel signals in the multichannel sound signal, SL and SR are the surround left and surround right channel signals, and C is the center. Indicates a channel signal. A low frequency effect (LFE) signal, which is commonly used, can be processed by dividing it into both downmixing channels as in C.

이하, 본 발명에 따른 상호상관을 이용한 다채널 오디오 신호 복호화 장치(120)의 구성요소 각각에 대하여 상세히 살펴보기로 한다.Hereinafter, each component of the multi-channel audio signal decoding apparatus 120 using cross correlation according to the present invention will be described in detail.

다채널 신호 생성부(121)는 부호화 장치(110)에서 다채널 오디오 신호(원 신호)가 다운믹싱된 다운믹싱 스테레오 오디오 신호를 각 채널간 상호상관 값에 따라 분리하여 다운믹싱 다채널 오디오 신호를 생성한다. 다채널 신호 생성부(121)는 전달받은 다운믹싱 스테레오 오디오 신호로부터 적응필터를 이용해 두 채널 간 공통된 신호와 독립된 신호를 적응적으로 추출하여 서라운드 채널 신호와 중앙채널의 신호를 일부 분리하는 기능을 수행한다. 즉, 다채널 신호 생성부(121)는 부호화 장 치(110)에서의 다운믹싱 스테레오(좌채널, 우채널) 오디오 신호로부터 적응필터와 두 채널 간 합 및 차 신호를 이용하여 다운믹싱 중앙채널 신호 및 다운믹싱 서라운드 채널 신호를 생성한다.The multichannel signal generator 121 separates the downmixing stereo audio signal in which the multichannel audio signal (the original signal) is downmixed by the encoding apparatus 110 according to the cross-correlation value between the respective channels to generate the downmixing multichannel audio signal. Create The multi-channel signal generator 121 adaptively extracts a common signal and an independent signal between the two channels by using an adaptive filter from the received downmixed stereo audio signal and performs a function of partially separating the surround channel signal and the center channel signal. do. That is, the multi-channel signal generation unit 121 uses the adaptive filter and the sum and difference signals between the two channels from the downmixing stereo (left channel and right channel) audio signals of the encoding device 110 to use the downmixing central channel signal. And generate a downmix surround channel signal.

그리고 다채널 신호 조정부(122)는 다채널 신호 생성부(121)에서 생성된 다운믹싱 다채널 오디오 신호의 채널 간 상호상관 값을 부호화 장치(110)로부터 전달받은 원 신호의 채널 간 상호상관 정보에 맞게 조정하고, 상기 조정된 다채널 오디오 신호의 서브밴드별 파워 값을 부호화 장치(110)로부터 전달받은 원 신호의 가상음원 방향정보에 맞게 조정한다. 다채널 신호 조정부(122)는 부호화 장치(110)에서 추출된 채널간 상호상관 및 가상음원 방향정보를 이용하여 다채널 신호 생성부(121)에서 생성된 다운믹싱 다채널 음향 신호의 서브밴드 별 스펙트럼의 상호상관 및 형상을 조정한다. 즉, 다채널 신호 조정부(122)는 다채널 신호 생성부(121)에서 출력된 다채널 음향 신호의 상호상관 및 서브밴드 파워를 조정하여 원래의 다채널 신호를 복원한다. 그리고 다채널 신호 조정부(122)는 상호상관 및 형상이 조정된 다채널 오디오 신호를 출력한다.In addition, the multi-channel signal adjusting unit 122 converts the channel cross-correlation value of the downmixing multi-channel audio signal generated by the multi-channel signal generating unit 121 to the inter-channel cross-correlation information of the original signal received from the encoding apparatus 110. The power value for each subband of the adjusted multichannel audio signal is adjusted according to the virtual sound source direction information of the original signal received from the encoding apparatus 110. The multi-channel signal adjusting unit 122 uses sub-band spectra of the downmixing multi-channel sound signal generated by the multi-channel signal generator 121 by using the cross-correlation and virtual sound source direction information extracted from the encoding apparatus 110. Adjust cross-correlation and shape of. That is, the multi-channel signal adjusting unit 122 adjusts cross-correlation and subband power of the multi-channel sound signal output from the multi-channel signal generating unit 121 to restore the original multi-channel signal. The multi-channel signal adjusting unit 122 outputs a multi-channel audio signal whose cross-correlation and shape are adjusted.

도 2 는 본 발명에 이용되는 도 1의 공간음향 지각단서 분석부의 일실시예 상세구성도이다.2 is a detailed configuration diagram of an embodiment of the spatial acoustic perception cue analysis unit of FIG. 1 used in the present invention.

도 2에 도시된 바와 같이, 공간음향 지각단서 분석부(111)는, 각 채널에 상응하는 제1 내지 제5 서브밴드 필터링부(201 내지 205), 및 공간음향 지각단서 추출부(206)를 포함한다.As shown in FIG. 2, the spatial acoustic perceptual cue analysis unit 111 includes the first to fifth subband filtering units 201 to 205 and the spatial acoustic perceptual extracting unit 206 corresponding to each channel. Include.

제1 내지 제5 서브밴드 필터링부(201 내지 205)는 외부로부터 입력된 다채널 오디오 신호를 각 채널에 대해 인간 청각특성에 기반한 서브밴드별로 구분하여 서브밴드 필터링한다. 그리고 제1 내지 제5 서브밴드 필터링부(201 내지 205)는 서브밴드 필터링된 제1 채널 내지 제5 채널 오디오 신호를 공간음향 지각단서 추출부(206)로 전달한다.The first to fifth subband filtering units 201 to 205 classify the multichannel audio signal input from the outside into subbands based on human auditory characteristics for each channel. The first to fifth subband filtering units 201 to 205 transmit the subband filtered first to fifth channel audio signals to the spatial acoustic perceptual end extractor 206.

그리고 공간음향 지각단서 추출부(206)는 제1 내지 제5 서브밴드 필터링부(201 내지 205)에서 각각 서브밴드 필터링된 제1 채널 내지 제5 채널 오디오 신호를 분석하여 인접 채널 간 상호상관 정보 및 가상음원 방향정보가 포함된 공간음향 지각단서를 추출한다. 즉, 공간음향 지각단서 추출부(206)는 각 서브밴드 별로 채널간 상호상관 정보 및 가상음원 방향정보를 생성한다. 그리고 공간음향 지각단서 추출부(206)는 제1 채널 내지 제5 채널 오디오 신호를 다운믹싱부(112)로 전달하고, 생성된 채널간 상호상관 및 가상음원 방향정보를 복호화 장치(120)로 전송한다.The spatial acoustic perceptual extractor 206 analyzes the first to fifth channel audio signals subband filtered by the first to fifth subband filtering units 201 to 205, respectively, and correlates information between adjacent channels. Spatial acoustic perception cues containing virtual sound source direction information are extracted. That is, the spatial acoustic perceptual extractor 206 generates cross-correlation information and virtual sound source direction information between channels for each subband. The spatial acoustic perceptual extractor 206 transmits the first to fifth channel audio signals to the downmixer 112, and transmits the generated cross-correlation and virtual sound source direction information to the decoding device 120. do.

여기서, 채널 간 상호상관 정보는 각 서브밴드 신호에 대하여 주파수 영역에서 산출될 수 있다. 또한, 가상음원 방향정보는 인접채널 신호의 서브밴드 파워비율에 의해 인접채널 스피커 배치 각도 사이에서 각도 값으로 산출될 수 있다.Here, the cross-correlation information between channels may be calculated in the frequency domain for each subband signal. In addition, the virtual sound source direction information may be calculated as an angle value between the adjacent channel speaker placement angles by the subband power ratio of the adjacent channel signal.

도 3 은 본 발명에 따른 도 1의 다채널 신호 생성부의 일실시예 상세구성도이다.3 is a detailed block diagram of an embodiment of the multi-channel signal generator of FIG. 1 according to the present invention.

도 3에 도시된 바와 같이, 다채널 신호 생성부(121)는 제1 서라운드 채널신호 생성부(310), 제2 서라운드 채널신호 생성부(320), 및 중앙채널 신호 생성부(330)를 포함한다. 여기서, 제1 서라운드 채널신호 생성부(310)는 제1 적응필 터(311)와 제1 및 제2 감산기(312, 313)를 포함한다. 또한, 제2 서라운드 채널신호 생성부(320)는 제2 적응필터(321)와 제3 및 제4 감산기(322, 323)를 포함한다. 또한, 중앙채널 신호 생성부(330)는 가산기(331)와 제산기(332)를 포함한다.As shown in FIG. 3, the multi-channel signal generator 121 includes a first surround channel signal generator 310, a second surround channel signal generator 320, and a center channel signal generator 330. do. Here, the first surround channel signal generator 310 includes a first adaptive filter 311 and first and second subtractors 312 and 313. In addition, the second surround channel signal generator 320 includes a second adaptive filter 321 and third and fourth subtractors 322 and 323. In addition, the central channel signal generator 330 includes an adder 331 and a divider 332.

다채널 신호 생성부(121)는 부호화 장치(110)에서 다채널 오디오 신호(원 신호)가 다운믹싱된 다운믹싱 스테레오 오디오 신호를 각 채널간 상호상관 값에 따라 분리하여 다운믹싱 다채널 오디오 신호를 생성한다. 여기서, 다운믹싱 다채널 오디오 신호 중 서라운드 채널 신호성분은 다운믹싱 좌채널 신호와 다운믹싱 우채널 신호의 차이 값을 제1 및 제2 적응필터(311, 321)의 계수를 갱신하는데 이용하여 구한다. 다채널 신호 생성부(121)는 적응필터를 이용한 스무딩 효과로 위상차이에 따라 특정 스펙트럼 신호가 왜곡되는 현상을 제거할 수 있다.The multichannel signal generator 121 separates the downmixing stereo audio signal in which the multichannel audio signal (the original signal) is downmixed by the encoding apparatus 110 according to the cross-correlation value between the respective channels to generate the downmixing multichannel audio signal. Create Here, the surround channel signal component of the downmixing multichannel audio signal is obtained by using a difference value between the downmixing left channel signal and the downmixing right channel signal to update the coefficients of the first and second adaptive filters 311 and 321. The multi-channel signal generator 121 may remove a phenomenon in which a specific spectrum signal is distorted due to a phase difference by a smoothing effect using an adaptive filter.

이하, 다채널 신호 생성부(121)의 구성요소를 각각 상세히 살펴보기로 한다.Hereinafter, the components of the multi-channel signal generator 121 will be described in detail.

제1 서라운드 채널신호 생성부(310)는 다운믹싱 스테레오 오디오 신호 중 다운믹싱 좌채널 오디오 신호에서 상호상관 값을 이용하여 중앙채널 신호성분과 서라운드 우채널 신호성분을 제거하여 다운믹싱 서라운드 좌채널 신호를 생성한다. 즉, 제1 서라운드 채널신호 생성부(310)는 다운믹싱 우채널 신호와 다운믹싱 좌채널 신호를 입력받고, 입력된 신호에서 제1 적응필터(311)와 제1 및 제2 감산기(312, 313)를 이용하여 다운믹싱 서라운드 좌채널 신호를 생성한다.The first surround channel signal generator 310 removes the center channel signal component and the surround right channel signal component from the downmixing left channel audio signal among the downmixing stereo audio signals to remove the downmixing surround left channel signal. Create That is, the first surround channel signal generator 310 receives the downmixing right channel signal and the downmixing left channel signal, and the first adaptive filter 311 and the first and second subtractors 312 and 313 from the input signal. To generate the downmix surround left channel signal.

여기서, 제1 적응필터(311)는 공통된 신호 성분인 중앙채널 신호 성분을 억제하고 독립된 신호 성분인 서라운드 신호를 통과시키는 기능을 수행한다. 제1 감산기(312)는 다운믹싱 우채널 신호에서 제1 적응필터를 통과한 다운믹싱 좌채널 신 호를 빼서 오차신호를 출력한다. 이때, 출력된 오차신호는 제1 적응필터(311)의 계수를 갱신하는데 사용된다. 그리고 제2 감산기(313)는 다운믹싱 좌채널 신호에서 제1 감산기(312)의 출력 신호를 빼서 다운믹싱 서라운드 좌채널 신호를 생성한다. 여기서, 제1 감산기(312)의 출력 신호를 다운믹싱 서라운드 좌채널 신호에서 빼는 것은 전후방 상호상관을 최대화하기 위함이다.Here, the first adaptive filter 311 performs a function of suppressing a central channel signal component which is a common signal component and passing a surround signal which is an independent signal component. The first subtractor 312 outputs an error signal by subtracting the downmixing left channel signal passed through the first adaptive filter from the downmixing right channel signal. In this case, the output error signal is used to update the coefficient of the first adaptive filter 311. The second subtractor 313 subtracts the output signal of the first subtracter 312 from the downmix left channel signal to generate a downmix surround left channel signal. Here, subtracting the output signal of the first subtractor 312 from the downmix surround left channel signal is for maximizing forward and backward cross-correlation.

제2 서라운드 채널신호 생성부(320)는 다운믹싱 스테레오 오디오 신호 중 다운믹싱 우채널 오디오 신호에서 상호상관 값을 이용하여 중앙채널 신호성분과 서라운드 좌채널 신호성분을 제거하여 다운믹싱 서라운드 우채널 신호를 생성한다. 즉, 제2 서라운드 채널신호 생성부(320)는 다운믹싱 좌채널 신호와 다운믹싱 우채널 신호를 입력받고, 입력된 신호에서 제2 적응필터(321)와 제3 및 제4 감산기(322, 323)를 이용하여 다운믹싱 서라운드 우채널 신호를 생성한다.The second surround channel signal generator 320 removes the center channel signal component and the surround left channel signal component from the downmixing right channel audio signal among the downmixing stereo audio signals to remove the downmixing surround right channel signal. Create That is, the second surround channel signal generator 320 receives the downmixing left channel signal and the downmixing right channel signal, and the second adaptive filter 321 and the third and fourth subtractors 322 and 323 from the input signal. To generate the downmix surround right channel signal.

여기서, 제2 적응필터(321)는 공통된 신호 성분인 중앙채널 신호 성분을 억제하고 독립된 신호 성분인 서라운드 신호를 통과시키는 기능을 수행한다. 제3 감산기(322)는 다운믹싱 좌채널 신호에서 다운믹싱 우채널 신호를 빼서 오차신호를 출력한다. 이때, 출력된 오차신호는 제2 적응필터(321)의 계수를 갱신하는데 사용된다. 그리고 제4 감산기(323)는 다운믹싱 우채널 신호에서 제2 감산기(322)의 출력 신호를 빼서 다운믹싱 서라운드 우채널 신호를 출력한다. 여기서, 제3 감산기(322)의 출력 신호를 다운믹싱 서라운드 우채널 신호에서 빼는 것은 전후방 상호상관을 최대화하기 위함이다.Here, the second adaptive filter 321 suppresses the central channel signal component that is a common signal component and passes the surround signal that is an independent signal component. The third subtractor 322 subtracts the downmixing right channel signal from the downmixing left channel signal to output an error signal. In this case, the output error signal is used to update the coefficient of the second adaptive filter 321. The fourth subtractor 323 subtracts the output signal of the second subtractor 322 from the downmixing right channel signal to output the downmixing surround right channel signal. Here, subtracting the output signal of the third subtractor 322 from the downmix surround right channel signal is for maximizing forward and backward cross-correlation.

중앙채널 신호 생성부(330)는 다운믹싱 스테레오 오디오 신호 중 좌채널 오 디오 신호와 우채널 오디오 신호를 결합하여 다운믹싱 중앙채널 신호를 생성한다. 즉, 중앙채널 신호 생성부(330)는 다운믹싱 좌채널 신호와 다운믹싱 우채널 신호를 입력받고, 다운믹싱 두 채널 신호를 더한 후 반으로 나누어서 다운믹싱 중앙채널 신호를 생성한다.The center channel signal generator 330 generates a downmixed center channel signal by combining a left channel audio signal and a right channel audio signal among the downmixed stereo audio signals. That is, the center channel signal generator 330 receives the downmixing left channel signal and the downmixing right channel signal, adds the downmixing two channel signals, and divides them in half to generate the downmixing central channel signal.

도 4 는 본 발명에 따른 도 1의 다채널 신호 조정부의 일실시예 상세구성도이다.4 is a detailed configuration diagram of an embodiment of the multi-channel signal adjusting unit of FIG. 1 according to the present invention.

도 4에 도시된 바와 같이, 다채널 신호 조정부(122)는 제6 내지 제10 서브밴드 필터링부(401 내지 405), 제1 및 제2 상호상관 조정부(406, 407), 다채널 파워비율 산출부(408), 및 신호 변환부(409)를 포함한다.As shown in FIG. 4, the multi-channel signal adjusting unit 122 calculates the sixth to tenth subband filtering units 401 to 405, the first and second cross-correlation adjusting units 406 and 407, and the multichannel power ratio. A unit 408 and a signal converter 409 are included.

다채널 신호 조정부(122)는 다채널 신호 생성부(121)에서 생성된 다운믹싱 다채널 오디오 신호의 채널 간 상호상관 값을 원 신호의 채널 간 상호상관 정보에 맞게 조정하고, 상기 조정된 다채널 오디오 신호의 서브밴드별 파워 값을 원 신호의 가상음원 방향정보에 맞게 조정한다. 즉, 다채널 신호 조정부(122)는 다채널 신호 생성부(121)에서 생성된 다채널 음향 신호의 상호상관 및 서브밴드 파워를 조정하여 원래의 다채널 오디오 신호를 복원하는 기능을 수행한다.The multi-channel signal adjusting unit 122 adjusts the cross-correlation value of the channels of the downmixing multi-channel audio signal generated by the multi-channel signal generating unit 121 according to the cross-correlation information of the channels of the original signal, and adjusts the adjusted multi-channel. The power value of each subband of the audio signal is adjusted according to the virtual sound source direction information of the original signal. That is, the multichannel signal adjuster 122 adjusts the cross-correlation and subband power of the multichannel sound signal generated by the multichannel signal generator 121 to perform a function of restoring the original multichannel audio signal.

이하, 다채널 신호 조정부(122)의 구성요소 각각에 대해 상세히 살펴보기로 한다.Hereinafter, each component of the multi-channel signal adjusting unit 122 will be described in detail.

제6 내지 제10 서브밴드 필터링부(401 내지 405)는 다채널 신호 생성부(121)에서 생성된 다운믹싱 다채널 오디오 신호를 각각 서브밴드 필터링한다.The sixth to tenth subband filtering units 401 to 405 each perform subband filtering on the downmixing multichannel audio signal generated by the multichannel signal generator 121.

그리고 다채널 파워비율 산출부(408)는 부호화 장치(110)로부터 전달받은 가 상음원 방향정보로부터 다채널 신호의 서브밴드별 파워비율을 산출한다.The multi-channel power ratio calculator 408 calculates the power ratio for each subband of the multi-channel signal from the virtual sound source direction information received from the encoding apparatus 110.

그리고 제1 및 제2 상호상관 조정부(406, 407)는 제6 내지 제10 서브밴드 필터링부(401 내지 405)에서 각각 서브밴드 필터링된 다운믹싱 다채널 오디오 신호의 채널 간 상호상관 값을 원 신호의 채널 간 상호상관 정보에 맞게 조정한다.In addition, the first and second cross-correlation adjustment units 406 and 407 may use the original signal as the cross-correlation value of the channel of the downmixed multichannel audio signal filtered by the sixth to tenth subband filtering units 401 to 405, respectively. Adjust the channel's cross-correlation information.

그리고 신호 변환부(409)는 제1 및 제2 상호상관 조정부(406, 407)에서 각각 상호상관이 조정된 다채널 오디오 신호의 서브밴드별 파워 값을 다채널 파워비율 산출부(408)에서 산출된 다채널 파워비율에 맞게 조정하고 시간영역으로 변환한다. 즉, 신호 변환부(409)는 다채널 파워비율 산출부(408)에서 계산된 파워비율에 맞게 제1 및 제2 상호상관 조정부(406, 407)에 의해 출력된 서라운드 신호의 파워에 해당하는 서브밴드별로 다채널 음향신호의 파워 값을 조정하고, 파워 값이 조정된 신호를 시간영역으로 변환한다.In addition, the signal converter 409 calculates, by the multi-channel power ratio calculator 408, a subband power value of the multi-channel audio signal whose cross-correlation is adjusted by the first and second cross-correlation adjustment units 406 and 407, respectively. It adjusts to the multichannel power ratio which is made and converts it to time domain. That is, the signal converter 409 is a sub corresponding to the power of the surround signal output by the first and second cross-correlation adjustment units 406 and 407 according to the power ratio calculated by the multi-channel power ratio calculator 408. The power value of the multi-channel sound signal is adjusted for each band, and the signal whose power value is adjusted is converted into the time domain.

도 5 는 본 발명에 따른 상호상관을 이용한 다채널 오디오 신호 복호화 방법에 대한 일실시예 흐름도이다.5 is a flowchart illustrating a method of decoding a multichannel audio signal using cross-correlation according to the present invention.

우선, 부호화 방법을 살펴보면, 공간음향 지각단서 분석부(111)는 다채널 오디오 신호를 서브밴드 필터링하고 필터링된 각 채널 오디오 신호로부터 인접 채널 간 상호상관 및 가상음원 방향정보가 포함되는 공간음향 지각단서를 추출한다.First, referring to the encoding method, the spatial acoustic perceptual analysis unit 111 subband-filters a multi-channel audio signal and includes spatial correlation perceptual information including cross-correlation and virtual sound source direction information between adjacent channels from each filtered channel audio signal. Extract

그리고 다운믹싱부(112)는 필터링된 각 채널 오디오 신호를 스테레오 오디오 신호로 다운믹싱하여 부호화한다.The downmixer 112 downmixes each filtered channel audio signal into a stereo audio signal and encodes the stereo audio signal.

이하, 본 발명에 따른 다채널 오디오 신호 복호화 방법에 살펴보면 다음과 같다.Hereinafter, a multichannel audio signal decoding method according to the present invention will be described.

다채널 신호 생성부(121)는 부호화 장치(110)로부터 전달받은 다운믹싱 스테레오 신호에서 다채널 오디오 신호(원 신호)가 다운믹싱된 스테레오 오디오 신호를 각 채널간 상호상관 값에 따라 분리하여 다운믹싱 다채널 오디오 신호를 생성한다(502).The multi-channel signal generator 121 separates the down-mixed stereo audio signal from which the multi-channel audio signal (the original signal) is downmixed from the down-mixed stereo signal received from the encoding apparatus 110 according to the cross-correlation value between the channels and downmixes them. A multichannel audio signal is generated (502).

그리고 다채널 신호 조정부(122)는 생성된 다운믹싱 다채널 오디오 신호의 채널 간 상호상관 값을 원 신호의 채널 간 상호상관 정보에 맞게 조정한다(504).The multi-channel signal adjusting unit 122 adjusts the cross-correlation value of the generated downmixing multi-channel audio signal according to the cross-correlation information of the original signal (504).

이후, 다채널 신호 조정부(122)는 조정된 다채널 오디오 신호의 서브밴드별 파워 값을 원 신호의 가상음원 방향정보에 맞게 조정하여 복호화한다(506).Thereafter, the multi-channel signal adjusting unit 122 adjusts and decodes the power value of each subband of the adjusted multi-channel audio signal according to the virtual sound source direction information of the original signal (506).

한편, 전술한 바와 같은 본 발명의 방법은 컴퓨터 프로그램으로 작성이 가능하다. 그리고 상기 프로그램을 구성하는 코드 및 코드 세그먼트는 당해 분야의 컴퓨터 프로그래머에 의하여 용이하게 추론될 수 있다.　또한, 상기 작성된 프로그램은 컴퓨터가 읽을 수 있는 기록매체(정보저장매체)에 저장되고, 컴퓨터에 의하여 판독되고 실행됨으로써 본 발명의 방법을 구현한다. 그리고 상기 기록매체는 컴퓨터가 판독할 수 있는 모든 형태의 기록매체를 포함한다.On the other hand, the method of the present invention as described above can be written in a computer program. And the code and code segments constituting the program can be easily inferred by a computer programmer in the art. In addition, the written program is stored in a computer-readable recording medium (information storage medium), and read and executed by a computer to implement the method of the present invention. The recording medium may include any type of computer readable recording medium.

이상에서 설명한 본 발명은, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에게 있어 본 발명의 기술적 사상을 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경이 가능하므로 전술한 실시예 및 첨부된 도면에 의해 한정되는 것이 아니다.The present invention described above is capable of various substitutions, modifications, and changes without departing from the technical spirit of the present invention for those skilled in the art to which the present invention pertains. It is not limited by the drawings.

도 1 은 본 발명에 따른 상호상관을 이용한 다채널 오디오 신호 복호화 장치의 일실시예 구성도,1 is a configuration diagram of an apparatus for decoding a multichannel audio signal using cross-correlation according to the present invention;

도 2 는 본 발명에 이용되는 도 1의 공간음향 지각단서 분석부의 일실시예 상세구성도,2 is a detailed configuration diagram of an embodiment of the spatial acoustic perception cue analysis unit of FIG. 1 used in the present invention;

도 3 은 본 발명에 따른 도 1의 다채널 신호 생성부의 일실시예 상세구성도,3 is a detailed configuration diagram of an embodiment of the multi-channel signal generator of FIG. 1 according to the present invention;

도 4 는 본 발명에 따른 도 1의 다채널 신호 조정부의 일실시예 상세구성도,4 is a detailed configuration diagram of an embodiment of the multi-channel signal adjusting unit of FIG. 1 according to the present invention;

* 도면의 주요 부분에 대한 부호 설명* Explanation of symbols on the main parts of the drawing

120: 복호화 장치 121: 다채널 신호 생성부120: decoding apparatus 121: multi-channel signal generation unit

122: 다채널 신호 조정부 310: 제1 서라운드 채널신호 생성부122: multi-channel signal adjusting unit 310: first surround channel signal generating unit

320: 제2 서라운드 채널신호 생성부 311: 제1 적응 필터320: second surround channel signal generator 311: first adaptive filter

321: 제2 적응 필터 330: 중앙채널 신호 생성부321: second adaptive filter 330: center channel signal generator

401 내지 405: 제6 내지 제10 서브밴드 필터링부401 to 405: sixth to tenth subband filtering units

406: 제1 상호상관 조정부 407: 제2 상호상관 조정부406: first cross-correlation adjustment unit 407: second cross-correlation adjustment unit

408: 다채널 파워비율 산출부 409: 신호 변환부408: multi-channel power ratio calculator 409: signal converter

Claims

In the multi-channel audio signal decoding apparatus using cross-correlation,

Multi-channel signal generating means for generating a plurality of channel-specific audio signals from the downmixing stereo audio signal using left / right channel correlation values; And

The cross-correlation value and the sub-band power value of the generated plurality of channel-specific audio signals and the sub-band power correlation information may be used to restore the original signal with respect to the downmixed stereo audio signal. Means for adjusting multichannel signals

Multi-channel audio signal decoding apparatus using a cross-correlation comprising a.

The method of claim 1,

The multi-channel signal generating means,

First surround channel signal generation means for generating a downmix surround left channel signal by removing a center channel signal component and a surround right channel signal component by using a cross-correlation value from a downmixing left channel audio signal among the downmixing stereo audio signals ;

Generating a second surround channel signal for generating a downmix surround right channel signal by removing a center channel signal component and a surround left channel signal component by using a cross-correlation value from a downmixing right channel audio signal among the downmixing stereo audio signals Way; And

Center channel signal generation means for generating a downmix center channel signal by combining a left channel audio signal and a right channel audio signal among the downmixed stereo audio signals

The method of claim 2,

The first surround channel signal generating means,

And an adaptive filter and a subtractor for removing the center channel signal component and the surround right channel signal component by using the cross-correlation value from the downmixing left channel audio signal.

The method of claim 2,

The second surround channel signal generating means,

And an adaptive filter and a subtractor for removing the center channel signal component and the surround left channel signal component by using the cross-correlation value from the downmixing right-channel audio signal.

The method according to any one of claims 1 to 4,

The multi-channel signal adjusting means,

Multichannel power ratio calculating means for calculating a multichannel power ratio from the virtual sound source direction information received from the encoding apparatus;

A plurality of subband filtering means for subband filtering each of the downmixing multichannel audio signals generated by the multichannel signal generating means;

A plurality of cross-correlation adjustment means for adjusting the cross-correlation value of the subband filtered downmixing multichannel audio signal according to the cross-correlation information of the original signal; And

Signal conversion means for adjusting the power value of each subband of the multi-channel audio signal whose cross-correlation is adjusted according to the calculated multi-channel power ratio and converting it into a time domain

In the multi-channel audio signal decoding method using cross-correlation,

Generating a plurality of channel-specific audio signals from the downmixing stereo audio signal using the cross-correlation value between the left and right channels; And

The cross-correlation value and the sub-band power value of the generated plurality of channel-specific audio signals and the sub-band power correlation information may be used to restore the original signal with respect to the downmixed stereo audio signal. Multi-channel signal adjustment step

Multi-channel audio signal decoding method using cross-correlation comprising a.

The method of claim 6,

The multi-channel signal generation step,

Generating a downmixing surround left channel signal by removing a center channel signal component and a surround right channel component using a cross-correlation value from a downmixing left channel audio signal among the downmixing stereo audio signals;

Generating a downmix surround right channel signal by removing a center channel signal component and a surround left channel signal component using a cross-correlation value from a downmixing right channel audio signal among the downmixing stereo audio signals; And

Combining a left channel audio signal and a right channel audio signal among the downmixing stereo audio signals to generate a downmix center channel signal;

The method according to claim 6 or 7,

The multi-channel signal adjustment step,

A multichannel power ratio calculating step of calculating a multichannel power ratio from the virtual sound source direction information received from the encoding apparatus;

A subband filtering step of subband filtering each of the downmixing multichannel audio signals generated in the multichannel signal generation step;

A cross-correlation adjustment step of adjusting the cross-correlation value of the subband-filtered downmixing multichannel audio signal according to the cross-correlation information of the original signal; And

A signal conversion step of adjusting the power value of each subband of the multi-channel audio signal whose cross-correlation is adjusted according to the calculated multi-channel power ratio and converting it to the time domain