JP2010034760A

JP2010034760A - Video/audio signal processing apparatus and video/audio signal processing method

Info

Publication number: JP2010034760A
Application number: JP2008193428A
Authority: JP
Inventors: Shunta Echigoya; 俊太越後谷; Masahito Noguchi; 雅人野口
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2008-07-28
Filing date: 2008-07-28
Publication date: 2010-02-12

Abstract

<P>PROBLEM TO BE SOLVED: To perform the processing of an audio signal having a high degree of freedom when outputting an asynchronous video signal on which an audio signal is superposed as a video signal synchronized with a reference signal. <P>SOLUTION: A video signal DSa on which an audio signal is superposed is written in a memory 16. The video signal written in the memory 16 is read out synchronously with a reference signal. An audio signal DAa (DAb) separated from the video signal or an input audio signal DAc is written in a memory 26. The audio signal written in the memory 26 is read synchronously with a reference signal. Instead of an audio signal superposed on a video signal DSb (DSc) read out from the memory 16, a multiplexor 33 superposes an audio signal DAe (DAf) read from the memory 26. Since the audio signal is stored separately from the video signal, the processing of an audio signal having a high degree of freedom can be performed by using the separately stored audio signal. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

この発明は、映像音声信号処理装置と映像音声信号処理方法に関する。詳しくは、音声信号が重畳されている非同期の映像信号を、基準信号に同期した映像信号として出力する際に、音声信号を記憶するメモリを設けて、自由度の高い音声信号の処理を行うことができるようにするものである。 The present invention relates to a video / audio signal processing apparatus and a video / audio signal processing method. Specifically, when outputting an asynchronous video signal on which an audio signal is superimposed as a video signal synchronized with a reference signal, a memory for storing the audio signal is provided to process the audio signal with a high degree of freedom. Is to be able to.

従来、映像音声信号装置例えばフレームシンクロナイザでは、映像信号の同期を別の映像信号の同期に変換する際に生じる映像信号の欠落や繰り返しが生じた場合でも、映像信号に重畳される補助信号の連続性を保持させることが行われている。 Conventionally, in a video / audio signal device such as a frame synchronizer, even if a video signal is lost or repeated when the synchronization of a video signal is converted to another video signal, a continuous auxiliary signal superimposed on the video signal is generated. It is done to keep sex.

例えば、特許文献１の発明では、映像／補助データ分離部が映像データと補助データを分離する。映像データ記憶部は、分離された映像データを記憶する。また、補助データ記憶部は、分離された補助データを記憶する。ここで、映像データの読み出しが入力順序通りに行われない場合、補助データの読み出しは、入力順序通りに行われるように読み出し順序を制御することで、補助データが欠落したり繰り返されることなく連続的に出力させることが行われている。 For example, in the invention of Patent Document 1, a video / auxiliary data separation unit separates video data and auxiliary data. The video data storage unit stores the separated video data. The auxiliary data storage unit stores the separated auxiliary data. Here, when the video data is not read out in the input order, the auxiliary data is read out continuously in the input order so that the auxiliary data is not lost or repeated by controlling the reading order. It is made to output automatically.

また、フレームシンクロナイザでは、映像信号や補助信号だけでなく音声信号の切り替えも行われる。図１２は、従来のフレームシンクロナイザ６０の構成を示している。 In addition, in the frame synchronizer, not only video signals and auxiliary signals but also audio signals are switched. FIG. 12 shows the configuration of a conventional frame synchronizer 60.

フレームシンクロナイザ６０では、音声信号が重畳された非同期の映像信号が所定のフォーマット例えばＳＤＩフォーマットのストリームとして供給される。ストリームＤＳinは、シリアル／パラレル変換部（Ｓ／Ｐ）６１でパラレル信号に変換されて映像信号入力処理部６２に供給される。 In the frame synchronizer 60, an asynchronous video signal on which an audio signal is superimposed is supplied as a stream of a predetermined format, for example, an SDI format. The stream DSin is converted into a parallel signal by the serial / parallel converter (S / P) 61 and supplied to the video signal input processor 62.

映像信号入力処理部６２は、シリアル／パラレル変換部６１から供給された映像信号の処理、例えば明るさの調整や映像の挿入等を行い、処理後の映像信号ＤＳｈをメモリコントローラ６５に供給する。 The video signal input processing unit 62 performs processing of the video signal supplied from the serial / parallel conversion unit 61, for example, adjustment of brightness, insertion of video, and the like, and supplies the processed video signal DSh to the memory controller 65.

メモリコントローラ６５は、供給された非同期の映像信号ＤＳｈをメモリ６６に書き込み、メモリ６６に書き込まれた映像信号の読み出しを基準信号に同期して行うことで、入力された映像信号ＤＳｈとは独立した同期位相の映像信号ＤＳｊをデマルチプレクサ６７に供給する。また、メモリコントローラ６５は、メモリ６６に書き込まれている映像信号の読み出し開始位置を変更してリポジション処理を行い、例えば映像の下部領域を画面上の上部領域に表示できるように映像の表示位置を垂直方向に移動させる。 The memory controller 65 writes the supplied asynchronous video signal DSh into the memory 66 and reads out the video signal written in the memory 66 in synchronization with the reference signal, so that it is independent from the input video signal DSh. A video signal DSj having a synchronous phase is supplied to the demultiplexer 67. Further, the memory controller 65 changes the read start position of the video signal written in the memory 66 and performs the repositioning process, for example, the video display position so that the lower area of the video can be displayed in the upper area on the screen. Is moved vertically.

デマルチプレクサ６７は、音声信号が重畳されている映像信号ＤＳｊを、映像信号ＤＶｊと音声信号ＤＡｊに分離して、映像信号ＤＶｊを映像信号出力処理部７１に供給し、音声信号ＤＡｊを音声信号出力処理部７２に供給する。 The demultiplexer 67 separates the video signal DSj on which the audio signal is superimposed into the video signal DVj and the audio signal DAj, supplies the video signal DVj to the video signal output processing unit 71, and outputs the audio signal DAj to the audio signal This is supplied to the processing unit 72.

映像信号出力処理部７１は、メモリコントローラ６５によってリポジション処理が行われたときブランキング期間の付け替えを行う。すなわち、メモリ６６に書き込まれている映像信号の読み出し開始位置を変更して映像の表示位置を垂直方向に移動させるものとすると、ブランキング期間も移動してしまう。このため、映像信号出力処理部７１は、ブランキング期間の付け替えを行うことで、垂直方向に移動された映像を表示できるようにする。また、映像信号出力処理部７１は、リポジション処理によって生じた不要な表示、例えば映像の上部領域の表示や付け替え前のブランキング期間に対応する表示をマスキングする。映像信号出力処理部７１は、信号処理後の映像信号ＤＶｋをマルチプレクサ７３に供給する。 The video signal output processing unit 71 changes the blanking period when the reposition processing is performed by the memory controller 65. That is, if the read start position of the video signal written in the memory 66 is changed to move the video display position in the vertical direction, the blanking period also moves. Therefore, the video signal output processing unit 71 can display the video moved in the vertical direction by changing the blanking period. The video signal output processing unit 71 masks unnecessary display generated by the repositioning process, for example, display of the upper area of the video or display corresponding to the blanking period before replacement. The video signal output processing unit 71 supplies the video signal DVk after the signal processing to the multiplexer 73.

音声信号出力処理部７２は、映像信号の補助データ領域に記録されている音声信号の順序がリポジション処理によって変更されてしまうとき、順序が変更された音声信号が出力されることがないよう音声を無音声化する。なお、音声信号出力処理部７２は、音声のフェード処理等も行う。音声信号出力処理部７２は、信号処理後の音声信号ＤＡｋをマルチプレクサ７３に供給する。 The audio signal output processing unit 72 prevents the audio signal whose order has been changed from being output when the order of the audio signals recorded in the auxiliary data area of the video signal is changed by the repositioning process. Is silenced. The audio signal output processing unit 72 also performs audio fade processing and the like. The audio signal output processing unit 72 supplies the audio signal DAk after the signal processing to the multiplexer 73.

マルチプレクサ７３は、映像信号出力処理部７１から供給された映像信号ＤＶｋの補助データ領域に、音声信号出力処理部７２から供給された音声信号ＤＡｋを挿入してパラレル／シリアル変換部（Ｐ／Ｓ）７４に供給する。 The multiplexer 73 inserts the audio signal DAk supplied from the audio signal output processing unit 72 into the auxiliary data area of the video signal DVk supplied from the video signal output processing unit 71 to parallel / serial conversion unit (P / S). 74.

パラレル／シリアル変換部７４は、マルチプレクサ７３から供給された映像信号をシリアル信号に変換して、ＳＤＩフォーマットのストリームＤＳout’として出力する。 The parallel / serial conversion unit 74 converts the video signal supplied from the multiplexer 73 into a serial signal, and outputs the serial signal as a stream DSout 'in the SDI format.

ローカルＣＰＵ８１は、メインＣＰＵ（図示せず）から供給された制御信号ＣＴｍに応じて制御信号ＣＴｂを生成する。ローカルＣＰＵ８１は、生成した制御信号ＣＴｂをメモリコントローラ６５や映像信号出力処理部７１、音声信号出力処理部７２等に供給して、フレームシンクロナイザ６０の動作が制御信号ＣＴｍによって指示された動作となるように各部を制御する。 The local CPU 81 generates a control signal CTb according to the control signal CTm supplied from the main CPU (not shown). The local CPU 81 supplies the generated control signal CTb to the memory controller 65, the video signal output processing unit 71, the audio signal output processing unit 72, and the like so that the operation of the frame synchronizer 60 becomes the operation instructed by the control signal CTm. Control each part.

特開２００４−２２１８４１号公報JP 2004-221841 A

ところで、図１２に示すようなフレームシンクロナイザ６０では、メモリ６６から読み出された映像信号ＤＳｂに対して映像信号の出力処理、およびメモリ６６から読み出された映像信号ＤＳｂに重畳されている音声信号に対して音声信号の出力処理が行われている。このため、簡単な構成で自由度の高い音声信号出力処理を行うことができない。例えばリポジション処理が行われると音声信号の順序が変更されてしまうために音声の無音化が行われる。したがって、リポジション処理を行ったとき、音声を出力させることができない。また、音声のフェードアウトやフェードインの処理は可能であるが、音声切り替え例えば音声のクロスフェード等を行うことができない。 By the way, in the frame synchronizer 60 as shown in FIG. 12, the output processing of the video signal with respect to the video signal DSb read from the memory 66 and the audio signal superimposed on the video signal DSb read from the memory 66. The audio signal is output. For this reason, an audio signal output process with a high degree of freedom cannot be performed with a simple configuration. For example, when the repositioning process is performed, the order of the audio signals is changed, so that the sound is silenced. Therefore, when the reposition process is performed, it is impossible to output sound. In addition, although voice fade-out and fade-in processing is possible, voice switching such as voice cross-fading cannot be performed.

そこで、この発明では、音声信号が重畳されている非同期の映像信号を、基準信号に同期させて出力する際に、自由度の高い音声信号の処理を行えるようにした映像音声処理装置と映像音声処理方法を提供するものである。 Therefore, according to the present invention, a video / audio processing apparatus and a video / audio that can perform processing of a highly flexible audio signal when an asynchronous video signal on which the audio signal is superimposed is output in synchronization with a reference signal. A processing method is provided.

この発明の第１の側面は、音声信号が重畳されている映像信号を記憶する第１のメモリと、前記第１のメモリに書き込まれた映像信号を基準信号に同期させて読み出す第１のメモリコントローラと、前記映像信号に重畳されている音声信号を分離するデマルチプレクサと、前記分離された音声信号または入力された音声信号を記憶する第２のメモリと、前記第２のメモリに書き込まれた音声信号を前記基準信号に同期させて読み出す第２のメモリコントローラと、前記第１のメモリから読み出された映像信号に重畳されている音声信号に換えて、前記第２のメモリから読み出された音声信号を重畳させるマルチプレクサとを有する映像音声信号処理にある。 According to a first aspect of the present invention, a first memory for storing a video signal on which an audio signal is superimposed, and a first memory for reading the video signal written in the first memory in synchronization with a reference signal A controller, a demultiplexer that separates the audio signal superimposed on the video signal, a second memory that stores the separated audio signal or the input audio signal, and the second memory written to the second memory A second memory controller that reads out an audio signal in synchronization with the reference signal and an audio signal superimposed on the video signal read out from the first memory are read out from the second memory. The audio / video signal processing includes a multiplexer for superimposing the audio signal.

また、この発明の第２の側面は、音声信号が重畳されている映像信号を第１のメモリに記憶するステップと、前記第１のメモリに書き込まれた映像信号を基準信号に同期させて読み出すステップと、前記映像信号に重畳されている音声信号を分離するステップと、前記分離された音声信号または入力された音声信号を第２のメモリに記憶するステップと、前記第２のメモリに書き込まれた音声信号を前記基準信号に同期させて読み出すステップと、前記第１のメモリから読み出された映像信号に重畳されている音声信号に換えて、前記第２のメモリから読み出された音声信号を重畳させるステップと
を有する映像音声信号処理方法にある。 According to a second aspect of the present invention, a step of storing a video signal on which an audio signal is superimposed in a first memory and a video signal written in the first memory are read out in synchronization with a reference signal. A step of separating an audio signal superimposed on the video signal, a step of storing the separated audio signal or an inputted audio signal in a second memory, and writing to the second memory The audio signal read from the second memory instead of the step of reading out the audio signal synchronized with the reference signal and the audio signal superimposed on the video signal read from the first memory There is a step of superimposing a video and audio signal.

この発明では、音声信号が例えばブランキング期間に重畳されている入力映像信号が、第１のメモリコントローラによって第１のメモリに書き込まれる。また、第１のメモリに書き込まれた映像信号は、基準信号に同期して第１のメモリコントローラによって読み出される。 In the present invention, an input video signal in which an audio signal is superimposed, for example, in a blanking period is written into the first memory by the first memory controller. Further, the video signal written in the first memory is read out by the first memory controller in synchronization with the reference signal.

また、映像信号に重畳されている音声信号がデマルチプレクサによって分離されて、この分離された音声信号または入力された音声信号が、第２のメモリコントローラによって第２のメモリに書き込まれる。さらに、第２のメモリに書き込まれた音声信号は、基準信号に同期して第２のメモリコントローラによって読み出される。 The audio signal superimposed on the video signal is separated by the demultiplexer, and the separated audio signal or the inputted audio signal is written into the second memory by the second memory controller. Furthermore, the audio signal written in the second memory is read out by the second memory controller in synchronization with the reference signal.

音声信号出力処理部では、例えば第１のメモリから読み出された映像信号に重畳されている音声信号の１映像フレーム当たりのサンプル数のシーケンスが判別されて、この判別結果に基づき第２のメモリから読み出された音声信号のサンプル数が、第１のメモリから読み出された映像信号に重畳されている音声信号の１映像フレーム当たりのサンプル数と等しくなるように調整される。また、音声信号の１映像フレーム当たりのサンプル数の調整では、第２のメモリに書き込まれた音声信号の１映像フレーム当たりのサンプル数を示す情報が用いられる。 In the audio signal output processing unit, for example, a sequence of the number of samples per video frame of the audio signal superimposed on the video signal read from the first memory is determined, and the second memory is determined based on the determination result. The number of samples of the audio signal read from the video signal is adjusted to be equal to the number of samples per video frame of the audio signal superimposed on the video signal read from the first memory. Further, in the adjustment of the number of samples per video frame of the audio signal, information indicating the number of samples per video frame of the audio signal written in the second memory is used.

マルチプレクサでは、第１のメモリから読み出された映像信号に重畳されている音声信号に換えて、音声信号出力処理部で処理された音声信号を重畳する処理が行われる。 The multiplexer performs a process of superimposing the audio signal processed by the audio signal output processing unit instead of the audio signal superimposed on the video signal read from the first memory.

また、音声信号出力処理部では、第２のメモリから読み出された音声信号とデマルチプレクサで分離された音声信号の何れか一方の音声信号から他方の音声信号に切り替えてマルチプレクサに供給する処理が行われる。 In the audio signal output processing unit, a process of switching from one audio signal of the audio signal read from the second memory and the audio signal separated by the demultiplexer to the other audio signal and supplying it to the multiplexer is performed. Done.

また、第１のメモリから読み出された映像信号に対して信号処理を行う映像信号出力処理部が設けられて、第１のメモリコントローラによって、第１のメモリに書き込まれている映像信号の読み出し開始位置を変更させることで映像の表示位置を移動させるリポジション処理が行われたとき、映像信号出力処理部は、第１のメモリから読み出された映像信号に対してブランキング期間の付け替えが行われる。また、第１のメモリに書き込まれる映像信号から音声信号を分離して第２のメモリに書き込み、この第２のメモリに書き込まれた音声信号を読み出して、リポジション処理後の映像信号に重畳される。 In addition, a video signal output processing unit that performs signal processing on the video signal read from the first memory is provided, and reading of the video signal written in the first memory by the first memory controller is provided. When the reposition process is performed to move the display position of the video by changing the start position, the video signal output processing unit can change the blanking period for the video signal read from the first memory. Done. Also, the audio signal is separated from the video signal written to the first memory and written to the second memory, and the audio signal written to the second memory is read out and superimposed on the video signal after the reposition processing. The

この発明によれば、音声信号が重畳されている映像信号が第１のメモリに記憶されて、この第１のメモリに書き込まれた映像信号が基準信号に同期して読み出される。また、映像信号に重畳されている音声信号が分離されて、この分離された音声信号または入力された音声信号が第２のメモリに記憶される。この第２のメモリに書き込まれた音声信号は、基準信号に同期して読み出されて、第１のメモリから読み出された映像信号に重畳されている音声信号に換えて重畳される。このように、音声信号を映像信号と別個に記憶することで、音声信号が重畳されている非同期の映像信号を、基準信号に同期した映像信号として出力する際に、自由度の高い音声信号の処理を行うことができるようになる。 According to the present invention, the video signal on which the audio signal is superimposed is stored in the first memory, and the video signal written in the first memory is read out in synchronization with the reference signal. The audio signal superimposed on the video signal is separated, and the separated audio signal or the input audio signal is stored in the second memory. The audio signal written in the second memory is read out in synchronization with the reference signal, and is superposed in place of the audio signal superimposed on the video signal read out from the first memory. As described above, by storing the audio signal separately from the video signal, an asynchronous video signal on which the audio signal is superimposed is output as a video signal synchronized with the reference signal. Processing can be performed.

以下、図を参照しながら、この発明の実施の一形態について説明する。図１は、映像信号と音声信号の処理を行う映像音声信号処理装置１０の構成を示している。なお、映像音声信号処理装置１０は、フレームシンクロナイザの機能、すなわち音声信号が重畳されている非同期の映像信号を基準信号に同期した映像信号として出力する機能だけでなく、音声信号の切り替え等の編集機能も備えたものである。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 shows a configuration of a video / audio signal processing apparatus 10 for processing a video signal and an audio signal. The video / audio signal processing apparatus 10 is not only a function of a frame synchronizer, that is, a function of outputting an asynchronous video signal on which an audio signal is superimposed as a video signal synchronized with a reference signal, but also editing such as switching of an audio signal. It also has functions.

映像音声信号処理装置１０では、音声信号が重畳されている非同期の映像信号が所定のフォーマット例えばＳＤＩフォーマットのストリームとして複数系統供給される。ストリームＤＳin-1は、シリアル／パラレル変換部（Ｓ／Ｐ）１１-1でパラレル信号に変換されて映像信号入力処理部１２-1に供給される。同様に、ストリームＤＳin-2〜ＤＳin-mは、シリアル／パラレル変換部（Ｓ／Ｐ）１１-2〜１１-mでパラレル信号に変換されて映像信号入力処理部１２-2〜１２-mに供給される。 In the video / audio signal processing apparatus 10, a plurality of asynchronous video signals on which audio signals are superimposed are supplied as streams in a predetermined format, for example, an SDI format. The stream DSin-1 is converted into a parallel signal by the serial / parallel converter (S / P) 11-1, and is supplied to the video signal input processor 12-1. Similarly, the streams DSin-2 to DSin-m are converted into parallel signals by the serial / parallel converters (S / P) 11-2 to 11-m and then sent to the video signal input processing units 12-2 to 12-m. Supplied.

映像信号入力処理部１２-1は、シリアル／パラレル変換部１１-1から供給された映像信号の処理、例えば明るさの調整や映像の挿入等を行い、処理後の映像信号をセレクタ１３に供給する。同様に、映像信号入力処理部１２-2〜１２-mは、入力された映像信号の処理を行い、処理後の映像信号をセレクタ１３に供給する。 The video signal input processing unit 12-1 processes the video signal supplied from the serial / parallel conversion unit 11-1, for example, adjusts the brightness or inserts the video, and supplies the processed video signal to the selector 13. To do. Similarly, the video signal input processing units 12-2 to 12-m process the input video signal and supply the processed video signal to the selector 13.

セレクタ１３は、映像信号入力処理部１２-1〜１２-mから供給された複数系統の映像信号から所望の映像信号を選択して、選択した映像信号ＤＳａを第１のメモリコントローラ１５とデマルチプレクサ１７に供給する。 The selector 13 selects a desired video signal from a plurality of video signals supplied from the video signal input processing units 12-1 to 12-m, and selects the selected video signal DSa from the first memory controller 15 and the demultiplexer. 17 is supplied.

メモリコントローラ１５は、供給された非同期の映像信号ＤＳａを第１のメモリ１６に書き込み、メモリ１６に書き込まれた映像信号の読み出しを基準信号に同期して行うことで、入力された映像信号ＤＳａとは独立した同期位相の映像信号ＤＳｂを映像信号出力処理部３１に供給する。また、メモリコントローラ１５は、メモリ１６に書き込まれている映像信号の読み出し開始位置を変更してリポジション処理を行う。例えば読み出し開始位置を変更して、映像の下部領域を画面上の上部領域に表示できるように映像の表示位置を垂直方向に移動させる。あるいは読み出し開始位置を変更して、映像を水平方向や斜め方向に移動させる。なお、以下の説明では、映像の下部領域を画面上の上部領域に表示できるように映像の表示位置を垂直方向に移動させるものとする。 The memory controller 15 writes the supplied asynchronous video signal DSa into the first memory 16 and reads out the video signal written in the memory 16 in synchronization with the reference signal, whereby the input video signal DSa and Supplies the video signal DSb having an independent synchronization phase to the video signal output processing unit 31. In addition, the memory controller 15 performs a repositioning process by changing the reading start position of the video signal written in the memory 16. For example, the readout start position is changed, and the video display position is moved in the vertical direction so that the lower area of the video can be displayed in the upper area on the screen. Alternatively, the readout start position is changed, and the video is moved in the horizontal direction or the oblique direction. In the following description, it is assumed that the display position of the video is moved in the vertical direction so that the lower area of the video can be displayed in the upper area on the screen.

デマルチプレクサ１７は、音声信号が重畳されている映像信号ＤＳａから音声信号ＤＡａを分離して、または音声信号が重畳されている映像信号ＤＳｂから音声信号ＤＡｂを分離して、分離した音声信号ＤＡａ，ＤＡｂをセレクタ２４と音声信号出力処理部３２に供給する。 The demultiplexer 17 separates the audio signal DAa from the video signal DSa on which the audio signal is superimposed or separates the audio signal DAb from the video signal DSb on which the audio signal is superimposed. DAb is supplied to the selector 24 and the audio signal output processing unit 32.

セレクタ２３には、デジタルインターフェイスレシーバ（以下「ＤＩＲ」という）２１-1〜２１-nが接続されている。ＤＩＲ-1は、外部から供給された音声ストリームＤＡin-1を音声信号に復元してセレクタ２３に供給する。同様に、ＤＩＲ-2〜ＤＩＲ-nは、外部から供給された音声ストリームＤＡin-2〜ＤＡin-nを音声信号に復元してセレクタ２３に供給する。セレクタ２３は、ＤＩＲ-1〜ＤＩＲ-nから供給された複数系統の音声信号から所望の音声信号を選択して、選択した音声信号ＤＡｃをセレクタ２４に供給する。 The selector 23 is connected to digital interface receivers (hereinafter referred to as “DIR”) 21-1 to 21-n. DIR-1 restores the audio stream DAin-1 supplied from the outside into an audio signal and supplies it to the selector 23. Similarly, DIR-2 to DIR-n restore the audio streams DAin-2 to DAin-n supplied from the outside into audio signals and supply them to the selector 23. The selector 23 selects a desired audio signal from a plurality of audio signals supplied from DIR-1 to DIR-n, and supplies the selected audio signal DAc to the selector 24.

セレクタ２４は、音声信号が重畳されている映像信号ＤＳａ（ＤＳｂ）から分離した音声信号ＤＡａ（ＤＡｂ）、またはセレクタ２３から供給された音声信号ＤＡｃを選択してメモリコントローラ２５に供給する。 The selector 24 selects the audio signal DAa (DAb) separated from the video signal DSa (DSb) on which the audio signal is superimposed, or the audio signal DAc supplied from the selector 23 and supplies the selected signal to the memory controller 25.

第２のメモリコントローラ２５は、セレクタ２４から供給された音声信号ＤＡｄを第２のメモリ２６に書き込む。また、メモリコントローラ２５は、メモリ２６に書き込まれた音声信号の読み出しを基準信号に同期して行うことで、基準信号に同期した音声信号ＤＡｅを音声信号出力処理部３２に供給する。なお、メモリ２６は音声信号を記憶するものであるから、映像信号を記憶するメモリ１６よりも記憶容量は少なくてよい。例えば映像信号ＤＳａの総走査線が１１２５本で１ライン当たりの総画素数が２２００画素であり、１０ビットの輝度信号と色差信号からなるとき、１映像フレーム当たりのデータ量は６．１Ｍｂｙｔｅ（≒１０ｂｉｔ×２×１１２５ライン×２２００画素）となる。ここで、音声信号ＤＡｄが１６チャンネルであり１映像フレーム当たりのサンプル数が１６０２サンプル、サンプルデータが３２ビットであるとき、１映像フレーム当たりのデータ量は０．１Ｍｂｙｔｅ（≒３２ｂｉｔ×１６チャンネル×１６０２サンプル）となる。したがって、メモリ２６は、メモリ１６に比べて記憶容量は少なくてよい。 The second memory controller 25 writes the audio signal DAd supplied from the selector 24 in the second memory 26. Further, the memory controller 25 reads the audio signal written in the memory 26 in synchronization with the reference signal, thereby supplying the audio signal output processing unit 32 with the audio signal DAe synchronized with the reference signal. Since the memory 26 stores the audio signal, the memory capacity may be smaller than that of the memory 16 that stores the video signal. For example, when the total number of scanning lines of the video signal DSa is 1125, the total number of pixels per line is 2200 pixels, and it consists of a 10-bit luminance signal and a color difference signal, the data amount per video frame is 6.1 Mbyte (≈ 10 bits × 2 × 1125 lines × 2200 pixels). Here, when the audio signal DAd is 16 channels, the number of samples per video frame is 1602 samples, and the sample data is 32 bits, the data amount per video frame is 0.1 Mbyte (≈32 bits × 16 channels × 1602). Sample). Therefore, the memory 26 may have a smaller storage capacity than the memory 16.

映像信号出力処理部３１は、メモリコントローラ１５によってリポジション処理が行われたときブランキング期間の付け替えを行う。すなわち、メモリ１６に書き込まれている映像信号の読み出し開始位置を変更して映像の表示位置を垂直方向に移動させるものとすると、ブランキング期間も移動してしまう。このため、映像信号出力処理部３１は、ブランキング期間の付け替えを行うことで、垂直方向に移動された映像を表示できるようにする。また、映像信号出力処理部３１は、リポジション処理によって生じた不要な表示、例えば映像の上部領域の表示や付け替え前のブランキング期間に対応する表示をマスキングする。映像信号出力処理部３１は、信号処理後の映像信号ＤＳｃをマルチプレクサ３３に供給する。 The video signal output processing unit 31 changes the blanking period when the reposition processing is performed by the memory controller 15. That is, if the reading start position of the video signal written in the memory 16 is changed to move the video display position in the vertical direction, the blanking period also moves. Therefore, the video signal output processing unit 31 can display the video moved in the vertical direction by changing the blanking period. Further, the video signal output processing unit 31 masks unnecessary display generated by the repositioning process, for example, display of the upper area of the video or display corresponding to the blanking period before replacement. The video signal output processing unit 31 supplies the video signal DSc after the signal processing to the multiplexer 33.

音声信号出力処理部３２は、映像信号ＤＳｃの補助データ領域に記録されている音声信号の順序がリポジション処理によって変更されてしまうとき、メモリコントローラ２５によってメモリ２６から読み出されている音声信号ＤＡｅを音声信号ＤＡｆとして、マルチプレクサ３３とデジタルインターフェイストランスミッタ（以下「ＤＩＴ」という）３５に供給する。また、音声信号出力処理部３２は、音声のフェード処理や、メモリコントローラ２５によってメモリ２６から読み出された音声信号ＤＡｅとデマルチプレクサで分離された音声信号ＤＡｂの何れか一方の音声信号を、他方の音声信号に切り替える音声切り替え処理を行い、処理後の音声信号を音声信号ＤＡｆとして、マルチプレクサ３３とＤＩＴ３５に供給する。さらに、音声信号出力処理部３２は、映像信号ＤＳｂに重畳されている音声信号ＤＡｂにおける１映像フレーム当たりのサンプル数のシーケンスを判別して、メモリコントローラ２５によってメモリ２６から読み出された音声信号ＤＡｅのサンプル数を、判別したシーケンスに対応させて調整する処理を行う。 The audio signal output processing unit 32 reads the audio signal DAe read from the memory 26 by the memory controller 25 when the order of the audio signals recorded in the auxiliary data area of the video signal DSc is changed by the reposition processing. Is supplied as an audio signal DAf to a multiplexer 33 and a digital interface transmitter (hereinafter referred to as “DIT”) 35. Also, the audio signal output processing unit 32 performs audio fading processing, audio signal DAe read from the memory 26 by the memory controller 25, and audio signal DAb separated by the demultiplexer, The voice switching process for switching to the voice signal is performed, and the processed voice signal is supplied to the multiplexer 33 and the DIT 35 as the voice signal DAf. Furthermore, the audio signal output processing unit 32 determines the sequence of the number of samples per video frame in the audio signal DAb superimposed on the video signal DSb, and the audio signal DAe read from the memory 26 by the memory controller 25. The number of samples is adjusted to correspond to the determined sequence.

マルチプレクサ３３は、映像信号出力処理部３１から供給された映像信号ＤＳｃの補助データ領域に、音声信号出力処理部３２から供給された音声信号ＤＡｆを重畳してパラレル／シリアル変換部（Ｐ／Ｓ）３４に供給する。 The multiplexer 33 superimposes the audio signal DAf supplied from the audio signal output processing unit 32 on the auxiliary data area of the video signal DSc supplied from the video signal output processing unit 31 to parallel / serial conversion unit (P / S). 34.

パラレル／シリアル変換部３４は、マルチプレクサ３３から供給された映像信号ＤＳｄをシリアル信号に変換して、ＳＤＩフォーマットのストリームＤＳoutとして出力する。 The parallel / serial converter 34 converts the video signal DSd supplied from the multiplexer 33 into a serial signal and outputs the serial signal as a stream DSout in the SDI format.

ＤＩＴ３５は、音声信号出力処理部３２から供給された音声信号ＤＡｆのパケット化等を行い、所定のフォーマットの音声ストリームＤＡoutとして出力する。 The DIT 35 packetizes the audio signal DAf supplied from the audio signal output processing unit 32 and outputs it as an audio stream DAout having a predetermined format.

ローカルＣＰＵ４１は、メインＣＰＵ（図示せず）から供給された制御信号ＣＴｍに応じて制御信号ＣＴａを生成する。ローカルＣＰＵ４１は、生成した制御信号ＣＴａをメモリコントローラ１５，２５や映像信号出力処理部３１、音声信号出力処理部３２等に供給して、映像音声信号処理装置１０の動作が制御信号ＣＴｍによって指示された動作となるように各部を制御する。 The local CPU 41 generates a control signal CTa according to a control signal CTm supplied from a main CPU (not shown). The local CPU 41 supplies the generated control signal CTa to the memory controllers 15, 25, the video signal output processing unit 31, the audio signal output processing unit 32, etc., and the operation of the video / audio signal processing device 10 is instructed by the control signal CTm. Each part is controlled so as to achieve the correct operation.

このように、映像音声信号処理装置１０は、映像信号ＤＳａを記憶するメモリ１６と音声信号を記憶するメモリ２６を有し、メモリ２６に書き込まれている音声信号を用いることで、音声信号が重畳されている映像信号ＤＳａの処理を行う際に自由度の高い音声信号の処理を行うことができるようにする。 As described above, the video / audio signal processing apparatus 10 includes the memory 16 that stores the video signal DSa and the memory 26 that stores the audio signal, and the audio signal is superimposed by using the audio signal written in the memory 26. It is possible to perform processing of an audio signal with a high degree of freedom when processing the processed video signal DSa.

次に、映像音声信号処理装置１０の具体的動作、例えばリポジション処理を行う場合について説明する。図２は、例えば映像信号ＤＳａに基づく映像をリポジション処理するときの信号経路を示している。なお、図２では、映像信号と音声信号の経路において映像音声出力に用いられる経路を実線、映像音声出力に用いられない経路を破線で示している。 Next, a specific operation of the video / audio signal processing apparatus 10, for example, a case where reposition processing is performed will be described. FIG. 2 shows a signal path when repositioning a video based on the video signal DSa, for example. In FIG. 2, the path used for video / audio output in the path of the video signal and the audio signal is indicated by a solid line, and the path not used for the video / audio output is indicated by a broken line.

ローカルＣＰＵ４１は、セレクタ１３を制御して映像信号入力処理部１２-1から供給された映像信号を選択させる。また、ローカルＣＰＵ４１は、メモリコントローラ１５，２５を制御して、メモリ１６に対する映像信号の書き込みと読み出し、およびメモリ２６に対する音声信号の書き込みと読み出しを行わせる。 The local CPU 41 controls the selector 13 to select the video signal supplied from the video signal input processing unit 12-1. In addition, the local CPU 41 controls the memory controllers 15 and 25 to write and read video signals to and from the memory 16 and write and read audio signals to and from the memory 26.

メモリコントローラ１５は、音声信号が重畳されている映像信号ＤＳａをメモリ１６に書き込み、このメモリ１６に書き込まれた映像信号の読み出し位置を変更して、基準信号に同期して映像信号を読み出すことでリポジション処理を行う。 The memory controller 15 writes the video signal DSa on which the audio signal is superimposed in the memory 16, changes the read position of the video signal written in the memory 16, and reads the video signal in synchronization with the reference signal. Perform reposition processing.

図３は、リポジション処理を行うときの動作を説明するための図である。例えばメモリコントローラ１５は、図３の（Ａ）に示すように、映像信号をアドレス「０ｘ００００」の位置から映像信号ＤＳａを順次書き込む。なお、１フレーム（フィールド）の映像信号の書き込み終了位置は、アドレス「０ｘＦＦＦＦ」とする。 FIG. 3 is a diagram for explaining an operation when performing the repositioning process. For example, the memory controller 15 sequentially writes the video signal DSa from the position of the address “0x0000” as shown in FIG. Note that the writing end position of the video signal of one frame (field) is an address “0xFFFF”.

リポジション処理を行うとき、メモリコントローラ１５は、映像信号の読み出し開始位置を変更する。例えばアドレス「０ｘ８０００」を映像信号の読み出し開始位置として、メモリ１６から順次映像信号の読み出しを行い、アドレス「０ｘＦＦＦＦ」の映像信号の次にアドレス「０ｘ００００」の位置から映像信号を順次読み出す。このように、映像信号の読み出し開始位置を変更すると、メモリ１６から読み出された映像は図３の（Ｂ）に示すものとなる。 When performing the reposition process, the memory controller 15 changes the reading start position of the video signal. For example, the video signal is sequentially read from the memory 16 using the address “0x8000” as the video signal readout start position, and the video signal is sequentially read from the position of the address “0x0000” next to the video signal of the address “0xFFFF”. As described above, when the reading start position of the video signal is changed, the video read from the memory 16 becomes as shown in FIG.

ここで、リポジション処理を行うものとすると、図３の（Ｂ）に示すように、ブランキング期間ＢＫinが移動した状態となる。また、ブランキング期間ＢＫinに重畳されている音声信号ＤＡａは、順序が変更されてしまう。したがって、映像信号出力処理部３１は、ブランキング期間の付け替え処理を行い、図３の（Ｃ）に示すように、メモリ１６から読み出された映像信号ＤＳｂに対して新たにブランキング期間ＢＫoutを設定する。さらに、映像信号出力処理部３１は、不必要な映像部分をマスキングする。例えば、付け替え前のブランキング期間ＢＫinや映像の上部を、図３の（Ｄ）に示すようにマスキング領域（クロスハッチングで示す領域）として、このマスキング領域を所定の色で塗りつぶす等のマスキング処理を行う。このように、ブランキング期間の付け替え処理やマスキング処理が行われた映像信号ＤＳｃをマルチプレクサ３３に供給する。 Here, when the reposition process is performed, the blanking period BKin is moved as shown in FIG. Further, the order of the audio signal DAa superimposed on the blanking period BKin is changed. Therefore, the video signal output processing unit 31 performs a blanking period replacement process, and newly sets a blanking period BKout for the video signal DSb read from the memory 16, as shown in FIG. Set. Furthermore, the video signal output processing unit 31 masks unnecessary video portions. For example, the blanking period BKin before the replacement or the upper part of the video is set as a masking area (area shown by cross-hatching) as shown in FIG. 3D, and masking processing such as filling the masking area with a predetermined color is performed. Do. In this way, the video signal DSc subjected to the blanking period replacement process and the masking process is supplied to the multiplexer 33.

また、リポジション処理を行うと、音声信号は上述のように順序が変更されてしまう。このため、ローカルＣＰＵ４１は、セレクタ２４とメモリコントローラ２５を制御して、デマルチプレクサ１７によって映像信号ＤＳａから分離した音声信号ＤＡａを、図３の（Ｅ）に示すようにメモリ２６に書き込むことで、映像信号ＤＳａに重畳されている音声信号すなわち順序が変更される前の音声信号のミラーリングを行う。 Further, when the reposition process is performed, the order of the audio signals is changed as described above. For this reason, the local CPU 41 controls the selector 24 and the memory controller 25 to write the audio signal DAa separated from the video signal DSa by the demultiplexer 17 into the memory 26 as shown in FIG. The audio signal superimposed on the video signal DSa, that is, the audio signal before the order is changed, is mirrored.

メモリコントローラ２５は、メモリ２６に書き込まれた音声信号ＤＡａを基準信号に同期して図３の（Ｆ）に示すように時間順に読み出して、音声信号ＤＡｅとして音声信号出力処理部３２に供給する。音声信号出力処理部３２は、メモリ２６から読み出された音声信号ＤＡｅを音声信号ＤＡｆとしてマルチプレクサ３３に供給する。 The memory controller 25 reads the audio signal DAa written in the memory 26 in order of time as shown in FIG. 3F in synchronization with the reference signal, and supplies it to the audio signal output processing unit 32 as the audio signal DAe. The audio signal output processing unit 32 supplies the audio signal DAe read from the memory 26 to the multiplexer 33 as the audio signal DAf.

マルチプレクサ３３は、映像信号出力処理部３１から供給された映像信号ＤＳｃのブランキング期間に音声信号である音声信号ＤＡｆを重畳させる。このような処理を行うものとすると、マルチプレクサ３３からパラレル／シリアル変換部３４を介して出力されるストリームＤＳoutは、図３の（Ｇ）に示すように、映像がリポジション処理されており、音声信号は正しい時間順で重畳された映像信号を示すものとなる。 The multiplexer 33 superimposes the audio signal DAf, which is an audio signal, on the blanking period of the video signal DSc supplied from the video signal output processing unit 31. Assuming that such processing is performed, the stream DSout output from the multiplexer 33 via the parallel / serial conversion unit 34 has undergone a repositioning process as shown in FIG. The signal indicates a video signal superimposed in the correct time order.

したがって、映像のリポジション処理が行われたか否かにかかわらず、正しく音声出力を行うことができるようになる。 Therefore, audio output can be performed correctly regardless of whether or not the video reposition processing has been performed.

次に、映像音声信号処理装置１０の他の具体的動作、例えば音声信号の切り替え処理を行う場合について説明する。図４は、例えば映像信号ＤＳａに重畳されている音声信号を、音声ストリームＤＡin-1から復元した音声信号に切り替えるときの信号経路を示している。なお、図４では、映像信号と音声信号の経路において映像音声出力に用いられる経路を実線、映像音声出力に用いられない経路を破線で示している。 Next, another specific operation of the video / audio signal processing apparatus 10, for example, a case where an audio signal switching process is performed will be described. FIG. 4 shows a signal path when, for example, an audio signal superimposed on the video signal DSa is switched to an audio signal restored from the audio stream DAin-1. In FIG. 4, the path used for video / audio output in the path of the video signal and the audio signal is indicated by a solid line, and the path not used for the video / audio output is indicated by a broken line.

ローカルＣＰＵ４１は、セレクタ１３，２３を制御して、映像信号入力処理部１２-1で処理された映像信号とＤＩＲ２１-1で得られた音声信号を選択させる。また、ローカルＣＰＵ４１は、セレクタ２４を制御して音声信号ＤＡｃを選択させる。さらに、ローカルＣＰＵ４１は、メモリコントローラ１５，２５を制御して、メモリ１６に対する映像信号の書き込みと読み出し、およびメモリ２６に対する音声信号の書き込みと読み出しを行わせる。 The local CPU 41 controls the selectors 13 and 23 to select the video signal processed by the video signal input processing unit 12-1 and the audio signal obtained by the DIR 21-1. Further, the local CPU 41 controls the selector 24 to select the audio signal DAc. Further, the local CPU 41 controls the memory controllers 15 and 25 to write and read video signals to and from the memory 16 and write and read audio signals to and from the memory 26.

メモリコントローラ１５は、音声信号が重畳されている映像信号ＤＳａをメモリ１６に書き込み、このメモリ１６に書き込まれた映像信号を、基準信号に同期して順次読み出して、映像信号ＤＳｂとしてデマルチプレクサ１７と映像信号出力処理部３１に供給する。 The memory controller 15 writes the video signal DSa on which the audio signal is superimposed in the memory 16, sequentially reads out the video signal written in the memory 16 in synchronization with the reference signal, and outputs the video signal DSb as the video signal DSb. The video signal output processing unit 31 is supplied.

映像信号出力処理部３１は、メモリコントローラ２５から供給された映像信号ＤＳｂを映像信号ＤＳｃとしてマルチプレクサ３３に供給する。 The video signal output processing unit 31 supplies the video signal DSb supplied from the memory controller 25 to the multiplexer 33 as the video signal DSc.

デマルチプレクサ１７は、映像信号ＤＳｂから音声信号ＤＡｂを分離して音声信号出力処理部３２に供給する。 The demultiplexer 17 separates the audio signal DAb from the video signal DSb and supplies it to the audio signal output processing unit 32.

メモリコントローラ２５は、セレクタ２４から供給された音声信号ＤＡｄをメモリ２６に書き込む。またメモリコントローラ２５は、メモリ２６に書き込まれた音声信号を書き込まれた順に読み出して、音声信号ＤＡｅとして音声信号出力処理部３２に供給する。 The memory controller 25 writes the audio signal DAd supplied from the selector 24 in the memory 26. Further, the memory controller 25 reads out the audio signals written in the memory 26 in the order of writing, and supplies them to the audio signal output processing unit 32 as the audio signal DAe.

音声信号出力処理部３２は、デマルチプレクサ１７から供給された音声信号ＤＡｂとメモリコントローラ２５から供給された音声信号ＤＡｅを用いて、音声信号の切り替え処理を行う。 The audio signal output processing unit 32 performs an audio signal switching process using the audio signal DAb supplied from the demultiplexer 17 and the audio signal DAe supplied from the memory controller 25.

図５は、音声信号出力処理部３２で行う音声信号の切り替え処理として、クロスフェード処理を行う場合を示している。音声信号出力処理部３２は、デマルチプレクサ１７から供給された音声信号ＤＡｂの信号レベル（実線で示す）を、時間の経過と共に減衰させて「０」レベルとする。また、音声信号出力処理部３２は、メモリコントローラ２５から供給された音声信号ＤＡｅの信号レベル（破線で示す）を、時間の経過と共に「０」レベルが増加させる。このようにして、音声信号出力処理部３２は、音声信号ＤＡｂを時間の経過と共に徐々に音声信号ＤＡｅに置き換えてクロスフェード処理を行う。音声信号出力処理部３２は、クロスフェード処理後の音声信号ＤＡｆをマルチプレクサ３３に供給する。 FIG. 5 shows a case where a crossfade process is performed as the audio signal switching process performed by the audio signal output processing unit 32. The audio signal output processing unit 32 attenuates the signal level (indicated by a solid line) of the audio signal DAb supplied from the demultiplexer 17 to “0” level as time passes. Further, the audio signal output processing unit 32 increases the signal level (indicated by a broken line) of the audio signal DAe supplied from the memory controller 25 to “0” level as time passes. In this way, the audio signal output processing unit 32 gradually replaces the audio signal DAb with the audio signal DAe as time elapses, and performs crossfade processing. The audio signal output processing unit 32 supplies the audio signal DAf after the cross fade process to the multiplexer 33.

マルチプレクサ３３は、映像信号出力処理部３１から供給された映像信号ＤＳｃに、音声信号出力処理部３２から供給された音声信号ＤＡｆを重畳させる。すなわち、マルチプレクサ３３は、映像信号ＤＳｂに重畳されている音声信号ＤＡｂを音声信号ＤＡｆに置き換える。このような処理を行うものとすると、マルチプレクサ３３からパラレル／シリアル変換部３４を介して出力されるストリームＤＳoutは、クロスフェード処理が行われた音声信号が重畳された映像信号を示すものとなる。 The multiplexer 33 superimposes the audio signal DAf supplied from the audio signal output processing unit 32 on the video signal DSc supplied from the video signal output processing unit 31. That is, the multiplexer 33 replaces the audio signal DAb superimposed on the video signal DSb with the audio signal DAf. Assuming that such processing is performed, the stream DSout output from the multiplexer 33 via the parallel / serial converter 34 indicates a video signal on which the audio signal subjected to the crossfade processing is superimposed.

ところで、映像信号に重畳された音声信号は、１映像フレームおけるサンプル数が異なる場合がある。例えば、音声信号のサンプリング周波数が４８．０ｋＨｚ、映像信号のフレーム周波数が（３０／１．００１）フレーム／秒である場合、１映像フレーム当たりの音声信号のサンプル数が整数とならないことから、連続する５フレームの期間でサンプル数が８００８サンプルとなるように各フレームのサンプル数が設定されている。 Incidentally, the audio signal superimposed on the video signal may have a different number of samples in one video frame. For example, when the sampling frequency of the audio signal is 48.0 kHz and the frame frequency of the video signal is (30 / 1.001) frames / second, the number of samples of the audio signal per video frame is not an integer. The number of samples in each frame is set so that the number of samples becomes 8008 samples in a period of 5 frames.

図６は、フレームと音声信号のサンプル数を示している。連続する５フレーム期間において、奇数オーディオフレームは１６０２サンプル、偶数オーディオフレームは１６０１サンプル、５フレームと次の５フレームのつなぎ目では１６０２サンプルのフレームが連続するようになされている。このため、例えばクロスフェード処理で用いた音声信号ＤＡｅのサンプル数が１６０１サンプル／フレームに固定されていた場合、あるいは音声信号ＤＡｅのサンプル数が１６０２サンプル／フレームに固定されていた場合、映像と音声の位相差が生じて映像と音のタイミングが合わない現象、いわゆるリップシンクのずれという問題が生じてしまう。 FIG. 6 shows the number of samples of the frame and audio signal. In five consecutive frame periods, odd-numbered audio frames have 1602 samples, even-numbered audio frames have 1601 samples, and five frames and the next five frames have 1602 sample frames. For this reason, for example, when the number of samples of the audio signal DAe used in the crossfade processing is fixed to 1601 samples / frame, or when the number of samples of the audio signal DAe is fixed to 1602 samples / frame, video and audio This causes a phase difference and a phenomenon that the timing of video and sound does not match, that is, a so-called lip sync shift.

そこで、音声信号出力処理部３２は、１映像フレーム当たりのサンプル数を調整してから映像信号に重畳させることでリップシンクの問題の発生を防止する。 Therefore, the audio signal output processing unit 32 prevents the occurrence of a lip sync problem by adjusting the number of samples per video frame and then superimposing it on the video signal.

音声信号出力処理部３２は、デマルチプレクサ１７から供給された音声信号ＤＡｂに基づきフレームシーケンスを判別して、判別結果に基づき音声信号の１映像フレーム当たりのサンプル数を調整する。例えば図６に示すように、連続する５フレームで音声信号を８００８サンプルとする場合、サンプル数が「１６０２」であるフレームの連続が５フレーム期間毎に生じる。したがって、音声信号出力処理部３２は、音声信号ＤＡｂに基づきフレーム毎に音声信号のサンプル数をカウントして、サンプル数が「１６０２」であるフレームが連続したタイミングを基準として、５フレームのカウントを繰り返すものとすれば、カウント値は５フレームシーケンスの何れのフレームであるかを示すものとなり、カウント値によって出力するフレームが５フレームシーケンスの何れのフレームに相当するか判別できる。 The audio signal output processing unit 32 determines the frame sequence based on the audio signal DAb supplied from the demultiplexer 17 and adjusts the number of samples per video frame of the audio signal based on the determination result. For example, as shown in FIG. 6, when the audio signal is 8008 samples in five consecutive frames, a series of frames whose number of samples is “1602” occurs every five frame periods. Therefore, the audio signal output processing unit 32 counts the number of samples of the audio signal for each frame based on the audio signal DAb, and counts 5 frames with reference to the timing when the frames having the number of samples of “1602” continue. If it is repeated, the count value indicates which frame of the 5-frame sequence, and it is possible to determine which frame of the 5-frame sequence corresponds to the frame to be output based on the count value.

また、音声ストリームＤＡin-1のフォーマットが例えばＡＥＳフォーマットである場合、ＡＥＳフォーマットでは、サンプリング周波数を示す情報が設けられている。ＡＥＳフォーマットの１フレームは２つのサブフレームで構成されており、サブフレームは図７に示す構成とされている。 In addition, when the format of the audio stream DAin-1 is, for example, the AES format, information indicating the sampling frequency is provided in the AES format. One frame of the AES format is composed of two subframes, and the subframe is configured as shown in FIG.

サブフレームは、プリアンブル、ＡＵＸ／オーディオ、オーディオデータ、オーディオサンプルバリディティビットＶ、ユーザービットＵ、オーディオチャンネルステータスビットＣ、サブフレームパリティビットＰを有する。 The subframe includes a preamble, AUX / audio, audio data, audio sample validity bit V, user bit U, audio channel status bit C, and subframe parity bit P.

プリアンブルは、４ビットのデータであり、サブフレームとフレームとの同期およびチャンネルステータスブロック識別を規定するものである。ＡＵＸ／オーディオの領域は、４ビットの領域であり、補助データ部である。オーディオデータは、２０ビットのデータ領域である。ここで、オーディオデータを２０ビットとした場合、ＡＵＸ／オーディオの領域は補助データ部として使用できる。オーディオデータを２４ビットとした場合、ＡＵＸ／オーディオの領域はオーディオデータの領域として使用する。なお、オーディオデータを２０ビット未満とした場合、使用しないＬＳＢ側の下位ビットは「０」に設定される。 The preamble is 4-bit data and defines synchronization between subframes and frames and channel status block identification. The AUX / audio area is a 4-bit area and is an auxiliary data portion. Audio data is a 20-bit data area. If the audio data is 20 bits, the AUX / audio area can be used as an auxiliary data part. When the audio data is 24 bits, the AUX / audio area is used as the audio data area. When the audio data is less than 20 bits, the low-order bits on the LSB side that are not used are set to “0”.

オーディオサンプルバリディティビットＶは、オーディオデータの有効性を示す１ビットのデータである。なお、オーディオデータが有効データの場合は「０」、無効データの場合は「１」に設定される。ユーザービットＵは、ユーザーにより定義された１ビットのデータである。オーディオチャンネルステータスビットＣは、データブロックを形成する１ビットのデータであり、各種パラメータについての送信情報を伝送する。サブフレームパリティビットＰは、プリアンブルを除く全てのデータに対する偶数パリティを示す１ビットのデータである。 The audio sample validity bit V is 1-bit data indicating the validity of the audio data. Note that “0” is set when the audio data is valid data, and “1” is set when the audio data is invalid data. The user bit U is 1-bit data defined by the user. The audio channel status bit C is 1-bit data that forms a data block, and transmits transmission information about various parameters. The subframe parity bit P is 1-bit data indicating even parity for all data except the preamble.

ここで、各チャンネルについて、連続する１９２フレームに含まれたオーディオチャンネルステータスビットＣにより１ブロックが形成される。このブロックの開始位置は、プリアンブルにより指定されており、オーディオチャンネルステータスビットＣのブロックでサンプリング周波数等の情報が示されている。このため、映像音声信号処理装置１０は、このサンプリング周波数の情報に基づき入力された音声信号の例えば１映像フレーム当たりのサンプル数を判別できる。 Here, for each channel, one block is formed by audio channel status bits C included in consecutive 192 frames. The start position of this block is specified by the preamble, and information such as the sampling frequency is indicated by the block of the audio channel status bit C. Therefore, the video / audio signal processing device 10 can determine the number of samples per video frame, for example, of the input audio signal based on this sampling frequency information.

したがって、音声信号出力処理部３２は、出力するフレームが５フレームシーケンスの何れのフレームに相当するかの判別結果と、メモリ２６に書き込まれている音声信号の１フレーム当たりのサンプル数に基づき、音声信号ＤＡｅのサンプルデータの削除や繰り返しを行うことで、映像信号ＤＳｃに対応したサンプル数の音声信号とすることができる。このため、音声信号出力処理部３２からマルチプレクサに供給される音声信号ＤＡｆは、１映像フレーム当たりのサンプル数が映像信号ＤＳｃに重畳されている音声信号と等しくなる。すなわち、クロスフェード処理や音声切り替え等を行い、映像信号に重畳されている音声信号を変更したとき、リップシンクのずれという問題が生じてしまうことを防止できる。 Therefore, the audio signal output processing unit 32 determines whether the output frame corresponds to which frame of the 5-frame sequence and the number of samples per frame of the audio signal written in the memory 26. By deleting or repeating the sample data of the signal DAe, an audio signal having the number of samples corresponding to the video signal DSc can be obtained. Therefore, the audio signal DAf supplied from the audio signal output processing unit 32 to the multiplexer is equal to the audio signal in which the number of samples per video frame is superimposed on the video signal DSc. In other words, it is possible to prevent a problem of lip sync deviation when the audio signal superimposed on the video signal is changed by performing cross fade processing, audio switching, or the like.

なお、出力するフレームが５フレームシーケンスの何れのフレームに相当するかの判別は、上述のようにサンプル数が「１６０２」であるフレームの連続するタイミングを基準として５フレームのカウント動作を行う場合に限られるものではなく、他の方法を用いるものとしてもよい。 Note that the determination of which frame in the 5-frame sequence corresponds to the frame to be output is performed when the counting operation of 5 frames is performed with reference to the continuous timing of the frames whose number of samples is “1602” as described above. The method is not limited, and other methods may be used.

図８は、映像信号に重畳されたパケットの構成を示している。図８の（Ａ）は、音声信号のパケットであり、補助データフラグＡＤＦは補助データパケットの開始を示すフラグである。データ識別ワードＤＩＤは補助データの種類を示している。データブロック番号ＤＢＮは、データを分割して送る場合など、同一のＤＩＤを持つ一連のパケットの順序番号である。データカウントＤＣは、ユーザーデータのワード数を示している。ユーザーデータワードＵＤＷは、補助データ例えばＡＥＳフォーマットの音声信号等を示すものである。チェックサムワードＣＳは、データ識別ワードＤＩＤからＵＤＷの最後のワードまでのデータに対するチェックサムである。 FIG. 8 shows the structure of a packet superimposed on the video signal. FIG. 8A shows a packet of an audio signal, and the auxiliary data flag ADF is a flag indicating the start of the auxiliary data packet. The data identification word DID indicates the type of auxiliary data. The data block number DBN is a sequence number of a series of packets having the same DID when data is divided and sent. The data count DC indicates the number of words of user data. The user data word UDW indicates auxiliary data such as an AES format audio signal. The checksum word CS is a checksum for data from the data identification word DID to the last word of the UDW.

また、映像信号に重畳されたパケットでは、音声信号だけでなく音声信号に関する制御信号のパケットも設けられる。図８の（Ｂ）はオーディオコントロールパケットであり、ユーザーデータワードＵＤＷには、オーディオフレームナンバーＡＦ、サンプリング周波数ＲＡＴＥ等の情報を設けることができるようになされている。 Further, in the packet superimposed on the video signal, not only the audio signal but also a control signal packet related to the audio signal is provided. FIG. 8B shows an audio control packet, and information such as an audio frame number AF and a sampling frequency RATE can be provided in the user data word UDW.

オーディオフレームナンバーＡＦは、１映像フレーム当たりのサンプル数が整数個にならない場合に、映像フレームの中に重畳されるサンプル数を識別可能とするための情報である。例えば、上述のように連続する５フレームの期間におけるサンプル数を８００８サンプルとする場合、１映像フレーム当たりのサンプル数が整数個にならない。したがって、オーディオフレームナンバーＡＦによって、５フレームシーケンスにおける何番目のフレームであるかを示すものとする。このため、オーディオフレームナンバーＡＦから、１フレームのサンプル数を判別することができるようになる。 The audio frame number AF is information for enabling identification of the number of samples superimposed in the video frame when the number of samples per video frame does not become an integer. For example, when the number of samples in a period of five consecutive frames is 8008 samples as described above, the number of samples per video frame is not an integer. Therefore, the audio frame number AF indicates what number frame is in the 5-frame sequence. Therefore, the number of samples in one frame can be determined from the audio frame number AF.

なお、サンプリング周波数ＲＡＴＥは音声信号のサンプリング周波数を示す情報、アクティブチャンネルＡＣＴは、どのチャンネルが有効であるかを示す情報、ディレイＤＥＬは、映像に対する音声の相対的な遅延量を示すデータである。 The sampling frequency RATE is information indicating the sampling frequency of the audio signal, the active channel ACT is information indicating which channel is effective, and the delay DEL is data indicating the relative delay amount of the audio with respect to the video.

このように、オーディオフレームナンバーＡＦによって、５フレームシーケンスの何番目のフレームであるか判別できるので、音声信号出力処理部３２は、オーディオフレームナンバーＡＦの情報を用いて、音声信号のサンプル数の調整を行うこともできる。 As described above, since the audio frame number AF can determine which frame of the 5-frame sequence, the audio signal output processing unit 32 uses the information of the audio frame number AF to adjust the number of audio signal samples. Can also be done.

図９は、音声信号出力処理部３２の動作を説明するための図である。図９の（Ａ）は、映像信号ＤＳｂに重畳されている音声信号の１映像フレーム当たりのサンプル数を示している。図９の（Ｂ）は、メモリコントローラ２５によってメモリ２６から読み出されて音声信号出力処理部３２に供給される音声信号ＤＡｅの１映像フレーム当たりのサンプル数を示している。図９の（Ｃ）は、音声信号出力処理部３２から出力されて、マルチプレクサに供給される音声信号ＤＡｆの１映像フレーム当たりのサンプル数を示している。さらに、図９の（Ｄ）は、マルチプレクサ３３から出力される映像信号ＤＳｄに重畳されている音声信号の１映像フレーム当たりのサンプル数を示している。 FIG. 9 is a diagram for explaining the operation of the audio signal output processing unit 32. FIG. 9A shows the number of samples per video frame of the audio signal superimposed on the video signal DSb. FIG. 9B shows the number of samples per video frame of the audio signal DAe read from the memory 26 by the memory controller 25 and supplied to the audio signal output processing unit 32. FIG. 9C shows the number of samples per video frame of the audio signal DAf output from the audio signal output processing unit 32 and supplied to the multiplexer. Further, (D) of FIG. 9 shows the number of samples per video frame of the audio signal superimposed on the video signal DSd output from the multiplexer 33.

ここで、例えば時点ｔc1で、映像信号に重畳されていた音声信号をメモリ２６に書き込まれた音声信号に切り替える場合、音声信号出力処理部３２は、マルチプレクサ３３に出力する音声信号ＤＡｆの１映像フレーム当たりのサンプル数が、マルチプレクサ３３に供給される映像信号ＤＳｄの１映像フレーム当たりのサンプル数と等しくなるように、サンプル数を調整する。例えば音声信号出力処理部３２は、時点ｔc1で開始する映像信号ＤＳｄの１映像フレーム当たりのサンプル数が「１６０２」であり、音声信号ＤＡｅの１映像フレーム当たりのサンプル数が「１６００」であるとき、音声信号出力処理部３２は、音声信号ＤＡｅのサンプリングデータを繰り返すことで２サンプル増加させて、１映像フレーム当たりのサンプル数を「１６０２」とした音声信号ＤＡｆをマルチプレクサ３３に供給する。また、次のフレームでは映像信号ＤＳｄの１映像フレーム当たりのサンプル数が「１６０１」であることから、音声信号出力処理部３２は、音声信号ＤＡｅのサンプルを繰り返すことで１サンプル増加させて、１映像フレーム当たりのサンプル数を「１６０１」とした音声信号ＤＡｆをマルチプレクサ３３に供給する。 Here, for example, when the audio signal superimposed on the video signal is switched to the audio signal written in the memory 26 at time tc1, the audio signal output processing unit 32 outputs one video frame of the audio signal DAf output to the multiplexer 33. The number of samples is adjusted so that the number of samples per unit is equal to the number of samples per video frame of the video signal DSd supplied to the multiplexer 33. For example, when the number of samples per video frame of the video signal DSd starting at time tc1 is “1602” and the number of samples per video frame of the audio signal DAe is “1600”, the audio signal output processing unit 32 The audio signal output processing unit 32 repeats the sampling data of the audio signal DAe to increase the number of samples by 2 and supplies the audio signal DAf with the number of samples per video frame of “1602” to the multiplexer 33. In the next frame, since the number of samples per video frame of the video signal DSd is “1601”, the audio signal output processing unit 32 repeats the sample of the audio signal DAe to increase the number of samples by one. An audio signal DAf with the number of samples per video frame being “1601” is supplied to the multiplexer 33.

このような処理を行うものとすれば、音声信号の切り替えを行っても、音声信号の１映像フレーム当たりのサンプル数は切り替え前と等しくなるので、リップシンクのずれの問題を生じることがない。 If such processing is performed, even if the audio signal is switched, the number of samples per video frame of the audio signal is equal to that before the switching, so that the problem of lip sync deviation does not occur.

また、メモリ２６に書き込まれる音声信号が、上述の５フレームシーケンスの音声信号のように１映像フレーム当たりのサンプルが変化する場合、メモリコントローラ２５は、メモリ２６に音声信号を書き込む際に、１映像フレーム当たりのサンプルを示す情報を書き込むものとする。このようにすれば、メモリ２６に書き込まれる音声信号の１映像フレーム当たりのサンプルが変化しても、音声信号出力処理部３２は、音声信号ＤＡｅの１映像フレーム当たりのサンプル数に応じて、音声信号のサンプリングデータの繰り返しや削除を行うことで、映像信号ＤＳｃに重畳されている音声信号と１映像フレーム当たりのサンプル数を等しくした音声信号ＤＡｆをマルチプレクサ３３に供給できる。 Further, when the audio signal written in the memory 26 changes the sample per video frame as in the above-described 5-frame sequence audio signal, the memory controller 25 outputs one video when writing the audio signal to the memory 26. Information indicating the sample per frame shall be written. In this way, even if the sample per video frame of the audio signal written to the memory 26 changes, the audio signal output processing unit 32 performs the audio according to the number of samples per video frame of the audio signal DAe. By repeating or deleting the sampling data of the signal, the audio signal DAf having the same number of samples per video frame as the audio signal superimposed on the video signal DSc can be supplied to the multiplexer 33.

図１０は、音声信号をメモリ２６に書き込む際に、１映像フレーム当たりのサンプルを識別可能として音声信号を書き込む場合を例示している。 FIG. 10 exemplifies a case where the audio signal is written so that the sample per video frame can be identified when the audio signal is written in the memory 26.

メモリコントローラ２５は、セレクタ２４で選択された音声信号を図１０に示すようにアドレスの最初から順次書き込む。ここで、音声信号の１サンプルは上述のように３２ビットのフォーマットのデータとされているものとする。また、ＳＤＩフォーマットの信号では１本の信号あたり音声信号が２ＣＨペアで４グループすなわち１６ＣＨの音声信号を重畳することが可能とされている。したがって、アドレス幅は、４つのグループ毎に音声信号を記憶するためのアドレスｇ０，ｇ１、２ＣＨペア毎に音声信号を記憶するためのアドレスｃ０，ｃ１、例えば１６０２サンプルの音声信号を記憶するために必要とされるアドレスｗ０〜ｗ１０を設定する。このようにアドレス幅「ｇ１，ｇ０，ｃ１，ｃ０，ｗ１０〜ｗ０」を設定すると、グループ毎および２ＣＨペア毎に音声信号を記憶することができる。また、アドレス幅「ｗ１０〜Ｗ０」は、２０４８サンプルを示すことができるので、アドレス幅「ｗ１０〜Ｗ０」を全て使い切ることがない。したがって、図１０に示すようにアドレス幅「ｗ１０〜Ｗ０」におけるアドレス「１１１１１１１１１００」〜「１１１１１１１１１１１」の範囲を、サンプル数を記憶するための領域とする。例えばアドレス「１１１１１１１１１００」に予め設定した固有のデータが記憶されている場合、１映像フレーム当たりのサンプル数は１６００とする。また、１映像フレーム当たりのサンプル数が１６０１のときは、固有のデータをアドレス「１１１１１１１１１０１」に記憶する。また、１映像フレーム当たりのサンプル数が１６０２のときは、固有のデータをアドレス「１１１１１１１１１１０」に記憶する。このようにすれば、予め設定した固有のデータが何れのアドレスに記憶されていたかに応じて、１映像フレーム当たりのサンプル数を判別することができる。また、図示せずも予め設定された例えばアドレス「１１１１１１１１１１１」に、１映像フレーム当たりのサンプル数を書き込むものとしてもよい。 The memory controller 25 sequentially writes the audio signal selected by the selector 24 from the beginning of the address as shown in FIG. Here, it is assumed that one sample of the audio signal is data of a 32-bit format as described above. In addition, in the SDI format signal, it is possible to superimpose four groups of audio signals, that is, 16 CH audio signals in 2CH pairs per signal. Accordingly, the address width is used to store addresses c0 and c1 for storing audio signals for each of the groups g4, CH1 and CH2 for storing audio signals for every four groups, for example, 1602 samples of audio signals. Necessary addresses w0 to w10 are set. When the address widths “g1, g0, c1, c0, w10 to w0” are set in this way, the audio signal can be stored for each group and for each 2CH pair. Further, since the address width “w10 to W0” can indicate 2048 samples, the entire address width “w10 to W0” is not used up. Therefore, as shown in FIG. 10, the range of addresses “11111111100” to “11111111111” in the address width “w10 to W0” is set as an area for storing the number of samples. For example, if unique data set in advance is stored at the address “11111111100”, the number of samples per video frame is 1600. When the number of samples per video frame is 1601, unique data is stored at the address “11111111101”. When the number of samples per video frame is 1602, unique data is stored at the address “11111111110”. In this way, it is possible to determine the number of samples per video frame depending on which address the preset unique data is stored in. Although not shown, the number of samples per video frame may be written to a preset address “11111111111”, for example.

このように、音声信号出力処理部３２は、メモリ２６に書き込まれているサンプル数の情報を利用することで、メモリ２６に書き込まれる音声信号の１映像フレーム当たりのサンプル数が一定でなくとも、リップシンクのずれの問題を防止することができる。 As described above, the audio signal output processing unit 32 uses the information on the number of samples written in the memory 26, so that the number of samples per video frame of the audio signal written in the memory 26 is not constant. The problem of lip sync displacement can be prevented.

図１１は、メモリ２６に記憶されたサンプル数の情報を利用したときの動作を示している。図１１の（Ａ）は、映像信号ＤＳｂに重畳されている音声信号の１映像フレーム当たりのサンプル数を示している。図１１の（Ｂ）は、メモリコントローラ２５によってメモリ２６から読み出されて音声信号出力処理部３２に供給される音声信号ＤＡｅの１映像フレーム当たりのサンプル数を示している。図１１の（Ｃ）は、音声信号出力処理部３２から出力されて、マルチプレクサに供給される音声信号ＤＡｆの１映像フレーム当たりのサンプル数を示している。さらに、図１１の（Ｄ）は、マルチプレクサ３３から出力される映像信号ＤＳｄに重畳されている音声信号の１映像フレーム当たりのサンプル数を示している。 FIG. 11 shows an operation when the information on the number of samples stored in the memory 26 is used. FIG. 11A shows the number of samples per video frame of the audio signal superimposed on the video signal DSb. FIG. 11B shows the number of samples per video frame of the audio signal DAe read from the memory 26 by the memory controller 25 and supplied to the audio signal output processing unit 32. FIG. 11C shows the number of samples per video frame of the audio signal DAf output from the audio signal output processing unit 32 and supplied to the multiplexer. Further, (D) of FIG. 11 shows the number of samples per video frame of the audio signal superimposed on the video signal DSd output from the multiplexer 33.

ここで、例えば時点ｔc2で、映像信号に重畳されていた音声信号をメモリ２６に記憶された音声信号に切り替える場合、音声信号出力処理部３２は、マルチプレクサ３３に出力する音声信号ＤＡｆの１映像フレーム当たりのサンプル数が、マルチプレクサ３３に供給される映像信号ＤＳｄに重畳されている音声信号の１映像フレーム当たりのサンプル数と等しくなるように、サンプル数を調整する。例えば音声信号出力処理部３２は、時点ｔc2で開始する映像信号ＤＳｄの１映像フレーム当たりのサンプル数が「１６０２」であり、音声信号ＤＡｅの１映像フレーム当たりのサンプル数が「１６０１」であるとき、音声信号出力処理部３２は、音声信号ＤＡｅのサンプリングデータを繰り返すことで１サンプル増加させて、１映像フレーム当たりのサンプル数を「１６０２」とした音声信号ＤＡｆをマルチプレクサ３３に供給する。また、次のフレームでは映像信号ＤＳｄの１映像フレーム当たりのサンプル数が「１６０１」であり、音声信号ＤＡｅの１映像フレーム当たりのサンプル数が「１６０２」であるとき、音声信号出力処理部３２は、音声信号ＤＡｅのサンプリングデータを１サンプル減少させて、１映像フレーム当たりのサンプル数を「１６０１」とした音声信号ＤＡｆをマルチプレクサ３３に供給する。 Here, for example, when the audio signal superimposed on the video signal is switched to the audio signal stored in the memory 26 at time tc2, the audio signal output processing unit 32 outputs one video frame of the audio signal DAf output to the multiplexer 33. The number of samples is adjusted so that the number of samples per hit is equal to the number of samples per video frame of the audio signal superimposed on the video signal DSd supplied to the multiplexer 33. For example, when the number of samples per video frame of the video signal DSd starting at the time tc2 is “1602” and the number of samples per video frame of the audio signal DAe is “1601”, the audio signal output processing unit 32 The audio signal output processing unit 32 repeats the sampling data of the audio signal DAe to increase one sample, and supplies the audio signal DAf with the number of samples per video frame “1602” to the multiplexer 33. In the next frame, when the number of samples per video frame of the video signal DSd is “1601” and the number of samples per video frame of the audio signal DAe is “1602”, the audio signal output processing unit 32 Then, the sampling data of the audio signal DAe is decreased by one sample, and the audio signal DAf with the number of samples per video frame being “1601” is supplied to the multiplexer 33.

このような処理を行うものとすれば、音声信号の切り替えを行っても、映像信号に重畳されている音声信号の１映像フレーム当たりのサンプル数は切り替え前と等しくなるので、リップシンクのずれの問題を防止できる。 If such processing is performed, even if the audio signal is switched, the number of samples per video frame of the audio signal superimposed on the video signal is equal to that before the switching. The problem can be prevented.

なお、上述の実施の形態では、ＳＤＩフォーマットのストリームやＡＥＳフォーマットのストリームが入力される場合を例示したが、音声信号が重畳された映像信号のフォーマットや音声信号のフォーマットは、上述のフォーマットの信号に限られるものではない。また、音声信号出力処理部３２は、１映像フレーム当たりのサンプル数を調整するものとしたが、映像信号がインタレース信号であるときは、フィールド期間毎にサンプル数の調整を行うものとしてもよい。 In the above-described embodiment, the case where an SDI format stream or an AES format stream is input is exemplified, but the format of the video signal and the format of the audio signal on which the audio signal is superimposed are the signals of the above format. It is not limited to. The audio signal output processing unit 32 adjusts the number of samples per video frame. However, when the video signal is an interlaced signal, the number of samples may be adjusted for each field period. .

このように、映像信号ＤＳａを記憶するメモリ１６と音声信号を記憶するメモリ２６を別個に設けて、メモリ２６に書き込まれている音声信号を用いることで、音声信号が重畳されている映像信号ＤＳａの処理を行う際に自由度の高い音声信号の処理を行うことができる。また、メモリ２６は、音声信号を記憶するものであることから記憶容量が少なくてよい。また、メモリ２６には、音声信号を自然な形で記憶することが可能であるので、サンプル数の調整やミキシング等を行う音声信号出力処理部を容易な構成にできる。 As described above, by separately providing the memory 16 for storing the video signal DSa and the memory 26 for storing the audio signal, and using the audio signal written in the memory 26, the video signal DSa on which the audio signal is superimposed. It is possible to process a voice signal with a high degree of freedom when performing the above process. Further, since the memory 26 stores audio signals, the memory capacity may be small. Further, since the audio signal can be stored in the memory 26 in a natural form, the audio signal output processing unit for adjusting the number of samples, mixing, etc. can be easily configured.

また、上述の説明で示した処理はハードウェアだけでなくソフトウェア、あるいは両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理を行うためのプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させるか、あるいは、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることが可能である。 Further, the processing described in the above description can be executed not only by hardware but also by software, or by a combined configuration of both. When executing processing by software, a program for performing processing is installed in a memory in a computer incorporated in dedicated hardware and executed, or the program is executed on a general-purpose computer capable of executing various types of processing. It can be installed and run.

プログラムは、記録媒体としてのハードディスクやＲＯＭ（Read Only Memory)に予め記録しておくことができる。また、プログラムは磁気ディスクや光ディスク、半導体メモリなどのリムーバブル記録媒体に、一時的あるいは永続的に格納（記録）しておくことができる。さらに、プログラムは、上述したようなリムーバブル記録媒体からコンピュータにインストールする他、ダウンロードサイトから無線あるいは有線のネットワークを介して取得するものとしてもよい。 The program can be recorded in advance on a hard disk or ROM (Read Only Memory) as a recording medium. The program can be stored (recorded) temporarily or permanently in a removable recording medium such as a magnetic disk, an optical disk, or a semiconductor memory. Furthermore, the program may be obtained from a download site via a wireless or wired network, in addition to being installed on the computer from the above-described removable recording medium.

以上、特定の実施の形態を参照しながら、本発明について詳解してきた。しかしながら、本発明の要旨を逸脱しない範囲で当業者が該実施例の修正や代用を成し得ることは自明である。すなわち、例示という形態で本発明を開示してきたのであり、限定的に解釈されるべきではない。本発明の要旨を判断するためには、特許請求の範囲を参酌すべきである。 The present invention has been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can make modifications and substitutions of the embodiments without departing from the gist of the present invention. In other words, the present invention has been disclosed in the form of exemplification, and should not be interpreted in a limited manner. In order to determine the gist of the present invention, the claims should be taken into consideration.

この発明の映像音声信号処理装置では、音声信号が重畳されている映像信号が第１のメモリに記憶されて、この第１のメモリに書き込まれた映像信号が基準信号に同期して読み出される。また、映像信号に重畳されている音声信号が分離されて、この分離された音声信号または入力された音声信号が第２のメモリに記憶される。この第２のメモリに書き込まれた音声信号は、基準信号に同期して読み出されて、第１のメモリから読み出された映像信号に重畳されている音声信号に換えて重畳される。したがって、フレームシンクロナイザーや編集装置として好適である。 In the video / audio signal processing apparatus according to the present invention, the video signal on which the audio signal is superimposed is stored in the first memory, and the video signal written in the first memory is read out in synchronization with the reference signal. The audio signal superimposed on the video signal is separated, and the separated audio signal or the input audio signal is stored in the second memory. The audio signal written in the second memory is read out in synchronization with the reference signal, and is superposed in place of the audio signal superimposed on the video signal read out from the first memory. Therefore, it is suitable as a frame synchronizer or editing device.

映像音声信号処理装置の構成を示す図である。It is a figure which shows the structure of a video / audio signal processing apparatus. リポジション処理を行うときの信号経路を示す図である。It is a figure which shows the signal path | route when performing a reposition process. リポジション処理を行うときの動作を説明するための図である。It is a figure for demonstrating operation | movement when performing a reposition process. 音声信号の切り替え処理を行うときの信号経路を示す図である。It is a figure which shows a signal path | route when performing the switching process of an audio | voice signal. クロスフェード処理を説明するための図である。It is a figure for demonstrating a cross fade process. フレームと音声信号のサンプル数を示す図である。It is a figure which shows the sample number of a frame and an audio | voice signal. サブフレームの構成を示す図である。It is a figure which shows the structure of a sub-frame. パケットの構成を示す図である。It is a figure which shows the structure of a packet. 音声信号出力処理部の動作を説明するための図である。It is a figure for demonstrating operation | movement of an audio | voice signal output process part. 音声信号の書き込み動作を説明するための図である。It is a figure for demonstrating the writing operation | movement of an audio | voice signal. メモリに記憶されたサンプル数の情報を利用したときの動作を説明するための図である。It is a figure for demonstrating operation | movement when the information of the number of samples memorize | stored in memory is utilized. 従来のフレームシンクロナイザの構成を示す図である。It is a figure which shows the structure of the conventional frame synchronizer.

Explanation of symbols

１０・・・映像音声信号処理装置、１１-1〜１１-m，６１・・・シリアル／パラレル変換部（Ｓ／Ｐ）、１２-1〜１２-m，６２・・・映像信号入力処理部、１３，２３，２４・・・セレクタ、１５，２５，６５・・・メモリコントローラ、１６，２６，６６・・・メモリ、１７，６７・・・デマルチプレクサ、３１，７１・・・映像信号出力処理部、３２，７２・・・音声信号出力処理部、３３，７３・・・マルチプレクサ、３４、７４・・・パラレル／シリアル変換部（Ｐ／Ｓ）、４１，８１・・・ローカルＣＰＵ、６０・・・フレームシンクロナイザ DESCRIPTION OF SYMBOLS 10 ... Audio / video signal processing apparatus, 11-1 to 11-m, 61 ... Serial / parallel converter (S / P), 12-1 to 12-m, 62 ... Video signal input processor , 13, 23, 24 ... selector, 15, 25, 65 ... memory controller, 16, 26, 66 ... memory, 17, 67 ... demultiplexer, 31, 71 ... video signal output Processing unit, 32, 72 ... Audio signal output processing unit, 33, 73 ... Multiplexer, 34, 74 ... Parallel / serial conversion unit (P / S), 41, 81 ... Local CPU, 60 ... Frame synchronizer

Claims

A first memory for storing a video signal on which an audio signal is superimposed;
A first memory controller that reads a video signal written in the first memory in synchronization with a reference signal;
A demultiplexer for separating an audio signal superimposed on the video signal;
A second memory for storing the separated audio signal or the inputted audio signal;
A second memory controller for reading out an audio signal written in the second memory in synchronization with the reference signal;
A video / audio signal processing apparatus comprising: a multiplexer that superimposes the audio signal read from the second memory in place of the audio signal superimposed on the video signal read from the first memory.

The number of samples of the audio signal read from the second memory is adjusted according to the number of samples per video frame of the audio signal superimposed on the video signal read from the first memory. The video / audio signal processing apparatus according to claim 1, further comprising an audio signal output processing unit that supplies the multiplexer.

The audio signal output processing unit discriminates a sequence of the number of samples per video frame of the audio signal superimposed on the video signal read from the first memory, and reads out the sequence from the second memory. The video / audio signal processing apparatus according to claim 2, wherein the number of samples of the obtained audio signal is adjusted in correspondence with the determined sequence.

The second memory controller writes information indicating the number of samples per video frame of the audio signal to the second memory;
4. The video / audio signal according to claim 3, wherein the audio signal output processing unit adjusts the number of samples of the audio signal in accordance with the determined sequence using information indicating the number of samples read from the second memory. Processing equipment.

The audio signal output processing unit converts an audio signal to be supplied to the multiplexer from an audio signal read from the second memory and an audio signal separated by the demultiplexer from the other audio signal. The video / audio signal processing apparatus according to claim 2, wherein processing for switching to a signal is performed.

A video signal output processing unit that performs signal processing on the video signal read from the first memory;
The first memory controller moves a video display position by changing a read start position of a video signal written in the first memory;
The video signal output processing unit replaces a blanking period for the video signal read from the first memory, and supplies the blanking period to the multiplexer.
The video / audio signal processing apparatus according to claim 1, wherein the second memory stores an audio signal separated from a video signal written to the first memory.

The video / audio signal processing apparatus according to claim 1, wherein the audio signal is superimposed on a blanking period of the video signal.

Storing the video signal on which the audio signal is superimposed in the first memory;
Reading the video signal written in the first memory in synchronization with a reference signal;
Separating an audio signal superimposed on the video signal;
Storing the separated audio signal or the inputted audio signal in a second memory;
Reading out the audio signal written in the second memory in synchronization with the reference signal;
A method of superimposing an audio signal read from the second memory in place of the audio signal superimposed on the video signal read from the first memory.