JP2003219385A

JP2003219385A - Video conference system

Info

Publication number: JP2003219385A
Application number: JP2002000960A
Authority: JP
Inventors: Moken Ryu; 孟賢劉
Original assignee: Leadtek Research Inc
Current assignee: Leadtek Research Inc
Priority date: 2002-01-07
Filing date: 2002-01-07
Publication date: 2003-07-31

Abstract

<P>PROBLEM TO BE SOLVED: To provide a video conference system with a high efficiency in which a facility cost can be reduced. <P>SOLUTION: The video conference system suitable for a conference held by a plurality of participant units (100a to 100f) each outputting an individual data stream, includes: a central control unit (110) for receiving the individual data stream and selectively outputting part or all of the streams; a decoding unit (122) for separating the received individual stream into an individual audio signal and an individual video signal, mixing the separated individual audio signal and individual video signal to generate a mixed audio data stream and a mixed video data stream; and an encoding unit (124) for encoding the mixed audio data stream and the mixed video data stream to generate a mixed data stream. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、大まかには、テレ
ビ会議システムに関するものであり、より詳細には、テ
レビ会議システムの映像データストリームの制御システ
ムおよび制御方法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates generally to video conferencing systems, and more particularly to video data stream control systems and methods for video conferencing systems.

【０００２】[0002]

【従来の技術】通信技術に関する科学技術の進展によ
り、空間的距離が、大いに短縮されている。そのため、
人々の間でのメッセージ交換は、もはや、地域間の距離
に制限されなくなった。人々は、協同して生きていくと
いう慣習を有している。その場合、相互情報交換は、生
活の一部として重要である。会議は、グループ内に共通
して存在する問題点を解決するために使用される一方法
である。2. Description of the Related Art Spatial distances have been greatly reduced by the progress of science and technology related to communication technology. for that reason,
Message exchanges between people are no longer restricted to distances between regions. People have the custom of living together. In that case, mutual information exchange is important as a part of life. Meetings are one method used to solve common problems that exist within a group.

【０００３】通信が十分に浸透していない状況では、会
議は、会議の参加者全員がそれぞれ所定の場所に出向く
という状況においてしか、開催することができなかっ
た。通信技術やマルチメディア情報技術が一般的になっ
てきた場合には、会議参加者の音声および映像は、電子
機器によってデジタル信号へと変換しさらにはデータス
トリームへと変換し、ローカルエリアネットワーク（Ｌ
ＡＮ）といったようなネットワーク技術を介して、デー
タストリームを送受信することができる。音声と映像と
を有したそれぞれのデータストリームは、例えばパーソ
ナルコンピュータシステムといったような会議参加者の
ディスプレイシステム上に、同時に表示される。この場
合、データの伝送および制御は、統一されたプロトコル
によって処理される。よって、現代のテレビ会議は、地
域間の距離によって制限されない。In a situation where communication is not sufficiently permeated, a meeting can be held only in a situation where all the participants of the meeting each go to a predetermined place. When communication technology and multimedia information technology have become popular, audio and video of conference participants are converted into digital signals by an electronic device and further into a data stream, and a local area network (L
Data streams can be sent and received via network technologies such as AN). The respective data streams, comprising audio and video, are simultaneously displayed on the display system of the conference participants, eg a personal computer system. In this case, the transmission and control of data is handled by a unified protocol. Thus, modern videoconferencing is not limited by the distance between regions.

【０００４】テレビ会議の簡単な動作原理が、図１に示
されている。図１においては、使用者（ａ）の近くに
は、例えばコンピュータシステム（１００ａ）といった
ような放送システムが設置されている。コンピュータシ
ステム（１００ａ）には、使用者（ａ）の映像および音
声を獲得するための獲得デバイス（１０４）が設けられ
ている。同様に、使用者（ｄ）の近くには、コンピュー
タシステム（１００ｄ）が設置されており、コンピュー
タシステム（１００ｄ）には、獲得デバイス（１０４）
が設けられている。この場合、各使用者は、それぞれ対
応する放送システム（１００ａ，１００ｂ，…，１００
ｆ）を有している。各使用者は、同一の伝送プロトコル
を使用して、ローカルエリアネットワーク（ＬＡＮ）を
介して、データストリームを制御ユニット（１０２）に
対して送信する。個々のデータストリームが混合される
ことにより、混合データストリームが形成され、その
後、この混合データストリームが、各使用者の放送シス
テムに対して送信される。一般に、制御ユニット（１０
２）は、会議の議長によって制御され、放送システムへ
と入力する数が制御される。例えば、使用者（ａ，ｂ，
ｃ，ｄ）のデータが選択して混合され、混合データが、
すべての使用者（ａ，ｂ，…，ｆ）へと送信される。使
用者（ａ）のコンピュータ（１００ａ）は、４人の使用
者の映像を表示している。同様に、使用者（ｄ）のコン
ピュータ（１００ｄ）は、同じ４人の使用者の映像を表
示している。この場合、表示は、個別の映像を有してい
ても有していなくても良い。すべての他の使用者も、ま
た、使用者（ａ，ｂ，ｃ，ｄ）の映像を表示している。The simple working principle of a video conference is illustrated in FIG. In FIG. 1, a broadcasting system such as a computer system (100a) is installed near the user (a). The computer system (100a) is provided with an acquisition device (104) for acquiring the video and audio of the user (a). Similarly, a computer system (100d) is installed near the user (d), and the acquisition device (104) is installed in the computer system (100d).
Is provided. In this case, each user has a corresponding broadcasting system (100a, 100b, ..., 100).
f). Each user sends a data stream to the control unit (102) via a local area network (LAN) using the same transmission protocol. The individual data streams are mixed to form a mixed data stream, which is then transmitted to each user's broadcast system. Generally, the control unit (10
2) is controlled by the chair of the conference and the number of inputs to the broadcasting system is controlled. For example, users (a, b,
The data of c, d) are selected and mixed, and the mixed data is
It is sent to all users (a, b, ..., F). The computer (100a) of the user (a) is displaying images of four users. Similarly, the computer (100d) of the user (d) is displaying images of the same four users. In this case, the display may or may not have a separate video. All other users are also viewing the video of user (a, b, c, d).

【０００５】図１に示すようなテレビ会議システムを使
用することによってテレビ会議を行うには、図２に示す
ような従来技術によるシステムが使用される。例えば
（１００ａ，１００ｂ，…，１００ｆ）といったような
会議参加者をなす複数の使用者は、個々のデータストリ
ームを、データストリーム制御センター（１１０）へと
入力する。データストリーム制御センター（１１０）
は、会議の議長によって制御される。映像データのデー
タ処理には、中央演算ユニット（ＣＰＵ）における大量
の計算パワーが必要とされ、また、他の関連部材には、
大量の処理能力が必要とされる。よって、負荷が制限さ
れる。一般に、リアルタイムで表示するにあたって、４
つのデータストリームしか選択することができない。To hold a video conference by using the video conference system as shown in FIG. 1, a conventional system as shown in FIG. 2 is used. A plurality of users who are conference participants, such as (100a, 100b, ..., 100f), input individual data streams to the data stream control center (110). Data Stream Control Center (110)
Is controlled by the chair of the meeting. Data processing of video data requires a large amount of calculation power in the central processing unit (CPU), and other related members include
A large amount of processing power is required. Therefore, the load is limited. Generally, when displaying in real time, 4
Only one data stream can be selected.

【０００６】選択された４つのデータストリームが、マ
ルチポイント制御ユニット（ＭＣＵ）（１１２）内に入
力される。ここで、マルチポイント制御ユニット（１１
２）は、音声／映像（Ａ／Ｖ）信号の復号化／符号化処
理ユニット（１１４）を有している。入力される個々の
データストリームは、映像信号および音声信号へと復号
される。その後、個々の音声信号および個々の映像信号
が混合され、他の混合データストリームへと符号化され
る。混合データストリームは、その後、各ユニット（１
００ａ，１００ｂ，…，１００ｆ）へと送信され、放送
に供される。The four selected data streams are input into a multipoint control unit (MCU) (112). Here, the multipoint control unit (11
2) has an audio / video (A / V) signal decoding / encoding processing unit (114). The individual data streams that are input are decoded into video and audio signals. The individual audio and video signals are then mixed and encoded into another mixed data stream. The mixed data stream is then sent to each unit (1
00a, 100b, ..., 100f) for broadcasting.

【０００７】[0007]

【発明が解決しようとする課題】上述のテレビ会議シス
テムにおいては、データストリームの復号化および符号
化は、ただ１つの処理ユニット（１１４）によって処理
される。データの処理に際しては、音声データおよび映
像データの復号化および符号化のために、大量のＣＰＵ
計算が必要である。よって、処理ユニット（１１４）
は、かなり大量の負荷を処理することとなる。計算能力
に制限があることにより、従来のテレビ会議システムの
効率は、理想的なものではなく、また、システムの設備
コストをうまく低減することができなかった。In the video conferencing system described above, the decoding and coding of the data stream is handled by only one processing unit (114). When processing data, a large amount of CPU is used to decode and encode audio data and video data.
Calculation is required. Therefore, the processing unit (114)
Will handle a fairly large amount of load. Due to the limited computing power, the efficiency of the conventional video conferencing system is not ideal, and the equipment cost of the system cannot be successfully reduced.

【０００８】[0008]

【課題を解決するための手段】上述した問題点を解決す
るために、本発明は、複数の参加者ユニットによって行
われる会議に適したテレビ会議システムを提供する。こ
のテレビ会議システムは、中央制御ユニットと、復号化
ユニットと、符号化ユニットと、を具備している。この
場合、中央制御ユニットは、複数の参加者ユニットから
複数の個別データストリームを受領するとともに、デー
タストリームの全部または一部を選択的に出力する。復
号化ユニットは、選択された複数の個別データストリー
ムを受領するとともに、個別データストリームを、個別
音声信号と個別映像信号とに分解し、これら分解した個
別音声信号および分解した個別映像信号を混合すること
により、混合音声データストリームと混合映像データス
トリームとを形成する。符号化ユニットは、混合音声デ
ータストリームと混合映像データストリームとを受領す
るとともに、これら混合音声データストリームと混合映
像データストリームとを符号化することにより、混合デ
ータストリームを形成し、さらに、混合データストリー
ムを参加者ユニットに対して送信する。In order to solve the above-mentioned problems, the present invention provides a video conference system suitable for a conference held by a plurality of participant units. This video conference system includes a central control unit, a decoding unit, and an encoding unit. In this case, the central control unit receives the plurality of individual data streams from the plurality of participant units and selectively outputs all or part of the data streams. The decoding unit receives the selected plurality of individual data streams, decomposes the individual data streams into individual audio signals and individual video signals, and mixes the decomposed individual audio signals and the decomposed individual video signals. As a result, a mixed audio data stream and a mixed video data stream are formed. The encoding unit receives the mixed audio data stream and the mixed video data stream, forms the mixed data stream by encoding the mixed audio data stream and the mixed video data stream, and further forms the mixed data stream. To the participant unit.

【０００９】復号化処理と符号化処理とは、２つの個別
の処理ユニットによって処理される。これにより、本発
明によるテレビ会議システムにおいては、計算処理を、
復号化チップと符号化チップとにおいて互いに個別に行
うことができる。これら２つのチップをなす個々のチッ
プは、高い計算能力を備えている必要はない。そのた
め、チップの回路構成が、従来のシステムと比較して、
ずっと単純となる。これに伴って、コストが低減され
る。したがって、テレビ会議システムの効率が向上する
だけでなく、システムの設備コストもこれに伴って低減
される。The decoding process and the encoding process are processed by two separate processing units. As a result, in the video conference system according to the present invention, the calculation process is
The decoding chip and the coding chip can be performed separately from each other. The individual chips that make up these two chips do not have to have high computational power. Therefore, the circuit configuration of the chip, compared to the conventional system,
It will be much simpler. Along with this, the cost is reduced. Therefore, not only the efficiency of the video conference system is improved, but also the equipment cost of the system is reduced accordingly.

【００１０】加えて、本発明においては、符号化処理と
復号化処理とを個別的に行う。これにより、放送重みづ
けを、各データストリームに対して、動的に計算操作内
に組み込むことができる。したがって、データストリー
ムが放送される際には、いくつかの関連する個別データ
ストリームを、重みづけをもって放送することができ
る。これにより、テレビ会議の効果を際立たせて増強す
ることができる。In addition, in the present invention, the encoding process and the decoding process are individually performed. This allows broadcast weighting to be dynamically incorporated into the computational operation for each data stream. Therefore, when the data stream is broadcast, several related individual data streams can be broadcast with weighting. As a result, the effect of the video conference can be emphasized and enhanced.

【００１１】本発明は、さらに、複数の参加者ユニット
によって行われる会議に適したマルチユニットデータス
トリーム制御システムを提供する。この場合、各参加者
ユニットは、個別データストリームを出力する。このデ
ータストリーム制御システムは、復号化ユニットと、符
号化ユニットと、を具備している。ここで、復号化ユニ
ットは、複数の個別データストリームを受領するととも
に、個別データストリームを、個別音声信号と個別映像
信号とに分解し、これら分解した個別音声信号および分
解した個別映像信号を混合することにより、混合音声デ
ータストリームと混合映像データストリームとを形成す
る。符号化ユニットは、混合音声データストリームと混
合映像データストリームとを受領するとともに、これら
混合音声データストリームと混合映像データストリーム
とを符号化することにより、混合データストリームを形
成し、さらに、混合データストリームを参加者ユニット
に対して送信して放送に供する。The present invention further provides a multi-unit data stream control system suitable for conferences conducted by multiple participant units. In this case, each participant unit outputs a separate data stream. This data stream control system comprises a decoding unit and an encoding unit. Here, the decoding unit receives the plurality of individual data streams, decomposes the individual data streams into individual audio signals and individual video signals, and mixes the decomposed individual audio signals and the decomposed individual video signals. As a result, a mixed audio data stream and a mixed video data stream are formed. The encoding unit receives the mixed audio data stream and the mixed video data stream, forms the mixed data stream by encoding the mixed audio data stream and the mixed video data stream, and further forms the mixed data stream. Is transmitted to the participant unit for broadcasting.

【００１２】上述の復号化ユニットと符号化ユニットと
は、２つの処理チップユニット（あるいは、２つの映像
処理チップユニット）によって実現することができる。
２つの処理チップユニットのいずれか一方が、復号化チ
ップとして動的に選択されたときには、他方のチップ
は、符号化ユニットとされる。The above-mentioned decoding unit and coding unit can be realized by two processing chip units (or two video processing chip units).
When either one of the two processing chip units is dynamically selected as the decoding chip, the other chip becomes the coding unit.

【００１３】本発明は、さらに、複数の入力を有したデ
ータストリーム制御方法を提供する。この方法において
は、復号化ユニットおよび符号化ユニットを準備し；複
数のデータストリームを復号化ユニットへと入力し；復
号化ユニットによって、データストリームを、個別音声
信号と個別映像信号とに分解し、これら分解した個別音
声信号および分解した個別映像信号を混合することによ
り、混合音声データストリームと混合映像データストリ
ームとを形成し；符号化ユニットによって、混合音声デ
ータストリームと混合映像データストリームとを結合す
ることにより、放送に供するための混合データストリー
ムを形成する。The present invention further provides a data stream control method having a plurality of inputs. In this method, a decoding unit and an encoding unit are prepared; a plurality of data streams are input to the decoding unit; the decoding unit decomposes the data stream into individual audio signals and individual video signals, Mixing the decomposed individual audio signals and the decomposed individual video signals forms a mixed audio data stream and a mixed video data stream; an encoding unit combines the mixed audio data stream and the mixed video data stream. As a result, a mixed data stream for broadcasting is formed.

【００１４】[0014]

【発明の実施の形態】添付図面は、本発明のより良好な
理解をもたらすためのものであり、本明細書の一部とし
て組み込まれる。添付図面は、本発明の実施形態を示し
ており、以下の説明と協働して、本発明の原理を説明す
るように機能している。BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings are provided to provide a better understanding of the present invention and are incorporated as part of the specification. The accompanying drawings illustrate embodiments of the present invention and, in cooperation with the following description, serve to explain the principles of the invention.

【００１５】図１は、従来技術によるテレビ会議の動作
原理を概略的に示している。FIG. 1 schematically shows the operating principle of a video conference according to the prior art.

【００１６】図２は、従来技術によるテレビ会議システ
ムをブロック図によって概略的に示している。FIG. 2 schematically shows a prior art video conference system by a block diagram.

【００１７】図３は、本発明によるテレビ会議システム
をブロック図によって概略的に示している。FIG. 3 schematically shows, by a block diagram, a video conference system according to the present invention.

【００１８】本発明の主要な特徴点の１つは、互いに個
別のかつ互いに独立な復号化ユニットと符号化ユニット
とを使用することである。よって、処理チップの複雑さ
が低減される。したがって、製造コストが低減されると
ともに、各放送ウィンドウに対して様々な放送重みづけ
を行うことができる。One of the main features of the invention is the use of a decoding unit and a coding unit which are separate and independent of each other. Therefore, the complexity of the processing chip is reduced. Therefore, the manufacturing cost is reduced and various broadcast weights can be applied to each broadcast window.

【００１９】本発明の特徴点を説明するために、以下、
実施形態を例示する。本発明は、テレビ会議のためのネ
ットワーク上において信号を伝送するために、データ伝
送プロトコルを採用する。通信基準に適合するデータ伝
送プロトコルには、例えば、Ｈ．３２０，Ｈ．３２１，
Ｈ．３２３，Ｈ．３２４，…等がある。データ伝送プロ
トコルは、また、テレビ会議プロトコルにおけるすべて
の機能要求を満足させるよう、ネットワークタイプや映
像基準や音声基準やマルチタスク制御や安全プロトコル
や制御プロトコルやモデム等を適合させるという機能を
有している。本発明においては、伝送プロトコルのもと
でハードウェア構成を変更することにより、伝送プロト
コルの効率を向上させるというさらなる機能を果たす。In order to explain the features of the present invention,
1 illustrates an embodiment. The present invention employs a data transmission protocol to transmit signals over a network for video conferencing. A data transmission protocol conforming to the communication standard is, for example, H.264. 320, H.M. 321,
H. 323, H.H. 324, ... etc. The data transmission protocol also has the function of adapting network types, video standards, audio standards, multi-task control, safety protocols, control protocols, modems, etc. to meet all the functional requirements of video conferencing protocols. There is. In the present invention, by changing the hardware configuration under the transmission protocol, the further function of improving the efficiency of the transmission protocol is fulfilled.

【００２０】図３は、本発明によるテレビ会議システム
をブロック図によって概略的に示している。ただし、ソ
フトウェアプロトコルやインターフェース基準は図示さ
れていない。図３においては、各会議参加者は、（図１
に示されているように）例えば（１００ａ，１００ｂ，
…，１００ｆ）といったような放送ユニットと、放送ユ
ニットに適合した獲得ユニット（１０４）と、を有して
いる。放送ユニットは、一般的には参加者ユニット（１
００ａ〜１００ｆ）と称され、会議参加者によって使用
される。参加者ユニット（１００ａ〜１００ｆ）によっ
て生成された個々のデータストリームは、例えばローカ
ルエリアネットワークといったような入力経路を介し
て、すべてのものがデータストリーム制御センター（１
１０）内へと入力される。FIG. 3 is a block diagram schematically showing a video conference system according to the present invention. However, software protocols and interface standards are not shown. In FIG. 3, each conference participant
(For example, (100a, 100b,
, 100f) and an acquisition unit (104) adapted to the broadcast unit. Broadcast units are typically participant units (1
00a-100f) and is used by conference participants. The individual data streams generated by the participant units (100a-100f) are all via a data stream control center (1) via an input path, eg a local area network.
10) is input into.

【００２１】データストリーム制御センター（１１０）
は、通常、会議の議長によって制御される。システムの
表示能力の制限のために、全参加者の中の一部の参加者
の個別データストリームが動的に選択され、各参加者に
対しての放送に供される。一方、全参加者に対して、同
時に放送を送信することができる。さらに、データスト
リーム制御センター（１１０）は、発言中の参加者を自
動的に認識し得る音声認識能力を有することができ、放
送を自動的に選択することもできる。図３に示すよう
に、例えば、４人の参加者が選択され、同時的に放送に
供される。選択された参加者の個々のデータストリーム
は、マルチポイント制御ユニット（１２０）へと入力さ
れ、復号化処理と符号化処理とが行われる。個々のデー
タストリームを混合することによって、通信プロトコル
に従った混合データストリームが生成される。混合デー
タストリームが出力され、放送のためのネットワークを
介して、各参加者ユニット（１００ａ〜１００ｆ）へと
送信される。Data Stream Control Center (110)
Is usually controlled by the chair of the meeting. Due to the limited display capabilities of the system, the individual data streams of some of the participants are dynamically selected and broadcast for each participant. On the other hand, the broadcast can be simultaneously transmitted to all the participants. Further, the data stream control center (110) may have a voice recognition capability capable of automatically recognizing the speaking participant, and may also automatically select a broadcast. As shown in FIG. 3, for example, four participants are selected and simultaneously broadcast. The individual data streams of the selected participants are input to the multipoint control unit (120) where they are subjected to a decoding process and an encoding process. By mixing the individual data streams, a mixed data stream according to the communication protocol is generated. The mixed data stream is output and sent to each participant unit (100a-100f) via the network for broadcasting.

【００２２】従来技術においては処理ユニットの負荷が
大きすぎたため、処理効率が悪く、設備コストが高くな
ってしまっていた。本発明は、マルチポイント制御ユニ
ット（１２０）内において２つの処理チップユニットを
使用することによって復号化処理と符号化処理とを行う
方法を提供する。In the prior art, the processing unit was overloaded, resulting in poor processing efficiency and high equipment costs. The present invention provides a method for performing a decoding process and an encoding process by using two processing chip units in the multipoint control unit (120).

【００２３】マルチポイント制御ユニット（１２０）内
においては、まず最初に、選択された個別データストリ
ームが、復号化ユニット（１２２）として公知の音声／
映像（Ａ／Ｖ）復号化ユニット（１２２）へと入力され
る。復号化ユニット（１２２）は、個別データストリー
ムを、個別音声信号と個別映像信号とに、復号化する。
複数の個別音声信号が混合されることにより、混合音声
データストリームが形成され、また、複数の個別映像信
号が混合されることにより、混合映像データストリーム
が形成される。混合処理に際しては、物理的要求に応じ
て、対応する個別音声信号または個別映像信号に、様々
な個別放送インデックスすなわち様々な個別放送重みづ
けを付加することができる。例えば、現在の発言者に対
して大きな放送重みづけを行うことにより、放送スクリ
ーン内において、現在の発言者の比率を大きくして現在
の発言者を際立たせることができる。復号化ユニット
（１２２）が符号化処理を行わないことにより、復号化
ユニットのための処理チップの構成および製造を、ずっ
と単純化することができる。復号化ユニット（１２２）
の計算負荷があまり重くないことにより、復号化ユニッ
トのための処理チップとして、高価な高速チップを使用
する必要がない。放送重みづけの調整は、必ずしも復号
化ユニット（１２２）において処理する必要があるわけ
ではなく、さらなる処理ユニットによって処理すること
ができる。Within the multipoint control unit (120), first of all, the selected individual data stream is a speech / audio signal known as a decoding unit (122).
It is input to the video (A / V) decoding unit (122). The decoding unit (122) decodes the individual data stream into an individual audio signal and an individual video signal.
A mixed audio data stream is formed by mixing a plurality of individual audio signals, and a mixed video data stream is formed by mixing a plurality of individual video signals. In the mixing process, various individual broadcast indexes, that is, various individual broadcast weights can be added to the corresponding individual audio signals or individual video signals according to physical requirements. For example, by giving a large weight to the current speaker, the ratio of the current speaker can be increased in the broadcast screen to make the current speaker stand out. Due to the fact that the decoding unit (122) does not perform the encoding process, the construction and manufacture of the processing chip for the decoding unit can be much simplified. Decoding unit (122)
Since the calculation load of is not so heavy, it is not necessary to use an expensive high speed chip as a processing chip for the decoding unit. The adjustment of broadcast weights does not necessarily have to be processed in the decoding unit (122), but can be processed by a further processing unit.

【００２４】復号化処理および混合処理の後に復号化ユ
ニット（１２２）から出力される混合音声データストリ
ームおよび混合映像データストリームは、符号化ユニッ
ト（１２４）へと送出される。符号化ユニット（１２
４）は、混合音声データストリームと混合映像データス
トリームとを混合し、プロトコルに従った混合データス
トリームを形成する。その後、混合データストリーム
が、各参加者ユニットに対して出力され、放送が行われ
る。The mixed audio data stream and mixed video data stream output from the decoding unit (122) after the decoding process and the mixing process are sent to the encoding unit (124). Encoding unit (12
In 4), the mixed audio data stream and the mixed video data stream are mixed to form a mixed data stream according to the protocol. The mixed data stream is then output to each participant unit for broadcasting.

【００２５】符号化ユニット（１２４）が符号化処理だ
けを行うことにより、作業負荷は、それほど重いもので
はない。そのため、高価な高速チップを使用する必要が
ない。加えて、放送重みづけの設定が、復号化ユニット
（１２２）において処理されない場合には、放送重みづ
けの設定は、符号化ユニット（１２４）において処理さ
れるべきである。どちらのユニットにおいて放送重みづ
けの設定を行うかは、復号化ユニット（１２２）と符号
化ユニット（１２４）との作業負荷レベルに依存する。Since the encoding unit (124) performs only the encoding process, the workload is not so heavy. Therefore, it is not necessary to use an expensive high speed chip. In addition, if the broadcast weight setting is not processed in the decoding unit (122), the broadcast weight setting should be processed in the encoding unit (124). Which unit sets the broadcast weighting depends on the workload level of the decoding unit (122) and the coding unit (124).

【００２６】加えて、上述した復号化ユニットおよび符
号化ユニットは、２つの処理チップユニットによって実
現される。これら２つの処理チップユニットのうちのど
ちらか一方が復号化ユニットとして動的に選択されたと
きには、他方の処理チップユニットが、符号化ユニット
として選択される。言い換えれば、その時点での計算負
荷に応じて、２つの処理チップユニットのいずれか一方
を復号処理を行うものとして動的に選択することがで
き、他方の処理チップユニットを符号化処理のために使
用することができる。そして、それらの役割を、交替さ
せることができる。In addition, the decoding unit and the encoding unit described above are realized by two processing chip units. When either one of these two processing chip units is dynamically selected as the decoding unit, the other processing chip unit is selected as the encoding unit. In other words, one of the two processing chip units can be dynamically selected to perform the decoding processing according to the calculation load at that time, and the other processing chip unit can be used for the encoding processing. Can be used. And those roles can be changed.

【００２７】本発明による特徴点においては、例えばＮ
ＴＳＣ６０Ｈｚ規格を有した映像の場合には、単一フル
スクリーン出力は、３０Ｈｚであり、このシステムの最
大遅延（復号化や符号化に際しての遅延やネットワーク
遅延を考慮しない場合）は、約１／３０秒である。加え
て、例えばＰＡＬ５０Ｈｚを使用した場合には、復号化
処理と符号化処理とのために２つの処理チップユニット
を使用することによって、遅延は、約１／２５秒でしか
ない。計算負荷を処理するために２つのチップを使用
し、最適の価格性能比率を得ることは、本発明の主要特
徴点の１つである。In the characteristic point according to the present invention, for example, N
In the case of video with the TSC 60Hz standard, the single full screen output is 30Hz, and the maximum delay of this system (when not considering the delay in decoding or encoding and the network delay) is about 1/30. Seconds. In addition, when using PAL50Hz, for example, the delay is only about 1/25 second by using two processing chip units for the decoding process and the encoding process. Using two chips to handle the computational load and getting an optimal price / performance ratio is one of the key features of the present invention.

【００２８】したがって、本発明は、少なくとも、以下
の特徴点および利点を有している。１．復号化処理と符号化処理とを互いに異なる処理チッ
プによって行い、これにより、チップの作業負荷を低減
するとともに、設備コストを低減する。２．本発明においては、放送のために選択された参加者
ユニットに対して放送重みづけを割り当てる。これによ
り、いくつかのウィンドウにおいて、発言者の映像を際
立たせることができる。３．本発明によるハードウェア構成においては、復号化
処理と符号化処理とを互いに異なる処理チップによって
行うとともに、従来のＨ．３２３プロトコルによっては
得られない機能を生成することができる。４．本発明による会議システムは、空間的制約を受ける
ことなく、ローカルエリアネットワーク伝送を使用する
ことによって、複数のテレビ会議参加社を取り扱うこと
ができる。５．本発明においては、現在の発言者を自動的に認識す
るという機能を有しており、放送を自動的に選択するこ
とができる。６．本発明による会議システムは、同時に４人しか選択
することができないという従来の制約を受けることな
く、放送のための複数の参加者ユニットを選択すること
ができる。復号化処理と符号化処理とが互いに異なるチ
ップによって処理されることにより、計算能力を向上さ
せることができる。Therefore, the present invention has at least the following features and advantages. 1. The decoding process and the encoding process are performed by different processing chips, which reduces the workload of the chip and reduces the equipment cost. 2. In the present invention, broadcast weights are assigned to participant units selected for broadcast. This allows the video of the speaker to stand out in some windows. 3. In the hardware configuration according to the present invention, the decoding process and the encoding process are performed by different processing chips and the conventional H.264 standard is used. It is possible to generate functions that cannot be obtained by the H.323 protocol. 4. The conferencing system according to the present invention is capable of handling multiple videoconferencing participants by using local area network transmission without spatial constraints. 5. The present invention has a function of automatically recognizing the current speaker, and can automatically select a broadcast. 6. The conferencing system according to the present invention is capable of selecting multiple participant units for broadcasting without the traditional constraint that only four can be selected at the same time. Since the decoding process and the encoding process are processed by different chips, the calculation capability can be improved.

【００２９】特定の実施形態を参照して本発明について
説明したけれども、本発明の精神を逸脱することなく、
実施形態に対して修正を加え得ることは、当業者には明
らかであろう。したがって、本発明の範囲は、上記の詳
細な説明によってではなく、請求範囲によって、規定さ
れる。Although the present invention has been described with reference to particular embodiments, without departing from the spirit of the invention,
It will be apparent to those skilled in the art that modifications can be made to the embodiments. Accordingly, the scope of the invention is defined by the claims, not the detailed description above.

[Brief description of drawings]

【図１】従来技術によるテレビ会議の動作原理を概略
的に示す図である。FIG. 1 is a diagram schematically showing an operating principle of a video conference according to a conventional technique.

【図２】従来技術によるテレビ会議システムを概略的
に示すブロック図である。FIG. 2 is a block diagram schematically showing a conventional video conference system.

【図３】本発明によるテレビ会議システムを概略的に
示すブロック図である。FIG. 3 is a block diagram schematically showing a video conference system according to the present invention.

[Explanation of symbols]

１００ａ参加者ユニット１００ｂ参加者ユニット１００ｃ参加者ユニット１００ｄ参加者ユニット１００ｅ参加者ユニット１００ｆ参加者ユニット１１０データストリーム制御センター（中央制御ユ
ニット）１２０マルチポイント制御ユニット（マルチユニッ
トデータストリーム制御システム）１２２復号化ユニット１２４符号化ユニット100a participant unit 100b participant unit 100c participant unit 100d participant unit 100e participant unit 100f participant unit 110 data stream control center (central control unit) 120 multipoint control unit (multi-unit data stream control system) 122 decoding Unit 124 Encoding unit

Claims

[Claims]

1. A video conferencing system suitable for a conference held by a plurality of participant units, each of which outputs an individual data stream, wherein the plurality of individual data streams are received and all or part of the data stream is received. A central control unit capable of selectively outputting a plurality of selected individual data streams, decomposing the individual data streams into individual audio signals and individual video signals, and decomposing these individual audio signals. A decoding unit for forming a mixed audio data stream and a mixed video data stream by mixing the signal and the decomposed individual video signals; receiving the mixed audio data stream and the mixed video data stream, and mixing them Audio data stream and mixed video data stream By encoding the arm, to form a mixed data stream,
And a coding unit for transmitting the mixed data stream to the participant unit for broadcasting.

2. A multi-unit data stream control system, suitable for a conference held by a plurality of participant units, each outputting an individual data stream, comprising a first processing chip unit and a second processing chip unit. A first processing chip unit and a second processing chip unit, wherein one of them is dynamically selected as a decoding unit, and the other is selected as an encoding unit; Mixed audio data by receiving a plurality of separate individual data streams, decomposing the individual data streams into individual audio signals and individual video signals, and mixing the decomposed individual audio signals and the decomposed individual video signals. A decoding unit forming a stream and a mixed video data stream; The audio data stream and the mixed video data stream are received, and the mixed audio data stream and the mixed video data stream are encoded to form a mixed data stream, and the mixed data stream is formed by the participant. An encoding unit for transmitting to the unit for broadcasting;
A multi-unit data stream control system comprising:

3. A multi-unit data stream control system suitable for a conference conducted by a plurality of participant units, each of which outputs a separate data stream, the system comprising: receiving a plurality of separate data streams; A decoding unit for decomposing an individual audio signal and an individual video signal, and mixing the decomposed individual audio signal and the decomposed individual video signal to form a mixed audio data stream and a mixed video data stream; A mixed audio data stream and the mixed video data stream are received, a mixed data stream is formed by encoding the mixed audio data stream and the mixed video data stream, and the mixed data stream is further joined. To the unit Multi-unit data stream control system characterized by comprising; a coding unit subjected to broadcast by Shin.

4. A method of controlling a data stream having a plurality of inputs, comprising: providing a decoding unit and an encoding unit; inputting a plurality of data streams to the decoding unit; The data stream
A mixed audio data stream and a mixed video data stream are formed by decomposing an individual audio signal and an individual video signal and mixing the decomposed individual audio signal and the decomposed individual video signal; A control method characterized by forming a mixed data stream for broadcasting by combining the mixed audio data stream and the mixed video data stream.