JP6710818B1

JP6710818B1 - Translation device, translation method, program

Info

Publication number: JP6710818B1
Application number: JP2020010101A
Authority: JP
Inventors: 聡黒川; 由敬石橋; 藤井　智; 智藤井; 喜敏須田; 信公明賀; 吉田　昌司; 昌司吉田; 英樹嶋田
Original assignee: NEC Corp; Sumitomo Mitsui Construction Co Ltd
Current assignee: NEC Corp; Sumitomo Mitsui Construction Co Ltd
Priority date: 2020-01-24
Filing date: 2020-01-24
Publication date: 2020-06-17
Anticipated expiration: 2040-01-24
Also published as: WO2021149267A1; JP2021117676A

Abstract

【課題】第一の言語の入力音声を、第一の言語以外の一つ以上の翻訳先言語の出力音声に変換して送信先装置へ送信する際に、滞りなく音声を送信先装置へ送信できる翻訳装置を提供する。【解決手段】第一の言語の入力音声の入力を検知した場合に、第一の言語とは異なる一つ以上の翻訳先言語の出力音声へ直ちに変換し、一つ以上の翻訳先言語の出力音声を、対応する言語を話すユーザの利用する送信先装置へ送信する。【選択図】図１PROBLEM TO BE SOLVED: To smoothly transmit a voice to a destination device when converting an input voice in a first language into an output voice in one or more translation destination languages other than the first language and transmitting the output voice to a destination device. A translation device capable of performing the above is provided. SOLUTION: When an input speech of a first language is detected, it is immediately converted into an output speech of one or more translation destination languages different from the first language, and output of one or more translation destination languages. The voice is transmitted to the destination device used by the user who speaks the corresponding language. [Selection diagram] Figure 1

Description

本発明は、翻訳装置、翻訳方法、プログラムに関する。 The present invention relates to a translation device, a translation method, and a program.

複数の言語話者が協働して作業する場合、ある指示者などの第一話者が第一の言語で発した音声の内容が、第一の言語を習得していない他の言語の話者に正確に伝わる必要がある。関連する技術として、第一の言語の音声を翻訳した翻訳テキストデータを生成する技術が特許文献１に開示されている。 When multiple language speakers work together, the content of the voice uttered by a first speaker, such as an instructor, in the first language is a speech in another language that has not mastered the first language. Need to be accurately communicated to others. As a related technique, Patent Document 1 discloses a technique for generating translated text data in which a voice in a first language is translated.

特開２０１９−００４３９２号公報JP, 2019-004392, A

上述のような翻訳において、第一話者が第一の言語で発した音声の内容が、第一の言語を習得していない他の言語の話者に、より短時間で伝わる仕組みが求められる。 In the above-mentioned translation, there is a need for a mechanism in which the content of the voice uttered by the first speaker in the first language is transmitted to a speaker in another language who has not mastered the first language in a shorter time. ..

そこでこの発明は、上述の課題を解決する翻訳装置、翻訳方法、プログラムを提供することを目的としている。 Then, this invention aims at providing the translation apparatus, the translation method, and the program which solve the above-mentioned subject.

発明の第一の態様によれば、翻訳装置は、送信元装置の識別情報に紐付けられた、複数の送信先装置の識別情報と当該識別情報に対応する翻訳先言語とをそれぞれ記憶する記憶部と、前記送信元装置からの第一の言語の入力音声の入力を検知した場合に、その入力音声を、前記第一の言語とは異なる前記送信先装置の識別情報に対応する一つ以上の前記翻訳先言語の出力音声へ直ちに変換し、前記複数の翻訳先言語の出力音声をテキスト化したテキストデータをそれぞれ生成する翻訳手段と、前記一つ以上の翻訳先言語の出力音声を、前記識別情報に対応する送信先装置へそれぞれ送信する音声送信手段と、前記翻訳先言語の出力音声を受信した前記送信先装置からのテキストデータ送信要求があった場合に、テキストデータ送信要求に含まれる当該送信先装置の識別情報に対応する翻訳先言語の前記出力音声をテキスト化した前記翻訳先言語のテキストデータを当該送信先装置へ送信するテキストデータ送信手段と、を備えることを特徴とする。 According to the first aspect of the invention, the translation device stores the identification information of the plurality of transmission destination devices and the translation target language corresponding to the identification information, which are associated with the identification information of the transmission source device. parts and, when detecting the input of the input speech of the first language from the transmission source device, the input voice, one or more corresponding to the identification information of said different destination apparatus and the first language wherein immediately converted into output audio translation destination language, and translating means for generating said plurality of target language output speech text of text data, respectively, the output audio of the one or more target language, the Included in the text data transmission request when there is a text data transmission request from the transmission destination device that receives the output voice of the translation destination language and the voice transmission unit that respectively transmits to the transmission destination device corresponding to the identification information. And a text data transmitting unit that transmits text data in the translation destination language, which is obtained by converting the output voice in the translation destination language corresponding to the identification information of the transmission destination device into text, to the transmission destination device .

発明の第二の態様によれば、翻訳方法は、送信元装置の識別情報に紐付けられた、複数の送信先装置の識別情報と当該識別情報に対応する翻訳先言語とをそれぞれ記憶し、前記送信元装置からの第一の言語の入力音声の入力を検知した場合に、その入力音声を、前記第一の言語とは異なる前記送信先装置の識別情報に対応する一つ以上の前記翻訳先言語の出力音声へ直ちに変換し、前記複数の翻訳先言語の出力音声をテキスト化したテキストデータをそれぞれ生成し、前記一つ以上の翻訳先言語の出力音声を、前記識別情報に対応する送信先装置へそれぞれ送信し、前記翻訳先言語の出力音声を受信した前記送信先装置からのテキストデータ送信要求があった場合に、テキストデータ送信要求に含まれる当該送信先装置の識別情報に対応する翻訳先言語の前記出力音声をテキスト化した前記翻訳先言語のテキストデータを当該送信先装置へ送信することを特徴とする。 According to the second aspect of the invention, the translation method stores identification information of a plurality of transmission destination devices, which is associated with the identification information of the transmission source device, and a translation destination language corresponding to the identification information, respectively. When the input speech of the first language from the transmission source apparatus is detected, the input speech is translated into one or more translations corresponding to identification information of the transmission destination apparatus different from the first language. Immediately converting to output speech in the destination language, generating text data by converting the output speech in the plurality of translation destination languages into text, and transmitting the output speech in the one or more translation destination languages corresponding to the identification information. send each onward device, when there is the text data transmission request from the transmission destination device which receives the output audio of the target language, corresponding to the identification information of the destination device included in the text data transmission request It is characterized in that the text data of the translation destination language in which the output voice of the translation destination language is converted into text is transmitted to the transmission destination device .

発明の第三の態様によれば、プログラムは、翻訳装置のコンピュータを、送信元装置の識別情報に紐付けられた、複数の送信先装置の識別情報と当該識別情報に対応する翻訳先言語とをそれぞれ記憶する記憶手段、前記送信元装置からの第一の言語の入力音声の入力を検知した場合に、その入力音声を、前記第一の言語とは異なる前記送信先装置の識別情報に対応する一つ以上の前記翻訳先言語の出力音声へ直ちに変換し、前記複数の翻訳先言語の出力音声をテキスト化したテキストデータをそれぞれ生成する翻訳手段、前記一つ以上の翻訳先言語の出力音声を、前記識別情報に対応する送信先装置へそれぞれ送信する音声送信手段、前記翻訳先言語の出力音声を受信した前記送信先装置からのテキストデータ送信要求があった場合に、テキストデータ送信要求に含まれる当該送信先装置の識別情報に対応する翻訳先言語の前記出力音声をテキスト化した前記翻訳先言語のテキストデータを当該送信先装置へ送信するテキストデータ送信手段、として機能させることを特徴とする。 According to the third aspect of the invention, the program causes the computer of the translation device to identify the identification information of the plurality of transmission destination devices associated with the identification information of the transmission source device and a translation destination language corresponding to the identification information. Storage means for respectively storing the input voices when the input voices of the first language from the transmission source device are detected, the input voices correspond to the identification information of the transmission destination device different from the first language. one or more of the immediately converted to the output sound of the target language, the plurality of target language translation means for generating respectively a text of text data output voice, the one or more output audio target language to and the audio transmission means for transmitting each to the destination device corresponding to the identification information, when there is a text data transmission request from the transmission destination device which receives the output audio of the target language, the text data transmission request A text data transmitting unit that transmits the text data in the translation destination language, which is the output speech in the translation destination language corresponding to the identification information of the transmission destination device included, to the transmission destination device. To do.

本発明によれば、第一話者が第一の言語で発した音声の内容が、第一の言語を習得していない他の言語の話者に、より短時間で伝わる翻訳装置を提供することができる。 According to the present invention, there is provided a translation device in which the content of a voice uttered by a first speaker in a first language is transmitted to a speaker in another language who has not mastered the first language in a shorter time. be able to.

本発明の一実施形態による翻訳システムの構成を示すブロック図である。It is a block diagram which shows the structure of the translation system by one Embodiment of this invention. 本発明の一実施形態による翻訳サーバ、トランシーバサーバ、仲介サーバのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the translation server, transceiver server, and mediation server by one Embodiment of this invention. 本発明の一実施形態による翻訳サーバの機能ブロック図である。It is a functional block diagram of a translation server by one embodiment of the present invention. 本発明の一実施形態による翻訳部の詳細な機能構成を示す図である。It is a figure which shows the detailed functional structure of the translation part by one Embodiment of this invention. 本発明の一実施形態によるトランシーバサーバが記憶するユーザ管理テーブルを示す図である。FIG. 6 is a diagram showing a user management table stored in a transceiver server according to an embodiment of the present invention. 本発明の一実施形態による翻訳システムの処理フローを示す図である。It is a figure which shows the processing flow of the translation system by one Embodiment of this invention. 本発明の一実施形態による翻訳サーバの最小構成を示す図である。It is a figure which shows the minimum structure of the translation server by one Embodiment of this invention. 本発明の一実施形態による最小構成による翻訳サーバの処理フローを示す図である。It is a figure which shows the processing flow of the translation server by the minimum structure by one Embodiment of this invention.

以下、本発明の一実施形態による翻訳装置を図面を参照して説明する。
図１は本実施形態による翻訳装置を含む翻訳システムの構成を示すブロック図である。
翻訳システム１００は、少なくとも翻訳サーバ１を備える。本実施形態による翻訳システム１００は、さらにトランシーバサーバ２、仲介サーバ３を備える。
翻訳サーバ１は、入力した音声を、所定の翻訳先言語の音声に変換して出力する機能を有する。
トランシーバサーバ２は、ユーザ端末５と直接、通信を行う装置であり、グループに含まれる各ユーザ端末５の管理、音声通信処理、ユーザ管理などを行う。
仲介サーバ３は、音声データやテキストデータのトランシーバサーバ２と翻訳サーバ１との通信を仲介する。 Hereinafter, a translation device according to an embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing the configuration of a translation system including a translation device according to this embodiment.
The translation system 100 includes at least the translation server 1. The translation system 100 according to the present embodiment further includes a transceiver server 2 and a mediation server 3.
The translation server 1 has a function of converting an input voice into a voice of a predetermined translation target language and outputting the voice.
The transceiver server 2 is a device that directly communicates with the user terminal 5, and performs management of each user terminal 5 included in the group, voice communication processing, user management, and the like.
The mediation server 3 mediates communication between the transceiver server 2 for voice data and text data and the translation server 1.

また図１ではユーザ端末５１、５２、５３、５４がトランシーバサーバ２と通信接続する態様を示している。ユーザ端末５１、５２、５３、５４を総称してユーザ端末５と呼ぶこととする。本実施形態においては、ユーザ端末５１を、ある建設現場の管理者が利用する。またユーザ端末５２、ユーザ端末５３、ユーザ端末５４を、建設現場の作業者が利用する。ユーザ端末５２、ユーザ端末５３、ユーザ端末５４を利用する作業者は、それぞれが異なる言語を母国語として会話をする作業者であるとする。ユーザ端末５はトランシーバ端末の機能を有する。そして管理者がユーザ端末５１を介して、各作業者に指示を音声で伝える。トランシーバサーバ２は、管理者の音声を受信して仲介サーバ３を介して翻訳サーバ１へ送信する。翻訳サーバ１は、管理者の音声を入力すると、直ちにユーザ端末５２、５３、５４を利用する各作業者の利用する言語に変換して、対応するユーザ端末５へ送信する。これにより、翻訳サーバ１を備えた翻訳システム１００は、管理者である第一話者が第一の言語で発した音声の内容が、第一の言語を習得していない他の言語の話者である作業者に、より短時間で伝わる仕組みを提供する。 Further, FIG. 1 shows a mode in which the user terminals 51, 52, 53 and 54 are communicatively connected to the transceiver server 2. The user terminals 51, 52, 53, 54 will be collectively referred to as the user terminal 5. In this embodiment, the user terminal 51 is used by an administrator at a certain construction site. In addition, a worker on the construction site uses the user terminal 52, the user terminal 53, and the user terminal 54. It is assumed that the workers who use the user terminal 52, the user terminal 53, and the user terminal 54 are workers who have different languages in their native languages. The user terminal 5 has the function of a transceiver terminal. Then, the administrator gives a voice instruction to each worker via the user terminal 51. The transceiver server 2 receives the voice of the administrator and transmits it to the translation server 1 via the mediation server 3. Upon inputting the voice of the administrator, the translation server 1 immediately converts it into the language used by each worker who uses the user terminals 52, 53, and 54, and transmits it to the corresponding user terminal 5. As a result, the translation system 100 including the translation server 1 is configured such that the content of the voice uttered by the first speaker who is the administrator in the first language is a speaker in another language that has not mastered the first language. We provide a mechanism that enables workers to be transmitted in a shorter time.

図２は、翻訳サーバ、トランシーバサーバ、仲介サーバのハードウェア構成を示す図である。
図２で示すように、翻訳サーバ１、トランシーバサーバ２、仲介サーバ３は、それぞれ、ＣＰＵ（Central Processing Unit）１０１、ＲＯＭ（Read Only Memory）１０２、ＲＡＭ（Random Access Memory）１０３、データベース１０４、通信モジュール１０５等の各ハードウェアを備えたコンピュータである。なおユーザ端末５も同様のハードウェアを備えたコンピュータである。 FIG. 2 is a diagram showing the hardware configurations of the translation server, the transceiver server, and the mediation server.
As shown in FIG. 2, the translation server 1, the transceiver server 2, and the mediation server 3 are respectively a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, a database 104, and a communication. It is a computer provided with each hardware such as the module 105. The user terminal 5 is also a computer equipped with similar hardware.

図３は、翻訳サーバの機能ブロック図である。
翻訳サーバ１は、翻訳管理プログラムを実行する。これにより翻訳サーバ１は、制御部１１、翻訳部１２、音声送信部１３、テキストデータ送信部１４を備える。 FIG. 3 is a functional block diagram of the translation server.
The translation server 1 executes a translation management program. As a result, the translation server 1 includes the control unit 11, the translation unit 12, the voice transmission unit 13, and the text data transmission unit 14.

制御部１１は、他の機能部を制御する。
翻訳部１２は、第一の言語の入力音声の入力を検知した場合に、第一の言語とは異なる一つ以上の翻訳先言語の出力音声へ直ちに変換する。翻訳部１２は、入力音声をテキスト化した第一テキストデータと、当該第一テキストデータに対応する翻訳先言語のテキストデータとを生成し、当該翻訳先言語のテキストデータを音声変換して翻訳先言語の出力音声を生成する。
音声送信部１３は、一つ以上の翻訳先言語の出力音声を、対応する言語を話すユーザの利用するユーザ端末５（送信先装置）へ送信する。
テキストデータ送信部１４は、ユーザ端末５からのテキストデータ送信要求があった場合に、当該ユーザ端末５に対応する言語を示す翻訳先言語の出力音声をテキスト化した翻訳先言語のテキストデータを当該ユーザ端末５へ送信する。 The control unit 11 controls other functional units.
When the translation unit 12 detects the input of the input voice of the first language, the translation unit 12 immediately converts the input voice into the output voice of one or more translation destination languages different from the first language. The translation unit 12 generates first text data in which the input voice is converted into text and text data in a translation destination language corresponding to the first text data, and performs voice conversion of the text data in the translation destination language to translate the translation destination. Generate the output speech of the language.
The voice transmitting unit 13 transmits the output voice in one or more translation destination languages to the user terminal 5 (transmission destination device) used by the user who speaks the corresponding language.
When there is a text data transmission request from the user terminal 5, the text data transmission unit 14 converts the text data of the translation destination language obtained by converting the output voice of the translation destination language indicating the language corresponding to the user terminal 5 into text. It is transmitted to the user terminal 5.

図４は翻訳部の詳細な機能構成を示す図である。
図４で示すように、翻訳部１２は、音声翻訳ＡＰＩ（Application Programming Interface）１２１、機械翻訳ＡＰＩ１２２、音声合成ＡＰＩ１２３、音声認識機能１２４、翻訳機能１２５、音声合成機能１２６を備える。音声翻訳ＡＰＩ１２１は、外部から入力した入力音声を音声認識機能１２４へ出力する。機械翻訳ＡＰＩ１２２は、機械翻訳のためのＡＰＩである。音声合成ＡＰＩ１２３は、翻訳によって変換された翻訳先言語の出力音声を出力する。音声認識機能１２４は、入力音声をその音声の言語のテキストに変換する。翻訳機能１２５は、入力音声のテキストを、他の指定された翻訳先言語のテキストに変換する。音声合成機能１２６は、翻訳先言語のテキストを、その翻訳先言語の音声に変換して音声合成ＡＰＩ１２３へ出力する。 FIG. 4 is a diagram showing a detailed functional configuration of the translation unit.
As shown in FIG. 4, the translation unit 12 includes a speech translation API (Application Programming Interface) 121, a machine translation API 122, a speech synthesis API 123, a speech recognition function 124, a translation function 125, and a speech synthesis function 126. The voice translation API 121 outputs the input voice input from the outside to the voice recognition function 124. The machine translation API 122 is an API for machine translation. The voice synthesis API 123 outputs the output voice of the translation destination language converted by the translation. The voice recognition function 124 converts the input voice into text in the language of the voice. The translation function 125 converts the text of the input voice into text of another designated target language. The voice synthesis function 126 converts the text of the translation destination language into the voice of the translation destination language and outputs it to the voice synthesis API 123.

図５はトランシーバサーバが記憶するユーザ管理テーブルを示す図である。
トランシーバサーバは、ある建設現場の管理者と作業者の利用する端末を、一つのグループに属するユーザの利用する端末として記憶する。例えば、端末ＩＤに、建設現場ＩＤ、企業ＩＤ、ユーザＩＤを紐づけて記憶する。そのほか、ユーザＩＤに、管理者ＩＤや、作業者ＩＤを紐づけて記憶してよい。ユーザ端末５１を利用するユーザをユーザＡ、ユーザ端末５２を利用するユーザをユーザＢ、ユーザ端末５３を利用するユーザをユーザＣ、ユーザ端末５４を利用するユーザをユーザＤと呼ぶこととする。ユーザＡは、建設現場の管理者、ユーザＢ，Ｃ，Ｄは建設現場の作業者であるとする。 FIG. 5 is a diagram showing a user management table stored in the transceiver server.
The transceiver server stores terminals used by managers and workers at a certain construction site as terminals used by users belonging to one group. For example, the construction site ID, the company ID, and the user ID are stored in association with the terminal ID. In addition, an administrator ID or a worker ID may be stored in association with the user ID. A user who uses the user terminal 51 is called a user A, a user who uses the user terminal 52 is called a user B, a user who uses the user terminal 53 is called a user C, and a user who uses the user terminal 54 is called a user D. User A is a construction site manager, and users B, C, and D are construction site workers.

この場合、トランシーバサーバ２は、ユーザ端末５１の端末ＩＤに、ユーザＡが管理する作業現場の建設現場ＩＤ、ユーザＡが属する企業の企業ＩＤ、ユーザＡの識別情報であるユーザＩＤ、ユーザＡが管理者であることを示す管理者ＩＤを紐づけて記憶する。またトランシーバサーバ２は、ユーザ端末５２の端末ＩＤに、ユーザＢが管理する作業現場の建設現場ＩＤ、ユーザＢを管理する企業の企業ＩＤ、ユーザＢの識別情報であるユーザＩＤ、ユーザＢが作業者であることを示す作業者ＩＤを紐づけて記憶する。またトランシーバサーバ２は、ユーザ端末５３の端末ＩＤに、ユーザＣが管理する作業現場の建設現場ＩＤ、ユーザＣを管理する企業の企業ＩＤ、ユーザＣの識別情報であるユーザＩＤ、ユーザＣが作業者であることを示す作業者ＩＤを紐づけて記憶する。またトランシーバサーバ２は、ユーザ端末５４の端末ＩＤに、ユーザＤが管理する作業現場の建設現場ＩＤ、ユーザＤを管理する企業の企業ＩＤ、ユーザＤの識別情報であるユーザＩＤ、ユーザＤが作業者であることを示す作業者ＩＤを紐づけて記憶する。なおトランシーバサーバ２は、作業者ＩＤや管理者ＩＤに紐づけてそのユーザが母語とする言語を示す言語ＩＤを記憶する。これら記憶する情報は、トランシーバサーバ２のユーザ管理テーブルにおいて記録される。 In this case, in the transceiver server 2, the terminal ID of the user terminal 51 includes the construction site ID of the work site managed by the user A, the company ID of the company to which the user A belongs, the user ID that is the identification information of the user A, and the user A. The administrator ID indicating that the administrator is associated is stored. In the transceiver server 2, the terminal ID of the user terminal 52 is set to the construction site ID of the work site managed by the user B, the company ID of the company managing the user B, the user ID that is the identification information of the user B, and the user B works. The worker ID indicating that the person is a person is stored in association with each other. Further, the transceiver server 2 uses the terminal ID of the user terminal 53 as the construction site ID of the work site managed by the user C, the company ID of the company managing the user C, the user ID as the identification information of the user C, and the work of the user C. The worker ID indicating that the person is a person is stored in association with each other. In addition, the transceiver server 2 uses the terminal ID of the user terminal 54 as the construction site ID of the work site managed by the user D, the company ID of the company that manages the user D, the user ID that is the identification information of the user D, and the user D works. The worker ID indicating that the person is a person is stored in association with each other. The transceiver server 2 stores the language ID indicating the language that the user has as a mother tongue in association with the worker ID or the manager ID. The stored information is recorded in the user management table of the transceiver server 2.

図６は本実施形態による処理フローを示す図である。
次に図６を用いて本実施形態による処理フローを順を追って説明する。
各ユーザは自身が利用するユーザ端末５を起動させる。これによりユーザ端末５１〜５４のそれぞれは、トランシーバサーバ２と通信接続を行う。この状態で、各ユーザはユーザ端末５を用いて、他のユーザと会話を行うことができる。 FIG. 6 is a diagram showing a processing flow according to the present embodiment.
Next, the processing flow according to the present embodiment will be described step by step with reference to FIG.
Each user activates the user terminal 5 used by the user. As a result, each of the user terminals 51 to 54 establishes communication connection with the transceiver server 2. In this state, each user can use the user terminal 5 to have a conversation with another user.

この時、まずユーザＡが、ユーザ端末５１のディスプレイに表示されている発話ボタンを押下する。ユーザ端末５１は、呼開始指示をトランシーバサーバ２へ送信する（ステップＳ１０１）。呼開始指示にはユーザ端末５１の端末ＩＤと呼番号が含まれる。トランシーバサーバ２は、呼開始指示に基づいてユーザ端末５が属するグループに含まれるユーザ端末同士の音声通信処理を開始する（ステップＳ１０２）。トランシーバサーバ２は、呼開始指示に含まれる端末ＩＤを取得する。 At this time, the user A first presses the speech button displayed on the display of the user terminal 51. The user terminal 51 transmits a call start instruction to the transceiver server 2 (step S101). The call start instruction includes the terminal ID of the user terminal 51 and the call number. The transceiver server 2 starts voice communication processing between the user terminals included in the group to which the user terminal 5 belongs based on the call start instruction (step S102). The transceiver server 2 acquires the terminal ID included in the call start instruction.

トランシーバサーバ２は、端末ＩＤに紐づいて同じグループに属する端末ＩＤをトランシーバサーバ２のデータベースに記録されているユーザ管理テーブルから取得する。これら取得した端末ＩＤは、ユーザ端末５２，５３，５４の各端末ＩＤであるとする。トランシーバサーバ２は、ユーザ端末５１，５２，５３，５４のそれぞれに紐づいてユーザ管理テーブルが記憶する言語ＩＤを取得する。トランシーバサーバ２は、呼出元のユーザ端末５１の端末ＩＤと対応する言語ＩＤ、呼出先のユーザ端末５２，５３，５４の端末ＩＤ、各端末ＩＤに対応する言語ＩＤ、呼番号、を含む翻訳開始指示を、仲介サーバ３を介して翻訳サーバ１へ送信する（ステップＳ１０３）。仲介サーバ３は、翻訳開始指示を翻訳サーバ１へ送信する。 The transceiver server 2 acquires a terminal ID belonging to the same group in association with the terminal ID from the user management table recorded in the database of the transceiver server 2. These acquired terminal IDs are assumed to be the terminal IDs of the user terminals 52, 53, 54. The transceiver server 2 acquires the language ID stored in the user management table in association with each of the user terminals 51, 52, 53, 54. The transceiver server 2 starts translation including a language ID corresponding to the terminal ID of the calling user terminal 51, terminal IDs of the called user terminals 52, 53, 54, a language ID corresponding to each terminal ID, and a call number. The instruction is transmitted to the translation server 1 via the mediation server 3 (step S103). The mediation server 3 transmits a translation start instruction to the translation server 1.

そして、ユーザＡはユーザ端末５１に備わるマイクに向けて音声を発する。当該音声は作業者に対する指示などであってよい。ユーザ端末５１は、ユーザ端末５１の端末ＩＤと音声データと、呼番号とを含む発話データを、トランシーバサーバ２へ送信する（ステップＳ１０４）。トランシーバサーバ２は発話データを、仲介サーバ３を介して翻訳サーバ１へ送信する（ステップＳ１０５）。ユーザＡは、発話を終了すると、ユーザ端末５１のディスプレイに表示されている終話ボタンを押下する。するとユーザ端末５１は終話通知をトランシーバサーバ２へ送信する（ステップＳ１０６）。なおトランシーバサーバ２は、終話通知を受信する前に、ユーザＡの翻訳しない音声データを、直ちに呼出先のユーザ端末５へ送信してもよい。 Then, the user A emits a voice toward the microphone provided in the user terminal 51. The voice may be an instruction to the operator or the like. The user terminal 51 transmits the speech data including the terminal ID of the user terminal 51, the voice data, and the call number to the transceiver server 2 (step S104). The transceiver server 2 transmits the utterance data to the translation server 1 via the mediation server 3 (step S105). When the user A finishes speaking, the user A presses the end call button displayed on the display of the user terminal 51. Then, the user terminal 51 transmits a call end notice to the transceiver server 2 (step S106). Note that the transceiver server 2 may immediately transmit the untranslated voice data of the user A to the callee user terminal 5 before receiving the call end notification.

翻訳サーバ１は翻訳開始指示を受信する。翻訳サーバ１は端末ＩＤと発話データを受信する。翻訳サーバ１は呼番号に基づいて、翻訳開始指示と発話データの対応関係を検知する。翻訳サーバ１の翻訳部１２は、呼番号に基づいて特定した翻訳開始指示に含まれる、呼出元のユーザ端末５１の端末ＩＤと対応する言語ＩＤと、呼出先のユーザ端末５２，５３，５４の端末ＩＤと各端末ＩＤに対応する言語ＩＤとを取得する。翻訳部１２は、呼出元のユーザ端末５１の端末ＩＤと対応する言語ＩＤを翻訳元の言語の言語ＩＤとして特定する。翻訳部１２は、呼出先のユーザ端末５２，５３，５４の端末ＩＤと対応する各言語ＩＤを翻訳先の言語の言語ＩＤとして特定する。 The translation server 1 receives the translation start instruction. The translation server 1 receives the terminal ID and the speech data. The translation server 1 detects the correspondence between the translation start instruction and the utterance data based on the call number. The translation unit 12 of the translation server 1 includes the language ID corresponding to the terminal ID of the calling user terminal 51 and the called user terminals 52, 53, 54 included in the translation start instruction specified based on the call number. The terminal ID and the language ID corresponding to each terminal ID are acquired. The translation unit 12 identifies the language ID corresponding to the terminal ID of the calling user terminal 51 as the language ID of the translation source language. The translation unit 12 identifies each language ID corresponding to the terminal ID of the callee user terminal 52, 53, 54 as the language ID of the translation destination language.

そして翻訳部１２の音声認識機能１２４が、翻訳開始指示を受信している状況であれば、終話通知を受信している受信していないにかかわらず、直ちに翻訳処理を開始する。この時、翻訳部１２の音声認識機能１２４は、同じ呼番号に基づいて特定した発話データに含まれる音声データを、翻訳元の言語ＩＤが示す翻訳元言語のテキストデータ（第一テキストデータ）に変換する（ステップＳ１０７）。翻訳部１２は、音声データをテキストデータに変換する際に公知の音声テキスト変換の技術を用いればよい。翻訳部１２は呼番号と翻訳元言語のテキストデータとを紐づけてデータベース等に記憶してよい。 Then, if the voice recognition function 124 of the translation unit 12 receives the translation start instruction, the translation process is immediately started regardless of whether or not the end-of-call notification is received. At this time, the voice recognition function 124 of the translation unit 12 converts the voice data included in the utterance data identified based on the same call number into text data (first text data) in the translation source language indicated by the translation source language ID. It is converted (step S107). The translation unit 12 may use a known voice-text conversion technique when converting voice data into text data. The translation unit 12 may store the call number and the text data of the translation source language in a database or the like in association with each other.

また翻訳部１２の音声認識機能１２４が、翻訳元言語のテキストデータを、特定した翻訳先の言語ＩＤが示す翻訳先言語のテキストデータに翻訳する（ステップＳ１０８）。翻訳部１２は、翻訳元言語のテキストデータを翻訳先言語のテキストデータに変換する際に、公知の翻訳技術を用いてよい。翻訳部１２は呼番号と翻訳先言語のテキストデータとを紐づけてデータベースに記録する（ステップＳ１０９）。ここでユーザＢが言語Ｂ、ユーザＣが言語Ｃ、ユーザＤが言語Ｄを母語とする作業者である場合、言語Ｂを翻訳先言語とするテキストデータ、言語Ｃを翻訳先言語とするテキストデータ、言語Ｄを翻訳先言語とするテキストデータ、の３つの言語のテキストデータが生成される。 Further, the voice recognition function 124 of the translation unit 12 translates the text data in the translation source language into the text data in the translation destination language indicated by the identified translation destination language ID (step S108). The translation unit 12 may use a known translation technique when converting the text data in the translation source language into the text data in the translation destination language. The translation unit 12 associates the call number with the text data of the translation target language and records it in the database (step S109). Here, when the user B is the language B, the user C is the language C, and the user D is an operator whose native language is the language D, the text data having the language B as the translation destination language and the text data having the language C as the translation destination language are described. , Text data in which the language D is the translation destination language, and text data in three languages are generated.

また翻訳部１２の音声合成機能１２６が、翻訳先言語のテキストデータを音声に変換し、翻訳先言語毎のテキストデータに対応する音声データを生成する（ステップＳ１１０）。翻訳部１２は、翻訳先言語の音声データと、言語ＩＤと、当該言語ＩＤに紐づく端末ＩＤと呼番号とを紐づけた出力音声データを、生成した翻訳先言語の音声データ毎に生成してデータベースに記録してもよい。本実施形態においては、音声合成機能１２６は、翻訳先言語Ｂの音声データを含む出力音声データ、翻訳先言語Ｃの音声データを含む出力音声データ、翻訳先言語Ｄの音声データを含む出力音声データ、の３つの出力音声データを生成して、データベースに記録する。音声合成ＡＰＩ１２３は、音声合成機能１２６の生成した３つの出力音声データをそれぞれ取得する。音声合成ＡＰＩ１２３は、各出力音声データを音声送信部１３へ出力する。 Further, the voice synthesis function 126 of the translation unit 12 converts the text data of the translation target language into voice and generates voice data corresponding to the text data of each translation target language (step S110). The translation unit 12 generates, for each generated voice data in the translation target language, output voice data in which the translation target language voice data, the language ID, the terminal ID linked to the language ID, and the call number are linked. It may be recorded in the database. In the present embodiment, the voice synthesis function 126 outputs voice data including voice data of the target language B, voice output data including voice data of the target language C, and voice output data including voice data of the target language D. , Three output voice data are generated and recorded in the database. The voice synthesis API 123 acquires each of the three output voice data generated by the voice synthesis function 126. The voice synthesis API 123 outputs each output voice data to the voice transmitting unit 13.

音声送信部１３は、３つの出力音声データを、仲介サーバ３を介してトランシーバサーバ２へ送信する（ステップＳ１１１）。トランシーバサーバ２は、３つの出力音声データを受信する。トランシーバサーバ２は出力音声データに含まれる端末ＩＤを送信先のユーザ端末の端末ＩＤと特定する。トランシーバサーバ２は、終話通知を受信している受信していないにかかわらず、各出力音声データを、ユーザ端末５２、ユーザ端末５３、ユーザ端末５４へそれぞれ一斉同報送信する（ステップＳ１１２）。つまり、トランシーバサーバ２は、ユーザ端末５２の端末ＩＤを含み翻訳先言語Ｂの音声データを含む出力音声データをユーザ端末５２へ送信する。またトランシーバサーバ２は、ユーザ端末５３の端末ＩＤを含み翻訳先言語Ｃの音声データを含む出力音声データをユーザ端末５３へ送信する。またトランシーバサーバ２は、ユーザ端末５４の端末ＩＤを含み翻訳先言語Ｄの音声データを含む出力音声データをユーザ端末５４へ送信する。 The voice transmitting unit 13 transmits the three output voice data to the transceiver server 2 via the mediation server 3 (step S111). The transceiver server 2 receives the three output voice data. The transceiver server 2 identifies the terminal ID included in the output voice data as the terminal ID of the destination user terminal. The transceiver server 2 broadcasts each output voice data to the user terminal 52, the user terminal 53, and the user terminal 54 regardless of whether or not the end-of-call notification is received (step S112). That is, the transceiver server 2 transmits the output voice data including the terminal ID of the user terminal 52 and the voice data of the target language B to the user terminal 52. Further, the transceiver server 2 transmits to the user terminal 53 output voice data including the terminal ID of the user terminal 53 and the voice data of the target language C. Further, the transceiver server 2 transmits the output voice data including the terminal ID of the user terminal 54 and the voice data of the target language D to the user terminal 54.

以上の処理により、ユーザＡがユーザ端末５１を用いて指示した音声が翻訳サーバ１により呼出先の各ユーザの母語の言語にそれぞれ翻訳されて、その翻訳後の音声データを含む出力音声データがユーザ端末５に一斉同報送信される。各ユーザ端末５は、出力音声データに含まれる音声データを用いてスピーカから音声を発する。ユーザＢ，Ｃ，Ｄは、ユーザＡの自身の言語に翻訳された指示に基づいて現場の作業を行うことができる。 Through the above processing, the voice instructed by the user A using the user terminal 51 is translated into the language of the mother tongue of each of the callees by the translation server 1, and the output voice data including the translated voice data is output to the user. Broadcast transmission to the terminal 5. Each user terminal 5 emits a voice from a speaker using the voice data included in the output voice data. The users B, C, and D can perform on-site work based on the instructions translated into the user A's own language.

上述の処理によれば、翻訳サーバ１は、ユーザ端末５１からの音声が届くと直ちに翻訳を開始している。これにより、ユーザＡがユーザ端末５１に入力した音声が直ちに翻訳されて、一斉同報送信により、呼出先のユーザ端末５へ翻訳後の出力音声データが届く。トランシーバの技術では、呼出元のユーザは、ユーザ端末に自身の音声を入力し、送信権開放の指示を入力するが、本実施形態においては送信権開放の指示をしない場合でも、翻訳サーバが翻訳の開始と翻訳後の出力音声データを呼出先の端末に一斉同報送信する。送信権開放とはユーザが自身の発話する権利を他のユーザに開放することを意味する。これにより、呼出元のユーザの操作も簡略化することができる。 According to the above process, the translation server 1 starts translation immediately after the voice from the user terminal 51 arrives. As a result, the voice input by the user A to the user terminal 51 is immediately translated, and the output voice data after translation arrives at the called user terminal 5 by the simultaneous broadcast transmission. In the transceiver technology, the calling user inputs his/her voice into the user terminal and inputs an instruction to release the transmission right. However, in the present embodiment, the translation server translates even if the instruction to release the transmission right is not issued. And broadcast the output voice data after translation to the called terminals. Releasing the transmission right means that the user releases the right to speak to another user. As a result, the operation of the calling user can be simplified.

呼出先のユーザ端末５２，５３，５４のユーザは、出力音声データに含まれる音声データのテキストデータの送信を要求することができる。この場合、呼出先のユーザは、ユーザ端末５のディスプレイに表示されているテキストデータ送信要求ボタンを押下する。一例としてユーザ端末５２を利用するユーザＢが、ユーザ端末５２のディスプレイに表示されているテキストデータ送信要求ボタンを押下したとする。この場合、ユーザ端末５２は、自端末の端末ＩＤを含むテキストデータ送信要求をトランシーバサーバ２へ送信する（ステップＳ１１３）。 The user of the called user terminal 52, 53, 54 can request transmission of text data of voice data included in the output voice data. In this case, the callee user presses the text data transmission request button displayed on the display of the user terminal 5. As an example, it is assumed that the user B who uses the user terminal 52 presses the text data transmission request button displayed on the display of the user terminal 52. In this case, the user terminal 52 transmits a text data transmission request including the terminal ID of its own terminal to the transceiver server 2 (step S113).

トランシーバサーバ２はテキストデータ送信要求を受信する。トランシーバサーバ２はテキストデータ送信要求を、仲介サーバ３を介して翻訳サーバ１へ送信する（ステップＳ１１４）。翻訳サーバ１のテキストデータ送信部１４は、テキストデータ送信要求に含まれる端末ＩＤを取得する。テキストデータ送信部１４は、端末ＩＤに紐づいて自装置のデータベースに登録されている翻訳先言語のテキストデータのうち、未送信の翻訳先言語のテキストデータを取得する。テキストデータ送信部１４は取得した翻訳先言語のテキストデータとテキストデータ送信要求に含まれる端末ＩＤとを含むテキストデータ応答を、仲介サーバ３を介してトランシーバサーバ２へ送信する（ステップＳ１１５）。トランシーバサーバ２は、テキストデータ応答を端末ＩＤに基づいてユーザ端末５２へ送信する（ステップＳ１１６）。 The transceiver server 2 receives the text data transmission request. The transceiver server 2 transmits the text data transmission request to the translation server 1 via the mediation server 3 (step S114). The text data transmission unit 14 of the translation server 1 acquires the terminal ID included in the text data transmission request. The text data transmission unit 14 acquires the text data in the translation destination language that has not been transmitted among the text data in the translation destination language that is registered in the database of the own device in association with the terminal ID. The text data transmission unit 14 transmits a text data response including the acquired text data in the translation target language and the terminal ID included in the text data transmission request to the transceiver server 2 via the intermediary server 3 (step S115). The transceiver server 2 transmits the text data response to the user terminal 52 based on the terminal ID (step S116).

ユーザ端末５２はテキストデータ応答を受信する。ユーザ端末５２はテキストデータ応答に含まれる翻訳先言語のテキストデータをディスプレイに出力する（ステップＳ１１７）。これにより、ユーザＢはユーザ端末５２に表示されたテキストデータを確認し、ユーザＡが行った指示等の音声の文字列を確認することができる。 The user terminal 52 receives the text data response. The user terminal 52 outputs the text data in the translation target language included in the text data response to the display (step S117). Accordingly, the user B can confirm the text data displayed on the user terminal 52 and the voice character string such as the instruction given by the user A.

上述の処理によれば、ユーザが指示した場合にのみテキストデータ送信要求に基づくテキストデータ応答が、当該ユーザの利用するユーザ端末５に送信される。これにより、翻訳サーバ１は全ての翻訳先言語のテキストデータを呼出先のユーザ端末５に送信する必要が無いため、処理負荷を軽減することができる。 According to the above-described processing, the text data response based on the text data transmission request is transmitted to the user terminal 5 used by the user only when instructed by the user. As a result, the translation server 1 does not need to transmit the text data in all the translation destination languages to the callee user terminal 5, so that the processing load can be reduced.

上述の処理においては、ユーザの操作に基づいて、呼出先のユーザ端末５が送信したテキストデータ送信要求を翻訳サーバ１が受信した場合にのみ、翻訳サーバ１のテキストデータ送信部１４が、翻訳先言語のテキストデータとテキストデータ送信要求に含まれる端末ＩＤとを含むテキストデータ応答を送信している。しかしながら、呼出先のユーザ端末５は、ステップＳ１１２でトランシーバサーバ２から送信された出力音声データを受信した場合に、自動的に、翻訳サーバ１に向けてテキストデータ送信要求を複数回送信するポーリングを行い、その結果、翻訳サーバ１からテキストデータ応答を受信してもよい。これにより、ユーザの労力なく短時間で、出力音声データに対応する翻訳先言語のテキストデータを呼出先のユーザ端末５に表示させることができる。 In the above-mentioned process, the text data transmission unit 14 of the translation server 1 causes the translation data transmission unit 14 of the translation server 1 to translate the text data transmission request transmitted from the callee's user terminal 5 based on the user's operation. The text data response including the language text data and the terminal ID included in the text data transmission request is transmitted. However, when the callee user terminal 5 receives the output voice data transmitted from the transceiver server 2 in step S112, it automatically performs polling for transmitting the text data transmission request to the translation server 1 multiple times. As a result, a text data response may be received from the translation server 1. As a result, the text data in the translation destination language corresponding to the output voice data can be displayed on the user terminal 5 of the call destination in a short time without the user's effort.

また上述の処理においては、翻訳サーバ１はテキストデータ送信要求を受信した場合にのみテキストデータ応答を送信しているが、出力音声データの送信と共に、またはその後直ちに、テキストデータ送信要求を受信することなく、テキストデータ応答に対応する情報を含むテキストデータを、各呼出先のユーザ端末５へ送信するようにしてもよい。 Further, in the above-mentioned processing, the translation server 1 transmits the text data response only when the text data transmission request is received. However, the translation server 1 may receive the text data transmission request with the output voice data transmission or immediately thereafter. Alternatively, text data including information corresponding to the text data response may be transmitted to the user terminals 5 of the respective call destinations.

図７は翻訳サーバの最小構成を示す図である。
図８は最小構成による翻訳サーバの処理フローを示す図である。
翻訳サーバ１は、少なくとも翻訳部１２と、音声送信部１３の構成を備えればよい。
翻訳部１２は、第一の言語の入力音声の入力を検知した場合に、第一の言語とは異なる一つ以上の翻訳先言語の出力音声へ直ちに変換する（ステップＳ２０１）。
音声送信部１３は、一つ以上の翻訳先言語の出力音声を、対応する言語を話すユーザの利用する送信先装置へ送信する（ステップＳ２０２）。 FIG. 7 is a diagram showing the minimum configuration of the translation server.
FIG. 8 is a diagram showing a processing flow of the translation server with the minimum configuration.
The translation server 1 may include at least the translation unit 12 and the voice transmission unit 13.
When the translation unit 12 detects the input of the input voice of the first language, the translation unit 12 immediately converts the input voice into the output voice of one or more translation destination languages different from the first language (step S201).
The voice transmitting unit 13 transmits the output voice in one or more translation destination languages to the destination device used by the user who speaks the corresponding language (step S202).

上述の各装置は内部に、コンピュータシステムを有している。そして、上述した各処理の過程は、プログラムの形式でコンピュータ読み取り可能な記録媒体に記憶されており、このプログラムをコンピュータが読み出して実行することによって、上記処理が行われる。ここでコンピュータ読み取り可能な記録媒体とは、磁気ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、半導体メモリ等をいう。また、このコンピュータプログラムを通信回線によってコンピュータに配信し、この配信を受けたコンピュータが当該プログラムを実行するようにしても良い。 Each of the above devices has a computer system inside. The process of each process described above is stored in a computer readable recording medium in the form of a program, and the above process is performed by the computer reading and executing the program. Here, the computer-readable recording medium refers to a magnetic disk, a magneto-optical disk, a CD-ROM, a DVD-ROM, a semiconductor memory or the like. Further, the computer program may be distributed to the computer via a communication line, and the computer that receives the distribution may execute the program.

また、上記プログラムは、前述した機能の一部を実現するためのものであっても良い。さらに、前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であっても良い。 Further, the program may be a program for realizing a part of the functions described above. Further, it may be a so-called difference file (difference program) that can realize the above-mentioned functions in combination with a program already recorded in the computer system.

１・・・翻訳サーバ（翻訳装置）
２・・・トランシーバサーバ
３・・・仲介サーバ
５・・・ユーザ端末
１１・・・制御部
１２・・・翻訳部（翻訳手段）
１３・・・音声送信部（音声送信手段）
１４・・・テキストデータ送信部（テキストデータ送信手段） 1... Translation server (translation device)
2... Transceiver server 3... Mediation server 5... User terminal 11... Control unit 12... Translation unit (translation means)
13... Voice transmitting unit (voice transmitting means)
14... Text data transmission unit (text data transmission means)

Claims

A storage unit that stores identification information of a plurality of transmission destination devices and a translation target language corresponding to the identification information, which are associated with the identification information of the transmission source device,
When the input speech of the first language from the transmission source apparatus is detected, the input speech is translated into one or more translations corresponding to identification information of the transmission destination apparatus different from the first language. A translation unit that immediately converts the output voice of the destination language into text data of the output voices of the plurality of translation destination languages, and
A voice transmitting means for transmitting the outputs audio of the one or more target language, to the destination device corresponding to the identification information,
When there is a text data transmission request from the transmission destination device that has received the output voice in the translation destination language, the output voice in the translation destination language corresponding to the identification information of the transmission destination device included in the text data transmission request. Text data transmitting means for transmitting the text data of the translation destination language in the form of text to the transmission destination device,
A translation device.

The voice transmission unit outputs the output voices of the plurality of translation target languages by the translation unit to a destination device corresponding to the identification information after generating the output voices and before the text data transmission request. Send each
The translation device according to claim 1.

The text data transmitting means, after transmitting the output audio by the audio transmission means, when there is the text data transmission request, transmits the text data of the target language
The translation device according to claim 1 or 2.

The translation unit generates first text data in which the input voice is converted into text, and text data in the translation target language corresponding to the first text data, and performs voice conversion on the text data in the translation target language. The translation device according to any one of claims 1 to 3, which generates an output voice in the translation destination language.

The one or more target languages are at least two or more translated languages.
The translation device according to any one of claims 1 to 4.

An acquisition means for acquiring, from a database, a plurality of target languages indicating the languages of the users belonging to the group specified based on the call start instruction, based on the reception of the call start instruction,
When the translation means detects an input of an input voice in the first language from a transmission source device that has transmitted the call start instruction, the translation means outputs a plurality of output voices in the translation target language different from the first language. to convert immediately
The translation device according to any one of claims 1 to 5.

The text data transmission request is made by a transmission operation by the user of the transmission destination device.
The translation device according to any one of claims 1 to 6.

The identification information of a plurality of transmission destination devices, which is associated with the identification information of the transmission source device, and the translation destination language corresponding to the identification information are stored,
When the input speech of the first language from the transmission source apparatus is detected, the input speech is translated into one or more translations corresponding to identification information of the transmission destination apparatus different from the first language. Immediately convert to the output speech of the destination language, respectively generate the text data to the output speech of the plurality of translation destination languages into text,
An output audio of the one or more target language, transmitted respectively to the destination device corresponding to the identification information,
When there is a text data transmission request from the transmission destination device that has received the output voice in the translation destination language, the output voice in the translation destination language corresponding to the identification information of the transmission destination device included in the text data transmission request. A translation method for transmitting the text data in the translation destination language in the form of text to the transmission destination device .

The output voices of the plurality of translation target languages are respectively transmitted to the destination device corresponding to the identification information after the output voices are generated and before the text data transmission request.
The translation method according to claim 8.

When the text data transmission request is made after the output voice is transmitted, the text data in the translation target language is transmitted.
The translation method according to claim 8 or 9.

First text data obtained by converting the input voice into text and text data of the translation target language corresponding to the first text data are generated, and text data of the translation target language is voice-converted to convert the translation target language. Generate output audio
The translation method according to any one of claims 8 to 10.

The one or more target languages are at least two or more translated languages.
The translation method according to any one of claims 8 to 11.

Upon receipt of the call start instruction, a plurality of target languages indicating the languages of the respective users belonging to the group identified based on the call start instruction are acquired from the database,
When an input voice input in the first language is detected from the transmission source device that has transmitted the call start instruction, the input voice is immediately converted into a plurality of output voices in the target language different from the first language.
The translation method according to any one of claims 8 to 12.

The text data transmission request is made by a transmission operation by the user of the transmission destination device.
The translation method according to any one of claims 8 to 13.

Computer of the translation device,
A storage unit that stores the identification information of the plurality of destination devices and the translation target language corresponding to the identification information, which are associated with the identification information of the transmission source device.
When the input speech of the first language from the transmission source apparatus is detected, the input speech is translated into one or more translations corresponding to identification information of the transmission destination apparatus different from the first language. A translation unit that immediately converts the output voice of the destination language into text data of the output voice of the plurality of translation destination languages ,
Audio transmission means for transmitting each of the one or more output audio target language, to the destination device corresponding to the identification information,
When there is a text data transmission request from the transmission destination device that has received the output voice in the translation destination language, the output voice in the translation destination language corresponding to the identification information of the transmission destination device included in the text data transmission request. Text data transmitting means for transmitting the text data of the translation destination language in the form of text to the transmission destination device,
Program to function as.