KR20200003529A

KR20200003529A - Digital device for recognizing voice and method for controlling the same

Info

Publication number: KR20200003529A
Application number: KR1020180076422A
Authority: KR
Inventors: 황정환; 조택일; 민동옥
Original assignee: 엘지전자 주식회사
Priority date: 2018-07-02
Filing date: 2018-07-02
Publication date: 2020-01-10
Also published as: WO2020009261A1

Abstract

The present invention relates to a digital device capable of recognizing a voice. According to one embodiment of the present invention, the digital device comprises: a memory; a microphone receiving an audio signal of a first language; and a controller controlling the memory and microphone. More especially, the controller translates the received audio signal of the first language into a text of a predetermined second language by referring the memory, retranslates the translated text of the second language with the first language by referring the memory, compares the retranslated text of the first language with the received audio signal of the first language, and modifying some or all of the translated text of the second language and then converting the modified text of the second language into a format, which may be outputted in an audio signal, according to the comparison result.

Description

Digital device capable of speech recognition and its control method {DIGITAL DEVICE FOR RECOGNIZING VOICE AND METHOD FOR CONTROLLING THE SAME}

본 발명은 음성 인식이 가능한 디지털 디바이스에 대한 것으로서, 예를 들어 헤드셋, 넥밴드 등 휴대용 음향기기 또는 휴대폰 등의 모바일 디바이스에도 적용 가능하다.The present invention relates to a digital device capable of speech recognition, and is applicable to, for example, a portable audio device such as a headset, a neckband, or a mobile device such as a mobile phone.

최근 들어 음성 인식이 가능한 모바일 디바이스, TV, 웨어러블 디바이스 등이 등장하고 있다. 서버 및 음성 인식 엔진의 기술 진화로 인하여, 오래 전에 비해 음성 인식률이 상당히 높아졌다.Recently, mobile devices, TVs, and wearable devices capable of speech recognition have emerged. Technological evolution of servers and speech recognition engines has resulted in significantly higher speech recognition rates than long ago.

그러나, 서로 다른 언어로 번역 서비스를 제공하는 경우, 다양한 언어의 특수한 차이로 인하여 음성 인식 기반 번역 서비스에는 여전히 오류가 많은 것이 현실이다.However, in the case of providing translation services in different languages, there are still many errors in the speech recognition based translation services due to special differences in various languages.

따라서, 본 발명의 일실시예는, 제1언어의 오디오를 제2언어로 번역하여 스피커 등을 통해 출력시, 음성 인식 및 번역의 오류를 원천적으로 제거하기 위한 기술을 제안하고자 한다.Accordingly, an embodiment of the present invention is to propose a technique for fundamentally eliminating errors in speech recognition and translation when translating audio of a first language into a second language and outputting the same through a speaker.

나아가, 본 발명의 다른 일실시예는, 음성 인식 엔진을 통해 단순히 언어의 텍스트만 번역하는 것이 아니라, 발화하는 유저의 상태 정보까지 디텍트 하기 위한 기술을 제안하고자 한다.Furthermore, another embodiment of the present invention is to propose a technique for detecting not only the text of a language but also the state information of a user who speaks through a speech recognition engine.

본 발명의 과제들은 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기한 문제점 등을 해결하기 위하여, 본 발명의 일실시예에 의한 음성 인식이 가능한 디지털 디바이스의 제어 방법은, 제1언어의 오디오 신호를 수신하는 단계와, 메모리를 참조하여, 상기 수신된 제1언어의 오디오 신호를 기설정된 제2언어의 텍스트로 번역하는 단계와, 상기 메모리를 참조하여, 상기 번역된 제2언어의 텍스트를 상기 제1언어로 재번역하는 단계와, 상기 재번역된 제1언어의 텍스트와 상기 수신된 제1언어의 오디오 신호를 비교하는 단계와, 그리고 상기 비교 결과에 따라, 상기 번역된 제2언어의 텍스트의 일부 또는 전부를 수정한 후 오디오 신호로 출력가능한 포맷으로 변환하는 단계를 포함한다.In order to solve the above problems, the control method of the digital device capable of speech recognition according to an embodiment of the present invention, the step of receiving an audio signal of a first language, with reference to a memory, the received first Translating an audio signal of a language into text of a second predetermined language; retranslating the translated second language text into the first language with reference to the memory; and Comparing text with an audio signal of the received first language, and converting a part or all of the text of the translated second language into a format that can be output as an audio signal according to the comparison result It includes.

상기 비교하는 단계는, 상기 재번역된 제1언어의 텍스트와 상기 수신된 제1언어의 오디오 신호가다른 경우, 동일한 부분과 다른 부분을 다르게 디스플레이 하는 단계 및 사용자 선택 또는 상기 디지털 디바이스의 메모리에 저장된 데이터에 따라 자동으로 상기 다른 부분을 수정하기 위한 옵션을 제공하는 단계를 더 포함한다.The comparing may include displaying different portions of the same and different portions when the text of the retranslated first language and the audio signal of the received first language are different, and user selection or data stored in a memory of the digital device. The method further includes providing an option for automatically modifying the other portion accordingly.

상기 수신하는 단계는, 상기 제1언어의 오디오 신호의 특징점을 추출하는 단계를 더 포함하고, 상기 특징점은, 강세, 억양, 목소리, 볼륨 중 적어도 하나를 포함하는 것을 특징으로 한다.The receiving may further include extracting feature points of the audio signal of the first language, wherein the feature points include at least one of stress, intonation, voice, and volume.

상기 변환하는 단계는, 상기 제1언어의 오디오 신호로부터 추출된 특징점에 따라, 상기 번역된 제2언어의 텍스트를 다른 강세, 억양, 목소리 또는 볼륨으로 출력하는 단계를 더 포함하는 것을 특징으로 한다.The converting may further include outputting the translated second language text in different accent, intonation, voice, or volume according to a feature point extracted from the audio signal of the first language.

상기 제1언어와 상기 제2언어는 서로 다르고, 상기 디지털 디바이스는, 웨어러블 디바이스 또는 모바일 디바이스 중 적어도 하나를 포함하는 것을 특징으로 한다.The first language and the second language are different from each other, and the digital device includes at least one of a wearable device and a mobile device.

본 발명의 일실시예에 의한 음성 인식이 가능한 디지털 디바이스는, 메모리와, 제1언어의 오디오 신호를 수신하는 마이크 및 상기 메모리 및 상기 마이크를 제어하는 컨트롤러를 포함한다.A digital device capable of speech recognition according to an embodiment of the present invention includes a memory, a microphone for receiving an audio signal of a first language, and a controller for controlling the memory and the microphone.

상기 컨트롤러는, 상기 메모리를 참조하여, 상기 수신된 제1언어의 오디오 신호를 기설정된 제2언어의 텍스트로 번역하고, 상기 메모리를 참조하여, 상기 번역된 제2언어의 텍스트를 상기 제1언어로 재번역하고, 상기 재번역된 제1언어의 텍스트와 상기 수신된 제1언어의 오디오 신호를 비교하고, 그리고 상기 비교 결과에 따라, 상기 번역된 제2언어의 텍스트의 일부 또는 전부를 수정한 후 오디오 신호로 출력가능한 포맷으로 변환하는 것을 특징으로 한다.The controller may be configured to translate the received audio signal of the first language into text of a second predetermined language by referring to the memory, and to translate the text of the translated second language into the first language by referring to the memory. Retranslate into a second language, compare the text of the retranslated first language with the audio signal of the received first language, and modify some or all of the text of the translated second language according to the comparison result. And converting it into a format that can be output as a signal.

상기 디지털 디바이스는, 상기 재번역된 제1언어의 텍스트와 상기 수신된 제1언어의 오디오 신호가다른 경우, 동일한 부분과 다른 부분을 다르게 디스플레이 하고, 그리고 사용자 선택 또는 상기 디지털 디바이스의 메모리에 저장된 데이터에 따라 자동으로 상기 다른 부분을 수정하기 위한 옵션을 디스플레이 하는 디스플레이 모듈을 더 포함하는 것을 특징으로 한다.When the text of the retranslated first language and the audio signal of the received first language are different, the digital device displays the same part and a different part differently, and displays the user part or data stored in the memory of the digital device. The display module may further include a display module configured to automatically display an option for automatically modifying the other part.

상기 컨트롤러는, 상기 제1언어의 오디오 신호의 특징점을 추출하고, 상기 특징점은, 강세, 억양, 목소리, 볼륨 중 적어도 하나를 포함하는 것을 특징으로 한다.The controller extracts a feature point of the audio signal of the first language, and the feature point includes at least one of stress, intonation, voice, and volume.

상기 컨트롤러는, 상기 제1언어의 오디오 신호로부터 추출된 특징점에 따라, 상기 번역된 제2언어의 텍스트를 다른 강세, 억양, 목소리 또는 볼륨으로 출력하도록 스피커를 제어하는 것을 특징으로 한다.The controller may be configured to control the speaker to output the translated text of the second language in another accent, intonation, voice, or volume according to a feature point extracted from the audio signal of the first language.

기타 실시예들의 구체적인 사항들은 상세한 설명 및 도면들에 포함되어 있다. Specific details of other embodiments are included in the detailed description and the drawings.

본 발명의 실시예에 따르면 다음과 같은 효과가 하나 혹은 그 이상 있다.According to an embodiment of the present invention, there are one or more of the following effects.

첫 째, 본 발명의 일실시예에 의하면, 제1언어의 오디오를 제2언어로 번역하여 스피커 등을 통해 출력시, 음성 인식 및 번역의 오류를 원천적으로 제거할 수 있는 기술적 효과가 있다.First, according to one embodiment of the present invention, when the audio of the first language is translated into a second language and output through a speaker, there is a technical effect of fundamentally eliminating errors in speech recognition and translation.

나아가, 본 발명의 다른 일실시예에 의하면, 음성 인식 엔진을 통해 단순히 언어의 텍스트만 번역하는 것이 아니라, 발화하는 유저의 상태 정보까지 디텍트 할 수 있는 기술적 효과가 있다.Furthermore, according to another embodiment of the present invention, there is a technical effect that can detect not only a text of a language but also state information of a user who speaks through a speech recognition engine.

본 발명의 효과들은 이상에서 언급한 효과들로 제한되지 않으며, 언급되지 않은 또 다른 효과들은 청구범위의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The effects of the present invention are not limited to the above-mentioned effects, and other effects not mentioned will be clearly understood by those skilled in the art from the description of the claims.

도 1은 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기를 설명하기 위한 블록도 이다.
도 2는 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기의 사시도를 도시한 것이다.
도 3은 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기의 분해 사시도를 도시한 것이다.
도 4는 본 발명의 일실시예에 따른 디지털 디바이스가, 제1번역/제2번역 등을 통해 오류를 정정하는 프로세스를 도시하고 있다.
도 5는 본 발명의 일실시예에 따른 디지털 디바이스가, 통역 후 보정을 실행하는 상세 플로우 차트를 도시하고 있다.
도 6은 본 발명의 일실시예에 따른 디지털 디바이스가, N 대 N 통역을 실행하는 과정을 도시하고 있다.
도 7은 본 발명의 다른 일실시예에 따른 디지털 디바이스가, 발화자의 음성/감정을 마스킹 하는 시나리오를 도시하고 있다.
도 8은 도 7에 도시된 시나리오를 보다 상세히 구체화한 플로우 차트를 도시하고 있다.
도 9는 이전 도 1 내지 도 8에 도시된 디지털 디바이스의 주요 구성을 도시한 블록도 이다.1 is a block diagram illustrating a portable audio device as an example of a digital device according to an embodiment of the present invention.
2 is a perspective view of a portable audio device as an example of a digital device according to an embodiment of the present invention.
3 is an exploded perspective view of a portable audio device as an example of a digital device according to an embodiment of the present invention.
4 illustrates a process in which a digital device corrects an error through a first translation / second translation or the like according to an embodiment of the present invention.
Fig. 5 shows a detailed flowchart of the digital device performing post-interpretation correction according to an embodiment of the present invention.
6 illustrates a process in which a digital device executes N to N interpretation according to an embodiment of the present invention.
7 illustrates a scenario in which a digital device according to another embodiment of the present invention masks a voice / emotion of a talker.
FIG. 8 shows a flow chart embodying the scenario shown in FIG. 7 in more detail.
FIG. 9 is a block diagram showing the main configuration of the digital device shown in FIGS. 1 to 8.

이하, 첨부된 도면을 참조하여 본 명세서에 개시된 실시 예를 상세히 설명하되, 도면 부호에 관계없이 동일하거나 유사한 구성요소는 동일한 참조 번호를 부여하고 이에 대한 중복되는 설명은 생략하기로 한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. 또한, 본 명세서에 개시된 실시 예를 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 명세서에 개시된 실시 예의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다. 또한, 첨부된 도면은 본 명세서에 개시된 실시 예를 쉽게 이해할 수 있도록 하기 위한 것일 뿐, 첨부된 도면에 의해 본 명세서에 개시된 기술적 사상이 제한되지 않으며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다.Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings, and the same or similar components are denoted by the same reference numerals regardless of the reference numerals, and redundant description thereof will be omitted. The suffixes "module" and "unit" for components used in the following description are given or used in consideration of ease of specification, and do not have distinct meanings or roles from each other. In addition, in describing the embodiments disclosed herein, when it is determined that the detailed description of the related known technology may obscure the gist of the embodiments disclosed herein, the detailed description thereof will be omitted. In addition, the accompanying drawings are intended to facilitate understanding of the embodiments disclosed herein, but are not limited to the technical spirit disclosed in the present specification by the accompanying drawings, all changes included in the spirit and scope of the present invention. It should be understood to include equivalents and substitutes.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다.Terms including ordinal numbers such as first and second may be used to describe various components, but the components are not limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When a component is referred to as being "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that other components may be present in between. Should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that there is no other component in between.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. Singular expressions include plural expressions unless the context clearly indicates otherwise.

본 출원에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.In this application, the terms "comprises" or "having" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

도 1은 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기를 설명하기 위한 블록도 이다. 상기 휴대용 음향기기(100)는 무선 통신부(110), 입력부(120), 센싱부(130), 출력부(140), 인터페이스부(150), 제어부(160) 및 전원 공급부(170) 등을 포함할 수 있다.1 is a block diagram illustrating a portable audio device as an example of a digital device according to an embodiment of the present invention. The portable audio device 100 includes a wireless communication unit 110, an input unit 120, a sensing unit 130, an output unit 140, an interface unit 150, a controller 160, a power supply unit 170, and the like. can do.

상기 휴대용 음향기기는 단말기(예를 들어, 휴대폰)로부터 음향신호를 수신하고 마이크를 통해 수집한 음향정보를 단말기에 전송하는 장치를 의미한다. 종래에는 휴대용 음향기기를 단말기의 이어잭에 단자를 꼽아 음향 신호를 받는 유선 방식을 이용하였으나, 최근에는 무선 통신 방식의 휴대용 음향기기도 상용화 되었다.The portable acoustic apparatus refers to a device that receives a sound signal from a terminal (for example, a mobile phone) and transmits sound information collected through a microphone to the terminal. Conventionally, a portable audio device is connected to a terminal jack of a terminal using a wired method for receiving an acoustic signal, but recently, a portable communication device of a wireless communication method has also been commercialized.

보다 구체적으로 예를 들면, 사용자의 몸에 휴대할 수 있도록 머리에 밴드형상으로 끼우는 헤드폰 타입, 귀에다 거는 타입 및 귀에 끼우는 타입 등 휴대성을 고려한 디자인의 휴대용 음향기기가 다양하게 개발되고 있다. 특히, 최근에 사용자의 목에 걸 수 있는 넥백드 형상의 휴대용음향기기가 증가하는 추세에 있다. 도 1은 이를 예시하고 있다.More specifically, various portable audio devices have been developed in consideration of portability, such as a headphone type that fits in a band shape on the head, a type that hangs on the ear, and a type that fits in the ear so as to be carried on a user's body. In particular, there is a recent trend of increasing the number of neck-back portable sound devices that can be worn on the user's neck. 1 illustrates this.

도 1에 도시된 구성요소들은 휴대용 음향기기를 구현하는데 있어서 필수적인 것은 아니어서, 본 명세서 상에서 설명되는 휴대용 음향기기는 위에서 열거된 구성요소들 보다 많거나, 또는 적은 구성요소들을 가질 수 있다.The components shown in FIG. 1 are not essential to implementing a portable audio device, so the portable audio device described herein may have more or fewer components than those listed above.

보다 구체적으로, 상기 구성요소들 중 무선 통신부(110)는 휴대용 음향기기(100)와 무선 통신 시스템 사이, 휴대용 음향기기(100)와 다른 이동 단말기 사이, 또는 휴대용 음향기기(100)와 외부서버 사이의 무선 통신을 가능하게 하는 하나 이상의 모듈을 포함할 수 있다. 또한, 상기 무선 통신부(110)는, 휴대용 음향기기(100)를 하나 이상의 네트워크에 연결하는 하나 이상의 모듈을 포함할 수 있다.More specifically, the wireless communication unit 110 of the components between the portable sound device 100 and the wireless communication system, between the portable sound device 100 and another mobile terminal, or between the portable sound device 100 and an external server It may include one or more modules to enable wireless communication of. In addition, the wireless communication unit 110 may include one or more modules for connecting the portable audio device 100 to one or more networks.

이러한 무선 통신부(110)는 근거리 통신 모듈(111), 위치정보 모듈(112) 중 적어도 하나를 포함할 수 있다. 또는 필요에 따라 이동 통신 모듈 또는 무선 인터넷 모듈 등을 포함할 수도 있다.The wireless communication unit 110 may include at least one of the short range communication module 111 and the location information module 112. Or it may include a mobile communication module or a wireless Internet module, if necessary.

근거리 통신 모듈(111)은 근거리 통신(Short range communication)을 위한 것으로서, 블루투스(Bluetooth™), RFID(Radio Frequency Identification), 적외선 통신(Infrared Data Association; IrDA), UWB(Ultra Wideband), ZigBee, NFC(Near Field Communication), Wi-Fi(Wireless-Fidelity), Wi-Fi Direct, Wireless USB(Wireless Universal Serial Bus) 기술 중 적어도 하나를 이용하여, 근거리 통신을 지원할 수 있다.The short range communication module 111 is for short range communication, and includes Bluetooth ™, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra Wideband (UWB), ZigBee, NFC (Near Field Communication), at least one of Wi-Fi (Wireless-Fidelity), Wi-Fi Direct, Wireless USB (Wireless Universal Serial Bus) technology can be used to support short-range communication.

이러한, 근거리 통신 모듈(111)은, 근거리 무선 통신망(Wireless Area Networks)을 통해 휴대용 음향기기(100)와 무선 통신 시스템 사이, 휴대용 음향기기(100)와 다른 이동 단말기 사이, 또는 휴대용 음향기기(100)와 다른 이동 단말기(또는 외부서버)가 위치한 네트워크 사이의 무선 통신을 지원할 수 있다. 상기 근거리 무선 통신망은 근거리 무선 개인 통신망(Wireless Personal Area Networks)일 수 있다.The short-range communication module 111 may be provided between the portable sound device 100 and the wireless communication system, between the portable sound device 100 and another mobile terminal, or the portable sound device 100 through local area network (Wireless Area Networks). ) And a wireless communication between the network where the other mobile terminal (or external server) is located. The short range wireless communication network may be short range wireless personal area networks.

근거리 통신 모듈(111)은, 휴대용 음향기기(100) 주변에 통신 가능한 단말기 등을 감지(또는 인식)할 수 있다. 나아가, 제어부(160)는 감지된 단말기가 본 발명에 따른 휴대용 음향기기(100)와 통신하도록 인증된 디바이스인 경우, 이동 단말기에서 처리되는 데이터의 적어도 일부를, 상기 근거리 통신 모듈(111)을 통해 전송 받을 수 있다. 따라서, 휴대용 음향기기(100)의 사용자는, 단말기에서 처리되는 데이터를, 웨어러블 디바이스를 통해 이용할 수 있다.The short range communication module 111 may detect (or recognize) a terminal or the like that can communicate around the portable sound device 100. Further, when the detected terminal is a device authorized to communicate with the portable acoustic apparatus 100 according to the present invention, the controller 160 transmits at least a part of the data processed by the mobile terminal through the short range communication module 111. Can be sent. Accordingly, the user of the portable acoustic apparatus 100 may use the data processed by the terminal through the wearable device.

예를 들어, 이에 따르면 사용자는, 단말기 등에 전화가 수신된 경우, 휴대용 음향기기(100)를 통해 전화 통화를 수행하는 것이 가능하다.For example, according to this, when a telephone is received in a terminal or the like, the user can perform a telephone call through the portable acoustic apparatus 100.

위치정보 모듈(112)은 휴대용 음향기기(100)의 위치(또는 현재 위치)를 획득하기 위한 모듈로서, 그의 대표적인 예로는 GPS(Global Positioning System) 모듈 또는 WiFi(Wireless Fidelity) 모듈이 있다. 예를 들어, 이동 단말기는 GPS모듈을 활용하면, GPS 위성에서 보내는 신호를 이용하여 휴대용 음향기기(100)의 위치를 획득할 수 있다. 다른 예로서, 휴대용 음향기기(100)는 Wi-Fi모듈을 활용하면, Wi-Fi모듈과 무선신호를 송신 또는 수신하는 무선 AP(Wireless Access Point)의 정보에 기반하여, 휴대용 음향기기(100)의 위치를 획득할 수 있다. 필요에 따라서, 위치정보모듈(112)은 치환 또는 부가적으로 휴대용 음향기기(100)의 위치에 관한 데이터를 얻기 위해 무선 통신부(110)의 다른 모듈 중 어느 기능을 수행할 수 있다. 위치정보모듈(115)은 휴대용 음향기기(100)의 위치(또는 현재 위치)를 획득하기 위해 이용되는 모듈로, 휴대용 음향기기(100)의 위치를 직접적으로 계산하거나 획득하는 모듈로 한정되지는 않는다.The location information module 112 is a module for acquiring the location (or current location) of the portable audio device 100. Examples of the location information module 112 include a GPS (Global Positioning System) module or a WiFi (Wireless Fidelity) module. For example, when the mobile terminal utilizes the GPS module, the mobile terminal may acquire the location of the portable acoustic apparatus 100 using a signal transmitted from a GPS satellite. As another example, when the portable audio device 100 utilizes the Wi-Fi module, the portable audio device 100 is based on information of the wireless access point (AP) that transmits or receives a Wi-Fi module and a wireless signal. To obtain the position of. If necessary, the location information module 112 may perform any function of other modules of the wireless communication unit 110 to substitute or additionally obtain data regarding the location of the portable sound device 100. The location information module 115 is a module used to acquire the location (or current location) of the portable audio device 100, and is not limited to a module that directly calculates or acquires the location of the portable audio device 100. .

입력부(120)는 오디오 신호 입력을 위한 마이크(MIC: microphone,121), 또는 오디오 입력부, 사용자로부터 정보를 입력받기 위한 사용자 입력부(122, 예를 들어, 터치키(touch key), 푸시키(mechanical key) 등)를 포함할 수 있다. 입력부(120)에서 수집한 음성 데이터나 이미지 데이터는 분석되어 사용자의 제어명령으로 처리될 수 있다.The input unit 120 may be a microphone (MIC) 121 for inputting an audio signal, or an audio input unit, a user input unit 122 (eg, a touch key or a push key) for receiving information from a user. key), etc.). The voice data or the image data collected by the input unit 120 may be analyzed and processed as a control command of the user.

사용자 입력부(122)는 사용자가 휴대용 음향기기(100)를 제어하기 위한 구성으로서, 통화 버튼, 음량조절 등을 위한 버튼, 전원 버튼 및 음향케이블을 메인 바디 내부로 수납하는 수납버튼 등을 일 예로 들 수 있다.The user input unit 122 is a configuration for the user to control the portable audio device 100, for example, a call button, a button for adjusting the volume, a power button and a storage button for accommodating the sound cable into the main body as an example. Can be.

사용자 입력부(122)는 통화 버튼, 한 쌍의 음량조절 버튼 만을 포함할 수도 있고, 그 외에 시작/정지(play/stop) 버튼, 곡순서 변경 버튼을 더 포함할 수 있다.The user input unit 122 may include only a call button, a pair of volume control buttons, and may further include a play / stop button and a song order change button.

휴대용 음향기기(100)의 크기는 제한되어 있고, 사용자 입력부(122)는 사용자가 직접 보지 않고 입력하는 경우가 많아 버튼의 개수가 많으면 각 버튼의 기능을 구분하기 힘들어 제한된 개수의 버튼을 이용하여 버튼을 누르는 시간 및 횟수 그리고 복수개의 버튼을 조합하여 입력 가능한 제어명령을 확대할 수도 있다.The size of the portable audio device 100 is limited, and the user input unit 122 is often inputted by the user without directly viewing the number of buttons, so that it is difficult to distinguish the functions of each button. Time and number of times of pressing and combinations of a plurality of buttons can be expanded to expand the control commands that can be input.

마이크(121)는 외부의 음향 신호를 전기적인 음성 데이터로 처리한다. 처리된 음성 데이터는 휴대용 음향기기(100)에서 수행 중인 기능(또는 실행 중인 응용 프로그램)에 따라 활용되거나 무선통신부(110)를 통해 외부 단말기 또는 외부 서버로 전달될 수 있다. 한편, 마이크(122)에는 외부의 음향 신호를 입력 받는 과정에서 발생되는 잡음(noise)을 제거하기 위한 다양한 잡음 제거 알고리즘이 구현될 수 있다.The microphone 121 processes an external sound signal as electrical voice data. The processed voice data may be utilized according to a function (or an application program being executed) performed by the portable sound device 100 or transmitted to an external terminal or an external server through the wireless communication unit 110. Meanwhile, various microphones for removing noise may be implemented in the microphone 122 to remove noise generated in the process of receiving an external sound signal.

센싱부(130)는 휴대용 음향기기 내 정보, 휴대용 음향기기를 둘러싼 주변 환경 정보 및 사용자 정보 중 적어도 하나를 센싱하기 위한 하나 이상의 센서를 포함할 수 있다. 예를 들어, 센싱부(130)는 근접센서(131, proximity sensor), 조도 센서(132, illumination sensor), 터치 센서(touch sensor), 가속도 센서 (acceleration sensor), 자기 센서(magnetic sensor), 중력 센서(G-sensor), 자이로스코프 센서(gyroscope sensor), 모션 센서(motion sensor), RGB 센서, 적외선 센서(IR 센서: infrared sensor), 지문인식 센서(finger scan sensor), 초음파 센서(ultrasonic sensor), 광 센서(optical sensor), 마이크(microphone, 121 참조), 배터리 게이지(battery gauge), 환경 센서(예를 들어, 기압계, 습도계, 온도계, 방사능 감지 센서, 열 감지 센서, 가스 감지 센서 등), 화학 센서(예를 들어, 전자코, 헬스케어 센서, 생체 인식 센서 등) 중 적어도 하나를 포함할 수 있다. 한편, 본 명세서에 개시된 휴대용 음향기기는, 이러한 센서들 중 적어도 둘 이상의 센서 에서 센싱되는 정보들을 조합하여 활용할 수 있다.The sensing unit 130 may include one or more sensors for sensing at least one of information in the portable sound device, surrounding environment information surrounding the portable sound device, and user information. For example, the sensing unit 130 may include a proximity sensor 131, an illumination sensor 132, an illumination sensor, a touch sensor, an acceleration sensor, a magnetic sensor, and gravity. G-sensor, Gyroscope Sensor, Motion Sensor, RGB Sensor, Infrared Sensor, Infrared Sensor, Fingerprint Sensor, Ultrasonic Sensor , Optical sensors, microphones (see 121), battery gauges, environmental sensors (e.g. barometers, hygrometers, thermometers, radiation sensors, heat sensors, gas sensors, etc.), It may include at least one of a chemical sensor (eg, electronic nose, healthcare sensor, biometric sensor, etc.). On the other hand, the portable audio device disclosed herein may utilize a combination of information sensed by at least two or more of these sensors.

특히, 후술하는 이어폰이 홀더에 위치했는지 여부를 감지하는 센서를 포함할 수 있고, 이러한 센서는 상기 자기 센서 등이 대표적으로 적용될 수 있다.In particular, it may include a sensor for detecting whether the earphone to be described later is located in the holder, such a sensor may be representatively applied.

출력부(140)는 시각, 청각 또는 촉각 등과 관련된 출력을 발생시키기 위한 것으로, 음향 출력부(141), 햅틱 모듈(142), 광 출력부(143) 중 적어도 하나를 포함할 수 있다.The output unit 140 is used to generate an output related to visual, auditory, or tactile, and may include at least one of an audio output unit 141, a haptic module 142, and an optical output unit 143.

음향 출력부(141)는 음향신호에 따라 음향을 출력하는 장치로서, 대표적인 예로 사용자의 귀에 끼워서 소리를 전달하는 이어폰 및 이어폰을 끼지 않은 상태에서 음향이 출력되는 스피커가 있을 수 있다.The sound output unit 141 is a device for outputting sound according to an audio signal, and representative examples thereof may include earphones that are inserted into a user's ear and deliver sound, and a speaker that outputs sound without the earphones being inserted.

인터페이스부(150)는 휴대용 음향기기(100)에 연결되는 다양한 종류의 외부 기기와의 통로 역할을 수행한다. 이러한 인터페이스부(150)는, 외부 충전기 포트(port), 유/무선 데이터 포트(port) 중 적어도 하나를 포함할 수 있다. 휴대용 음향기기(100)에서는, 상기 인터페이스부(150)에 외부 기기가 연결되는 것에 대응하여, 연결된 외부 기기와 관련된 적절할 제어를 수행할 수 있다.The interface unit 150 serves as a path to various types of external devices connected to the portable sound device 100. The interface unit 150 may include at least one of an external charger port and a wired / wireless data port. In the portable audio device 100, in response to an external device being connected to the interface unit 150, appropriate control associated with the connected external device may be performed.

제어부(160)는 상기 응용 프로그램과 관련된 동작 외에도, 통상적으로 휴대용 음향기기(100)의 전반적인 동작을 제어한다. 제어부(160)는 위에서 살펴 본 구성요소들을 통해 입력 또는 출력되는 신호, 데이터, 정보 등을 처리할 수 있다.In addition to the operation related to the application program, the controller 160 generally controls the overall operation of the portable acoustic apparatus 100. The controller 160 may process signals, data, information, or the like, which are input or output through the above-described components.

전원 공급부(170)는 제어부(160)의 제어 하에서, 외부의 전원, 내부의 전원을 인가 받아 휴대용 음향기기(100)에 포함된 각 구성요소들에 전원을 공급한다. 이러한 전원 공급부(170)는 배터리를 포함하며, 상기 배터리는 내장형 배터리 또는 교체가능한 형태의 배터리가 될 수 있다.The power supply unit 170 receives power from an external power source and an internal power source under the control of the controller 160 to supply power to each component included in the portable sound device 100. The power supply unit 170 includes a battery, which may be a built-in battery or a replaceable battery.

상기 각 구성요소들 중 적어도 일부는, 이하에서 설명되는 다양한 실시 예들에 따른 휴대용 음향기기의 동작, 제어, 또는 제어방법을 구현하기 위하여 서로 협력하여 동작할 수 있다.At least some of the components may operate in cooperation with each other in order to implement an operation, control, or control method of the portable acoustic apparatus according to various embodiments described below.

도 2는 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기의 사시도를 도시한 것이다. 도 3은 본 발명의 일실시예에 따른 디지털 디바이스의 일예로 휴대용 음향기기(100)의 분해 사시도를 도시한 것이다.2 is a perspective view of a portable audio device as an example of a digital device according to an embodiment of the present invention. 3 is an exploded perspective view of the portable audio device 100 as an example of a digital device according to an embodiment of the present invention.

휴대용 음향기기(100)는 크게 넥밴드 와이어(310) 및 메인 바디(200)의 구성을 포함할 수 있다. 넥밴드 와이어(310)는 사용자의 목 일 영역을 감싸 휴대용 음향기기(100)가 사용자의 목에 거치되는 역할을 할 수 있다.The portable sound device 100 may largely include a configuration of the neckband wire 310 and the main body 200. The neckband wire 310 may surround the neck area of the user and serve to mount the portable audio device 100 on the neck of the user.

메인 바디(200) 및 넥밴드 와이어(310)는 결합하여 'U'자 형태를 형성할 수 있고, 또는 필요에 따라 양단이 착탈되어 선택적으로 'O'자 형태를 형성할 수도 있다.The main body 200 and the neckband wire 310 may be combined to form a 'U' shape, or both ends may be detached and optionally formed to form an 'O' shape as needed.

메인 바디(200)는 넥백드 와이어의 양 단에 각각 결합할 수 있다.The main body 200 may be coupled to both ends of the necked wire, respectively.

메인 바디(200)는 넥밴드 와이어(310)의 일단에 결합하는 제1 바디 및 넥밴드 와이어(310)의 타단에 결합하는 제2 바디로 구분될 수 있다.The main body 200 may be divided into a first body coupled to one end of the neckband wire 310 and a second body coupled to the other end of the neckband wire 310.

설명의 중복 방지를 위해 제1 바디 또는 제2 바디 중 일측의 메인 바디(200)에 대해서만 설명하도록 한다.In order to prevent duplication of description, only the main body 200 of one side of the first body or the second body will be described.

메인 바디(200)는 일체형으로 구비될 수도 있으나, 부품 실장의 목적 등을 위해 상부 케이스(210)와 하부 케이스(220)의 결합 형태로 구비될 수 있다. 상부 케이스(210) 또는 하부 케이스(220) 중 일측은 메인 바디(200)의 측면 영역까지 포함할 수 있다. 또는 측면 케이스가 별개의 부재로 구비되어 상부 케이스(210) 및 하부 케이스(220)와 결합할 수도 있다. 본 발명에서는 상부 케이스(210)가 메인 바디(200)의 측면 영역을 형성하는 경우를 전제로 설명하도록 한다.The main body 200 may be provided integrally, but may be provided in the form of a combination of the upper case 210 and the lower case 220 for the purpose of component mounting. One side of the upper case 210 or the lower case 220 may include a side region of the main body 200. Alternatively, the side case may be provided as a separate member to be combined with the upper case 210 and the lower case 220. In the present invention, it will be described on the premise that the upper case 210 forms a side region of the main body 200.

휴대용 음향기기(100)는 땀과 같은 수분에 노출되기 쉬우므로 방수 기능을 부가할 수 있다. 상부 케이스(210)와 하부 케이스(220) 사이의 틈을 커버하는 리브를 형성하거나 방수부재를 개재하여 물이 스며드는 것을 방지할 수 있다.Since the portable acoustic device 100 is easily exposed to moisture such as sweat, it may add a waterproof function. A rib covering a gap between the upper case 210 and the lower case 220 may be formed or water may be prevented from penetrating through the waterproof member.

메인 바디(200)는 폴리머(Polymer) 재질을 사출 성형하여 제조할 수 있다. 예를 들어, 폴리스티렌(PS: Polystryrene)과 같은 강도가 있는 플라스틱 제품을 이용할 수 있다. 다만 부분적으로 금속이나, 유리, 가죽 등을 포함하는 이형 재질로 구성될 수도 있다.The main body 200 may be manufactured by injection molding a polymer material. For example, a plastic product having a strength such as polystyrene (PS) may be used. However, it may be partially composed of a release material including metal, glass, leather, and the like.

메인 바디(200)는 내부 부품을 보호하며 동시에 사용자의 신체에 밀착되도록 메인 바디(200) 표면에 폴리우레탄 코팅을 할 수 있다. 메인 바디(200) 표면에 폴리우레탄 코팅을 하면 휴대용 음향기기(100)의 일체감 있는 외관을 형성할 수 있으며, 메인 바디(200)가 사용자의 피부와 밀착되어 사용자의 움직임에 따라 흔들리지 않아 착용감이 우수한 장점이 있다.The main body 200 may protect the internal parts and at the same time may be a polyurethane coating on the surface of the main body 200 to be in close contact with the user's body. Applying a polyurethane coating on the surface of the main body 200 can form a unitary appearance of the portable audio device 100. The main body 200 is in close contact with the user's skin and does not shake in accordance with the user's movement. There is an advantage.

메인 바디(200)는 본 발명의 휴대용 음향기기(100)의 기능을 수행할 수 있는 대부분의 부품을 실장할 수 있다. 예컨대 메인 바디(200)에는 전술한 메인 기판, 무선통신부, 배터리(260) 및 회수 모듈과 같은 각종 부품이 삽입될 수 있다.The main body 200 may mount most components capable of performing the functions of the portable acoustic apparatus 100 of the present invention. For example, various components such as the main board, the wireless communication unit, the battery 260, and the recovery module may be inserted into the main body 200.

메인기판은 무선통신부, 마이크(201) 등이 실장되며, 배터리(260), 사용자 입력부, 음향출력부 등과 연결된다. 메인 바디(200)에 실장되는 부품은 양측의 바디, 즉 제1 바디 및 제2 바디에 대칭적으로 구비할 수도 있고, 일측만 구비될 수도 있다.The main board is mounted with a wireless communication unit, a microphone 201, etc., and is connected to a battery 260, a user input unit, a sound output unit, and the like. Components mounted on the main body 200 may be provided symmetrically on both sides of the body, that is, the first body and the second body, or may be provided on only one side.

넥밴드 와이어(310)는 탄성의 성질을 가져 힘을 가하면 소정 범위내에서 탄성 변형 되었다가 힘이 제거되면 원래 형상으로 복원 될 수 있다. 탄성의 성질을 갖는 대표적인 형태로 형상기억합금이 있을 수 있다. 나아가, 제1 기판(240), 고정 브라켓(250), 배터리(260) 등이 추가적으로 구비되어 있다.The neckband wire 310 may be elastically deformed within a predetermined range when a force is applied to the neckband wire 310 and then restored to its original shape when the force is removed. The shape memory alloy may be a representative form having elastic properties. Further, the first substrate 240, the fixing bracket 250, the battery 260, and the like are additionally provided.

한편, 이전 도 1 내지 3에서는 휴대용 음향기기를 예시하여 설명하고 있으나, 본 발명이 이에 제한되는 것은 아니며 음성 인식이 가능한 다른 형태의 어떠한 디지털 디바이스에도 적용 가능하다.1 to 3 exemplify the portable audio device, the present invention is not limited thereto and may be applied to any other type of digital device capable of speech recognition.

도 4는 본 발명의 일실시예에 따른 디지털 디바이스가, 제1번역/제2번역 등을 통해 오류를 정정하는 프로세스를 도시하고 있다. 이전 도 1 내지 도 3에서 설명한 마이크 등을 통해 사용자의 음성을 최초 수신하는 것이 가능하고, 컨트롤러, 메모리, 네트워크 인터페이스 등을 통해 수신된 음성의 분석 및 번역이 이루어 진다.4 illustrates a process in which a digital device corrects an error through a first translation / second translation or the like according to an embodiment of the present invention. It is possible to initially receive a user's voice through the microphone described above with reference to FIGS. 1 to 3, and to analyze and translate the voice received through a controller, a memory, a network interface, and the like.

본 발명의 주된 기술적 사상은, 음성 인식된 언어를 기설정된 언어로 번역(통역)(예를 들어, 한글 ? 영어)하고, 다시 통역(영어 ? 한글)하여 차이점을 보정하는 기술이다. 보정된 내용들의 차이값(Diff)을 핏쳐(feature)로 학습시켜, 이후 AI (Artificial Intelligence) 를 통해 자동 보정이 가능하도록 설계하는 것도 본 발명의 다른 권리범위에 속한다.The main technical idea of the present invention is a technology of translating a speech recognized language into a preset language (for example, Korean-English) and interpreting (English-Korean) again to correct the difference. Learning the difference (Diff) of the corrected content by the feature (feature), and then designed to enable automatic correction through the AI (Artificial Intelligence) also belongs to another scope of the present invention.

예를 들어, 본 발명의 일실시예에 의한 디바이스를 도 4에 도시된 A 디바이스로 가정하겠다. A 디바이스의 마이크를 통해 “내일 6시에 시계탑에서 봐”라는 음성이 입력된다(S401).For example, assume that a device according to an embodiment of the present invention is a device A shown in FIG. 4. Through the microphone of the A device, a voice of “see at 6 o'clock tomorrow” is input (S401).

상기 A 디바이스가 내부 메모리에 저장된 번역 프로그램 또는 서버와의 통신을 번역을 수행하면, “Look at the clock tower at 6 o’clock tomorrow” 가 최초 번역된다(S402).When the device A translates the communication with the translation program or the server stored in the internal memory, “Look at the clock tower at 6 o'clock tomorrow” is first translated (S402).

이와 같이 최초 번역된 “Look at the clock tower at 6 o’clock tomorrow” (S403)를 기초로, 다시 국문으로 번역하면 통상적으로 “내일 6시에 시계탑을 보세요”라고 번역된다(S404).Based on the original translation of "Look at the clock tower at 6 o'clock tomorrow" (S403), when translated back into Korean, it is usually translated "Look at the clock tower tomorrow at 6" (S404).

이 때, 본 발명의 일실시예에 의한 A 디바이스는 S401 단계에서 수신된 국어 텍스트(S401)와 재번역된 국어 문장(S404)의 차이점을 디텍트 한다. 그리고, 차이점에 해당하는 “보세요”를 수정하기 위한 리스트를 제공한다(S405). At this time, the A device according to an embodiment of the present invention detects the difference between the Korean text (S401) and the retranslated Korean sentence (S404) received in step S401. Then, it provides a list for correcting the "look" corresponding to the difference (S405).

유저가 “만나자”를 선택한 경우(S406), “See you at the clock tower at 6 o’clock tomorrow” 로 다시 영문 번역본이 수정된다(S407).If the user selects "Meet" (S406), the English translation is modified again to "See you at the clock tower at 6 o'clock tomorrow" (S407).

그리고, 수정된 “See you at the clock tower at 6 o’clock tomorrow” (S408)를 다시 확인하기 위한 GUI(S409)가 디스플레이 되고, 사용자 컨펌(전송)이 수신되면, 최종 영문인 “See you at the clock tower at 6 o’clock tomorrow” 를 B 디바이스로 전송하고(S411), 대응하는 국문(S410)을 메모리에 같이 저장해 두어, 추후 동일한 오류가 발생하지 않도록 사전에 차단한다.In addition, a GUI (S409) for re-checking the modified “See you at the clock tower at 6 o'clock tomorrow” (S408) is displayed, and when a user confirmation (transmission) is received, the final English “See you at” is displayed. The clock tower at 6 o'clock tomorrow ”is transmitted to the B device (S411), and the corresponding Korean language (S410) is stored together in memory, so that the same error is prevented in the future.

도 5는 본 발명의 일실시예에 따른 디지털 디바이스가, 통역 후 보정을 실행하는 상세 플로우 차트를 도시하고 있다.Fig. 5 shows a detailed flowchart of the digital device performing post-interpretation correction according to an embodiment of the present invention.

이전 도면 4에서는 음성 인식을 통해 임의의 언어가 오디오 포맷으로 수신되는 것을 가정하였으나, 음성 또는 키보드 입력으로 원본 텍스트가 전달될 수도 있다(S501).In FIG. 4, it is assumed that any language is received in an audio format through speech recognition, but original text may be transmitted through voice or keyboard input (S501).

이 때, 본 발명의 일실시예에 의한 디지털 디바이스는 기설정된 언어로 번역을 수행한다(S502). 상기 기설정된 언어는, 영어, 한국어, 일어, 중국어 등이 될 수 있으며 한정되지 않는다.At this time, the digital device according to an embodiment of the present invention performs translation in a predetermined language (S502). The preset language may be English, Korean, Japanese, Chinese, etc., but is not limited thereto.

그리고, 상기 디지털 디바이스는, 상기 S502 단계에서 번역된 텍스트 기반으로 다시 원본 언어로 재번역 한다(S503).The digital device retranslates the original language back to the original language based on the text translated in step S502 (S503).

나아가, 상기 S501 단계에서 입력된 원본과 상기 S503 단계에서 재번역된 텍스트가 동일하지 여부를 판단한다(S504).Further, it is determined whether the original text input in step S501 and the text retranslated in step S503 are the same (S504).

상기 판단 결과(S504) 동일한 경우에는 특별히 보정 프로세스가 필요하지 않으므로, S502 단계에서 번역된 내용을 스피커를 통해 출력하거나 또는 다른 디바이스에 전송한다(S514).In the case where the determination result is the same (S504), since a special correction process is not necessary, the translated content is output through the speaker or transmitted to another device (S514).

반면 상기 판단 결과(S504) 동일하지 않은 경우에는 사용자에게 차이점을 알리는 피드백을 제공한다(S505). 이전 도면 4에서는 리스트 형태의 그래픽 이미지로 출력하는 것을 예시하였으나, 오디오를 통해 차이점을 피드백 하는 것도 본 발명의 권리범위에 속한다.On the other hand, if it is not the same as the determination result (S504), it provides feedback informing the user of the difference (S505). In FIG. 4, outputting a graphic image in the form of a list is illustrated, but feeding back a difference through audio is within the scope of the present invention.

본 발명의 디지털 디바이스의 성능에 따라 또는 유저의 선택에 따라 AI 기능이 적용되어 있는지 여부를 판단한다(S506).It is determined whether the AI function is applied according to the performance of the digital device of the present invention or according to the user's selection (S506).

상기 판단 결과(S506) AI 가 적용되어 있으면, AI 기술에 근거하여 차이가 나는 단어를 어떤 단어로 수정할지 제안한다(S507).As a result of the determination (S506), if AI is applied, it is proposed to change the word with the difference based on the AI technology (S507).

상기 판단 결과(S506) AI 가 적용되어 있지 않으면, 사용자가 직업 보정하고자 하는 단어를 선택한다(S508).As a result of the determination (S506), if the AI is not applied, the user selects a word to correct the job (S508).

나아가, 상기 디지털 디바이스는 사용자 입력이 있는지 여부를 판단한다(S509).Further, the digital device determines whether there is a user input (S509).

상기 판단 결과(S509) 사용자 입력이 없으면, 오류 정정 없이 번역된 내용 그대로 출력하거나 다른 디바이스에 전송한다(S514). 예를 들어, 그 정도의 차이는 번역의 오류를 정정하지 않아도, 상대방이 이해하는데 어려움이 없는 정도일 것이다.As a result of the determination (S509), if there is no user input, the translated content is output as it is without error correction or transmitted to another device (S514). For example, the difference may be that the other party has no difficulty in understanding even if the translation error is not corrected.

반면, 상기 판단 결과(S509) 사용자 선택이 있으면, 보정이 적용된 문장을 새롭게 생성하고(S510), 사용자에게 재확인을 요청하는 메시지를 디스플레이 한다(S511). 재확인이 되면 S502 단계로 복귀하고, 재확인이 되지 않으면, AI 적용 여부를 다시 판단한다(S512).On the other hand, if there is a user selection as a result of the determination (S509), a sentence to which correction is applied is newly generated (S510), and a message for requesting reconfirmation to the user is displayed (S511). If reconfirmation is returned to step S502, and if reconfirmation is not performed, it is determined again whether AI is applied (S512).

상기 판단 결과(S512) AI 가 적용되어 있지 않으면 S514 단계로 복귀하고, 상기 판단 결과(S512) AI가 적용되어 있으면 보정 단어를 학습하여 메모리에 저장한 후(S513), S514 단계로 이동한다.If the determination result (S512) AI is not applied, the process returns to step S514. If the determination result (S512) AI is applied, the correction word is learned and stored in the memory (S513), and the process moves to step S514.

도 6은 본 발명의 일실시예에 따른 디지털 디바이스가, N 대 N 통역을 실행하는 과정을 도시하고 있다. 한편, 다대다 번역 모델에서 번역을 보다 신속하게 하기 위하여, N x N의 번역 엔진을 사용하지 않고, 공용어 번역을 이용하여 N개의 번역 엔진만 사용하는 것도 본 발명의 일특징이다. 즉, 여기서 공용어 번역은 “한국어” 또는 “영어”가 될 수 있다.6 illustrates a process in which a digital device executes N to N interpretation according to an embodiment of the present invention. On the other hand, in order to translate more quickly in the many-to-many translation model, one feature of the present invention is to use only N translation engines by using official language translation instead of using N × N translation engines. That is, the official language translation can be "Korean" or "English" here.

이전 실시예들은 디지털 디바이스(예를 들어, 휴대용 음향기기, 헤드셋 등)를 통해 1 대 1 통역을 실시하는 것을 가정하였다면, 도 6은 1대N 또는 다대다(N 대 N) 통역 서비스를 제안한다.If the previous embodiments assume a one-to-one interpretation through a digital device (eg, portable audio device, headset, etc.), FIG. 6 proposes a one-to-n or many-to-many (N to N) interpretation service. .

한 개의 단말에 여러 헤드셋을 연결하여 각각의 언어를 설정하여 통역을 사용한다. 단말(휴대폰)끼리의 통신을 포함하여 연결된 헤드셋을 모두 연결하여 번역 네트워크를 구성한다.Connect multiple headsets to one terminal and set the language to use an interpreter. The translation network is configured by connecting all connected headsets, including communication between terminals (mobile phones).

예를 들어, 발화자가 해당 언어로 얘기하면, 같은 단말 포함 다른 단말에도 연결되어 있는 모든 사람이 각자 설정한 언어로 통역 서비스를 받을 수가 있다.For example, if the talker speaks the language, everyone who is connected to the other terminal including the same terminal can receive the interpreter service in the language set by each person.

보다 구체적으로 예를 들면, 제1휴대폰에 2개의 헤드셋이 연결되어 있고, 제2휴대폰에도 역시 2개의 헤드셋이 연결되어 있음을 가정한다.More specifically, for example, it is assumed that two headsets are connected to the first mobile phone and two headsets are also connected to the second mobile phone.

상기 제1휴대폰에 연결된 제1헤드셋의 언어가 “영어”로 설정되어 있고, 제1휴대폰에 연결된 제1헤드셋의 언어가 “중국어” 로 설정되어 있으며, 상기 제2휴대폰에 연결된 제3헤드셋의 언어가 “일본어”로 설정되어 있고, 제2휴대폰에 연결된 제4헤드셋의 언어가 “한국어” 로 설정되어 있음을 또한 가정한다.The language of the first headset connected to the first mobile phone is set to “English”, the language of the first headset connected to the first mobile phone is set to “Chinese”, and the language of the third headset connected to the second mobile phone Is also set to “Japanese” and the language of the fourth headset connected to the second mobile phone is also set to “Korean”.

이 때, 상기 제1헤드셋을 통해 “Do you have dinner?”가 입력되면(S601), 제2헤드셋을 통해서는 해당 하는 중국어가 출력되고(S602), 제3헤드셋을 통해서는 해당 하는 일본어가 출력되고(S603), 그리고 제4헤드셋을 통해서는 해당 하는 한국어가 출력되고(S604). 물론, 전술한 도면 4 내지 5의 실시예를 결합하여 통번역의 오류를 정정하는 것도 본 발명의 다른 권리범위에 속한다.At this time, if “Do you have dinner?” Is input through the first headset (S601), the corresponding Chinese is output through the second headset (S602), and the corresponding Japanese is output through the third headset. (S603), and the corresponding Korean language is output through the fourth headset (S604). Of course, it is also within the scope of other rights of the present invention to combine the above-described embodiments of FIGS.

도 7은 본 발명의 다른 일실시예에 따른 디지털 디바이스가, 발화자의 음성/감정을 마스킹 하는 시나리오를 도시하고 있다.7 illustrates a scenario in which a digital device according to another embodiment of the present invention masks a voice / emotion of a talker.

이전 실시예들이 주로 텍스트 자체의 오류에 포커싱 하였다면, 도 7은 발화자의 의도 등을 보다 정교하게 디텍트 하기 위한 기술을 제안한다.If the previous embodiments mainly focused on errors in the text itself, FIG. 7 proposes a technique for more precisely detecting the intent of the talker.

예컨대, 통역된 내용을 발화자의 목소리를 통해 들려 줌으로써, 듣는 사람이 발화자가 누군인지도 식별할 수 있도록 도와 준다. 나아가, 번역되기 이전의 언어를 알려 주어, 어느 나라 사람이 발화했는지 식별할 수 있도록 가이드 한다.For example, by interpreting the content through the talker's voice, the listener can identify who the talker is. In addition, the language before translation is given so that the guide can be used to identify which country is speaking.

나아가, 발화자의 목소리에서 강세 등을 단어에 맵핑해서 번역된 문장에서도 강세 등을 통해 감정을 느낄 수 있도록 전달한다. 그리고, 강세나 감정 등을 텍스트로 표현하여 디스플레이 하는 것도 본 발명의 다른 권리범위에 속한다고 할 것이다.Furthermore, the stressed voices are mapped to words in the speaker's voice, and the translated sentences are conveyed for feelings through stressed stresses. In addition, expressing and displaying the stress or emotion in text will be said to belong to another scope of the present invention.

보다 구체적으로 예를 들면, 도 7에 도시된 바와 같이, 여자 A가 별다른 톤 변화 없이 “내일 6시에 시계탑에서 봐”라고 발화한 경우(S701), 우선 기설정된 언어가 영어인 경우 “See you at the clock tower at 6 o’clock tomorrow” 로 번역되고(S702), 한국어 또는 한국인 이였음을 표시하고(Korean), 동일한 여자 A의 목소리로 “See you at the clock tower at 6 o’clock tomorrow” 을 다른 디바이스에 전송한다(S703).More specifically, for example, as shown in FIG. 7, when the female A utters "see at the clock tower tomorrow at 6 o'clock" without any change in tone (S701), first, when the preset language is English, “See you at the clock tower at 6 o'clock tomorrow ”(S702), indicating that it was Korean or Korean, and in the voice of the same woman A,“ See you at the clock tower at 6 o'clock tomorrow ” Is transmitted to the other device (S703).

반면, 동일한 여자 A 이지만, 특정 부분을 강조하여 “너 때문에 화났다고 말했자나!” 한국어로 발화한 경우를 가정해 보겠다(S704). 예컨대, “너”, “말했자나” 2 부분이 강조되었다.On the other hand, let's assume that the same woman A, but emphasized a certain part and said "I was angry because of you!" For example, "you" and "he said" two parts are emphasized.

액세트가 강조된 단어 및 어휘 분석을 통해, “I told you I’am upset because of you” 로 번역된 부분 중에서 “told you” 및 “you” 가 강조되어 출력되도록 오디오 데이터에 부가 정보를 실어서 전송한다(S705).Through the analysis of words and vocabulary with an accented emphasis, the additional information is transmitted to the audio data so that the words “told you” and “you” are highlighted and output as “I told you I'am upset because of you”. (S705).

도 8은 도 7에 도시된 시나리오를 보다 상세히 구체화한 플로우 차트를 도시하고 있다.FIG. 8 shows a flow chart embodying the scenario shown in FIG. 7 in more detail.

음성 인식이 가능한 본 발명의 일실시예에 의한 디지털 디바이스는 사용자 음성 입력을 수신한다(S801). 이 때 종래 기술과 달리, 입력된 음성의 특징점을 추출한다(S802). 상기 특징점은 예를 들어, 강세, 억양, 목소리 등에 해당하고 다른 부가 정보가 추출될 수도 있다.The digital device according to an embodiment of the present invention capable of speech recognition receives a user voice input (S801). At this time, unlike the prior art, feature points of the input voice are extracted (S802). The feature point corresponds to, for example, stress, intonation, voice, etc., and other additional information may be extracted.

인식된 음성을 텍스트로 변환하고(S803), 기설정된 다른 언어로 번역한다(S804).The recognized voice is converted into text (S803) and translated into another preset language (S804).

나아가, 번역된 내용 뿐만 아니라 추출된 특징점을 서로 맵핑하여 다른 디바이스 또는 내부 컨트롤러에 전송한다(S805).Furthermore, the extracted feature points as well as the translated content are mapped to each other and transmitted to another device or an internal controller (S805).

상기 S805 단계에서 수신된 데이터에 특징점이 존재하는지 여부를 우선 판단한다(S806).First, it is determined whether a feature point exists in the data received in step S805 (S806).

상기 판단 결과(S806) 특징점이 없는 경우에는 번역된 텍스트 그대로 출력한다(S809).If there is no feature point (S806), the translated text is output as it is (S809).

반면, 상기 판단 결과(S806) 특징점이 있는 경우에는, 상기 S805 단계에서 수신된 데이터를 참조하여 텍스트에 특징점을 마스킹 한다(S807).On the other hand, if there is a feature point in the determination result (S806), the feature point is masked in the text with reference to the data received in step S805 (S807).

그리고, 특징점이 마스킹된 텍스트를 TTS(Text To Speech)에 적용 후, S809 단계로 이동한다. 따라서, 최초 발화한 제1언어의 강조 부분이, 통역된 제2언어에서도 거의 동일 또는 유사하게 반영될 수 있는 기술적 효과가 있다.Then, after the feature point masked text is applied to the text to speech (TTS), the process moves to step S809. Therefore, there is a technical effect that the emphasis portion of the first spoken first language can be reflected almost the same or similarly in the interpreted second language.

도 9는 이전 도 1 내지 도 8에 도시된 디지털 디바이스의 주요 구성을 도시한 블록도 이다. 따라서, 도 9를 독립적으로 해석할 수도 있고, 이전 도면들을 참조하여 보충 해석하는 것도 본 발명의 권리범위에 속한다. 다만, 특히 도 6에 도시된 다자간 통역 서비스를 서버 기반으로 수행하는 것으로 도 9에서 구체적으로 예시하였다.FIG. 9 is a block diagram showing a main configuration of the digital device shown in FIGS. 1 to 8. Accordingly, FIG. 9 may be interpreted independently, and supplementary interpretation with reference to the previous drawings is also within the scope of the present invention. However, in particular, the multilateral interpretation service illustrated in FIG. 6 is specifically illustrated in FIG.

도 9에서는 본 발명이 적용된 A 디바이스(900), 서버(910) 및 B 디바이스 그룹(920)을 분리하여 도시하였으나, 상기 서버(910)의 기능 중 일부 또는 전부를 A 디바이스(900) 또는 B 디바이스 그룹(920)에서 수행하도록 설계하는 것도 본 발명의 권리범위에 속한다.In FIG. 9, the A device 900, the server 910, and the B device group 920 to which the present invention is applied are shown separately, but some or all of the functions of the server 910 are A device 900 or B device. Designing to perform in group 920 is also within the scope of the present invention.

A 디바이스(900)를 통해 수신된 제1언어의 음성 메시지는 서버(910)의 메시지 수신부(911)에 전달된다. The voice message of the first language received through the A device 900 is transmitted to the message receiving unit 911 of the server 910.

상기 메시지 수신부(911)는 상기 A 디바이스(900)로부터 음성 메시지를 수신하고(S902), 공용어(예를 들어 영어)로 메시지를 번역한다(S903).The message receiving unit 911 receives a voice message from the A device 900 (S902), and translates the message into an official language (for example, English) (S903).

한편, B 디바이스 그룹(920)으로부터 통역을 통해 수신하고 싶은 언어가 서버(910)에 전달되고(S900), 번역 엔진(912)은 번역 관리부(913)를 참조하여 공용어인 영어로 번역 후(S904), 상기 S900 단계에서 수신한 특정 언어로 재번역이 이루어 진다(S905).On the other hand, the language that you want to receive from the B device group 920 through an interpreter is transmitted to the server 910 (S900), the translation engine 912 is translated into English as a public language with reference to the translation manager 913 (S904). Retranslation is performed in a specific language received in step S900 (S905).

그리고, 메시지 송신부(914)는 번역된 메시지를 B 디바이스 그룹(920)내 각각의 수신 디바이스들에 전송한다(S906). The message transmitter 914 transmits the translated message to respective receiving devices in the B device group 920 (S906).

전술한 본 발명은, 프로그램이 기록된 매체에 컴퓨터가 읽을 수 있는 코드로서구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 매체는, 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치를 포함한다.The present invention described above can be embodied as computer readable codes on a medium on which a program is recorded. The computer-readable medium includes all kinds of recording devices in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 매체의 예로는, HDD(Hard Disk Drive), SSD(Solid State Disk), SDD(Silicon Disk Drive), ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있으며, 또한 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 상기 컴퓨터는 단말기의 제어부를 포함할 수도 있다. 따라서, 상기의 상세한 설명은 모든면에서 제한적으로 해석되어서는 아니되고 예시적인 것으로 고려되어야 한다.Examples of computer-readable media include hard disk drives (HDDs), solid state disks (SSDs), silicon disk drives (SDDs), ROMs, RAMs, CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, and the like. This also includes implementations in the form of carrier waves (eg, transmission over the Internet). In addition, the computer may include a control unit of the terminal. The above detailed description, therefore, is not to be construed as limiting in all respects, but should be considered as illustrative.

본 발명의 범위는 첨부된 청구항의 합리적 해석에 의해 결정되어야 하고, 본 발명의 등가적 범위 내에서의 모든 변경은 본 발명의 범위에 포함된다.The scope of the invention should be determined by reasonable interpretation of the appended claims, and all changes within the equivalent scope of the invention are included in the scope of the invention.

100: 휴대용 음향기기
110: 무선 통신부
120: 입력부
130: 센싱부
140: 출력부
150: 인터페이스부
160: 제어부100: portable audio equipment
110: wireless communication unit
120: input unit
130: sensing unit
140: output unit
150: interface unit
160: control unit

Claims

In the control method of a digital device capable of speech recognition,
Receiving an audio signal of a first language;
Translating the received audio signal of the first language into text of a second predetermined language by referring to a memory;
Retranslating the translated text of the second language into the first language with reference to the memory;
Comparing the text of the retranslated first language with an audio signal of the received first language; And
According to the comparison result, modifying a part or all of the translated second language text and converting it into a format that can be output as an audio signal
Control method of a digital device capable of speech recognition, comprising a.

The method of claim 1,
The comparing step,
If the text of the retranslated first language and the audio signal of the received first language are different, displaying the same part and a different part differently; And
Providing an option to automatically modify the other portion according to user selection or data stored in the memory of the digital device
The control method of the digital device capable of speech recognition further comprising.

The method of claim 2,
The receiving step,
And extracting feature points of the audio signal of the first language, wherein the feature points include at least one of stress, intonation, voice, and volume.

The method of claim 3,
The converting step,
And outputting the translated second language text in different accents, intonations, voices, or volumes according to feature points extracted from the audio signal of the first language. Control method.

The method of claim 4, wherein
The first language and the second language are different from each other,
The digital device is a control method of a digital device capable of speech recognition, characterized in that at least one of a wearable device or a mobile device.

In a digital device capable of speech recognition,
Memory;
A microphone for receiving an audio signal of a first language; And
A controller for controlling the memory and the microphone,
The controller,
The audio signal of the first language is translated into a text of a second predetermined language by referring to the memory.
Retranslating the translated text of the second language into the first language with reference to the memory,
Compare the text of the retranslated first language with an audio signal of the received first language, and
And a part or all of the translated second language text is converted into a format that can be output as an audio signal according to the comparison result.

The method of claim 6,
If the text of the retranslated first language and the audio signal of the received first language are different, display the same part and a different part differently, and
And a display module for displaying an option for automatically modifying the other part according to a user selection or data stored in a memory of the digital device.

The method of claim 7, wherein
The controller,
Extracting a feature point of the audio signal of the first language, wherein the feature point includes at least one of stress, intonation, voice, and volume.

The method of claim 8,
The controller,
And a speaker for outputting the translated second language text in different accents, intonations, voices, or volumes according to feature points extracted from the audio signal of the first language.

The method of claim 9,
The first language and the second language are different from each other,
The digital device is a digital device capable of speech recognition, characterized in that at least one of a wearable device or a mobile device.