[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

TW200407710A - Dialog control for an electric apparatus - Google Patents

Dialog control for an electric apparatus Download PDF

Info

Publication number
TW200407710A
TW200407710A TW092112722A TW92112722A TW200407710A TW 200407710 A TW200407710 A TW 200407710A TW 092112722 A TW092112722 A TW 092112722A TW 92112722 A TW92112722 A TW 92112722A TW 200407710 A TW200407710 A TW 200407710A
Authority
TW
Taiwan
Prior art keywords
user
anthropomorphic
component
camera
patent application
Prior art date
Application number
TW092112722A
Other languages
Chinese (zh)
Other versions
TWI280481B (en
Inventor
Martin Oerder
Original Assignee
Koninkl Philips Electronics Nv
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from DE10249060A external-priority patent/DE10249060A1/en
Application filed by Koninkl Philips Electronics Nv filed Critical Koninkl Philips Electronics Nv
Publication of TW200407710A publication Critical patent/TW200407710A/en
Application granted granted Critical
Publication of TWI280481B publication Critical patent/TWI280481B/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Selective Calling Equipment (AREA)
  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

A device comprising means for picking up and recognizing speech signals and a method of controlling an electric apparatus are proposed. The device comprises a personifying element 14 which can be moved mechanically. The position of a user is determined and the personifying element 14, which may comprise, for example, the representation of a human face, is moved in such a way that its front side 44 points in the direction of the user's position. Microphones 16, loudspeakers 18 and/or a camera 20 may be arranged on the personifying element 14. The user can conduct a speech dialog with the device, in which the apparatus is represented in the form of the personifying element 14. An electric apparatus can be controlled in accordance with the user's speech input. A dialog of the user with the personifying element for the purpose of instructing the user is also possible.

Description

200407710 玖、發明說明: 技術領域 本發明揭示-種包括用於拾取及辨識語音訊號之構件之 裝置,以及一種讓使用者與一電氣裝置通信之方法。 已知之語音辨識構件可將所拾取之聲學語音訊號指定給 對應語詞或對應語詞序列。語音辨識系統通常與語音合成 相、'.“,作控制電氣裝置之對話系統。與使用者之對話 可作為操作該電氣裝置之唯一介面。亦可將語音輸入甚至 輸出作多種溝通方式當中的一種。 先前技術 美國專利第US-A-6,1 1 8,888號描述了一種控制裝置以及 一種控制電氣裝置(譬如電腦)或娛樂電子領域所用裝置之 方法。為控制該裝置,使用者有權支配複數個輸入設備。 孩等設備為機械輸入設備(譬如鍵盤或滑鼠)以及語音辨識 設備。此外’該控制裝置包括一攝影機,其可拾取使用者 的手勢及擬態’並可將其處理後作為進一步的輸入訊號。 與使用者之溝通係以對話形式實現,其中該系統具有複數 個模式可供支配,以向使用者傳送資訊。其包括語音合成 及吾首輸出。尤其亦包括擬人化圖像,譬如人、人臉或動 物的圖像。該圖像係以電腦圖形的形式在顯示幕上顯示給 使用者。 儘管目前對話系統已用於各種特殊應用,譬如電話資訊 系統,但在諸如家用領域内之控制電氣裝置、娛樂電子等 其他領域之應用則仍然未獲廣泛認可。 85329 200407710 發明内容 本發明之一項目的係提供一 — 禮包括拾取構件以用於辨識 語晋訊號之裝置,以及一種和 > #作—電氣裝置之方法,該電 氣裝置讓使用者可藉由注立松庄τ 一 、 田%曰控制輕鬆操作該裝置。 藉由如申凊專利範圍第1项壯 、 /、 k且以及如申請專利範圍 第π項之方法可實現本目的。其 定義了本發明之較佳具體實施例 根據本發明之裝置包括一可機^ 他申請專利範圍附屬項則 〇 械地移動之擬人化元件。 其為該裝置之H,該裝置係作為使用者之擬人化對話 夥伴。該種擬人化元件之具體實施可能差異很大。譬如, 其可為可藉由馬達相對於電氣裝置之固定外殼移動之外殼 的邛刀。關键在於該擬人化元件具有一使用者可辨識無 疾(A側。若此前側朝向該使用者,他將感覺到該裝置是 "注意傾聽π的,即其可接收語音指令。 根據本發明,該裝置包括用於判定使用者位置之構件。 此可經由諸如聲音或光學感應器來實現。該擬人化元件之 運動構件係被控制以使該擬人化元件之前側朝向該使用者 之位置。如此使得使用者始終感覺該裝置準備”聆聽”他講 話0 根據本發明之另一項具體實施例,該擬人化元件包括_ 擬人化圖像◦此不僅可為一人或動物之圖像、亦可為一虛 幻角色(譬如機器人)之圖像。較易被接受的為人臉之圖像。 其可為寫實或象徵性的圖像,譬如其中僅顯示出眼、鼻、 口等之輪廓。 85329 -6- 200407710 該裝置最好亦包括供給語音訊號之構件。語音辨識對於 控制電氣裝置的確尤其重要,然而,回答、確認、查詢等 亦可以語音輸出構件實現。語音輸出可包括再現預存的語 音訊號,以及真實的語音合成。可以語音輸出構件實現一 完整的對話控制。亦可與使用者對話,以實現為其提供娛 樂之目的。 根據本發明之另一項具體實施例,該裝置包括複數個麥 克風及/或至少一個攝影機。語音訊號由一單一麥克風即可 拾取。然而,當使用複數個麥克風時,一方面可達成一拾 取模式,另一方面亦可藉由通過複數個麥克風接收使用者 之語音訊號來查明使用者位置。可以一攝影機來觀察該裝 置之環境。藉由對應的影像處理,亦可根據所拾取之影像 判定使用者之位置。麥克風、攝影機及/或用於供給語音訊 號之揚聲器可安排在可機械地移動之該擬人化元件上。譬 如,對於一人頭形式之擬人化元件,可在眼部區域内安置 兩架攝影機,在嘴部位置安置一揚聲器,以及靠近耳部位 置安置兩個麥克風。 最好係配備用以辨識使用者之構件。此係可藉由譬如評 估所拾取之影像訊號(視覺或臉部辨識)或藉由評估所拾取 之聲音訊號(語音辨識)來實現。因而該裝置可從該裝置環境 内的若干人中判定當前使用者,並使該擬人化元件面向該 使用者。 可以多種不同方式配置該運動構件以機械地移動該擬人 化元件。譬如,該等構件可為電動馬達或液壓調整構件。 85329 200407710 亦可藉由該運動構件以移動該擬人化元件。然而,該擬人 化7L件最好僅可相對於一固定部分轉動。舉例而言,在本 例中’其可圍繞一水平及/或垂直軸轉動。 根據本發明之装置可形成電氣裝置之一部分,諸如用於 娱樂電子之裝置(譬如電視、音訊及/或視訊之播放裝置,等 等)°在本例中,該裝置代表該裝置之使用者介面。此外, 該裝置亦可包括其他作業構件(鍵盤等)。或者,根據本發明 之裝置亦可為一獨立裝置,作為控制一或多個獨立電氣裝 置足控制裝置。在本例中,待控制之該等裝置具有一電氣 控制終端機(譬如無線終端機或合適之控制匯流排),經由該 終端機’該裝置根據所接收之使用者語音指令來控制該裝 置。 根據本發明之裝置可特別地作為使用者之資料存儲及/ 或查沟系統之介面。為此,該裝置包括内部資料記憶體, 或該裝置係經由諸如電腦網路或網際網路與一外部資料記 憶體連接。使用者可在對話時存儲資料(譬如電話號碼、備 :&錄等等)或査詢貧料(譬如時間、新聞、最新電視節目表等 等)。 此外,與使用者之對話亦可用於調整該裝置自身之參數 ,以及改變其組態。 當配有提供聲音訊號之揚聲器以及拾取該等訊號之麥克 風時,即可提供具有干擾抑制的訊號處理,即處理所拾取 聲音訊號之方式可抑制部分來自揚聲器之聲音訊號。當揚 聲器及麥克風在空間上相鄰排列,譬如排列在該擬人化元 85329 件上時,此點尤為有利。 除上述利用該裝置 使用者進行對話,以服=子裳置外’亦可將其用於與 ^ # ^ , 力万他目的,諸如資訊、娛樂或 向使用者發出指示。根櫨 、 備有可藉以進行對⑽ 月之另-項具體實施例,配 時,^^❹者發出指示㈣話構件。此 時,對活万式最好既可給 之回答。該等指示可為複又可拾取使用者 物件提問,譬如外語詞彙,並 卞白 答(如外語中+ 一致毛 /、中扣不(如—語詞之定義)及回 〜PQ D§])均相對較短。對話係在使用者與該擬 化:件,間進行’且可採取視覺及/或音訊方式實施。 本發明提出一種可能有 件(諸如外語詞彙)存错ϋ自万法’即將一組學習物 果)存储起來,其中對於每個學習物件存儲至 V —個問題(譬如定義)、一 安 個a衣(i如詞彙)以及最近一次 -使用者k問後或該使用者正確回答提問後所經歷時間之 -種量龍。在對話中,逐個選取並提問學習物件 係向該使用者提問,而將使用者之回答與存儲之答案比較 。待k出作為問題之學習物件之選取係考慮到所存儲之計 時量測值,即自悬折_、A 乂丄斗丄、、 目取近, 人針對孩物件提問後所經過的時間 經由(譬如)—適宜之學習模式來實現,該模式具有假 錯料。此外,㈣時間量測值外,在選取時 亦可將相關性程度納人考量,來評估每個學習物件。 結合下列具體實施例’將更清楚的瞭解本發明之這些及 其它方面。 85329 200407710 圖1係控制裝置10以及受此裝置控制之裝置12的方塊圖 。控制裝f1G之㈣為針對使用者之擬人化元件14。麥克 風揚耳1 8及針對使用者位置之位置感應器(此處為攝 影機此形式)㈣在擬人化元件14上。此料件共同構成 一機械罕兀22。該擬人化元件u以及機械單元22藉由馬達 24圍、'兀垂直軸轉動。—中央控制單元%經由—驅動電路 28控制該馬達24。她人化元件14係一獨立機械單元。其 具有使用纟可辨識無誤之一前側。麥克風]6、揚聲器㈣ 及攝影機20排列在擬人化元件14上,朝向此前側之方向。 孩麥克風16提供聲音訊號。此訊號由拾取系統3〇拾取, 並由語首辨識單元32處理。該語音辨識結果,即指定給拾 取之耸首訊號之語詞序列,被傳送至中央控制單元%。 為中央控制單元26亦控制一語音合成單元34,其經由_ 發聲單元3 6及揚聲器1 8提供合成語音説號。 该攝景> 機20所拾取之影像由該影像處理單元3 8處理。該 象處理單元38根據攝影機20提供之影像訊號判定使用者 之位且。该位置寅说被傳送至該中央控制單元2 $。 忒機械單元2 2係作為一使用者介面,該中央控制單元2 6 經由該機械單元接收來自使用者之輸入(麥克風丨6、語音辨 哉單元32),並回答使用者(語音合成單元34、揚聲器18)。 在本例中,該控制單元1 0係用於控制一電氣裝置1 2,璧如 —娛樂電子領域所用裝置。 圖1中僅象徵性地表示出該控制裝置1 〇之功能性單元。不 同單元,譬如中央控制單元26、語音辨識單元32及影像處 85329 -10 - 200407710 理單元3 8,在一且歸尚 t 、 “睹笑杈中可以獨立群組方式存在。同樣 地’亦可以純粹軟體 — 卜 乃式/目、她琢寺早凡,其中可藉由在一 中央早兀上執行余 式不㈢現禝數個或所有該等單元之功能 性。 该寺早兀在命門 \ -j- 二间上不必彼此或與該機械單元22相鄭。該 機械單元22,亦即耘、, 乂 f但並非必要排列在此元上 人化元件14以及來力m r 克風16、揚聲器]8和感應器2〇,可與控 制裝置]0之其餘部分分班 __ 刀開女且,且僅經由線路或無線連接 與之進行訊號連接。 =作中’轉制裝置1Q不斷探查其鄰近是否有使用者 曰判疋使用者位置後,該中央控制單7t26即控制馬達24 ,令擬人化元件10之前側朝向該使用者。 奋亥;5^像處理早元3 8 ~ 二 、 亚包括面邵辨識。當該攝影機20提供 複數個人之影像時,係葬由、、 、 r 你稭由面邯辨識來判疋誰為系統已知 之使用者。然後令兮I > 7 d k人化兀件14朝向該使用者。當配 有複數個麥克風時,i以 、b 、 万式處理該等麥克風發出之訊 號,以便祕已知使用者位置方向上之拾取模式。 此外,料設定該影像處理單元38之實施方式,使其可 "理解’’攝影機2 0所於取夕遍奸口口 叮心取&機械早元22附近之景象。接著, 可將相應景象指定給若干預先定義之狀態。譬如,以此方 式’该中央#制單元26可得知房間内是有—人或有多人。 該單元亦可辨識及指認使用者的行為,即:諸如該使用者 是正注視該機械軍& 9 9 > 士 & 錢早7L22m或是正與他人交談。藉由 評估所辨識之狀態,可顯著改進辨識能力。譬如,可避免 85329 -11 - 200407710 將兩人間之部分對話錯誤地理解為語音指令。 與使用者對話時,該中央控制單元會判定其輸入,並相 應地fe制該裝置1 2。可以如下方式對話,來控制聲音再生 裝置12之音量: -使用者改變其位置並面向該擬人化元件14。藉由馬達 24的不斷引導該擬人化元件14,令其前側朝向該使用者。 為此,根據判定之使用者位置,藉由裝置1〇之中央控制單 元26控制驅動電路28 ; 使用者發出語音指令,譬如"電視音量"。麥克風1 6拾 取4 ^曰扣令,並由語音辨識單元3 2進行辨識; 中央控制單元26作出反應,經由語音合成單元34以揚 常器18提問:”升高或降低?,,: 使用者發出浯首指令"降低"。辨識語音訊號後,中央 控制單元26控制裝置12,使音量降低。 圖2係具有整合式控制裝置之電氣裝置40的透視圖。該圖 ^ ^工制衣置1 0之擬人化元件14,該元件可圍繞一 垂直轴相對於m裝置4Q之固定外殼42轉動。在此實例中, 該擬人化元件且古 、 一 ’烏平矩形之形狀。攝影機20及揚聲器18 目^示' 係位y{印丨 抑 ^ 4上。兩麥克風16係排列在側面。機械 早元2 2係精由_1民、去 ’建(未顯示)轉動,使得前側始終指向使用 者方向。 /、/、l貝施例(未顯示)中,圖1之裝置1 〇並非用於控 制裝置12,而# 、… '、於進行對話,其目的在指示使用者。中 央控制單元2 61彳f — 订—可供使用者學習外語之學習程式。記 85329 -12 - 200407710 =:=::_件。該等物件係個別資料組,每組 (在該語言中出現之頻率)之評估1」、“狀關聯性 料紀錄φ、 ^ 里、心以及自最近提出資 3中义問畸後經過時間之時間量測值。 匕時在逐個選取並提問之數據★己錄中$ 習單 ^-己塚中執行該對話之學 白早兀。在此情況下,給予使 ,^ x J考一#曰不,即以光學顧示 或’耳首播放資料記錄中存儲 键…人 么者《…拾取使用者藉由(譬如) 鍵|的輪入,且較佳地由麥克 i人i 士门斤 次啟動自動語晋辨識32 知取〜回谷,並將其與已存答案(詞彙 告知答案是不判# A T i I存儲。使用者被 。木疋口刦疋為正確。若复鸯 正確答案,納曰、… 使用者會被告知 咨拉、,„ & f新回合又機會。如此處理 貝枓屺錄後,所存最近一次接 設為零。 人挺問後<計時獲得更新,即重 k後,選取並查詢下一資料記錄。 藉由一記憶模型選取待杳詢 d又貝枓1己錄。以公式 P(k) = exp(-t(k)*r(c(k)))表示一筒 川衣 間早吕己憶模型,豆中P(k)代表 !人知曉學習物件k之機率’叫代表指數函數、,雜表自 ^迎提問以來之時間,e(k)代表物件之學習級別,轉 :係學習級別之特定錯誤率。t可表示時間。亦可在學習步 #中給疋時間t。學習級別可以 1 U 1通且万式來足義。一 可仃模式係給被答對N次之物件泛|彻M σϊ 忏又母個^^〇指定一相應級 別。至於錯誤率,可假設一適宜 、 、、 〜口疋值,或選擇一通宜 I初始值,並以一種梯度演算法調整。 才曰示足目的係最大化知識的度量。 ,^ 規又此知識度Τ為整 85329 -13 - 200407710 :!習Π之部分,為使用者知曉,1以相關性量測值來 ΓΓ二=髮物件k之問題令機娜)成為因而,為 I 4心’應在每—步中提問知識機率為P(k)最低 模量測警刚、^ ::,可在母步後計算知識度量並顯示給使料。將該方 、、、化*以邊使用者盡可能廣泛地獲取當前學習物件組 j4 11由使用良好之記憶模型,可依此達成有效之學 習策略。 可對上逑對活式查詢進行多種修改及進一步改良。譬如 問碭(疋義)可具有複數個正確答案(詞彙)。譬如,可考 慮,用所存相關性量測值來強調更為相關(更常則之語詞 :如,相應學習物件組可包括數千個語詞。該等可為譬 如卞白物件’即給足用途(譬如文學、商冑、技術領域等等) 之具體詞彙。 、’心〜,本發明涉及一種包括用於拾取及辨識語音訊號之 構件的裝置,以及一種與一電氣裝置溝通之方法。該裝置 包括一可機械地移動之擬人化元件。判定使用者位置,且 居擬人化元件(其可包括諸如一人臉之圖像)之移動方式可 使其七側指向該使用者位置之方向。麥克風、揚聲器及/或 才砰影機可排列在該擬人化元件上。使用者可與該裝置進行 語音對話’其中該裝置為擬人化元件之形式。可根據使用 者語晋輸入控制一電氣裝置。亦可為實現指示使用者之目 的而進行使用者與該擬人化元件之對話。 85329 -14- 200407710 在圖式中。 圖1係一控制裝置之元件方塊圖; 圖2係包括一控制裝置之電氣裝置的透視圖。 圖式代表符號說明 10 控 制 裝 置 12 裝 置 14 擬 人 化 元 件 16 麥 克 風 18 揚 聲 器 20 攝 影 機 22 機 械 早 元 24 馬 達 26 中 央 控 制 單 元 28 驅 動 電 路 30 拾取 系 統 32 語 音 辨 Ί线 單 元 34 語 晋合成 單 元 36 發 聲 單 元 38 影 像 處 理 口口 早 元 40 裝 置 42 固 定 機 殼 44 前 側 -15 - 85329200407710 (1) Description of the invention: TECHNICAL FIELD The present invention discloses a device including a component for picking up and recognizing a voice signal, and a method for allowing a user to communicate with an electrical device. The known speech recognition component can assign the picked-up acoustic speech signal to a corresponding word or a corresponding word sequence. The speech recognition system is usually synthesized with speech, ".", As a dialogue system for controlling electrical devices. The dialogue with the user can be used as the only interface to operate the electrical device. Voice input and even output can be used as one of many communication methods The prior art US Patent No. US-A-6, 1 1 8,888 describes a control device and a method for controlling an electrical device (such as a computer) or a device used in the field of entertainment electronics. In order to control the device, the user has the right to control plural numbers Input devices. Children's devices are mechanical input devices (such as a keyboard or mouse) and speech recognition devices. In addition, 'the control device includes a camera that can pick up the user's gestures and mimicry' and can be processed as further The communication with the user is realized in the form of dialogue, in which the system has multiple modes at its disposal to transmit information to the user. It includes speech synthesis and my output. It also includes anthropomorphic images, Such as an image of a person, a face, or an animal. This image is in the form of computer graphics on the display Display to the user. Although the dialog system has been used for various special applications, such as telephone information systems, applications in other fields such as home control electronics, entertainment electronics, etc. have not been widely recognized. 85329 200407710 Summary of the Invention One of the items of the present invention is to provide a device for picking up a component for identifying a signal, and a method for making an electrical device. The electrical device allows a user to set up Songzhuang τ First, Tian Wei said that the device can be easily operated by control. The purpose can be achieved by methods such as the first item in the scope of patent application, and / or the second item in the scope of patent application, which defines the better of the present invention. DETAILED DESCRIPTION The device according to the present invention includes an anthropomorphic element that can be moved mechanically and mechanically. It is the H of the device, and the device is an anthropomorphic conversation partner of the user. The implementation of anthropomorphic components can vary widely. For example, they can be moved by a motor relative to a fixed housing of an electrical device. The guillotine of the shell. The key is that the anthropomorphic element has a user-identifiable disease-free side (A side. If the front side is facing the user, he will feel that the device is " attention to listening, that is, it can be received Voice instructions. According to the invention, the device includes means for determining the position of the user. This can be achieved via, for example, a sound or an optical sensor. The movement element of the anthropomorphic element is controlled such that the front side of the anthropomorphic element faces The position of the user. This makes the user always feel that the device is ready to "listen" to his speech. 0 According to another embodiment of the present invention, the anthropomorphic element includes _ anthropomorphic image. This can be not only a person or an animal The image can also be an image of an illusive character (such as a robot). The more easily accepted image is a human face. It can be a realistic or symbolic image, such as only showing eyes, nose, Mouth contour. 85329 -6- 200407710 The device preferably also includes means for supplying a voice signal. Speech recognition is indeed particularly important for controlling electrical devices. However, answers, confirmations, queries, etc. can also be implemented with speech output components. Speech output can include reproduction of pre-stored speech signals, as well as real speech synthesis. A complete dialog control can be implemented with the voice output component. You can also talk to users to provide entertainment for them. According to another embodiment of the present invention, the device includes a plurality of microphones and / or at least one camera. Voice signals can be picked up by a single microphone. However, when using multiple microphones, on the one hand, a pick-up mode can be achieved, and on the other hand, the user's position can be ascertained by receiving the user's voice signals through the multiple microphones. A camera can be used to observe the environment of the device. With corresponding image processing, the user's position can also be determined based on the picked up image. A microphone, a camera and / or a speaker for supplying a voice signal may be arranged on the anthropomorphic element which can be moved mechanically. For example, for anthropomorphic elements in the form of a human head, two cameras can be placed in the eye area, a speaker can be placed in the mouth, and two microphones can be placed close to the ear. It is best to have components to identify the user. This can be achieved, for example, by evaluating the picked up image signals (visual or facial recognition) or by evaluating the picked up audio signals (voice recognition). The device can thus determine the current user from several people within the device environment and direct the anthropomorphic component to the user. The moving member can be configured in a number of different ways to mechanically move the anthropomorphic element. For example, these components may be electric motors or hydraulic adjustment components. 85329 200407710 The anthropomorphic element can also be moved by the moving member. However, the anthropomorphic 7L piece is preferably rotatable only with respect to a fixed portion. For example, in this example, it may be rotated about a horizontal and / or vertical axis. The device according to the invention may form part of an electrical device, such as a device for entertainment electronics (such as a television, audio and / or video playback device, etc.). In this example, the device represents the user interface of the device . In addition, the device may include other working members (keyboard, etc.). Alternatively, the device according to the present invention may be an independent device as a foot control device for controlling one or more independent electrical devices. In this example, the devices to be controlled have an electrical control terminal (such as a wireless terminal or a suitable control bus), via which the device controls the device based on the user's voice command received. The device according to the invention can be used in particular as an interface for a user's data storage and / or trenching system. To this end, the device includes internal data memory, or the device is connected to an external data memory via, for example, a computer network or the Internet. Users can store data (such as phone numbers, backup: & recordings, etc.) or query poor materials (such as time, news, latest TV program list, etc.) during conversations. In addition, the dialogue with the user can also be used to adjust the device's own parameters and change its configuration. When equipped with speakers that provide sound signals and microphones that pick up those signals, you can provide signal processing with interference suppression, that is, the way of processing the picked up sound signals can suppress some of the sound signals from the speakers. This is especially advantageous when the speakers and microphones are arranged next to each other in space, such as on the 85329 anthropomorphic element. In addition to using the device for the above-mentioned users to conduct dialogues, taking service = Zishangzhi 'can also be used for ^ # ^, force million purposes, such as information, entertainment or giving instructions to users. Based on this, there is another specific embodiment which can be used to perform the opposite month, and in time, the person who sends the instruction issues instructions. At this time, it is better to answer both the living style. These instructions can be complex and can pick up user objects to ask questions, such as foreign language vocabulary, and answer blankly (such as foreign language + consistent hair /, deduction is not (such as the definition of words) and return ~ PQ D§]) Relatively short. Dialogue is conducted between the user and the simulation: it can be implemented visually and / or audioly. The present invention proposes that there may be pieces (such as foreign language vocabulary) that are stored incorrectly, and are stored in Wanfa '(that is, a set of learning objects and fruits), in which each learning object is stored to V — a question (such as a definition), a security a Clothing (i like vocabulary) and the last time-the amount of time elapsed after the user k asked or after the user answered the question correctly. In the dialogue, selecting and questioning the learning objects one by one is asking the user the question, and comparing the user's answer with the stored answer. The selection of the learning object to be used as a question is to take into account the stored time measurement values, that is, since the suspension _, A 乂 丄 丄,, and the subject are close, the time elapsed after the person asked the child object ( For example)-suitable learning mode to achieve, this mode has false errors. In addition, in addition to the time measurement value, the degree of relevance can also be taken into consideration when selecting to evaluate each learning object. These and other aspects of the invention will be more clearly understood in conjunction with the following specific examples. 85329 200407710 Figure 1 is a block diagram of the control device 10 and the device 12 controlled by the device. The control device f1G is an anthropomorphic element 14 for the user. Microphone 18 and a position sensor (here in the form of a camera) for the user's position are placed on the anthropomorphic element 14. This piece together constitutes a mechanically rare 22. The anthropomorphic element u and the mechanical unit 22 are rotated by a motor 24 and a vertical axis. The central control unit controls the motor 24 via a drive circuit 28. Her human element 14 is an independent mechanical unit. It has a front side that can be identified without errors. Microphone] 6, speaker ㈣ and camera 20 are arranged on the anthropomorphic element 14 and face the front side. The microphone 16 provides a sound signal. This signal is picked up by the pickup system 30 and processed by the speech recognition unit 32. The speech recognition result, that is, the sequence of words assigned to the picked up signal, is transmitted to the central control unit%. The central control unit 26 also controls a speech synthesizing unit 34, which provides a synthesized speech signal through the sound generating unit 36 and the speaker 18. The scene picked up by the camera 20 is processed by the image processing unit 38. The image processing unit 38 determines the position of the user based on the image signal provided by the camera 20. The position is said to be transmitted to the central control unit 2 $.忒 The mechanical unit 2 2 serves as a user interface. The central control unit 2 6 receives input from the user (microphone 丨 6, speech recognition unit 32) through the mechanical unit, and answers the user (speech synthesis unit 34, Speaker 18). In this example, the control unit 10 is used to control an electric device 12, such as a device used in the field of entertainment electronics. Only functional units of the control device 10 are shown symbolically in FIG. 1. Different units, such as the central control unit 26, the voice recognition unit 32, and the image processing unit 85329 -10-200407710 processing unit 38, can be separated into groups at the same time t, "seeing smiles can exist in the same way. The same can also be Pure software-Bunai / Mu, She Zhuo Temple Early Fan, among which the functionality of several or all of these units can be realized by performing a Yu-style on a central early tower. \ -j- The two rooms do not need to be in line with each other or the mechanical unit 22. The mechanical unit 22, that is, Yun ,, 乂 f but not necessarily arranged on this element, human elements 14 and Lyric mr grams wind 16, Speaker] 8 and sensor 20, can be separated from the rest of the control device] __ 刀 开 女 和, and the signal connection to it only through the line or wireless connection. = Origin ', the conversion device 1Q constantly probes its After a user is judged nearby, the central control unit 7t26 controls the motor 24 so that the front side of the anthropomorphic element 10 faces the user. Fen Hai; 5 ^ image processing early Yuan 3 8 ~ Including face recognition. When the camera 20 provides plural In the image of a person, the user is identified by face recognition, to determine who is known to the system. Then I > 7 dk humanized component 14 is facing the user. When equipped When there are multiple microphones, i processes signals from these microphones in the form of b, b, and ten, so as to know the pickup mode in the direction of the user's position. In addition, the implementation of the image processing unit 38 is set to make it available ; Understand '' the scene of the camera 20 in the vicinity of the rushing gangster mouth & mechanical early Yuan 22. Then, the corresponding scene can be assigned to a number of predefined states. For example, in this way 'the central # 制 组 26 can know whether there are people in the room, or there are many people. This unit can also identify and identify the behavior of the user, such as: the user is watching the robot army & 9 9 > soldiers & Qian Zao 7L22m or talking to others. By evaluating the identified status, the recognition ability can be significantly improved. For example, 85329 -11-200407710 can be avoided to misunderstand part of the dialogue between two people as a voice command. When talking to the user , The central control unit determines its input and controls the device 12 accordingly. The volume of the sound reproduction device 12 can be controlled as follows:-The user changes his position and faces the anthropomorphic element 14. By the motor 24 Continuously guide the anthropomorphic element 14 so that its front side faces the user. To this end, according to the determined user position, the driving circuit 28 is controlled by the central control unit 26 of the device 10; the user issues a voice command, such as & quot TV volume ". Microphone 16 picks up 4 ^ order and is recognized by the voice recognition unit 32; the central control unit 26 responds, and asks the speaker 18 via the voice synthesis unit 34 to raise or lower: ? , :: The user issues the first command " lower ". After recognizing the voice signal, the central control unit 26 controls the device 12 to reduce the volume. FIG. 2 is a perspective view of an electrical device 40 having an integrated control device. In the figure, a personification element 14 of a work garment 10 is provided, and the element can be rotated about a vertical axis relative to the fixed housing 42 of the m device 4Q. In this example, the anthropomorphic element has the shape of an ancient and a flat black rectangle. The camera 20 and the speaker 18 are shown on the display system. Two microphones 16 are arranged on the side. Machinery Early Yuan 2 Series 2 is rotated by _1min. To go (not shown), so that the front side always points in the direction of the user. /, /, L In the example (not shown), the device 10 in FIG. 1 is not used to control the device 12, and #, ... 'is used for dialogue, and the purpose is to instruct the user. The central control unit 2 61 彳 f — subscription — a learning program for users to learn foreign languages. 85329 -12-200407710 =: = :: _ pieces. These objects are individual data groups, each group (frequency in the language) assessment 1 "," state-related material records φ, ^ li, heart, and Time measurement value. Data selected and questioned one by one during the dungeon. ★ Recorded $ Xidan ^-Ji Tzu learned the implementation of the dialogue in the early days. In this case, give, ^ x J 考 一 # Yue No, that is to store the key in the optical record or 'ear-broadcast data record ... person or person ... pick up the user's turn-in by (for example) the key |, and preferably by Mike i person i Shimenjin Start the automatic speech recognition 32 to learn ~ back to the valley, and store it with the stored answer (the word tells the answer is not judged # AT i I store. The user is hacked. The wooden mouth is robbed as correct. If the correct answer is restored, Maybe,… the user will be informed about the new round, “& f new opportunity. After processing the beijing record, the last saved will be set to zero. After the person asks very much < the timing is updated, it will be repeated After k, select and query the next data record. Select a to-be-queried d by a memory model 1 The formula P (k) = exp (-t (k) * r (c (k))) is used to express a model of Chuanyima early Lu Jiyi, P (k) in the bean represents! People know the learning object k The probability is called the exponential function. The time of the miscellaneous table since the question was welcomed. E (k) represents the learning level of the object. Turn: the specific error rate of the learning level. T can represent time. Give the time t. The learning level can be 1 U 1 pass and all kinds of meanings. A savable mode is to give the object that is answered N times correctly | To M σ ϊ 母 and mother ^^ 〇 assign a corresponding level. As for The error rate can be assumed to be a suitable value, or a threshold value, or to choose an initial value of Tongyi I, and adjust it with a gradient algorithm. It is said that the goal is a measure of maximizing knowledge. Τ is part of the entire 85329 -13-200407710 :! Xi, for the user's knowledge, 1 uses the correlation measurement value to ΓΓ2 = the problem of sending the object k makes the machine), therefore, I 4 heart The probability of asking knowledge in each step is P (k). The lowest modulus is to measure the police force, ^ :, and the knowledge measure can be calculated after the parent step and displayed to the agent. This method can be used to obtain the current learning object group j4 11 as widely as possible. By using a good memory model, an effective learning strategy can be achieved. A variety of modifications and further improvements can be made to the live query on the query. For example, Q 砀 (疋 义) can have multiple correct answers (vocabulary). For example, it may be considered to use stored correlation measures to emphasize more relevant (more common words: for example, the corresponding learning object group may include thousands of words. These can be, for example, white objects' for sufficient use (Such as literature, business, technology, etc.) specific words. 'Heart ~' The present invention relates to a device including a component for picking up and recognizing a voice signal, and a method for communicating with an electrical device. The device It includes a mechanically movable anthropomorphic element. The position of the user is determined, and the anthropomorphic element (which may include an image of a human face, for example) is moved in such a way that its seven sides point in the direction of the user's position. Speakers and / or video cameras can be arranged on the anthropomorphic element. The user can have a voice conversation with the device, where the device is in the form of anthropomorphic element. An electrical device can be controlled according to the user's language input. The dialogue between the user and the anthropomorphic component can be carried out for the purpose of instructing the user. 85329 -14- 200407710 In the figure. Figure 1 is the element of a control device Figure 2 is a perspective view of an electrical device including a control device. Explanation of the representative symbols of the drawings 10 Control device 12 Device 14 Anthropomorphic element 16 Microphone 18 Speaker 20 Camera 22 Mechanical element 24 Motor 26 Central control unit 28 Drive Circuit 30 Picking system 32 Speech recognition line unit 34 Language synthesis unit 36 Sound unit 38 Image processing mouth early element 40 Device 42 Fixing case 44 Front side -15-85329

Claims (1)

200407710 拾、申請專利範圍: 1. 一種裝置,其包括: -用於拾取及辨識語音訊號(30、32)之構件,及 ‘ -具有一前側(44)之一擬人化元件(14),以及用於機械 地移動該擬人化元件(14)之運動構件(24),其中: -配置有用於判定使用者位置之構件(3 8);及 -控制該運動構件(24)之方式使得該擬人化元件(14) 之箾側(4 4)指向該使用者位置之方向。 2·如申請專利範圍第1項之裝置,其中配置有提供語音訊籲 號之構件(34、36、18)。 3.如莉述申請專利範圍中任一項之裝置,其中該擬人化元 件(14)包括一擬人化圖像,尤其係一人臉之圖像。 4 ·如前述申請專利範圍中任一項之裝置,其中: 配備有複數個之麥克風(16)及/或至少一個攝影機 (20); 居麥克風(16)及/或該攝影機(2 〇)較佳地配置於該擬 人化元件(14)上。 鲁 5. 如前述申請專利範圍中任一項之裝置,其中配備有用於 識別至少一個使用者之構件。 6. 如前述申請專利範圍中任一項之裝置,其中該運動構件 (24)使該擬人化元件(14)可圍繞至少一個軸轉動。 7. 如说述申請專利範圍中任一項之裝置,其中配備有至少 一個外邵電氣裝置(12),其係由該等語音訊號所控制。 8. 如前述申請專利範圍中任一項之裝置,其中: 85329 -配備有至少—伽ra、λ 個用於提供音響訊號之揚聲哭 _配備有至少—個阳' 每耳,及 及其中: 個用於拾取音響訊號之麥克風(16);以 -配備有用於虛: 、里所拾取之該等音響訊號之一祧_ 理單元(30),复φ如、、 處 %& /、 Q $源於該揚聲器(18)所發出聲 唬(訊號係受到抑制。 耳曰讯 9.如前述申請專利笳 Λ , 中j 一項之裝置,其中配備有用於 為‘不使用者之目的摄 及/或藉聲立m τ “义構件,對話中係以視覺 或一失古 、'、&予該使用者指並藉由-键盤及/ 或一 4克風拾取該使用者之回答。 1 0.如申請專利範圍第9 士 甘b、 η ^ ^ 1,其中孩對話構件包括存 儲一套學習物件之構件,其中·· -對於每個學f物件存儲至少—條指示、—個答案以 及使用者處理該指示所用時間之一项量測值及 構件之形成方式使得可藉由指示該使用者並 斯^亥使用者之艾奮盒邮六 , 木舁所存儲答案比較來選擇並查詢學 習物件;且其中 -在選取學習㈣時考細所㈣之量測值。 ].一種在使用者與電氣裝置叫之間通信的方法,其中包 栝· -判定一使用者之位置; -移動-擬人化元件(14),使得該擬人化元件(14)之前 側(4 4)指向該使用者之方向;以及 _拾取並處理該使用者之語音訊號。 200407710 1 2.如申請專利範圍第1 1項之方法,其中係根據所拾取之該 等語音訊號以控制該電氣裝置(1 2)。 85329200407710 Scope of patent application: 1. A device comprising:-a component for picking up and recognizing a voice signal (30, 32), and '-an anthropomorphic element (14) having a front side (44), and A moving member (24) for mechanically moving the anthropomorphic element (14), wherein:-a member (38) is provided for determining the position of the user; and-the way of controlling the moving member (24) makes the anthropomorphic The side (4 4) of the chemical element (14) points in the direction of the user's position. 2. The device according to item 1 of the scope of patent application, which is provided with a component (34, 36, 18) that provides a voice signal. 3. The device according to any of Lishu's patent applications, wherein the anthropomorphic element (14) includes an anthropomorphic image, especially an image of a human face. 4. The device according to any one of the foregoing patent applications, wherein: a plurality of microphones (16) and / or at least one camera (20) are provided; and the microphone (16) and / or the camera (20) are Ideally placed on the anthropomorphic element (14). Lu 5. The device according to any of the foregoing patent applications, which is equipped with a means for identifying at least one user. 6. The device according to any one of the foregoing patent applications, wherein the moving member (24) enables the anthropomorphic element (14) to rotate about at least one axis. 7. As mentioned in any one of the patent application devices, it is equipped with at least one foreign electrical device (12), which is controlled by these voice signals. 8. The device according to any one of the foregoing patent applications, wherein: 85329-equipped with at least-Gara, λ crows for providing audio signals-equipped with at least-Yang 'per ear, and among them : A microphone (16) for picking up the audio signal;-Equipped with one of the audio signals picked up for the virtual :, processing unit (30), complex φ such as ,, etc.% & / 、 Q $ is derived from the sound emitted by the speaker (18) (the signal is suppressed. Earphones 9. As described in the aforementioned application for patent 笳 Λ, j, the device is equipped with a camera for the purpose of 'no user' And / or by standing m τ "righteous component, in the dialogue is visual or an antiquated, ', & to the user to point and pick up the user's answer with -keyboard and / or a 4g wind 1 0. For example, the scope of the patent application No. 9 Shigan b, η ^ ^ 1, where the child dialogue component includes a component that stores a set of learning objects, where ...-for each learning f object, at least-one instruction, one The answer and a measure and component of the time it took the user to process the instruction The formation method makes it possible to select and query the learning objects by comparing the stored answers of Ai Fen Box Six and Mu Cong to instruct the user and user; Measured value.]. A method of communication between a user and an electrical device, including:-determining the location of a user;-moving-anthropomorphic element (14) before the anthropomorphic element (14) The side (4 4) points in the direction of the user; and _ picking up and processing the user's voice signal. 200407710 1 2. The method according to item 11 of the scope of patent application, which is based on the picked up voice signals to Control the electrical device (1 2). 85329
TW092112722A 2002-05-14 2003-05-09 A device for dialog control and a method of communication between a user and an electric apparatus TWI280481B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE10221490 2002-05-14
DE10249060A DE10249060A1 (en) 2002-05-14 2002-10-22 Dialog control for electrical device

Publications (2)

Publication Number Publication Date
TW200407710A true TW200407710A (en) 2004-05-16
TWI280481B TWI280481B (en) 2007-05-01

Family

ID=29421506

Family Applications (1)

Application Number Title Priority Date Filing Date
TW092112722A TWI280481B (en) 2002-05-14 2003-05-09 A device for dialog control and a method of communication between a user and an electric apparatus

Country Status (10)

Country Link
US (1) US20050159955A1 (en)
EP (1) EP1506472A1 (en)
JP (1) JP2005525597A (en)
CN (1) CN100357863C (en)
AU (1) AU2003230067A1 (en)
BR (1) BR0304830A (en)
PL (1) PL372592A1 (en)
RU (1) RU2336560C2 (en)
TW (1) TWI280481B (en)
WO (1) WO2003096171A1 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007533236A (en) * 2004-04-13 2007-11-15 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Method and system for sending voice messages
EP1766499A2 (en) 2004-07-08 2007-03-28 Philips Intellectual Property & Standards GmbH A method and a system for communication between a user and a system
US8689135B2 (en) 2005-08-11 2014-04-01 Koninklijke Philips N.V. Method of driving an interactive system and user interface system
EP1915676A2 (en) 2005-08-11 2008-04-30 Philips Intellectual Property & Standards GmbH Method for introducing interaction pattern and application functionalities
US8467672B2 (en) * 2005-10-17 2013-06-18 Jeffrey C. Konicek Voice recognition and gaze-tracking for a camera
US7697827B2 (en) 2005-10-17 2010-04-13 Konicek Jeffrey C User-friendlier interfaces for a camera
WO2007063447A2 (en) * 2005-11-30 2007-06-07 Philips Intellectual Property & Standards Gmbh Method of driving an interactive system, and a user interface system
JP2010206451A (en) * 2009-03-03 2010-09-16 Panasonic Corp Speaker with camera, signal processing apparatus, and av system
JP5263092B2 (en) 2009-09-07 2013-08-14 ソニー株式会社 Display device and control method
US9197736B2 (en) * 2009-12-31 2015-11-24 Digimarc Corporation Intuitive computing methods and systems
US9143603B2 (en) 2009-12-31 2015-09-22 Digimarc Corporation Methods and arrangements employing sensor-equipped smart phones
CN102298443B (en) * 2011-06-24 2013-09-25 华南理工大学 Smart home voice control system combined with video channel and control method thereof
CN102572282A (en) * 2012-01-06 2012-07-11 鸿富锦精密工业(深圳)有限公司 Intelligent tracking device
EP2699022A1 (en) * 2012-08-16 2014-02-19 Alcatel Lucent Method for provisioning a person with information associated with an event
FR3011375B1 (en) * 2013-10-01 2017-01-27 Aldebaran Robotics METHOD FOR DIALOGUE BETWEEN A MACHINE, SUCH AS A HUMANOID ROBOT, AND A HUMAN INTERLOCUTOR, COMPUTER PROGRAM PRODUCT AND HUMANOID ROBOT FOR IMPLEMENTING SUCH A METHOD
US9311639B2 (en) 2014-02-11 2016-04-12 Digimarc Corporation Methods, apparatus and arrangements for device to device communication
CN104898581B (en) * 2014-03-05 2018-08-24 青岛海尔机器人有限公司 A kind of holographic intelligent central control system
EP2933070A1 (en) 2014-04-17 2015-10-21 Aldebaran Robotics Methods and systems of handling a dialog with a robot
JP6739907B2 (en) * 2015-06-18 2020-08-12 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Device specifying method, device specifying device and program
JP6516585B2 (en) * 2015-06-24 2019-05-22 パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカPanasonic Intellectual Property Corporation of America Control device, method thereof and program
TW201707471A (en) * 2015-08-14 2017-02-16 Unity Opto Technology Co Ltd Automatically controlled directional speaker and lamp thereof enabling mobile users to stay in the best listening condition, preventing the sound from affecting others when broadcasting, and improving the convenience of use in life
TWI603626B (en) * 2016-04-26 2017-10-21 音律電子股份有限公司 Speaker apparatus, control method thereof, and playing control system
EP3611941A4 (en) * 2017-04-10 2020-12-30 Yamaha Corporation Voice providing device, voice providing method, and program
CN110412881B (en) * 2018-04-30 2022-10-14 仁宝电脑工业股份有限公司 Separated mobile intelligent system and operation method and base device thereof
EP3685718A1 (en) * 2019-01-24 2020-07-29 Millo Appliances, UAB Kitchen worktop-integrated food blending and mixing system
JP7026066B2 (en) * 2019-03-13 2022-02-25 株式会社日立ビルシステム Voice guidance system and voice guidance method
US11380094B2 (en) 2019-12-12 2022-07-05 At&T Intellectual Property I, L.P. Systems and methods for applied machine cognition

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2239691C (en) * 1995-12-04 2006-06-06 Jared C. Bernstein Method and apparatus for combined information from speech signals for adaptive interaction in teaching and testing
US6118888A (en) * 1997-02-28 2000-09-12 Kabushiki Kaisha Toshiba Multi-modal interface apparatus and method
IL120855A0 (en) * 1997-05-19 1997-09-30 Creator Ltd Apparatus and methods for controlling household appliances
US6077085A (en) * 1998-05-19 2000-06-20 Intellectual Reserve, Inc. Technology assisted learning
EP1122038A4 (en) * 1998-06-23 2009-06-17 Sony Corp Robot and information processing system
JP4036542B2 (en) * 1998-09-18 2008-01-23 富士通株式会社 Echo canceller
JP2001157976A (en) * 1999-11-30 2001-06-12 Sony Corp Robot control device, robot control method, and recording medium
AU4449801A (en) * 2000-03-24 2001-10-03 Creator Ltd. Interactive toy applications
JP4480843B2 (en) * 2000-04-03 2010-06-16 ソニー株式会社 Legged mobile robot, control method therefor, and relative movement measurement sensor for legged mobile robot
GB0010034D0 (en) * 2000-04-26 2000-06-14 20 20 Speech Limited Human-machine interface apparatus
JP4296714B2 (en) * 2000-10-11 2009-07-15 ソニー株式会社 Robot control apparatus, robot control method, recording medium, and program
US20020150869A1 (en) * 2000-12-18 2002-10-17 Zeev Shpiro Context-responsive spoken language instruction

Also Published As

Publication number Publication date
RU2336560C2 (en) 2008-10-20
PL372592A1 (en) 2005-07-25
AU2003230067A1 (en) 2003-11-11
JP2005525597A (en) 2005-08-25
US20050159955A1 (en) 2005-07-21
EP1506472A1 (en) 2005-02-16
WO2003096171A1 (en) 2003-11-20
TWI280481B (en) 2007-05-01
RU2004136294A (en) 2005-05-27
BR0304830A (en) 2004-08-17
CN100357863C (en) 2007-12-26
CN1653410A (en) 2005-08-10

Similar Documents

Publication Publication Date Title
TW200407710A (en) Dialog control for an electric apparatus
US8723984B2 (en) Selective sound source listening in conjunction with computer interactive processing
US11948241B2 (en) Robot and method for operating same
JP3670180B2 (en) hearing aid
US11803579B2 (en) Apparatus, systems and methods for providing conversational assistance
CN110035250A (en) Audio-frequency processing method, processing equipment, terminal and computer readable storage medium
JP2019220848A (en) Data processing apparatus, data processing method and program
JP2007050461A (en) Robot control system, robot device, and robot control method
EP3684076B1 (en) Accelerometer-based selection of an audio source for a hearing device
US20210409876A1 (en) Method for Adjusting a Hearing Aid Device and System for Carrying Out the Method
JP6798258B2 (en) Generation program, generation device, control program, control method, robot device and call system
JP7087804B2 (en) Communication support device, communication support system and communication method
JP2015192332A (en) Situation grasping unit
KR20030024904A (en) Device having speech-control means and having test-means for testing a function of the speech-control means
US20230351261A1 (en) Learning data generating device, learning data generating method, learning device, learning method, data structure, information processor, and acoustic treatment device
CN112820265B (en) Speech synthesis model training method and related device
JP7286303B2 (en) Conference support system and conference robot
KR20040107523A (en) Dialog control for an electric apparatus
Okuno et al. Realizing audio-visually triggered ELIZA-like non-verbal behaviors
Okuno et al. Realizing personality in audio-visually triggered non-verbal behaviors
CN114203148A (en) Analog voice playing method and device, electronic equipment and storage medium
JP2007030050A (en) Robot control device, robot control system, robot device and robot control method
JP2005123959A (en) High-presence communication conference apparatus
WO2024185334A1 (en) Information processing device, information processing method, and program
US20230306828A1 (en) Apparatus, method and computer program for identifying acoustic events, in particular acoustic information and/or warning signals

Legal Events

Date Code Title Description
MM4A Annulment or lapse of patent due to non-payment of fees