DE102014108371B4

DE102014108371B4 - Method for voice control of entertainment electronic devices

Info

Publication number: DE102014108371B4
Application number: DE102014108371.7A
Authority: DE
Inventors: Roland Gaidzik
Original assignee: Loewe Technologies GmbH
Current assignee: Loewe Ip Holding Ltd Cy
Priority date: 2014-06-13
Filing date: 2014-06-13
Publication date: 2016-04-14
Anticipated expiration: 2034-06-14
Also published as: DE102014108371A1

Abstract

Verfahren zum Steuern eines unterhaltungselektronischen Geräts, wobei das unterhaltungselektronische Gerät mindestens eine Steuereinheit, eine Kommunikationseinheit, eine Anzeigeeinheit und Mittel mindestens zum Empfangen von Spracheingaben aufweist, wobei mit dem unterhaltungselektronischen Gerät Fernsehsendungen empfangbar sind und wobei nach Erhalt einer Spracheingabe von einem Nutzer die Steuereinheit diese Spracheingabe analysiert und feststellt, ob diese Spracheingabe aus einzelnen Befehlsworten oder einem vollständigen Satz mit darin enthaltenen Befehlsworten besteht, und die Steuereinheit anhand der ausgewerteten Befehlswörter eine Steuerfunktion generiert, wenn die Spracheingabe des Nutzers mindestens eine Sequenz von zwei Befehlsworten enthält und dem Nutzer eine Spracheingabe optisch und/oder akustisch vorschlägt, welche die von ihm empfangenen Befehlsworte enthält und als vollständiger Satz ausgebildet ist.A method of controlling an entertainment electronic device, wherein the entertainment electronic device comprises at least one control unit, a communication unit, a display unit, and means for receiving voice input, wherein television programs are receivable with the entertainment electronic device, and upon receiving a voice input from a user, the control unit inputs that voice input analyzes and determines whether this voice input consists of individual command words or a complete set of command words contained therein, and the control unit generates a control function based on the evaluated command words, if the user's speech input contains at least a sequence of two command words and the user a voice input visually and / or proposes acoustically, which contains the command words received from him and is designed as a complete sentence.

Description

Die vorliegende Erfindung betrifft ein Verfahren zur Sprachsteuerung von unterhaltungselektronischen Geräten. Das unterhaltungselektronische Gerät weist mindestens eine Steuereinheit, eine Kommunikationseinheit, eine Anzeigeeinheit und Mittel mindestens zum Empfangen von Spracheingaben auf.The present invention relates to a method for voice control of entertainment electronic devices. The entertainment electronic device has at least one control unit, a communication unit, a display unit, and means for at least receiving voice input.

Unterhaltungselektronische Geräte, wie beispielsweise Fernsehgeräte, können über Spracheingaben von einem Nutzer gesteuert werden. Hierzu wird der Spracheingabemodus des unterhaltungselektronischen Gerätes aktiviert und anschließend werden Befehle zum Betrieb des unterhaltungselektronischen Gerätes von einem Nutzer eingegeben. In den meisten Fällen erfolgt die Eingabe der Spracheingaben durch Steuerung des unterhaltungselektronischen Gerätes über einzelne Kommandos. Dadurch soll sichergestellt werden, dass das unterhaltungselektronische Gerät die Benutzereingaben richtig versteht und nicht durch Wörter, die nicht zur Steuerung vorgesehen sind, falsche Befehle ausgeführt werden.Consumer electronic devices such as televisions may be controlled by a user via voice input. For this purpose, the voice input mode of the entertainment electronic device is activated, and then commands for operating the entertainment electronic device are entered by a user. In most cases, the voice input is entered by controlling the electronic entertainment device via individual commands. This is to ensure that the entertainment electronic device correctly understands the user input and does not execute erroneous commands through words that are not intended for control.

Es sind auch Sprachsteuerungen für elektronische Geräte bekannt, die es einem Nutzer erlauben, in einer natürlichen Sprache Befehle für das elektronische Gerät auszugeben. Die Befehle werden beispielsweise in einem vollständigen Satz ausgegeben, wobei das unterhaltungselektronische Gerät die in dem Satz enthaltenen Befehlsworte erkennt und eine entsprechende Steuerung vornimmt. Ein Beispiel dafür ist in der DE 195 33 541 C1 zu finden. Diese Patentschrift gibt ein Verfahren zur automatischen Steuerung eines oder mehrerer Geräte durch Sprachkommandos oder per Sprachdialog im Echtzeitbetrieb und Vorrichtungen zum Durchführen des Verfahrens an. Das Sprachbediensystem basiert auf einem Verfahren zur Sprachausgabe, der Sprachsignalvorverarbeitung, der Spracherkennung und der syntaktisch-grammatikalischen Nachverarbeitung sowie auf der Dialog-, Ablauf- und Schnittstellensteuerung. Syntax- und Kommandostruktur sind während des Echtzeitdialogbetriebs fixiert. Die Eingabe von Kommandos erfolgt verbunden, wobei die Anzahl der Worte aus denen ein Kommando für die Spracheingabe gebildet wird, variabel ist.Voice controls for electronic devices are also known which allow a user to issue commands for the electronic device in a natural language. For example, the commands are issued in a complete set, with the entertainment electronic device recognizing the command words contained in the sentence and providing appropriate control. An example of this is in the DE 195 33 541 C1 to find. This patent discloses a method for automatically controlling one or more devices by voice commands or real-time voice dialogue and apparatus for performing the method. The voice control system is based on a method for speech output, voice signal preprocessing, speech recognition and syntactic-grammatical post-processing, as well as dialog, workflow and interface control. Syntax and command structure are fixed during real-time dialog operation. The input of commands is connected, wherein the number of words from which a command for the voice input is formed, is variable.

Darüber hinaus sind Einrichtungen bekannt, die bei einer Spracheingabe störende Hintergrundgeräusche herausfiltern und einen steuerungsberechtigten Benutzer durch eine Sprachanalyse ermitteln. Auch dies ist aus der zuvor genannten Patentschrift DE 195 33 541 C1 bekannt.In addition, facilities are known which filter out disturbing background noise during voice input and determine a control-authorized user through a speech analysis. This is also from the aforementioned patent DE 195 33 541 C1 known.

Aus der US 6 553 345 B1 ist eine Fernbedienung für ein unterhaltungselektronisches Gerät bekannt, dass ein Mikrofon aufweist, um hierüber eine Sprachsteuerung vornehmen zu können. Eine Fernbedienung mit einem Mikrofon zur Sprachsteuerung ist auch in der US 2005/0114141 A1 angegeben.From the US 6 553 345 B1 For example, a remote control for an entertainment electronic device is known, which has a microphone in order to be able to carry out voice control over it. A remote control with a microphone for voice control is also in the US 2005/0114141 A1 specified.

Aufgabe der vorliegenden Erfindung ist es, ein Verfahren zum Steuern eines unterhaltungselektronischen Gerätes zu verbessern, wobei die Steuerung vereinfacht und im Hinblick auf die Eingabegewohnheiten des Benutzers verbessert werden soll, um eine zuverlässigere Erkennung und Auswertung von Steuerbefehlen zu erreichen.The object of the present invention is to improve a method for controlling an entertainment electronic device, wherein the control should be simplified and improved with regard to the user's input habits in order to achieve a more reliable detection and evaluation of control commands.

Die Aufgabe wird durch ein Verfahren mit den in Anspruch 1 angegebenen Verfahrensschritten gelöst.The object is achieved by a method having the method steps specified in claim 1.

Vorteilhafte Weiterbildungen der Erfindung sind in den Unteransprüchen im Detail angegeben.Advantageous developments of the invention are specified in the dependent claims in detail.

Bei einem erfindungsgemäßen Verfahren zum Steuern eines unterhaltungselektronischen Gerätes, wobei das unterhaltungselektronische Gerät mindestens eine Steuereinheit, eine Kommunikationseinheit, eine Anzeigeeinheit und Mittel mindestens zum Empfangen von Spracheingaben aufweist und dazu ausgebildet ist, Fernsehsendungen zu empfangen, analysiert die Steuereinheit nach Erhalt einer Spracheingabe von einem Nutzer diese Spracheingabe und stellt fest, ob diese Spracheingabe aus einzelnen Befehlsworten oder einem vollständigen Satz mit darin enthaltenen Befehlsworten besteht, wobei die Steuereinheit anhand der ausgewerteten Befehlswörter eine Steuerfunktion generiert, wenn die Spracheingabe des Nutzers mindestens eine Sequenz von zwei Befehlsworten enthält. Die Steuereinheit schlägt dem Nutzer eine Spracheingabe optisch und/oder akustisch vor, welche die von ihm empfangenen Befehlsworte enthält und als vollständiger Satz ausgebildet ist. Spricht der Nutzer beispielsweise einen Steuerbefehl (Spracheingabe) aus, der aus zwei einzelnen Befehlsworten besteht, jedoch nicht als vollständiger Satz ausgebildet ist, so wird durch die Steuereinheit dem Nutzer optisch und/oder akustisch vorgeschlagen, dass und wie eine verbesserte Spracheingabe erfolgen kann. Insbesondere bei der Verwendung von einzelnen Befehlsworten (Kommandosprache), ergeben sich häufig Fehlinterpretationen von Systemen zur Steuerung von unterhaltungselektronischen Geräten. Wird jedoch ein vollständiger Satz als Spracheingabe zur Steuerung eines unterhaltungselektronischen Gerätes ausgegeben, so kann durch die Auswertung des Satzes mit den darin enthaltenen Befehlsworten eine zuverlässigere Steuerung gewährleistet werden. Die Steuereinheit gibt daher dem Nutzer eine Hilfestellung, indem die Steuereinheit dem Nutzer angibt, dass eine natürliche Spracheingabe möglich ist und bietet ihm darüber hinaus einen Vorschlag für eine derartige Eingabe.In a method according to the invention for controlling an entertainment electronic device, wherein the entertainment electronic device has at least one control unit, a communication unit, a display unit and means for at least receiving voice inputs and is adapted to receive television broadcasts, the control unit analyzes upon receipt of a voice input from a user this voice input and determines whether that voice input consists of individual command words or a complete set of command words contained therein, the control unit generating a control function based on the evaluated command words if the user's speech input contains at least a sequence of two command words. The control unit proposes visually and / or acoustically to the user a speech input which contains the command words received from it and is designed as a complete sentence. For example, if the user speaks a control command (voice input) consisting of two individual command words but is not configured as a complete sentence, the control unit will optically and / or acoustically suggest to the user that and how an improved voice input can take place. In particular, when using individual command words (command language), often result in misinterpretations of systems for controlling entertainment electronic devices. However, if a complete sentence is output as voice input for controlling an entertainment electronic device, more reliable control can be ensured by evaluating the sentence with the command words contained therein. The control unit therefore provides the user with assistance in that the control unit indicates to the user that a natural speech input is possible and moreover offers him a suggestion for such an input.

Bei dem Verfahren wird vermieden, dass durch die Eingabe eines einzelnen Befehlswortes eine Steuerfunktion ausgeführt wird. Wenn beispielsweise ein Nutzer sich unterhält und ein Befehlswort, wie z. B. „aus”, verwendet, erfolgt keine Steuerfunktion für das unterhaltungselektronische Gerät. Spricht der Nutzer jedoch einen Satz mit den Befehlswörtern „aus” und „Fernseher” aus (beispielsweise „schalte Fernseher aus”), so erkennt die Steuereinheit einen Befehl für das unterhaltungselektronische Gerät und generiert eine Steuerfunktion, z. B. Fernseher ausschalten. The method avoids that a control function is executed by the input of a single command word. For example, if a user is talking and a command word, such. B. "off" used, there is no control function for the entertainment electronic device. However, when the user speaks a sentence with the command words "off" and "TV" (for example, "turn off TV"), the control unit recognizes a command for the entertainment electronic device and generates a control function, e.g. B. Turn off the TV.

Bei der Analyse und der anschließenden Überprüfung der Spracheingabe können verschiedene Kriterien vorgegeben sein, welche das Vorliegen einer Sequenz von zwei Befehlsworten definieren. Beispielsweise kann eine bestimmte maximale Anzahl von Wörtern zwischen zwei vermeintlichen Befehlswörtern zulässig sein, wobei ein Überschreiten dieser Anzahl gegen das Vorliegen einer Sequenz von zwei Befehlswörtern spricht. Darüber hinaus werden bei der Analyse der Spracheingabe des Nutzers auch Synonyme für die erkannten Befehlsworte herangezogen, um eine Sequenz von zwei Befehlsworten zu erkennen. Um zu ermitteln, ob zwei erkannte Befehlsworte zu einer erfindungsgemäßen Sequenz von mindestens zwei Befehlsworten gehören, kann die Steuereinheit Pausen in der Spracheingabe des Nutzers ermitteln, um somit Abschnitte festzulegen, die zur Erkennung einer Sequenz dienen.In the analysis and the subsequent verification of the speech input, various criteria can be provided which define the presence of a sequence of two command words. For example, a certain maximum number of words between two putative command words may be allowed, wherein exceeding this number speaks against the presence of a sequence of two command words. In addition, in analyzing the user's voice input, synonyms are also used for the detected command words to detect a sequence of two command words. In order to determine whether two recognized command words belong to a sequence according to the invention of at least two command words, the control unit can detect pauses in the speech input of the user, thus defining sections which serve to identify a sequence.

Bei dem erfindungsgemäßen Verfahren erfolgt daher eine zuverlässige Erkennung von tatsächlich vorliegenden Steuerbefehlen für das unterhaltungselektronische Gerät.In the method according to the invention, therefore, a reliable detection of actually present control commands for the entertainment electronic device.

Die Steuereinheit kann eine Anzeige auf der Anzeigeeinheit generieren, welche die von dem Nutzer empfangene Spracheingabe in Textform und/oder eine von der Steuereinheit vorgeschlagene Spracheingabe in Textform anzeigt und/oder die Steuereinheit eine Sprachausgabe aufweist, die mindestens eine interpretierte Spracheingabe ausgibt. Eine interpretierte Spracheingabe umfasst eine Sprachausgabe dessen, was die Steuereinheit als Steuerbefehl für das unterhaltungselektronische Gerät erkannt hat. So kann die interpretierte Spracheingabe sich von der tatsächlichen Spracheingabe durch den Nutzer unterscheiden. Die interpretierte Spracheingabe kann beispielsweise Synonyme von zwei erkannten Befehlsworten umfassen, wobei die tatsächlich von dem Nutzer eingegebenen Befehlsworte nicht in der interpretierten Spracheingabe enthalten sind. Zusätzlich oder alternativ dazu kann auf der Anzeigeeinheit eine Anzeige generiert werden, die die von dem Nutzer erkannte Spracheingabe wörtlich wiedergibt. Alternativ oder zusätzlich hierzu kann eine von der Steuereinheit vorgeschlagene Spracheingabe in Textform angezeigt werden, wobei die vorgeschlagene Texteingabe, wie die interpretierte Spracheingabe, nicht den gleichen Wortlaut der tatsächlichen Spracheingabe des Nutzers aufweisen muss.The control unit may generate a display on the display unit which displays in text form the voice input in text form and / or a voice input proposed by the control unit and / or the control unit has a voice output which outputs at least one interpreted voice input. An interpreted voice input comprises a voice output of what the control unit has recognized as a control command for the entertainment electronic device. Thus, the interpreted speech input may differ from the actual speech input by the user. For example, the interpreted speech input may include synonyms of two recognized command words, with the command words actually input by the user not included in the interpreted speech input. Additionally or alternatively, a display can be generated on the display unit, which reproduces the speech input recognized by the user verbatim. Alternatively or additionally, a voice input suggested by the control unit may be displayed in textual form, wherein the proposed text input, such as the interpreted voice input, need not have the same wording of the user's actual voice input.

Die von dem Nutzer in einer Spracheingabe empfangenen logischen Verknüpfungen der einzelnen Befehlsworte können gespeichert werden und die Steuereinheit kann eine Sprachausgabe unterdrücken, wenn von dem Nutzer in einem vorgebbaren Zeitraum öfter als eine bestimmbare Anzahl eine Spracheingabe mit einer im Wesentlichen ähnlichen Verknüpfung der einzelnen Befehlsworte empfangen wird. Beispielsweise hat der Nutzer eine Sequenz von zwei Befehlsworten zur Steuerung des unterhaltungselektronischen Gerätes mehrfach als einzelne Kommandos eingegeben, wodurch über die Steuereinheit des unterhaltungselektronischen Gerätes ein Hinweis auf eine natürliche Spracheingabe ausgegeben worden ist. Ändert der Nutzer sein Verhalten jedoch nicht und verwendet weiterhin einzelne Befehlsworte, so kann von der Steuereinheit ein Hinweis auf eine natürliche Spracheingabe unterdrückt werden. In diesem Fall „lernt” die Steuereinheit, dass der Nutzer eine natürliche Spracheingabe nicht wünscht und verhindert, dass der Nutzer durch oft wiederkehrende und gleiche Hinweise zur Verwendung der Sprachsteuerung entmutigt wird, die Sprachsteuerung des unterhaltungselektronischen Geräts zu benutzen. Oft wiederkehrende Hinweise können bei einem Nutzer den Eindruck erwecken, er bediene das unterhaltungselektronische nicht richtig. Um eventuellen Fehlern vorzubeugen, würde ein Nutzer dann gänzlich auf die Sprachsteuerung verzichten. Dadurch, dass die Steuereinheit oft wiederkehrende und gleiche Sprachausgaben unterdrückt, wird der Nutzer nicht entmutigt.The logical links of the individual command words received by the user in a voice input may be stored and the control unit may suppress a voice output if the user receives a voice input having a substantially similar linkage of the individual command words more than a determinable number of times within a predeterminable time period , For example, the user has repeatedly input a sequence of two command words for controlling the entertainment electronic device as individual commands, whereby an indication of natural voice input has been output through the control unit of the entertainment electronic device. However, if the user does not change his behavior and continues to use individual command words, then the control unit can suppress an indication of natural speech input. In this case, the control unit "learns" that the user does not want a natural voice input and prevents the user from being discouraged from using the voice control of the entertainment electronic device by often recurring and similar advice on using the voice control. Frequently recurring clues can give users the impression that they are not using the entertainment electronic correctly. In order to prevent any errors, a user would then completely renounce the voice control. The fact that the control unit often suppresses recurrent and same voice output does not discourage the user.

Hierbei kann die Steuereinheit ebenfalls die als Kommando empfangenen Befehlsworte mit Synonymen abgleichen, so dass auch bei wechselnden Befehlsworten mit der gleichen Bedeutung ein Hinweis auf eine natürliche Spracheingabe unterdrückt wird. Die logischen Verknüpfungen stellen sicher, dass der Zusammenhang, in welchem die mindestens zwei Befehlsworte stehen, gleich oder zumindest ähnlich ist. Beispielsweise sollte ein Hinweis auf eine natürliche Spracheingabe nicht unterdrückt werden, nur weil in einer Sequenz von mindestens zwei Befehlswörtern beispielsweise die Wörter „heute” und „Programm” enthalten sind. Als logische Verknüpfung ist hierbei auch zu verstehen, ob die Spracheingaben des Nutzers beispielsweise für die Steuerung des unterhaltungselektronischen Geräts (Lautstärke, Helligkeit, Videotext, etc.) oder für die Informationsbeschaffung (z. B. Suchen in EPG-Daten nach bestimmten Filmen) vorgesehen sind.Here, the control unit can also match the command words received as a command with synonyms, so that even with changing command words with the same meaning an indication of a natural speech input is suppressed. The logical links ensure that the context in which the at least two command words are located is the same or at least similar. For example, an indication of natural speech input should not be suppressed simply because the words "today" and "program" are included in a sequence of at least two command words, for example. A logical link here is also to be understood as meaning whether the user's voice inputs are intended, for example, for the control of the entertainment electronic device (volume, brightness, videotext, etc.) or for the acquisition of information (eg searches in EPG data for specific films) are.

Die Steuereinheit kann eine Sprachausgabe unterdrücken, wenn eine bestimmte Sprachausgabe in einem vorgebbaren Zeitraum öfter als eine bestimmbare Anzahl ausgegeben worden ist. Hat die Steuereinheit eine bestimmte Sprachausgabe mehrfach ausgegeben, so kann aus den oben genannten Gründen ebenfalls darauf verzichtet werden, diese Sprachausgabe zu generieren. Beispielsweise verwendet der Nutzer verschiedene Befehlsworte innerhalb einer Sequenz, wobei die Steuereinheit anhand der vorstehend genannten Kriterien keine Übereinstimmung festgestellt hat. Die Steuereinheit erkennt jedoch über die Ausgabe der Steuereinheit, dass die Sprachausgabe mehrfach bereits ausgegeben worden ist, und verhindert das Ausgeben einer solchen.The control unit can suppress a voice output if a particular one Speech output in a predetermined period of time has been issued more than a determinable number. If the control unit has issued a certain voice output several times, it can also be dispensed with for the reasons mentioned above to generate this voice output. For example, the user uses different command words within a sequence, and the controller has found no match based on the above criteria. However, the control unit recognizes, via the output of the control unit, that the voice output has already been issued several times, and prevents it from being output.

Die Spracheingabe kann über eine Fernsteuerung für das unterhaltungselektronische Gerät erfolgen, wobei die Fernsteuerung ein Mikrofon und eine Kommunikationseinheit aufweist und die über das Mikrofon empfangenen Spracheingaben des Nutzers über die Kommunikationseinheit der Fernsteuerung an die Kommunikationseinheit des unterhaltungselektronischen Gerätes gesendet und der Steuereinheit übermittelt werden. Eine Spracheingabe über eine Fernsteuerung verhindert das Risiko, dass Störgeräusche oder andere Personen die Spracheingaben des Nutzers verfälschen. Wenn keine Fernsteuerung für die Spracheingabe verwendet wird, weist entsprechend das unterhaltungselektronische Gerät ein Mikrofon und entsprechende Mittel zum Verarbeiten der Spracheingaben auf.The voice input can be made via a remote control for the entertainment electronic device, wherein the remote control has a microphone and a communication unit and sent via the microphone speech inputs of the user via the communication unit of the remote control to the communication unit of the entertainment electronic device and transmitted to the control unit. Voice input via remote control prevents the risk of noise or other people distorting the user's voice input. Accordingly, if no remote control is used for the voice input, the entertainment electronic device has a microphone and corresponding means for processing the voice inputs.

Zur Generierung der interpretierten Sprachausgabe kann eine semantische Analyse der Sequenz von mindestens zwei Befehlsworten im Hinblick auf die Bedeutung der mindestens zwei Befehlsworte in deren Zusammenhang durchgeführt werden und die interpretierte Spracheingabe inhaltlich die Spracheingabe des Nutzers wiedergeben. Die interpretierte Spracheingabe kann daher völlig verschieden zu der Spracheingabe durch den Nutzer sein. Gibt ein Nutzer beispielsweise als Spracheingabe den Befehl „das ist mir viel zu leise” aus, kann die interpretierte Spracheingabe „erhöhe die Lautstärke des Fernsehgerätes” umfassen.In order to generate the interpreted speech output, a semantic analysis of the sequence of at least two command words with regard to the meaning of the at least two command words in their context can be carried out and the interpreted speech input can reproduce the speech input of the user. The interpreted speech input may therefore be completely different from the user's speech input. For example, if a user types the command "Much too quiet" as voice input, the interpreted voice input may include "Increase the volume of the TV."

Das Verfahren kann in einem unterhaltungselektronischen Gerät zur Anwendung kommen, das ein Fernsehgerät, ein Laptop, ein PC, ein Smartphone, ein Tablet-Computer oder eine mit einem Bildschirm verbundenen Set-Top-Box sein kann. Das Verfahren kann aber auch in einer Kombination solcher unterhaltungselektronischer Geräte mit einer Fernsteuerung oder in einer Fernsteuerung, die eine herkömmliche Fernbedienung, ein Smartphone, ein Tablet-Computer, ein PC, ein Laptop, ein PDA oder Mobiltelefon sein kann, zur Anwendung kommen.The method may be used in an entertainment electronic device, which may be a television, a laptop, a PC, a smartphone, a tablet computer, or a set-top box connected to a display. However, the method can also be used in a combination of such entertainment electronic devices with a remote control or in a remote control, which may be a conventional remote control, a smartphone, a tablet computer, a PC, a laptop, a PDA or a mobile phone.

Weitere Vorteile, Merkmale, Anwendungen und Ausführungsbeispiele ergeben sich aus der nachfolgenden Figurenbeschreibung von nicht einschränkend zu verstehenden Ausführungsbeispielen.Further advantages, features, applications and embodiments will become apparent from the following description of the figures of non-limiting embodiments to be understood.

In der Zeichnung zeigt:In the drawing shows:

1 ein schematisches Ablaufdiagramm für ein Verfahren zum Steuern eines unterhaltungselektronischen Gerätes mittels einer Spracheingabe. 1 a schematic flow diagram of a method for controlling a entertainment electronic device by means of a voice input.

Bei einem unterhaltungselektronischen Gerät wird durch eine Nutzereingabe oder automatisch nach dem Inbetriebnehmen eines unterhaltungselektronischen Gerätes, beispielsweise eines Fernsehgerätes, über eine spezielle Software das unterhaltungselektronische Gerät in einen Spracheingabemodus versetzt. Das unterhaltungselektronische Gerät verfügt über ein Mikrofon und Mittel zum Empfangen und Auswerten von Spracheingaben. Zusätzlich dazu verfügt eine Fernsteuerung für das unterhaltungselektronische Gerät über ein Mikrofon und Mittel zum Senden von Spracheingaben an das unterhaltungselektronische Gerät. Das unterhaltungselektronische Gerät kann somit über Spracheingaben durch einen Nutzer gesteuert werden. In einem ersten Spracheingabemodus gibt der Nutzer die Spracheingaben über die Fernsteuerung ein. In einem zweiten Spracheingabemodus gibt der Nutzer die Spracheingaben über das unterhaltungselektronische Gerät ein.In an entertainment electronic device, the entertainment electronic device is set in a voice input mode by a user input or automatically after the commissioning of an entertainment electronic device, such as a television set, via special software. The entertainment electronic device has a microphone and means for receiving and evaluating voice input. In addition, a remote controller for the entertainment electronic device has a microphone and means for sending voice input to the entertainment electronic device. The entertainment electronic device can thus be controlled via voice inputs by a user. In a first voice input mode, the user enters the voice inputs via the remote control. In a second voice input mode, the user inputs the voice inputs via the entertainment electronic device.

Das unterhaltungselektronische Gerät weist eine Steuereinheit auf, welche die Funktionen zur Steuerung des unterhaltungselektronischen Gerätes ausführt. Das unterhaltungselektronische Gerät weist Mittel zum Empfang von Fernsehsendungen über Kabel, Satellit, Antenne und/oder das Internet auf. Darüber hinaus kann die Steuereinheit die Verarbeitung der Spracheingaben des Nutzers über eine Internetverbindung an eine externe Auswerteeinrichtung senden, die die Auswertung der Spracheingabe des Nutzers vornimmt und eine ausgewertete Spracheingabe an die Steuereinheit des unterhaltungselektronischen Gerätes sendet.The entertainment electronic device has a control unit which performs the functions for controlling the entertainment electronic device. The entertainment electronic device has means for receiving television broadcasts via cable, satellite, antenna and / or the Internet. In addition, the control unit can send the processing of the user's voice inputs via an Internet connection to an external evaluation device, which performs the evaluation of the user's voice input and sends an evaluated voice input to the control unit of the entertainment electronic device.

Alternativ dazu kann die Steuereinheit des unterhaltungselektronischen Gerätes selbstständig die Auswertung der Spracheingabe des Nutzers vornehmen. Das unterhaltungselektronische Gerät kann ferner einen Speicher aufweisen, in dem von einem Nutzer eingegebene Spracheingaben zur Steuerung des unterhaltungselektronischen Gerätes und zur Suche nach Informationen über empfangbare Fernsehsendungen sowie die hierfür ausgegebenen Sprachausgaben und interpretierten Spracheingaben des unterhaltungselektronischen Gerätes gespeichert werden.Alternatively, the control unit of the entertainment electronic device can independently carry out the evaluation of the speech input of the user. The entertainment electronic device may further comprise a memory in which user input voice inputs are stored for controlling the entertainment electronic device and searching for information about receivable television broadcasts and the voice output and interpreted voice inputs of the entertainment electronic device output therefor.

Nachdem sich das unterhaltungselektronische Gerät in einem Spracheingabemodus befindet und ein Nutzer oder Betrachter eine Spracheingabe ausgibt, erfolgt durch die Steuereinheit des unterhaltungselektronischen Gerätes oder einer externen Auswerteinrichtung die Analyse und Auswertung der Spracheingabe.After the entertainment electronic device is in a voice input mode and a user or viewer is viewing a voice input device Voice input, is carried out by the control unit of the entertainment electronic device or an external evaluation device, the analysis and evaluation of voice input.

1 zeigt schematisch ein Ablaufdiagramm für die Sprachsteuerung. In Schritt 10 wird das Verfahren zur Sprachsteuerung des unterhaltungselektronischen Gerätes gestartet. Anschließend wird in Schritt 12 eine Spracheingabe von einem Nutzer empfangen. Die Spracheingabe kann über Aufnahme- und Empfangsmittel des unterhaltungselektronischen Gerätes oder über Aufnahme- und Empfangsmittel einer Fernsteuerung für das unterhaltungselektronische Gerät empfangen werden. Die Fernsteuerung sendet dann über Infrarot- oder Nachbereichsnetzwerktechnologien die Spracheingabe an die Steuereinheit des unterhaltungselektronischen Gerätes über deren Kommunikationseinheiten. Anschließend erfolgt bei Schritt 14 die Analyse der Spracheingabe, wobei in der Spracheingabe enthaltene Befehlswörter als solche erkannt werden. Bei der Analyse in Schritt 14 wird auch eine semantische Analyse im Hinblick auf die Bedeutung der Wörter sowie die Kombination der mindestens zwei Befehlswörter für eine bestimmte Steuerung bzw. einem bestimmten Steuerungswunsch durchgeführt und ermittelt. Bei der Analyse wird auch ein Abgleich mit Synonymen für bestimmte Befehlswörter durchgeführt. Daher können über bestimmte festgelegte Befehlswörter weitere Befehlswörter erkannt werden, die als gleichwertig anzusehen sind. Es erfolgt auch eine Überprüfung der Kombination von mindestens zwei Befehlswörtern. 1 schematically shows a flow chart for the voice control. In step 10 the method for voice control of the entertainment electronic device is started. Subsequently, in step 12 receive a voice input from a user. The voice input can be received via receiving and receiving means of the entertainment electronic device or recording and receiving means of a remote control for the entertainment electronic device. The remote control then transmits the speech input to the control unit of the entertainment electronic device via its communication units via infrared or sub-area network technologies. Followed by step 14 the analysis of the speech input, wherein command words contained in the speech input are recognized as such. In the analysis in step 14 Also, a semantic analysis with respect to the meaning of the words as well as the combination of the at least two command words for a particular control or control request is performed and determined. The analysis also performs synonym matching for specific command words. Therefore, certain command words can be recognized via certain fixed command words, which are to be regarded as equivalent. There is also a check of the combination of at least two command words.

In Schritt 16 wird eine Überprüfung der Spracheingabe des Nutzers durchgeführt. Bei der Überprüfung wird ermittelt, ob die Spracheingabe des Nutzers als vollständiger Satz ausgegeben worden ist. Wird durch die Steuereinheit oder eine externe Auswerteeinheit festgestellt, dass die Spracheingabe des Nutzers als vollständiger Satz in natürlicher Sprache ausgegeben worden ist, erfolgt im Schritt 20 die inhaltliche Wiedergabe der Spracheingabe. Die inhaltliche Wiedergabe der Spracheingabe kann optisch oder akustisch sowie optisch und akustisch erfolgen. Hierbei wird über die Anzeigeeinheit des unterhaltungselektronischen Gerätes optisch die Spracheingabe inhaltlich wiedergegeben. Über die Lautsprecher wird akustisch die Spracheingabe inhaltlich wiedergegeben. Bei der inhaltlichen Wiedergabe der Spracheingabe erzeugt die Steuereinheit des unterhaltungselektronischen Gerätes eine interpretierte Spracheingabe. Die interpretierte Spracheingabe kann daher keines der Worte aufweisen, die in der natürlichen Spracheingabe des Nutzers vorkommen.In step 16 a verification of the user's voice input is performed. The check determines whether the user's speech input has been output as a complete sentence. If it is determined by the control unit or an external evaluation unit that the user's speech input has been output as a complete sentence in natural language, takes place in the step 20 the content of the speech input. The content of the speech input can be made visually or acoustically as well as visually and acoustically. In this case, the speech input is reproduced in terms of content via the display unit of the entertainment electronic device. The contents of the audio are acoustically reproduced via the loudspeakers. When reproducing the content of the voice input, the control unit of the entertainment electronic device generates an interpreted voice input. The interpreted speech input can therefore not have any of the words that occur in the natural speech input of the user.

Ist die Spracheingabe des Nutzers in Schritt 16 als nicht vollständiger Satz erkannt worden, greift die Steuereinheit des unterhaltungselektronischen Gerätes oder die Auswerteeinrichtung auf einen Speicher zu, in dem Spracheingaben des Nutzers sowie Sprachausgaben (z. B. interpretierte Spracheingaben) des unterhaltungselektronischen Gerätes mit Zeit- und Datumsangaben gespeichert sind. Die Steuereinheit des unterhaltungselektronischen Gerätes oder die externe Auswerteeinrichtung greifen auf den Speicher zu und vergleichen die nicht als vollständiger Satz ausgebildete analysierte Spracheingabe des Nutzers mit den im Speicher abgelegten Daten. Wird dabei festgestellt, dass die aktuell ausgegebenen Steuerbefehle des Nutzers mehrmals in einem bestimmbaren Zeitraum ausgegeben worden sind und hat eine Interpretation der Spracheingabe ergeben, dass der damit verknüpfte Befehl gleich einem in dem Speicher für die gespeicherten Spracheingaben verknüpften Befehl ist, erfolgt ebenfalls die inhaltliche Wiedergabe der Spracheingabe bei Schritt 20.Is the user's voice input in step 16 has been recognized as an incomplete sentence, the control unit of the entertainment electronic device or the evaluation device accesses a memory in which voice inputs of the user as well as voice outputs (eg interpreted voice inputs) of the entertainment electronic device are stored with time and date information. The control unit of the entertainment electronic device or the external evaluation device accesses the memory and compares the non-fully formed analyzed voice input of the user with the data stored in the memory. If it is found that the currently issued control commands of the user have been issued several times in a determinable period of time and an interpretation of the voice input has shown that the associated command is equal to one in the memory for the stored voice inputs associated command, also the content playback the voice input at step 20 ,

Ist keine ähnliche Spracheingabe für die aktuell von einem Nutzer ausgegebene Spracheingabe in einem definierten Zeitraum ausgegeben worden, so erzeugt die Steuereinheit des unterhaltungselektronischen Gerätes optisch oder akustisch als auch optisch und akustisch einen Hinweis auf eine natürliche Spracheingabe in Schritt 22. Der Hinweis auf die natürliche Spracheingabe umfasst eine interpretierte Spracheingabe. Zudem umfasst der Hinweis auf die natürliche Spracheingabe Aufforderungen, welche dem Nutzer oder Betrachter darstellen oder mitteilen, dass er in einer natürlichen Sprache mit dem Gerät kommunizieren kann. Es erfolgt bei Schritt 22 daher wie in Schritt 20 eine inhaltliche Wiedergabe der Spracheingabe, wobei zusätzlich dem Benutzer oder Betrachter mitgeteilt wird, dass eine natürliche Spracheingabe möglich ist.If no similar speech input has been output for the voice input currently being output by a user in a defined period of time, then the control unit of the entertainment electronic device generates optically or acoustically as well as visually and acoustically an indication of a natural voice input in step 22 , The indication of natural speech input includes an interpreted speech input. In addition, the indication of the natural voice input includes prompts that indicate to the user or viewer or that he can communicate with the device in a natural language. It takes place at step 22 therefore as in step 20 a substantive representation of the speech input, wherein additionally the user or viewer is informed that a natural speech input is possible.

Nach der inhaltlichen Wiedergabe der Spracheingabe in Schritt 20 oder dem Hinweis auf eine natürliche Spracheingabe in Schritt 22 wird der erkannte Befehl in Schritt 24 ausgeführt. Das Ausführen des Befehls kann sowohl das Ein- oder Ausschalten des unterhaltungselektronischen Gerätes, die Änderung der Lautstärke des unterhaltungselektronischen Gerätes, das Umschalten des unterhaltungselektronischen Gerätes auf einen anderen Kanal, das Aufnehmen oder Vormerken einer Sendung, das Aufrufen eines Steuerungs- oder Informationsmenüs als auch die Suche nach für einen Betrachter oder Benutzer interessanten Fernsehsendungen oder Medieninhalten umfassen.After the content of the speech input in step 20 or the indication of a natural speech input in step 22 becomes the recognized command in step 24 executed. Execution of the command may include both turning on or turning off the entertainment electronic device, changing the volume of the entertainment electronic device, switching the entertainment electronic device to another channel, recording or pre-recording a program, invoking a control or information menu, and the like Search for TV or media content that is of interest to a viewer or user.

Nachdem der Befehl ausgeführt worden ist, wird das Verfahren zur Sprachsteuerung beendet, wenn beispielsweise der Spracheingabemodus beendet wird. Alternativ kann nach dem Ausführen des Befehls in Schritt 24 eine weitere Spracheingabe des Nutzers ausgegeben werden, wobei das Verfahren bei Schritt 12 wieder durchlaufen wird.After the command has been executed, the voice control method is ended when, for example, the voice input mode is ended. Alternatively, after executing the command in step 24 a further speech input of the user are output, the method at step 12 to go through again.

Das Ausführen eines Befehls muss daher nicht das Ende des Verfahrens bedeuten. Vielmehr wird ein Benutzer oder Betrachter nach der inhaltlichen Wiedergabe der Spracheingabe in Schritt 20 erneut eine Spracheingabe tätigen, so dass die vorstehend genannten Schritte erneut ausgeführt werden. Die gestrichelten Verbindungslinien von 1 deuten an, dass eine Spracheingabe des Nutzers oder Betrachters nach Schritt 20 zur inhaltlichen Wiedergabe der Spracheingabe oder nach dem Ausführen eines Befehls in Schritt 24 erfolgen kann, so dass eine Kommunikation mit dem unterhaltungselektronischen Gerät erfolgt.Running a command does not necessarily mean the end of the process. Rather, a user or viewer becomes after the content playback of the speech input in step 20 make a voice input again so that the above steps are repeated. The dashed connecting lines of 1 indicate that a voice input by the user or viewer after step 20 to reproduce the contents of the speech input or to execute a command in step 24 can be done so that a communication with the entertainment electronic device takes place.

Eine in natürlicher Sprache als vollständiger Satz ausgegebene Sprachausgabe des Nutzers oder Betrachters ermöglicht für das unterhaltungselektronische Gerät eine zuverlässigere Ermittlung des gewünschten Steuerbefehls bzw. der gewünschten Suchanfrage. Hierbei kann aus dem Zusammenhang und der Verknüpfung von einzelnen erkannten Befehlswörtern gezielt bestimmt werden, nach welchen Inhalten ein Betrachter oder Benutzer suchen will oder welche Steuerung er vornehmen möchte. Die Ausgabe von Hinweisen auf natürliche Spracheingaben dient dem Betrachter oder Benutzer nicht nur als Hilfestellung sondern lehrt ihm einen optimalen Umgang mit dem unterhaltungselektronischen Gerät und der Verwendung von Spracheingaben. Die Zuverlässigkeit der Spracherkennung wird damit merklich erhöht. Sollte jedoch ein Betrachter oder Benutzer Vorbehalte gegen eine derartige Spracheingabe haben, so erkennt das unterhaltungselektronische Gerät die wiederholt als einzelne Befehlsworte eingegebenen Spracheingaben und verzichtet darauf, Hinweise auf natürliche Spracheingaben zu generieren und auszugeben.A speech output of the user or viewer in natural language as a complete sentence allows the entertainment electronic device to more reliably determine the desired control command or query. In this case, it can be determined from the context and the linkage of individual recognized command words, which contents a viewer or user wants to search for or which control he would like to carry out. The output of hints to natural speech inputs not only helps the viewer or user but also teaches him how to best use the entertainment electronic device and how to use voice input. The reliability of speech recognition is thus increased significantly. However, should a viewer or user have reservations about such voice input, the entertainment electronic device recognizes the voice inputs repeatedly input as individual command words and refrains from generating and outputting indications of natural voice input.

Die Fernsteuerung kann eine Fernbedienung, ein Smartphone, ein Tablet, ein PC, ein Laptop, ein PDA oder ein Mobiltelefon sein. Ist die Fernsteuerung z. B. durch ein Smartphone realisiert, so weist dieses Mittel zum Aufnehmen von Spracheingaben von einem Nutzer auf. Ferner weist das Smartphone Einrichtungen für eine Nahbereichskommunikation (z. B. Bluetooth) mit dem unterhaltungselektronischen Gerät (z. B. Fernsehgerät) auf. Auf dem Smartphone ist für die Steuerung des unterhaltungselektronischen Geräts eine Anwendungssoftware („App”) installiert, welche nach einer Aktivierung durch den Nutzer in der Lage ist, mit dem Fernsehgerät zu kommunizieren und Steuerbefehle von dem Nutzer zu empfangen und an das Fernsehgerät zu senden. Die Anwendungssoftware des Smartphone kann auch die Interpretation und Analyse der Spracheingaben des Nutzers realisieren. Dabei wird der Steuereinheit des Fernsehgeräts von dem Smartphone (Fernsteuerung) eine interpretierte Spracheingabe übermittelt. Ferner können auf dem Display des Smartphones die tatsächlich ausgegebene Spracheingabe des Nutzers, die interpretierte Spracheingabe, ein Hinweis auf eine natürliche Spracheingabe einschließlich einer möglichen vorgeschlagenen Spracheingabe und z. B. Suchergebnisse dargestellt werden, wobei die Suchergebnisse von dem Fernsehgerät an das Smartphone gesendet werden. Lautsprecher des Smartphones können sämtliche Spracheingaben und Sprachausgaben (z. B. Hinweis auf natürliche Spracheingabe, vorgeschlagene Spracheingabe, Suchergebnisse, Bestätigung von Änderungen, etc.) wiedergeben.The remote control can be a remote control, a smartphone, a tablet, a PC, a laptop, a PDA or a mobile phone. Is the remote control z. B. realized by a smartphone, so this has means for recording voice input from a user. Further, the smartphone has facilities for short-range communication (eg, Bluetooth) with the entertainment electronic device (eg, TV). An application software ("app") is installed on the smartphone for the control of the entertainment electronic device, which upon activation by the user is able to communicate with the television and receive and send control commands from the user to the television. The application software of the smartphone can also realize the interpretation and analysis of the user's voice input. In this case, the control unit of the TV is transmitted from the smartphone (remote control) an interpreted voice input. Further, on the display of the smartphone, the user's actual voice input, the interpreted voice input, an indication of a natural voice input, including a possible suggested voice input, and e.g. B. Search results are displayed, the search results are sent from the TV to the smartphone. Speakers on the smartphone can play back all voice input and output (eg, natural voice input, suggested voice input, search results, confirmation of changes, etc.).

BezugszeichenlisteLIST OF REFERENCE NUMBERS

1010: Startbegin
1212: Spracheingabevoice input
1414: Analyseanalysis
1616: ÜberprüfungVerification
1818: Vergleichcomparison
2020: Wiedergabereproduction
2222: HinweisNote
2424: AusführenTo run
2626: EndeThe End

Claims

A method of controlling an entertainment electronic device, wherein the entertainment electronic device comprises at least one control unit, a communication unit, a display unit, and means for receiving voice input, wherein television programs are receivable with the entertainment electronic device, and upon receiving a voice input from a user, the control unit inputs that voice input analyzes and determines whether this voice input consists of individual command words or a complete set of command words contained therein, and the control unit generates a control function based on the evaluated command words, if the user's speech input contains at least a sequence of two command words and the user a voice input visually and / or proposes acoustically, which contains the command words received from him and is designed as a complete sentence.

The method of claim 1, wherein the control unit generates a display on the display unit, which displays the speech input received from the user and / or a voice input proposed by the control unit in text form, and / or the control unit has a voice output that outputs at least one interpreted voice input ,

The method of claim 1 or 2, wherein the user is in a voice input received logical associations of the individual command words are stored and the control unit suppresses a speech output, if more than a determinable number of times a voice input is received by the user with a similar linkage of the individual command words.

The method of claim 1 or 2, wherein the control unit suppresses a voice output when a particular voice output has been output more than a determinable number of times within a predeterminable time period.

Method according to one of claims 1 to 4, wherein the voice input via a remote control for the entertainment electronic device and the remote control comprises a microphone and a communication unit and the speech received by the user via the microphone via the communication unit of the remote control to the communication unit of the entertainment electronic device sent and the control unit.

Method according to one of the preceding claims, wherein for generating the interpreted speech output, a semantic analysis of the sequence of at least two command words with respect to the meaning of the at least two command words in their context is performed and the interpreted speech input content reproduces the speech input of the user.

Method according to one of the preceding claims, characterized in that it is used in a television set, a laptop, a PC, a smartphone, a tablet or in a set-top box connected to a screen.