US20100267345A1 - Method and System for Preparing Speech Dialogue Applications - Google Patents
Method and System for Preparing Speech Dialogue Applications Download PDFInfo
- Publication number
- US20100267345A1 US20100267345A1 US12/223,916 US22391606A US2010267345A1 US 20100267345 A1 US20100267345 A1 US 20100267345A1 US 22391606 A US22391606 A US 22391606A US 2010267345 A1 US2010267345 A1 US 2010267345A1
- Authority
- US
- United States
- Prior art keywords
- local
- speech dialog
- applications
- radio
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 48
- 230000005540 biological transmission Effects 0.000 claims abstract description 21
- 230000005577 local transmission Effects 0.000 claims description 7
- 230000002457 bidirectional effect Effects 0.000 claims description 4
- 230000015572 biosynthetic process Effects 0.000 claims description 4
- 238000003786 synthesis reaction Methods 0.000 claims description 4
- 238000004519 manufacturing process Methods 0.000 claims description 3
- 230000001747 exhibiting effect Effects 0.000 claims 2
- 238000010586 diagram Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 239000000969 carrier Substances 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Definitions
- Described below are a method and a system for providing speech dialogue applications on mobile terminals, in particular on mobile terminals in vehicles.
- command and control speech dialogue applications serve the purpose of inputting speech commands, for example for voice operation of a navigation system inside a vehicle.
- Speech dialogue systems are not used to input commands to control a device but enable the user to search for information on any desired subject, for example hotel information.
- information provided always has to be up to date.
- the corresponding speech dialogue applications can be constantly, for example hourly or daily, updated.
- a GMS link is not suitable for transmitting speech dialogue applications for the retrieval of information as such speech dialogue applications include very large volumes of data.
- a WLAN link is not suitable owing to its short range.
- One advantage of the method is that the speech dialogue applications can be made available at the same time to numerous mobile terminals.
- a further advantage of the method is that the speech dialogue applications can be updated easily and in a very close time-frame.
- the speech dialogue application exhibits background system data.
- each speech dialogue application exhibits a time stamp showing the point in time when it was produced.
- the description language is an XML description language.
- each speech dialogue application exhibits a name for its identification.
- the background system data are programmed in Java.
- the background system data are also transmitted with an associated speech dialogue application by digital radio to the mobile terminals.
- the background system data are retrieved via a bidirectional mobile radio interface of the mobile terminal.
- the speech dialogue applications are produced in a server which is connected to a data network.
- the data network is formed by the internet.
- the produced speech dialogue applications are stored in a first directory of the server and copied from the first directory of the server to a second directory of the server when the respective speech dialogue application is recognized as being complete.
- the speech dialogue applications copied to the second directory of the server are transmitted by a radio transmitter to a radio receiver of the mobile terminal at regular intervals.
- the speech dialogue applications received by the radio receiver of the mobile terminal are stored in a first directory of the mobile terminal and then copied from the first directory of the mobile terminal to a second directory of the mobile terminal when the respective time stamp of the speech dialogue application indicates that the speech dialogue application is younger than a corresponding speech dialogue application stored in the second directory of the mobile terminal.
- the speech dialogue applications are generated by the server automatically from local internet sites.
- the speech dialogue applications are transmitted by the radio transmitter to the radio receivers of the mobile terminals in its transmission area across a certain local radio reception frequency.
- each local radio reception frequency is assigned internet addresses of various local internet sites from which the server produces local speech dialogue applications automatically for transmission to the mobile terminals in the transmission area of the radio transmitter.
- the server is connected to a data network.
- the data network is formed by the internet.
- each mobile terminal exhibits a speech dialogue machine for processing the received speech dialogue applications and a background system with stored information dialogue data.
- the speech dialogue machine exhibits a speech interpretation unit for interpreting data which are output by an automatic speech recognition unit.
- the speech dialogue machine exhibits an output unit for outputting data to a speech synthesis unit.
- the mobile terminal exhibits a transmitter-receiver unit for a mobile telephone connection.
- the mobile terminal is installed in an associated vehicle.
- the mobile terminal is located in a vehicle.
- FIG. 1 is a block diagram of an arrangement in accordance with an embodiment of the system described below for providing speech dialogue applications for mobile terminals;
- FIG. 2 is a block diagram of an embodiment of a mobile terminal
- FIG. 3 is a data structure diagram of a speech dialogue application in accordance with an embodiment of the method described below;
- FIG. 4 is a block diagram of an alternative embodiment of the mobile terminal
- FIG. 5 is a flowchart to explain how the method described below functions on the transmission side
- FIG. 6 is a further flowchart to explain how the method described below functions on the reception side.
- the system 1 for providing speech dialogue applications includes a server 2 for the production of speech dialogue applications.
- the speech dialogue applications produced are transmitted via a data transmission line 3 , for example by FTP, to a digital radio transmitter 4 .
- the server 2 is for example connected to a database 5 and to a data network 6 , in particular to the internet.
- the server 2 produces speech dialogue applications automatically on the basis of documents which are provided by the data network 6 , in particular on the basis of internet sites.
- the speech dialogue applications are programmed manually.
- the radio transmitter 4 transmits the speech dialogue applications received from the server 2 to mobile terminals 7 which are inside its transmission area.
- the mobile terminals 7 are in each case connected to a reception antenna 8 which receives the digital radio signal from the radio transmitter 4 .
- the mobile terminals 7 are preferably located in a vehicle 9 , for example in a motor vehicle, in a train, on a ship or in an airplane.
- FIG. 2 shows a block diagram of an embodiment of the mobile terminal 7 .
- the mobile terminal 7 contains a digital radio receiver 10 , which is connected to a speech dialogue machine 12 via a line 11 .
- the speech dialogue machine 12 serves the purpose of processing the speech dialogue applications received by the digital radio receiver 10 .
- the speech dialogue machine 12 contains a dialogue management unit 12 A which is connected to an interpretation unit 12 B and an output unit 12 C.
- the speech interpretation unit 12 B receives data from an automatic speech recognition unit 13 (ASR: Automatic Speech Recognition), which is connected to a microphone 14 .
- the output unit 12 C of the speech dialogue machine 12 is connected to a speech synthesis unit 15 (TTS: Text to Speech), which transmits the analogue speech signal produced through a loudspeaker 16 .
- a user 17 conducts a speech dialogue with the mobile terminal 7 through the microphone 14 and the loudspeaker 16 .
- the dialogue machine 12 of the mobile terminal 7 is also connected with a background system 19 (BGS: Background System), in which information dialogue data are stored.
- BGS Background System
- the background system data of the background system 19 are for example programmed in Java or in C.
- FIG. 3 shows an embodiment of a data structure of a speech dialogue application.
- the speech dialogue application exhibits a speech dialogue flow description. This represents a formal description of a speech dialogue and is described in a certain description language, for example in an XML description language (Voice XML).
- the speech dialogue application contains language models LM (Language Model) for individual dialogue steps of the described speech dialogue.
- LM Language
- Voice XML Voice Extended Markup Language
- Voice XML exhibits data constructs which provide the user with certain freedoms in the dialogue procedure, so-called form filling.
- the dialogue manager determines the system reaction, preferably dynamically on the basis of the dialogue history.
- the speech recognition system 13 receives an analogue speech signal from the microphone 14 .
- This speech signal is digitized for example by a soundcard and then converted into a frequency spectrum by Fourier transformation.
- the frequency spectrum is then compared with the content of a database and the symbol of the acoustically most similar reference vector is passed on.
- Recognition takes place for example by hidden Markov models (HMM).
- HMM hidden Markov models
- Using a language model the probability of certain word combinations is then determined in order to exclude incorrect hypotheses.
- a grammar model or a trigram statistic is used.
- a bigram or trigram statistic stores the occurrence probability of word combinations from two or three words.
- a speech dialogue application also includes background system data (BGS: Background System).
- BGS Background System
- BGS data are for example programmed in Java and contain up-to-date information data on certain subjects.
- each speech dialogue application additionally contains a time stamp which shows the point in time when it was produced.
- the background system data are transmitted together with the language model and the speech dialogue flow description within a speech dialogue application by digital radio by the radio transmitter 4 to the radio receiver 10 within the mobile terminal 7 .
- the speech dialogue application transmitted by radio does not contain the background system data but an address provided instead, for example an IP address.
- the mobile terminal 7 After the mobile terminal 7 has received the speech dialogue application from the digital radio transmitter 4 it creates a bidirectional transmission channel to a base station of a data network by a separate data link, in particular a mobile telephone link.
- FIG. 4 shows a mobile terminal 7 with a further transmitter-receiver unit 20 , for example a UMTS transmitter-receiver unit to create a bidirectional mobile telephone link.
- the dialogue machine 12 After the dialogue machine 12 has received a speech dialogue application from the radio transmitter 4 on a unidirectional radio link it extracts the IP address contained in it and sends an enquiry via the created mobile telephone link in order to obtain the corresponding background system data or information data which belong to the speech dialogue application. After the background information data have been received these are placed in the background system 19 by the dialogue machine 12 .
- the user With the system described herein it is possible for the user to conduct a speech dialogue with the mobile terminal 7 in order to obtain information on any desired subject. To do so, the user 17 conducts a dialogue with the mobile terminal 7 .
- a user 17 is traveling in a vehicle 9 in a transmission area of a local radio transmitter 4 , he/she can for example obtain local information by conducting a speech dialogue with the mobile terminal 7 . If for example the user 17 is traveling in an area of a local radio transmitter in the vicinity of Cologne and would like to find out what musical activities are on offer in the evening in Cologne, he/she can do so by conducting a dialogue with the mobile terminal 7 .
- the speech dialogue can be initiated either by the user 17 or by the mobile terminal 7 .
- the user 17 is asked by the mobile terminal 7 whether he/she would like to receive information about leisure-time activities in the transmission area. If the user 17 answers in the affirmative, he/she can for example enquire about musical activities available. The user can ask, for example, whether any jazz concerts are taking place in Cologne in the evening.
- the background system 19 conducts a search process and answers the search enquiry by sending an output data record to the dialogue machine 12 .
- the dialogue machine 12 After speech synthesis the dialogue machine 12 gives the following answer to the user 17 : “Jazz is being played in Cologne this evening starting 20.00 hrs in the Domizil Club at Luxemburgerstrasse 117”.
- the server 2 generates background information data automatically on the basis of internet documents from the internet 6 .
- the server 2 for example evaluates a given group of internet homepages relating to the city of Cologne.
- the local radio transmitter 4 transmits the speech dialogue application to those mobile terminals whose radio receivers are in its transmission area. The transmission takes place on a certain local radio reception frequency f.
- certain internet addresses of various local internet sites are assigned to each local radio reception frequency. From these local internet sites the server 2 produces local speech dialogue applications for transmission in the corresponding local transmission area.
- the speech dialogue applications provided are always up to date and for example can be updated daily or hourly.
- the speech dialogue application is updated as shown in FIGS. 5 , 6 .
- the server 2 contains a first directory A and a second directory B.
- a step S 1 the server 2 checks whether there is a newly produced speech dialogue application in its directory A.
- the server 2 further checks in a step S 2 by an index file whether the speech dialogue application is complete.
- the server 2 further checks in a step S 3 whether there is a speech dialogue application with the same name in its directory B.
- step S 4 If this is the case a check is made in a step S 4 whether the two speech dialogue applications with the same name are identical.
- step S 1 If this is the case the procedure reverts to step S 1 . If there are not two speech dialogue applications with the same name in the two directories A, B or the two speech applications are not identical, the server 2 copies the newly produced speech dialogue application from its directory A to its directory B in a step S 5 .
- the copied speech dialogue application is transmitted to the radio transmitter 4 for example by FTP and is transmitted from there in a step S 6 to all the mobile terminals 7 in the transmission area.
- a step S 7 the speech dialogue applications received by the digital radio receiver 10 are initially stored in a directory C in the mobile terminal 7 .
- a step S 8 it is checked whether there are any new speech dialogue applications in the directory C.
- step S 9 If this is the case it is checked in a step S 9 whether the received speech dialogue application is complete.
- step S 10 it is checked whether in a further directory D of the mobile terminal 7 a speech dialogue application of the same name exists.
- a step S 11 it is checked whether the speech dialogue application in the directory C is younger than the speech dialogue application in the other directory D. This check is made using the time stamp provided in the speech dialogue application. If the speech dialogue application in the reception directory C is younger than the speech dialogue application in directory D the updated speech dialogue application is copied from directory C to directory D in step S 12 and the old speech dialogue application is preferably deleted.
- the procedure shown in FIGS. 5 , 6 ensures that the same speech dialogue data are not released twice for transmission and that always only updated versions of the speech dialogue applications are released for transmission.
- the speech dialogue data are transmitted continuously by the radio transmitter 4 , so that the transmitted speech dialogue applications are available complete on the mobile terminal 7 at a given time.
- the method described above ensures that the user can conduct speech dialogues on up-to-date subjects with his/her mobile terminal 7 without a continuous WLAN link having to exist. Furthermore, the method ensures that a knowledge status exists which is updated daily or hourly.
- the mobile terminal 7 may be any mobile terminal, for example a vehicle unit or a PDA.
- the system also includes permanent or removable storage, such as magnetic and optical discs, RAM, ROM, etc. on which the process and data structures of the present invention can be stored and distributed.
- the processes can also be distributed via, for example, downloading over a network such as the Internet.
- the system can output the results to a display device, printer, readily accessible memory or another computer on a network.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mobile Radio Communication Systems (AREA)
- Telephonic Communication Services (AREA)
Abstract
Description
- This application is based on and hereby claims priority to German Application No. 10 20076 006 551.4 filed on Feb. 13, 2006, the contents of which are hereby incorporated by reference.
- Described below are a method and a system for providing speech dialogue applications on mobile terminals, in particular on mobile terminals in vehicles.
- In speech dialogue systems a distinction can be made between command and control speech dialogue applications and systems for information dialogues. Command and control systems serve the purpose of inputting speech commands, for example for voice operation of a navigation system inside a vehicle. Speech dialogue systems are not used to input commands to control a device but enable the user to search for information on any desired subject, for example hotel information. In speech dialogue applications the information provided always has to be up to date. Depending on the application it is therefore necessary that the corresponding speech dialogue applications can be constantly, for example hourly or daily, updated.
- In mobile terminals, as used in vehicles, up to now only speech dialogue applications have been provided for inputting speech control commands which are updated by data carriers. For example, the voice operation for a navigation system inside a vehicle can be loaded or updated from a CD. Known mobile terminals establish a link to a data network, for example the internet, via a GMS network or a WLAN network. Owing to its low bandwidth, however, a GMS link is not suitable for transmitting speech dialogue applications for the retrieval of information as such speech dialogue applications include very large volumes of data. A WLAN link is not suitable owing to its short range.
- Therefore, described below are a method and a system for providing speech dialogue applications for the retrieval of information on mobile terminals by performing the following operations:
-
- production of a speech dialogue application which exhibits a formal description of a speech dialogue in a certain description language and language models for individual dialogue steps of the speech dialogue; and
- transmission of the speech dialogue application produced to the mobile terminals by digital radio.
- One advantage of the method is that the speech dialogue applications can be made available at the same time to numerous mobile terminals.
- A further advantage of the method is that the speech dialogue applications can be updated easily and in a very close time-frame.
- In an embodiment of the method the speech dialogue application exhibits background system data.
- In an embodiment of the method each speech dialogue application exhibits a time stamp showing the point in time when it was produced.
- In an embodiment of the method the description language is an XML description language.
- In an embodiment of the method each speech dialogue application exhibits a name for its identification.
- In an embodiment of the method the background system data are programmed in Java.
- In a further embodiment of the method the background system data are also transmitted with an associated speech dialogue application by digital radio to the mobile terminals.
- In an alternative embodiment of the method, after receipt of a speech dialogue application by the mobile terminal the background system data are retrieved via a bidirectional mobile radio interface of the mobile terminal.
- In an embodiment of the method the speech dialogue applications are produced in a server which is connected to a data network.
- In an embodiment of the method the data network is formed by the internet.
- In an embodiment of the method the produced speech dialogue applications are stored in a first directory of the server and copied from the first directory of the server to a second directory of the server when the respective speech dialogue application is recognized as being complete.
- In an embodiment of the method the speech dialogue applications copied to the second directory of the server are transmitted by a radio transmitter to a radio receiver of the mobile terminal at regular intervals.
- In an embodiment of the method the speech dialogue applications received by the radio receiver of the mobile terminal are stored in a first directory of the mobile terminal and then copied from the first directory of the mobile terminal to a second directory of the mobile terminal when the respective time stamp of the speech dialogue application indicates that the speech dialogue application is younger than a corresponding speech dialogue application stored in the second directory of the mobile terminal.
- In an embodiment of the method the speech dialogue applications are generated by the server automatically from local internet sites.
- In an embodiment of the method the speech dialogue applications are transmitted by the radio transmitter to the radio receivers of the mobile terminals in its transmission area across a certain local radio reception frequency.
- In an embodiment of the method each local radio reception frequency is assigned internet addresses of various local internet sites from which the server produces local speech dialogue applications automatically for transmission to the mobile terminals in the transmission area of the radio transmitter.
- Also described below is a system for providing speech dialogue applications for mobile terminals with:
-
- a server for producing at least one speech dialogue application which exhibits a formal description, programmed in a description language, of the speech dialogue and language models for dialogue steps of the speech dialogue; and
- a radio transmitter which transmits the produced speech dialogue applications digitally to radio receivers of mobile terminals which are in its transmission area.
- In an embodiment of the system the server is connected to a data network.
- In an embodiment of the system the data network is formed by the internet.
- In an embodiment of the system each mobile terminal exhibits a speech dialogue machine for processing the received speech dialogue applications and a background system with stored information dialogue data.
- In an embodiment of the system the speech dialogue machine exhibits a speech interpretation unit for interpreting data which are output by an automatic speech recognition unit.
- In an embodiment of the system the speech dialogue machine exhibits an output unit for outputting data to a speech synthesis unit.
- In an embodiment of the system the mobile terminal exhibits a transmitter-receiver unit for a mobile telephone connection.
- In an embodiment of the system the mobile terminal is installed in an associated vehicle.
- Also described below is a mobile terminal with:
-
- a digital radio receiver for the reception of speech dialogue applications which exhibit a formal description, programmed in a description language, of a speech dialogue and language models for dialogue steps of the speech dialogue;
- a speech dialogue machine for processing the received speech dialogue applications, and
- a background system with stored information dialogue data.
- In an embodiment the mobile terminal is located in a vehicle.
- These and other aspects and advantages will become more apparent and more readily appreciated from the following description of the exemplary embodiments with reference to the accompanying drawings of which:
-
FIG. 1 is a block diagram of an arrangement in accordance with an embodiment of the system described below for providing speech dialogue applications for mobile terminals; -
FIG. 2 is a block diagram of an embodiment of a mobile terminal; -
FIG. 3 is a data structure diagram of a speech dialogue application in accordance with an embodiment of the method described below; -
FIG. 4 is a block diagram of an alternative embodiment of the mobile terminal; -
FIG. 5 is a flowchart to explain how the method described below functions on the transmission side; -
FIG. 6 is a further flowchart to explain how the method described below functions on the reception side. - Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout.
- As can be seen from
FIG. 1 , the system 1 for providing speech dialogue applications includes aserver 2 for the production of speech dialogue applications. The speech dialogue applications produced are transmitted via a data transmission line 3, for example by FTP, to adigital radio transmitter 4. Theserver 2 is for example connected to adatabase 5 and to adata network 6, in particular to the internet. In an embodiment of the method theserver 2 produces speech dialogue applications automatically on the basis of documents which are provided by thedata network 6, in particular on the basis of internet sites. In an alternative embodiment the speech dialogue applications are programmed manually. - The
radio transmitter 4 transmits the speech dialogue applications received from theserver 2 tomobile terminals 7 which are inside its transmission area. For this purpose themobile terminals 7 are in each case connected to areception antenna 8 which receives the digital radio signal from theradio transmitter 4. Themobile terminals 7 are preferably located in avehicle 9, for example in a motor vehicle, in a train, on a ship or in an airplane. -
FIG. 2 shows a block diagram of an embodiment of themobile terminal 7. Themobile terminal 7 contains adigital radio receiver 10, which is connected to aspeech dialogue machine 12 via aline 11. Thespeech dialogue machine 12 serves the purpose of processing the speech dialogue applications received by thedigital radio receiver 10. Thespeech dialogue machine 12 contains adialogue management unit 12A which is connected to aninterpretation unit 12B and anoutput unit 12C. Thespeech interpretation unit 12B receives data from an automatic speech recognition unit 13 (ASR: Automatic Speech Recognition), which is connected to a microphone 14. Theoutput unit 12C of thespeech dialogue machine 12 is connected to a speech synthesis unit 15 (TTS: Text to Speech), which transmits the analogue speech signal produced through aloudspeaker 16. Auser 17 conducts a speech dialogue with themobile terminal 7 through the microphone 14 and theloudspeaker 16. - Via
lines 18 thedialogue machine 12 of themobile terminal 7 is also connected with a background system 19 (BGS: Background System), in which information dialogue data are stored. The background system data of thebackground system 19 are for example programmed in Java or in C. -
FIG. 3 shows an embodiment of a data structure of a speech dialogue application. The speech dialogue application exhibits a speech dialogue flow description. This represents a formal description of a speech dialogue and is described in a certain description language, for example in an XML description language (Voice XML). In addition, the speech dialogue application contains language models LM (Language Model) for individual dialogue steps of the described speech dialogue. Voice XML (Voice Extended Markup Language) serves the purpose of describing dialogue procedures in a speech dialogue system and represents a variant of the data description language XML. Voice XML exhibits data constructs which provide the user with certain freedoms in the dialogue procedure, so-called form filling. The dialogue manager determines the system reaction, preferably dynamically on the basis of the dialogue history. - The
speech recognition system 13 receives an analogue speech signal from the microphone 14. This speech signal is digitized for example by a soundcard and then converted into a frequency spectrum by Fourier transformation. The frequency spectrum is then compared with the content of a database and the symbol of the acoustically most similar reference vector is passed on. Recognition takes place for example by hidden Markov models (HMM). Using a language model, the probability of certain word combinations is then determined in order to exclude incorrect hypotheses. For this purpose either a grammar model or a trigram statistic is used. A bigram or trigram statistic stores the occurrence probability of word combinations from two or three words. - In addition to the speech dialogue flow description and the language models, in an embodiment a speech dialogue application also includes background system data (BGS: Background System). These BGS data are for example programmed in Java and contain up-to-date information data on certain subjects.
- In an embodiment each speech dialogue application additionally contains a time stamp which shows the point in time when it was produced.
- In a first embodiment of the method the background system data are transmitted together with the language model and the speech dialogue flow description within a speech dialogue application by digital radio by the
radio transmitter 4 to theradio receiver 10 within themobile terminal 7. - In an alternative embodiment the speech dialogue application transmitted by radio does not contain the background system data but an address provided instead, for example an IP address. After the
mobile terminal 7 has received the speech dialogue application from thedigital radio transmitter 4 it creates a bidirectional transmission channel to a base station of a data network by a separate data link, in particular a mobile telephone link. -
FIG. 4 shows amobile terminal 7 with a further transmitter-receiver unit 20, for example a UMTS transmitter-receiver unit to create a bidirectional mobile telephone link. After thedialogue machine 12 has received a speech dialogue application from theradio transmitter 4 on a unidirectional radio link it extracts the IP address contained in it and sends an enquiry via the created mobile telephone link in order to obtain the corresponding background system data or information data which belong to the speech dialogue application. After the background information data have been received these are placed in thebackground system 19 by thedialogue machine 12. With the system described herein it is possible for the user to conduct a speech dialogue with themobile terminal 7 in order to obtain information on any desired subject. To do so, theuser 17 conducts a dialogue with themobile terminal 7. If auser 17 is traveling in avehicle 9 in a transmission area of alocal radio transmitter 4, he/she can for example obtain local information by conducting a speech dialogue with themobile terminal 7. If for example theuser 17 is traveling in an area of a local radio transmitter in the vicinity of Cologne and would like to find out what musical activities are on offer in the evening in Cologne, he/she can do so by conducting a dialogue with themobile terminal 7. The speech dialogue can be initiated either by theuser 17 or by themobile terminal 7. - For example, the
user 17 is asked by themobile terminal 7 whether he/she would like to receive information about leisure-time activities in the transmission area. If theuser 17 answers in the affirmative, he/she can for example enquire about musical activities available. The user can ask, for example, whether any jazz concerts are taking place in Cologne in the evening. - The
dialogue machine 12 extracts reference words from this, such as for example “jazz”, “concert”, “Cologne” and generates a search enquiry to thebackground system 19, for example: Search (music=“jazz”; town/city=“Cologne”; time=“evening”). - The
background system 19 conducts a search process and answers the search enquiry by sending an output data record to thedialogue machine 12. - After speech synthesis the
dialogue machine 12 gives the following answer to the user 17: “Jazz is being played in Cologne this evening starting 20.00 hrs in the Domizil Club at Luxemburgerstrasse 117”. - In an embodiment of the system the
server 2 generates background information data automatically on the basis of internet documents from theinternet 6. To this end, theserver 2 for example evaluates a given group of internet homepages relating to the city of Cologne. Thelocal radio transmitter 4 transmits the speech dialogue application to those mobile terminals whose radio receivers are in its transmission area. The transmission takes place on a certain local radio reception frequency f. In an embodiment certain internet addresses of various local internet sites are assigned to each local radio reception frequency. From these local internet sites theserver 2 produces local speech dialogue applications for transmission in the corresponding local transmission area. - In the method described herein it is particularly important that the speech dialogue applications provided are always up to date and for example can be updated daily or hourly. The speech dialogue application is updated as shown in
FIGS. 5 , 6. - The
server 2 contains a first directory A and a second directory B. In a step S1 theserver 2 checks whether there is a newly produced speech dialogue application in its directory A. - If this is the case the
server 2 further checks in a step S2 by an index file whether the speech dialogue application is complete. - If this is also the case the
server 2 further checks in a step S3 whether there is a speech dialogue application with the same name in its directory B. - If this is the case a check is made in a step S4 whether the two speech dialogue applications with the same name are identical.
- If this is the case the procedure reverts to step S1. If there are not two speech dialogue applications with the same name in the two directories A, B or the two speech applications are not identical, the
server 2 copies the newly produced speech dialogue application from its directory A to its directory B in a step S5. The copied speech dialogue application is transmitted to theradio transmitter 4 for example by FTP and is transmitted from there in a step S6 to all themobile terminals 7 in the transmission area. - As can be seen from
FIG. 6 , in a step S7 the speech dialogue applications received by thedigital radio receiver 10 are initially stored in a directory C in themobile terminal 7. - In a step S8 it is checked whether there are any new speech dialogue applications in the directory C.
- If this is the case it is checked in a step S9 whether the received speech dialogue application is complete.
- If the speech dialogue application is complete, in a step S10 it is checked whether in a further directory D of the mobile terminal 7 a speech dialogue application of the same name exists.
- If this is the case, in a step S11 it is checked whether the speech dialogue application in the directory C is younger than the speech dialogue application in the other directory D. This check is made using the time stamp provided in the speech dialogue application. If the speech dialogue application in the reception directory C is younger than the speech dialogue application in directory D the updated speech dialogue application is copied from directory C to directory D in step S12 and the old speech dialogue application is preferably deleted. The procedure shown in
FIGS. 5 , 6 ensures that the same speech dialogue data are not released twice for transmission and that always only updated versions of the speech dialogue applications are released for transmission. - The speech dialogue data are transmitted continuously by the
radio transmitter 4, so that the transmitted speech dialogue applications are available complete on themobile terminal 7 at a given time. - In addition, it is ensured that the relatively time-consuming process of producing the language models only takes place once per updated speech dialogue application.
- The method described above ensures that the user can conduct speech dialogues on up-to-date subjects with his/her
mobile terminal 7 without a continuous WLAN link having to exist. Furthermore, the method ensures that a knowledge status exists which is updated daily or hourly. Themobile terminal 7 may be any mobile terminal, for example a vehicle unit or a PDA. - The system also includes permanent or removable storage, such as magnetic and optical discs, RAM, ROM, etc. on which the process and data structures of the present invention can be stored and distributed. The processes can also be distributed via, for example, downloading over a network such as the Internet. The system can output the results to a display device, printer, readily accessible memory or another computer on a network.
- A description has been provided with particular reference to exemplary embodiments thereof and examples, but it will be understood that variations and modifications can be effected within the spirit and scope of the claims which may include the phrase “at least one of A, B and C” as an alternative expression that means one or more of A, B and C may be used, contrary to the holding in Superguide v. DIRECTV, 358F3d 870, 69 USPQ2d 1865 (Fed. Cir. 2004).
Claims (21)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE102006006551 | 2006-02-13 | ||
DE102006006551.4 | 2006-02-13 | ||
DE102006006551A DE102006006551B4 (en) | 2006-02-13 | 2006-02-13 | Method and system for providing voice dialogue applications and mobile terminal |
PCT/EP2006/067997 WO2007093236A1 (en) | 2006-02-13 | 2006-10-31 | Method and system for preparing speech dialogue applications |
Publications (2)
Publication Number | Publication Date |
---|---|
US20100267345A1 true US20100267345A1 (en) | 2010-10-21 |
US8583441B2 US8583441B2 (en) | 2013-11-12 |
Family
ID=37603869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/223,916 Active 2029-01-30 US8583441B2 (en) | 2006-02-13 | 2006-10-31 | Method and system for providing speech dialogue applications |
Country Status (4)
Country | Link |
---|---|
US (1) | US8583441B2 (en) |
EP (1) | EP1984910B1 (en) |
DE (1) | DE102006006551B4 (en) |
WO (1) | WO2007093236A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140047415A1 (en) * | 2009-07-23 | 2014-02-13 | Sandeep CHATTERJEE | Modification of Terminal and Service Provider Machines Using an Update Server Machine |
WO2014062851A1 (en) * | 2012-10-17 | 2014-04-24 | Nuance Communications, Inc. | Multiple device intelligent language model synchronization |
US20170255612A1 (en) * | 2013-06-21 | 2017-09-07 | Microsoft Technology Licensing, Llc | Building conversational understanding systems using a toolset |
US10304448B2 (en) | 2013-06-21 | 2019-05-28 | Microsoft Technology Licensing, Llc | Environmentally aware dialog policies and response generation |
US10387140B2 (en) | 2009-07-23 | 2019-08-20 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US10418032B1 (en) * | 2015-04-10 | 2019-09-17 | Soundhound, Inc. | System and methods for a virtual assistant to manage and use context in a natural language dialog |
US10497367B2 (en) | 2014-03-27 | 2019-12-03 | Microsoft Technology Licensing, Llc | Flexible schema for language model customization |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090271200A1 (en) | 2008-04-23 | 2009-10-29 | Volkswagen Group Of America, Inc. | Speech recognition assembly for acoustically controlling a function of a motor vehicle |
DE102011109932B4 (en) | 2011-08-10 | 2014-10-02 | Audi Ag | Method for controlling functional devices in a vehicle during voice command operation |
US9953646B2 (en) | 2014-09-02 | 2018-04-24 | Belleau Technologies | Method and system for dynamic speech recognition and tracking of prewritten script |
DE102017213235A1 (en) * | 2017-08-01 | 2019-02-07 | Audi Ag | A method for determining a user feedback when using a device by a user and control device for performing the method |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6246672B1 (en) * | 1998-04-28 | 2001-06-12 | International Business Machines Corp. | Singlecast interactive radio system |
US20020184373A1 (en) * | 2000-11-01 | 2002-12-05 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
US6721633B2 (en) * | 2001-09-28 | 2004-04-13 | Robert Bosch Gmbh | Method and device for interfacing a driver information system using a voice portal server |
US20050043067A1 (en) * | 2003-08-21 | 2005-02-24 | Odell Thomas W. | Voice recognition in a vehicle radio system |
US20060029109A1 (en) * | 2004-08-06 | 2006-02-09 | M-Systems Flash Disk Pioneers Ltd. | Playback of downloaded digital audio content on car radios |
US7010263B1 (en) * | 1999-12-14 | 2006-03-07 | Xm Satellite Radio, Inc. | System and method for distributing music and data |
US20070136069A1 (en) * | 2005-12-13 | 2007-06-14 | General Motors Corporation | Method and system for customizing speech recognition in a mobile vehicle communication system |
US7277696B2 (en) * | 2001-04-23 | 2007-10-02 | Soma Networks, Inc. | System and method for minimising bandwidth utilisation in a wireless interactive voice response system |
US20090019061A1 (en) * | 2004-02-20 | 2009-01-15 | Insignio Technologies, Inc. | Providing information to a user |
US8051369B2 (en) * | 1999-09-13 | 2011-11-01 | Microstrategy, Incorporated | System and method for the creation and automatic deployment of personalized, dynamic and interactive voice services, including deployment through personalized broadcasts |
US8195468B2 (en) * | 2005-08-29 | 2012-06-05 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
US8279844B1 (en) * | 2000-11-03 | 2012-10-02 | Intervoice Limited Partnership | Extensible interactive voice response |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2003289589A1 (en) * | 2003-07-11 | 2005-01-28 | Electronics And Telecommunications Research Institute | Apparatus and method for transmitting/receiving voice electrics program guide information |
US20060149553A1 (en) | 2005-01-05 | 2006-07-06 | At&T Corp. | System and method for using a library to interactively design natural language spoken dialog systems |
-
2006
- 2006-02-13 DE DE102006006551A patent/DE102006006551B4/en active Active
- 2006-10-31 US US12/223,916 patent/US8583441B2/en active Active
- 2006-10-31 WO PCT/EP2006/067997 patent/WO2007093236A1/en active Application Filing
- 2006-10-31 EP EP06807702.3A patent/EP1984910B1/en not_active Not-in-force
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6246672B1 (en) * | 1998-04-28 | 2001-06-12 | International Business Machines Corp. | Singlecast interactive radio system |
US8051369B2 (en) * | 1999-09-13 | 2011-11-01 | Microstrategy, Incorporated | System and method for the creation and automatic deployment of personalized, dynamic and interactive voice services, including deployment through personalized broadcasts |
US7010263B1 (en) * | 1999-12-14 | 2006-03-07 | Xm Satellite Radio, Inc. | System and method for distributing music and data |
US20020184373A1 (en) * | 2000-11-01 | 2002-12-05 | International Business Machines Corporation | Conversational networking via transport, coding and control conversational protocols |
US8279844B1 (en) * | 2000-11-03 | 2012-10-02 | Intervoice Limited Partnership | Extensible interactive voice response |
US7277696B2 (en) * | 2001-04-23 | 2007-10-02 | Soma Networks, Inc. | System and method for minimising bandwidth utilisation in a wireless interactive voice response system |
US6721633B2 (en) * | 2001-09-28 | 2004-04-13 | Robert Bosch Gmbh | Method and device for interfacing a driver information system using a voice portal server |
US20050043067A1 (en) * | 2003-08-21 | 2005-02-24 | Odell Thomas W. | Voice recognition in a vehicle radio system |
US20090019061A1 (en) * | 2004-02-20 | 2009-01-15 | Insignio Technologies, Inc. | Providing information to a user |
US20060029109A1 (en) * | 2004-08-06 | 2006-02-09 | M-Systems Flash Disk Pioneers Ltd. | Playback of downloaded digital audio content on car radios |
US8195468B2 (en) * | 2005-08-29 | 2012-06-05 | Voicebox Technologies, Inc. | Mobile systems and methods of supporting natural language human-machine interactions |
US20070136069A1 (en) * | 2005-12-13 | 2007-06-14 | General Motors Corporation | Method and system for customizing speech recognition in a mobile vehicle communication system |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10261774B2 (en) | 2009-07-23 | 2019-04-16 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US10831468B2 (en) | 2009-07-23 | 2020-11-10 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US20140047415A1 (en) * | 2009-07-23 | 2014-02-13 | Sandeep CHATTERJEE | Modification of Terminal and Service Provider Machines Using an Update Server Machine |
US10387140B2 (en) | 2009-07-23 | 2019-08-20 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US9081897B2 (en) * | 2009-07-23 | 2015-07-14 | Shuv Gray Llc | Modification of terminal and service provider machines using an update server machine |
US9304758B2 (en) | 2009-07-23 | 2016-04-05 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US11662995B2 (en) | 2009-07-23 | 2023-05-30 | S3G Technology Llc | Network efficient location-based dialogue sequence using virtual processor |
US9940124B2 (en) | 2009-07-23 | 2018-04-10 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US12099830B2 (en) | 2009-07-23 | 2024-09-24 | S3G Technology Llc | Network efficient and user experience optimized dialogue sequence between user devices |
US11210082B2 (en) | 2009-07-23 | 2021-12-28 | S3G Technology Llc | Modification of terminal and service provider machines using an update server machine |
US8983849B2 (en) | 2012-10-17 | 2015-03-17 | Nuance Communications, Inc. | Multiple device intelligent language model synchronization |
US9035884B2 (en) | 2012-10-17 | 2015-05-19 | Nuance Communications, Inc. | Subscription updates in multiple device language models |
WO2014062851A1 (en) * | 2012-10-17 | 2014-04-24 | Nuance Communications, Inc. | Multiple device intelligent language model synchronization |
US10572602B2 (en) * | 2013-06-21 | 2020-02-25 | Microsoft Technology Licensing, Llc | Building conversational understanding systems using a toolset |
US20170255612A1 (en) * | 2013-06-21 | 2017-09-07 | Microsoft Technology Licensing, Llc | Building conversational understanding systems using a toolset |
US10304448B2 (en) | 2013-06-21 | 2019-05-28 | Microsoft Technology Licensing, Llc | Environmentally aware dialog policies and response generation |
US10497367B2 (en) | 2014-03-27 | 2019-12-03 | Microsoft Technology Licensing, Llc | Flexible schema for language model customization |
US10418032B1 (en) * | 2015-04-10 | 2019-09-17 | Soundhound, Inc. | System and methods for a virtual assistant to manage and use context in a natural language dialog |
Also Published As
Publication number | Publication date |
---|---|
EP1984910B1 (en) | 2015-11-18 |
DE102006006551B4 (en) | 2008-09-11 |
EP1984910A1 (en) | 2008-10-29 |
US8583441B2 (en) | 2013-11-12 |
DE102006006551A1 (en) | 2007-08-16 |
WO2007093236A1 (en) | 2007-08-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8583441B2 (en) | Method and system for providing speech dialogue applications | |
US9495957B2 (en) | Mobile systems and methods of supporting natural language human-machine interactions | |
US9558745B2 (en) | Service oriented speech recognition for in-vehicle automated interaction and in-vehicle user interfaces requiring minimal cognitive driver processing for same | |
US8620659B2 (en) | System and method of supporting adaptive misrecognition in conversational speech | |
EP2226793B1 (en) | Speech recognition system and data updating method | |
US8326634B2 (en) | Systems and methods for responding to natural language speech utterance | |
US8694206B2 (en) | Systems and methods for off-board voice-automated web searching | |
CN102543077B (en) | Male acoustic model adaptation method based on language-independent female speech data | |
US9082414B2 (en) | Correcting unintelligible synthesized speech | |
CN112017642A (en) | Method, device and equipment for speech recognition and computer readable storage medium | |
US20040107097A1 (en) | Method and system for voice recognition through dialect identification | |
US20020072916A1 (en) | Distributed speech recognition for internet access | |
Agarwal et al. | Voice Browsing the Web for Information Access | |
JP2000181475A (en) | Voice answering device | |
Alessio Brutti et al. | USE OF MULTIPLE SPEECH RECOGNITION UNITS IN AN IN-CAR ASSISTANCE SYSTEM¹ | |
JP2017161815A (en) | Response system and response program | |
KR20090061917A (en) | Method and apparatus for providing voice database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: SIEMENS AKTIENGESELLSCHAFT, GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BERTON, ANDRE;BLOCK, DR. HANS-ULRICH;GEHRKE, MANFRED;AND OTHERS;SIGNING DATES FROM 20080708 TO 20080929;REEL/FRAME:024593/0757 |
|
AS | Assignment |
Owner name: SVOX AG, SWITZERLAND Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SIEMENS AKTIENGESELLSCHAFT;REEL/FRAME:024663/0952 Effective date: 20100223 |
|
AS | Assignment |
Owner name: NUANCE COMMUNICATIONS, INC., MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SVOX AG;REEL/FRAME:031266/0764 Effective date: 20130710 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: CERENCE INC., MASSACHUSETTS Free format text: INTELLECTUAL PROPERTY AGREEMENT;ASSIGNOR:NUANCE COMMUNICATIONS, INC.;REEL/FRAME:050836/0191 Effective date: 20190930 |
|
AS | Assignment |
Owner name: CERENCE OPERATING COMPANY, MASSACHUSETTS Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE ASSIGNEE NAME PREVIOUSLY RECORDED AT REEL: 050836 FRAME: 0191. ASSIGNOR(S) HEREBY CONFIRMS THE INTELLECTUAL PROPERTY AGREEMENT;ASSIGNOR:NUANCE COMMUNICATIONS, INC.;REEL/FRAME:050871/0001 Effective date: 20190930 |
|
AS | Assignment |
Owner name: BARCLAYS BANK PLC, NEW YORK Free format text: SECURITY AGREEMENT;ASSIGNOR:CERENCE OPERATING COMPANY;REEL/FRAME:050953/0133 Effective date: 20191001 |
|
AS | Assignment |
Owner name: CERENCE OPERATING COMPANY, MASSACHUSETTS Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:BARCLAYS BANK PLC;REEL/FRAME:052927/0335 Effective date: 20200612 |
|
AS | Assignment |
Owner name: WELLS FARGO BANK, N.A., NORTH CAROLINA Free format text: SECURITY AGREEMENT;ASSIGNOR:CERENCE OPERATING COMPANY;REEL/FRAME:052935/0584 Effective date: 20200612 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |
|
AS | Assignment |
Owner name: CERENCE OPERATING COMPANY, MASSACHUSETTS Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE REPLACE THE CONVEYANCE DOCUMENT WITH THE NEW ASSIGNMENT PREVIOUSLY RECORDED AT REEL: 050836 FRAME: 0191. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNOR:NUANCE COMMUNICATIONS, INC.;REEL/FRAME:059804/0186 Effective date: 20190930 |