CN1798210B

CN1798210B - Method and system for selecting speech or dtmf interfaces or a mixture of both

Info

Publication number: CN1798210B
Application number: CN2005101283704A
Authority: CN
Inventors: C·阿加皮; F·戈梅斯; J·R·刘易斯
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2004-12-30
Filing date: 2005-11-14
Publication date: 2010-08-11
Anticipated expiration: 2025-11-14
Also published as: CN1798210A

Abstract

A wizard that from a fixed design can create various audio interfaces. The generated interfaces can be speech only, DTMF only, or various mixed speech and DTMF UIs. When specifying both speech and DTMF prompts, a number of combinations of these interfaces could be automatically generated. Robust speech recognition systems can be built by automatically generating a ''shadow'' DTMF application. TheDTMF application will perform the same task as the primary speech application; however the transfer to a DTMF application could be done explicitly by the user, or could be transferred automatically (either a temporary or permanent transition) at a point in the call flow where there was a problem with the speech recognition.

Description

Be used to select the method and system of voice or DTMF interface or the mixing of the two

Technical field

The present invention relates to the user interface field of the audio frequency input that can respond voice for example or push-button telephone of computer system.

Background technology

Traditionally, audio interface is manually constructed by the programmer, and " sound perception " used has been fixed when design.Utilize the automatic code of audio user interface to generate guide, can generate the code that is used for two General Audio user interfaces (only DTMF or only voice application).Although this has promoted the quick exploitation of voice applications greatly, yet also do not generate the guide that hybrid interface (DTMF and voice) is used at present.Can use by use two interfaces to design hybrid interface in every way in the suitable time.Adopt the design of the mixing of voice and DTMF (dual-tone multifrequency) input, promptly the employed system of push-button telephone interface can solve not tractable problem when independent DTMF or voice user interface.For example, when having the problem of speech recognition response, allow the interface to DTMF, perhaps designing this problem can be favourable automatically this identification problem is reacted by switching to DTMF.Therefore, need provide a kind of method and system to come to provide the ability of from the design of single senior application call stream, enabling voice or DTMF or the mixing of the two simply, as following will the explanation in detail for the developer of interactive voice response system.

Summary of the invention

The present invention solves the technological deficiency about the Admin Events in the interactive voice application, and a kind of novelty and non-obvious method, system and the device that is used to the interface type of audio frequency interactive system preliminary election voice, DTMF or mixing are provided.Particularly,, can allow this user in polytype, to select, the determined demand of wherein being chosen of type response user by presenting interface to guiding user (for example application developer) according to principle of the present invention.Particularly, the user can select specific type, revises the type of being chosen and/or select different types, to satisfy the needs of user for specific interactive audio applied environment.

The invention provides the method for a kind of definition received pronunciation/DTMF mixing user interface type, use during voice application code that this mixing user interface type is used for presenting at this user interface of application management (UI) in generation, this application can support voice be discerned and the key entry of DTMF (button) phone, the method comprising the steps of: present the voice/DTMF type selecting menu of the selection that allows one or more UI types, each UI type is corresponding to code being generated the request system response; And in case when selecting the UI type, be that code generates the response of request preparation system.

The method according to this invention provides a kind of guide that can create the Mixed Design of various audio interface.The interface that is generated can be the voice and the DTMF user interface of only voice, only DTMF or various mixing.And comprise, be used for the given selection that comes from the type of type selecting guide panel, automatically generate the means of these different interface types from same source information.

On the other hand, the invention provides a kind of computer-readable recording medium, its storage computation machine program, definition is used for the type of interactive audio incident when this program is performed.The type defines by the type selecting menu that presents the selection that allows one or more types.Each type is corresponding to system or user's input.In case when selecting type, for code generates the response of request preparation system.

Other aspects of the present invention will partly be set forth in the following description, partly will be apparent from this explanation, and maybe can be by practice of the present invention is known.These aspects of the present invention all will realize and reach by the element specifically noted in claims and combination.It should be understood that top summary description and following detailed only are exemplary and indicative, and be not restriction the present invention for required protection.

Description of drawings

In conjunction with in this manual and constitute this specification a part the accompanying drawing illustration embodiments of the invention, and with describe one and be used from and explain principle of the present invention.Here illustrated embodiment is current preferred, yet, should be appreciated that the present invention is not limited to shownly definitely arrange, wherein:

Fig. 1 is according to the expression of the present invention time diagram selected of DTMF only;

Fig. 2 is according to the expression of the present invention time diagram of voice selecting only;

Fig. 3 is according to time diagram of the present invention, and wherein initial system responses voice are transformed into DTMF when going wrong, turn back to voice when problem is eliminated;

Fig. 4 is the time diagram that expression one according to the present invention is used, initial use speech recognition in this is used, and the voice responsive identification problem is for good and all transformed to DTMF;

Fig. 5 is the computer screen that type selecting interface of the present invention is shown.

Embodiment

The present invention is a kind of system and method, when specified speech and DTMF prompting, is used for the multiple combination of these interfaces that can generate automatically.Strong speech recognition system can should be used for making up by generating " shadow " DTMF automatically.This DTMF uses will carry out the task identical with basic voice application; Yet, can finish by user's explicitly to the conversion that DTMF uses, perhaps can conversion automatically on the in-problem point of speech recognition in calling stream.

Following Example is represented by time diagram shown in Figure 1.

Example 1: explicit switching is used to select an only DTMF interface.

Beginning in this application can provide explicit " switching " to the user, switches to a DTMF interface with explicitly.

System: our automatic＜Apply Names of Custom House Welcome to Custom House〉system.Continue and to use with the DTMF pattern, by 1.Otherwise wait for next prompting.

User: (pressing key " 1 ").

System: need weather information, by 1.Need news, by 2.Need amusement, by 3.

After this selection, the user interactions of all and this system will continue with the DTMF pattern.In the time of on being plotted in time diagram, this system action (user interactions) seems as shown in Figure 1.

Following Example is represented by time diagram shown in Figure 2.

Example 2: explicit switching is used to select an only speech interface.

Beginning in this application can provide explicit " switching " to the user, in order to continuing with speech interface.

User: (wait)

System: need weather information, please say meteorology.Need news information, please say news.Need film and music, please say amusement.

After this selection, will continue with speech pattern with all user interactions of this system.In the time of on being plotted in time diagram, this system action (user interactions) seems as shown in Figure 2.

Following Example is represented by time diagram shown in Figure 3.

Example 3: depend on the DTMF of voice response performance and the implicit expression hybrid interface between the voice alternately.If when design, the noise jamming noise source may occur with evanescent hypothesis, make mistakes and recover to seek help from DTMF, but return to voice then again but made.Can provide when the temporary problem that has speech interface (exceedingly do not have input or the incident that do not match), will expose the implicit expression " switching " (being established to second of the rule of the switching of distinct interface-does not for example match) of DTMF interface automatically.

User: (wait)

User: meteorology

System: I'm sorry, can not hear the content that you say.

User: meteorology

System: do not hear; Please repeat your content.

User: meteorology

System: need weather information, by 1.Need news, by 2.Need amusement, by 3.

User: (button " 1 ")

System: the weather of Boca Raton is ...

User: news

System: September 2, the top news on Tuesday is ...

Start with speech pattern with all user interactions of this system, but when needs, will seek help from the DTMF pattern.In the time of on being plotted in time diagram, this system action (user interactions) seems as shown in Figure 3.

Following Example is represented by time diagram shown in Figure 4.

Example 4: if the design hypothesis is that voice are as desired interface, if may keep like this but environment is noisy, " switching " that will expose DTMF interface and all ensuing promptings automatically and will present with DTMF when speech interface has problems (in fact, to DTMF UI switching) promptly is provided.

User: (wait)

User: meteorology

System: I'm sorry, can not hear the content that you say.

User: meteorology

System: do not hear; Please repeat your content.

User: meteorology

System: need weather information, by 1.Need news, by 2.Need amusement, by 3.

User: (button " 1 ")

System: the weather of Boca Raton is ...

System: need weather information, by 1.Need news, by 2.Need amusement, by 3.

User: (button " 2 ")

System: September 2, the top news on Tuesday is ...

Start with speech pattern with all user interactions of this system, if will forever retreat into the DTMF interface but speech recognition power is low.In the time of on being plotted in time diagram, this system action (user interactions) seems as shown in Figure 4.

Fig. 5 is the computer screen that type selecting interface of the present invention is shown.Be noted that in automatic code to generate in the engine, can when design, select type of interaction.

The present invention can realize with the combination of hardware, software or software and hardware.The realization of method and system of the present invention can realize in a computer system with centralized fashion, or the distributed way that is distributed in wherein different assemblies on the computer system of several interconnection realizes.The computer system of any kind, or be suitable for carrying out other devices of method described herein, all be suitable for carrying out function described herein.

The typical combination of hardware and software can be a general-purpose computing system, and it has CPU and the computer program that is stored in the storage medium, and when this program was loaded and carries out, it controlled this computer system so that it carries out method described herein.The present invention also can be embedded in the computer program, and this program product comprises all features that can realize method described herein, can carry out these methods when this computer program is loaded in the computer system.Storage medium refers to any volatibility or non-volatile memory device.

Computer program herein or application mean any expression of one group of instruction with any language, code or symbol, this group instruction is intended to make the system with information processing capability to carry out specific function, this execution is that directly or below one or two in two carried out after all, a) is converted to another language, code or symbol; B) reproduce with different material forms.Importantly, under the situation that does not deviate from spirit of the present invention or essential attribute, the present invention can other specific forms realize, therefore, and should be with reference to the specification of claims rather than front, as the indication of the scope of the invention.

Claims

One kind the definition received pronunciation/DTMF mixing user interface type method, use during voice application code that this mixing user interface type is used for presenting at this user interface of application management (UI) in generation, this application can support voice be discerned and the key entry of DTMF phone, and the method comprising the steps of:

The type selecting menu that presents the selection that allows one or more UI types, each UI type is corresponding to code being generated the request system response; And

In case when selecting the UI type, prepare the said system response for code generates request;

Wherein, comprise the voice and the DTMF interface type of mixing in above-mentioned one or more UI types, during problem that the voice of this mixing and DTMF interface type allow to exist, automatically switch to the DTMF interface modes in detecting speech recognition from the speech interface pattern.
2. method as claimed in claim 1 also comprises only DTMF interface type in wherein above-mentioned one or more UI types.
3. method as claimed in claim 1 also comprises only speech interface type in wherein above-mentioned one or more UI types.
4. method as claimed in claim 1 wherein in the voice and DTMF interface type of above-mentioned mixing, is replaced the speech interface pattern with the DTMF interface modes temporarily, and the problem that exists in speech recognition switches to the speech interface pattern from the DTMF interface modes when eliminating.
5. method as claimed in claim 1 wherein in the voice and DTMF interface type of above-mentioned mixing, is forever replaced the speech interface pattern with the DTMF interface modes.
6. system that is used for administrative standard voice/DTMF mixing user interface type, this system comprises computer, this computer comprises the type selecting menu of the selection that allows one or more UI types, and each UI type is corresponding to code being generated the request system response;

Wherein, comprise the voice and the DTMF interface type of mixing in above-mentioned one or more UI types, during problem that the voice of this mixing and DTMF interface type allow to exist, automatically switch to the DTMF interface modes in detecting speech recognition from the speech interface pattern.
7. system as claimed in claim 6 also comprises only DTMF interface type in wherein above-mentioned one or more UI types.
8. system as claimed in claim 6 also comprises only speech interface type in wherein above-mentioned one or more UI types.
9. system as claimed in claim 6 wherein in the voice and DTMF interface type of above-mentioned mixing, replaces the speech interface pattern with the DTMF interface modes temporarily, and the problem that exists in speech recognition switches to the speech interface pattern from the DTMF interface modes when eliminating.
10. system as claimed in claim 6 wherein in the voice and DTMF interface type of above-mentioned mixing, forever replaces the speech interface pattern with the DTMF interface modes.