[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN109977218A - A kind of automatic answering system and method applied to session operational scenarios - Google Patents

A kind of automatic answering system and method applied to session operational scenarios Download PDF

Info

Publication number
CN109977218A
CN109977218A CN201910324994.5A CN201910324994A CN109977218A CN 109977218 A CN109977218 A CN 109977218A CN 201910324994 A CN201910324994 A CN 201910324994A CN 109977218 A CN109977218 A CN 109977218A
Authority
CN
China
Prior art keywords
session
module
speech
voice gateway
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910324994.5A
Other languages
Chinese (zh)
Other versions
CN109977218B (en
Inventor
孟宪坤
田文
曹金龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Huakun Dove Data Technology Co Ltd
Original Assignee
Zhejiang Huakun Dove Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Huakun Dove Data Technology Co Ltd filed Critical Zhejiang Huakun Dove Data Technology Co Ltd
Priority to CN201910324994.5A priority Critical patent/CN109977218B/en
Publication of CN109977218A publication Critical patent/CN109977218A/en
Application granted granted Critical
Publication of CN109977218B publication Critical patent/CN109977218B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/527Centralised call answering arrangements not requiring operator intervention

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A kind of automatic answering system and method applied to session operational scenarios, belongs to telephone communicating technology field, comprising: client session processing module and AI session engine module;The client session processing module, as, to adaptation layer, help scene dialogue is preceding to content recognition and event recognition, including client modules, voice gateway module before interaction;The AI session engine module, record and control current sessions state, and the event for combining voice gateway module to be passed to is to issuing different instructions to client session processing module.Retrieval knowledge of the invention is quick;Access diversification of forms, can voice, text or language and characters mix three's mode, also other expansible business are transferred channel;More flexible compared to conventional intelligence access voice, using face is more extensive.

Description

A kind of automatic answering system and method applied to session operational scenarios
Technical field
The invention belongs to telephone communicating technology fields, and in particular to be a kind of automatic answering system applied to session operational scenarios And method.
Background technique
It between enterprise and client in daily communication process, needs to expend a large amount of costs of labor and personnel's energy problem, markets Respectively knowledge experience is different by attendant, while understanding, expressing, the features such as mood, spoken language, causes service effectiveness irregular, To influence service quality and result.Therefore, be born intelligent sound, attends a banquet and Communication with Customer to substitute.
Currently, intelligent sound can pass through ASR(Real-time speech recognition) and NLP(natural language understanding), realize that machine is real When understand human sound language, carry out AI in scenes such as customer service, sale and intelligently link up, extensive language is carried out to human sound language Material training, under given scenario, the preferable identification model of available identification quality, voice gateway module is by the sound of mankind reality When be sent in ASR and identified, obtain the recognition result of textual form, be used for keyword match or semantic processes, obtain Preset question and answer, plays back with audio form, is linked up with matching man-machine voice.
Although existing scheme can support voice gateway module and the language communication of the mankind, substantially with people and voice gateways Between module based on the form of question-response, it is difficult to accomplish the exchange of chipping in of human levels, it is more inflexible and unnatural.For visit The access of chipping in suddenly of visitor, if voice gateway module is unmoved, it appears rude, exchange is unfriendly, and user must hear out The default words art of whole voice gateway module, and can not interrupt within the period of voice gateway module words art output or statement into question, It is difficult to realize in communication timely, quick;On the other hand, chipping in for visitor is interrupted, and may have the inquiry of more urgent problems, if not It is switched on relevant issues node in time, client's time can be wasted.In conclusion existing intelligent sound voice gateway module and people Speech exchange scheme it is still to be improved in interactive experience, communication efficiency.
Summary of the invention
It is an object of the invention to overcoming defect and deficiency mentioned above, and provide it is a kind of applied to session operational scenarios from Dynamic answering system.
Another object of the present invention is to provide a kind of auto-answer methods applied to session operational scenarios.
A kind of automatic answering system applied to session operational scenarios characterized by comprising client session processing module and AI session engine module;
The client session processing module, as, to adaptation layer, help scene dialogue is preceding to content recognition and event before interaction Identification, including client modules, voice gateway module;The client modules, for speech communication function mobile phone, Base, or can communication text social tool;The voice gateway module is actively initiated or is answered according to dial plan Call, provides corresponding ESL event for a series of actions generated according to client modules after closing of the circuit and is issued to AI meeting Engine modules are talked about, and is received from AI session engine module and executes corresponding movement;
The AI session engine module, record and control current sessions state, and the event for combining voice gateway module to be passed to To issuing different instructions to client session processing module.
Further, the communication protocol of the various channels of client modules is converted unification by the voice gateway module Communication protocol, and event recognition is carried out, following several core events can be generated in the life cycle for the call connected all the way:
1, the event SPEECH CONNET: is generated after establishing communication connection with client modules;
2, SPEECH CHANNEL_ANSWER: client modules generate the event after receiving calls;
3, SPEECH CHANNEL_EXECUTE: start to generate the event when playing one section of voice to client modules;
4, SPEECH CHANNEL_EXECUTE_COMPLETE: voice generates the event after finishing;
5, SPEECH ASR_START: listen to client modules voice flow it is incoming after generate the event;
6, SPEECH ASR_END: generating the event after client voice flow end of transmission, which carries the ASR knot of client Fruit, i.e. voiced translation result;
7, SPEECH HANGUP: either party generates the event after actively hanging up the telephone.
Further, the voice gateway module has event generation ability and ASR understandability, and client modules are passed to Voice flow be converted to text and be issued to AI session engine module in the form of event.
Further, the AI session engine module inside sets dialogue control module, tree-like data store organisation, commonly knows Know library and system knowledge base;Data store organisation is the storage unit for talking about art;General knowledge library is the storage unit of general knowledge; System knowledge base is the storage unit of systematic knowledge.
Further, the session status is divided into voice gateway module not in playback, voice gateway module playback, voice network It closes in module pause, call has terminated;The instruction that the incoming event of session status combination voice gateway module can produce includes Initialization, actively initiation dialogue, other side's speech content, other side just interrupt voice gateway module and speak in silencing, other side;
The session state transfer logic judgment scheme of AI session engine module:
1, when the incoming event " SPEECH CONNET " of voice gateway module arrives, initialization directive, hair AI CONNET: are generated It is sent to voice gateway module, does not change session status at this time;
2, AI CHANNEL_ANSWER: when the incoming event " SPEECH CHANNEL_ANSWER " of voice gateway module arrives, The instruction for generating other side's speech content, is sent to voice gateway module, session status is transferred to voice gateway module not at this time In playback;After " SPEECH CHANNEL_ANSWER ", words art is executed, generates the instruction for actively initiating dialogue;
3, AI CHANNEL_EXECUTE: the incoming event " SPEECH CHANNEL_EXECUTE " of voice gateway module arrives When;Judge current sessions state if state in pause, is not then changed, otherwise session state transfer to voice gateway module playback In, and other side is generated just in the instruction of silencing;
4, AI CHANNE_EXECUTE_COMPLETE: incoming event " the SPEECH CHANNEL_ of voice gateway module When EXECUTE_COMPLETE " arrives, it will speech phase is transferred to voice gateway module not in playback;
5, AI ASR_START: when the incoming event " SPEECH ASR_END " of voice gateway module arrives, if at current state An other side is then generated in voice gateway module playback interrupts the instruction that voice gateway module is spoken;AI session engine module root Language is interrupted according to speech recognition module identification, decides whether pause voice gateway module playback, and session status is turned It moves on in pause or not transfering state: if pause voice gateway module playback, session status is transferred in pause, and is generated The instruction that one other side is speaking is sent to voice gateway module, if not suspending voice gateway module playback, does not shift meeting Speech phase;
6, AI ASR_END: session state transfer is decided whether to when pause according to the event that voice gateway module returns On-hook generates one and actively initiates the instruction of dialogue, and shifts session status position and terminated or voice gateway module to conversing In playback;
7, AI HANGUP: either party generates the event after actively hanging up the telephone;
A kind of auto-answer method using above system, comprising the following steps:
Step 1. client modules and voice gateway module establish bi-directionally established both-way communication access, generate event " SPEECH CONNET";
Step 2.AI session engine module generates event " AI CONNET ", generates initialization directive, reads words art, reads knowledge Library loads global ambiguity dictionary, and initialization directive is returned to client modules back through voice gateway module;
Step 3. client modules receive calls, and generate event " SPEECH CHANNEL_ANSWER ", and voice gateway module receives The speech information that client modules are sent, and speech information is sent to AI session engine module, carry out words art knowledge Matching;AI session engine module generates event " AI CHANNEL_ANSWER ";
Step 3a: information of making a speech is carried out to the correction of ambiguity word;It includes the correction of homonym and the correction of synonym that ambiguity, which is corrected,;
Step 3b: according to the progress of the session node of current speech information, art keyword in speech information is obtained, by making Intention branch is matched with session content of the regular expression technology to speech information;
If, according to tree-like data store organisation, getting next words art of connection after information matches to intention branch of making a speech Node;Words art node can be divided into ordinary node, adjustment node;
Ordinary node: the art to answer user executes step 4, can increase the action mark for sending short message;
Jump node: be divided into following movement: 1. jump to next main flow node, continue as user and answer words art, execute step Rapid 4;2. being adjusted to specified main flow node, words art is answered for user, executes step 4;3. on-hook acts, terminate this session; 4. sending short message, the short message of user demand is sent;
After the feedback action for having collected words art node, movement is sent to voice gateway module and is executed;
If being not matched to intention branch, the matching of knowledge base is carried out;Knowledge base is broadly divided into 2 classes:
The traffic issues of user are fed back in general knowledge library, execute step 4;
System knowledge base, is subdivided into 3 classes: 1. can not reply process: the problem of user, can not retrieve answer, at this point, AI session Engine modules execute preset voice playback action, execute step 4;2. interrupting processing: being acted in AI session engine module feedback In execution, user is interrupted midway, and the movement that stopping is carrying out by AI engine executes step 5;3. repetitive operation handle: with Family is not heard or the movement without understanding feedback, and movement before is repeated execution one time by AI session engine module;
Step 4. voice gateway module generates event " SPEECH CHANNEL_EXECUTE ", and AI session engine module generates event " AI CHANNEL_EXECUTE ": AI session engine module judges whether one section of speech information receives;When one section of speech is believed Breath receives, and AI session engine module, which generates, initiates dialogue instruction, sends voice gateway module for art if matching, it will Speech phase is transferred in voice gateway module playback;Voice gateway module starts the art to one section of matching of client modules broadcasting Voice;Then step 3 is executed;
When step 5. talks about the broadcasting of term sound, if the voice flow that voice gateway module listens to client modules is incoming, voice gateways After module carries out noise filtering to voice flow, it is transmitted to the operation that AI session engine module interrupt word filtering;That is voice network It closes module and generates event " SPEECH ASR_START " and " SPEECH ASR_END ";If voice flow is all filter word, AI Session engine module does not generate new movement;If voice flow is not all filter word, the execution of AI session engine module interrupts behaviour Make, instruction will be interrupted and be sent to voice gateway module, voice gateway module stops the broadcasting of words term sound, i.e. AI session engine mould Block generates event " AI ASR_START " and " AI ASR_END ", meanwhile, execute step 3.
Further, in step 4, AI session engine module judges whether one section of speech information receives, and uses with lower section Method:
AI session engine module, conversate sampling, presets the N number of sampled point of setting in a period of time T;N is of total sampled point Number is fixed sample point number n1With stochastical sampling point number n2The sum of, i.e. N=n1+n2;The sampling time of fixed sample point Are as follows: xt1±t2, wherein x is no more than n1Positive integer, t1=T/n1;t2The time interval generated by random function, and 0 < t2< t1;Stochastical sampling point number n2The random acquisition in time T, 0 < n2≤n1/2;
When data record is sonance, this is effective sampling points;When data record is silent state, this is invalid sampling Point;When more than half of the number N of the total sampled point of effective sampling points Zhan, then judges that speech information no-reception finishes, otherwise sentence Disconnected speech information receives.
The object of the invention is to utilize AI the relevant technologies using a kind of, realizes that machine and true man's interactive scene, dialog interface are full The scenes such as sufficient phone, voice-over-net, IM message.The present invention can help enterprise to establish intelligent customer service system, with overwhelming majority weight Multiple inter-work is executed by machine, provides product introduction and guide service interactive service simultaneously for client in machine, helps enterprise Process significant data in interaction is collected, provides data basis for subsequent big data analysis.
Compared to the similar product in current industry, retrieval knowledge of the invention is quick;Access diversification of forms, can voice, Text or language and characters mix three's mode, also expansible other business switching channel;More compared to conventional intelligence access voice Add flexibly, using face is more extensive.
Detailed description of the invention
Fig. 1 is the structural block diagram of this system;
Fig. 2 is flow chart of the invention;
Fig. 3 is words art processing timing diagram.
Specific embodiment
A kind of automatic answering system applied to session operational scenarios, including client session processing module and AI session engine mould Block.
The client session processing module, as to adaptation layer, helped before interaction before scene dialogue to content recognition with Event recognition, including client modules, voice gateway module.
The client modules, for mobile phone, base with speech communication function, or can communication text it is micro- The social tools such as letter, wechat public platform.
Call is initiated or answered to the voice gateway module actively according to dial plan, by basis after closing of the circuit A series of actions that client modules generate provides corresponding ESL (Event Socket Library) event and is issued to AI session Engine modules, and received from AI session engine module and execute corresponding movement.
The communication protocol of the various channels of client modules is converted unified communication protocol by voice gateway module, and Event recognition is carried out, following several core events can be generated in the life cycle for the call connected all the way:
1, the event SPEECH CONNET: is generated after establishing communication connection with client modules;
2, SPEECH CHANNEL_ANSWER: client modules generate the event after receiving calls;
3, SPEECH CHANNEL_EXECUTE: start to generate the event when playing one section of voice to client modules;
4, SPEECH CHANNEL_EXECUTE_COMPLETE: voice generates the event after finishing;
5, SPEECH ASR_START: listen to client modules voice flow it is incoming after generate the event;
6, SPEECH ASR_END: generating the event after client voice flow end of transmission, which carries the ASR knot of client Fruit, i.e. voiced translation result.
7, SPEECH HANGUP: either party generates the event after actively hanging up the telephone.
The generation of these events is all relatively orderly under normal circumstances, and SPEECH CHANNEL_EXECUT, SPEECH CHANNEL_EXECUTE_COMPLETE is mostly that pairs of form successively occurs, SPEECH ASR_START, SPEECH ASR_ It is that pairs of form successively occurs that END is also mostly.The event finally occurred is SPEECH HANGUP.
Voice gateway module has event generation ability and ASR (Automatic Speech Recognition) understands energy The voice flow that client modules are passed to can be converted to text and be issued to AI session engine module in the form of event by power.Example Such as, voice gateway module is integrated with speech recognition module, which can be used the Aitalk of Iflytek company 2.0, InterReco 2.0 etc..
The AI session engine module, record and control current sessions state, and voice gateway module is combined to be passed to Event to different instructions is issued to client session processing module, inside set dialogue control module, tree-like data store organisation, General knowledge library and system knowledge base.Data store organisation is the storage unit for talking about art.General knowledge library is depositing for general knowledge Store up component.System knowledge base is the storage unit of systematic knowledge, facilitates the domain requirement according to dialogue, increases and decreases the profession in the field Knowledge.
Wherein session status is divided into voice gateway module not in playback, voice gateway module playback, voice gateway module In pause, conversing has terminated.The instruction that the incoming event of session status combination voice gateway module can produce include initialization, Actively initiation dialogue, other side's speech content, other side just interrupt voice gateway module and speak in silencing, other side.
The session state transfer logic judgment scheme of AI session engine module:
1, when the incoming event " SPEECH CONNET " of voice gateway module arrives, initialization directive, hair AI CONNET: are generated It is sent to voice gateway module, does not change session status at this time;
2, AI CHANNEL_ANSWER: when the incoming event " SPEECH CHANNEL_ANSWER " of voice gateway module arrives, The instruction for generating other side's speech content, is sent to voice gateway module, session status is transferred to voice gateway module not at this time In playback;After " SPEECH CHANNEL_ANSWER ", words art is executed, generates the instruction for actively initiating dialogue;
3, AI CHANNEL_EXECUTE: the incoming event " SPEECH CHANNEL_EXECUTE " of voice gateway module arrives When;Judge current sessions state if state in pause, is not then changed, otherwise session state transfer to voice gateway module playback In, and other side is generated just in the instruction of silencing;
4, AI CHANNE_EXECUTE_COMPLETE: incoming event " the SPEECH CHANNEL_ of voice gateway module When EXECUTE_COMPLETE " arrives, it will speech phase is transferred to voice gateway module not in playback;
5, AI ASR_START: when the incoming event " SPEECH ASR_END " of voice gateway module arrives, if at current state An other side is then generated in voice gateway module playback interrupts the instruction that voice gateway module is spoken;AI session engine module root Language is interrupted according to speech recognition module identification, decides whether pause voice gateway module playback, and session status is turned It moves on in pause or not transfering state: if pause voice gateway module playback, session status is transferred in pause, and is generated The instruction that one other side is speaking is sent to voice gateway module, if not suspending voice gateway module playback, does not shift meeting Speech phase;
6, AI ASR_END: session state transfer is decided whether to when pause according to the event that voice gateway module returns On-hook generates one and actively initiates the instruction of dialogue, and shifts session status position and terminated or voice gateway module to conversing In playback;
7, AI HANGUP: either party generates the event after actively hanging up the telephone.
A kind of automatic answering system and logical method applied to session operational scenarios, comprising the following steps:
Step 1. client modules and voice gateway module establish bi-directionally established both-way communication access, generate event " SPEECH CONNET";
Step 2.AI session engine module generates event " AI CONNET ", generates initialization directive, reads words art, reads knowledge Library loads global ambiguity dictionary, and initialization directive is returned to client modules back through voice gateway module.
Step 3. client modules receive calls, and generate event " SPEECH CHANNEL_ANSWER ", voice gateway module The speech information that client modules are sent is received, and speech information is sent to AI session engine module, words art is carried out and knows The matching of knowledge;AI session engine module generates event " AI CHANNEL_ANSWER ".
Step 3a: information of making a speech is carried out to the correction of ambiguity word.Ambiguity correction includes correction and the synonym of homonym It corrects.
Step 3b: according to the progress of the session node of current speech information, art keyword in speech information is obtained, is led to It crosses and matches intention branch using session content of the regular expression technology to speech information.
If, according to tree-like data store organisation, getting the next of connection after information matches to intention branch of making a speech Talk about art node;Words art node can be divided into ordinary node, adjustment node.
Ordinary node: the art to answer user executes step 4, can increase the action mark for sending short message.
Jump node: be divided into following movement: 1. jump to next main flow node, continue as user and answer words art, hold Row step 4;2. being adjusted to specified main flow node, words art is answered for user, executes step 4;3. on-hook acts, terminate this meeting Words;4. sending short message, the short message of user demand is sent.
After the feedback action for having collected words art node, movement is sent to voice gateway module and is executed.
If being not matched to intention branch, the matching of knowledge base is carried out;Knowledge base is broadly divided into 2 classes:
The traffic issues of user are fed back in general knowledge library, execute step 4;
System knowledge base, is subdivided into 3 classes: 1. can not reply process: the problem of user, can not retrieve answer, at this point, AI session Engine modules execute preset voice playback action, execute step 4;2. interrupting processing: being acted in AI session engine module feedback In execution, user is interrupted midway, and the movement that stopping is carrying out by AI engine executes step 5;3. repetitive operation handle: with Family is not heard or the movement without understanding feedback, and movement before is repeated execution one time by AI session engine module.
Step 4. voice gateway module generates event " SPEECH CHANNEL_EXECUTE ", and AI session engine module generates Event " AI CHANNEL_EXECUTE ": AI session engine module judges whether one section of speech information receives;It is sent out when one section Speech information receives, and AI session engine module, which generates, initiates dialogue instruction, sends voice gateway module for art if matching, Session status is transferred in voice gateway module playback;Voice gateway module starts matched to one section of client modules broadcasting Talk about term sound;Then step 3 is executed.
AI session engine module judges whether one section of speech information receives, using following methods:
AI session engine module, conversate sampling, presets the N number of sampled point of setting in a period of time T;N is of total sampled point Number is fixed sample point number n1With stochastical sampling point number n2The sum of, i.e. N=n1+n2.The sampling time of fixed sample point Are as follows: xt1±t2, wherein x is no more than n1Positive integer, t1=T/n1;t2The time interval generated by random function, and 0 < t2< t1;Stochastical sampling point number n2The random acquisition in time T, 0 < n2≤n1/2。
When data record is sonance, this is effective sampling points;When data record is silent state, this is invalid Sampled point.When more than half of the number N of the total sampled point of effective sampling points Zhan, then judge that speech information no-reception finishes, it is no Then judge that speech information receives.
For example, 35 sampled points are arranged in default 30 seconds a period of times;Fixed sample point number n1It is 30, stochastical sampling Point number n2It is 5.Its t before and after per second of fixed sample point2Interior acquisition, stochastical sampling o'clock random acquisition in 30 seconds.
The method has fully ensured that the randomness and harmony of sampled point.Some users speak with timing, if solid When determining the time interval between sampled point to determine numerical value, the two is easy to produce overlapping, and continuous several sampled points is easy to cause to fall In sound point or noiseless point.Therefore, traditional fixed sample point has one-sidedness.This method falls within every fixation using point for fixed Time interval point t1Front and back t2Interior acquisition, by Fixed Time Interval point t1It is laid in whole section of time interval T, and adjacent fixation is adopted The time interval of collection point all has randomness.
Meanwhile this method also sets up stochastical sampling point, can not adopt so as to avoid time starting point or time end point The disadvantage of sample.Such as: since first sampled point is in (t1±t2), and 0 < t2<t1, cause time starting point that can not sample.If t2= t1, it will lead to and generate the possibility that continuous 3 adjacent fixed sample points fall within same point, increase Duplication, therefore t2≠t1.Meanwhile Stochastical sampling point increases sampling density at random, increases the reduction degree of sampling.
When step 5. talks about the broadcasting of term sound, if the voice flow that voice gateway module listens to client modules is incoming, voice After gateway module carries out noise filtering to voice flow, it is transmitted to the operation that AI session engine module interrupt word filtering;That is language Sound gateway module generates event " SPEECH ASR_START " and " SPEECH ASR_END ";If voice flow is all filter word, Then AI session engine module does not generate new movement;If voice flow is not all filter word, the execution of AI session engine module is beaten Disconnected operation will interrupt instruction and be sent to voice gateway module, and voice gateway module stops the broadcasting of words term sound, i.e. AI session is drawn It holds up module and generates event " AI ASR_START " and " AI ASR_END ", meanwhile, execute step 3.Filter word contains for not essence Justice word, such as: uh, eh, it is good.
The present invention can to answer the call, any communication class tool such as wechat, microblogging, webpage IM, be suitable for covering tools of communications more; Only need to extend engaging tool mode, the intelligent answer engine of bottom is adapted to without making an amendment set;Primary configuration is i.e. applicable In enterprise phone customer service, the scene by all kinds of means such as wechat customer service;Rapid expansion user interaction channel and experience, simplify enterprise's maintenance at This;
In order to improve the response speed and the request of possible massive concurrent, event processing module and voice gateways of system entirety Communication architecture between module is built using the Netty network frame that current industry is had excellent performance.
It is interrupted in processing what voice was linked up, the present invention is more humanized compared to doing in traditional initial mode, to normal Rule link up in answer situation that such as " pair ", " good " etc. do not need to interrupt identify, without interrupting.
Compared to traditional training method, the present invention provides more training informations;Matching classification including knowledge point, Keyword, ambiguity word, which other intention branch being matched to, knowledge point and the knowledge points preferentially selected are;Improve instruction Pilot's training words art is the efficiency of investigation problem;
Conversational mode is flexible, actively initiates dialogue, or receive service session, and only simple modification, should all fall within this technology side In the frame of case.
It, can according to the technique and scheme of the present invention and its hair it is understood that for those of ordinary skills Bright design is subject to equivalent substitution or change, and all these changes or replacement all should belong to the guarantor of appended claims of the invention Protect range.

Claims (7)

1. a kind of automatic answering system applied to session operational scenarios characterized by comprising client session processing module and AI Session engine module;
The client session processing module, as, to adaptation layer, help scene dialogue is preceding to content recognition and event before interaction Identification, including client modules, voice gateway module;The client modules, for speech communication function mobile phone, Base, or can communication text social tool;The voice gateway module is actively initiated or is answered according to dial plan Call, provides corresponding ESL event for a series of actions generated according to client modules after closing of the circuit and is issued to AI meeting Engine modules are talked about, and is received from AI session engine module and executes corresponding movement;
The AI session engine module, record and control current sessions state, and the event for combining voice gateway module to be passed to To issuing different instructions to client session processing module.
2. a kind of automatic answering system applied to session operational scenarios as described in claim 1, which is characterized in that the voice network Module is closed, converts unified communication protocol for the communication protocol of the various channels of client modules, and carry out event recognition, Following several core events can be generated in the life cycle for the call connected all the way:
1, the event SPEECH CONNET: is generated after establishing communication connection with client modules;
2, SPEECH CHANNEL_ANSWER: client modules generate the event after receiving calls;
3, SPEECH CHANNEL_EXECUTE: start to generate the event when playing one section of voice to client modules;
4, SPEECH CHANNEL_EXECUTE_COMPLETE: voice generates the event after finishing;
5, SPEECH ASR_START: listen to client modules voice flow it is incoming after generate the event;
6, SPEECH ASR_END: generating the event after client voice flow end of transmission, which carries the ASR knot of client Fruit, i.e. voiced translation result;
7, SPEECH HANGUP: either party generates the event after actively hanging up the telephone.
3. a kind of automatic answering system applied to session operational scenarios as described in claim 1, which is characterized in that the voice network It closes module and has event generation ability and ASR understandability, the voice flow that client modules are passed to is converted into text with event Form be issued to AI session engine module.
4. a kind of automatic answering system applied to session operational scenarios as claimed in claim 2, which is characterized in that the AI session Engine modules inside set dialogue control module, tree-like data store organisation, general knowledge library and system knowledge base;Data storage Structure is to talk about the storage unit of art;General knowledge library is the storage unit of general knowledge;System knowledge base is depositing for systematic knowledge Store up component.
5. a kind of automatic answering system applied to session operational scenarios as claimed in claim 4, it is characterised in that: the session shape State be divided into voice gateway module not in playback, voice gateway module playback, voice gateway module pause in, call terminated; The instruction that the incoming event of session status combination voice gateway module can produce includes initialization, actively initiates dialogue, other side Speech content, other side just interrupt voice gateway module and speak in silencing, other side;
The session state transfer logic judgment scheme of AI session engine module:
1, when the incoming event " SPEECH CONNET " of voice gateway module arrives, initialization directive, hair AI CONNET: are generated It is sent to voice gateway module, does not change session status at this time;
2, AI CHANNEL_ANSWER: when the incoming event " SPEECH CHANNEL_ANSWER " of voice gateway module arrives, The instruction for generating other side's speech content, is sent to voice gateway module, session status is transferred to voice gateway module not at this time In playback;After " SPEECH CHANNEL_ANSWER ", words art is executed, generates the instruction for actively initiating dialogue;
3, AI CHANNEL_EXECUTE: the incoming event " SPEECH CHANNEL_EXECUTE " of voice gateway module arrives When;Judge current sessions state if state in pause, is not then changed, otherwise session state transfer to voice gateway module playback In, and other side is generated just in the instruction of silencing;
4, AI CHANNE_EXECUTE_COMPLETE: incoming event " the SPEECH CHANNEL_ of voice gateway module When EXECUTE_COMPLETE " arrives, it will speech phase is transferred to voice gateway module not in playback;
5, AI ASR_START: when the incoming event " SPEECH ASR_END " of voice gateway module arrives, if at current state An other side is then generated in voice gateway module playback interrupts the instruction that voice gateway module is spoken;AI session engine module root Language is interrupted according to speech recognition module identification, decides whether pause voice gateway module playback, and session status is turned It moves on in pause or not transfering state: if pause voice gateway module playback, session status is transferred in pause, and is generated The instruction that one other side is speaking is sent to voice gateway module, if not suspending voice gateway module playback, does not shift meeting Speech phase;
6, AI ASR_END: session state transfer is decided whether to when pause according to the event that voice gateway module returns On-hook generates one and actively initiates the instruction of dialogue, and shifts session status position and terminated or voice gateway module to conversing In playback;
7, AI HANGUP: either party generates the event after actively hanging up the telephone.
6. a kind of auto-answer method using system described in claim 5, comprising the following steps:
Step 1. client modules and voice gateway module establish bi-directionally established both-way communication access, generate event " SPEECH CONNET";
Step 2.AI session engine module generates event " AI CONNET ", generates initialization directive, reads words art, reads knowledge Library loads global ambiguity dictionary, and initialization directive is returned to client modules back through voice gateway module;
Step 3. client modules receive calls, and generate event " SPEECH CHANNEL_ANSWER ", and voice gateway module receives The speech information that client modules are sent, and speech information is sent to AI session engine module, carry out words art knowledge Matching;AI session engine module generates event " AI CHANNEL_ANSWER ";
Step 3a: information of making a speech is carried out to the correction of ambiguity word;It includes the correction of homonym and the correction of synonym that ambiguity, which is corrected,;
Step 3b: according to the progress of the session node of current speech information, art keyword in speech information is obtained, by making Intention branch is matched with session content of the regular expression technology to speech information;
If, according to tree-like data store organisation, getting next words art of connection after information matches to intention branch of making a speech Node;Words art node can be divided into ordinary node, adjustment node;
Ordinary node: the art to answer user executes step 4, can increase the action mark for sending short message;
Jump node: be divided into following movement: 1. jump to next main flow node, continue as user and answer words art, execute step Rapid 4;2. being adjusted to specified main flow node, words art is answered for user, executes step 4;3. on-hook acts, terminate this session; 4. sending short message, the short message of user demand is sent;
After the feedback action for having collected words art node, movement is sent to voice gateway module and is executed;
If being not matched to intention branch, the matching of knowledge base is carried out;Knowledge base is broadly divided into 2 classes:
The traffic issues of user are fed back in general knowledge library, execute step 4;
System knowledge base, is subdivided into 3 classes: 1. can not reply process: the problem of user, can not retrieve answer, at this point, AI session Engine modules execute preset voice playback action, execute step 4;2. interrupting processing: being acted in AI session engine module feedback In execution, user is interrupted midway, and the movement that stopping is carrying out by AI engine executes step 5;3. repetitive operation handle: with Family is not heard or the movement without understanding feedback, and movement before is repeated execution one time by AI session engine module;
Step 4. voice gateway module generates event " SPEECH CHANNEL_EXECUTE ", and AI session engine module generates event " AI CHANNEL_EXECUTE ": AI session engine module judges whether one section of speech information receives;When one section of speech is believed Breath receives, and AI session engine module, which generates, initiates dialogue instruction, sends voice gateway module for art if matching, it will Speech phase is transferred in voice gateway module playback;Voice gateway module starts the art to one section of matching of client modules broadcasting Voice;Then step 3 is executed;
When step 5. talks about the broadcasting of term sound, if the voice flow that voice gateway module listens to client modules is incoming, voice gateways After module carries out noise filtering to voice flow, it is transmitted to the operation that AI session engine module interrupt word filtering;That is voice network It closes module and generates event " SPEECH ASR_START " and " SPEECH ASR_END ";If voice flow is all filter word, AI Session engine module does not generate new movement;If voice flow is not all filter word, the execution of AI session engine module interrupts behaviour Make, instruction will be interrupted and be sent to voice gateway module, voice gateway module stops the broadcasting of words term sound, i.e. AI session engine mould Block generates event " AI ASR_START " and " AI ASR_END ", meanwhile, execute step 3.
7. a kind of auto-answer method applied to session operational scenarios as claimed in claim 6, it is characterised in that: in step 4, AI Session engine module judges whether one section of speech information receives, using following methods:
AI session engine module, conversate sampling, presets the N number of sampled point of setting in a period of time T;N is of total sampled point Number is fixed sample point number n1With stochastical sampling point number n2The sum of, i.e. N=n1+n2;The sampling time of fixed sample point Are as follows: xt1±t2, wherein x is no more than n1Positive integer, t1=T/n1;t2The time interval generated by random function, and 0 < t2< t1;Stochastical sampling point number n2The random acquisition in time T, 0 < n2≤n1/2;
When data record is sonance, this is effective sampling points;When data record is silent state, this is invalid sampling Point;When more than half of the number N of the total sampled point of effective sampling points Zhan, then judges that speech information no-reception finishes, otherwise sentence Disconnected speech information receives.
CN201910324994.5A 2019-04-22 2019-04-22 A kind of automatic answering system and method applied to session operational scenarios Expired - Fee Related CN109977218B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910324994.5A CN109977218B (en) 2019-04-22 2019-04-22 A kind of automatic answering system and method applied to session operational scenarios

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910324994.5A CN109977218B (en) 2019-04-22 2019-04-22 A kind of automatic answering system and method applied to session operational scenarios

Publications (2)

Publication Number Publication Date
CN109977218A true CN109977218A (en) 2019-07-05
CN109977218B CN109977218B (en) 2019-10-25

Family

ID=67085716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910324994.5A Expired - Fee Related CN109977218B (en) 2019-04-22 2019-04-22 A kind of automatic answering system and method applied to session operational scenarios

Country Status (1)

Country Link
CN (1) CN109977218B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110661927A (en) * 2019-09-18 2020-01-07 平安科技(深圳)有限公司 Voice interaction method and device, computer equipment and storage medium
CN110674278A (en) * 2019-10-09 2020-01-10 浙江百应科技有限公司 Text robot dialogue interaction method
CN110852799A (en) * 2019-11-07 2020-02-28 北京集奥聚合科技有限公司 User screening method and device based on intention label, electronic equipment and medium
CN111126076A (en) * 2019-12-28 2020-05-08 大唐网络有限公司 AI speech engine implementation method based on object
CN111177310A (en) * 2019-12-06 2020-05-19 广西电网有限责任公司 Intelligent scene conversation method and device for power service robot
CN111209380A (en) * 2019-12-31 2020-05-29 深圳追一科技有限公司 Control method and device for conversation robot, computer device and storage medium
CN111338705A (en) * 2020-02-13 2020-06-26 贝壳技术有限公司 Data processing method, device and storage medium
CN111402881A (en) * 2020-03-25 2020-07-10 广东叁友科技股份有限公司 Intelligent dialogue robot system and method for realizing intelligent dialogue
CN111917726A (en) * 2020-07-01 2020-11-10 中国建设银行股份有限公司 Adaptation layer, voice communication system and control method thereof
CN112037799A (en) * 2020-11-04 2020-12-04 深圳追一科技有限公司 Voice interrupt processing method and device, computer equipment and storage medium
CN115659994A (en) * 2022-12-09 2023-01-31 深圳市人马互动科技有限公司 Data processing method and related device in human-computer interaction system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182131A1 (en) * 2002-03-25 2003-09-25 Arnold James F. Method and apparatus for providing speech-driven routing between spoken language applications
CN101355602A (en) * 2008-09-04 2009-01-28 宇龙计算机通信科技(深圳)有限公司 Mobile terminal as well as method and system for automatically answering thereof
CN101404697A (en) * 2008-11-18 2009-04-08 中国电信股份有限公司 Calling center system and calling method for providing integrated information service
US10108702B2 (en) * 2015-08-24 2018-10-23 International Business Machines Corporation Topic shift detector
CN108846127A (en) * 2018-06-29 2018-11-20 北京百度网讯科技有限公司 A kind of voice interactive method, device, electronic equipment and storage medium
CN108989592A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligence words art interactive system and method for call center
CN109509471A (en) * 2018-12-28 2019-03-22 浙江百应科技有限公司 A method of the dialogue of intelligent sound robot is interrupted based on vad algorithm
CN109660405A (en) * 2019-01-10 2019-04-19 平安科技(深圳)有限公司 Disaster recovery method, device, equipment and the storage medium of call center

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030182131A1 (en) * 2002-03-25 2003-09-25 Arnold James F. Method and apparatus for providing speech-driven routing between spoken language applications
CN101355602A (en) * 2008-09-04 2009-01-28 宇龙计算机通信科技(深圳)有限公司 Mobile terminal as well as method and system for automatically answering thereof
CN101404697A (en) * 2008-11-18 2009-04-08 中国电信股份有限公司 Calling center system and calling method for providing integrated information service
US10108702B2 (en) * 2015-08-24 2018-10-23 International Business Machines Corporation Topic shift detector
CN108846127A (en) * 2018-06-29 2018-11-20 北京百度网讯科技有限公司 A kind of voice interactive method, device, electronic equipment and storage medium
CN108989592A (en) * 2018-07-25 2018-12-11 南京瓦尔基里网络科技有限公司 A kind of intelligence words art interactive system and method for call center
CN109509471A (en) * 2018-12-28 2019-03-22 浙江百应科技有限公司 A method of the dialogue of intelligent sound robot is interrupted based on vad algorithm
CN109660405A (en) * 2019-01-10 2019-04-19 平安科技(深圳)有限公司 Disaster recovery method, device, equipment and the storage medium of call center

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110661927A (en) * 2019-09-18 2020-01-07 平安科技(深圳)有限公司 Voice interaction method and device, computer equipment and storage medium
CN110674278A (en) * 2019-10-09 2020-01-10 浙江百应科技有限公司 Text robot dialogue interaction method
CN110852799A (en) * 2019-11-07 2020-02-28 北京集奥聚合科技有限公司 User screening method and device based on intention label, electronic equipment and medium
CN111177310A (en) * 2019-12-06 2020-05-19 广西电网有限责任公司 Intelligent scene conversation method and device for power service robot
CN111177310B (en) * 2019-12-06 2023-08-18 广西电网有限责任公司 Intelligent scene conversation method and device for power service robot
CN111126076A (en) * 2019-12-28 2020-05-08 大唐网络有限公司 AI speech engine implementation method based on object
CN111209380A (en) * 2019-12-31 2020-05-29 深圳追一科技有限公司 Control method and device for conversation robot, computer device and storage medium
CN111338705A (en) * 2020-02-13 2020-06-26 贝壳技术有限公司 Data processing method, device and storage medium
CN111338705B (en) * 2020-02-13 2021-03-26 北京房江湖科技有限公司 Data processing method, device and storage medium
CN111402881B (en) * 2020-03-25 2023-02-10 广东叁友科技股份有限公司 Intelligent dialogue robot system and method for realizing intelligent dialogue
CN111402881A (en) * 2020-03-25 2020-07-10 广东叁友科技股份有限公司 Intelligent dialogue robot system and method for realizing intelligent dialogue
CN111917726A (en) * 2020-07-01 2020-11-10 中国建设银行股份有限公司 Adaptation layer, voice communication system and control method thereof
CN111917726B (en) * 2020-07-01 2022-03-15 中国建设银行股份有限公司 Adaptation layer, voice communication system and control method thereof
CN112037799B (en) * 2020-11-04 2021-04-06 深圳追一科技有限公司 Voice interrupt processing method and device, computer equipment and storage medium
CN112037799A (en) * 2020-11-04 2020-12-04 深圳追一科技有限公司 Voice interrupt processing method and device, computer equipment and storage medium
CN115659994A (en) * 2022-12-09 2023-01-31 深圳市人马互动科技有限公司 Data processing method and related device in human-computer interaction system

Also Published As

Publication number Publication date
CN109977218B (en) 2019-10-25

Similar Documents

Publication Publication Date Title
CN109977218B (en) A kind of automatic answering system and method applied to session operational scenarios
CN111885272B (en) Intelligent call-out method for supporting telephone by call center seat and intelligent call center system
WO2021051506A1 (en) Voice interaction method and apparatus, computer device and storage medium
CN104756473B (en) Handle concurrent voice
CN109509471A (en) A method of the dialogue of intelligent sound robot is interrupted based on vad algorithm
CN110442701A (en) Voice dialogue processing method and device
CN111128126A (en) Multi-language intelligent voice conversation method and system
EP1798945A1 (en) System and methods for enabling applications of who-is-speaking (WIS) signals
CN109979457A (en) A method of thousand people, thousand face applied to Intelligent dialogue robot
CN111294471B (en) Intelligent telephone answering method and system
CN103903627A (en) Voice-data transmission method and device
CN105100360A (en) Communication auxiliary method and device for voice communication
CN109977202A (en) A kind of intelligent customer service system and its control method
CN102868836A (en) Real person talk skill system for call center and realization method thereof
CN109005190B (en) Method for realizing full duplex voice conversation and page control on webpage
CN110298463A (en) Meeting room preordering method, device, equipment and storage medium based on speech recognition
CN105578439A (en) Incoming call transfer intelligent answering method and system for call transfer platform
CN109829729A (en) A kind of intelligence outgoing call system and method
CN101834809A (en) Internet instant message communication system
CN1714390B (en) Speech recognition device and method
CN109274922A (en) A kind of Video Conference Controlling System based on speech recognition
CN101415257A (en) Man-machine conversation chatting method
CN113779217A (en) Intelligent voice outbound service method and system based on human-computer interaction
CN109710503A (en) A kind of method of Intelligent voice dialog data OA operation analysis
CN113194203A (en) Communication system, answering and dialing method and communication system for hearing-impaired people

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Meng Xiankun

Inventor after: Tian Wen

Inventor after: Cao Quanlong

Inventor before: Meng Xiankun

Inventor before: Tian Wen

Inventor before: Cao Jinlong

CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: 310000 1-206, 206M, 5g Innovation Park, 1818-1 Wenyi West Road, Yuhang District, Hangzhou City, Zhejiang Province

Patentee after: ZHEJIANG HUAKUN DAOWEI DATA TECHNOLOGY Co.,Ltd.

Address before: 310016 Room 2404, Building A, Hualian Times Building, Jianggan District, Hangzhou City, Zhejiang Province

Patentee before: ZHEJIANG HUAKUN DAOWEI DATA TECHNOLOGY Co.,Ltd.

CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20191025