[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115392264A - RASA-based task-type intelligent multi-turn dialogue method and related equipment - Google Patents

RASA-based task-type intelligent multi-turn dialogue method and related equipment Download PDF

Info

Publication number
CN115392264A
CN115392264A CN202211342781.3A CN202211342781A CN115392264A CN 115392264 A CN115392264 A CN 115392264A CN 202211342781 A CN202211342781 A CN 202211342781A CN 115392264 A CN115392264 A CN 115392264A
Authority
CN
China
Prior art keywords
user
text information
rasa
intention
natural language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211342781.3A
Other languages
Chinese (zh)
Inventor
梁兴伟
王冰冰
严海强
杨波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Konka Group Co Ltd
Original Assignee
Konka Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Konka Group Co Ltd filed Critical Konka Group Co Ltd
Priority to CN202211342781.3A priority Critical patent/CN115392264A/en
Publication of CN115392264A publication Critical patent/CN115392264A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a RASA-based task-type intelligent multi-turn dialogue method and related equipment, wherein the method comprises the following steps: constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user; controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information; and controlling the multi-turn dialogue management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of the user dialogue. According to the invention, the dialogue system is constructed based on the RASA open-source framework and the pipeline method, the tasks of the modules are clear and independent, the complex and tedious configuration process is presented in a graphical mode, and the construction efficiency is improved; and by adopting a Botfront open source framework and a front-back end interaction technology, the model can be trained one-key only by configuring related parameters, and the personalized service of the user is realized.

Description

RASA-based task-type intelligent multi-turn dialogue method and related equipment
Technical Field
The invention relates to the technical field of man-machine interaction, in particular to a RASA-based task-type intelligent multi-turn dialogue method, system and terminal.
Background
The dialogue system is an important content in the field of human-computer interaction, human beings use natural language to communicate information with the system, and machines can provide personalized services for the system.
The multi-turn dialogue system aims at knowing the complex intention of a user with the minimum number of turns and providing personalized services in a targeted manner, research on multi-turn dialogue currently achieves certain research progress and research result, but has certain gap from practical application, and the following problems are faced: the traditional task-type multi-turn conversation system is complicated in process and high in repeatability, all links are dispersed, and no engineering process is formed; most task-based multi-turn dialog systems do not support visual analysis, so that users or managers cannot visually evaluate the dialog effect; because the dialogue management corpus is not the original language input by the user, the structured dialogue story flow containing intention, word slot and historical dialogue content is marked manually based on the original input data; therefore, the dialogue management corpus labeling is difficult, and the user personalized service cannot be realized; especially, when the conversation scene and the conversation process are complex and the number of the intended word slots is large, manual labeling is difficult and corpus labeling quality is difficult to ensure; the existing technology has the problems that visual analysis is not supported, so that a user or a manager cannot visually evaluate conversation effect, language material labeling is difficult to manage aiming at conversation, and user personalization cannot be realized.
Accordingly, there is a need for improvements and developments in the art.
Disclosure of Invention
The invention mainly aims to provide a RASA-based task-based intelligent multi-turn dialogue method and related equipment, and aims to solve the problems that visual analysis is not supported and user personalization cannot be realized in the prior art.
In order to achieve the above object, the present invention provides a tasking intelligent multi-turn dialog method based on RASA, which includes the following steps:
constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user;
controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information;
and controlling the multi-turn dialog management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of user dialog.
Optionally, the RASA-based task-based intelligent multi-turn dialog method, where the RASA-based natural language understanding module and the multi-turn dialog management module are constructed, and text information input by a user is acquired, and the method further includes:
and constructing scene corpora based on semantic data and spoken habits, and classifying the scene corpora according to different intention categories to construct the data type of the text information.
Optionally, the RASA-based task-based intelligent multi-turn dialog method, wherein the controlling the natural language understanding module performs intention detection and semantic slot filling on the text information to obtain user intention and entity information, respectively, before further comprising:
extracting the characteristics of the text information to obtain text characteristics based on a pre-training language model for large-scale Chinese corpus training, and segmenting the text information to obtain target text information;
embedding the target text information into a vector space based on the text features to cause the natural language understanding module to process the text information.
Optionally, the RASA-based task-based intelligent multi-turn dialog method, wherein the controlling the natural language understanding module performs intention detection and semantic slot filling on the text information to obtain a user intention and entity information, respectively, specifically includes:
the method comprises the steps that a DIETClassifier is used as a classifier for intention recognition in advance, and text information is input into the classifier for intention classification;
performing intention detection on the spoken texts in the text information after intention classification to obtain user intentions of the text information;
and labeling words in the text information based on the semantic information, and controlling an extractor to perform semantic slot filling based on the labels to obtain entity information of the text information.
Optionally, the RASA-based task-based intelligent multi-turn dialog method, wherein the decimator includes a DIET decimator, a regular expression decimator, and a conditional random field decimator.
Optionally, the RASA-based task-based intelligent multi-turn dialog method includes, after that, constructing a natural language understanding module and a multi-turn dialog management module based on RASA, and acquiring text information input by a user:
when the RASA provides insufficient components, acquiring a component interface of the RASA, and accessing different components based on the component interface.
Optionally, the RASA-based task-based intelligent multi-turn dialog method, wherein the controlling the multi-turn dialog management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of a user dialog specifically includes:
inputting the user intention and the entity information into a tracker based on an interpreter to obtain a conversation state of the user, and sending the conversation state to a policer;
controlling the strategy device to perform action response based on the conversation state, and outputting text conversation based on the responded action;
and displaying a visual interface of the user conversation based on the Botfront framework, and constructing a multi-language conversation agent.
Optionally, the RASA-based task-based intelligent multi-turn dialogue method includes a weather query, a schedule query, and a movie query.
Optionally, the RASA-based task-based intelligent multi-turn dialog method includes:
the data acquisition module is used for constructing a natural language understanding module and a multi-turn dialogue management module based on RASA (random access association), and acquiring text information input by a user;
the natural language understanding module is used for understanding the user intention of the text information, inputting the user intention into a correct intention category and extracting a semantic groove value of the text information;
the multi-round dialogue management module is used for training a dialogue management model and outputting an answer text of the text information;
the data analysis module is used for controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information so as to respectively obtain user intention and entity information;
and the result display module is used for controlling the multi-turn dialogue management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user and displaying a visual interface of the user dialogue.
In addition, to achieve the above object, the present invention further provides a terminal, wherein the terminal includes: the processor is used for executing the RASA-based task-based intelligent multi-turn dialog program to realize the steps of the RASA-based task-based intelligent multi-turn dialog method when the RASA-based task-based intelligent multi-turn dialog program is executed by the processor.
The method comprises the steps of constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user; controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information; and controlling the multi-turn dialogue management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of the user dialogue. According to the invention, the dialogue system is constructed based on the RASA open-source framework and the pipeline method, the tasks of the modules are clear and independent, the complex and tedious configuration process is presented in a graphical mode, and the construction efficiency is improved; and by adopting a Botfront open source framework and a front-back end interaction technology, the model can be trained one-key only by configuring related parameters, and the personalized service of the user is realized.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the RASA-based task-based intelligent multi-turn dialog method of the present invention;
FIG. 2 is a schematic diagram of the overall framework of the RASA-based task-based intelligent multi-turn dialog system of the present invention;
FIG. 3 is a schematic diagram of Rasa natural language understanding and Rasa core of the RASA-based task-based intelligent multi-turn dialog method of the present invention;
FIG. 4 is a diagram of a pre-training language model based on large-scale Chinese corpus training according to the present invention;
FIG. 5 is a diagram of a multitasking architecture for intent classification and entity identification in accordance with the present invention;
FIG. 6 is a block diagram of a multi-turn dialog management module framework according to the present invention;
FIG. 7 is a schematic diagram of creating and training a story in an embodiment of the invention;
FIG. 8 is a schematic diagram of creating, training and evaluating NLU models in an embodiment of the present invention;
FIG. 9 is a diagram illustrating the creation and editing of corresponding responses in an embodiment of the present invention;
FIG. 10 is a schematic illustration of a monitoring session in an embodiment of the invention;
FIG. 11 is a schematic illustration of an NLU utterance for review and annotation input in an embodiment of the present invention;
FIG. 12 is a schematic diagram of a preferred embodiment of the RASA-based task-based intelligent multi-turn dialog system of the present invention;
fig. 13 is a schematic operating environment of a terminal according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative position relationship between the components, the motion situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.
In addition, if there is a description of "first", "second", etc. in an embodiment of the present invention, the description of "first", "second", etc. is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
The task-based multi-turn dialogue system is mainly oriented to tasks, and gradually collects information related to target tasks by performing multi-turn natural language dialogue with a user so as to assist the user in obtaining certain services. The research on task-based multi-turn dialog systems in the prior art is generally based on pipeline (pipeline) and end-to-end (end-to-end) architectures; the research of the early task-based dialogue System is most typical of DARPA travel Information System (ATIS) and a travel plan System (Communicator), which respectively provide dialogue services targeting airline reservation and travel plan planning based on mainly a pipe structure divided into three modules of natural language understanding, dialogue management (dialogue state tracking and dialogue strategy selection) and natural language generation, and connected in order. In recent years, some excellent task-based multi-turn dialog systems are emerging at home and abroad, and the AIUI system is developed by science news and news, and only Natural Language Understanding (NLU) service is provided; hundreds of intelligent dialogue customization and service platforms (UNIT) are developed in hundreds, wherein the platforms mainly use an intention recognition and word slot filling model and assist a dialogue template, and realize dialogue management by triggering a rule set; the UNIT defines multiple trigger rule groups for each intention, and once the conditions of a certain rule group are met, the platform triggers the execution action under the intention, but the platform does not support a developer to define multiple execution actions and does not provide a visualization module.
There are many conversational systems or platforms in foreign countries as well, microsoft develops an LUIS (Language Understanding Intelligent Service) platform based on NLU Service, the platform adopts a pipeline structure and machine learning to respectively train a model for an intention and a word slot defined by a developer, but does not provide a conversational management function; ai was purchased by Google in 2016 and renamed to Dialogflow, the method of its natural language processing module is similar to the LUIS platform developed by microsoft corporation, and dialog management is performed in the form of context, which can reflect the current request state of the user, so that the dialog system can transmit an intention to the user, thereby controlling the dialog path. In addition, unlike microsoft LUIS and Google's Dialogflow platform, facebook's wit.ai platform jointly recognizes intents and word slots using an end-to-end structure, without requiring developers to configure the intents and word slots, whereas task-based multi-turn conversations have strong domain relevance, and the definition of the intents and word slots is also helpful for natural language understanding; the current end-to-end architecture is also in the initial research stage, and the model architecture does not perform well in a task-based multi-turn dialogue scene.
As shown in fig. 1, the RASA-based task-based intelligent multi-turn dialog method according to a preferred embodiment of the present invention includes the following steps:
and S10, constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user.
Specifically, the RASA-based task-based intelligent multi-turn dialog system is mainly implemented by performing algorithm based on an RASA framework and implementing a visual interface based on Botfront (front end of the robot), wherein the RASA framework is an open-source robot framework for implementing multi-turn dialog based on machine learning; the method comprises the steps of adopting pipeline to split and modularize key technical problems in multiple rounds of conversation, cascading a plurality of modules through the pipeline method, and defining an interactive interface mode for each module so as to determine input and output of each module. In the general pipeline method, the multi-turn dialog system mainly includes five parts of ASR (automatic speech recognition), NLU (natural language understanding), DM (dialog management), NLG (natural language generation), and TIS (speech synthesis), as shown in fig. 2; the user sends the voice signal to the ASR; the ASR recognizes text information in the speech signal (e.g., a french person looking at a Alibab) and sends the text information to the NLU; the NLU identifies an intention and a semantic groove (for example, the intention is a French look, the semantic groove is a company name: alibaba, attribute: french) in the text information, and sends the intention and the semantic groove to a DM, the DM carries out DST (state tracking) and DPO (policy optimization) on the text information based on the intention and the semantic groove, retrieves the French look based on a knowledge base and APIs (APIs), and sends the French look to an NLG; the NLG sends a text reply (e.g., xxx, a courabar corporation) to the TIS; and the TIS synthesizes the text reply into voice, and plays the voice for the user to finish the conversation with the user.
The two modules of natural language understanding and dialogue management are in the most close logical connection, are the core of task-type dialogue and are the problems that each dialogue system needs to pay attention to and solve; as shown in fig. 3, RASA establishes RASA NLU (RASA natural language understanding) and RASA Core (RASA Core, also called multi-round dialogue management module); the Rasa NLU is used for intention identification, entity identification, and data conversion of input of a user into structured data, nlu.md and NLU _ config.yml, and the Rasa Core is used for conversation management and deciding what content is returned to the user next, mainly analyzing stories and defining domains, wherein the stories comprise scene flows of conversations, story creation, title, intention and action analysis, and stores.md; yml, domain, including knowledge base of machines, intents, actions, answer templates, entities, word slots, and domain; establishing a Rasa NLU and a Rasa Core through RASA to respectively finish user message understanding and multi-round conversation management, solving the two Core problems and realizing the main functions of a conversation system; the main documents used and the related functions are shown in the following table.
Figure 335873DEST_PATH_IMAGE001
The primary purpose of the natural language understanding module is to understand user intent. The method generally comprises two tasks of intention detection and semantic slot filling, achieves the purpose of understanding and formatting the intention of a user by analyzing the semantics of a text input by the user and extracting key information related to the tasks, and provides support for subsequent modules of multi-turn conversations; the intention detection is generally regarded as a sentence classification problem, and a category of a user purpose is predicted from a predefined category set through an algorithm, wherein the category corresponds to an intention; different from other classification tasks, the intention detection data is spoken text, and needs to be combined with sentences and contexts to capture real semantic information; the semantic slot filling is to understand a segment of characters by marking meaningful words or signs in sentences, and labels are marked on each word (character) in the text according to semantic information, which is essentially a sequence labeling task, and the label can be used for extracting clearly defined attributes (namely slot positions) from the text, so that the user intention is converted into clear instructions; in contrast, intent detection focuses more on the overall meaning of user input, and semantic slot filling focuses on understanding and capturing the fine granularity of text, as shown in the following table.
Figure 49751DEST_PATH_IMAGE002
Furthermore, data is the basis of an artificial intelligence system, whether based on rules, or a traditional machine learning method, or a currently common neural network method, and tens of thousands of high-quality data are often required for training to obtain accurate and problem-compliant parameters; the application scene of the task type conversation needs a large amount of training data which aims at a specific field and accords with daily conversation logic and spoken language habits; even so to say, the quality of the data largely determines the performance of the dialog system, and is limited by the current research situation of the task-based dialog system and the special application scene oriented by the invention, and no sufficient starting data set is available to complete the task, so that the data set is automatically constructed in the invention, thereby supporting the model to realize the function of the task-based dialog system; firstly, building common scene linguistic data (such as greetings, billings, chatting and the like) by referring to other multi-turn conversation tasks and daily spoken habits, then building data for practical application scenes (such as weather inquiry, schedule arrangement inquiry and movie and television inquiry) according to different intention categories, and labeling semantic slots to extract key information input by a user; then, in order to improve the generalization capability of the model, data enhancement is carried out on the basic data; on one hand, synonyms, near synonyms and association words are replaced for the semantic slots, so that the data set can cover more scenes; on the other hand, sentence pattern transformation is carried out on the data, so that the data set can adapt to more types of spoken language expressions; the constructed and enhanced data set is proved to be effective in the subsequent process, the system can be supported to complete multiple rounds of conversations under most contexts, but some special conditions can not be completed smoothly, the data which can not be completed smoothly are recorded, fed back to the multiple rounds of conversation system and added into the data set for retraining, and the accuracy and generalization capability of the system are continuously improved in a self-supervision mode.
And S20, controlling the natural language understanding module to perform intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information.
The step S20 includes:
step S21, taking DIETClassifier as a classifier for intention identification in advance, and inputting the text information into the classifier for intention classification;
s22, carrying out intention detection on the spoken language texts in the text information after intention classification to obtain the user intention of the text information;
and S23, labeling words in the text information based on the semantic information, and controlling an extractor to perform semantic slot filling based on the labels to obtain entity information of the text information.
Specifically, according to a general processing flow of a natural language understanding module, aiming at an intention detection task, a Rasa NLU (Rasa natural language understanding module) designs a classifier module (classifier) for classification; aiming at the semantic slot filling task, the Rasa NLU designs a word segmentation device (tokenizer), a feature extractor (featurer), an extractor (extractors) and the like to complete the analysis of the input text and the extraction of key information, and the RASA frame has rich content, powerful functions and is integrated with a plurality of components; in the classifier module, a method based on key words, a method based on the MITIE language model and the like are included; when the semantic slot is analyzed, efficient language models such as MITIE, space and the like are provided, and tools such as jieba word segmentation and the like are provided for Chinese, so that the whole open source framework is compatible with Chinese; when the components provided by the RASA framework can not meet the requirements of the dialog system, developers can customize the components through various interfaces provided by the framework, so that artificial rules, a traditional machine learning and statistical learning method and a front-edge deep learning result can be integrated in the multi-turn dialog system as required, and the RASA framework can be freely applied to various actual scenes and has high flexibility and expandability; the natural language understanding module is one of core modules of a task-based dialog system and is also the first module after receiving user input, and the main task of the module is to understand user intention in text information, carry out correct intention classification on the user intention and extract a proper semantic slot value; in this section, the invention is primarily based on the Rasa open source framework and rules set up for the task.
Further, if the model can process text information, firstly, the text features of the text information need to be extracted, and the text features are segmented and then embedded into vector space for representation; the Rasa framework provides a Jieba Tokenizer word segmentation component supporting Chinese and a MitieNLP Chinese word vector tool to complete a Chinese task; however, they have certain limitations, firstly, based on the processing method of Chinese word segmentation, because the word segmentation error may cause cascade error, the effect of subsequent intention classification and semantic groove value extraction is influenced; the Mitie natural language processing toolkit is mainly based on machine learning algorithms such as SVM and the like, has better keyword extraction performance, but is gradually surpassed by a large-scale pre-training language model, and the training speed is slow; therefore, the invention adopts the pre-training language model BERT-base-chip based on large-scale Chinese corpus training to realize the feature extraction of the Chinese text, wherein the pre-training language model BERT-base-chip is based on rich pre-training knowledge of a BERT model (the main input of the BERT model is an original word vector of each word/word in the text), so that the pre-training language model can be widely applied to various downstream tasks and unused contexts, and the pre-training task based on shape filling can also be combined with context information to improve the performance of semantic slot extraction, as shown in figure 4.
While the Rasa framework proposes a DIET (Dual Intent Entity Transformer) framework for Intent classification and Entity extraction; the DIET framework is a multitasking architecture for intent classification and entity recognition, as shown in fig. 5, which can combine pre-trained word embedding of language models in a plug-and-play manner and combine them with word-and-character-level n-gram sparse features, experiments show that DIET can achieve better results than other models on complex natural language understanding datasets and train far-beyond-fine-tuned BERT models even without pre-trained embedding, using only word-and-character-level n-gram sparse features; the DIET framework is inherited from a Transformer Rasa model class, the whole sentence is coded by using a 12-layer Transformer and a relative position attention mechanism based on the Transformer model framework, CLS marks output by the Transformer represent intention classification for user input, similarity comparison is carried out on the CLS marks and the intention classification and the similarity comparison are carried out on the CLS marks and the intention classification, a loss function is calculated, and the purpose of accommodating the loss function is to measure the quality of model prediction; the invention uses DIETClassifier as the classifier for intention identification, and can classify the intention more accurately.
Another major task of the natural language understanding module is semantic slot filling, which is essentially an entity recognition and entity extraction problem; due to the diversity of entities and the complexity of different linguistic expressions, this problem is solved by combining a variety of decimators, which mainly include a DIET decimator, a regular expression decimator, and a conditional random field decimator; the DIET extractor is used for acquiring the relation between a context label and an input sequence label through a conditional random field layer according to the named entity recognition task on the basis of the transform output, so as to acquire entity prediction; the regular expression extractor extracts the entity by defining a lookup table and/or a regular expression in the training data, and the component checks whether the user message contains an entry of one of the lookup tables or matches one of the regular expressions; if a matching item is found, the value is extracted as an entity, and the regular expression extractor can set rules for some special words or expression modes and filter errors caused by special conditions; while the conditional random field extractor implements a conditional random field for named entity recognition, a conditional random field can be considered a undirected Markov chain, where the time step is a word, the state is an entity class, the features of the word (e.g., capitalization, POS tags, etc.) give the probability of some entity classes, as do transitions between adjacent entity tags: then calculating and returning a most possible label set; the conditional random field decimator may better learn the relationships between contexts in text.
And S30, controlling the multi-turn dialogue management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of user dialogue.
The step S40 includes:
s31, inputting the user intention and the entity information into a tracker based on an interpreter to obtain a conversation state of the user, and sending the conversation state to a strategy device;
s32, controlling the strategy device to perform action response based on the conversation state, and outputting text conversation based on the responded action;
and S33, displaying a visual interface of the user conversation based on the Botfront framework, and constructing a multi-language conversation agent.
Specifically, rasa _ core (multi-turn dialog management module) is responsible for completing the management of multi-turn dialogs, because the Rasa _ natural language understanding module only supports English and German, and the system uses a jieba word segmenter (jieba) as a Chinese word segmenter (tokenizer) as a part of the whole pipeline because the system is Chinese; when a Rasa _ natural language understanding module is used for an intention recognition task, a model of the MITIE trained based on an unsupervised method is needed, and the model is similar to word embedding in word2 vec; training the MITIE model by utilizing a database created by the user, and segmenting words of the whole corpus by using the ending segmentation words; specifically, under an ubuntu operating system, a root word bank is firstly installed through a command sudo pip install jieba, and then an executable wordrep tool is created through compiling a command cmake; finally, training to generate a binary file of total _ word _ feature _ extra _ chi.dat, wherein the binary file is a 300-dimensional word vector to be used in the whole dialog system, after the word vector is provided, training of an NLU (natural language understanding) model can be started, but before training, pipeline needs to be configured, and is specifically implemented as [ "nlp _ limiter", "token _ jieba", "ner _ limiter", "ner _ synnym", "interface _ featurer _ limiter", "interface _ classier _ sklee" ], wherein "nlp _ limiter" is used for initializing the MITIE, and "token _ jeneermer _ jieba" is used for dividing words by the jeneba, and "interface _ limiter" is used for entity recognition and "interface _ recognizer" is used for extracting the characteristics of the interface _ recognizer; when the sklern is used for intention recognition, the used core algorithm is a Support Vector Machine (SVM), the input feature of the SVM is that the word vectors of each word in a sentence with 300 dimensions are added, and then an average Vector is taken; in order to improve the performance of the NLU model, the Rasa NLU is improved, and the feature extraction of the Chinese text is realized by adopting a pre-training language model bert-base-Chinese based on large-scale Chinese corpus training.
The Rasa _ core does not realize complex conversation logic through if/else condition judgment, but trains a conversation management model through a machine learning method, and the machine learning method has good portability and good maintainability; as shown in fig. 6, firstly, a system receives a user message, sends the user message to an Interpreter module, and identifies and generates a dictionary containing a message text and an intent; the identification of the intentions by the Interpreter module is realized by a PaddleNLP deep learning model; then tracking the conversation state through a Tracker (Tracker), wherein the Tracker is mainly used for receiving and recording a new message identified by the Interpreter model; the current dialog state is then sent to Policy, which selects which Action to respond to, the responding Action will be recorded in Tracker, and the result of the responding Action is returned to the user.
Further, presentation of interactive and visual interfaces that operate in conjunction with the RASA and the Botfront framework, which can build advanced multilingual conversation agents, e.g., create and train stories (specific interfaces are shown in fig. 7), create, train, and evaluate NLU models (specific interfaces are shown in fig. 8), create and edit corresponding responses (specific interfaces are shown in fig. 9), monitor dialogues (specific interfaces are shown in fig. 10), review and annotate input NLU utterances (specific interfaces are shown in fig. 11); the invention quickly constructs a task-type multi-turn dialog system based on the RASA open-source framework and the pipeline-type structure, the system has a whole set of engineering process, the modules are clear, the interpretability is strong, multiple languages are supported, and the maintenance and the expansion are easy; the graphical configuration is realized based on the Botfront open source framework, the system presents complicated and fussy configuration processes in a graphical mode, and a user or a manager can rapidly configure a conversation only by performing simple operations such as selection, filling, dragging and the like, so that the learning cost is reduced, and the building efficiency is improved; and by using the Botfront open-source framework and the front-back end interaction technology, the user can train the model by one key only by configuring related parameters, and in addition, the user can update, add or modify the dialogue corpus at any time and train the model again by using the new corpus to realize the user personalized service.
Further, as shown in fig. 12, based on the above RASA-based task-based intelligent multi-turn dialog method, the present invention also provides a RASA-based task-based intelligent multi-turn dialog system, where the RASA-based task-based intelligent multi-turn dialog system includes:
a data acquisition module 51, configured to construct a natural language understanding module and a multi-turn dialogue management module based on the RASA, and acquire text information input by a user;
the natural language understanding module 52 is configured to understand a user intention of the text information, input the user intention into a correct intention category, and extract a semantic slot value of the text information;
the multi-round dialogue management module 53 is configured to train a dialogue management model and output an answer text of the text information;
a data analysis module 54, configured to control the natural language understanding module to perform intent detection and semantic slot filling on the text information, so as to obtain a user intent and entity information respectively;
and the result display module 55 is configured to control the multi-turn dialog management module to match a response result of the text information based on the user intention and the entity information, feed the response result back to the user, and display a visual interface of a user dialog.
Further, as shown in fig. 13, based on the above RASA-based task-based intelligent multi-turn dialog method, the present invention also provides a terminal, where the terminal includes a processor 10, a memory 20, and a display 30; fig. 13 shows only some of the components of the terminal, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The memory 20 may in some embodiments be an internal storage unit of the terminal, such as a hard disk or a memory of the terminal. The memory 20 may also be an external storage device of the terminal in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal. Further, the memory 20 may also include both an internal storage unit and an external storage device of the terminal. The memory 20 is used for storing application software installed in the terminal and various types of data, such as program codes of the installation terminal. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores a program 40 of RASA-based task-based intelligent multi-turn dialog, and the program 40 of RASA-based task-based intelligent multi-turn dialog is executable by the processor 10 to implement the RASA-based task-based intelligent multi-turn dialog method of the present application.
The processor 10 may be, in some embodiments, a Central Processing Unit (CPU), microprocessor or other data Processing chip, which is used to run program codes stored in the memory 20 or process data, such as executing the RASA-based task-based intelligent multi-turn dialog method.
The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information at the terminal and for displaying a visual user interface. The components 10-30 of the terminal communicate with each other via a system bus.
In one embodiment, when the processor 10 executes the interface display program 40 of the split screen window in the memory 20, the following steps are implemented:
constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user;
controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information;
and controlling the multi-turn dialogue management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of the user dialogue.
Wherein, the establishing of the natural language understanding module and the multi-turn dialogue management module based on the RASA and the obtaining of the text information input by the user also comprise:
and constructing scene corpora based on semantic data and spoken habits, and classifying the scene corpora according to different intention categories to construct the data type of the text information.
Wherein, the controlling the natural language understanding module to perform intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information, and the method also comprises the following steps:
extracting the characteristics of the text information to obtain text characteristics based on a pre-training language model for large-scale Chinese corpus training, and segmenting the text information to obtain target text information;
embedding the target text information into a vector space based on the text features to cause the natural language understanding module to process the text information.
The controlling the natural language understanding module to perform intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information specifically comprises:
the method comprises the steps that a DIETClassifier is used as a classifier for intention recognition in advance, and text information is input into the classifier for intention classification;
performing intention detection on the spoken texts in the text information after intention classification to obtain user intentions of the text information;
and labeling words in the text information based on the semantic information, and controlling an extractor to perform semantic slot filling based on the labels to obtain entity information of the text information.
Wherein the decimators include a DIET decimator, a regular expression decimator, and a conditional random field decimator.
Wherein, the controlling the natural language understanding module to perform intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information, and then further comprises:
when the RASA provides insufficient components, acquiring a component interface of the RASA, and accessing different components based on the component interface.
Wherein, the controlling the multi-turn dialog management module to match out a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of the user dialog specifically comprises:
inputting the user intention and the entity information into a tracker based on an interpreter to obtain a conversation state of the user, and sending the conversation state to a policer;
controlling the policy maker to perform action response based on the conversation state, and outputting text conversation based on the responded action;
and displaying a visual interface of the user conversation based on the Botfront framework, and constructing a multi-language conversation agent.
Wherein the data types include weather queries, schedule queries, and movie queries.
In summary, the present invention provides a tasking intelligent multi-turn dialog method based on RASA and related devices, the method includes: constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user; controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information; and controlling the multi-turn dialog management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of user dialog. According to the invention, the dialogue system is constructed based on the RASA open-source framework and the pipeline method, the tasks of the modules are clear and independent, the complex and tedious configuration process is presented in a graphical mode, and the construction efficiency is improved; and by adopting a Botfront open source framework and a front-back end interaction technology, the model can be trained one-key only by configuring related parameters, and the personalized service of the user is realized.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by instructing relevant hardware (such as a processor, a controller, etc.) through a computer program, and the program can be stored in a computer readable storage medium, and when executed, the program can include the processes of the embodiments of the methods described above. The computer readable storage medium may be a memory, a magnetic disk, an optical disk, etc.
It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims (10)

1. A RASA-based task-based intelligent multi-turn dialog method is characterized by comprising the following steps:
constructing a natural language understanding module and a multi-turn dialogue management module based on RASA, and acquiring text information input by a user;
controlling the natural language understanding module to perform intention detection and semantic slot filling on the text information to respectively obtain user intention and entity information;
and controlling the multi-turn dialog management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of user dialog.
2. The RASA-based task-based intelligent multi-turn dialog method according to claim 1, wherein the RASA-based natural language understanding module and multi-turn dialog management module are constructed and obtain text information input by a user, and the method further comprises:
and constructing scene corpora based on semantic data and spoken habits, and classifying the scene corpora according to different intention categories to construct the data type of the text information.
3. The RASA-based task-based intelligent multi-turn dialog method of claim 1, wherein the controlling the natural language understanding module to perform intent detection and semantic slot filling on the text information to obtain user intent and entity information respectively further comprises:
extracting the characteristics of the text information to obtain text characteristics based on a pre-training language model for large-scale Chinese corpus training, and segmenting the text information to obtain target text information;
embedding the target text information into a vector space based on the text features to cause the natural language understanding module to process the text information.
4. The RASA-based task-based intelligent multi-turn dialog method of claim 1, wherein the controlling the natural language understanding module to perform intent detection and semantic slot filling on the text information to obtain user intent and entity information, respectively, specifically comprises:
the method comprises the steps that a DIETClassifier is used as a classifier for intention recognition in advance, and text information is input into the classifier for intention classification;
intention detection is carried out on the spoken language texts in the text information after intention classification, and user intentions of the text information are obtained;
and labeling words in the text information based on the semantic information, and controlling an extractor to perform semantic slot filling based on the labels to obtain entity information of the text information.
5. The RASA-based task-based intelligent multi-turn dialog method of claim 4, wherein the decimators comprise a DIET decimator, a regular expression decimator, and a conditional random field decimator.
6. The RASA-based task-based intelligent multi-turn dialog method of claim 1, wherein the controlling the natural language understanding module performs intent detection and semantic slot filling on the text information to obtain user intent and entity information, respectively, and then further comprising:
when the RASA provides insufficient components, acquiring a component interface of the RASA, and accessing different components based on the component interface.
7. The RASA-based task-based intelligent multi-turn dialog method of claim 1, wherein the controlling the multi-turn dialog management module to match out a response result of the text information based on the user intent and the entity information, feeding the response result back to the user, and displaying a visual interface of a user dialog comprises:
inputting the user intention and the entity information into a tracker based on an interpreter to obtain a conversation state of the user, and sending the conversation state to a policer;
controlling the policy maker to perform action response based on the conversation state, and outputting text conversation based on the responded action;
and displaying a visual interface of the user conversation based on the Botfront framework, and constructing a multi-language conversation agent.
8. The RASA-based task-based intelligent multi-turn dialog method of claim 2, wherein the data types include weather queries, scheduling queries, and movie queries.
9. An RASA-based task-based intelligent multi-turn dialog system, comprising:
the data acquisition module is used for constructing a natural language understanding module and a multi-turn dialogue management module based on RASA (random access association), and acquiring text information input by a user;
the natural language understanding module is used for understanding the user intention of the text information, inputting the user intention into a correct intention category and extracting a semantic groove value of the text information;
the multi-round dialogue management module is used for training a dialogue management model and outputting an answer text of the text information;
the data analysis module is used for controlling the natural language understanding module to carry out intention detection and semantic slot filling on the text information so as to respectively obtain user intention and entity information;
and the result display module is used for controlling the multi-turn conversation management module to match a response result of the text information based on the user intention and the entity information, feeding the response result back to the user, and displaying a visual interface of user conversation.
10. A terminal, characterized in that the terminal comprises: memory, a processor and a program stored on the memory and executable on the processor, the program when executed by the processor implementing the steps of the RASA-based task-based intelligent multi-turn dialog method according to any of the claims 1-8.
CN202211342781.3A 2022-10-31 2022-10-31 RASA-based task-type intelligent multi-turn dialogue method and related equipment Pending CN115392264A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211342781.3A CN115392264A (en) 2022-10-31 2022-10-31 RASA-based task-type intelligent multi-turn dialogue method and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211342781.3A CN115392264A (en) 2022-10-31 2022-10-31 RASA-based task-type intelligent multi-turn dialogue method and related equipment

Publications (1)

Publication Number Publication Date
CN115392264A true CN115392264A (en) 2022-11-25

Family

ID=84114900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211342781.3A Pending CN115392264A (en) 2022-10-31 2022-10-31 RASA-based task-type intelligent multi-turn dialogue method and related equipment

Country Status (1)

Country Link
CN (1) CN115392264A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115617972A (en) * 2022-12-14 2023-01-17 成都明途科技有限公司 Robot dialogue method, device, electronic equipment and storage medium
CN117112261A (en) * 2023-08-28 2023-11-24 上海澜码科技有限公司 Method and system for realizing natural language dialogue type API
CN117400278A (en) * 2023-11-28 2024-01-16 南京信息工程大学 Chemical food safety conversation robot
CN117556025A (en) * 2024-01-10 2024-02-13 川投信息产业集团有限公司 AI and visualization-based platform project service information optimization method and system
CN118069820A (en) * 2024-04-22 2024-05-24 来未来科技(浙江)有限公司 Conversational data analysis method, device, equipment and storage medium
CN118095216A (en) * 2024-04-09 2024-05-28 华南师范大学 Processing method and device for prompt template applied to language model
CN118193854A (en) * 2024-05-16 2024-06-14 浪潮软件股份有限公司 Data integration task construction device based on artificial intelligence interaction
CN118445776A (en) * 2024-07-08 2024-08-06 北京大学 Psychological consultation dialogue data set generation method, device, equipment and storage medium
CN118535714A (en) * 2024-07-25 2024-08-23 北京致远互联软件股份有限公司 Intelligent proxy system and method for human-computer collaboration process

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220425A (en) * 2021-11-04 2022-03-22 福建亿榕信息技术有限公司 Chat robot system and conversation method based on voice recognition and Rasa framework

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114220425A (en) * 2021-11-04 2022-03-22 福建亿榕信息技术有限公司 Chat robot system and conversation method based on voice recognition and Rasa framework

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MARTIS: "开源对话机器人平台botfront初体验", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/78363022》 *
刘宇杰 等: "基于Rasa的任务型对话系统设计与实现", 《现代计算机》 *
王雅君: "基于RASA的智能语音对话系统", 《中国优秀硕士学位论文全文数据库信息科技辑(月刊)》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115617972B (en) * 2022-12-14 2023-04-07 成都明途科技有限公司 Robot dialogue method, device, electronic equipment and storage medium
CN115617972A (en) * 2022-12-14 2023-01-17 成都明途科技有限公司 Robot dialogue method, device, electronic equipment and storage medium
CN117112261A (en) * 2023-08-28 2023-11-24 上海澜码科技有限公司 Method and system for realizing natural language dialogue type API
CN117400278A (en) * 2023-11-28 2024-01-16 南京信息工程大学 Chemical food safety conversation robot
CN117556025A (en) * 2024-01-10 2024-02-13 川投信息产业集团有限公司 AI and visualization-based platform project service information optimization method and system
CN117556025B (en) * 2024-01-10 2024-04-02 川投信息产业集团有限公司 AI and visualization-based platform project service information optimization method and system
CN118095216A (en) * 2024-04-09 2024-05-28 华南师范大学 Processing method and device for prompt template applied to language model
CN118069820A (en) * 2024-04-22 2024-05-24 来未来科技(浙江)有限公司 Conversational data analysis method, device, equipment and storage medium
CN118193854A (en) * 2024-05-16 2024-06-14 浪潮软件股份有限公司 Data integration task construction device based on artificial intelligence interaction
CN118193854B (en) * 2024-05-16 2024-08-30 浪潮软件股份有限公司 Data integration task construction device based on artificial intelligence interaction
CN118445776A (en) * 2024-07-08 2024-08-06 北京大学 Psychological consultation dialogue data set generation method, device, equipment and storage medium
CN118445776B (en) * 2024-07-08 2024-10-29 北京大学 Psychological consultation dialogue data set generation method, device, equipment and storage medium
CN118535714A (en) * 2024-07-25 2024-08-23 北京致远互联软件股份有限公司 Intelligent proxy system and method for human-computer collaboration process

Similar Documents

Publication Publication Date Title
CN115392264A (en) RASA-based task-type intelligent multi-turn dialogue method and related equipment
US11503155B2 (en) Interactive voice-control method and apparatus, device and medium
US11568855B2 (en) System and method for defining dialog intents and building zero-shot intent recognition models
US11250841B2 (en) Natural language generation, a hybrid sequence-to-sequence approach
Tur et al. Spoken language understanding: Systems for extracting semantic information from speech
CN111062217B (en) Language information processing method and device, storage medium and electronic equipment
CN111738016B (en) Multi-intention recognition method and related equipment
CN110765759B (en) Intention recognition method and device
CN113505591A (en) Slot position identification method and electronic equipment
CN110428823A (en) Speech understanding device and the speech understanding method for using the device
CN109754809A (en) Audio recognition method, device, electronic equipment and storage medium
US11907665B2 (en) Method and system for processing user inputs using natural language processing
CN114691852A (en) Man-machine conversation system and method
KR102339794B1 (en) Apparatus and method for servicing question and answer
CN111739520A (en) Speech recognition model training method, speech recognition method and device
CN117591663B (en) Knowledge graph-based large model promt generation method
JP2022076439A (en) Dialogue management
US11314534B2 (en) System and method for interactively guiding users through a procedure
CN111399629A (en) Operation guiding method of terminal equipment, terminal equipment and storage medium
Ostendorf Continuous-space language processing: Beyond word embeddings
Samuel et al. Computing dialogue acts from features with transformation-based learning
CN112905774A (en) Human-computer conversation deep intention understanding method based on affair map
CN114077650A (en) Training method and device of spoken language understanding model
CN117493548A (en) Text classification method, training method and training device for model
CN116483314A (en) Automatic intelligent activity diagram generation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20221125

RJ01 Rejection of invention patent application after publication