[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118246910A - Conversational online payment method, system, medium, equipment and program product - Google Patents

Conversational online payment method, system, medium, equipment and program product Download PDF

Info

Publication number
CN118246910A
CN118246910A CN202410668542.XA CN202410668542A CN118246910A CN 118246910 A CN118246910 A CN 118246910A CN 202410668542 A CN202410668542 A CN 202410668542A CN 118246910 A CN118246910 A CN 118246910A
Authority
CN
China
Prior art keywords
user
conversational
audio signal
model
online payment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410668542.XA
Other languages
Chinese (zh)
Inventor
韩娟
仝天
于丽梅
姜远孟
温馨
田梦雨
李爱青
王新新
孙源
杜丽洁
高琼
吕会会
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Marketing Service Center of State Grid Shandong Electric Power Co Ltd
Original Assignee
Marketing Service Center of State Grid Shandong Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Marketing Service Center of State Grid Shandong Electric Power Co Ltd filed Critical Marketing Service Center of State Grid Shandong Electric Power Co Ltd
Priority to CN202410668542.XA priority Critical patent/CN118246910A/en
Publication of CN118246910A publication Critical patent/CN118246910A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/08Payment architectures
    • G06Q20/10Payment architectures specially adapted for electronic funds transfer [EFT] systems; specially adapted for home banking systems
    • G06Q20/102Bill distribution or payments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/08Payment architectures
    • G06Q20/14Payment architectures specially adapted for billing systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4014Identity check for transactions
    • G06Q20/40145Biometric identity checks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Signal Processing (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Computer Security & Cryptography (AREA)
  • Psychiatry (AREA)
  • Hospice & Palliative Care (AREA)
  • Child & Adolescent Psychology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of natural language processing, and discloses a conversational online payment method, a conversational online payment system, a conversational online payment medium, conversational online payment equipment and a conversational online payment program product, wherein the conversational online payment method comprises the following steps: acquiring an audio signal of a user; for the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model; responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model; if the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation. Customer experience in the online payment process is effectively improved.

Description

Conversational online payment method, system, medium, equipment and program product
Technical Field
The invention relates to the technical field of natural language processing, in particular to a conversational online payment method, a conversational online payment system, a conversational online payment medium, conversational online payment equipment and a conversational online payment program product.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In the electric charge payment field, users usually pay electric charge through websites of electric power companies, mobile phone applications or channels such as off-line business halls, traditional online payment, automatic deduction or off-line payment and other modes become user habits, but most of the modes require users or business halls to perform multi-step operation, have certain complexity, lack personalized services and interactions, and also have potential safety hazards.
On the user experience: the operation is complicated, and a user expects a simpler interaction mode and does not depend on keyboard and mouse input any more.
In terms of security: traditional payment methods have the risks of information leakage and account theft, and require more enhanced security measures.
On personalized services: the traditional mode can not provide services which are matched with the requirements of users in a personalized way, and the humanized feeling of interaction is lacked.
Currently, with the wide application of artificial intelligence technology, the requirements of users on the intellectualization and high efficiency of electric charge payment are higher and higher, and the requirements of users on individuation are expected to be better adapted. Therefore, how to integrate advanced technology to construct a conversational online electric charge paying method, which brings more intelligent, convenient and safe electric charge paying experience to users, is a problem to be solved.
With the technical progress and the rapid increase of the user experience optimization demands, the conventional electric charge payment technology cannot meet the business processing demands of intelligent electric charge payment. At present, some progress has been made in the related technical solutions of conversational electric charge payment, such as voice recognition and Natural Language Processing (NLP) technology, which can realize voice interaction and provide a more convenient payment mode for users; the face recognition technology can strengthen user identity authentication and enhance the safety of payment operation; big data analysis realizes more personalized service, recommendation and the like through user behavior data. Despite the introduction of new security technologies, as network threats continue to evolve, protection means need to be continually upgraded. And implementing some advanced techniques may increase costs, especially for small electric utility companies, which may face financial pressures. In addition, when technologies such as biometric identification are used, user privacy protection is of greater concern to avoid potential disputes. Moreover, dialects cannot be supported, user emotion cannot be considered, and user experience is poor.
Disclosure of Invention
In order to solve the problems, the invention provides a conversational online payment method, a conversational online payment system, a conversational online payment medium, conversational online payment equipment and a conversational online payment program product, which not only realize diversified language support functions in a voice recognition process, but also analyze emotion in a conversational text of a client, match and formulate an intervention strategy, and effectively improve the client experience in the online payment process.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a first aspect of the present invention provides a conversational online payment method, comprising:
Acquiring an audio signal of a user;
For the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model;
Responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model;
If the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation.
Further, the training process of the sound noise reduction model is as follows:
Inputting the audio signal with noise into a deep learning model, and recovering the audio signal after noise reduction; the noisy audio signal comprises a clean audio signal and a noise signal;
Based on the pure audio signal and the noise-reduced audio signal, the parameters of the deep learning model are continuously adjusted through a back propagation algorithm with the aim of minimizing the mean square error, and the sound noise reduction model is obtained.
Further, the generation of the synthesized voice against the network is used in the pushing of the bill and the process of guiding the user to complete the payment operation.
Further, the method further comprises the following steps: optimizing the voice recognition model and the intervention measures by adopting a total effect; the total effect is the product of emotion recognition rate and intervention effect.
Further, in the process of guiding the user to finish the payment operation, a voiceprint recognition authentication technology is introduced.
Further, the speech recognition model adopts a cyclic neural network and a long-term and short-term memory network which are sequentially connected.
A second aspect of the present invention provides a conversational online payment system, comprising:
A signal acquisition module configured to: acquiring an audio signal of a user;
A noise reduction module configured to: for the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model;
An analysis module configured to: responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model;
An interaction module configured to: if the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation.
A third aspect of the present invention provides a computer readable storage medium having stored thereon a computer program for execution by a processor, the program when executed by the processor effecting steps in a conversational online payment method as described above.
A fourth aspect of the invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and running on the processor, the processor implementing the steps in a conversational online payment method as described above when the program is executed.
A fourth aspect of the invention provides a computer program product or computer program comprising computer instructions stored on a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium and executes the computer instructions to cause the computer device to perform the steps of a conversational online payment method as described above.
Compared with the prior art, the invention has the beneficial effects that:
According to the method, emotion in the customer dialogue text is analyzed, a formulated intervention strategy is matched, intelligent customer service is supported to dredge and manage the emotion of the customer according to the guiding speech of the strategy flow, and customer experience in the online payment process is improved.
The invention can identify various dialects and minority languages outside standard Mandarin and support English language identification, so that the diversified language support functions exceed the limit that a plurality of existing methods are limited to single or few language support capabilities.
According to the invention, a large amount of noisy and noiseless audio data are used for training the deep learning model, so that the generalization capability and adaptability of the sound noise reduction model can be improved, and the noise reduction problem of various different scenes can be better processed; the deep learning technology has the capabilities of automatic learning and self-adaption, can be continuously optimized and updated along with the change of data, and has better long-term effect compared with the traditional noise reduction method.
According to the invention, through voiceprint recognition authentication, only an authorized user can be ensured to carry out payment operation, and potential safety hazards possibly existing in the traditional password or other authentication modes are avoided.
The voice recognition model of the invention uses the cyclic neural network and the long-short-term memory network to improve the accuracy of voice recognition of the user, more accurately captures the fine characteristics of voice signals and accurately converts various voices of the user into texts.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flow chart of a conversational online payment method according to a first embodiment of the invention;
Fig. 2 is a flow chart of electric bill payment based on a large language model according to the first embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The embodiments of the present invention and features of the embodiments may be combined with each other without conflict, and the present invention will be further described with reference to the drawings and embodiments.
Example 1
An objective of the first embodiment is to provide a conversational online payment method.
In order to further improve the technology and service of the electric charge payment service, the safety and interaction convenience of the user privacy are required to be fully considered, and a conversational online payment method based on a large model is innovatively constructed through the combination of advanced technologies such as voice interaction, large model algorithm technology, face recognition authentication, voiceprint recognition and the like, so that more intelligent, more convenient and safer electric charge payment interaction experience is provided for the user.
The interactive online payment method provided by the embodiment is suitable for online payment of electric charges of electric power users, so that more intelligent, personalized and safe user experience is realized.
The interactive online payment method provided by the embodiment comprises the following steps: acquiring an audio signal of a user; for the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model; responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining a user emotion type through an emotion analysis model, and selecting an intervention measure based on the user emotion type; if the user intends to inquire the cost, inquiring the bill, and pushing the bill by combining the intervention measures; if the user intends to pay the fee, the user is guided to complete the payment operation by combining the intervention measures. For example, if the user intent is a fee query, if the user emotion classification is urgent, then the intervention measure pushes the total number of fees in the bill to the user; if the emotion type of the user is puzzled, the intervention measures push the cost details in the bill for the user; if the emotion type of the user is angry, the intervention measures push the cost details in the bill for the user, and the highest electricity consumption period of the speech and the language description is set. If the user intends to pay the fee, if the emotion type of the user is urgent, the intervention measures are directly jumped to a fee paying interface; if the emotion type of the user is puzzled, introducing a fee payment flow for the user by the intervention measures; if the emotion type of the user is angry, the intervention measures are jumping to the fee payment interface, and a fee payment flow is set by speaking and intonation description.
The interactive online payment method provided by the embodiment specifically comprises the following steps:
Step 1: and (5) constructing a voice recognition and interaction parameter management model.
And 101, constructing a voice recognition model based on a deep learning algorithm.
According to the embodiment, the voice instruction of the user is converted into the text through the voice recognition model, so that the intention of the user is understood, the accuracy of voice recognition of the user is improved by using the neural network model combination such as a Convolutional Neural Network (CNN), a cyclic neural network (RNN) and a long-short-term memory network (LSTM), fine features of voice signals can be more accurately captured by the voice recognition model, adaptability to different accents and pronunciations is improved through the training and feature extraction of the voice recognition model, and various voices of the user are accurately converted into the text.
The speech recognition model comprises 4 units, namely a feature extraction unit, an RNN unit, an LSTM unit and a decoder. First, the feature extraction unit performs feature extraction through CNN, reduces the time dimension, and captures local context information. Then, more complex timing patterns are captured using RNN unit and LSTM unit processing features. Finally, the decoder generates text predictions through a fully connected layer connected to the LSTM unit output.
The RNN unit update rules in the speech recognition model are as follows: ; wherein/> Is the time step/>Hidden state of/>Is the time step/>Input features of/>、/>、/>Is a weight matrix,/>And/>Is an offset term,/>Is an activation function.
The LSTM cells in the speech recognition model contain more complex structures including forget gates, input gates, and output gates, whose equations are as follows:,/>,/>,/> ; wherein/> Values of forget gate, input gate, candidate memory cell, memory cell and output gate, respectively; /(I)Is a corresponding weight matrix; /(I)Is a bias term.
In training a speech recognition model, a cross entropy loss function is used to measure the difference between the prediction of the training speech recognition model and the actual label: ; wherein/> Is a loss function,/>Is/>True tag of individual samples,/>Is an input feature,/>Is a model parameter,/>Is the probability distribution predicted by the speech recognition model. The optimization process of the speech recognition model involves gradient descent or variants thereof, such as random gradient descent (SGD) or adaptive moment estimation (Adam), to update the speech recognition model parameters/>To minimize the loss function/>
And 202, constructing a sound noise reduction model.
In order to eliminate or reduce noise in the environment, the embodiment provides clearer and purer audio signals, improves the accuracy of voice recognition, and constructs a sound noise reduction model.
Firstly, training and optimizing a sound noise reduction model by methods such as signal enhancement, feature extraction and the like based on noisy and noiseless audio data. Then, in the training stage, the preprocessed noisy audio is input into a deep learning model (acoustic noise reduction model), and parameters of the deep learning model are continuously adjusted through a back propagation algorithm, so that the output noiseless audio signal is as close to the original noiseless audio as possible. In the training process, an integrated acceleration frame is innovatively designed, and the convergence of the sound noise reduction model is accelerated by combining different compression and acceleration methods including various optimization skills such as batch standardization and learning rate attenuation, so that the performance of the sound noise reduction model is improved. And finally, applying the trained sound noise reduction model to the audio signal with noise to be processed, and reasoning to obtain the audio signal after noise reduction. The method comprises the following steps:
Provided that there is a noisy audio signal Wherein contains a clean audio signal/>And noise signal/>I.e.. The objective of the acoustic noise reduction model is to recover a clean audio signal/>, from a noisy audio signalThis may be achieved by minimizing some loss function, such as Mean Square Error (MSE): Wherein/> Is a sound noise reduction model parameter,/>Is the output of the sound noise reduction model,/>Is the total number of time steps.
The acoustic noise reduction model first employs a Convolutional Neural Network (CNN) to perform local feature extraction, and the convolutional operation can be expressed as: Wherein/> Is an output feature map,/>Is an input signal,/>Is a convolution kernel,/>And (3) representing convolution operation, wherein n is the nth region of the output characteristic diagram, m is the region of which the m-th size corresponding to the input signal is the convolution kernel size.
The acoustic noise reduction model then takes the local features extracted by the CNN as input to a Recurrent Neural Network (RNN), capturing the timing dependencies in the sequence. The hidden state update formula is: Wherein/> Is the hidden state of the current time step,/>Is the hidden state of the previous time step,/>Is the input of the current time step,/>AndIs a weight matrix,/>Is an offset term,/>Is an activation function.
The integrated acceleration framework incorporates batch normalization and learning rate decay techniques. Normalizing each layer of input of the neural network, for example, when inputting the value set B, outputting the data normalization result asWherein/>As a normalization function,/>For the restoration parameters, the distribution of the original data is preserved to a certain extent. Meanwhile, when the sound noise reduction model is trained, a relatively good solution is obtained rapidly by using a larger learning rate through a learning rate exponential decay method, and then the learning rate is gradually reduced along with iteration, wherein the calculation formula is as follows: /(I)Wherein/>Learning rate used in optimizing each round,/>For the initial learning rate set in advance,/>For the attenuation coefficient,/>As a global step for the attenuation calculation,Is the decay rate.
A self-encoder is an unsupervised learning model that attempts to learn a compressed representation of the input data, trained by minimizing reconstruction errors:
Wherein, Is an input sample,/>Is a reconstructed sample,/>Is the number of samples.
Various optimization algorithms, such as random gradient descent (SGD), adam, etc., are used in training the deep learning model, while regularization techniques, such as L1/L2 regularization, drop-out (dropout), etc., are utilized to prevent overfitting. For example, the update rule of SGD is: Wherein/> Is learning rate,/>Is the gradient of the loss function with respect to the acoustic noise reduction model parameters.
Compared with the existing noise reduction technology, the sound noise reduction model provided by the embodiment has the advantages that: the combined CNN and RNN deep learning technology can better process complex sound signals and nonlinear noise characteristics, and has higher noise reduction performance and tone quality compared with the traditional noise reduction method. By training the deep learning model with a large amount of noisy and noiseless audio data, the generalization capability and adaptability of the sound noise reduction model can be improved, so that the noise reduction problem of various different scenes can be better processed. The deep learning technology has the capabilities of automatic learning and self-adaption, can be continuously optimized and updated along with the change of data, and has better long-term effect compared with the traditional noise reduction method. The deep learning technique can be combined with other audio processing techniques to form a more complete audio processing system, providing richer, more advanced audio processing functions.
And 203, constructing a voiceprint database.
Based on a standard voiceprint information acquisition flow, voiceprints of acquired personnel are acquired and put in storage, wherein the voiceprint information acquisition flow comprises standard voiceprint acquisition (voice acquired by standard voiceprint acquisition equipment) and nonstandard voiceprint acquisition (such as WeChat voice, telecom operator call voice and the like), and a voiceprint database is established.
(1) Data collection and annotation.
(101) Multi-source data acquisition: a large amount of voice data is collected from different sources (e.g., microphone array, phone call, social media platform, etc.).
(102) Data cleaning: remove noise, mute paragraphs, and handle the problems of different sampling rates and bit depths.
(103) Manual labeling: by collecting a large amount of audio data and manually labeling, the audio data are classified into different categories or labels, including different speakers, different emotions, different speech speeds and the like.
(104) Feature extraction: high value acoustic features are extracted, such as MFCCs (mel-frequency cepstral coefficients), LPC (linear prediction of speech signals), cepstral coefficients, fundamental frequency (F0), speech speed, pitch, etc.
(2) Model design and training.
The design carries out classification model training on the collected audio data, classification labels comprise personnel types, emotion types, speech speeds and the like, and the model performance is enhanced through parameter tuning, so that a prediction target for correctly classifying the new audio data is realized.
(201) The model architecture is preferably: the deep learning model RNN, CNN, transformer is selected for model training.
(202) Building a training set and a verification set: the annotated data is divided into a training set for model learning and a validation set for model evaluation.
(203) Model training: the model is trained using a large amount of annotation data, and weight parameters are adjusted by back propagation and gradient descent methods.
(204) Super parameter tuning: the optimal super-parameter setting such as learning rate, batch size, layer number and the like is found through the cross-validation, grid search and other technologies.
(205) Regularization strategy: the techniques of Dropout, batch normalization and the like are adopted to reduce the overfitting.
(206) Performance evaluation: and testing the performance of the model on the verification set, and evaluating the performance by using indexes such as confusion matrix, accuracy, recall rate and the like.
Based on emotion recognition and emotion analysis technical capabilities in the dialogue text, emotion in the customer dialogue text is analyzed, a formulated intervention strategy is matched, intelligent customer service is supported to dredge and manage the customer emotion according to a guiding call of a strategy flow, and customer experience of online customer service is improved.
Step 2: and a conversational electric charge payment management model.
Step 201, identifying and intervening the emotion of the client based on the multi-mode architecture.
As shown in fig. 1, the steps of identifying and intervening the emotion of the client based on the multi-mode architecture include:
(1) Data collection and preprocessing.
Customer dialogue data is collected from different channels (such as telephone, chat, mail and the like), irrelevant information such as noise, punctuation marks and unstructured data is removed, text is converted into small cases, word segmentation, stop word removal and the like are performed, and text standardization is achieved.
(2) And training an emotion analysis model.
The invention provides a method for carrying out fusion classification by combining personal emotion analysis of multi-mode information such as text, voice, image and the like, and the accuracy of experimental results is higher. In the processing of dialog text, a word is generated in the language model as a final output result using a Decoder mechanism of a transducer, i.e., a Decoder mechanism. The emotion analysis model is endowed with memory capacity, communication with a user is ensured to be consistent with context information, anthropomorphic properties of the emotion analysis model are improved, and reality sense of the interaction is enhanced. Based on the preprocessed image of the input data and the associated dialogue text, extracting emotion related features in the image by using a Convolutional Neural Network (CNN) algorithm, extracting semantic features of the text by using a natural language processing technology, respectively outputting emotion classification results, finally receiving mapping relations of each emotion classification, and executing final emotion classification by using a gradient hoist (GBM) model.
(3) And (5) identifying emotion in real time.
(301) Deployment model: and deploying the trained emotion analysis model to a production environment.
(302) And (3) real-time analysis: when a customer interacts with the intelligent customer service, the emotion analysis model analyzes the dialogue text of the customer in real time.
(303) Emotion classification: the emotion analysis model classifies the emotion of the client into predefined emotion categories.
(4) And (5) establishing an intervention strategy.
The invention adopts corresponding intervention measures based on emotion recognition of the client. Corresponding intervention strategies and speaking templates are formulated according to different emotion categories, and the most appropriate response strategy is selected according to the emotion states of clients. Defining two main variables, wherein the emotion recognition rate (E) represents the accuracy of emotion recognition of a client by an emotion analysis model, and the higher the recognition rate is, the more accurate the emotion recognition technology is; the intervention effect (I) represents the influence of the intervention measures on the emotion of the client, and the higher the intervention effect value is, the more effective the intervention measures are.
Emotion recognition rate (E): the emotion recognition rate is an important index for measuring the performance of an emotion analysis system, and reflects the ability of the system to correctly recognize emotion tendencies in texts. The emotion recognition rate is directly related to the accuracy and reliability of the emotion analysis model. The invention utilizes the fact that the emotion recognition rate depends on the accuracy of an emotion analysis model.
Intervention effect (I): the quality of the intervention depends on the influence of the intervention on the emotion of the customer. Thus, in performing emotion analysis or emotion recognition, intervention needs to be carefully selected and adequately verified and tested. Can ensure that the intervention measures can truly improve the accuracy of emotion recognition, thereby better understanding and responding to the emotion of the client.
Total effect (T): the total effect is the product of emotion recognition rate and intervention effect, and represents the overall effect of emotion analysis model and intervention strategy, namely:
The mathematical model described above may help understand how to improve the overall effect of customer service. Specifically, the emotion recognition rate (E) can be improved by optimizing the emotion recognition algorithm, and the total effect (T) can be improved by optimizing the intervention measures. Meanwhile, the service effect can be evaluated according to the value of T and adjusted according to the requirement.
(5) And managing the emotion of the customer.
(501) Automated intervention: the reply content and the mood (intervention measure) are automatically adjusted according to the emotion of the client.
(502) Emotion tracking: the change of the emotion of the client is monitored, and the effectiveness of the intervention measures is ensured.
(503) Feedback loop: and collecting feedback of clients, calculating the total effect, and continuously optimizing emotion recognition models and intervention strategies.
(6) And (5) continuous optimization.
(601) Data analysis: and analyzing the customer interaction data to know which intervention measures are effective and which need improvement.
(602) Model updating: the emotion analysis model is updated periodically to accommodate the new data and behavior patterns.
(603) User experience investigation: user experience research is conducted regularly to know the needs and expectations of clients.
Step 202, electric bill payment based on a large language model.
According to the invention, a Large Language Model (LLMs) is utilized, a bill payment process is assisted by a Natural Language Processing (NLP) technology, the problems that the existing robot part knowledge pushing has questions about answering and cannot be accurately matched with the final knowledge point are solved, and the accuracy and comprehensiveness of the knowledge pushing are to be further improved. Specifically, the large language model can understand and process the query requirement of the user, provide relevant information and guide the user to finish the payment process.
As shown in fig. 2, the electric bill payment process based on the large language model includes:
(1) Understanding user intent: and analyzing the voice or text input of the user by using the large language model, and understanding the bill information which the user wants to inquire or the executed payment operation or problem consultation. For example, the user may ask "what is the electricity fee of me last month? Or me wants to pay my electricity bill now. "
(2) Retrieving and providing information: according to the inquiry of the user, the large language model can retrieve bill information from a database, wherein the bill information comprises bill amount, payment period and the like; payment guidelines may also be provided to tell the user how to make online payments or to pay bills by other means.
(3) Multiple rounds of dialog management: in multiple rounds of conversations, the large language model can remember previous conversational content (history), which helps to provide more accurate information and services in subsequent communications. For example, if the user has queried billing information in a first round of dialogue, and wants to pay directly in a second round of dialogue, the large language model may utilize the previous information to simplify the payment process.
(4) Integrated payment system: the large language model may be integrated with existing payment systems, such as bank payment gateways, third party paymate platforms, etc., so that the user may complete the payment operation directly through the model. The large language model may generate payment confirmation information to ensure that the user knows the payment status and results.
(5) User education and guidance: the large language model can provide educational content for bill payment and help users understand payment flow and notes. For users unfamiliar with electronic payments, the large language model may provide step-by-step guidance to help them complete the payment.
(6) Security and privacy protection: when processing bill payment, the large language model needs to ensure compliance with data protection regulations, and protect personal information of users from leakage. The large language model should have the ability to verify the identity of the user to prevent unauthorized payment.
Step 3: conversational electric charge payment.
The invention carries out conversational electric charge payment based on a voice recognition and interaction parameter management model and a conversational electric charge payment management model, relies on an electric marketing knowledge system, constructs an intelligent electric charge bill retrieval engine based on the semantic understanding, knowledge correlation reasoning and other capabilities of a large model, and extracts recommended knowledge content according to service classification and knowledge matching degree. And calling or waking up a voice assistant function in the front page of the electric charge payment software, and carrying out dialogue communication through natural language and the voice assistant so as to support the functions of inquiring a user bill, paying the electric charge and the like. The integration function is as follows:
(1) And (5) inquiring the bill. The method supports the user to inquire the electricity consumption, the electricity bill, the electricity balance and the like within three years by voice, the daily electricity consumption is inquired, and the inquiry result is pushed in the forms of voice, characters, pictures and the like.
(2) And (5) fee accounting management. The functions of electric charge payment, payment confirmation, user settlement confirmation bill printing, billing confirmation by clients and the like are supported through voice. Non-homeowners support face recognition invocation.
(3) And judging the type of the client. In response to a customer login instruction, the customer type (low-voltage resident customer, electric automobile user, high-voltage customer, photovoltaic customer, etc.) is automatically judged. And identifying the age of the consultant and the service type of the consultation according to the voiceprint and the keyword, judging the type of the client, and pushing the relevant special service solution by voice. Based on the construction of the voiceprint database, the newly input audio data is analyzed, the personality characteristics of the consultant are identified according to the voiceprints, and the age range (child, middle-aged and old) and sex (male and female) of the consultant are assisted to be judged. And extracting keywords based on the output content of the voice recognition, matching the keywords with known service types, and completing automatic classification of the service types. And integrating the client type and the business type, and pushing the related characteristic service solution by voice.
(4) Wake word management. And supporting the specific voice command to be input for waking up, and executing command operation while waking up. And a plurality of wake-up words are supported to be set in a self-defined mode, and personalized requirements are met. Wake word evaluation is supported, evaluating the vocabulary that is suitable to be set as wake words.
(5) And (5) intelligent session distribution management. When the customer selects to change to manual work, the customers consulting different problems can be allocated to the corresponding customer service according to the principles of satisfaction, conversion rate, capability value, workload, priority of acquaintances and the like. And the functions of quick reply, internal knowledge base, message reminding, one-key switching and the like are supported.
(6) And reading the screen by one key. And the operation modes of various status bars such as play/pause, refresh, fast forward or backward are supported, all characters and information on a screen can be played by one key, and the selected characters can be read by double-click selection.
(7) And (5) customizing and managing the voice library. The input characters are sent to the server through the network, tone synthesis is carried out by utilizing the deep learning technology, and the tone effects with high tone quality and plump tone are synthesized, so that the effects are more similar to human voice, and simultaneously, multiple tone and multiple language selections are provided to meet different requirements of users.
(8) Multiple languages and language support. Can identify various dialects and minority languages outside the standard Mandarin, and support English language identification, which has remarkable value in countries with wide regions and various languages. Such diverse language support functions exceed the limitations of many existing approaches limited to single or few language support capabilities. In addition, personalized voice recognition service is provided for the user, specific vocabulary and phrases are allowed to be customized, the method is flexible, and the specific requirements of the user can be met better.
According to the conversational online payment method provided by the embodiment, a series of problems existing in a traditional electric charge payment mode are solved by integrating a plurality of technologies such as voice input, voice interaction, large model algorithm technology and face recognition authentication, and the ever-increasing demands of users on efficient and personalized services are met.
(1) And realizing personalized sound interaction.
The primary purpose of the conversational online payment method provided by the embodiment is to wake up the robot assistant through voice input, so as to realize personalized voice interaction. Through multiple voice input of the user, the voice characteristics of the user can be accurately identified, and a personal voice model is built, so that personalized interaction of waking up the robot assistant is realized. The function not only improves the user experience, but also enables interaction to be more relevant and personalized, and enables the user to feel care and care.
(2) And the voice interaction intelligence is improved.
The conversational online payment method provided by the embodiment aims to provide more intelligent and natural communication experience through a voice interaction function. Through the application of the large model algorithm technology, the intelligent voice interaction system has higher-level intelligence, continuously learns word habit and voice characteristics of a user, can better understand voice instructions of the user, adapts to context understanding under different scenes, meets personalized requirements of the user, and improves the intelligent level of voice interaction. The conversation between the user and the robot assistant is smoother and more accurate, and meanwhile, more accurate instruction judgment and response can be carried out according to continuous iteration and optimization of feedback and historical data of the user, the voice interaction effect is improved, and a deeper interaction relation is established with the user.
(3) The safety of the payment process is improved.
According to the interactive online payment method provided by the embodiment, the voiceprint recognition authentication technology is introduced to improve the safety of the payment process. Through voiceprint recognition authentication, only authorized users can be ensured to carry out payment operation, and potential safety hazards possibly existing in the traditional password or other authentication modes are avoided. Meanwhile, in order to further strengthen the safety, a face recognition authentication technology is introduced, and advanced technologies such as living body detection and the like are adopted to prevent common fraud means, so that the safety of user data and accounts is further improved.
(4) And the payment operation is simplified, and the user experience is optimized.
The interactive online payment method provided by the embodiment aims at enabling payment operation to be more convenient and visual. The user can realize the functions of electricity bill inquiry, balance inquiry, consumption record inquiry, electricity fee payment and the like on the system in a dialogue mode only through simple voice instructions, and a series of complex operations do not need to be manually carried out according to the traditional method. The innovative payment mode not only improves the operation efficiency of the user, but also realizes humanization of the payment system, optimizes the user experience by natural and smooth response, and can establish a more natural and related communication relationship with the user.
Example two
An object of the second embodiment is to provide a conversational online payment system, including:
A signal acquisition module configured to: acquiring an audio signal of a user;
A noise reduction module configured to: for the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model;
An analysis module configured to: responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model;
An interaction module configured to: if the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation.
It should be noted that, each module in the embodiment corresponds to each step in the first embodiment one to one, and the implementation process is the same, which is not described here.
Example III
The present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a conversational online payment method as described in the above embodiment.
Example IV
The present embodiment provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor executes the program to implement the steps in a conversational online payment method according to the above embodiment.
Example five
The present embodiments provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the steps of a conversational online payment method as described in the above embodiment one.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (10)

1. A conversational online payment method, comprising:
Acquiring an audio signal of a user;
For the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model;
Responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model;
If the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation.
2. A conversational online payment method according to claim 1, wherein the training process of the acoustic noise reduction model is:
Inputting the audio signal with noise into a deep learning model, and recovering the audio signal after noise reduction; the noisy audio signal comprises a clean audio signal and a noise signal;
Based on the pure audio signal and the noise-reduced audio signal, the parameters of the deep learning model are continuously adjusted through a back propagation algorithm with the aim of minimizing the mean square error, and the sound noise reduction model is obtained.
3. A conversational online payment method according to claim 1, wherein the pushing of the bill and guiding the user through the payment operation uses generation of the synthesized voice against the network.
4. A conversational online payment method according to claim 1, further comprising: optimizing the voice recognition model and the intervention measures by adopting a total effect; the total effect is the product of emotion recognition rate and intervention effect.
5. A conversational online payment method according to claim 1, wherein a voiceprint recognition authentication technique is introduced during the process of directing the user to complete the payment operation.
6. A conversational online payment method according to claim 1, wherein the speech recognition model employs a recurrent neural network and a long and short term memory network connected in sequence.
7. A conversational online payment system, comprising:
A signal acquisition module configured to: acquiring an audio signal of a user;
A noise reduction module configured to: for the audio signal, obtaining a noise-reduced audio signal through a sound noise reduction model;
An analysis module configured to: responding to the selected language category, calling a voice recognition model corresponding to the selected language category, and converting the noise-reduced audio signal into a text; analyzing to obtain user intention through a large language model based on the text; based on the text, obtaining the emotion type of the user through an emotion analysis model;
An interaction module configured to: if the user intention is the fee inquiry, carrying out bill inquiry, selecting intervention measures based on the emotion type of the user, and carrying out bill pushing; if the user intends to pay the fee, selecting intervention measures based on the emotion type of the user, and guiding the user to complete payment operation.
8. A computer readable storage medium having stored thereon a computer program, the program being executed by a processor, characterized in that the program when executed by the processor implements the steps of a conversational online payment method according to any one of claims 1-6.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor performs the steps of a conversational online payment method according to any one of claims 1-6 when the program is executed.
10. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the steps of a conversational online payment method according to any one of claims 1-6.
CN202410668542.XA 2024-05-28 2024-05-28 Conversational online payment method, system, medium, equipment and program product Pending CN118246910A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410668542.XA CN118246910A (en) 2024-05-28 2024-05-28 Conversational online payment method, system, medium, equipment and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410668542.XA CN118246910A (en) 2024-05-28 2024-05-28 Conversational online payment method, system, medium, equipment and program product

Publications (1)

Publication Number Publication Date
CN118246910A true CN118246910A (en) 2024-06-25

Family

ID=91559361

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410668542.XA Pending CN118246910A (en) 2024-05-28 2024-05-28 Conversational online payment method, system, medium, equipment and program product

Country Status (1)

Country Link
CN (1) CN118246910A (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108921562A (en) * 2018-05-24 2018-11-30 佛山市竣智文化传播股份有限公司 A kind of on-line payment and its device based on Application on Voiceprint Recognition
CN110019683A (en) * 2017-12-29 2019-07-16 同方威视技术股份有限公司 Intelligent sound interaction robot and its voice interactive method
WO2019153522A1 (en) * 2018-02-09 2019-08-15 卫盈联信息技术(深圳)有限公司 Intelligent interaction method, electronic device, and storage medium
CN110349575A (en) * 2019-05-22 2019-10-18 深圳壹账通智能科技有限公司 Method, apparatus, electronic equipment and the storage medium of speech recognition
CN110379445A (en) * 2019-06-20 2019-10-25 深圳壹账通智能科技有限公司 Method for processing business, device, equipment and storage medium based on mood analysis
CN110444200A (en) * 2018-05-04 2019-11-12 北京京东尚科信息技术有限公司 Information processing method, electronic equipment, server, computer system and medium
CN111883091A (en) * 2020-07-09 2020-11-03 腾讯音乐娱乐科技(深圳)有限公司 Audio noise reduction method and training method of audio noise reduction model
US20210127003A1 (en) * 2019-10-28 2021-04-29 Baidu Online Network Technology (Beijing) Co., Ltd. Interactive voice-control method and apparatus, device and medium
CN113077790A (en) * 2019-12-17 2021-07-06 阿里巴巴集团控股有限公司 Multi-language configuration method, multi-language interaction method and device and electronic equipment
CN114333882A (en) * 2022-03-09 2022-04-12 深圳市友杰智新科技有限公司 Voice noise reduction method, device and equipment based on amplitude spectrum and storage medium
CN115599894A (en) * 2022-09-22 2023-01-13 号百信息服务有限公司(Cn) Emotion recognition method and device, electronic equipment and storage medium
CN115762514A (en) * 2022-11-07 2023-03-07 国网江苏省电力有限公司营销服务中心 Intelligent voice quality inspection technology-based electric power business hall abnormal event discovery and handling method and system
CN115934918A (en) * 2023-01-10 2023-04-07 江苏电力信息技术有限公司 Multi-turn conversation method of electric charge payment prompting robot based on intelligent voice technology
CN116450943A (en) * 2023-04-12 2023-07-18 中国平安财产保险股份有限公司 Artificial intelligence-based speaking recommendation method, device, equipment and storage medium
CN116469405A (en) * 2023-04-23 2023-07-21 富韵声学科技(深圳)有限公司 Noise reduction conversation method, medium and electronic equipment
CN116844558A (en) * 2023-08-01 2023-10-03 深圳百瑞互联技术有限公司 Audio noise reduction method, system, encoder and medium based on deep learning
WO2023207149A1 (en) * 2022-04-29 2023-11-02 荣耀终端有限公司 Speech recognition method and electronic device

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019683A (en) * 2017-12-29 2019-07-16 同方威视技术股份有限公司 Intelligent sound interaction robot and its voice interactive method
WO2019153522A1 (en) * 2018-02-09 2019-08-15 卫盈联信息技术(深圳)有限公司 Intelligent interaction method, electronic device, and storage medium
CN110444200A (en) * 2018-05-04 2019-11-12 北京京东尚科信息技术有限公司 Information processing method, electronic equipment, server, computer system and medium
CN108921562A (en) * 2018-05-24 2018-11-30 佛山市竣智文化传播股份有限公司 A kind of on-line payment and its device based on Application on Voiceprint Recognition
CN110349575A (en) * 2019-05-22 2019-10-18 深圳壹账通智能科技有限公司 Method, apparatus, electronic equipment and the storage medium of speech recognition
CN110379445A (en) * 2019-06-20 2019-10-25 深圳壹账通智能科技有限公司 Method for processing business, device, equipment and storage medium based on mood analysis
US20210127003A1 (en) * 2019-10-28 2021-04-29 Baidu Online Network Technology (Beijing) Co., Ltd. Interactive voice-control method and apparatus, device and medium
CN113077790A (en) * 2019-12-17 2021-07-06 阿里巴巴集团控股有限公司 Multi-language configuration method, multi-language interaction method and device and electronic equipment
CN111883091A (en) * 2020-07-09 2020-11-03 腾讯音乐娱乐科技(深圳)有限公司 Audio noise reduction method and training method of audio noise reduction model
CN114333882A (en) * 2022-03-09 2022-04-12 深圳市友杰智新科技有限公司 Voice noise reduction method, device and equipment based on amplitude spectrum and storage medium
WO2023207149A1 (en) * 2022-04-29 2023-11-02 荣耀终端有限公司 Speech recognition method and electronic device
CN115599894A (en) * 2022-09-22 2023-01-13 号百信息服务有限公司(Cn) Emotion recognition method and device, electronic equipment and storage medium
CN115762514A (en) * 2022-11-07 2023-03-07 国网江苏省电力有限公司营销服务中心 Intelligent voice quality inspection technology-based electric power business hall abnormal event discovery and handling method and system
CN115934918A (en) * 2023-01-10 2023-04-07 江苏电力信息技术有限公司 Multi-turn conversation method of electric charge payment prompting robot based on intelligent voice technology
CN116450943A (en) * 2023-04-12 2023-07-18 中国平安财产保险股份有限公司 Artificial intelligence-based speaking recommendation method, device, equipment and storage medium
CN116469405A (en) * 2023-04-23 2023-07-21 富韵声学科技(深圳)有限公司 Noise reduction conversation method, medium and electronic equipment
CN116844558A (en) * 2023-08-01 2023-10-03 深圳百瑞互联技术有限公司 Audio noise reduction method, system, encoder and medium based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王兴梅: "《基于深度学习的水下信息处理方法研究》", 30 April 2021, 北京航空航天大学出版社, pages: 4 - 7 *
郭业才著: "《深度学习与信号处理:原理与实践》", 30 June 2022, 机械工业出版社, pages: 264 - 247 *

Similar Documents

Publication Publication Date Title
Kabir et al. A survey of speaker recognition: Fundamental theories, recognition methods and opportunities
CN111276131B (en) Multi-class acoustic feature integration method and system based on deep neural network
CN111312245B (en) Voice response method, device and storage medium
US10771627B2 (en) Personalized support routing based on paralinguistic information
Kelly et al. Deep neural network based forensic automatic speaker recognition in VOCALISE using x-vectors
Chamishka et al. A voice-based real-time emotion detection technique using recurrent neural network empowered feature modelling
US11615787B2 (en) Dialogue system and method of controlling the same
CN117765981A (en) Emotion recognition method and system based on cross-modal fusion of voice text
CN110704618B (en) Method and device for determining standard problem corresponding to dialogue data
Chakroun et al. New approach for short utterance speaker identification
CN117149977A (en) Intelligent collecting robot based on robot flow automation
Chen et al. Integrated design of financial self-service terminal based on artificial intelligence voice interaction
Gilbert et al. Intelligent virtual agents for contact center automation
CN117150338A (en) Task processing, automatic question and answer and multimedia data identification model training method
Thakur et al. NLP & AI speech recognition: an analytical review
Tailor et al. Deep learning approach for spoken digit recognition in Gujarati language
CN112150103B (en) Schedule setting method, schedule setting device and storage medium
CN118246910A (en) Conversational online payment method, system, medium, equipment and program product
CN116561284A (en) Intelligent response method, device, electronic equipment and medium
Sartiukova et al. Remote Voice Control of Computer Based on Convolutional Neural Network
CN115691500A (en) Power customer service voice recognition method and device based on time delay neural network
Devnath et al. Emotion recognition from isolated Bengali speech
Yadava et al. Improvements in spoken query system to access the agricultural commodity prices and weather information in Kannada language/dialects
OUKAS et al. ArabAlg: A new Dataset for Arabic Speech Commands Recognition for Machine Learning Purposes
Avikal et al. Estimation of age from speech using excitation source features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination