JP7216627B2

JP7216627B2 - INPUT SUPPORT METHOD, INPUT SUPPORT SYSTEM, AND PROGRAM

Info

Publication number: JP7216627B2
Application number: JP2019163108A
Authority: JP
Inventors: 雅之川村; 信輝河野; 淳太郎松本; 佳子上屋
Original assignee: Tokio Marine and Nichido Fire Insurance Co Ltd
Current assignee: Tokio Marine and Nichido Fire Insurance Co Ltd
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2023-02-01
Anticipated expiration: 2039-09-06
Also published as: JP2021043530A

Description

本発明の一実施形態は、入力支援システム、入力支援方法、及びプログラムに関する。 An embodiment of the present invention relates to an input support system, an input support method, and a program.

従来から、コールセンタにおいて、カスタマからの問い合わせ、照会、又は要求に対し、カスタマ対応を行う従業員（以下、オペレータと記す）が対応している。電話によるカスタマ対応が終了したオペレータは、カスタマ対応における進捗状況を、案件を管理するシステムに入力して、他のオペレータと情報の共有化を図っている。 2. Description of the Related Art Conventionally, in a call center, an employee (hereinafter referred to as an operator) responds to an inquiry, inquiry, or request from a customer. An operator who has finished responding to a customer by telephone inputs the progress of the customer response into a system that manages the case so that the information can be shared with other operators.

特許文献１には、コールセンタにおいて、カスタマからの問い合わせに応じて、当該問い合わせに対応づけるインシデントの種別と、インシデントの種別に対応づけて取りうるアクションの内容を表示するシステムが開示されている。また、特許文献１には、オペレータが、カスタマからの問い合わせに対してどのように対応したかを、プルダウンメニューから選択することによって、システムに入力することが記載されている。オペレータが、案件の処理の進捗状況をシステムに入力することによって、他のオペレータが、当該案件を引き継いだ場合であっても、他のオペレータが残りの処理を行うことができる。 Patent Literature 1 discloses a system for displaying, in response to an inquiry from a customer, the type of incident associated with the inquiry and the content of actions that can be taken in association with the type of incident in a call center. Further, Patent Literature 1 describes that an operator inputs into the system how he/she responded to an inquiry from a customer by selecting from a pull-down menu. By inputting the progress of the processing of the matter into the system by the operator, even if another operator takes over the matter, the other operator can carry out the rest of the processing.

特開２０１９－８７２２号公報JP 2019-8722 A

コールセンタで扱う案件の種別に応じて、取りうるアクションが異なっており、案件の対応のステージにおいても取りうるアクションが異なっている。そのため、案件を管理するアプリケーション（以下、案件管理アプリケーションという）に入力するべき進捗状況の定型文は、数十以上、又は数百以上となる場合がある。オペレータがプルダウン方式で目的の定型文を選択する場合、数百以上ある項目の中から目的の定型文を探すことが困難となる。また、プルダウン方式で目的の定型文を探すのではなく、テキストで入力する場合には、案件管理アプリケーションへの入力に時間がかかってしまう。また、オペレータの熟練度が低い場合には、さらに時間がかかってしまう。 Actions that can be taken differ depending on the type of matter handled by the call center, and actions that can be taken at the stage of dealing with the matter also differ. Therefore, there are cases in which there are dozens or more, or hundreds or more, of fixed phrases of the progress status to be input to an application for managing matters (hereinafter referred to as matter management application). When an operator selects a target fixed phrase using a pull-down method, it becomes difficult to find the target fixed phrase from hundreds or more of items. In addition, it takes a long time to input to the matter management application when inputting text instead of searching for the target fixed form using a pull-down method. Further, if the operator's skill level is low, more time is required.

上記問題に鑑み、本発明の一実施形態では、オペレータが案件管理アプリケーションに登録するための定型文を入力する際の入力支援方法を提供することを目的の一つとする。 In view of the above problems, an object of one embodiment of the present invention is to provide an input support method when an operator inputs fixed form sentences for registration in a case management application.

本発明の一実施形態に係る入力支援方法は、通話テキストデータに含まれる単語及び文字をそれぞれ抽出し、一つの単語及び連続する複数の単語、並びに一つの文字及び連続する複数の文字のそれぞれに対して出現頻度及び希少度によって重み付けされた特徴ベクトルを生成し、特徴ベクトルが入力されると、機械学習モデルによって、案件を管理するアプリケーションに登録するための通話テキストデータの内容に対応する候補の定型文を出力する。 An input support method according to an embodiment of the present invention extracts words and characters included in speech text data, and extracts one word, a plurality of consecutive words, and one character and a plurality of consecutive characters. A feature vector weighted by appearance frequency and rarity is generated, and when the feature vector is input, a machine learning model generates candidates corresponding to the contents of the call text data to be registered in the application that manages the matter. Output fixed phrases.

上記方法において、機械学習モデルは、ロジスティック回帰、ニューラルネットワークを含む。 In the above method, the machine learning model includes logistic regression, neural network.

上記方法において、特徴ベクトルと、通話テキストデータに対して選択された定型文に基づいて、機械学習モデルを更新する。 In the above method, the machine learning model is updated based on the feature vectors and the fixed phrases selected for the speech text data.

上記方法において、特徴ベクトルは、通話テキストデータから、オペレータが発話したテキストデータが抽出され、オペレータが発話したテキストデータに含まれる単語及び文字がそれぞれ抽出され、一つの単語及び連続する複数の単語、並びに一つの文字及び連続する複数の文字のそれぞれに対して出現頻度及び希少度によって重み付けされている。 In the above method, the feature vector is obtained by extracting text data uttered by the operator from the call text data, extracting words and characters contained in the text data uttered by the operator, and extracting one word and a plurality of consecutive words, Also, one character and a plurality of consecutive characters are weighted by appearance frequency and rarity.

上記方法において、通話テキストデータは、通話の開始を示すタグと通話の終了を示すタグとを含む。 In the above method, the call text data includes a tag indicating the start of the call and a tag indicating the end of the call.

上記方法において、通話テキストデータは、通話の開始を示すタグと通話の途中を示す区切りタグとを含み、通話テキストデータを取得する度に、通話テキストデータを特徴ベクトルに変換し、特徴ベクトルを入力とした機械学習モデルの演算によって、アプリケーションに出力される通話テキストデータに対応した候補の定型文を更新する。 In the above method, the call text data includes a tag indicating the start of the call and a delimiter tag indicating the middle of the call. Each time the call text data is acquired, the call text data is converted into a feature vector, and the feature vector is input By the calculation of the machine learning model as described above, the fixed sentences of the candidates corresponding to the speech text data output to the application are updated.

本発明の一実施形態に係る入力支援方法を、コンピュータに実行させるためのプログラムである。 A program for causing a computer to execute an input support method according to an embodiment of the present invention.

本発明の一実施形態に係る入力支援システムは、通話音声データをテキスト化するテキストデータ変換部と、通話テキストデータを取得するテキストデータ取得部と、通話テキストデータに含まれる単語及び文字をそれぞれ抽出し、一つの単語及び連続する複数の単語、並びに一つの文字及び連続する複数の文字のそれぞれに対して出現頻度及び希少度によって重み付けされた特徴ベクトルを生成する前処理部と、特徴ベクトルが入力されると、機械学習モデルによって、案件を管理するアプリケーションに登録するための通話テキストデータの内容に対応する候補の定型文を出力する分類器と、を含む。 An input support system according to an embodiment of the present invention includes a text data conversion unit that converts call voice data into text, a text data acquisition unit that acquires call text data, and extracts words and characters included in the call text data. a preprocessing unit that generates feature vectors weighted by appearance frequency and rarity for each of one word, a plurality of consecutive words, and a single character and a plurality of consecutive characters; and a classifier that, when received, outputs candidate boilerplate sentences corresponding to the content of the call text data for registration in an application that manages cases by means of a machine learning model.

上記システムにおいて、テキストデータ変換部は、ファイルサーバに含まれ、テキストデータ取得部、前処理部、及び分類器は、分析サーバに含まれる。 In the above system, the text data conversion unit is included in the file server, and the text data acquisition unit, preprocessing unit, and classifier are included in the analysis server.

本発明の一実施形態によれば、オペレータが案件管理アプリケーションに登録するための定型文を入力する際に、目的の定型文の検索時間を短縮することができる。これにより、目的の定型文を探す時間を大幅に短縮することができるため、業務効率化を図ることができる。 According to one embodiment of the present invention, it is possible to reduce the search time for a target fixed phrase when an operator inputs a fixed phrase to be registered in a case management application. As a result, it is possible to significantly reduce the time required to search for the target fixed phrase, thereby improving work efficiency.

本発明の一実施形態に係る入力支援システムのブロック図である。1 is a block diagram of an input support system according to one embodiment of the present invention; FIG. 通話テキストデータの一例である。It is an example of call text data. 分析サーバのハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of an analysis server. 分析サーバのブロック図である。3 is a block diagram of an analysis server; FIG. 学習段階を説明するフローチャートである。4 is a flow chart illustrating the learning stage; 単語ベクトルを生成する方法を説明するフローチャートである。4 is a flow chart illustrating a method of generating word vectors; 文字ベクトルを生成する方法を説明するフローチャートである。4 is a flow chart illustrating a method of generating a character vector; 分析サーバのブロック図である。3 is a block diagram of an analysis server; FIG. 推定段階を説明するフローチャートである。4 is a flow chart describing the estimation stage; 端末装置の画面に表示される定型文の一例である。It is an example of fixed phrases displayed on the screen of the terminal device. 通話テキストデータの一例である。It is an example of call text data. 推定段階を説明するフローチャートである。4 is a flow chart describing the estimation stage; 本発明の一実施形態に係る入力支援システムのブロック図である。1 is a block diagram of an input support system according to one embodiment of the present invention; FIG.

以下、本発明の一実施形態について、図面を参照しながら説明する。以下に示す実施形態は本発明の実施形態の一例であって、本発明はこれらの実施形態に限定されるものではない。なお、本実施形態で参照する図面において、同一部分または同様な機能を有する部分には同一の符号または類似の符号（数字の後にＡ、Ｂなどを付しただけの符号）を付し、その繰り返しの説明は省略する場合がある。 An embodiment of the present invention will be described below with reference to the drawings. The embodiments shown below are examples of embodiments of the present invention, and the present invention is not limited to these embodiments. In the drawings referred to in this embodiment, the same parts or parts having similar functions are denoted by the same reference numerals or similar reference numerals (reference numerals followed by A, B, etc.). may be omitted.

（第１実施形態）
本実施形態では、本発明の一実施形態に係る入力支援システム１について図１乃至図１０を参照して説明する。 (First embodiment)
In this embodiment, an input support system 1 according to one embodiment of the present invention will be described with reference to FIGS. 1 to 10. FIG.

［入力支援システムの概要］
まず、本発明の一実施形態に係る入力支援システム１の概要について説明する。図１は、本発明の一実施形態に係る入力支援システム１のブロック図である。入力支援システム１は、コールセンタ業務において、カスタマとオペレータとの会話の内容から、案件に対する適切な進捗状況を示す定型文を推定する。これにより、案件管理アプリケーションに登録する定型文の入力支援を行うものである。 [Overview of input support system]
First, an overview of an input support system 1 according to an embodiment of the present invention will be described. FIG. 1 is a block diagram of an input support system 1 according to one embodiment of the invention. The input support system 1 infers a fixed phrase indicating appropriate progress of a project from the content of conversation between a customer and an operator in call center operations. In this way, input support for a fixed phrase to be registered in the matter management application is provided.

コールセンタ業務において、カスタマ通話端末４１からコールセンタに電話がかかってくると、オペレータ通話端末３０に接続される。オペレータ通話端末３０は、カスタマとオペレータとの会話を、通話音声データとして記録する。通話音声データは、ファイルサーバ１０で記録される。カスタマ通話端末４１とオペレータ通話端末３０との接続が切断されると、ファイルサーバ１０で取得した通話音声データは、通話テキストデータに変換される。ファイルサーバ１０は、分析サーバ２０に、通話テキストデータを送信する。分析サーバ２０が通話テキストデータを取得すると、通話テキストデータから、単語ベクトル及び文字ベクトルが統合された特徴ベクトルを生成する。本明細書等において、特徴ベクトルとは、各通話テキストデータについて、そこに含まれる語と、その重要度によって文書の内容をベクトルで表したものをいう。特徴ベクトルを分類器２４に入力すると、学習済みの機械学習モデルによって通話テキストデータに対応する案件管理アプリケーションに登録するための候補の定型文をＷｅｂサーバ２５に出力する。Ｗｅｂサーバ２５に出力された候補の定型文は、シンクライアントサーバ３３を介して、端末装置３２に表示される。端末装置３２において、候補の定型文から、通話テキストデータに対応する定型文が選択されると、案件管理アプリケーションに、選択された定型文を入力することができる。 In the call center business, when a call is received from the customer call terminal 41 to the call center, it is connected to the operator call terminal 30. - 特許庁The operator call terminal 30 records the conversation between the customer and the operator as call voice data. Call voice data is recorded in the file server 10 . When the connection between the customer call terminal 41 and the operator call terminal 30 is disconnected, the call voice data acquired by the file server 10 is converted into call text data. The file server 10 transmits call text data to the analysis server 20 . When the analysis server 20 acquires the speech text data, it generates feature vectors in which word vectors and character vectors are integrated from the speech text data. In this specification and the like, a feature vector refers to a vector representing the content of a document based on the words contained therein and the degree of importance of each speech text data. When the feature vector is input to the classifier 24 , it outputs to the Web server 25 candidate phrases to be registered in the case management application corresponding to the call text data by means of a learned machine learning model. The candidate fixed phrases output to the web server 25 are displayed on the terminal device 32 via the thin client server 33 . When a fixed phrase corresponding to call text data is selected from the candidate fixed phrases in the terminal device 32, the selected fixed phrase can be input to the case management application.

分類器２４が有する機械学習モデルは、通話テキストデータから生成された特徴ベクトルと、複数の定型文とのデータセットに基づいて、予め機械学習によって生成されている。そのため、分析サーバ２０が通話テキストデータを取得すると、機械学習モデルによって、通話テキストデータに対応する候補の定型文を推定することができる。これにより、オペレータは、端末装置３２に表示された候補の定型文の中から、カスタマとの通話の内容に対応する定型文を選択すればよいため、数十以上、又は数百以上ある定型文の中から定型文を探し出す場合と比較して、定型文の検索時間を短縮することができる。これにより、オペレータの熟練度に関わらず、目的の定型文を探す時間を大幅に短縮することができるため、業務効率化を図ることができる。 A machine learning model of the classifier 24 is generated in advance by machine learning based on a data set of feature vectors generated from speech text data and a plurality of fixed phrases. Therefore, when the analysis server 20 acquires the call text data, the machine learning model can estimate candidate fixed phrases corresponding to the call text data. As a result, the operator can select a fixed phrase corresponding to the content of the call with the customer from among the candidate fixed phrases displayed on the terminal device 32, so that there are dozens or more or hundreds of fixed phrases. It is possible to shorten the search time for fixed phrases compared to searching for fixed phrases from among. As a result, regardless of the skill level of the operator, it is possible to significantly reduce the time required to search for the target fixed phrase, thereby improving operational efficiency.

上述の入力支援システム１は、通話音声データを扱う様々な業務に適用可能である。入力支援システム１は、例えば、カスタマとオペレータとの通話が行われる保険業務、銀行業務、及び販売業務に適用することができる。ここで、通話とは、カスタマ通話端末４１と、オペレータ通話端末３０とが接続されてから、切断されるまでの間を意味する。 The input support system 1 described above can be applied to various businesses that handle call voice data. The input support system 1 can be applied to, for example, insurance business, banking business, and sales business in which customers and operators talk over each other. Here, a call means a period from when the customer call terminal 41 and the operator call terminal 30 are connected to when they are disconnected.

以降、本発明の一実施形態に係る入力支援システム１の構成について詳細に説明する。また、通話音声データを扱う業務として、保険に関するコールセンタ業務を一例に挙げて説明する。 Hereinafter, the configuration of the input support system 1 according to one embodiment of the present invention will be described in detail. Also, a call center service related to insurance will be described as an example of a service that handles call voice data.

［入力支援システムの構成］
図１に示す入力支援システム１は、ファイルサーバ１０及び分析サーバ２０を少なくとも有する。また、入力支援システム１は、交換機３１、オペレータ通話端末３０、端末装置３２、及びシンクライアントサーバ３３をさらに有していてもよい。図１では、ファイルサーバ１０は、通信網５２を介して分析サーバ２０と接続される。 [Configuration of input support system]
The input support system 1 shown in FIG. 1 has at least a file server 10 and an analysis server 20 . The input support system 1 may further include an exchange 31 , an operator call terminal 30 , a terminal device 32 and a thin client server 33 . In FIG. 1, file server 10 is connected to analysis server 20 via communication network 52 .

コールセンタにおいて、交換機３１は、通信網５１を介して、カスタマ通話端末４１と通信可能に接続されている。通信網５１は、インターネットやＰＳＴＮ（ＰｕｂｌｉｃＳｗｉｔｃｈｅｄＴｅｌｅｐｈｏｎｅＮｅｔｗｏｒｋｓ）等のような公衆網、無線ネットワーク等である。また、交換機３１は、複数のオペレータ通話端末３０と接続されている。交換機３１は、カスタマ通話端末４１からの呼び出しを受けると、複数のオペレータ通話端末３０のいずれかと接続する。 In the call center, the exchange 31 is communicably connected to the customer call terminal 41 via the communication network 51 . The communication network 51 is a public network such as the Internet or PSTN (Public Switched Telephone Networks), a wireless network, or the like. Also, the exchange 31 is connected to a plurality of operator call terminals 30 . When receiving a call from the customer call terminal 41 , the exchange 31 connects to one of the plurality of operator call terminals 30 .

データセンタにおいて、ファイルサーバ１０は、通話データ取得部１１、テキストデータ変換部１２、及び格納部１３を有する。通話データ取得部１１は、オペレータ通話端末３０と、カスタマ通話端末４１とが接続されると、カスタマとオペレータとの会話を通話音声データとして格納部１３に記録する。通話音声データには、通話開始時間、通話終了時間、通話開始を示すタグ及び通話終了開始を示すタグや、通話音声データを区別するための識別情報が付与されてもよい。オペレータ通話端末３０は、カスタマの通話音声データとオペレータの通話音声データとを分けて記録してもよい。 In the data center, the file server 10 has a call data acquisition section 11 , a text data conversion section 12 and a storage section 13 . When the operator call terminal 30 and the customer call terminal 41 are connected, the call data acquisition unit 11 records the conversation between the customer and the operator in the storage unit 13 as call voice data. The call voice data may be given a call start time, a call end time, a tag indicating the start of the call, a tag indicating the start of the end of the call, and identification information for distinguishing the call voice data. The operator call terminal 30 may separately record the customer's call voice data and the operator's call voice data.

テキストデータ変換部１２は、通話音声データを取得すると、音声認識処理によって、通話テキストデータに変換する。図２は、通話テキストデータ４００の一例である。図２に示すように、通話テキストデータ４００には、カスタマが発話した内容には、カスタマのタグが付与され、オペレータが発話した内容には、オペレータのタグが付与される。また、通話開始時間、通話終了時間、通話開始を示すタグ及び通話終了開始を示すタグや、通話テキストデータを区別するための識別情報が付与されてもよい。音声認識処理は、カスタマ通話端末４１とオペレータ通話端末３０とが接続されている間、リアルタイムで実行してもよいし、カスタマ通話端末４１とオペレータ通話端末３０との接続が切断されてから音声認識処理を実行してもよい。本実施形態では、この音声認識処理には、周知な手法が利用されればよく、音声認識処理自体及びその音声認識処理で利用される各種音声認識パラメータは特に限定されない。また、通話テキストデータは、格納部１３に格納される。 When the text data conversion unit 12 acquires the call voice data, it converts it into call text data by voice recognition processing. FIG. 2 is an example of call text data 400 . As shown in FIG. 2, in the call text data 400, customer tags are added to the customer's utterances, and operator's tags are added to the operator's utterances. Also, a call start time, a call end time, a tag indicating the start of a call, a tag indicating the start of the end of a call, and identification information for distinguishing call text data may be added. The speech recognition process may be executed in real time while the customer call terminal 41 and the operator call terminal 30 are connected, or may be executed after the customer call terminal 41 and the operator call terminal 30 are disconnected. processing may be performed. In this embodiment, a well-known technique may be used for this speech recognition processing, and the speech recognition processing itself and various speech recognition parameters used in the speech recognition processing are not particularly limited. Also, the call text data is stored in the storage unit 13 .

分析サーバ２０は、テキストデータ取得部２１、前処理部２２、格納部２３、分類器２４、及びＷｅｂサーバ２５を有する。 The analysis server 20 has a text data acquisition unit 21 , a preprocessing unit 22 , a storage unit 23 , a classifier 24 and a web server 25 .

テキストデータ取得部２１は、テキストデータ変換部１２から通信網５２を介して通話テキストデータを取得する。 The text data acquisition unit 21 acquires call text data from the text data conversion unit 12 via the communication network 52 .

前処理部２２は、通話テキストデータに前処理を行って特徴ベクトルを生成する。 The preprocessing unit 22 preprocesses the speech text data to generate feature vectors.

格納部２３には、通話テキストデータから生成された特徴ベクトルが学習データとして格納される。また、格納部２３には、案件管理アプリケーションに登録するための全ての種類の定型文が格納される。定型文は、例えば、第１段階から第３段階まで各々数１０種類以上設定されている。定型文は、第１段階で選択される定型文、第２段階で選択される定型文、第３段階で選択される定型文が組み合わされて１つのセットとする。なお、候補の定型文セットは、少なくとも第１段階で選択される定型文が含まれていればよい。また、第１段階で選択される定型文及び第２段階で選択される定型文の組み合わせであってもよい。ここでは、定型文セットの種類を、３００種類とする。また、格納部２３には、学習データに対応して、アノテーションされたメタデータ（「正解」を与えるラベル）である定型文データが格納される。また、格納部２３には、取得した通話テキストデータ、機械学習モデル、及び本発明の一実施形態に係る入力支援方法を実行するためのプログラムなどが格納される。 The feature vector generated from the speech text data is stored in the storage unit 23 as learning data. In addition, the storage unit 23 stores all kinds of fixed phrases to be registered in the matter management application. For example, several tens or more types of standard sentences are set for each of the first to third stages. The fixed phrases are combined into one set of the fixed phrases selected in the first step, the fixed phrases selected in the second step, and the fixed phrases selected in the third step. It should be noted that the set of candidate fixed phrases should include at least the fixed phrases selected in the first stage. Alternatively, it may be a combination of the fixed phrase selected in the first step and the fixed phrase selected in the second step. Here, it is assumed that there are 300 types of fixed phrase sets. Further, the storage unit 23 stores standard sentence data, which is annotated metadata (a label giving "correct answer") corresponding to the learning data. Further, the storage unit 23 stores the obtained call text data, the machine learning model, the program for executing the input support method according to the embodiment of the present invention, and the like.

分類器２４は、特徴ベクトルが入力されると、学習済みの機械学習モデルによって案件管理アプリケーションに登録するための複数の定型文（セット）各々の発生確率を出力する。機械学習のアルゴリズムとして、ロジスティック回帰、ニューラルネットワーク、サポートベクターマシン、ランダムフォレスト、又はナイーブベイズ等が挙げられる。本実施形態では、機械学習のアルゴリズムとして、ロジスティクス回帰を用いている。ロジスティック回帰は、シグモイド関数をモデルの出力に用いる。任意の値を０から１の間に写像するシグモイド関数を用いることにより、与えられたデータが正例（＋１）になるか、負例（０）になるかの確率が求められる。シグモイド関数では閾値によって正例と負例とを分類する。シグモイド関数が出力した定型文ごとの発生確率に対し、あらかじめ決められた閾値（例えば、０．８）以上の発生確率を持つ定型文を候補として推定することができる。また、別の方法として、閾値を設けず、発生確率上位の複数の定型文（例えば、上位５つの定型文）を候補と推定することもできる。 When the feature vector is input, the classifier 24 outputs the probability of occurrence of each of a plurality of fixed phrases (sets) to be registered in the case management application by means of a learned machine learning model. Machine learning algorithms include logistic regression, neural networks, support vector machines, random forests, naive Bayes, and the like. In this embodiment, logistic regression is used as a machine learning algorithm. Logistic regression uses a sigmoid function as the output of the model. By using a sigmoid function that maps an arbitrary value between 0 and 1, the probability of whether the given data is positive (+1) or negative (0) is determined. The sigmoid function classifies positive examples and negative examples by a threshold. It is possible to estimate, as candidates, fixed phrases having occurrence probabilities equal to or higher than a predetermined threshold value (for example, 0.8) with respect to occurrence probabilities for each fixed phrase output by the sigmoid function. As another method, it is also possible to estimate a plurality of standardized sentences (for example, top five standardized sentences) with the highest probability of occurrence as candidates without setting a threshold.

Ｗｅｂサーバ２５は、候補の定型文を出力する。Ｗｅｂサーバ２５は、通信網５２を介してシンクライアントサーバ３３に接続されている。シンクライアントサーバ３３は、端末装置３２で使用される各種ソフトウェア（アプリケーション）を保持する。シンクライアントサーバ３３が、端末装置３２に仮想デスクトップを稼働させる場合には、その仮想デスクトップ上で稼働させるアプリケーションを保持する。アプリケーションの一つには、案件管理アプリケーションがある。端末装置３２は、シンクライアントサーバ３３を介して、Ｗｅｂサーバ２５から出力された結果を表示する。 The Web server 25 outputs candidate fixed form sentences. The web server 25 is connected to the thin client server 33 via the communication network 52 . The thin client server 33 holds various software (applications) used by the terminal device 32 . When the thin client server 33 runs a virtual desktop on the terminal device 32, it holds an application to run on the virtual desktop. One of the applications is a case management application. The terminal device 32 displays the results output from the web server 25 via the thin client server 33 .

端末装置３２に表示された候補の定型文から、定型文が選択されると、シンクライアントサーバ３３に保持される案件管理アプリケーションに、カスタマとの通話の内容に対応する定型文を登録することができる。 When a fixed phrase is selected from the candidate fixed phrases displayed on the terminal device 32, the fixed phrase corresponding to the content of the call with the customer can be registered in the project management application held in the thin client server 33. can.

［分析サーバのハードウェア構成］
図３は、分析サーバ２０のハードウェア構成を説明するブロック図である。図３に示すように、分析サーバ２０は、制御部２０１、メモリ２０２、記憶部２０３、及び通信部２０４を有する。 [Analysis server hardware configuration]
FIG. 3 is a block diagram for explaining the hardware configuration of the analysis server 20. As shown in FIG. As shown in FIG. 3 , the analysis server 20 has a control unit 201 , memory 202 , storage unit 203 and communication unit 204 .

制御部２０１は、例えば、ＣＰＵ（例えば、複数のプロセッサコアを実装したマルチ・プロセッサなど）、ＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔｓ）、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒｓ）、ＦＰＧＡ（Ｆｉｅｌｄ－ＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙｓ）などを組み合わせることによって構成される。また、制御部２０１として、より高速な演算処理を可能にするために、ＧＰＧＰＵ（Ｇｅｎｅｒａｌ－ＰｕｒｐｏｓｅｃｏｍｐｕｔｉｎｇｏｎＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔｓ）を用いることが好ましい。 The control unit 201, for example, CPU (for example, a multi-processor having a plurality of processor cores), GPUs (Graphics Processing Units), DSPs (Digital Signal Processors), FPGAs (Field-Programmable Gate Arrays), etc. can be combined. Consists of Further, it is preferable to use a GPGPU (General-Purpose Computing on Graphics Processing Units) as the control unit 201 in order to enable faster arithmetic processing.

メモリ２０２は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）及びＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等のメモリを用いる。ＲＯＭは、制御部２０１に各種の処理を実行させるための制御プログラムが予め記憶された不揮発性の記憶部である。ＲＡＭは、各種の情報を記憶する揮発性又は不揮発性のメモリであり、制御部２０１が実行する各種の処理の一時記憶メモリ（作業領域）として使用される。 The memory 202 uses memories such as ROM (Read Only Memory) and RAM (Random Access Memory). The ROM is a non-volatile storage unit in which control programs for causing the control unit 201 to execute various processes are stored in advance. The RAM is a volatile or nonvolatile memory that stores various types of information, and is used as a temporary storage memory (work area) for various types of processing executed by the control unit 201 .

記憶部２０３は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）又はＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）などの書き換え可能な不揮発性の記録媒体で構成される。記憶部２０３には、格納部２３に格納される通話テキストデータ、学習データ、定型文データ、機械学習モデル、本発明の一実施形態に係る入力支援方法をコンピュータに実行させるためのプログラムなどが記憶される。これらのデータは、それぞれ異なる記憶媒体に格納されてもよい。 The storage unit 203 is composed of a rewritable non-volatile recording medium such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive). The storage unit 203 stores call text data, learning data, fixed phrase data, machine learning models, programs for causing a computer to execute the input support method according to one embodiment of the present invention, etc., which are stored in the storage unit 23. be done. These data may be stored in different storage media.

通信部２０４は、有線通信用のネットワークカード等の通信デバイス、ファイルサーバ１０に接続する無線通信デバイス、又はアクセスポイントへの接続に対応する無線通信デバイスを用いることができる。制御部２０１は、通信部２０４により、通信網５１を介してファイルサーバ１０又はシンクライアントサーバ３３との間で通信接続又は情報の送受信が可能である。 The communication unit 204 can use a communication device such as a network card for wired communication, a wireless communication device that connects to the file server 10, or a wireless communication device that supports connection to an access point. The control unit 201 is capable of communication connection or information transmission/reception with the file server 10 or the thin client server 33 via the communication network 51 using the communication unit 204 .

分析サーバ２０として、少なくとも上記のハードウェア構成を備えていれば良く、企業が提供するクラウドサービス、例えば、ＡＷＳ（登録商標）（ＡｍａｚｏｎＷｅｂＳｅｒｖｉｃｅｓ（登録商標））、Ａｚｕｒｅ（登録商標）（Ｍｉｃｒｏｓｏｆｔ（登録商標））、ＧＣＰ（登録商標）（ＧｏｏｇｌｅＣｌｏｕｄＰｌａｔｆｏｒｍ（登録商標））を用いて構築されてもよい。 The analysis server 20 may have at least the above hardware configuration, and cloud services provided by companies, such as AWS (registered trademark) (Amazon Web Services (registered trademark)), Azure (registered trademark) (Microsoft ( (registered trademark)), GCP (registered trademark) (Google Cloud Platform (registered trademark)).

［入力支援方法のフローチャート］
次に、本発明の一実施形態に係る入力支援方法について、図４乃至図１０を参照して説明する。分析サーバ２０は、それぞれの処理を制御部２０１によって実行させる。 [Flowchart of input support method]
Next, an input support method according to one embodiment of the present invention will be described with reference to FIGS. 4 to 10. FIG. The analysis server 20 causes the control unit 201 to execute each process.

［学習段階］
まず、定型文の学習段階について説明する。図４は、機械学習モデルの学習段階で使用される分析サーバ２０の機能ブロックである。図４に、分析サーバ２０のテキストデータ取得部２１、前処理部２２、格納部２３、及び分類器２４の機能ブロックを示す。図５は、学習段階を説明するフローチャートである。 [Learning stage]
First, the learning stage of fixed phrases will be described. FIG. 4 shows functional blocks of the analysis server 20 used in the learning stage of the machine learning model. FIG. 4 shows functional blocks of the text data acquisition unit 21, the preprocessing unit 22, the storage unit 23, and the classifier 24 of the analysis server 20. As shown in FIG. FIG. 5 is a flow chart illustrating the learning stage.

学習段階において、まず、テキストデータ取得部２１は、学習用通話テキストデータを取得する（ステップＳ３０１）。学習用通話テキストデータとは、過去から蓄積された通話テキストデータである。学習用通話テキストデータは、例えば、ファイルサーバ１０の格納部１３から取得する。本実施形態では、学習用通話テキストデータを２万件とする。ここで、通話テキストデータに特徴を持つ数値を含む場合には、数値情報を単一特徴にするために、数値はすべて「０」に変換しておくことが好ましい。 In the learning stage, first, the text data acquisition unit 21 acquires learning call text data (step S301). The training call text data is call text data accumulated from the past. The learning speech text data is acquired from the storage unit 13 of the file server 10, for example. In this embodiment, it is assumed that there are 20,000 learning call text data. Here, if the speech text data includes numerical values having characteristics, it is preferable to convert all the numerical values to "0" in order to make the numerical information into a single characteristic.

次に、前処理部２２は、学習用通話テキストデータに前処理を実行し、単語ベクトルに変換する（ステップＳ３０２）。図６は、学習用通話テキストデータから、単語ベクトルを生成する方法について示している。 Next, the preprocessing unit 22 performs preprocessing on the speech text data for learning and converts it into a word vector (step S302). FIG. 6 shows a method of generating word vectors from learning speech text data.

まず、図６に示すように、取得した通話テキストデータに形態素解析を行って、品詞分解を行う（ステップＳ３１１）。形態素解析は、例えば、ＭｅＣａｂ、ＪＵＭＡＮ＋＋等の各種プログラムや、ライブラリを適宜用いて行うことができる。以下に、形態素解析前の通話テキストデータに含まれる文章の一例を示す。 First, as shown in FIG. 6, morphological analysis is performed on the acquired speech text data to perform part-of-speech decomposition (step S311). Morphological analysis can be performed using various programs such as MeCab and JUMAN++, and libraries as appropriate. An example of sentences included in the speech text data before morphological analysis is shown below.

以下に形態素解析後の通話テキストデータに含まれる文章の一例を示す。通話テキストデータに形態素解析を行うことにより、文章を形態素に分割し、それぞれの品詞や変化を割り出す。 An example of sentences included in the speech text data after morphological analysis is shown below. By performing morphological analysis on the speech text data, the sentence is divided into morphemes and the parts of speech and variations of each are determined.

次に、形態素解析を行った通話テキストデータをクリーニングすることにより、単語データを抽出する（ステップＳ３１２）。クリーニングとは、所定の選別基準に基づいて、予め設定された文字又は単語を除去する処理をいう。ここでは、通話テキストデータに含まれる助詞、記号を除去することで、名詞、動詞、形容詞の単語を抽出する。また、クリーニング処理により、自然言語の分類をする上で一般的過ぎて役に立たない単語をストップワードとして除去してもよい。ストップワードとは、例えば、「する」、「ます」等の言葉である。また、ストップワードは、機械学習の結果に応じて、辞書にストップワードを登録しておいてもよい。クリーニング処理の際に、辞書に登録されたストップワードを除去してもよい。これらの他に、クリーニング処理によって、ノイズ等を除去してもよい。以下に、通話テキストデータから助詞及び記号が除去されたデータを示す。以下に示す「ます」、「する」がストップワードに該当する。 Next, word data is extracted by cleaning the morphologically analyzed speech text data (step S312). Cleaning refers to the process of removing preset characters or words based on predetermined selection criteria. Here, words of nouns, verbs and adjectives are extracted by removing particles and symbols contained in the speech text data. The cleaning process may also remove words that are too common to be useful in classifying natural language as stop words. Stop words are, for example, words such as "do" and "masu". Also, stop words may be registered in a dictionary according to the results of machine learning. Stopwords registered in the dictionary may be removed during the cleaning process. In addition to these, noise and the like may be removed by cleaning processing. Data obtained by removing particles and symbols from the speech text data is shown below. "Masu" and "Do" shown below correspond to stop words.

以下に、示すデータからストップワードが除去されたデータを示す。これにより、必要な単語データが抽出される。 Below is the data with the stopwords removed from the data shown. This extracts the necessary word data.

次に、単語データを、単語ベクトルに変換する（ステップＳ３１３）。まず、単語データを分割する。単語の分割は、例えば、Ｎ－ｇｒａｍによって行う。Ｎ－ｇｒａｍとは、任意の文字列や文書を連続したｎ個の文字又は単語で分割するテキスト分割方法をいう。例えば、ｎが１の場合をｕｎｉ－ｇｒａｍ、ｎが２の場合をｂｉ－ｇｒａｍ、ｎが３の場合をｔｒｉ－ｇｒａｍと呼ぶ。以下に、単語データを、ｔｒｉ－ｇｒａｍで分割する一例を示す。 Next, the word data is converted into word vectors (step S313). First, the word data is divided. Word segmentation is performed by, for example, N-grams. N-gram is a text division method for dividing an arbitrary character string or document into consecutive n characters or words. For example, a case where n is 1 is called a uni-gram, a case where n is 2 is called a bi-gram, and a case where n is 3 is called a tri-gram. An example of dividing word data by tri-grams is shown below.

以下に、ｔｒｉ－ｇｒａｍで分割された一又は連続する複数の単語の出現頻度をカウントすることでベクトル化した結果を示す。 The results of vectorization by counting the frequency of appearance of one or a plurality of consecutive words divided by tri-grams are shown below.

次に、一又は連続する複数の単語に重み付け処理を実行し、重みづけされた単語ベクトルを生成する（ステップＳ３１４）。重み付け処理は、例えば、ｔｆ－ｉｄｆによって行う。ｔｆ－ｉｄｆとは、情報探索やテキストマイニングなどの分野で利用される、文書中に出現した特定の単語がどのくらい特徴的であるかを識別するための指標のことである。ここで、ｔｆ（ｔｅｒｍｆｒｅｑｕｅｎｃｙ）は、その文書の中で特定の単語が出現した回数（出現頻度）を表し、ｉｄｆ（ｉｎｖｅｒｓｅｄｏｃｕｍｅｎｔｆｒｅｑｕｅｎｃｙ）は、コーパス全体の中でその文書を含む文書数の自然対数（希少度）を表し、「ｔｆ×ｉｄｆ」が、その文書中におけるその単語のｔｆ－ｉｄｆ値となる。 Next, a weighting process is performed on one or a plurality of consecutive words to generate a weighted word vector (step S314). Weighting processing is performed by, for example, tf-idf. tf-idf is an index used in fields such as information search and text mining to identify how characteristic a specific word appears in a document. Here, tf (term frequency) represents the number of occurrences of a particular word in the document, and idf (inverse document frequency) represents the natural number of documents including that document in the entire corpus. It represents the logarithm (rarity), and "tf×idf" is the tf-idf value of that word in that document.

本実施形態では、例えば、２０万件の通話テキストデータからコーパスを作成する。ｔｆ－ｉｄｆによって、「私」、「昨日」等の通話特有でない単語による影響を抑制することができる。また、２０万件の通話テキストデータ全体を解析することで、各通話テキストデータの中で、定型文の分類にとって重要な単語を選択することができる。また、出現頻度が高い単語であっても、どの通話テキストデータにも出現する単語は、希少度が低くなるように調整する。以下に、重み付けされた単語ベクトルの一例を示す。 In this embodiment, for example, a corpus is created from 200,000 call text data. tf-idf can suppress the influence of non-call-specific words such as "me" and "yesterday". In addition, by analyzing the entire 200,000 call text data, it is possible to select words that are important for classification of fixed phrases from each call text data. In addition, even if the word has a high appearance frequency, the word appearing in any speech text data is adjusted so that the rarity level is low. Below is an example of a weighted word vector.

次に、重み付けされた単語ベクトルのうち、重要度が上位の単語ベクトルを抽出する（ステップＳ３１５）。本実施形態では、重要度が上位の単語ベクトルを、例えば、２万件抽出する。以上により、通話テキストデータから単語ベクトルに変換することができる。 Next, among the weighted word vectors, the word vectors with higher importance are extracted (step S315). In this embodiment, for example, 20,000 word vectors with high importance are extracted. As described above, speech text data can be converted into a word vector.

次に、前処理部２２は、学習用通話テキストデータに前処理を実行し、文字ベクトルに変換する（ステップＳ３０３）。図７は、学習用通話テキストデータから、文字ベクトルを生成する方法について示している。なお、通話テキストデータに特徴を持つ数値を含む場合には、この段階で、数値情報を単一特徴にするために、数値はすべて「０」に変換しておくことが好ましい。 Next, the preprocessing unit 22 performs preprocessing on the speech text data for learning and converts it into a character vector (step S303). FIG. 7 shows a method of generating a character vector from learning speech text data. It should be noted that, if the speech text data includes numerical values having characteristics, it is preferable to convert all the numerical values to "0" at this stage in order to make the numerical information into a single characteristic.

まず、学習用通話テキストデータに、データクリーニングすることにより、文字データを抽出する（ステップＳ３２１）。文字データに関するデータクリーニングは、単語データのクリーニングと同様の方法で実行する。 First, character data is extracted from the speech text data for learning by data cleaning (step S321). Data cleaning for character data is performed in a manner similar to cleaning word data.

次に、文字データを、文字ベクトルに変換する（ステップＳ３２２）。文字の分割は、単語の分割と同様に、Ｎ－ｇｒａｍによって行う。本実施形態では、文字データを、ｔｒｉ－ｇｒａｍで分割する。そして、一または連続する複数の文字の出現頻度をカウントすることで文字ベクトルに変換する。 Next, the character data is converted into a character vector (step S322). Character segmentation is performed by N-grams in the same manner as word segmentation. In this embodiment, character data is divided into tri-grams. Then, the appearance frequency of one or a plurality of consecutive characters is counted to convert to a character vector.

次に、文字ベクトルに重み付け処理を実行し、重みづけされた文字ベクトルを生成する（ステップＳ３２３）。重み付け処理は、単語ベクトルと同様に、ｔｆ－ｉｄｆによって行う。 Next, a weighting process is performed on the character vector to generate a weighted character vector (step S323). Weighting processing is performed by tf-idf in the same way as word vectors.

次に、重み付けされた文字ベクトルのうち、重要度が上位の文字ベクトルを抽出する（ステップＳ３２４）。本実施形態では、重要度が上位の文字ベクトルを、例えば、６万件抽出する。以上により、通話テキストデータから文字ベクトルを抽出することができる。なお、ステップＳ３０２に示す単語ベクトルに変換する処理と、ステップＳ３０３に示す文字ベクトルに変換する処理との処理の順序を逆にして実行してもよい。 Next, among the weighted character vectors, character vectors with higher importance are extracted (step S324). In this embodiment, for example, 60,000 character vectors with high importance are extracted. As described above, a character vector can be extracted from the speech text data. Note that the order of the conversion into word vectors in step S302 and the conversion into character vectors in step S303 may be reversed.

次に、単語ベクトルと、文字ベクトルとを統合することにより、特徴ベクトルを生成する（ステップＳ３０４）。本実施形態では、特徴ベクトルは８万列であり、全て固定長とする。以上説明した処理によって、学習用通話テキストデータから特徴ベクトルを生成することができる。学習用通話テキストデータを全て特徴ベクトルに変換することで、２万行、８万列の特徴ベクトルを学習データ２７とする。 Next, a feature vector is generated by integrating the word vector and the character vector (step S304). In this embodiment, the feature vector has 80,000 columns, all of which have a fixed length. Through the processing described above, feature vectors can be generated from learning speech text data. By converting all the learning speech text data into feature vectors, the feature vectors of 20,000 rows and 80,000 columns are used as learning data 27 .

次に、ステップＳ３０４によって得られた特徴ベクトルを学習データ２７として、格納部２３に格納する（ステップＳ３０５）。 Next, the feature vector obtained in step S304 is stored in the storage unit 23 as learning data 27 (step S305).

次に、学習データ２７と定型文データ２８とに基づいて、機械学習モデルに機械学習させる（ステップＳ３０６）。定型文データ２８は、格納部２３に格納されている。定型文データ２８は、アノテーションされたメタデータ（「正解」を与えるラベル）である。学習データ２７と定型文データ２８とを、データセット２５とも呼ぶ。データセット２５は、学習用通話テキストデータに対応して２万件、格納部２３に格納されている。学習データ２７と定型文データ２８とで構成されるデータセット２５を教師データとして、機械学習をすることで、学習済み機械学習モデルを生成することができる。 Next, the machine learning model is machine-learned based on the learning data 27 and the standard sentence data 28 (step S306). The standard sentence data 28 is stored in the storage unit 23 . The fixed phrase data 28 is annotated metadata (a label giving "correct answer"). The learning data 27 and fixed phrase data 28 are also called a data set 25 . 20,000 data sets 25 are stored in the storage unit 23 corresponding to the training speech text data. A learned machine learning model can be generated by performing machine learning using a data set 25 composed of learning data 27 and fixed phrase data 28 as teacher data.

最後に、学習済み機械学習モデルを格納部２３に格納する（ステップＳ３０７）。 Finally, the trained machine learning model is stored in the storage unit 23 (step S307).

［推定段階］
次に、定型文の推定段階について説明する。図８は、推定段階で使用される分析サーバ２０の機能ブロックである。図８に、分析サーバ２０のテキストデータ取得部２１、前処理部２２、及び分類器２４の機能ブロックを示す。図９は、定型文の推定段階を説明するフローチャートである。 [Estimation stage]
Next, the step of estimating fixed phrases will be described. FIG. 8 shows functional blocks of the analysis server 20 used in the estimation stage. FIG. 8 shows functional blocks of the text data acquisition unit 21, the preprocessing unit 22, and the classifier 24 of the analysis server 20. As shown in FIG. FIG. 9 is a flow chart for explaining the step of estimating fixed phrases.

推定段階において、まず、テキストデータ取得部２１は、ファイルサーバ１０から、通話テキストデータを取得する（ステップＳ３３１）。ここで、通話テキストデータとは、カスタマとオペレータとの通話によって新たに生成された通話テキストデータである。本実施形態では、通話テキストデータは、カスタマとオペレータとの通話が終了してから、通話テキストデータの分析を実行する。 In the estimation stage, first, the text data acquisition unit 21 acquires call text data from the file server 10 (step S331). Here, call text data is call text data newly generated by a call between a customer and an operator. In this embodiment, the call text data is analyzed after the call between the customer and the operator is completed.

次に、前処理部２２は、通話テキストデータに前処理を実行し、特徴ベクトルに変換する（ステップＳ３３２）。通話テキストデータから特徴ベクトルに変換する方法については、図６及び図７で説明したフローチャートに従えばよい。 Next, the preprocessing unit 22 performs preprocessing on the call text data and converts it into a feature vector (step S332). The method of converting speech text data into feature vectors may follow the flowcharts described in FIGS. 6 and 7. FIG.

次に、分類器２４は、特徴ベクトルを入力ベクトルとして、機械学習モデル２９に入力する（ステップＳ３３３）。機械学習モデル２９は、学習段階で説明した学習済み機械学習モデルである。入力ベクトルとして、単語ベクトル２万列及び文字ベクトル６万列の計８万列の特徴ベクトルを機械学習モデル２９に入力する。本実施形態では、シグモイド関数を機械学習モデルの出力として用いている。これにより、出力ベクトルとして、全ての定型文各々に対して発生確率が出力される。例えば、定型文１の発生確率が３％、定型文２の発生確率が３２％、定型文３の発生確率が９８％、・・・・・・のように出力される。シグモイド関数が出力した定型文ごとの発生確率に対し、予め決められた閾値（例えば、０．８）以上の発生確率を持つ定型文を候補として推定してもよい。本実施形態では、シグモイド関数の出力に閾値を設けず、発生確率上位の複数の定型文（例えば、上位５つの定型文）を候補として推定する。 Next, the classifier 24 inputs the feature vector as an input vector to the machine learning model 29 (step S333). The machine learning model 29 is the learned machine learning model described in the learning stage. As input vectors, a total of 80,000 feature vectors including 20,000 word vectors and 60,000 character vectors are input to the machine learning model 29 . In this embodiment, a sigmoid function is used as the output of the machine learning model. As a result, occurrence probabilities are output as output vectors for all fixed phrases. For example, the probability of occurrence of fixed phrase 1 is 3%, the probability of occurrence of fixed phrase 2 is 32%, the probability of occurrence of fixed phrase 3 is 98%, and so on. A standard sentence having an occurrence probability equal to or greater than a predetermined threshold value (for example, 0.8) may be estimated as a candidate for the occurrence probability of each standard sentence output by the sigmoid function. In this embodiment, no threshold is set for the output of the sigmoid function, and a plurality of standardized sentences (for example, top five standardized sentences) with the highest probability of occurrence are estimated as candidates.

最後に、分類器２４は、候補の定型文を、Ｗｅｂサーバ２５に出力する（ステップＳ３３４）。Ｗｅｂサーバに出力される候補の定型文は、例えば、発生確率が高い定型文から出力される。 Finally, the classifier 24 outputs the candidate fixed phrases to the Web server 25 (step S334). Candidate fixed phrases to be output to the Web server are, for example, output in descending order of the probability of occurrence of fixed phrases.

Ｗｅｂサーバ２５に、候補の定型文が出力されると、端末装置３２からシンクライアントサーバ３３を介して、端末装置３２の画面に高い発生確率の少なくとも一つを出力する。 When the candidate fixed phrases are output to the Web server 25 , at least one of the high occurrence probabilities is output from the terminal device 32 to the screen of the terminal device 32 via the thin client server 33 .

図１０は、端末装置３２の画面に表示される定型文の一例である。図１０に示すようにウィンドウ３００に、５つの候補の定型文が表示される。なお、ウィンドウ３００に表示された候補の定型文は、図３に示す通話テキストデータに対応する可能性がある定型文の一例である。 FIG. 10 shows an example of fixed phrases displayed on the screen of the terminal device 32. As shown in FIG. As shown in FIG. 10, window 300 displays five candidate fixed phrases. It should be noted that the candidate fixed phrases displayed in window 300 are examples of fixed phrases that may correspond to the call text data shown in FIG.

図３に示すウィンドウ３００には、Ｗｅｂページのアドレス３０１、通話テキストデータの付属情報３０２、事案番号３０３、登録・更新ボタン３０３、スクロールバー３０４、ページ送りボタン３０６が表示される。また、ウィンドウ３００には、候補の定型文３１１～３１５が発生確率とともに表示される。候補の定型文３１１～３１５は、発生確率が一番高いものから選択された５つの候補の定型文である。候補の定型文は、第１段階で選択される定型文、第２段階で選択される定型文、第３段階で選択される定型文が組み合わされて１つのセットとして表示される。なお、候補の定型文は、少なくとも第１段階で選択される定型文が含まれていればよい。また、第１段階で選択される定型文及び第２段階で選択される定型文の組み合わせであってもよい。 A window 300 shown in FIG. 3 displays a web page address 301, additional information 302 of call text data, a case number 303, a registration/update button 303, a scroll bar 304, and a page forward button 306. FIG. Further, in the window 300, candidate fixed sentences 311 to 315 are displayed together with their occurrence probabilities. Candidate standardized sentences 311 to 315 are five candidate standardized sentences selected in descending order of probability of occurrence. Candidate fixed phrases are displayed as one set by combining the fixed phrases selected in the first step, the fixed phrases selected in the second step, and the fixed phrases selected in the third step. It should be noted that at least the fixed phrases selected in the first stage should be included in the candidate fixed phrases. Alternatively, it may be a combination of the fixed phrase selected in the first step and the fixed phrase selected in the second step.

オペレータはカーソル３０５によって、候補の選択文を選択する。オペレータによって、ウィンドウ３００に表示された候補の定型文のいずれかが選択されて、登録・更新ボタン３０３が選択されることで、選択された定型文を案件管理アプリケーションに当該定型文を登録することができる。また、提示された候補の定型文の中に適切な定型文がない場合は、ページ送りボタン３０６が選択されることで、次の候補の定型文を複数表示させてもよい。このように、端末装置３２に、通話テキストデータの内容に応じた適切な候補の定型文を表示させることができる。 The operator selects a candidate selection sentence with the cursor 305 . The operator selects one of the candidate fixed phrases displayed in the window 300 and selects the register/update button 303 to register the selected fixed phrase in the matter management application. can be done. Further, if there is no appropriate fixed phrase among the presented candidate fixed phrases, a plurality of next candidate fixed phrases may be displayed by selecting the page turn button 306 . In this way, the terminal device 32 can display an appropriate candidate fixed phrase according to the content of the call text data.

［再学習］
分類器２４において、所定の期間毎に再学習を行い、機械学習モデル２９を更新してもよい。格納部２３は、取得した通話テキストデータを格納部２３に蓄積している。また、通話テキストデータから変換された特徴ベクトルを学習データ２７として格納部２３に蓄積している。その際、通話テキストデータに対応する定型文として選択された定型文には「正解」を与えるラベルが付与されて、定型文データ２８として格納部２３に蓄積している。 [Relearn]
In the classifier 24, re-learning may be performed every predetermined period to update the machine learning model 29. FIG. The storage unit 23 stores the acquired call text data in the storage unit 23 . Further, feature vectors converted from speech text data are accumulated in the storage unit 23 as learning data 27 . At this time, the fixed phrase selected as the fixed phrase corresponding to the call text data is given a label giving "correct answer" and stored in the storage unit 23 as fixed phrase data 28 .

分類器２４では、蓄積された学習データ２７と定型文データ２８とに基づいて再学習することにより、機械学習モデル２９を更新する。これにより、以前の機械学習モデル２９では、正確に推定できなかった候補の定型文を、更新された機械学習モデル２９により、推定の精度を向上させることができるようになる。なお、機械学習モデル２９の更新のタイミングは、所定の期間毎に限定されず、格納部２３に学習データ２７及び定型文データ２８のデータセット２５が所定数蓄積されたタイミングで実行されてもよい。 The classifier 24 updates the machine learning model 29 by re-learning based on the accumulated learning data 27 and fixed phrase data 28 . As a result, the updated machine learning model 29 can improve the accuracy of estimating candidate fixed phrases that could not be accurately estimated by the previous machine learning model 29 . Note that the timing of updating the machine learning model 29 is not limited to every predetermined period, and may be executed at the timing when a predetermined number of data sets 25 of the learning data 27 and fixed phrase data 28 are accumulated in the storage unit 23. .

（第２実施形態）
第１実施形態では、カスタマとオペレータとの通話が終了してから、通話テキストデータを分析サーバ２０に送信して、進捗状況を示す定型文を推定する方法を説明したが、本発明の一実施形態はこれに限定されない。カスタマとオペレータとの通話の途中であっても、リアルタイムで通話音声データを通話テキストデータに変換し、当該通話テキストデータを順次分析してもよい。 (Second embodiment)
In the first embodiment, the method of transmitting call text data to the analysis server 20 after the call between the customer and the operator is finished and estimating a fixed phrase indicating the progress has been described, but one implementation of the present invention. The form is not limited to this. Even during a call between the customer and the operator, call voice data may be converted into call text data in real time, and the call text data may be analyzed sequentially.

本実施形態では、カスタマとオペレータとの通話の途中において、通話音声データを通話テキストデータに変換し、通話テキストデータを分析することで、時間の経過毎に候補の定型文を更新する方法について、図１、図１１及び図１２を参照して説明する。 In this embodiment, in the middle of a call between a customer and an operator, by converting call voice data into call text data and analyzing the call text data, a method of updating candidate phrases as time elapses, Description will be made with reference to FIGS. 1, 11 and 12. FIG.

カスタマ通話端末４１とオペレータ通話端末とが接続を開始すると、通話データ取得部１１は、通話音声データの取得を開始する。通話音声データの取得が開始されると、通話音声データに、通話開始を示すタグ（以下、開始タグという）が付与される。次に、カスタマとオペレータとの通話の途中において、通話音声データに通話の区切りを示すタグ（以下、区切りタグという）が付与される。通話音声データに付与される通話の区切りを示すタグは、例えば、カスタマの発話と、オペレータの発話が切り替わるタイミングで付与されてもよい。また、通話音声データに通話開始のタグが付与されてから、所定の時間経過後毎に、区切りのタグを付与してもよい。また、音声が途切れてから所定の時間経過後に区切りのタグを付与してもよい。オペレータ通話端末３０とオペレータ通話端末の接続が終了すると、通話音声データに、通話終了を示すタグ（以下、終了タグという）が付与される。 When the customer call terminal 41 and the operator call terminal start connection, the call data acquiring unit 11 starts acquiring call voice data. When acquisition of call voice data is started, a tag indicating call start (hereinafter referred to as a start tag) is added to the call voice data. Next, in the middle of the call between the customer and the operator, a tag indicating a break in the call (hereinafter referred to as a break tag) is attached to the call voice data. The tag indicating the delimitation of the call given to the call voice data may be given at the timing of switching between the customer's speech and the operator's speech, for example. Alternatively, a delimiter tag may be added every time a predetermined time elapses after a call start tag is added to the call voice data. Alternatively, a delimiter tag may be added after a predetermined period of time has elapsed since the voice was interrupted. When the connection between the operator call terminal 30 and the operator call terminal is terminated, a tag indicating the end of the call (hereinafter referred to as an end tag) is added to the call voice data.

テキストデータ変換部１２は、通話音声データに順次音声認識処理を実行し、通話テキストデータに変換する。変換された通話テキストデータにも、開始タグ、区切りタが付与される。終了タグが付与されるまで、通話テキストデータが格納部１３に蓄積される。図１１に、開始タグ、区切りタグ、終了タグが付与された通話テキストデータ４００Ａを示す。図１１において、開始タグから３０秒までの区切りタグを通話テキストデータ４０１とし、開始タグから１分までの区切りタグを通話テキストデータ４０２とし、開始タグから終了タグまでの区切りタグを通話テキストデータ４０３とする。 The text data conversion unit 12 sequentially performs speech recognition processing on the call voice data and converts it into call text data. A start tag and a delimiter are added to the converted speech text data as well. The call text data is stored in the storage unit 13 until the end tag is added. FIG. 11 shows call text data 400A to which start tags, delimiter tags, and end tags have been added. In FIG. 11, the delimiter tag from the start tag to 30 seconds is call text data 401, the delimiter tag from the start tag to 1 minute is call text data 402, and the delimiter tag from the start tag to the end tag is call text data 403. and

次に、分析サーバ２０が、取得した通話テキストデータを順次分析し、通話が終了するまで候補の定型文を出力し続ける方法について、図１２を参照して説明する。図１１は、分析サーバ２０が定型文を推定する方法を説明するフローチャートである。 Next, a method in which the analysis server 20 sequentially analyzes the acquired call text data and continues to output candidate fixed phrases until the end of the call will be described with reference to FIG. 12 . FIG. 11 is a flow chart for explaining a method for the analysis server 20 to estimate fixed phrases.

まず、テキストデータ取得部２１は、ファイルサーバ１０から通話テキストデータ４０１を取得する（ステップＳ３４１）。テキストデータ取得部２１は、カスタマとオペレータとの会話の途中、つまり、ファイルサーバ１０において、通話テキストデータが蓄積されている途中に、通話テキストデータを取得する。取得した通話テキストデータ４０１には、開始タグと区切りタグとが付与されている。 First, the text data acquisition unit 21 acquires the call text data 401 from the file server 10 (step S341). The text data acquisition unit 21 acquires the call text data during the conversation between the customer and the operator, that is, while the call text data is being accumulated in the file server 10 . A start tag and a delimiter tag are added to the acquired call text data 401 .

次に、前処理部２２は、通話テキストデータ４０１に前処理を実行し、特徴ベクトルに変換する（ステップＳ３４２）。なお、取得した通話テキストデータが、区切りタグよりも後までテキスト化されたものであったとしても、開始タグから区切りタグまでの通話テキストデータで、前処理を実行する。前処理部２２は、区切りタグを検出すると、開始タグから区切りタグまでの通話テキストデータ４０１を特徴ベクトルに変換する。次に、分類器２４は、特徴ベクトルを入力ベクトルとして、機械学習モデル２９に入力する（ステップＳ３４３）。次に、分類器２４は、候補の定型文を、Ｗｅｂサーバ２５に出力する（ステップＳ３４４）。以上の処理により、カスタマとオペレータとの通話の途中における候補の定型文を出力することができる。 Next, the preprocessing unit 22 performs preprocessing on the call text data 401 and converts it into a feature vector (step S342). Note that even if the obtained speech text data is text data after the delimiter tag, the preprocessing is executed with the speech text data from the start tag to the delimiter tag. Upon detecting the delimiter tag, the preprocessing unit 22 converts the speech text data 401 from the start tag to the delimiter tag into a feature vector. Next, the classifier 24 inputs the feature vector as an input vector to the machine learning model 29 (step S343). Next, the classifier 24 outputs the candidate fixed phrases to the Web server 25 (step S344). By the above processing, it is possible to output a candidate fixed phrase during the conversation between the customer and the operator.

次に、テキストデータ取得部２１は、ファイルサーバ１０から通話テキストデータ４０２を取得する（ステップＳ３４１）。テキストデータ取得部２１は、所定の時間の経過毎に通話テキストデータを取得する。取得した通話テキストデータ４０２には、開始タグと区切りタグ、及び追加された区切りタグが付与されている。 Next, the text data acquisition unit 21 acquires the call text data 402 from the file server 10 (step S341). The text data acquisition unit 21 acquires call text data each time a predetermined time elapses. A start tag, a delimiter tag, and an added delimiter tag are added to the acquired call text data 402 .

次に、前処理部２２は、通話テキストデータ４０２に前処理を実行し、特徴ベクトルに変換する（ステップＳ３４２）。前処理部２２は、追加された区切りタグを検出すると、開始タグから追加された区切りタグまでの通話テキストデータ４０２を特徴ベクトルに変換する。次に、分類器２４は、特徴ベクトルを入力ベクトルとして、機械学習モデル２９に入力する（ステップＳ３４３）。次に、分類器２４は、候補の定型文を、Ｗｅｂサーバ２５に出力する（ステップＳ３４４）。以上の処理により、カスタマとオペレータとの通話の途中において候補の定型文を更新することができる。 Next, the preprocessing unit 22 performs preprocessing on the call text data 402 and converts it into a feature vector (step S342). Upon detecting the added delimiter tag, the preprocessing unit 22 converts the speech text data 402 from the start tag to the added delimiter tag into a feature vector. Next, the classifier 24 inputs the feature vector as an input vector to the machine learning model 29 (step S343). Next, the classifier 24 outputs the candidate fixed phrases to the Web server 25 (step S344). With the above processing, candidate fixed phrases can be updated during a call between the customer and the operator.

分析サーバ２０は、図１１に示す推定処理を、カスタマとオペレータとの会話が終了するまで、つまり、前処理部２２が通話テキストデータから終了タグを検出するまで、繰り返し実行し続ける。テキストデータ取得部２１が取得する通話テキストデータの量は、時間の経過毎に増加する。そのため、通話テキストデータの量が増加するに従って、出力される候補の定型文の精度を向上させることができる。さらに、カスタマとオペレータとの通話が終了した時点で、最終的な候補の定型文を出力することができる。そのため、オペレータは、通話が終了した際に、直ちに進捗管理アプリケーションに通話の内容に対応する定型文を入力することができるため、進捗管理アプリケーションに定型文を登録するまでの時間を短縮することができる。 The analysis server 20 repeats the estimation process shown in FIG. 11 until the conversation between the customer and the operator ends, that is, until the preprocessing unit 22 detects an end tag from the call text data. The amount of call text data acquired by the text data acquisition unit 21 increases as time passes. Therefore, as the amount of speech text data increases, the accuracy of output candidate fixed sentences can be improved. Furthermore, when the call between the customer and the operator ends, it is possible to output final candidate fixed phrases. Therefore, when the call ends, the operator can immediately enter a fixed phrase corresponding to the content of the call into the progress management application, thereby shortening the time required to register the fixed phrase in the progress management application. can.

（変形例１）
先に説明した実施形態では、カスタマとオペレータとの通話テキストデータを用いて、候補の定型文を出力する方法について説明したが、本発明の一実施形態はこれに限定されない。カスタマとオペレータとの通話音声データから、オペレータの発話テキストデータを抽出して、機械学習を行ってもよい。この場合、前処理部２２において、オペレータの発話テキストデータを抽出した後、オペレータの発話テキストに対して前処理を実行して特徴ベクトルを生成すればよい。カスタマとオペレータとの会話において、カスタマよりもオペレータの方が、案件に対する現在の状況を示す内容を話すことが多い。そのため、通話テキストデータから、オペレータが発話したテキストデータのみを抽出して、分析を行うことで、分析処理時間を短縮することができる。 (Modification 1)
In the above-described embodiment, the method of outputting a candidate standard sentence using the text data of the conversation between the customer and the operator has been described, but one embodiment of the present invention is not limited to this. Machine learning may be performed by extracting the operator's uttered text data from the call voice data between the customer and the operator. In this case, the preprocessing unit 22 extracts the operator's uttered text data, and then preprocesses the operator's uttered text to generate a feature vector. In a conversation between a customer and an operator, the operator often speaks more about the current status of the item than the customer. Therefore, the analysis processing time can be shortened by extracting only the text data spoken by the operator from the call text data and analyzing it.

（変形例２）
先に説明した実施形態では、通話音声データを通話テキストデータに変換してから、分析サーバ２０のテキストデータ取得部２１に送信する方法について説明したが本発明の一実施形態はこれに限定されない。ファイルサーバ１０で取得した通話音声データを、分析サーバ２０に送信してから、音声認識処理によって通話テキストデータに変換してもよい。その後、分析サーバ２０において、通話テキストデータを特徴ベクトルに変換して、機械学習モデル２９に入力すればよい。 (Modification 2)
In the above-described embodiment, the method of converting call voice data into call text data and then transmitting the data to the text data acquisition unit 21 of the analysis server 20 has been explained, but one embodiment of the present invention is not limited to this. The call voice data acquired by the file server 10 may be transmitted to the analysis server 20 and then converted into call text data by voice recognition processing. After that, the analysis server 20 converts the call text data into a feature vector and inputs it to the machine learning model 29 .

（変形例３）
先に説明した実施形態では、定型文を、第１段階で選択される定型文～第３段階で選択される定型文の組み合わせを１つのセットとする例について説明したが、本発明の一実施形態はこれに限定されない。定型文は、第１段階～第３段階の定型文の組み合わせでなくてもよく、一つの定型文であってもよい。また、定型文セットは、第１段階～第３段階に限定されず、第４段階以上の定型文の組み合わせであってもよい。 (Modification 3)
In the above-described embodiment, an example has been described in which a combination of fixed phrases selected in the first step to fixed phrases selected in the third step is set as one set, but one embodiment of the present invention. The form is not limited to this. The fixed phrase does not have to be a combination of the fixed phrases of the first to third stages, and may be one fixed phrase. Also, the standard sentence set is not limited to the first to third stages, and may be a combination of standard sentences of the fourth and higher stages.

（変形例４）
先に説明した実施形態では、入力支援システム１において、ファイルサーバ１０と分析サーバ２０とが分かれている例について説明したが、本発明の一実施形態はこれに限定されない。図１３に示すように、入力支援システム１Ａに、テキストデータ変換部１２、テキストデータ取得部２１、前処理部２２、及び分類器２４を含む構成であってもよい。 (Modification 4)
In the above-described embodiment, an example in which the file server 10 and the analysis server 20 are separate in the input support system 1 has been described, but one embodiment of the present invention is not limited to this. As shown in FIG. 13, the input support system 1A may include a text data conversion section 12, a text data acquisition section 21, a preprocessing section 22, and a classifier 24. FIG.

１：入力支援システム、１０：ファイルサーバ、１１：通話データ取得部、１２：テキストデータ変換部、２０：分析サーバ、２１：テキストデータ取得部、２２：前処理部、２３：格納部、２４：分類器、２５：Ｗｅｂサーバ、２７：学習データ、２８：定型文データ、２９：機械学習モデル、３０：オペレータ通話端末、３１：交換機、３２：端末装置、３３：シンクライアントサーバ、４１：カスタマ通話端末、５１：通信網、５２：通信網、２０１：制御部、２０２：メモリ、２０３：記憶部、２０４：通信部 1: Input Support System, 10: File Server, 11: Call Data Acquisition Unit, 12: Text Data Conversion Unit, 20: Analysis Server, 21: Text Data Acquisition Unit, 22: Preprocessing Unit, 23: Storage Unit, 24: Classifier, 25: Web server, 27: Learning data, 28: Fixed phrase data, 29: Machine learning model, 30: Operator call terminal, 31: Exchange, 32: Terminal device, 33: Thin client server, 41: Customer call terminal, 51: communication network, 52: communication network, 201: control unit, 202: memory, 203: storage unit, 204: communication unit

Claims

Words and characters included in call text data are extracted, and feature vectors are weighted by frequency of appearance and rarity for each of a single word, a plurality of consecutive words, and a single character and a plurality of consecutive characters. to generate
outputting a plurality of selectable fixed form sentences corresponding to the contents of the call text data to the application for registration in an application for managing the case, by calculation of the machine learning model with the feature vector as input; Input assistance method.

The input support method according to claim 1, wherein said machine learning model includes logistic regression and neural network.

3. The input support method according to claim 1, wherein said machine learning model is updated based on said feature vector and fixed form sentences selected for said speech text data.

The feature vector is
Text data spoken by an operator is extracted from the call text data,
Words and characters included in the text data uttered by the operator are respectively extracted, and one word, a plurality of consecutive words, and one character and a plurality of consecutive characters are weighted by appearance frequency and rarity. 4. The input support method according to any one of claims 1 to 3, wherein:

5. The input support method according to any one of claims 1 to 4, wherein said call text data includes a tag indicating the start of a call and a tag indicating the end of said call.

The call text data includes a tag indicating the start of the call and a delimiter tag indicating the middle of the call,
Each time the call text data is obtained, the call text data is converted into the feature vector, and a selection corresponding to the call text data output to the application is performed by a machine learning model operation using the feature vector as an input. 6. The input support method according to any one of claims 1 to 5, wherein a plurality of possible candidate fixed phrases are updated.

The input support method according to any one of claims 1 to 6,
A program that makes a computer run.

a text data conversion unit that converts call voice data into text;
a text data acquisition unit for acquiring call text data;
Extracting words and characters contained in the call text data, and weighting features of one word, a plurality of consecutive words, and a single character and a plurality of consecutive characters by appearance frequency and rarity a preprocessing unit that generates a vector;
Classification for outputting a plurality of selectable fixed form sentences corresponding to the content of the call text data to the application for registering in the application for managing the matter by calculation of the machine learning model with the feature vector as input. an input support system, including

The text data conversion unit is included in a file server,
9. The input support system according to claim 8, wherein said text data acquisition unit, said preprocessing unit, and said classifier are included in an analysis server.