JP6979909B2

JP6979909B2 - Information processing equipment, information processing methods, and programs

Info

Publication number: JP6979909B2
Application number: JP2018052879A
Authority: JP
Inventors: カウステューブクルカルニ
Original assignee: Yahoo Japan Corp
Current assignee: Yahoo Japan Corp
Priority date: 2018-03-20
Filing date: 2018-03-20
Publication date: 2021-12-15
Anticipated expiration: 2038-03-20
Also published as: JP2019164669A

Description

本発明は、情報処理装置、情報処理方法、およびプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

従来、自然文で入力される質問に対して自動応答するための仕組みについて研究および実用化が進められている。これに関連し、１つまたは複数の単語を有するクエリーを受信するステップと、名称または別名がサブジェクトチャンクと同一の表層形式を有する少なくとも１つの候補サブジェクトを検出するようにデータベースにクエリーするステップと、前記少なくとも１つの候補サブジェクトと関連する１つまたは複数の関係を表現する１つまたは複数の関係ベクトルを検出するようにデータベースにクエリーするステップと、前記１つまたは複数の関係の、それぞれ前記クエリーと対応関係との意味類似性を示すランキングスコアを決定するステップと、前記１つまたは複数の関係から最も高いランキングスコアを有する関係を予測関係として選択し、且つ前記少なくとも１つの候補サブジェクトを予測トピックサブジェクトとして選択するステップと、前記予測関係と前記予測トピックサブジェクトでデータベースにクエリーして前記クエリーの解答を検出するステップと、を含んでおり、ここで、前記１つまたは複数の単語が前記クエリーのトピックサブジェクトを記述するサブジェクトチャンクを含むことを特徴とするクエリーに解答を提供するためのコンピュータ実施方法が知られている（特許文献１参照）。 Conventionally, research and practical application have been carried out on a mechanism for automatically answering a question entered in a natural sentence. Related to this, the step of receiving a query with one or more words, and the step of querying the database to find at least one candidate subject whose name or alias has the same surface format as the subject chunk. A step of querying the database to detect one or more relationship vectors representing one or more relationships associated with the at least one candidate subject, and the query of the one or more relationships, respectively. The step of determining the ranking score indicating the semantic similarity with the correspondence relationship, the relationship having the highest ranking score from the one or more relationships is selected as the predictive relationship, and the at least one candidate subject is the predictive topic subject. Includes a step of selecting as, and a step of querying the database with the predictive relationship and the predictive topic subject to find the answer to the query, where the one or more words are the topic of the query. A computer implementation method for providing an answer to a query characterized by including a subject chunk that describes a subject is known (see Patent Document 1).

特開２０１７−７６４０３号公報Japanese Unexamined Patent Publication No. 2017-76403

従来の技術では、効率的に処理を進められなかったり、精度が不十分になる場合があった。
本発明は、このような事情を考慮してなされたものであり、より効率的かつ高精度に質問の答えを導出できるようにすることが可能な情報処理装置、情報処理方法、およびプログラムを提供することを目的の一つとする。 With the conventional technology, the processing may not proceed efficiently or the accuracy may be insufficient.
The present invention has been made in consideration of such circumstances, and provides an information processing device, an information processing method, and a program capable of deriving an answer to a question more efficiently and with high accuracy. One of the purposes is to do.

本発明の一態様は、質問を取得する取得部と、複数のエンティティと、エンティティ間の関係を示すリレーションとが登録された第１データベースに基づいて、前記複数のエンティティから選択された、第１エンティティの特徴量に、前記第１エンティティと第２エンティティとの関係を示す第１リレーションの特徴量を加算した特徴量が、前記第２エンティティの特徴量に近づくように学習された特徴量群を含む第２データベースを参照し、前記質問の特徴量を導出する導出器を学習する導出器学習部と、を備え、前記導出器学習部は、前記質問の特徴量と、前記第２エンティティの特徴量との差分を小さくし、前記質問の特徴量と、前記第２エンティティ以外のサンプリングされたエンティティの特徴量との差分を大きくするように、前記導出器を学習する、情報処理装置である。 One aspect of the present invention is a first aspect selected from the plurality of entities based on a first database in which an acquisition unit for acquiring a question, a plurality of entities, and a relationship indicating a relationship between the entities are registered. A feature amount group learned so that the feature amount obtained by adding the feature amount of the first relation indicating the relationship between the first entity and the second entity to the feature amount of the entity approaches the feature amount of the second entity. The derivator learning unit includes a derivator learning unit that learns a derivator that derives the feature amount of the question by referring to the second database including the question, and the derivator learning unit includes the feature amount of the question and the feature of the second entity. It is an information processing apparatus that learns the derivator so as to reduce the difference from the amount and increase the difference between the feature amount of the question and the feature amount of the sampled entity other than the second entity.

本発明の一態様によれば、より効率的かつ高精度に質問の答えを導出できるようにすることができる。 According to one aspect of the present invention, it is possible to derive the answer to the question more efficiently and with high accuracy.

情報処理装置１００の構成の一例を示す図である。It is a figure which shows an example of the structure of the information processing apparatus 100. トリプレットを構成する二つのエンティティとリレーションの関係を例示した図である。It is a figure exemplifying the relationship between two entities constituting a triplet and a relation. ナレッジベース５０に登録されるデータの内容を模式的に示すイメージ図である。It is an image diagram schematically showing the content of the data registered in the knowledge base 50. ヘッドエンティティのベクトルＶｈｒ、リレーションのベクトルＶｒ、およびテイルエンティティのベクトルＶｇｔの幾何的関係を模式的に示す図である。It is a figure which shows the geometric relation of the vector Vhr of a head entity, the vector Vr of a relation, and the vector Vgt of a tail entity schematically. エンティティ・リレーションベクトルＤＢ６０に登録されるデータの内容を模式的に示すイメージ図である。It is an image diagram schematically showing the content of the data registered in the entity relation vector DB 60. 導出器の機能を概念的に示すイメージ図である。It is an image diagram which conceptually shows the function of a derivator. 第２学習部３０による処理の内容を概念的に示すイメージ図である。It is an image diagram conceptually showing the content of processing by the 2nd learning unit 30. 比較例による処理の内容を模式的に示すイメージ図である。It is an image diagram schematically showing the content of processing by a comparative example.

［概要］
以下、図面を参照し、本発明の情報処理装置、情報処理方法、およびプログラムの実施形態について説明する。情報処理装置は、一以上のプロセッサにより実現される。情報処理装置は、クラウドサービスを提供する装置であってもよいし、ツールやファームウェアなどのプログラムがインストールされ、単体で処理を実行可能な装置であってもよい。情報処理装置は、インターネットやＷＡＮなどのネットワークに接続されていてもよいし、接続されていなくてもよい。すなわち、情報処理装置を実現するためのコンピュータ装置について特段の制約は存在せず、以下に説明する処理を実行可能なものであれば、如何なるコンピュータ装置によって情報処理装置が実現されてもよい。 [Overview]
Hereinafter, embodiments of the information processing apparatus, information processing method, and program of the present invention will be described with reference to the drawings. The information processing device is realized by one or more processors. The information processing device may be a device that provides a cloud service, or may be a device in which a program such as a tool or firmware is installed and processing can be executed independently. The information processing device may or may not be connected to a network such as the Internet or WAN. That is, there are no particular restrictions on the computer device for realizing the information processing device, and the information processing device may be realized by any computer device as long as the processing described below can be executed.

情報処理装置は、例えば、人またはコンピュータによるテキストまたは音声の形式で入力される質問に対して、自動的に応答する装置ないしシステムに利用される。自動的に応答する装置とは、会話形式のインターフェースを備えるものであってもよいし、検索装置に包含され、クエリを質問と解釈し、回答を検索結果と共にクエリ入力者の端末装置に返すものであってもよい。 The information processing device is used, for example, in a device or system that automatically responds to a question input in the form of text or voice by a person or a computer. The device that responds automatically may be one that has a conversational interface, is included in the search device, interprets the query as a question, and returns the answer together with the search result to the terminal device of the query input person. May be.

情報処理装置は、質問の特徴量を導出し、質問から得られた特徴量に近い特徴量を持つエンティティを、質問の回答として選択する。以下の説明では、特徴量は、ユークリッド空間上のベクトルであるものとするが、特徴量は、距離が定義でき、加減算が可能なものであれば、ベクトルでなくてもよい。以下に登場するベクトルは、Ｌ２ノルムが０から１の間になるように正規化されているものとする。 The information processing device derives the feature amount of the question, and selects an entity having a feature amount close to the feature amount obtained from the question as the answer to the question. In the following description, the feature amount is assumed to be a vector on Euclidean space, but the feature amount does not have to be a vector as long as the distance can be defined and addition / subtraction is possible. It is assumed that the vectors appearing below are normalized so that the L2 norm is between 0 and 1.

［構成］
図１は、情報処理装置１００の構成の一例を示す図である。情報処理装置１００は、例えば、質問取得部１０と、第１学習部２０と、第２学習部３０（「導出器学習部」の一例）と、回答出力部４０とを備える。これらの構成要素は、例えば、ＣＰＵ（Central Processing Unit）などのハードウェアプロセッサがプログラム（ソフトウェア）を実行することにより実現される。これらの構成要素のうち一部または全部は、ＬＳＩ（Large Scale Integration）やＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field-Programmable Gate Array）、ＧＰＵ（Graphics Processing Unit）などのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアの協働によって実現されてもよい。プログラムは、予めＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの記憶装置に格納されていてもよいし、ＤＶＤやＣＤ−ＲＯＭなどの着脱可能な記憶媒体に格納されており、記憶媒体がドライブ装置に装着されることでインストールされてもよい。 [composition]
FIG. 1 is a diagram showing an example of the configuration of the information processing apparatus 100. The information processing apparatus 100 includes, for example, a question acquisition unit 10, a first learning unit 20, a second learning unit 30 (an example of a “derivator learning unit”), and an answer output unit 40. These components are realized by, for example, a hardware processor such as a CPU (Central Processing Unit) executing a program (software). Some or all of these components are hardware (circuit parts) such as LSI (Large Scale Integration), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array), GPU (Graphics Processing Unit). It may be realized by (including circuits), or it may be realized by the cooperation of software and hardware. The program may be stored in advance in a storage device such as an HDD (Hard Disk Drive) or a flash memory, or may be stored in a removable storage medium such as a DVD or a CD-ROM, and the storage medium is stored in the drive device. It may be installed by being attached.

また、情報処理装置１００は、ＨＤＤやフラッシュメモリなどの記憶装置に、ナレッジベース５０、エンティティ・リレーションベクトルＤＢ（データベース）６０、導出器情報７０などの情報を格納する。これらの情報は、情報処理装置１００とは別体のデータベースサーバなどによって保持されてもよい。すなわち、情報処理装置１００は、これらの情報を保持するための記憶装置を備えなくてもよい。 Further, the information processing apparatus 100 stores information such as a knowledge base 50, an entity relation vector DB (database) 60, and a derivator information 70 in a storage device such as an HDD or a flash memory. These information may be held by a database server or the like separate from the information processing apparatus 100. That is, the information processing device 100 does not have to include a storage device for holding such information.

質問取得部１０は、例えば、ネットワークＮＷを介して他装置から質問を取得する。ネットワークＮＷは、例えば、インターネット、ＷＡＮ（Wide Area Network）、ＬＡＮ（Local Area Network）、セルラー網などを含む。質問は、テキストの形式で取得されてもよいし、音声の形式で取得されてもよい。後者の場合、情報処理装置１は音声認識部を備えてよい。いずれの場合も、質問は自然文で構成され、抽象化された符号では無いものとする。 The question acquisition unit 10 acquires a question from another device via, for example, a network NW. The network NW includes, for example, the Internet, a WAN (Wide Area Network), a LAN (Local Area Network), a cellular network, and the like. Questions may be retrieved in textual format or in audio format. In the latter case, the information processing device 1 may include a voice recognition unit. In each case, the question shall be composed of natural sentences and not an abstract code.

第１学習部２０は、複数のエンティティと、エンティティ間の関係を示すリレーションとが登録されたナレッジベース５０に基づいて、エンティティとリレーションのベクトル（以下、それぞれをエンティティベクトル、リレーションベクトルと称する）を学習し、エンティティ・リレーションベクトルＤＢ６０に登録する。 The first learning unit 20 sets a vector of an entity and a relation (hereinafter, each of which is referred to as an entity vector and a relation vector) based on a knowledge base 50 in which a plurality of entities and a relation indicating a relationship between the entities are registered. Learn and register in the entity relation vector DB60.

［ナレッジベース］
ここで、ナレッジベース５０について説明する。ナレッジベース５０とは、事物に関する情報および事物間の意味的関係に関する情報をグラフとして記述したデータベースである。ナレッジベース５０における事物とは、例えば、「人間」、「機械」、「建物」、「組織」、「美」、「学問」、「旅行」といった抽象的な概念と、例えば特定の人間、特定の建物、特定の組織等の、それらの個体（以下、「インスタンス」)を含む。本実施形態では、事物のうち、ナレッジベース５０で情報を記述する対象事物のことを、特に「エンティティ」と称して説明する。 [Knowledge Base]
Here, the knowledge base 50 will be described. The knowledge base 50 is a database in which information about things and information about semantic relationships between things are described as a graph. Things in the Knowledge Base 50 are, for example, abstract concepts such as "human", "machine", "building", "organization", "beauty", "academic", "travel", and, for example, a specific person, specific. Including those individuals (hereinafter referred to as "instances") such as buildings, specific organizations, etc. In the present embodiment, among the things, the object for which the information is described in the knowledge base 50 will be described particularly by referring to the “entity”.

エンティティは、例えば、ある対象事物のインスタンスの実体（例えば実世界で存在している物体）を表していてもよいし、ある対象事物の概念（例えば実世界または仮想世界の中で定義された概念）を表していてもよい。例えば「建物」のように概念を表すエンティティもあれば、「○○タワー」のように「建物」という概念のインスタンスの実体を表すエンティティもある。 An entity may represent, for example, an entity of an instance of an object (eg, an object that exists in the real world), or a concept of an object (eg, a concept defined in the real world or virtual world). ) May be represented. For example, there are entities such as "building" that represent a concept, and there are entities such as "○○ tower" that represent an instance of the concept "building".

ナレッジベース５０は、計算機による意味処理を可能とするため、オントロジーという語彙体系で定められたクラスとリレーションを用いて記述される。オントロジーとは、事物のクラスおよびリレーションを定義したものであり、クラスとリレーションとの間に成り立つ制約を集めたものである。 The knowledge base 50 is described using classes and relations defined by a vocabulary system called an ontology in order to enable semantic processing by a computer. An ontology is a definition of a class and a relation of things, and is a collection of constraints that hold between the class and the relation.

クラスとは、オントロジーにおいて、同じ性質を持つ事物同士を一つのグループにしたものである。クラスの性質や事物の性質は後述するリレーションにより記述される。 A class is a group of things that have the same properties in an ontology. The nature of classes and the nature of things are described by the relations described below.

例えば、くちばしを持ち、卵生の脊椎動物であり、前肢が翼になっている、という性質を持つ事物は、「鳥」というクラスあるいはその下位のクラスに分類される。また、「鳥」というクラスの中で、飛べない、という性質を持つ事物は、例えば、「ペンギン」や「ダチョウ」という、より下位のクラスに分類される。このように、クラスの体系は、上位と下位の関係を有する階層構造をなし、上位のクラスの性質は、下位のクラスに継承される。上述した例では、「鳥」というクラスの、「くちばしを持ち、卵生の脊椎動物であり、前肢が翼になっている」という性質は、「ペンギン」や「ダチョウ」という下位のクラスの性質にも含まれることになる。クラスを識別するためのクラス名自体は必ずしもクラスの意味を表している必要はないが、以下の説明では簡略化のためにクラスの意味を表すクラス名が与えられていることとする。 For example, things that have a beak, an oviparous vertebrate, and forelimbs that are wings are classified in the "bird" class or its subordinate classes. Also, in the class of "birds", things that have the property of not being able to fly are classified into lower classes such as "penguins" and "ostriches". In this way, the class system has a hierarchical structure with a higher-level relationship and lower-level relationships, and the properties of the upper-level class are inherited by the lower-level class. In the example above, the "bird" class's "beak-bearing, oviparous vertebrate with winged forelimbs" property is a lower class property of "penguins" and "ostriches". Will also be included. The class name itself for identifying the class does not necessarily represent the meaning of the class, but in the following explanation, it is assumed that the class name representing the meaning of the class is given for the sake of brevity.

リレーションとは、事物の性質や特徴、クラス間の関係を記述する属性である。この結果、リレーションは、エンティティ間の関係を表す情報となる。例えば、リレーションは、「〜を体の構成要素としてもつ」という性質や、「〜に生息する」という性質を示す属性であってもよいし、「あるクラスが上位クラスであり、あるクラスが下位クラスである」というクラス間の上位下位の関係を示す属性であってもよい。リレーションを識別するためのリレーション名自体は必ずしもリレーションの意味を表している必要はないが、以下の説明では簡略化のためにリレーションの意味を表すリレーション名が与えられていることとする。 Relations are attributes that describe the nature and characteristics of things and the relationships between classes. As a result, the relation becomes information representing the relationship between the entities. For example, a relation may be an attribute that has the property of "having ~ as a component of the body" or the property of "living in ...", or "a class is a higher class and a certain class is a lower class". It may be an attribute indicating the relationship between the upper and lower levels of "class". The relation name itself for identifying the relation does not necessarily represent the meaning of the relation, but in the following description, it is assumed that the relation name representing the meaning of the relation is given for the sake of brevity.

ナレッジベース５０の基本的な単位は、エンティティ間をリレーションで接続した３つの情報の組（以下、トリプレットと称する）である。図２は、トリプレットを構成する二つのエンティティとリレーションの関係を例示した図である。図示する例では、［エンティティ「日本」、リレーション「首都」、エンティティ「東京」］というトリプレットが挙げられている。このようなトリプレットから、「日本の首都は東京である」という意味情報を取得することができる。ナレッジベース５０を用いることで、エンティティに関する情報やエンティティ間の関係が明確に表現され、各種の機械処理が可能になる。以下、図２における「日本」すなわちトリプレットにおけるリレーションの意味的な方向性に関して元側にあるエンティティをヘッドエンティティ、図２における「東京」すなわちトリプレットにおけるリレーションの意味的な方向性に関して先側にあるエンティティをテールエンティティと称する。ナレッジベース５０は、トリプレットを複数備えるものである。あるヘッドエンティティが、他のヘッドエンティティにとってのテイルエンティティになることも、その逆もあり得る。図３は、ナレッジベース５０に登録されるデータの内容を模式的に示すイメージ図である。図中、矢印はリレーションを示し、破線の楕円はトリプレットを示している。図中のエンティティ１は、エンティティ２に対してヘッドエンティティであると共に、エンティティ３に対してはテイルエンティティである。 The basic unit of the knowledge base 50 is a set of three pieces of information (hereinafter referred to as triplets) in which entities are connected by relations. FIG. 2 is a diagram illustrating the relationship between the two entities constituting the triplet and the relationship. In the illustrated example, the triplet [entity "Japan", relation "capital", entity "Tokyo"] is given. From such triplets, it is possible to obtain semantic information that "the capital of Japan is Tokyo." By using the knowledge base 50, information about entities and relationships between entities are clearly expressed, and various machine processes become possible. Hereinafter, the entity on the original side with respect to the semantic direction of the relation in "Japan", that is, the triplet in FIG. 2 is the head entity, and the entity on the front side with respect to the "Tokyo" in FIG. 2, that is, the entity on the front side with respect to the semantic direction of the relation in the triplet. Is called a tail entity. The knowledge base 50 includes a plurality of triplets. One head entity can be a tail entity for another head entity and vice versa. FIG. 3 is an image diagram schematically showing the contents of the data registered in the knowledge base 50. In the figure, the arrows indicate the relations and the dashed ellipses indicate the triplets. Entity 1 in the figure is a head entity for entity 2 and a tail entity for entity 3.

［第１学習部］
第１学習部２０は、ナレッジベース５０からサンプリングされた複数のトリプレット（既知情報）のそれぞれについて、ヘッドエンティティのベクトルにリレーションのベクトルを加算したベクトルが、テイルエンティティのベクトルに近づくように、ナレッジベース５０に含まれる各エンティティおよび各リレーションのベクトルを学習する。 [1st learning department]
In the first learning unit 20, for each of the plurality of triplets (known information) sampled from the knowledge base 50, the knowledge base is such that the vector obtained by adding the relation vector to the head entity vector approaches the tail entity vector. Learn the vector of each entity and each relation contained in 50.

ここで、ヘッドエンティティのベクトルをＶｈ、リレーションのベクトルをＶｒ、テイルエンティティのベクトルをＶｔと表す。なお、エンティティ空間はリレーション空間よりも多くの情報量を包含するため、例えば、エンティティのベクトルはリレーションのベクトルＶｒの次元数（ｄ）よりも高い次元数（ｋ）で生成される。以下の説明では、ヘッドエンティティのベクトルＶｈとテイルエンティティのベクトルＶｔのそれぞれにｋ×ｄ行列Ｍ^ｋ×ｄを乗算してリレーションのベクトルＶｒの属する空間に写像したベクトルＶｈｒおよびＶｔｒを、ヘッドエンティティおよびテイルエンティティのベクトルと称する（式（１）（２）参照；Ｍは行列）。ヘッドエンティティのベクトルＶｈｒは「第１エンティティの特徴量」の一例であり、テイルエンティティのベクトルＶｔｒは「第２エンティティの特徴量」の一例であり、それらを接続するリレーションのベクトルＶｒは「第１リレーションのベクトル」の一例である。図中、Ｏはベクトル空間の原点である。 Here, the vector of the head entity is represented by Vh, the vector of the relation is represented by Vr, and the vector of the tail entity is represented by Vt. Since the entity space contains a larger amount of information than the relation space, for example, the vector of the entity is generated with a number of dimensions (k) higher than the number of dimensions (d) of the vector Vr of the relation. In the following description, the vector Vhr and Vtr mapped to the space to which the relation vector Vr belongs by multiplying each of the vector Vh of the head entity and the vector Vt of the tail entity by the k × d matrix ^{Mk × d are obtained with the head entity and the vector Vt.} It is called a vector of tail entities (see equations (1) and (2); M is a matrix). The vector Vhr of the head entity is an example of the "features of the first entity", the vector Vtr of the tail entity is an example of the "features of the second entity", and the vector Vr of the relation connecting them is the "first". This is an example of "relation vector". In the figure, O is the origin of the vector space.

Ｖｈｒ＝Ｖｈ×Ｍ^ｋ×ｄ …（１）
Ｖｔｒ＝Ｖｔ×Ｍ^ｋ×ｄ …（２） Vhr = Vh × M ^{k × d} … (1)
Vtr = Vt × M ^{k × d} … (2)

理想的には、ヘッドエンティティのベクトルＶｈｒとリレーションのベクトルＶｒとのベクトル和は、テイルエンティティのベクトルＶｇｔと一致する。図４は、ヘッドエンティティのベクトルＶｈｒ、リレーションのベクトルＶｒ、およびテイルエンティティのベクトルＶｇｔの幾何的関係を模式的に示す図である。 Ideally, the vector sum of the vector Vhr of the head entity and the vector Vr of the relation matches the vector Vgt of the tail entity. FIG. 4 is a diagram schematically showing the geometrical relationship between the vector Vhr of the head entity, the vector Vr of the relation, and the vector Vgt of the tail entity.

第１学習部２０は、式（３）で表される目的関数ｆｒに基づくスコアＣが十分に小さくなり収束するように、ヘッドエンティティのベクトルＶｈｒ、リレーションのベクトルＶｒ、およびテイルエンティティのベクトルＶｇｔを学習する。式中、||Ａ||_２はベクトルＡのＬ２ノルムを表している。目的関数ｆｒは、図３の幾何的関係が成立している場合にゼロとなる。 The first learning unit 20 sets the vector Vhr of the head entity, the vector Vr of the relation, and the vector Vgt of the tail entity so that the score C based on the objective function fr represented by the equation (3) becomes sufficiently small and converges. learn. In the equation, || A || ₂ represents the L2 norm of the vector A. The objective function fr becomes zero when the geometrical relationship shown in FIG. 3 is established.

ｆｒ（ｈ，ｔ）＝｛||Ｖｈｒ＋Ｖｒ−Ｖｔｒ||_２｝^２ …（３） fr (h, t) = {|| Vhr + Vr-Vtr || ₂ } ² ... (3)

スコアＣは、式（４）で表される。式中、Ｓは正しいトリプレットを構成するヘッドエンティティ、リレーション、およびテイルエンティティの組み合わせであり、Ｓ＊は正しいトリプレットを構成しないヘッドエンティティ、リレーション、およびテイルエンティティの組み合わせである。γは、任意に定められるマージンパラメータである。

The score C is expressed by the equation (4). In the formula, S is a combination of head entities, relations, and tail entities that make up the correct triplet, and S * is a combination of head entities, relations, and tail entities that do not make up the correct triplet. γ is an arbitrarily determined margin parameter.

第１学習部２０によって学習されたエンティティおよびリレーションのベクトルは、エンティティ・リレーションベクトルＤＢ６０に登録され、第２学習部３０によって使用される。図５は、エンティティ・リレーションベクトルＤＢ６０に登録されるデータの内容を模式的に示すイメージ図である。エンティティ１とエンティティ２を含むトリプレットに着目すると、エンティティ１のベクトルＶ１ｒがＶｈｒに、エンティティ２のベクトルＶ２ｒがＶｔｒに、それらを繋ぐリレーションのベクトルＶ１２ｒがＶｒに、それぞれ該当する。また、エンティティ１とエンティティ３を含むトリプレットに着目すると、エンティティ１のベクトルＶ１ｒがＶｔｒに、エンティティ３のベクトルＶ３ｒがＶｈｒに、それらを繋ぐリレーションのベクトルＶ３１ｒがＶｒに、それぞれ該当する。 The vector of the entity and the relation learned by the first learning unit 20 is registered in the entity relation vector DB 60 and used by the second learning unit 30. FIG. 5 is an image diagram schematically showing the contents of the data registered in the entity relation vector DB 60. Focusing on the triplet including the entity 1 and the entity 2, the vector V1r of the entity 1 corresponds to Vhr, the vector V2r of the entity 2 corresponds to Vtr, and the vector V12r of the relation connecting them corresponds to Vr. Focusing on the triplet including the entity 1 and the entity 3, the vector V1r of the entity 1 corresponds to the Vtr, the vector V3r of the entity 3 corresponds to the Vhr, and the vector V31r of the relation connecting them corresponds to the Vr.

［第２学習部］
第２学習部３０は、導出器情報７０によって表現される導出器のパラメータ等を学習する。導出器は、質問を構成するテキストのベクトル（以下、質問のベクトルＶｑ）を生成するものである。第２学習部３０の処理を意味的に分解すると、ヘッドエンティティ、リレーション、およびテイルエンティティとの関係に応じた質問のベクトルＶｑ自体の学習と、質問を構成するテキストを質問のベクトルＶｑにする規則の学習とが含まれている。質問のベクトルＶｑは、ベクトルＶｈｒ、Ｖｒ、Ｖｔｒと同じベクトル空間にあるベクトル、すなわちこれらと次元数が同じベクトルである。 [Second learning department]
The second learning unit 30 learns the parameters of the derivator represented by the derivator information 70 and the like. The derivator generates a vector of texts constituting the question (hereinafter, the vector Vq of the question). When the process of the second learning unit 30 is semantically decomposed, the learning of the question vector Vq itself according to the relationship with the head entity, the relation, and the tail entity, and the rule that the text constituting the question is made into the question vector Vq. Includes learning and. The vector Vq in question is a vector in the same vector space as the vectors Vhr, Vr, Vtr, that is, a vector having the same number of dimensions as these.

以下の説明において、第２学習部３０には、質問と、質問の意味を表すと考えられるヘッドエンティティおよびリレーション、並びに質問の正解であるテイルエンティティを含むトリプレットと、を対応付けた情報が、教師データとして与えられるものとする。例えば、［質問「日本の首都はどこ」、ヘッドエンティティ「日本」、リレーション「首都」、テイルエンティティ「東京」］を一つのレコードとし、複数のレコードを含むデータが教師データとして第２学習部３０に与えられる。 In the following description, the second learning unit 30 is provided with information associating a question with a head entity and a relation that are considered to represent the meaning of the question, and a triplet including a tail entity that is the correct answer to the question. It shall be given as data. For example, [Question "Where is the capital of Japan", head entity "Japan", relation "capital", tail entity "Tokyo"] is regarded as one record, and data including multiple records is used as teacher data in the second learning unit 30. Given to.

第２学習部３０は、上記の教師データと、エンティティ・リレーションベクトルＤＢ６０とに基づいて、質問のベクトルＶｑと、テイルエンティティのベクトルＶｔｒとの差分（距離）を小さくし、質問のベクトルＶｑと、テイルエンティティ以外のサンプリングされたエンティティのベクトルＶｔｒ＊との差分（距離）を大きくするように、質問のベクトルＶｑを学習する。すなわち、第２学習部３０は、式（５）の目的関数ｇ１を小さくするように質問のベクトルＶｑを学習する。 The second learning unit 30 reduces the difference (distance) between the vector Vq of the question and the vector Vtr of the tail entity based on the above teacher data and the entity relation vector DB60, and reduces the difference (distance) between the vector Vq of the question and the vector Vq of the question. The vector Vq of the question is learned so as to increase the difference (distance) from the vector Vtr * of the sampled entity other than the tail entity. That is, the second learning unit 30 learns the vector Vq of the question so as to reduce the objective function g1 of the equation (5).

ｇ１＝｛||Ｖｔ−Ｖｑ||_２｝^２ …（５） g1 = {|| Vt-Vq || ₂ } ² ... (5)

また、第２学習部３０は、更に、質問のベクトルＶｑと、ヘッドエンティティのベクトルＶｈｒとリレーションのベクトルＶｒとの和、との差分を小さくし、質問のベクトルＶｑと、ヘッドエンティティ以外のサンプリングされたエンティティのベクトルＶｈｒ＊と上記リレーション以外のサンプリングされたリレーションのベクトルＶｒ＊の和、との差分を大きくするように、質問のベクトルＶｑを学習する。すなわち、第２学習部３０は、式（６）の目的関数ｇ２を小さくするように質問のベクトルＶｑを学習する。 Further, the second learning unit 30 further reduces the difference between the vector Vq of the question and the sum of the vector Vhr of the head entity and the vector Vr of the relation, and samples the vector Vq of the question and other than the head entity. The vector Vq of the question is learned so as to increase the difference between the vector Vhr * of the entity and the sum of the vectors Vr * of the sampled relations other than the above relations. That is, the second learning unit 30 learns the vector Vq of the question so as to reduce the objective function g2 of the equation (6).

ｇ２＝｛||Ｖｈｒ＋Ｖｒ−Ｖｑ||_２｝^２ …（６） g2 = {|| Vhr + Vr-Vq || ₂ } ² ... (6)

より具体的に、第２学習部３０は、目的関数ｇ１と目的関数ｇ２の和に基づくスコアＬが十分に小さくなり収束するように、質問のベクトルＶｑを学習する。スコアＬは、式（７）で表される。式中、Ｓは正しいトリプレットを構成するヘッドエンティティ、リレーション、およびテイルエンティティと、対応する質問との組み合わせであり、Ｓ＊は正しいトリプレットを構成せず、或いは質問とその他の情報とが対応しない、ヘッドエンティティ、リレーション、テイルエンティティ、および質問の組み合わせである。γは、任意に定められるマージンパラメータである。 More specifically, the second learning unit 30 learns the vector Vq of the question so that the score L based on the sum of the objective function g1 and the objective function g2 becomes sufficiently small and converges. The score L is expressed by the equation (7). In the formula, S is a combination of head, relation, and tail entities that make up the correct triplet with the corresponding question, and S * does not make up the correct triplet, or the question does not correspond to other information. A combination of head entities, relationships, tail entities, and questions. γ is an arbitrarily determined margin parameter.

第２学習部３０は、自然文であるテキストから質問のベクトルＶｑを導出する導出器のパラメータ等を学習する。図６は、導出器の機能を概念的に示すイメージ図である。図示するように、導出器は、例えば、ＲＮＮに基づいて生成される。導出器には、例えば、質問を一文字ごとに区切った語のコード等が順次入力される。ＲＮＮは、各時点における暫定的な質問のベクトルＶｑ（１）、Ｖｑ（２）、…を出力すると共に、計算結果の少なくとも一部を次の時点に伝播させる。質問のすべての語のコード等が入力された時点のＲＮＮの出力が、質問のベクトルＶｑとなる。第２学習部３０は、ある質問について、対応する既知のヘッドエンティティ、リレーション、テイルエンティティのそれぞれのベクトルＶｈｒ、Ｖｒ、Ｖｔｒに基づくスコアＬを小さくするように学習された質問のベクトルＶｑが導出されるように、ＲＮＮのパラメータを学習する。このパラメータは、導出器情報７０として保持される。 The second learning unit 30 learns the parameters of the derivator for deriving the vector Vq of the question from the text which is a natural sentence. FIG. 6 is an image diagram conceptually showing the function of the deriver. As shown, the deriver is generated based on, for example, an RNN. For example, the code of a word in which a question is separated for each character is sequentially input to the derivator. The RNN outputs the vectors Vq (1), Vq (2), ... Of the provisional question at each time point, and propagates at least a part of the calculation result to the next time point. The output of the RNN at the time when the codes of all the words of the question are input becomes the vector Vq of the question. The second learning unit 30 derives the vector Vq of the question learned so as to reduce the score L based on the respective vectors Vhr, Vr, Vtr of the corresponding known head entity, relation, and tail entity for a certain question. As such, the parameters of RNN are learned. This parameter is held as the deriver information 70.

なお、質問が英文である場合、スペースで区切られたワードが各時点のＲＮＮに順次入力されてもよいし、和文その他である場合、形態素解析によって区切られたワードが各時点のＲＮＮに順次入力されてもよい。第２学習部３０は、それらの態様に即した学習を行う。 If the question is in English, the words separated by spaces may be sequentially input to the RNN at each time point, and if it is Japanese or other, the words separated by morphological analysis are sequentially input to the RNN at each time point. May be done. The second learning unit 30 performs learning according to those aspects.

図７は、第２学習部３０による処理の内容を概念的に示すイメージ図である。図示するように、質問が導出器に入力されることで質問のベクトルＶｑが得られる。第２学習部３０は、質問のベクトルＶｑとテイルエンティティのベクトルＶｔｒとの距離ｄ１、および、質問のベクトルＶｑとヘッドエンティティのベクトルＶｈｒとリレーションのベクトルＶｒとの和との距離ｄ２の双方が十分に小さくなるように、導出器のパラメータを学習する。 FIG. 7 is an image diagram conceptually showing the contents of processing by the second learning unit 30. As shown in the figure, the vector Vq of the question is obtained by inputting the question into the deriver. In the second learning unit 30, both the distance d1 between the question vector Vq and the tail entity vector Vtr and the distance d2 between the question vector Vq and the sum of the head entity vector Vhr and the relation vector Vr are sufficient. Learn the vector parameters so that they become smaller.

また、前述したようにナレッジベース５０には、エンティティの性質に関する情報（例えばクラス）が付与されている。第２学習部３０は、学習開始からある程度の時間が経過するまでの第１期間においては、ネガティブサンプリングするエンティティを、クラスを問わずサンプリングし、第１期間よりも後の第２期間においては、ネガティブサンプリングするエンティティを、正解のエンティティとクラスが同じエンティティの中からサンプリングする。これによって、効率的な学習を行い、収束を早めることができる。 Further, as described above, the knowledge base 50 is provided with information (for example, a class) regarding the properties of the entity. The second learning unit 30 samples the entity to be negatively sampled regardless of the class in the first period from the start of learning until a certain amount of time elapses, and in the second period after the first period, the second learning unit 30 samples. Negatively sampled entities are sampled from the same entity as the correct entity. As a result, efficient learning can be performed and convergence can be accelerated.

回答出力部４０は、例えばネットワークＮＷを介して他装置から取得した質問を、導出器情報７０によって規定される導出器に入力し、出力された質問のベクトルＶｑに最も近いエンティティのベクトルをエンティティ・リレーションベクトルＤＢ６０から抽出する。この際に、回答出力部４０は、例えば１ＮＮ（Nearest Neighbor）の手法によって質問のベクトルＶｑに最も近いエンティティのベクトルを抽出する。回答出力部４０は、最も近いエンティティのベクトルに対応するエンティティの内容を、質問の回答として、例えばネットワークＮＷを介して他装置に返信する。なお、情報処理装置が単体で処理をする場合、情報処理装置は、図示しない入力部に対して入力された質問に対して処理を行い、出力部に回答を出力させる。 The answer output unit 40 inputs, for example, a question acquired from another device via the network NW into the deriver defined by the deriver information 70, and inputs the vector of the entity closest to the vector Vq of the output question to the entity. Extracted from the relation vector DB60. At this time, the answer output unit 40 extracts the vector of the entity closest to the vector Vq of the question by, for example, a method of 1NN (Nearest Neighbor). The answer output unit 40 returns the content of the entity corresponding to the vector of the nearest entity to another device as an answer to the question, for example, via the network NW. When the information processing device performs processing by itself, the information processing device processes a question input to an input unit (not shown) and causes an output unit to output an answer.

以上説明した情報処理装置によれば、より効率的かつ高精度に質問の答えを導出できるようにすることができる。 According to the information processing apparatus described above, it is possible to derive the answer to the question more efficiently and with high accuracy.

ここで、比較対象として、質問のベクトルＶｑを、単にベクトルＶｈｒ＋Ｖｒに近づけるように学習する場合（以下、比較例）を考える。図８は、比較例による処理の内容を模式的に示すイメージ図である。比較例では、ＶｑをＶｈｒ＋Ｖｒに近づける（距離ｄ２を小さくする）学習と、Ｖｈｒ＋ＶｒをＶｔｒに近づける（距離ｄ３を小さくする）学習との二段階の学習（推論）を経て回答を導出することになるが、実施形態の情報処理装置では、ＶｑをＶｔｒに直接近づける学習を処理内容に含んでいるため、回答を導出する精度を向上させることができる。 Here, as a comparison target, consider a case of learning so that the vector Vq of the question is simply brought closer to the vector Vhr + Vr (hereinafter, comparative example). FIG. 8 is an image diagram schematically showing the contents of the processing according to the comparative example. In the comparative example, the answer is derived through two-step learning (inference) of learning to bring Vq closer to Vhr + Vr (reducing the distance d2) and learning to bring Vhr + Vr closer to Vtr (reducing the distance d3). However, in the information processing apparatus of the embodiment, since the learning that directly brings Vq closer to Vtr is included in the processing content, the accuracy of deriving the answer can be improved.

情報量の圧縮の観点からも同じことが言える。ＶｑをＶｈｒ＋Ｖｒに近づける学習においては、一種の次元圧縮が行われる。なぜなら、質問の自然文には、ヘッドエンティティやリレーションで表される抽象化された意味情報以上の情報（以下、付加情報）が含まれているからである。比較例では、付加情報が削除された状態でＶｈｒ＋Ｖｒに近づける学習が行われるのに対し、実施形態の情報処理装置では、付加情報を含んだ状態でＶｔｒに近づける学習が行われるため、そもそもＶｈｒ＋ＶｒをＶｔｒに近づける学習よりも高精度が結果を得ることができる。 The same can be said from the viewpoint of information compression. In learning to bring Vq closer to Vhr + Vr, a kind of dimensional compression is performed. This is because the natural sentence of the question contains more information (hereinafter, additional information) than the abstract semantic information represented by the head entity or relation. In the comparative example, learning to approach Vhr + Vr is performed with the additional information deleted, whereas in the information processing apparatus of the embodiment, learning is performed to approach Vtr with the additional information included, so that Vhr + Vr is used in the first place. Results can be obtained with higher accuracy than learning that approaches Vtr.

また、実施形態の情報処理装置によれば、式（７）に示すように、ＶｑをＶｔｒに近づけるという制約と、ＶｑをＶｈｒ＋Ｖｒに近づけるという制約の二つの制約の上でＶｑを学習する。このため、学習における自由度を下げることで、処理の収束を早めることができる。この結果、比較例よりも高速に処理を行うことができる。 Further, according to the information processing apparatus of the embodiment, as shown in the equation (7), Vq is learned under the two constraints of bringing Vq closer to Vtr and bringing Vq closer to Vhr + Vr. Therefore, by lowering the degree of freedom in learning, it is possible to accelerate the convergence of processing. As a result, processing can be performed at a higher speed than in the comparative example.

また、実施形態の情報処理装置によれば、式（７）に示すように、ポジティブサンプルに対する距離を最小化すると共に、ネガティブサンプルに対する距離を最大化する問題を解く際に、最大化する項の符号をマイナスにすることで、全体を最小化問題にしている。ディープラーニング等の処理は、目的関数を最小化する問題を解くのに適しているため、実施形態の処理は、コンピュータを用いて好適に実現することができる。 Further, according to the information processing apparatus of the embodiment, as shown in the equation (7), the term for maximizing when solving the problem of minimizing the distance to the positive sample and maximizing the distance to the negative sample. By making the sign negative, the whole is made a minimization problem. Since the processing such as deep learning is suitable for solving the problem of minimizing the objective function, the processing of the embodiment can be suitably realized by using a computer.

なお、上記の説明において、情報処理装置は、質問を区切った語のコード等がＲＮＮに順次入力されるものとしたが、類義語辞書等を使用して、ある程度正規表現にしてからＲＮＮに入力してもよい。 In the above description, the information processing device assumes that the code of the word that separates the questions is sequentially input to the RNN, but it is input to the RNN after making it a regular expression to some extent using a synonym dictionary or the like. You may.

以上、本発明を実施するための形態について実施形態を用いて説明したが、本発明はこうした実施形態に何等限定されるものではなく、本発明の要旨を逸脱しない範囲内において種々の変形及び置換を加えることができる。 Although the embodiments for carrying out the present invention have been described above using the embodiments, the present invention is not limited to these embodiments, and various modifications and substitutions are made without departing from the gist of the present invention. Can be added.

１情報処理装置
１０質問取得部
２０第１学習部
３０第２学習部
４０回答出力部
５０ナレッジベース
６０エンティティ・リレーションベクトルＤＢ
７０導出器情報 1 Information processing device 10 Question acquisition unit 20 1st learning unit 30 2nd learning unit 40 Answer output unit 50 Knowledge base 60 Entity relationship vector DB
70 Deriver information

Claims

Based on the first database in which a plurality of entities and relationships indicating relationships between the entities are registered, the first entity and the second entity are added to the features of the first entity selected from the plurality of entities. Refer to the second database including the feature amount group learned so that the feature amount obtained by adding the feature amount of the first relation indicating the relationship of the second entity approaches the feature amount of the second entity, and the feature of the text constituting the question. Equipped with a derivator learning unit that learns a derivator that derives a quantity
The derivator learning unit
Reduce the difference between the feature amount of the question and the feature amount of the second entity.
The derivator is learned so as to increase the difference between the feature amount of the question and the feature amount of the sampled entity other than the second entity.
Information processing device.

The derivator learning unit further
The difference between the feature amount of the question and the sum of the feature amount of the first entity and the feature amount of the first relation is reduced.
The derivator is set so as to increase the difference between the feature amount of the question and the feature amount of the sampled entity other than the first entity and the feature amount of the sampled relation other than the first relation. learn,
The information processing apparatus according to claim 1.

The feature amount of the question is a feature amount in the same space as the feature amount of the first entity, the feature amount of the first relation, and the feature amount of the second entity.
The information processing apparatus according to claim 1 or 2.

Information about the nature of the entity is attached to the first database.
In the first period, the derivator learning unit samples entities other than the second entity regardless of the property, and in the second period after the first period, the property is the second entity. Sampling entities other than the second entity from the entities common to
The information processing apparatus according to any one of claims 1 to 3.

The derivator is configured based on an RNN (Recurrent Neural Networks) in which words that divide the question are sequentially input.
The derivator learning unit learns the parameters of the RNN.
The information processing apparatus according to any one of claims 1 to 4.

Based on the first database in which a plurality of entities and relationships indicating relationships between the entities are registered, the first entity and the second entity are added to the features of the first entity selected from the plurality of entities. A feature amount learning unit for learning the feature amount so that the feature amount obtained by adding the feature amount of the first relation indicating the relationship of the second entity approaches the feature amount of the second entity is further provided.
The information processing apparatus according to any one of claims 1 to 5.

The acquisition department to get the question and
Further, an answer output unit that extracts an entity close to the feature amount of the question obtained by inputting the question acquired by the acquisition unit into the derivator from the first database and outputs it as an answer to the question. Prepare, prepare
The information processing apparatus according to any one of claims 1 to 6.

The computer
Based on the first database in which a plurality of entities and relationships indicating relationships between the entities are registered, the first entity and the second entity are added to the features of the first entity selected from the plurality of entities. Refer to the second database including the feature amount group learned so that the feature amount obtained by adding the feature amount of the first relation indicating the relationship of the second entity approaches the feature amount of the second entity, and the feature of the text constituting the question. Learn a derivator to derive a quantity,
During the above learning
Reduce the difference between the feature amount of the question and the feature amount of the second entity.
The derivator is learned so as to increase the difference between the feature amount of the question and the feature amount of the sampled entity other than the second entity.
Information processing method.

On the computer
Based on the first database in which a plurality of entities and relationships indicating relationships between the entities are registered, the first entity and the second entity are added to the features of the first entity selected from the plurality of entities. The feature amount obtained by adding the feature amount of the first relation indicating the relationship of the above is referred to the second database including the feature amount group learned so as to approach the feature amount of the second entity, and the feature of the text constituting the question is referred to. Train a derivator to derive a quantity
When learning the above,
Reduce the difference between the feature amount of the question and the feature amount of the second entity.
The derivator is trained so as to increase the difference between the feature amount of the question and the feature amount of the sampled entity other than the second entity.
program.