JP7634499B2

JP7634499B2 - Information processing system, method and program

Info

Publication number: JP7634499B2
Application number: JP2022056450A
Authority: JP
Inventors: 智彦山下; 大樹町田; 垠呉; スブラタオシュ; 麻里子河崎; アシュリージェーン; 卓志梅田; ▲琢▼磨蛭子; サティアンアブロール
Original assignee: Rakuten Group Inc
Current assignee: Rakuten Group Inc
Filing date: 2022-03-30
Publication date: 2025-02-21
Anticipated expiration: 2042-03-30

Description

本開示は、ユーザに関するスコアの算出等の評価を支援するための技術に関する。 This disclosure relates to technology for supporting evaluations such as calculating scores for users.

従来、ユーザの行動を示す行動情報を取得するユーザ情報取得部と、行動情報に基づいて、将来のユーザの融資に対する返済能力に関する信用度を判定する信用度判定部と、を備える判定装置が提案されている（特許文献１を参照）。また、ユーザ間の親密度に応じてユーザスコアの表示可否が決定されるシステムが提案されている（例えば、特許文献２を参照）。 Conventionally, a judgment device has been proposed that includes a user information acquisition unit that acquires behavioral information indicating the user's behavior, and a creditworthiness judgment unit that judges the user's creditworthiness regarding his or her future ability to repay loans based on the behavioral information (see Patent Document 1). Also, a system has been proposed that determines whether or not to display a user score depending on the intimacy between users (see, for example, Patent Document 2).

特開２０２１－１７４０３９号公報JP 2021-174039 A 特開２０２０－１２９２２８号公報JP 2020-129228 A

従来、ユーザの行動履歴に基づいてユーザの信用度等を表すユーザスコアを算出する技術が提案されている。しかし、対象ユーザの情報が欠損していたり情報の信頼性が低かったりする場合には、ユーザスコアが算出できない、又は算出されるユーザスコアの精度が不十分となる、といった問題があった。 Technology has been proposed to calculate a user score that represents a user's trustworthiness, etc., based on the user's behavioral history. However, there are problems with this technology, such as the inability to calculate a user score or the accuracy of the calculated user score being insufficient when information about the target user is missing or the information is unreliable.

本開示は、上記した問題に鑑み、対象ユーザの情報が欠損していたり情報の信頼性が低かったりする場合にも、ユーザスコアの算出等の評価を実現させ、又は評価精度を向上させることを課題とする。 In view of the above problems, the present disclosure aims to realize evaluation such as calculation of a user score, or to improve the accuracy of the evaluation, even when information on the target user is missing or the information is unreliable.

本開示の一例は、対象ユーザと互いに関係がある参照ユーザを特定する参照ユーザ特定手段と、前記対象ユーザについて特定された前記参照ユーザの属性データに基づいて、該対象ユーザの対応する属性データを生成する属性生成手段と、生成された前記対象のユーザの対応する属性データの少なくとも一部に基づいて、前記対象ユーザの対応する属性データ群を補完する属性補完手段と、補完された前記対象ユーザの対応する前記属性データ群に基づいて、該対象ユーザに設定されるユーザスコアを推定するユーザスコア推定手段と、を備える情報処理システムである。 An example of the present disclosure is an information processing system including a reference user identification means for identifying a reference user having a relationship with a target user, an attribute generation means for generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user, an attribute completion means for completing a corresponding attribute data group of the target user based on at least a portion of the generated corresponding attribute data of the target user, and a user score estimation means for estimating a user score to be set for the target user based on the completed corresponding attribute data group of the target user.

本開示は、情報処理装置、システム、コンピュータによって実行される方法又はコンピュータに実行させるプログラムとして把握することが可能である。また、本開示は、そのようなプログラムをコンピュータその他の装置、機械等が読み取り可能な記録媒体に記録したものとしても把握できる。ここで、コンピュータ等が読み取り可能な記録媒体とは、データやプログラム等の情報を電気的、磁気的、光学的、機械的又は化学的作用によって蓄積し、コンピュータ等から読み取ることができる記録媒体をいう。 The present disclosure can be understood as an information processing device, a system, a method executed by a computer, or a program executed by a computer. The present disclosure can also be understood as such a program recorded on a recording medium readable by a computer or other device, machine, etc. Here, a recording medium readable by a computer, etc. refers to a recording medium that stores information such as data and programs through electrical, magnetic, optical, mechanical, or chemical action and can be read by a computer, etc.

本開示によれば、対象ユーザの情報が欠損していたり情報の信頼性が低かったりする場合にも、ユーザスコアの算出等の評価を実現させ、又は評価精度を向上させることが可能となる。 According to the present disclosure, even when information on a target user is missing or the information is unreliable, it is possible to perform evaluation such as calculating a user score, or to improve the accuracy of the evaluation.

実施形態に係る情報処理システムの構成を示す概略図である。1 is a schematic diagram showing a configuration of an information processing system according to an embodiment. 実施形態に係る情報処理装置の機能構成の概略を示す図である。1 is a diagram illustrating an outline of a functional configuration of an information processing device according to an embodiment. 実施形態においてＩＰアドレスデータの値が共通していることの一例を模式的に示す図である。FIG. 10 is a diagram illustrating an example in which IP address data values are common in the embodiment. 実施形態に係るグラフデータの一例を示す図である。FIG. 4 is a diagram illustrating an example of graph data according to the embodiment. 実施形態において住所データの値が共通していることの一例を模式的に示す図である。FIG. 10 is a diagram illustrating an example in which address data values are common in the embodiment. 実施形態に係るグラフデータの一例を示す図である。FIG. 4 is a diagram illustrating an example of graph data according to the embodiment. 実施形態においてクレジットカード番号データの値が共通していることの一例を模式的に示す図である。FIG. 10 is a diagram illustrating an example in which the value of credit card number data is common in the embodiment. 実施形態に係るグラフデータの一例を示す図である。FIG. 4 is a diagram illustrating an example of graph data according to the embodiment. 実施形態に係るグラフデータの一例を示す図である。FIG. 4 is a diagram illustrating an example of graph data according to the embodiment. 実施形態に係るクラスタの一例を示す図である。FIG. 2 is a diagram illustrating an example of a cluster according to an embodiment. 実施形態に係る分類の可視化の一例を示す図である。FIG. 13 is a diagram illustrating an example of visualization of classification according to the embodiment. 実施形態に係る機械学習モデルを用いた関係性強度（近さスコア）の決定の一例を示す図である。FIG. 1 illustrates an example of determining relationship strength (closeness score) using a machine learning model according to an embodiment. 実施形態において採用される機械学習モデルの決定木の概念を簡略化して示す図である。FIG. 2 is a diagram illustrating a simplified concept of a decision tree of a machine learning model employed in an embodiment. 実施形態に係る機械学習処理の流れを示すフローチャートである。1 is a flowchart illustrating a flow of a machine learning process according to an embodiment. 実施形態に係るユーザスコア推定処理の流れを示すフローチャートである。11 is a flowchart showing a flow of a user score estimation process according to the embodiment.

以下、本開示に係る情報処理装置、方法及びプログラムの実施の形態を、図面に基づいて説明する。但し、以下に説明する実施の形態は、実施形態を例示するものであって、本開示に係る情報処理装置、方法及びプログラムを以下に説明する具体的構成に限定するものではない。実施にあたっては、実施の態様に応じた具体的構成が適宜採用され、また、種々の改良や変形が行われてよい。 Below, an embodiment of an information processing device, method, and program according to the present disclosure will be described with reference to the drawings. However, the embodiment described below is merely an example of an embodiment, and the information processing device, method, and program according to the present disclosure are not limited to the specific configuration described below. In implementing the present disclosure, a specific configuration according to the embodiment may be appropriately adopted, and various improvements and modifications may be made.

本実施形態では、本開示に係る技術を、ユーザに関連する何らかの尺度（例えば、信用等）を示すユーザスコアを管理するユーザスコア管理システムのために実施した場合の実施の形態について説明する。但し、本開示に係る技術は、ユーザスコアを推定するための技術について広く用いることが可能であり、本開示の適用対象は、実施形態において示した例に限定されない。 In this embodiment, an embodiment will be described in which the technology according to the present disclosure is implemented for a user score management system that manages a user score that indicates some measure related to a user (e.g., credit, etc.). However, the technology according to the present disclosure can be widely used as a technology for estimating a user score, and the application of the present disclosure is not limited to the examples shown in the embodiment.

＜システムの構成＞
図１は、本実施形態に係る情報処理システムの構成を示す概略図である。本実施形態に係るシステムでは、情報処理装置１と、１又は複数のサービス提供システム５と、が互いに通信可能に接続されている。ユーザは、サービス提供システム５によって提供されるサービスの利用者であり、ユーザ端末からサービス提供システム５にアクセスすることでサービスの提供を受ける。 <System Configuration>
1 is a schematic diagram showing the configuration of an information processing system according to this embodiment. In the system according to this embodiment, an information processing device 1 and one or more service providing systems 5 are connected to each other so as to be able to communicate with each other. A user is a user of a service provided by the service providing system 5, and receives the service by accessing the service providing system 5 from a user terminal.

情報処理装置１は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３、ＥＥＰＲＯＭ（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）やＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等の記憶装置１４、ＮＩＣ（ＮｅｔｗｏｒｋＩｎｔｅｒｆａｃｅＣａｒｄ）等の通信ユニット１５、等を備えるコンピュータである。但し、情報処理装置１の具体的なハードウェア構成に関しては、実施の態様に応じて適宜省略や置換、追加が可能である。また、情報処理装置１は、単一の筐体からなる装置に限定されない。情報処理装置１は、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてよい。 The information processing device 1 is a computer including a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, a storage device 14 such as an EEPROM (Electrically Erasable and Programmable Read Only Memory) or a HDD (Hard Disk Drive), a communication unit 15 such as a NIC (Network Interface Card), etc. However, the specific hardware configuration of the information processing device 1 can be omitted, replaced, or added as appropriate depending on the embodiment. Furthermore, the information processing device 1 is not limited to a device consisting of a single housing. The information processing device 1 may be realized by multiple devices using so-called cloud or distributed computing technology, etc.

情報処理装置１は、ユーザ毎にユーザスコアを管理し、サービス提供システム５に対してユーザスコアを提供する。サービス提供システム５は、情報処理装置１から提供されたユーザスコアに応じて、対象ユーザに対するサービスをカスタマイズすることが可能である。 The information processing device 1 manages a user score for each user and provides the user score to the service providing system 5. The service providing system 5 can customize services for the target user according to the user score provided by the information processing device 1.

サービス提供システム５は、ＣＰＵ、ＲＯＭ、ＲＡＭ、記憶装置、通信ユニット、入力装置、出力装置等（図示は省略する）を備えるコンピュータである。また、これらのシステム及び端末は、いずれも、単一の筐体からなる装置に限定されない。これらのシステム及び端末は、所謂クラウドや分散コンピューティングの技術等を用いた、複数の装置によって実現されてよい。 The service providing system 5 is a computer equipped with a CPU, ROM, RAM, a storage device, a communication unit, an input device, an output device, etc. (not shown). Furthermore, these systems and terminals are not limited to devices consisting of a single housing. These systems and terminals may be realized by multiple devices using so-called cloud or distributed computing technology, etc.

本実施形態に係るシステムでは、サービス提供システム５として、電子商取引システム４０、ゴルフ場予約システム４２、旅行予約システム４４、及びカード管理システム４６が互いに通信可能に接続されている。但し、サービス提供システム５によって提供されるサービスは本実施形態における例示に限定されない。サービス提供システム５によって提供されるサービスは、例えば、地図情報サービスやクレジットカード／後払い決済サービス、電子マネー決済サービス、オンラインショッピングサービス、オンライン予約サービス、オペレーションセンターサービス等であってよい。なお、「後払い決済」には、所謂ＢｕｙＮｏｗ，ＰａｙＬａｔｅｒ（ＢＮＰＬ）等と称されるサービスに限定されず、あらゆる後払いによる商品／サービスの購入が含まれるものとする。 In the system according to this embodiment, an electronic commerce system 40, a golf course reservation system 42, a travel reservation system 44, and a card management system 46 are connected to each other so that they can communicate with each other as the service providing system 5. However, the services provided by the service providing system 5 are not limited to the examples in this embodiment. The services provided by the service providing system 5 may be, for example, a map information service, a credit card/deferred payment service, an electronic money payment service, an online shopping service, an online reservation service, an operation center service, etc. Note that "deferred payment" is not limited to services such as Buy Now, Pay Later (BNPL), but includes the purchase of any product/service with deferred payment.

サービス提供システム５は、サービスの提供に際してユーザから取得された当該ユーザの属性データ群を情報処理装置１に通知する。また、情報処理装置１は、サービス提供システム５にアクセスして、対象ユーザを含む複数のユーザについてシステムに登録されているユーザ属性データを取得し属性データ群に含めることができる。ここで、ユーザの属性データには、システムを利用するユーザに関する情報であるアカウントデータ、及び当該ユーザによるサービスの利用履歴データが含まれる。サービスの利用履歴データの内容はサービスの内容に応じて様々であり、例えば、ユーザの位置情報の履歴データ、クレジットカード利用額／後払い決済利用額の支払履歴データ、電子マネー利用履歴データ、取引履歴データ、予約履歴データ、オペレーションセンターからのユーザに対するオペレーション履歴データ、位置情報の履歴データに基づいて特定された頻繁に訪れる滞在場所等が含まれてよい。また、アカウントデータには、例えば、ユーザＩＤ、氏名データ、住所データ、年齢データ、性別データ、電話番号データ、携帯電話番号データ、クレジットカード番号データ、ＩＰアドレスデータ、通学先データ、勤務先データ等が含まれる。 The service providing system 5 notifies the information processing device 1 of a group of attribute data of the user acquired from the user when the service is provided. The information processing device 1 can also access the service providing system 5 to acquire user attribute data registered in the system for multiple users including the target user, and include the data in the group of attribute data. Here, the user attribute data includes account data, which is information about the user who uses the system, and service usage history data by the user. The content of the service usage history data varies depending on the content of the service, and may include, for example, historical data of the user's location information, payment history data of credit card usage amount/postpaid payment usage amount, electronic money usage history data, transaction history data, reservation history data, operation history data for the user from the operation center, and frequently visited places of stay identified based on historical data of location information. The account data includes, for example, a user ID, name data, address data, age data, gender data, telephone number data, mobile phone number data, credit card number data, IP address data, school data, and workplace data.

ユーザＩＤは、例えば、当該コンピュータシステムにおける当該ユーザの識別情報である。氏名データは、例えば、当該ユーザの氏名（名字及び名前）を示すデータである。住所データは、例えば、当該ユーザの住所を示すデータである。当該コンピュータシステムが電子商取引システム４０である場合に、住所データが、当該ユーザが購入した商品の送付先の住所を示していてもよい。年齢データは、例えば、当該ユーザの年齢を示すデータである。性別データは、例えば、当該ユーザの性別を示すデータである。電話番号データは、例えば、当該ユーザの電話番号を示すデータである。携帯電話番号データは、例えば、当該ユーザの携帯電話番号を示すデータである。クレジットカード番号データは、例えば、当該ユーザが当該コンピュータシステムでの決済において利用するクレジットカードのカード番号を示すデータである。ＩＰアドレスデータは、例えば、当該ユーザが使用するコンピュータのＩＰアドレス（例えば、送信元のＩＰアドレス）を示すデータである。通学先データは、例えば、当該ユーザが学生である場合に、当該ユーザの通学先（教育機関名称や住所等）を示すデータである。勤務先データは、例えば、当該ユーザが社会人である場合に、当該ユーザの勤務先（企業名称や住所等）を示すデータである。 The user ID is, for example, the identification information of the user in the computer system. The name data is, for example, data indicating the name (first name and surname) of the user. The address data is, for example, data indicating the address of the user. If the computer system is an electronic commerce system 40, the address data may indicate the address to which the product purchased by the user is to be delivered. The age data is, for example, data indicating the age of the user. The gender data is, for example, data indicating the gender of the user. The telephone number data is, for example, data indicating the telephone number of the user. The mobile phone number data is, for example, data indicating the mobile phone number of the user. The credit card number data is, for example, data indicating the card number of the credit card used by the user for payment in the computer system. The IP address data is, for example, data indicating the IP address of the computer used by the user (for example, the IP address of the sender). The school data is, for example, data indicating the school (name of educational institution, address, etc.) of the user if the user is a student. The workplace data is data that indicates the user's workplace (company name, address, etc.) if the user is a working member of society, for example.

図２は、本実施形態に係る情報処理装置１の機能構成の概略を示す図である。情報処理装置１は、記憶装置１４に記録されているプログラムが、ＲＡＭ１３に読み出され、ＣＰＵ１１によって実行されて、情報処理装置１に備えられた各ハードウェアが制御されることで、グラフデータ生成部２１、参照ユーザ特定部２２、関係性特定部２３、関係性強度決定部２４、属性選択部２５、属性生成部２６、属性補完部２７、ユーザスコア推定部２８、及び機械学習部２９を備える情報処理装置として機能する。なお、本実施形態及び後述する他の実施形態では、情報処理装置１の備える各機能は、汎用プロセッサであるＣＰＵ１１によって実行されるが、これらの機能の一部又は全部は、１又は複数の専用プロセッサによって実行されてもよい。 FIG. 2 is a diagram showing an outline of the functional configuration of the information processing device 1 according to this embodiment. The information processing device 1 functions as an information processing device including a graph data generating unit 21, a reference user identifying unit 22, a relationship identifying unit 23, a relationship strength determining unit 24, an attribute selecting unit 25, an attribute generating unit 26, an attribute complementing unit 27, a user score estimating unit 28, and a machine learning unit 29, by a program recorded in the storage device 14 being read into the RAM 13 and executed by the CPU 11, which controls each piece of hardware included in the information processing device 1. Note that in this embodiment and other embodiments described later, each function included in the information processing device 1 is executed by the CPU 11, which is a general-purpose processor, but some or all of these functions may be executed by one or more dedicated processors.

グラフデータ生成部２１は、複数のユーザの夫々の属性データ群に基づいて互いに関係があるユーザのペアを特定することで、ユーザ間の関係性を示すグラフデータ（ソーシャルグラフネットワーク）を生成する。より具体的には、グラフデータ生成部２１は、例えば、対象ユーザを含む複数のユーザにそれぞれ対応付けられるノードデータ５０と、互いに関係があるユーザのペアに対応付けられるリンクデータ５２と、を含むグラフデータを生成する（図４、図６、図８、及び、図９参照）。なお、グラフ生成部２１は、明示的リンクで接続されたノード（ユーザ）で構成されるユーザ間関係グラフの学習（表現学習、関係学習、埋込学習、知識グラフ埋め込み）を行うことで、ユーザ間の暗示的リンクを予測し作成する。このとき、グラフ生成部２１は、既知の埋め込みモデルまたはその拡張に適宜、基づき、当該学習を行ってよい。 The graph data generating unit 21 generates graph data (social graph network) showing the relationships between users by identifying pairs of users who are related to each other based on the attribute data groups of each of the users. More specifically, the graph data generating unit 21 generates graph data including, for example, node data 50 corresponding to each of multiple users including the target user, and link data 52 corresponding to pairs of users who are related to each other (see Figures 4, 6, 8, and 9). The graph generating unit 21 predicts and creates implicit links between users by learning (representation learning, relationship learning, embedding learning, knowledge graph embedding) a user relationship graph consisting of nodes (users) connected by explicit links. At this time, the graph generating unit 21 may perform the learning based on a known embedding model or an extension thereof as appropriate.

例えば、図３に示すように、電子商取引システム４０に、ユーザＡの属性データ群が登録されていることとする。また、ゴルフ場予約システム４２に、ユーザＢの属性データ群が登録されていることとする。また、旅行予約システム４４に、ユーザＣの属性データが登録されていることとする。そして、電子商取引システム４０に登録されているユーザＡのＩＰアドレスデータの値、ゴルフ場予約システム４２に登録されているユーザＢのＩＰアドレスデータの値、及び、旅行予約システム４４に登録されているユーザＣのＩＰアドレスデータの値が同じであるとする。 For example, as shown in FIG. 3, suppose that a group of attribute data for user A is registered in electronic commerce system 40. Also, suppose that a group of attribute data for user B is registered in golf course reservation system 42. Also, suppose that attribute data for user C is registered in travel reservation system 44. And suppose that the value of the IP address data for user A registered in electronic commerce system 40, the value of the IP address data for user B registered in golf course reservation system 42, and the value of the IP address data for user C registered in travel reservation system 44 are the same.

この場合、グラフデータ生成部２１は、図４に示すように、ユーザＡに対応付けられるノードデータ５０ａ、ユーザＢに対応付けられるノードデータ５０ｂ、ユーザＣに対応付けられるノードデータ５０ｃ、ユーザＡがユーザＢと関係があることを示すリンクデータ５２ａ、ユーザＡがユーザＣと関係があることを示すリンクデータ５２ｂ、ユーザＢがユーザＣと関係があることを示すリンクデータ５２ｃ、を含むグラフデータを生成する。ＩＰアドレスが同じであるユーザは同じコンピュータを利用しているか又は同じ住居又は職場においてグローバルアドレスを共有しているものと推察される。そのため、本実施形態ではこのようなユーザは互いに関連付けられるようになっている。 In this case, as shown in FIG. 4, the graph data generation unit 21 generates graph data including node data 50a associated with user A, node data 50b associated with user B, node data 50c associated with user C, link data 52a indicating that user A is related to user B, link data 52b indicating that user A is related to user C, and link data 52c indicating that user B is related to user C. It is presumed that users with the same IP address use the same computer or share a global address in the same residence or workplace. For this reason, in this embodiment, such users are associated with each other.

また、例えば、図５に示すように、電子商取引システム４０に、ユーザＤ、ユーザＥ、及び、ユーザＦの属性データ群が登録されていることとする。そして、電子商取引システム４０に登録されているユーザＤの住所データの値、ユーザＥの住所データの値、及び、ユーザＦの住所データの値が同じであるとする。 For example, suppose that attribute data groups for user D, user E, and user F are registered in electronic commerce system 40, as shown in FIG. 5. Also, suppose that the address data values of user D, user E, and user F registered in electronic commerce system 40 are the same.

この場合、グラフデータ生成部２１は、図６に示すように、ユーザＤに対応付けられるノードデータ５０ｄ、ユーザＥに対応付けられるノードデータ５０ｅ、ユーザＦに対応付けられるノードデータ５０ｆ、ユーザＤがユーザＥと関係があることを示すリンクデータ５２ｄ、ユーザＤがユーザＦと関係があることを示すリンクデータ５２ｅ、ユーザＥがユーザＦと関係があることを示すリンクデータ５２ｆ、を含むグラフデータを生成する。住所が同じであるユーザは同居しているものと推察される。そのため、本実施形態ではこのようなユーザは互いに関連付けられるようになっている。 In this case, as shown in FIG. 6, the graph data generation unit 21 generates graph data including node data 50d associated with user D, node data 50e associated with user E, node data 50f associated with user F, link data 52d indicating that user D is related to user E, link data 52e indicating that user D is related to user F, and link data 52f indicating that user E is related to user F. It is presumed that users with the same address live together. Therefore, in this embodiment, such users are associated with each other.

また、例えば、図７に示すように、電子商取引システム４０に、ユーザＧの属性データ群が登録されていることとする。また、ゴルフ場予約システム４２に、ユーザＨの属性データ群が登録されていることとする。また、旅行予約システム４４に、ユーザＩの属性データ群が登録されていることとする。そして、電子商取引システム４０に登録されているユーザＧのクレジットカード番号データの値、ゴルフ場予約システム４２に登録されているユーザＨのクレジットカード番号データの値、及び、旅行予約システム４４に登録されているユーザＩのクレジットカード番号データの値が同じであるとする。 For example, as shown in FIG. 7, suppose that a group of attribute data for user G is registered in electronic commerce system 40. Also, suppose that a group of attribute data for user H is registered in golf course reservation system 42. Also, suppose that a group of attribute data for user I is registered in travel reservation system 44. And suppose that the value of the credit card number data for user G registered in electronic commerce system 40, the value of the credit card number data for user H registered in golf course reservation system 42, and the value of the credit card number data for user I registered in travel reservation system 44 are the same.

この場合、グラフデータ生成部２１は、図８に示すように、ユーザＧに対応付けられるノードデータ５０ｇ、ユーザＨに対応付けられるノードデータ５０ｈ、ユーザＩに対応付けられるノードデータ５０ｉ、ユーザＧがユーザＨと関係があることを示すリンクデータ５２ｇ、ユーザＧがユーザＩと関係があることを示すリンクデータ５２ｈ、ユーザＨがユーザＩと関係があることを示すリンクデータ５２ｉ、を含むグラフデータを生成する。クレジットカード番号が同じであるユーザは親子等の家族であるものと推察される。そのため、本実施形態ではこのようなユーザは互いに関連付けられるようになっている。 In this case, as shown in FIG. 8, the graph data generation unit 21 generates graph data including node data 50g associated with user G, node data 50h associated with user H, node data 50i associated with user I, link data 52g indicating that user G is related to user H, link data 52h indicating that user G is related to user I, and link data 52i indicating that user H is related to user I. Users with the same credit card number are presumed to be family members, such as parents and children. For this reason, in this embodiment, such users are associated with each other.

なお、互いに関係があるユーザのペアに該当するか否かの判断基準は、以上で説明したものには限定されない。ユーザのペアは、位置情報の履歴や行動履歴等、様々な基準に基づいて判断することが出来る。 The criteria for determining whether a pair of users is related to each other are not limited to those described above. A pair of users can be determined based on various criteria, such as location history and behavior history.

また、以上で説明した、互いに関係があると特定されたユーザを関連付けるリンクデータ５２が示すリンクを明示的リンクと呼ぶこととする。ここで例えば、第１のユーザと明示的リンクで接続されているユーザと、第２のユーザと明示的リンクで接続されているユーザと、が所定数以上（例えば、３人以上）共通しているとする。この場合、本実施形態では例えば、グラフデータ生成部２１は、当該第１のユーザが当該第２のユーザと関係があることを示すリンクデータ５２を生成する。このようにして生成されるリンクデータ５２が示すリンクを黙示的リンクと呼ぶこととする。 The link indicated by the link data 52 associating users identified as having a relationship with each other as described above is referred to as an explicit link. For example, a user connected to a first user via an explicit link and a user connected to a second user via an explicit link have a predetermined number of users in common (e.g., three or more). In this case, in this embodiment, for example, the graph data generation unit 21 generates link data 52 indicating that the first user is related to the second user. The link indicated by the link data 52 generated in this way is referred to as an implicit link.

例えば、図９に示すように、明示的リンクを示すリンクデータ５２ｊによって、ユーザＪに対応付けられるノードデータ５０ｊとユーザＫに対応付けられるノードデータ５０ｋとが接続されていることとする。また、明示的リンクを示すリンクデータ５２ｋによって、ユーザＪに対応付けられるノードデータ５０ｊとユーザＬに対応付けられるノードデータ５０ｌとが接続されていることとする。また、明示的リンクを示すリンクデータ５２ｌによって、ユーザＪに対応付けられるノードデータ５０ｊとユーザＭに対応付けられるノードデータ５０ｍとが接続されていることとする。 For example, as shown in FIG. 9, it is assumed that node data 50j associated with user J and node data 50k associated with user K are connected by link data 52j indicating an explicit link. It is also assumed that node data 50j associated with user J and node data 50l associated with user L are connected by link data 52k indicating an explicit link. It is also assumed that node data 50j associated with user J and node data 50m associated with user M are connected by link data 52l indicating an explicit link.

また、明示的リンクを示すリンクデータ５２ｍによって、ユーザＫに対応付けられるノードデータ５０ｋとユーザＮに対応付けられるノードデータ５０ｎとが接続されていることとする。また、明示的リンクを示すリンクデータ５２ｎによって、ユーザＬに対応付けられるノードデータ５０ｌとユーザＮに対応付けられるノードデータ５０ｎとが接続されていることとする。また、明示的リンクを示すリンクデータ５２ｏによって、ユーザＭに対応付けられるノードデータ５０ｍとユーザＮに対応付けられるノードデータ５０ｎとが接続されていることとする。 In addition, it is assumed that the node data 50k associated with user K and the node data 50n associated with user N are connected by link data 52m indicating an explicit link. It is assumed that the node data 50l associated with user L and the node data 50n associated with user N are connected by link data 52n indicating an explicit link. It is assumed that the node data 50m associated with user M and the node data 50n associated with user N are connected by link data 52o indicating an explicit link.

この場合、グラフデータ生成部２１は、ユーザＪがユーザＮと関係があることを示すリンクデータ５２ｐ（黙示的リンクを示すリンクデータ５２ｐ）を生成する。このようにして、ユーザＮが、ユーザＪと関係があるユーザとして特定されることとなる。 In this case, the graph data generating unit 21 generates link data 52p (link data 52p indicating an implicit link) indicating that user J has a relationship with user N. In this way, user N is identified as a user who has a relationship with user J.

また、例えば、第１のユーザと明示的リンク又は黙示的リンクで接続されているユーザと、第２のユーザと明示的リンク又は黙示的リンクで接続されているユーザと、が所定数以上（例えば、３人以上）共通しているとする。この場合、グラフデータ生成部２１が、当該第１のユーザが当該第２のユーザと関係があることを示すリンクデータ５２（黙示的リンクを示すリンクデータ５２）を生成してもよい。 For example, suppose that a user connected to a first user via an explicit link or an implicit link has a predetermined number of users in common (e.g., three or more) with a second user connected to the first user via an explicit link or an implicit link. In this case, the graph data generating unit 21 may generate link data 52 indicating that the first user is related to the second user (link data 52 indicating an implicit link).

参照ユーザ特定部２２は、グラフデータ生成部２１によって生成されたグラフデータを参照し、当該グラフデータに含まれるユーザのうち対象ユーザと互いに関係がある他のユーザを、当該対象ユーザに対する参照ユーザとして特定する。ここで、参照ユーザ特定部２２は、対象ユーザと関係があるユーザとして特定されるユーザ、及び、関係があるユーザとして特定されるユーザが所定数以上対象ユーザと共通するユーザを、参照ユーザとして特定してもよい。また、参照ユーザ特定部２２は、対象ユーザの属性と、複数のユーザの属性と、に基づいて、当該複数のユーザのうちから、参照ユーザを特定してもよい。 The reference user identification unit 22 refers to the graph data generated by the graph data generation unit 21, and identifies other users included in the graph data who are related to the target user as reference users for the target user. Here, the reference user identification unit 22 may identify, as reference users, users who are identified as users who are related to the target user, and users who have a predetermined number or more of users identified as related users in common with the target user. Furthermore, the reference user identification unit 22 may identify a reference user from among the multiple users based on the attributes of the target user and the attributes of the multiple users.

参照ユーザ特定部２２は、例えば、対象ユーザに対応付けられるノードデータ５０と、明示的リンク又は黙示的リンクを示すリンクデータ５２によって接続されるノードデータ５０に対応付けられるユーザを、当該対象ユーザに対する参照ユーザとして特定してもよい。 The reference user identification unit 22 may, for example, identify a user associated with node data 50 associated with a target user and node data 50 connected by link data 52 indicating an explicit link or an implicit link as a reference user for the target user.

関係性特定部２３は、ユーザ間の関係性を特定する。ここで特定されるユーザ間の関係性は、例えば、（１）同一世帯に居住する親子関係又は夫婦関係、（２）友達関係、（３）同じ職場で働く関係、等である。但し、特定される関係性は本開示における例示に限定されない。本実施形態では、関係性特定部２３は、ユーザ間の関係に対応付けられる値に基づくクラスタリングの結果に基づいて、ユーザ間の関係性を特定する。ここで、ユーザ間の関係に対応付けられる値として採用可能な値の種類は限定されないが、例えば、ユーザの氏名、ＩＰアドレス、住所、クレジットカード番号、年齢、性別、通学先、勤務先及び滞在場所のうちの少なくとも１つが含まれてよい。 The relationship identification unit 23 identifies the relationship between users. The relationship between users identified here may be, for example, (1) a parent-child relationship or a husband-wife relationship living in the same household, (2) a friendship relationship, (3) a relationship working in the same workplace, etc. However, the identified relationship is not limited to the examples in this disclosure. In this embodiment, the relationship identification unit 23 identifies the relationship between users based on the results of clustering based on values associated with the relationship between users. Here, the types of values that can be adopted as values associated with the relationship between users are not limited, but may include, for example, at least one of the user's name, IP address, address, credit card number, age, gender, place of school, place of work, and place of stay.

関係性特定部２３は、対象ユーザと参照ユーザとの関係性を特定する。ここで、関係性特定部２３は、対象ユーザの属性データ群と、参照ユーザの属性データ群と、に基づいて、対象ユーザと参照ユーザとの関係性を特定してもよい。また、対象ユーザの属性データ群が登録されているコンピュータシステムと参照ユーザの属性データ群が登録されているコンピュータシステムとは異なっていてもよい。例えば、電子商取引システム４０に登録されている、対象ユーザの属性データ群と、ゴルフ場予約システム４２に登録されている、参照ユーザの属性データ群と、に基づいて、対象ユーザと参照ユーザとの関係性を特定してもよい。 The relationship identification unit 23 identifies the relationship between the target user and the reference user. Here, the relationship identification unit 23 may identify the relationship between the target user and the reference user based on the attribute data group of the target user and the attribute data group of the reference user. Furthermore, the computer system in which the attribute data group of the target user is registered may be different from the computer system in which the attribute data group of the reference user is registered. For example, the relationship between the target user and the reference user may be identified based on the attribute data group of the target user registered in the electronic commerce system 40 and the attribute data group of the reference user registered in the golf course reservation system 42.

関係性特定部２３は、例えば、リンクデータ５２で接続されているノードデータ５０のペアを特定する。そして、関係性特定部２３は、当該ペアに対応付けられる２人のユーザのユーザ属性データ群に基づいて、当該ペアに対応付けられるペア属性データを生成する。ここで、ペア属性データには、例えば、ＩＰ共通フラグ、住所共通フラグ、クレジットカード番号共通フラグ、名字同一フラグ、年齢差データ、ペア性別データ、通学先共通フラグ、勤務先共通フラグ、滞在場所共通フラグ、等が含まれる。 The relationship identification unit 23, for example, identifies pairs of node data 50 that are connected by link data 52. Then, the relationship identification unit 23 generates pair attribute data associated with the pair based on a group of user attribute data of the two users associated with the pair. Here, the pair attribute data includes, for example, a common IP flag, a common address flag, a common credit card number flag, a same last name flag, age difference data, pair gender data, a common school flag, a common workplace flag, a common place of stay flag, etc.

ＩＰ共通フラグは、例えば、当該ペアのうちの一方の属性データに含まれるＩＰアドレスデータの値と他方の属性データに含まれるＩＰアドレスデータの値とが同じであるか否かを示すフラグである。例えば、ＩＰアドレスデータの値が同じである場合はＩＰ共通フラグの値に１が設定され、ＩＰアドレスデータの値が異なる場合はＩＰ共通フラグの値に０が設定されてもよい。 The IP common flag is, for example, a flag indicating whether the value of the IP address data included in one attribute data of the pair is the same as the value of the IP address data included in the other attribute data. For example, if the values of the IP address data are the same, the value of the IP common flag may be set to 1, and if the values of the IP address data are different, the value of the IP common flag may be set to 0.

住所共通フラグ、通学先共通フラグ、勤務先共通フラグ及び滞在場所共通フラグは、例えば、当該ペアのうちの一方の属性データ群に含まれる住所データ／通学先データ／勤務先データ／滞在場所データの値と他方の属性データ群に含まれる住所データ／通学先データ／勤務先データ／滞在場所データの値とが同じであるか否かを示すフラグである。例えば、住所データの値が同じである場合は住所共通フラグの値に１が設定され、住所データの値が異なる場合は住所共通フラグの値に０が設定されてもよい。 The common address flag, common school flag, common workplace flag, and common place of stay flag are flags that indicate, for example, whether the values of the address data/school data/workplace data/place of stay data included in one attribute data group of the pair are the same as the values of the address data/school data/workplace data/place of stay data included in the other attribute data group. For example, if the values of the address data are the same, the value of the common address flag may be set to 1, and if the values of the address data are different, the value of the common address flag may be set to 0.

クレジットカード番号共通フラグは、例えば、当該ペアのうちの一方の属性データ群に含まれるクレジットカード番号データの値と他方の属性データ群に含まれるクレジットカード番号データの値とが同じであるか否かを示すフラグである。例えば、クレジットカード番号データの値が同じである場合はクレジットカード番号共通フラグの値に１が設定され、クレジットカード番号データの値が異なる場合はクレジットカード番号共通フラグの値に０が設定されてもよい。 The credit card number common flag is, for example, a flag indicating whether the value of the credit card number data included in one attribute data group of the pair is the same as the value of the credit card number data included in the other attribute data group. For example, if the values of the credit card number data are the same, the value of the credit card number common flag may be set to 1, and if the values of the credit card number data are different, the value of the credit card number common flag may be set to 0.

名字同一フラグは、例えば、当該ペアのうちの一方の属性データ群に含まれる氏名データが示す名字と他方の属性データ群に含まれる氏名データが示す名字とが同じであるか否かを示すフラグである。例えば、氏名データが示す名字が同じである場合は名字同一フラグの値に１が設定され、氏名データが示す名字が異なる場合は名字同一フラグの値に０が設定されてもよい。 The same surname flag is, for example, a flag indicating whether the surname indicated by the name data included in one attribute data group of the pair is the same as the surname indicated by the name data included in the other attribute data group. For example, if the surnames indicated by the name data are the same, the value of the same surname flag may be set to 1, and if the surnames indicated by the name data are different, the value of the same surname flag may be set to 0.

年齢差データは、例えば、当該ペアのうちの一方の属性データ群に含まれる年齢データの値と他方の属性データ群に含まれる年齢データの値との差を示すデータである。 The age difference data is, for example, data indicating the difference between the value of the age data included in one attribute data group of the pair and the value of the age data included in the other attribute data group.

ペア性別データは、例えば、当該ペアのうちの一方の属性データ群に含まれる性別データの値と他方の属性データ群に含まれる性別データの値との組合せを示すデータである。 Paired gender data is, for example, data that indicates a combination of the gender data value contained in one attribute data group of the pair and the gender data value contained in the other attribute data group.

そして、関係性特定部２３は、複数のペアのそれぞれに対応付けられるペア属性データ群の値に基づいて、一般的なクラスタリング手法を用いたクラスタリングを実行することで、当該複数のペアを、図１０に示すような複数のクラスタ５４に分類する。 Then, the relationship identification unit 23 performs clustering using a general clustering method based on the values of the pair attribute data group associated with each of the multiple pairs, thereby classifying the multiple pairs into multiple clusters 54 as shown in FIG. 10.

図１０は、複数のペアが、５つのクラスタ５４（５４ａ、５４ｂ、５４ｃ、５４ｄ、及び、５４ｅ）に分類された様子の一例を模式的に示す図である。図１０に示されているバツ印は、ペアに対応付けられる。そして、複数のバツ印のそれぞれは、当該バツ印に対応するペアのペア属性データの値に対応付けられる位置に配置されている。図１０の例では、複数のペアが５つのクラスタ５４に分類されているが、複数のペアが分類されるクラスタ５４の数は５つには限定されず、例えば、複数のペアが４つのクラスタ５４に分類されてもよい。 Figure 10 is a diagram showing a schematic example of how a plurality of pairs are classified into five clusters 54 (54a, 54b, 54c, 54d, and 54e). The crosses shown in Figure 10 correspond to pairs. Each of the plurality of crosses is placed at a position that corresponds to the value of the pair attribute data of the pair corresponding to the cross. In the example of Figure 10, the plurality of pairs are classified into five clusters 54, but the number of clusters 54 into which the plurality of pairs are classified is not limited to five, and the plurality of pairs may be classified into four clusters 54, for example.

図１１は、複数のペアが４つのクラスタ５４に分類された場合における、当該分類の可視化の一例を示す図である。図１１に示すように、住所が同じであり、性別が同じであり、年齢差がＸ歳より大きく、名字が同じペアは、第１クラスタに分類されてもよい。また、住所が同じであり、性別が同じであり、年齢差がＸ歳以下であり、名字が同じペアは、第２クラスタに分類されてもよい。また、住所が同じであり、性別が異なり、年齢差がＹ歳より大きく、名字が同じペアは、第３クラスタに分類されてもよい。また、住所が同じであり、性別が異なり、年齢差がＹ歳以下であり、名字が同じペアは、第４クラスタに分類されてもよい。 FIG. 11 is a diagram showing an example of visualization of classification when multiple pairs are classified into four clusters 54. As shown in FIG. 11, pairs with the same address, the same gender, an age difference of more than X years, and the same last name may be classified into a first cluster. Pairs with the same address, the same gender, an age difference of X years or less, and the same last name may be classified into a second cluster. Pairs with the same address, different gender, an age difference of more than Y years, and the same last name may be classified into a third cluster. Pairs with the same address, different gender, an age difference of Y years or less, and the same last name may be classified into a fourth cluster.

この場合、第１クラスタは、例えば同性の親子に対応付けられるクラスタ５４であるものと推察される。また、第２クラスタは、同性の兄弟に対応付けられるクラスタ５４であるものと推察される。また、第３クラスタは、異性の親子に対応付けられるクラスタ５４であるものと推察される。また、第４クラスタは、夫婦に対応付けられるクラスタ５４であるものと推察される。 In this case, the first cluster is presumed to be cluster 54 associated with, for example, parents and children of the same sex. The second cluster is presumed to be cluster 54 associated with siblings of the same sex. The third cluster is presumed to be cluster 54 associated with parents and children of the opposite sex. The fourth cluster is presumed to be cluster 54 associated with a married couple.

以上で説明したようにして、関係性特定部２３が、ユーザ間の関係に対応付けられる値に基づくクラスタリングの結果に基づいて、対象ユーザと参照ユーザとの関係性を特定してもよい。通学先共通フラグ、勤務先共通フラグ、滞在場所共通フラグに基づくクラスタリングによって友達関係や同じ職場で働く関係のクラスタを作成する場合の具体例については、上記説明した例と概略同様であるため、説明を省略する。また、関係性特定部２３が、名字、ＩＰアドレス、住所、クレジットカード番号、年齢差、性別、通学先、勤務先及び滞在場所のうちの少なくとも１つに基づくクラスタリングの結果に基づいて、対象ユーザと参照ユーザとの関係性を特定してもよい。 As described above, the relationship identification unit 23 may identify the relationship between the target user and the reference user based on the results of clustering based on values associated with the relationships between users. A specific example of creating clusters of friendships or working in the same workplace by clustering based on the common school flag, common workplace flag, and common place of stay flag is roughly similar to the example described above, so an explanation is omitted. In addition, the relationship identification unit 23 may identify the relationship between the target user and the reference user based on the results of clustering based on at least one of the family name, IP address, address, credit card number, age difference, gender, school, place of work, and place of stay.

関係性強度決定部２４は、対象ユーザと参照ユーザとの関係性に対応する判断基準に従って、当該対象ユーザと当該参照ユーザとの関係の強さを示す指標に基づいて、当該対象ユーザと当該参照ユーザとの近さを示す関係性強度（以下、「近さスコア」とも称する。）を決定する。本実施形態において、関係性強度決定部２４は、対象ユーザと参照ユーザとの関係性に対応する学習済の機械学習モデルに指標を表すデータを入力した際の出力に基づいて、対象ユーザと参照ユーザとの近さを示す関係性強度（近さスコア）を決定する。 The relationship strength determination unit 24 determines the relationship strength (hereinafter also referred to as "closeness score") indicating the closeness between the target user and the reference user based on an index indicating the strength of the relationship between the target user and the reference user in accordance with a judgment criterion corresponding to the relationship between the target user and the reference user. In this embodiment, the relationship strength determination unit 24 determines the relationship strength (closeness score) indicating the closeness between the target user and the reference user based on the output when data indicating the index is input into a trained machine learning model corresponding to the relationship between the target user and the reference user.

ここで、関係性強度決定部２４は、それぞれ上述のクラスタ５４に対応付けられる学習済の機械学習モデルを含んでいてもよい。例えば、複数のペアが５つのクラスタ５４に分類される場合には、関係性強度決定部２４は、５つの機械学習モデルを含んでいてもよい。そして、関係性強度決定部２４は、対象ユーザと参照ユーザとの関係性に対応する学習済の機械学習モデルに、対象ユーザと当該参照ユーザとの関係の強さを示す指標を表すデータを入力した際の出力に基づいて、対象ユーザと参照ユーザとの近さを示す近さスコアを決定してもよい。この場合、学習済の機械学習モデルにおいて実装された入出力関係が、上述の判断基準に相当する。 Here, the relationship strength determination unit 24 may include a trained machine learning model that is associated with each of the above-mentioned clusters 54. For example, if multiple pairs are classified into five clusters 54, the relationship strength determination unit 24 may include five machine learning models. Then, the relationship strength determination unit 24 may determine a closeness score indicating the closeness between the target user and the reference user based on the output when data indicating an index indicating the strength of the relationship between the target user and the reference user is input to a trained machine learning model corresponding to the relationship between the target user and the reference user. In this case, the input/output relationship implemented in the trained machine learning model corresponds to the above-mentioned judgment criterion.

図１２に示すように、関係性強度決定部２４が、ｎ番目の機械学習モデルである第ｎ機械学習モデルに、第ｎ機械学習モデルに対応付けられるクラスタ５４に分類されたペアに対応する入力データを入力してもよい。例えば、関係性強度決定部２４が５つの機械学習モデルを含む場合は、上述の値ｎは、１以上５以下の整数のうちのいずれかとなる。そして、関係性強度決定部２４が、当該入力データの入力に応じて第ｎ機械学習モデルから出力される出力データの値を、当該ペアについての近さスコアの値として決定するようにしてもよい。 As shown in FIG. 12, the relationship strength determination unit 24 may input input data corresponding to a pair classified into a cluster 54 associated with the nth machine learning model to the nth machine learning model, which is the nth machine learning model. For example, if the relationship strength determination unit 24 includes five machine learning models, the above-mentioned value n is any integer between 1 and 5. Then, the relationship strength determination unit 24 may determine the value of the output data output from the nth machine learning model in response to the input of the input data as the value of the closeness score for the pair.

ペアに対応付けられる入力データには、例えば、当該ペアに対応付けられるペア属性データの一部又は全部が含まれるようにしてもよい。また、入力データに、ペア属性データに含まれていないデータが含まれるようにしてもよい。例えば、入力データに、電子商取引システム４０の利用履歴を示すデータや、関係性強度決定部２４によってＳＮＳ等の他の情報源から取得されるデータ等が含まれていてもよい。より具体的には例えば、入力データに、ペア間の単位期間あたりの通話回数やメッセージのやり取りの回数、一方が他方に送ったギフトの数、ペアにおける共通のフレンドの数、等を示すデータが含まれるようにしてもよい。 The input data associated with a pair may include, for example, some or all of the pair attribute data associated with the pair. The input data may also include data that is not included in the pair attribute data. For example, the input data may include data indicating the usage history of the electronic commerce system 40, data acquired by the relationship strength determination unit 24 from other information sources such as SNS, and the like. More specifically, for example, the input data may include data indicating the number of calls or message exchanges between the pair per unit period, the number of gifts sent by one to the other, the number of mutual friends in the pair, and the like.

また、ペアに対応付けられる入力データに含まれるデータの種類は、当該ペアが属するクラスタ５４によって同じであってもよいし異なっていてもよい。例えば、第１機械学習モデルに入力される入力データに含まれるデータの種類と、第２機械学習モデルに入力される入力データに含まれるデータの種類と、が異なっていてもよい。 In addition, the type of data included in the input data associated with a pair may be the same or different depending on the cluster 54 to which the pair belongs. For example, the type of data included in the input data input to the first machine learning model may be different from the type of data included in the input data input to the second machine learning model.

本実施形態では例えば、関係性強度決定部２４による近さスコアの決定に先立って、予め、第ｎ機械学習モデルに対応付けられる所与の複数の教師データを用いた、第ｎ機械学習モデルの学習が実行される。この教師データは、例えば、当該第ｎ機械学習モデルに対応付けられるクラスタ５４における近さスコアの決定が妥当なものとなるよう予め準備されたものである。ここで、教師データに設定される近さスコアは、ルールベースで設定された（アノテーションがなされた）近さスコアであってもよい。また、機械学習モデルによって過去に出力された後で、管理者等によって修正された近さスコアであってもよい。 In this embodiment, for example, prior to the determination of the closeness score by the relationship strength determination unit 24, learning of the nth machine learning model is performed in advance using a given number of training data items associated with the nth machine learning model. This training data is, for example, prepared in advance so that the determination of the closeness score in the cluster 54 associated with the nth machine learning model is valid. Here, the closeness score set in the training data may be a rule-based set (annotated) closeness score. It may also be a closeness score that was previously output by the machine learning model and then modified by an administrator, etc.

ここで、第ｎ機械学習モデルに対して、弱教師あり学習による学習が行われてもよい。例えば、教師データに、第ｎ機械学習モデルに入力される入力データと同じ種類のデータが含まれている学習入力データと、学習入力データの入力に応じて第ｎ機械学習モデルから出力される出力データと比較される教師データと、が含まれていてもよい。 Here, weakly supervised learning may be performed on the nth machine learning model. For example, the training data may include training input data that includes the same type of data as the input data input to the nth machine learning model, and training data that is compared with output data output from the nth machine learning model in response to the input of the training input data.

ここで例えば、上述の近さスコアが、０又は１のいずれかの値をとるとする。例えば、ペアが近い関係にある場合には、当該ペアの近さスコアの値として１が決定され、そうでない場合に、当該ペアの近さスコアの値として０が決定されるとする。この場合、教師データが、対応する学習入力データにおける妥当な近さスコアの値、及び、この値が妥当である確率を示すデータを含んでいてもよい。そして、例えば、教師データに含まれる学習入力データの入力に応じて第ｎ機械学習モデルから出力される出力データの値と、当該教師データに含まれる教師データの値と、に基づいて、第ｎ機械学習モデルのパラメータの値を更新する弱教師あり学習が実行されてもよい。 Here, for example, the closeness score described above takes a value of either 0 or 1. For example, if the pair is closely related, the closeness score value of the pair is determined to be 1, and if not, the closeness score value of the pair is determined to be 0. In this case, the teacher data may include a valid closeness score value for the corresponding learning input data and data indicating the probability that this value is valid. Then, for example, weakly supervised learning may be performed to update the parameter values of the nth machine learning model based on the value of the output data output from the nth machine learning model in response to the input of the learning input data included in the teacher data and the value of the teacher data included in the teacher data.

なお、上述の近さスコアは、０又は１のいずれかの値をとるバイナリデータである必要はない。例えば、上述の近さスコアが、当該ペアが近い関係にあるほど大きな値となる実数値（例えば、０以上１０以下の実数値）や、多段階の整数値（例えば、１以上１０以下の整数値）であっても構わない。 The above-mentioned closeness score does not have to be binary data that takes the value of either 0 or 1. For example, the above-mentioned closeness score may be a real number (e.g., a real number between 0 and 10 inclusive) that increases the closer the pair is to each other, or a multi-level integer value (e.g., an integer number between 1 and 10 inclusive).

また、機械学習モデルの学習手法は、弱教師あり学習には限定されない。一具体例として、兄弟の関係があるペアについて考察する。この場合、当該ペアに対応付けられる入力データが、兄弟という関係に対応する学習済の機械学習モデルに入力される。そして例えば、このペアについて住所データの値が同じであり、このペアの一方が他方に送ったギフトの数が５０であり、このペアの今までの通話回数が１２００回である場合には、値が１である出力データが出力されるような学習が実行されてもよい。また例えば、このペアについて住所データの値が異なっており、このペアの一方が他方に送ったギフトの数が２であり、このペアの今までの通話回数が３０回である場合には、値が０である出力データが出力されるような学習が実行されてもよい。そして、近さスコアに対応する出力データの値が１となるか０となるかの判断基準（例えば閾値）が、機械学習モデルによって異なっていてもよい。 Furthermore, the learning method of the machine learning model is not limited to weakly supervised learning. As a specific example, consider a pair that has a sibling relationship. In this case, input data associated with the pair is input to a trained machine learning model that corresponds to the sibling relationship. For example, if the address data values for the pair are the same, one of the pair has sent 50 gifts to the other, and the number of calls between the pair has been 1200, learning may be performed to output output data with a value of 1. For example, if the address data values for the pair are different, one of the pair has sent 2 gifts to the other, and the number of calls between the pair has been 30, learning may be performed to output output data with a value of 0. The criteria (e.g., threshold) for determining whether the value of the output data corresponding to the closeness score is 1 or 0 may differ depending on the machine learning model.

属性選択部２５は、対象ユーザと参照ユーザとの関係性の種類に応じて、属性生成部２６によって生成される属性データの種類（補完対象の属性データの種類）を選択する。ユーザ間の関係性の種類の具体例、及び関係性の種類に応じて選択される属性データの種類としては、以下に例示するような関係性及び属性データが挙げられる。 The attribute selection unit 25 selects the type of attribute data (type of attribute data to be complemented) generated by the attribute generation unit 26 according to the type of relationship between the target user and the reference user. Specific examples of the types of relationships between users and the types of attribute data selected according to the types of relationships include the relationships and attribute data exemplified below.

（１）同一世帯に居住する親子関係又は夫婦関係
ユーザ間の関係性が同一世帯に居住する親子関係又は夫婦関係である場合、主に、金銭系の変数、世帯としての行動を示す変数は同一になると仮定できる。このため、ユーザ間に当該関係性が特定された場合、属性選択部２５は、属性生成部２６によって生成される属性データの種類として、例えば、世帯収入、世帯年収、居住地、（世帯としての）保険加入有無、預貯金額、金融資産、新聞購読有無、等を選択する。 (1) Parent-child or marital relationship living in the same household When the relationship between users is a parent-child or marital relationship living in the same household, it can be assumed that mainly monetary variables and variables indicating behavior as a household are the same. Therefore, when such a relationship between users is identified, the attribute selection unit 25 selects, as the type of attribute data to be generated by the attribute generation unit 26, for example, household income, annual household income, place of residence, whether or not the household has insurance, amount of deposits, financial assets, whether or not the household subscribes to a newspaper, etc.

（２）友達関係
ユーザ間の関係性が友達関係である場合、同じ性別・年齢・趣味の集団が友達になりやすいと仮定できる。このため、ユーザ間に当該関係性が特定された場合、属性選択部２５は、属性生成部２６によって生成される属性データの種類として、例えば、趣味、よく行く場所・地域、年齢、性別、等を選択する。 (2) Friendship Relationship When the relationship between users is a friendship relationship, it can be assumed that groups of people with the same sex, age, and hobbies are likely to become friends. Therefore, when the relationship between users is identified, the attribute selection unit 25 selects, for example, hobbies, frequently visited places/areas, age, sex, etc. as the types of attribute data to be generated by the attribute generation unit 26.

（３）同じ職場で働く関係
ユーザ間の関係性が同じ職場で働く関係である場合、同じ教育水準、専門分野の集団が、同じ職場で働いている場合が多いと仮定できる。このため、ユーザ間に当該関係性が特定された場合、属性選択部２５は、属性生成部２６によって生成される属性データの種類として、例えば、購入する専門書のジャンル、教育水準、等を選択する。 (3) Relationship of working in the same workplace When the relationship between users is a relationship of working in the same workplace, it can be assumed that a group with the same educational level and specialty field often works in the same workplace. Therefore, when the relationship between users is identified, the attribute selection unit 25 selects, for example, the genre of specialized books to be purchased, educational level, etc. as the type of attribute data to be generated by the attribute generation unit 26.

本実施形態では、属性選択部２５がルールベースで補完対象（生成対象）の属性データの種類を選択する方法を説明したが、補完対象属性データの種類の選択方法は、本実施形態における例示に限定されない。例えば、ユーザ間の関係性の種類と近似する属性データの種類との相関性の有無や相関度を学習させた機械学習モデルを用いて、補完対象属性データの種類を選択する方法が採用されてもよい。 In this embodiment, a method has been described in which the attribute selection unit 25 selects the type of attribute data to be complemented (generated) on a rule basis, but the method of selecting the type of attribute data to be complemented is not limited to the example in this embodiment. For example, a method may be adopted in which the type of attribute data to be complemented is selected using a machine learning model that has learned the presence or absence of correlation between the type of relationship between users and the type of similar attribute data, and the degree of correlation.

属性生成部２６は、対象ユーザの属性データ群のうち欠損している属性データ又は信頼性の低い属性データを補完するための属性データを、対象ユーザについて特定された少なくとも１の参照ユーザに関する情報に基づいて生成する。ここで、属性生成部２６は、参照ユーザに関する情報として、参照ユーザの属性データ群のうち属性選択部２５によって選択された種類の属性データを参照し、参照された属性データに対応する対象ユーザの属性データを生成する。 The attribute generation unit 26 generates attribute data for complementing missing attribute data or unreliable attribute data in the attribute data group of the target user based on information about at least one reference user identified for the target user. Here, the attribute generation unit 26 refers to the type of attribute data selected by the attribute selection unit 25 from the attribute data group of the reference user as the information about the reference user, and generates attribute data of the target user corresponding to the referred attribute data.

具体的には、対象ユーザと参照ユーザとの間の関係性が「（１）同一世帯に居住する親子関係又は夫婦関係」である場合、属性生成部２６は、世帯収入、世帯年収、居住地、（世帯としての）保険加入有無、預貯金額、金融資産、新聞購読有無、等の属性データについて参照ユーザの属性データを参照し、これに基づいて対象ユーザの対応する属性データを生成する。また、対象ユーザと参照ユーザとの間の関係性が「（２）友達関係」である場合、属性生成部２６は、趣味、よく行く場所・地域、年齢、性別、等の属性データについて参照ユーザの属性データを参照し、これに基づいて対象ユーザの対応する属性データを生成する。また、対象ユーザと参照ユーザとの間の関係性が「（３）同じ職場で働く関係」である場合、属性生成部２６は、購入する専門書のジャンル、教育水準、等の属性データについて参照ユーザの属性データを参照し、これに基づいて対象ユーザの対応する属性データを生成する。 Specifically, when the relationship between the target user and the reference user is "(1) parent-child relationship or marital relationship residing in the same household," the attribute generation unit 26 refers to the reference user's attribute data for attribute data such as household income, annual household income, place of residence, whether or not the household has insurance, deposit amount, financial assets, whether or not the household subscribes to a newspaper, etc., and generates corresponding attribute data for the target user based on the attribute data. When the relationship between the target user and the reference user is "(2) friendship," the attribute generation unit 26 refers to the reference user's attribute data for attribute data such as hobbies, frequently visited places and areas, age, sex, etc., and generates corresponding attribute data for the target user based on the attribute data. When the relationship between the target user and the reference user is "(3) working in the same workplace," the attribute generation unit 26 refers to the reference user's attribute data for attribute data such as genre of specialized books purchased, educational level, etc., and generates corresponding attribute data for the target user based on the attribute data.

属性生成部２６は、参照ユーザの属性データのパラメータをそのまま対象ユーザの対応する属性データにコピーすることで、対象ユーザの属性データを生成してもよい。但し、属性生成部２６は、参照ユーザの属性データのパラメータに対して何らかの処理を加えることで、対象ユーザの対応する属性データを生成することとしてもよい。例えば、対象ユーザの属性データの生成にあたって、属性生成部２６は、参照ユーザについて決定された近さスコアを参照し、参照ユーザの属性データのパラメータと近さスコアとに基づいて、対象ユーザの属性データを生成してもよい。 The attribute generation unit 26 may generate the attribute data of the target user by directly copying the parameters of the attribute data of the reference user to the corresponding attribute data of the target user. However, the attribute generation unit 26 may generate the corresponding attribute data of the target user by performing some processing on the parameters of the attribute data of the reference user. For example, when generating the attribute data of the target user, the attribute generation unit 26 may refer to the closeness score determined for the reference user, and generate the attribute data of the target user based on the parameters of the attribute data of the reference user and the closeness score.

例えば、属性生成部２６は、参照ユーザの属性データのパラメータに対して、近さスコアに基づいて決定された重み付けを行うことで、対象ユーザの属性データを生成してもよい。この場合、属性生成部２６は、対象ユーザと参照ユーザとの間の近さスコアがユーザ間の関係性強度が高いことを示しているほど、大きな重み付け係数を設定する。そして、参照ユーザの属性データのパラメータに対して重み付け係数を用いた処理（例えば、単純にパラメータに対して重み付け係数を積算する等）を行うことで、対象ユーザについて補完される属性データのパラメータが、参照された参照ユーザの属性データのパラメータに近くなるようにすることが出来る。 For example, the attribute generation unit 26 may generate attribute data for the target user by weighting the parameters of the attribute data of the reference user based on the closeness score. In this case, the attribute generation unit 26 sets a larger weighting coefficient the higher the closeness score between the target user and the reference user indicates the strength of the relationship between the users. Then, by performing processing using the weighting coefficient on the parameters of the attribute data of the reference user (for example, simply multiplying the parameters by the weighting coefficient), the parameters of the attribute data supplemented for the target user can be made to be closer to the parameters of the attribute data of the referenced reference user.

また、ここで、参照ユーザが複数特定されている場合、複数の参照ユーザに基づいて対象ユーザの属性データが生成されてもよい。例えば、属性生成部２６は、複数の参照ユーザの夫々について近さスコアと補完対象属性データのパラメータとを取得し、各参照ユーザから取得されたパラメータを近さスコアに基づいて重み付けし、参照ユーザ毎に得られた複数の重み付け済パラメータの平均（平均に限らず、中央値等その他の統計量が採用されてもよい）を、対象ユーザの対応する属性データのパラメータとしてよい。 In addition, here, when multiple reference users are identified, attribute data of the target user may be generated based on the multiple reference users. For example, the attribute generation unit 26 may acquire a closeness score and a parameter of the attribute data to be complemented for each of the multiple reference users, weight the parameters acquired from each reference user based on the closeness score, and use the average (not limited to the average, but other statistics such as the median may also be used) of the multiple weighted parameters acquired for each reference user as the parameter of the corresponding attribute data of the target user.

また、例えば、属性生成部２６は、補完が行われる前の対象ユーザの属性データ群の少なくとも一部のパラメータと、参照ユーザの属性データ群の少なくとも一部のパラメータと、対象ユーザ及び参照ユーザ間の近さスコアと、を入力値とし、補完される対象ユーザの属性データを出力値とする属性生成モデルを用いて、対象ユーザの属性データを生成してもよい。重み付けを採用する場合と同様、属性生成モデルを採用する場合も、属性生成モデルは、対象ユーザと参照ユーザとの間の近さスコアが高いほど、対象ユーザについて補完される属性データのパラメータが、参照された参照ユーザの属性データのパラメータに近くなるように生成及び／又は更新される。また、属性生成モデルに対して複数の参照ユーザに係る近さスコア及び属性データを入力し、対象ユーザの補完対象属性データのパラメータが出力されるようにしてもよいことも、上記重み付けを採用する場合と同様である。 For example, the attribute generation unit 26 may generate attribute data of the target user using an attribute generation model that uses as input values at least some parameters of the attribute data group of the target user before complementation, at least some parameters of the attribute data group of the reference user, and a closeness score between the target user and the reference user, and that uses as output values the attribute data of the target user to be complemented. As in the case of employing weighting, when employing an attribute generation model, the attribute generation model is generated and/or updated so that the parameters of the attribute data to be complemented for the target user are closer to the parameters of the attribute data of the referenced reference user as the closeness score between the target user and the reference user is higher. As in the case of employing weighting, the attribute generation model may be input with closeness scores and attribute data related to multiple reference users, and parameters of the attribute data to be complemented of the target user may be output.

属性補完部２７は、生成された属性データの少なくとも一部に基づいて、ユーザにかかる属性データ群を補完する。ユーザにかかる属性データ群には、サービス提供システム５から取得されたアカウントデータ及び利用履歴データを含む属性データが含まれるが、この際、属性補完部２７は、属性生成部２６によって生成された属性データの少なくとも一部を対象ユーザにかかる属性データ群の少なくとも一部として決定し、ユーザにかかる属性データ群を補完する。 The attribute complementing unit 27 complements the attribute data group for the user based on at least a portion of the generated attribute data. The attribute data group for the user includes attribute data including account data and usage history data acquired from the service providing system 5, and in this case, the attribute complementing unit 27 determines at least a portion of the attribute data generated by the attribute generating unit 26 as at least a portion of the attribute data group for the target user, and complements the attribute data group for the user.

ここで、属性補完部２７によって補完される属性データには、デモグラフィック属性、ビヘイビオラル属性、又はサイコグラフィック属性が含まれてよい。デモグラフィック属性は、例えば、ユーザの性別（ジェンダー）、家族構成、年齢等であり、ビヘイビオラル属性は、例えば、キャッシング利用有無、リボ払い利用有無、所定の口座に係る入出金履歴、賭博又はくじを含む何らかの商品に係る商取引履歴（オンラインマーケットプレイス等におけるオンライン取引履歴を含んでよい）等であり、サイコグラフィック属性は、例えば、賭博又はくじに係る趣向等である。但し、利用可能なユーザの属性は、本実施形態における例示に限定されない。例えば、オペレーションセンターサービス等からの「オペレーション（架電等）に要する時間」、「クレジットカード利用額／後払い決済利用額」も、属性データとして用いられてよい。 Here, the attribute data complemented by the attribute complement unit 27 may include demographic attributes, behavioral attributes, or psychographic attributes. Demographic attributes are, for example, the user's gender, family structure, age, etc., behavioral attributes are, for example, whether or not a cash advance has been used, whether or not a revolving payment has been used, deposit and withdrawal history for a specific account, commercial transaction history for some product including gambling or lotteries (which may include online transaction history in an online marketplace, etc.), etc., and psychographic attributes are, for example, preferences for gambling or lotteries. However, the available user attributes are not limited to the examples in this embodiment. For example, "time required for operation (calls, etc.)" and "credit card usage amount/deferred payment usage amount" from an operation center service, etc. may also be used as attribute data.

ユーザスコア推定部２８は、補完された属性データ群に基づいて、ユーザに設定されるユーザスコアを推定する。本実施形態において、ユーザスコア推定部２８は、ユーザの属性データ群をユーザスコア推定モデルに入力することで、当該ユーザに設定されるユーザスコアを推定する。ここで、ユーザスコア推定モデルの出力値は、０を最小値、１を最大値として正規化／規格化されたユーザスコアである。ここで、ユーザスコア推定モデルに入力される対象ユーザの属性データ群には、属性生成部２６によって生成された属性データが含まれる。上述の通り、属性生成部２６によって生成された属性データには、例えば、世帯収入、世帯年収、居住地、（世帯としての）保険加入有無、預貯金額、金融資産、新聞購読有無、趣味、よく行く場所・地域、年齢、性別、購入する専門書のジャンル、教育水準、等が含まれてよい。 The user score estimation unit 28 estimates a user score to be set for the user based on the complemented attribute data group. In this embodiment, the user score estimation unit 28 estimates a user score to be set for the user by inputting the attribute data group of the user into a user score estimation model. Here, the output value of the user score estimation model is a user score normalized/standardized with 0 as the minimum value and 1 as the maximum value. Here, the attribute data group of the target user input into the user score estimation model includes attribute data generated by the attribute generation unit 26. As described above, the attribute data generated by the attribute generation unit 26 may include, for example, household income, annual household income, place of residence, whether or not the household has insurance (as a household), amount of deposits, financial assets, whether or not the household subscribes to a newspaper, hobbies, frequently visited places/areas, age, sex, genre of specialized books purchased, educational level, etc.

機械学習部２９は、ユーザスコア推定部２８によるユーザスコア推定に用いられるユーザスコア推定モデルを生成及び／又は更新する。ユーザスコア推定モデルは、対象ユーザに係る１又は複数の属性データ（属性データ群）が入力された場合に、ユーザに関連する何らかの尺度（例えば、信用等）を示すユーザスコアを出力する機械学習モデルであってよく、ユーザスコアを出力可能な何らかの関数又は統計モデルであってよい。 The machine learning unit 29 generates and/or updates a user score estimation model used for user score estimation by the user score estimation unit 28. The user score estimation model may be a machine learning model that outputs a user score indicating some measure related to the user (e.g., trust, etc.) when one or more attribute data (group of attribute data) related to the target user is input, or may be some function or statistical model capable of outputting a user score.

ユーザスコア推定モデルの生成及び／又は更新にあたって、機械学習部２９は、サービス提供システム５から取得したデータに基づいて、ユーザ毎に、当該ユーザのデモグラフィック属性を含む属性データ群を入力値とし当該ユーザに係るユーザスコアを出力値として定義した教師データを作成する。そして、機械学習部２９は、当該教師データに基づいて、ユーザスコア推定モデルを作成する。上述の通り、ユーザスコア推定モデルに入力される属性データ群には、属性生成部２６によって生成された属性データが含まれ、対応するユーザのユーザスコアと組み合わせられて、教師データとして機械学習部２９に入力される。教師データに設定されるユーザスコアは、ルールベースで設定された（アノテーションがなされた）ユーザスコアであってもよい。また、ユーザスコア推定モデルによって過去に出力された後で、管理者等によって修正されたユーザスコアであってもよい。 When generating and/or updating the user score estimation model, the machine learning unit 29 creates teacher data for each user based on the data acquired from the service providing system 5, in which a group of attribute data including the demographic attributes of the user is defined as an input value and the user score related to the user is defined as an output value. Then, the machine learning unit 29 creates a user score estimation model based on the teacher data. As described above, the group of attribute data input to the user score estimation model includes attribute data generated by the attribute generation unit 26, and is combined with the user score of the corresponding user and input to the machine learning unit 29 as teacher data. The user score set in the teacher data may be a user score set (annotated) based on a rule base. It may also be a user score that was previously output by the user score estimation model and then corrected by an administrator or the like.

本開示に係る技術を実装するにあたり採用可能な機械学習モデル生成のフレームワークは、例として、アンサンブル学習アルゴリズムに基づく。当該フレームワークには、例えば、勾配ブースティング決定木（ＧｒａｄｉｅｎｔＢｏｏｓｔｉｎｇＤｅｃｉｓｉｏｎＴｒｅｅ：ＧＢＤＴ）に基づく機械学習フレームワーク（例えば、ＬｉｇｈｔＧＢＭ）が採用されてよい。換言すると、当該フレームワークは、前後の弱学習器（弱分類器）間で正解と予測値との誤差を引き継がせるような決定木モデルに基づく機械学習フレームワークが採用されてよい。ここでの予測値とは、例として、ユーザスコアの予測値を指す。なお、当該フレームワークは、ＬｉｇｈｔＧＢＭの他、ＸＧＢｏｏｓｔやＣａｔＢｏｏｓｔ等のブースティング手法を採用してよい。決定木を用いるフレームワークによれば、ニューラルネットワークを用いるフレームワークと比較して少ないパラメータ調整の手間で、比較的高い性能を有する機械学習モデルを生成することが出来る。但し、本開示に係る技術を実装するにあたり採用可能な機械学習モデル生成のフレームワークは、本実施形態における例示に限定されない。例えば、学習器として勾配ブースティング決定木に代えてランダムフォレスト等の他の学習器が採用されてよいし、ニューラルネットワーク等の所謂弱学習器とは称されない学習器が採用されてもよい。また、特にニューラルネットワーク等の所謂弱学習器とは称されない学習器が採用される場合には、アンサンブル学習が採用されなくてもよい。 A machine learning model generation framework that can be adopted when implementing the technology according to the present disclosure is based on an ensemble learning algorithm, for example. For example, a machine learning framework (for example, LightGBM) based on a gradient boosting decision tree (GBDT) may be adopted as the framework. In other words, the framework may be a machine learning framework based on a decision tree model that transfers the error between the correct answer and the predicted value between the previous and next weak learners (weak classifiers). The predicted value here refers to the predicted value of the user score, for example. In addition to LightGBM, the framework may adopt boosting methods such as XGBoost and CatBoost. According to a framework using a decision tree, a machine learning model with relatively high performance can be generated with less effort in parameter adjustment compared to a framework using a neural network. However, the machine learning model generation framework that can be adopted when implementing the technology according to the present disclosure is not limited to the examples in this embodiment. For example, instead of a gradient boosting decision tree, another learning device such as a random forest may be used as the learning device, or a learning device that is not a so-called weak learning device such as a neural network may be used. In particular, when a learning device that is not a so-called weak learning device such as a neural network is used, ensemble learning does not need to be used.

図１３は、本実施形態において採用される機械学習モデルの決定木の概念を簡略化して示す図である。決定木アルゴリズムに基づいた勾配ブースティングの機械学習フレームワークを採用する場合、決定木の各ノードの分岐条件の最適化が行われる。具体的には、決定木アルゴリズムに基づいた勾配ブースティングの機械学習フレームワークでは、一つの親のノードから分岐した二つの子のノードの夫々が示す属性を有するユーザ群についてユーザスコアを夫々算出し、このユーザスコアの差分が大きくなるように（例えば、差分が最大になるように、又は所定の閾値以上になるように）、即ち、二つの子のノードがきれいに分岐するように、親のノードの分岐条件が最適化される。例えば、ノードの分岐条件として示される属性が年齢である場合、分岐の閾値に設定される年齢を変更したり、分岐条件を年齢以外の属性に変更したりしてもよい。このようにして、決定木の全ノードの分岐条件を再帰的に最適化することで、属性データ群に基づくユーザスコアの推定精度を向上させることができる。 FIG. 13 is a simplified diagram showing the concept of a decision tree in a machine learning model employed in this embodiment. When a gradient boosting machine learning framework based on a decision tree algorithm is employed, the branching conditions of each node of the decision tree are optimized. Specifically, in the gradient boosting machine learning framework based on a decision tree algorithm, a user score is calculated for each user group having attributes indicated by each of two child nodes branched from one parent node, and the branching conditions of the parent node are optimized so that the difference between these user scores is large (for example, so that the difference is maximized or is equal to or greater than a predetermined threshold), that is, so that the two child nodes branch neatly. For example, if the attribute indicated as the branching condition of the node is age, the age set as the branching threshold may be changed, or the branching condition may be changed to an attribute other than age. In this way, the branching conditions of all nodes of the decision tree are recursively optimized, thereby improving the estimation accuracy of the user score based on the attribute data group.

また、属性生成部２６が属性生成モデルを用いて補完対象の属性データを生成する場合、機械学習部２９は更に、属性生成部２６による、対象ユーザの補完対象属性データの生成に用いられる属性生成モデルを生成及び／又は更新する。属性生成モデルは、１又は複数の参照ユーザに係る１又は複数の属性データ及び近さスコアが入力された場合に、対象ユーザに係る補完対象属性データを出力する機械学習モデルである。 In addition, when the attribute generation unit 26 generates attribute data to be complemented using an attribute generation model, the machine learning unit 29 further generates and/or updates the attribute generation model used by the attribute generation unit 26 to generate the attribute data to be complemented for the target user. The attribute generation model is a machine learning model that outputs attribute data to be complemented for the target user when one or more attribute data and a closeness score related to one or more reference users are input.

属性生成モデルの生成及び／又は更新にあたって、機械学習部２９は、サービス提供システム５から取得したデータのうち、１又は複数の参照ユーザの属性データ及び近さスコアを入力値とし１の属性データ（対象ユーザに係る補完対象属性データ）を出力値として定義した教師データを作成する。ここで、属性生成モデルの生成及び／又は更新に用いられる教師データに設定される出力値（対象ユーザの補完対象属性データのパラメータ）は、ルールベース（例えば、上述した重み付けによる算出方法）で設定された（アノテーションがなされた）出力値であってもよい。また、属性生成モデルによって過去に出力された後で、管理者等によって修正された出力値であってもよい。 When generating and/or updating the attribute generation model, the machine learning unit 29 creates training data in which the attribute data and closeness scores of one or more reference users among the data acquired from the service providing system 5 are defined as input values and one attribute data (attribute data to be complemented related to the target user) is defined as an output value. Here, the output value (parameter of the attribute data to be complemented of the target user) set in the training data used to generate and/or update the attribute generation model may be an output value set (annotated) based on a rule base (for example, the above-mentioned weighting calculation method). It may also be an output value that was previously output by the attribute generation model and then modified by an administrator, etc.

そして、機械学習部２９は、当該教師データに基づいて、属性生成モデルを生成又は更新する。１又は複数の属性データ及び近さスコアは、対応する属性データと組み合わせて、教師データとして機械学習部２９に入力される。また、属性生成モデルの生成又は更新においても、採用可能な機械学習モデル生成のフレームワークは限定されないが、決定木アルゴリズムに基づいた勾配ブースティングの機械学習フレームワークが採用されてよいことは、上記説明したユーザスコア推定モデルと同様である。 Then, the machine learning unit 29 generates or updates an attribute generation model based on the teacher data. One or more attribute data and the closeness score are combined with the corresponding attribute data and input to the machine learning unit 29 as teacher data. In addition, in generating or updating the attribute generation model, the machine learning model generation framework that can be adopted is not limited, but a machine learning framework of gradient boosting based on a decision tree algorithm may be adopted, as in the user score estimation model described above.

＜処理の流れ＞
次に、本実施形態に係る情報処理システムによって実行される処理の流れを説明する。なお、以下に説明する処理の具体的な内容及び処理順序は、本開示を実施するための一例である。具体的な処理内容及び処理順序は、本開示の実施の形態に応じて適宜選択されてよい。 <Processing flow>
Next, a process flow executed by the information processing system according to the present embodiment will be described. Note that the specific contents and processing order of the processes described below are an example for implementing the present disclosure. The specific contents and processing order may be appropriately selected according to the embodiment of the present disclosure.

図１４は、本実施形態に係る機械学習処理の流れを示すフローチャートである。本フローチャートに示された処理は、管理者によって指定されたタイミングで実行される。 Figure 14 is a flowchart showing the flow of the machine learning process according to this embodiment. The process shown in this flowchart is executed at a timing specified by the administrator.

本実施形態において、機械学習処理では、ユーザスコア推定モデルが生成及び／又は更新される。機械学習部２９は、サービス提供システム５において過去に蓄積されたユーザ毎の属性データ群と、対応するユーザについて予め決定されたユーザスコアと、の組み合わせを含む教師データを作成する（ステップＳ１０１）。そして、機械学習部２９は、作成された教師データをユーザスコア推定モデルに入力し、ユーザスコア推定部２８によるユーザスコア推定に用いられるユーザスコア推定モデルを生成又は更新する（ステップＳ１０２）。その後、本フローチャートに示された処理は終了する。なお、属性生成部２６が属性補完のために属性生成モデルを用いる場合、属性生成モデルの生成及び／又は更新も、同様の処理の流れで行われてよい。 In this embodiment, in the machine learning process, a user score estimation model is generated and/or updated. The machine learning unit 29 creates training data including a combination of a group of attribute data for each user previously accumulated in the service providing system 5 and a user score previously determined for the corresponding user (step S101). Then, the machine learning unit 29 inputs the created training data into the user score estimation model, and creates or updates a user score estimation model used for user score estimation by the user score estimation unit 28 (step S102). Thereafter, the process shown in this flowchart ends. Note that when the attribute generation unit 26 uses an attribute generation model for attribute complementation, the attribute generation model may also be generated and/or updated in a similar process flow.

図１５は、本実施形態に係るユーザスコア推定処理の流れを示すフローチャートである。本フローチャートに示された処理は、管理者によって指定されたタイミングで、対象となるユーザ毎に実行される。ここで、対象ユーザは、属性データに欠損があったり属性データの信頼性が低かったりするユーザである。信頼性の低い属性データの例としては、蓄積された量が十分でない履歴データに基づいて生成された属性データや、他の属性データの内容と明らかに矛盾する属性データ等が挙げられる。なお、ここでは対象ユーザを含む複数のユーザについてのグラフデータが既に生成されており、また、各機械学習モデルが既に学習済であることとする。 Figure 15 is a flowchart showing the flow of the user score estimation process according to this embodiment. The process shown in this flowchart is executed for each target user at a timing specified by the administrator. Here, the target users are users whose attribute data is missing or has low reliability. Examples of unreliable attribute data include attribute data generated based on insufficiently accumulated history data and attribute data that clearly contradicts the contents of other attribute data. Note that here, it is assumed that graph data has already been generated for multiple users including the target user, and that each machine learning model has already been trained.

ステップＳ２０１及びステップＳ２０３では、参照ユーザが特定され、対象ユーザと参照ユーザとの間の関係性が特定される。参照ユーザ特定部２２は、グラフデータを参照し、対象ユーザに対応するノードデータ５０と明示的リンク又は黙示的リンクで接続されているノードデータ５０に対応する１又は複数の他のユーザを、参照ユーザとして特定する（ステップＳ２０１）。そして、関係性特定部２３は、当該対象ユーザとステップＳ２０１で特定された１又は複数の参照ユーザとのペア毎に、ユーザ間の関係性の種類（具体的には、同一世帯に居住する親子関係／夫婦関係／友達関係／同じ職場で働く関係、等）を特定する（ステップＳ２０２）。その後、処理はステップＳ２０３へ進む。 In steps S201 and S203, the reference user is identified, and the relationship between the target user and the reference user is identified. The reference user identification unit 22 refers to the graph data, and identifies one or more other users corresponding to the node data 50 that is connected to the node data 50 corresponding to the target user by an explicit link or an implicit link as reference users (step S201). Then, the relationship identification unit 23 identifies the type of relationship between the users (specifically, parent-child relationship living in the same household/married couple relationship/friendship relationship/working in the same workplace, etc.) for each pair of the target user and one or more reference users identified in step S201 (step S202). After that, the process proceeds to step S203.

ステップＳ２０３及びステップＳ２０４では、補完対象となる属性データの種類が選択され、ユーザ間の近さスコアが決定される。属性選択部２５は、ステップＳ２０２で特定された関係性の種類に応じて、対象ユーザについて補完対象となる属性データの種類を選択する（ステップＳ２０３）。また、関係性強度決定部２４は、当該対象ユーザと各参照ユーザとのペア毎に、当該ペアに対応付けられる近さスコアの値を決定する（Ｓ２０４）。その後、処理はステップＳ２０５へ進む。 In steps S203 and S204, the type of attribute data to be complemented is selected, and a closeness score between users is determined. The attribute selection unit 25 selects the type of attribute data to be complemented for the target user according to the type of relationship identified in step S202 (step S203). In addition, the relationship strength determination unit 24 determines the closeness score value associated with each pair of the target user and each reference user (S204). After that, the process proceeds to step S205.

ステップＳ２０５では、対象ユーザについて補完される属性データが生成される。属性生成部２６は、補完対象の属性データに対応する参照ユーザの属性データのパラメータと、当該参照ユーザについてステップＳ２０４で決定された近さスコアとに基づいて、対象ユーザについて補完される属性データを生成する。その後、処理はステップＳ２０６へ進む。 In step S205, attribute data to be complemented for the target user is generated. The attribute generation unit 26 generates attribute data to be complemented for the target user based on the parameters of the attribute data of the reference user that correspond to the attribute data to be complemented and the closeness score determined in step S204 for the reference user. Then, the process proceeds to step S206.

ステップＳ２０６及びステップＳ２０７では、ユーザスコアが推定され、出力される。属性補完部２７は、対象ユーザについてサービス提供システム５から取得される等して予め保持されている属性データ群に、ステップＳ２０５で生成された補完される属性データを追加することで、当該ユーザの属性データ群とする（ステップＳ２０６）。そして、ユーザスコア推定部２８は、ステップＳ２０６で対象ユーザについて補完された属性データを含む属性データ群をユーザスコア推定モデルに入力し、出力された値を当該ユーザに設定されるユーザスコアとして取得する（ステップＳ２０７）。但し、ユーザスコアの推定方法は、本実施形態における例示に限定されない。例えば、ユーザスコアは、属性データ群を機械学習モデルではない所定の関数に入力して算出された値を含むものであってもよい。その後、本フローチャートに示された処理は終了する。 In steps S206 and S207, a user score is estimated and output. The attribute complementing unit 27 adds the complemented attribute data generated in step S205 to the attribute data group previously stored for the target user, such as by being acquired from the service providing system 5, to generate an attribute data group for the user (step S206). Then, the user score estimating unit 28 inputs the attribute data group including the attribute data complemented for the target user in step S206 into a user score estimation model, and acquires the output value as the user score to be set for the user (step S207). However, the method of estimating the user score is not limited to the example in this embodiment. For example, the user score may include a value calculated by inputting the attribute data group into a predetermined function that is not a machine learning model. After that, the processing shown in this flowchart ends.

ユーザ毎に設定されたユーザスコアは、サービス提供システム５等の他のシステムに対して提供され、サービス提供システム５等の他のシステムによって対象ユーザに対して提供されるサービスのカスタマイズ等に活用される。 The user score set for each user is provided to other systems, such as the service providing system 5, and is used to customize the services provided to the target user by the other systems, such as the service providing system 5.

本実施形態は、対応するノードデータ５０がグラフノードに含まれていない新規の対象ユーザについてのユーザスコアの推定にも用いることができる。例えば、新規の対象ユーザのユーザ属性データに基づいて、当該対象ユーザに対応するノードデータ５０、及び、当該ノードデータ５０と接続される少なくとも１つのリンクデータ５２が生成されてもよい。そして、リンクデータ５２によって当該対象ユーザに対応するノードデータ５０と接続されるユーザが、当該対象ユーザの参照ユーザとして特定されてもよい。 This embodiment can also be used to estimate a user score for a new target user whose corresponding node data 50 is not included in the graph node. For example, based on the user attribute data of the new target user, node data 50 corresponding to the target user and at least one link data 52 connected to the node data 50 may be generated. Then, a user connected to the node data 50 corresponding to the target user by the link data 52 may be identified as a reference user of the target user.

＜効果＞
本実施形態によれば、ユーザ間のリレーションが網羅されたソーシャルグラフネットワークからユーザの欠損属性を補完し、補完された属性群で以ってユーザスコアを推定／判定することで、対象ユーザの情報が欠損していたり情報の信頼性が低かったりする場合にも、ユーザスコアの算出を可能とし、又は算出されるユーザスコアの精度を向上させることが可能となる。また、様々なユーザ属性データを用いることで、規約や法律等によりある範囲の（例えば、クレジットカード部門の）属性データを用いることができない場合や、対象ユーザについて一部の属性データが存在しない場合であっても、精度の高いユーザスコアを算出することが可能となる。＜Effects＞
According to this embodiment, missing attributes of a user are complemented from a social graph network that covers the relationships between users, and a user score is estimated/determined using the complemented attribute group, so that even when information about a target user is missing or the reliability of the information is low, it is possible to calculate a user score or improve the accuracy of the calculated user score. In addition, by using various user attribute data, it is possible to calculate a highly accurate user score even when a certain range of attribute data (for example, credit card department) cannot be used due to regulations, laws, etc., or when some attribute data does not exist for the target user.

＜バリエーション＞
上記説明した実施形態では、グラフデータ生成部２１、参照ユーザ特定部２２、関係性特定部２３、関係性強度決定部２４、属性選択部２５、属性生成部２６、属性補完部２７、ユーザスコア推定部２８、及び機械学習部２９を備える情報処理装置の例について説明したが、これらの機能部は、本開示に係る発明を実施可能な範囲で、その一部が省略されてもよい。 <Variations>
In the embodiment described above, an example of an information processing device including a graph data generation unit 21, a reference user identification unit 22, a relationship identification unit 23, a relationship strength determination unit 24, an attribute selection unit 25, an attribute generation unit 26, an attribute completion unit 27, a user score estimation unit 28, and a machine learning unit 29 was described, but some of these functional units may be omitted to the extent that the invention disclosed herein can be implemented.

例えば、上記説明した実施形態では、補完対象の属性データを生成するにあたって対象ユーザと参照ユーザとの間の関係性強度（近さスコア）が生成され、また参照されたが、補完対象の属性データを生成するにあたり、近さスコアの生成及び参照は省略されてもよい。この場合、図２を参照して説明した情報処理装置１の各機能部のうち、関係性強度決定部２４は省略されてよい。また、属性生成部２６は、属性データの生成に際して、近さスコアを参照した重み付け等を行わず、参照ユーザの属性データに基づいて対象ユーザの補完対象属性データを生成してよい。 For example, in the embodiment described above, the relationship strength (closeness score) between the target user and the reference user was generated and referenced when generating the attribute data to be complemented, but the generation and reference of the closeness score may be omitted when generating the attribute data to be complemented. In this case, of the functional units of the information processing device 1 described with reference to FIG. 2, the relationship strength determination unit 24 may be omitted. Furthermore, when generating the attribute data, the attribute generation unit 26 may generate the attribute data to be complemented for the target user based on the attribute data of the reference user without performing weighting with reference to the closeness score.

また、例えば、属性生成部２６は、参照ユーザの属性データ群の少なくとも一部のパラメータと、対象ユーザ及び参照ユーザ間の近さスコアと、を入力値とし、補完される対象ユーザの属性データを出力値とする属性生成モデルを用いて、対象ユーザの属性データを生成してもよい。このとき、属性生成モデルは入力値、出力値の態様に応じて適宜、予め学習処理が行われる。 For example, the attribute generation unit 26 may generate attribute data of the target user using an attribute generation model in which at least some parameters of the attribute data group of the reference user and the closeness score between the target user and the reference user are used as input values, and the complemented attribute data of the target user is used as output values. At this time, the attribute generation model is appropriately pre-trained according to the state of the input values and output values.

また、例えば、属性生成部２６は、対象ユーザの属性データ群の少なくとも一部のパラメータ、及び／又は、参照ユーザの属性データ群の少なくとも一部のパラメータを、入力値とし、補完される対象ユーザの属性データを出力値とする属性生成モデルを用いて、対象ユーザの属性データを生成してもよい。このとき、属性生成モデルは入力値、出力値の態様に応じて適宜、予め学習処理が行われる。また、このとき、属性生成部２６は、対象ユーザ及び参照ユーザ間の関係性及び／又は近さスコア毎に異なる複数の属性生成モデルのうち、処理の対象となる対象ユーザとその参照ユーザとの間における関係性の種類及び／又は近さスコアに応じて所定の属性生成モデルを決定し、補完される対象ユーザの属性データを生成してよい。ここで、複数の属性生成モデルの夫々は、例として、関係性の種類及び／又は近さスコアが共通又は類似する（所定の範囲内にある）教師データに基づいて予め学習処理が行われてよい。 For example, the attribute generation unit 26 may generate attribute data of the target user using an attribute generation model in which at least some parameters of the attribute data group of the target user and/or at least some parameters of the attribute data group of the reference user are used as input values, and the attribute data of the target user to be complemented is used as output values. At this time, the attribute generation model is appropriately pre-trained according to the state of the input value and the output value. At this time, the attribute generation unit 26 may determine a predetermined attribute generation model according to the type of relationship and/or the closeness score between the target user to be processed and the reference user, among a plurality of attribute generation models that differ for each relationship and/or closeness score between the target user and the reference user, and generate attribute data of the target user to be complemented. Here, each of the plurality of attribute generation models may be pre-trained based on teacher data in which the type of relationship and/or the closeness score is common or similar (within a predetermined range).

また、例えば、属性生成部２６は、ユーザ（対象ユーザ、参照ユーザ）の属性データ群の少なくとも一部のパラメータとして、グラフデータ上のユーザの埋め込み表現（ベクトル表現、特徴表現）を入力値とし、補完される対象ユーザの属性データを出力値とする属性生成モデルを用いて、対象ユーザの属性データを生成してもよい。また、属性生成モデルは、グラフデータ上の対象ユーザ及び参照ユーザの距離又は内積等（グラフデータに基づくベクトル空間上の距離又は内積等）を入力値に含んでよい。このとき、属性生成モデルは入力値、出力値の態様に応じて適宜、予め学習処理が行われる。 For example, the attribute generation unit 26 may generate attribute data of a target user using an attribute generation model in which the embedded representation (vector representation, feature representation) of the user on the graph data is used as an input value as at least a portion of the parameters of the attribute data group of the user (target user, reference user) and the complemented attribute data of the target user is used as an output value. The attribute generation model may include as an input value the distance or inner product between the target user and the reference user on the graph data (the distance or inner product in a vector space based on the graph data). In this case, the attribute generation model is appropriately pre-trained according to the state of the input value and the output value.

また、例えば、属性補完部２７は、属性生成モデルによって出力された属性データが、補完が行われる前の対象ユーザの属性データ群における欠損値（欠損している属性データ）又は不正値（信頼性が低い属性データ）である場合、出力された属性データを対象ユーザの属性データ群の一部として決定してよい。 Furthermore, for example, if the attribute data output by the attribute generation model is a missing value (missing attribute data) or an invalid value (low-reliability attribute data) in the attribute data group of the target user before the completion, the attribute completion unit 27 may determine that the output attribute data is part of the attribute data group of the target user.

また、例えば、属性選択部２５又は属性補完部２７は、ユーザスコア推定モデル等として採用される勾配ブースティング決定木等のアンサンブル学習モデルにおいてウェイトが高い属性データを、補完対象の属性データとして扱ってよい。ここで、ウェイトが高い属性データとは、例として、ユーザスコア推定モデルにおいて所定のウェイトを上回るウェイトの木と対応する属性データであってよく、ユーザスコア推定モデルにおいて上位の（所定の順位以上の）ウェイトを示す木と対応する属性データであってよい。 Furthermore, for example, the attribute selection unit 25 or the attribute completion unit 27 may treat attribute data with a high weight in an ensemble learning model such as a gradient boosting decision tree employed as a user score estimation model, as attribute data to be completed. Here, the attribute data with a high weight may be, for example, attribute data corresponding to a tree with a weight exceeding a predetermined weight in the user score estimation model, or attribute data corresponding to a tree showing a high weight (above a predetermined rank) in the user score estimation model.

１情報処理装置

1. Information processing device

Claims

A reference user specification means for specifying a reference user having a relationship with the target user based on graph data indicating the relationship between the users;
an attribute generating means for generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
An information processing system comprising:

a graph data generating means for generating the graph data by identifying pairs of users who are related to each other based on a group of attribute data of each of the users;
The information processing system according to claim 1 .

A relationship specification means for specifying a relationship between users based on a result of clustering based on at least one of the user's name, IP address, address, credit card number, age, sex, school, place of employment, and place of stay, as a value associated with the relationship between users;
A reference user identification means for identifying a reference user having a relationship with the target user;
an attribute generating means for generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
An information processing system comprising:

and a relationship strength determination means for determining a relationship strength indicating a closeness between the target user and the reference user based on an index indicating a strength of the relationship between the target user and the reference user in accordance with a judgment criterion corresponding to the relationship between the target user and the reference user,
the attribute generating means generates, for at least one of the reference users, corresponding attribute data of the target user based on information about the reference user and the relationship strength determined for the reference user;
The information processing system according to claim 3 .

The relationship strength determination means determines the relationship strength indicating the closeness between the target user and the reference user based on an output when data representing the index is input to a trained machine learning model corresponding to the relationship between the target user and the reference user.
5. The information processing system according to claim 4.

A reference user identification means for identifying a reference user having a relationship with the target user;
an attribute selection means for selecting a type of attribute data to be complemented for the target user according to a type of relationship between the target user and the reference user ;
an attribute generating means for generating attribute data corresponding to the target user based on the type of attribute data selected by the attribute selecting means from the attribute data group of the reference users identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
An information processing system comprising:

The user score estimation means estimates a user score to be set for the target user by inputting a group of attribute data of the target user into a machine learning model.
The information processing system according to claim 1 .

The user score estimation means estimates the user score using a machine learning model generated using a machine learning framework based on a gradient boosting decision tree.
The information processing system according to claim 7.

The user score estimation means estimates a user score to be set for the target user by using the machine learning model generated using teacher data in which a group of attribute data including demographic attributes of the user is an input value and the user score related to the user is an output value.
9. The information processing system according to claim 7 or 8.

The attribute complementing means generates attribute data for complementing missing attribute data or low reliability attribute data among the attribute data group of the target user based on the attribute data of the reference user.
The information processing system according to any one of claims 1 to 9.

The computer
A reference user identifying step of identifying a reference user having a relationship with the target user based on graph data indicating the relationship between the users;
an attribute generating step of generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementation step of complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimating step of estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
How to do it.

The computer
a relationship identification step of identifying a relationship between the users based on a result of clustering based on at least one of the users' names, IP addresses, addresses, credit card numbers, ages, sexes, schools, places of employment, and places of stay as values associated with the relationships between the users;
A reference user identification step of identifying a reference user who is related to the target user;
an attribute generating step of generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementation step of complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimating step of estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
How to do it.

The computer
A reference user identification step of identifying a reference user who is related to the target user;
an attribute selection step of selecting a type of attribute data to be complemented for the target user according to a type of relationship between the target user and the reference user ;
an attribute generating step of generating corresponding attribute data of the target user based on the attribute data of the type selected in the attribute selecting step from among the attribute data group of the reference users identified for the target user;
an attribute complementation step of complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimating step of estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
How to do it.

Computer,
A reference user specification means for specifying a reference user having a relationship with the target user based on graph data indicating the relationship between the users;
an attribute generating means for generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
A program that functions as a

Computer,
A relationship specification means for specifying a relationship between users based on a result of clustering based on at least one of the user's name, IP address, address, credit card number, age, sex, school, place of employment, and place of stay, as a value associated with the relationship between users;
A reference user identification means for identifying a reference user having a relationship with the target user;
an attribute generating means for generating corresponding attribute data of the target user based on attribute data of the reference user identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
A program that functions as a

Computer,
A reference user identification means for identifying a reference user having a relationship with the target user;
an attribute selection means for selecting a type of attribute data to be complemented for the target user according to a type of relationship between the target user and the reference user ;
an attribute generating means for generating attribute data corresponding to the target user based on the type of attribute data selected by the attribute selecting means from the attribute data group of the reference users identified for the target user;
an attribute complementing means for complementing a group of attribute data corresponding to the target user based on at least a part of the generated attribute data corresponding to the target user;
a user score estimation means for estimating a user score to be set for the target user based on the complemented attribute data group corresponding to the target user;
A program that functions as a