JP2012256239A

JP2012256239A - Destination prediction system and program

Info

Publication number: JP2012256239A
Application number: JP2011129468A
Authority: JP
Inventors: Tatsuya Iwase; 竜也岩瀬; Noriyoshi Suzuki; 徳祥鈴木
Original assignee: Toyota Central R&D Labs Inc
Current assignee: Toyota Central R&D Labs Inc
Priority date: 2011-06-09
Filing date: 2011-06-09
Publication date: 2012-12-27

Abstract

PROBLEM TO BE SOLVED: To achieve accurate prediction of a destination even in the case of non-regular movement.SOLUTION: A data collection part 50 collects access data and movement data for each user in chronological order. A content extraction part 60 extracts each place name from a text on the website, with respect to each of the collected access data, and extracts a place corresponding to the place name. A staying place extraction part 56 extracts staying place data for each user on the basis of the movement data collected in chronological order. A correlation calculation part 62 calculates a correlation for each user upon associating access data for each user with each staying place data extracted from a movement history. A prior probability learning part 64 learns a prior probability of going to a place shown by each staying place data. A destination prediction part 70 calculates a probability of going to the place shown by each staying place data and predicts a destination on the basis of the calculated probability, with respect to a user subject to the prediction.

Description

本発明は、目的地予測装置及びプログラムに係り、特に、ユーザの文書データの閲覧履歴及び位置情報の履歴から、ユーザの目的地を予測する目的地予測装置及びプログラムに関する。 The present invention relates to a destination prediction apparatus and program, and more particularly, to a destination prediction apparatus and program for predicting a user's destination from a user's document data browsing history and position information history.

従来より、インターネット閲覧履歴からユーザの興味を推定するシステムが知られている（例えば、特許文献１）。また、閲覧履歴・移動履歴から行動パターンを生成・予測し、情報提供を行うシステムが知られている（特許文献２）。 2. Description of the Related Art Conventionally, a system that estimates a user's interest from Internet browsing history is known (for example, Patent Document 1). In addition, a system is known that generates and predicts an action pattern from a browsing history / movement history and provides information (Patent Document 2).

特開２００９−２３２４１５号公報JP 2009-232415 A 特開２００８−３１１２１２号公報JP 2008-311212 A

しかしながら、上記の特許文献１に記載のシステムは、あくまでユーザの興味の有りそうな箇所を列挙し推薦するシステムであり、目的地の予測を行うことができない、という問題がある。また、移動履歴データを使っておらずインターネット履歴のみを使っているため、推定精度が低い、という問題がある。 However, the system described in Patent Document 1 is a system that enumerates and recommends portions that the user is likely to be interested in, and has a problem that the destination cannot be predicted. In addition, since the movement history data is not used and only the Internet history is used, there is a problem that the estimation accuracy is low.

また、上記の特許文献２に記載の技術は、移動距離に基づく地点間移動予測であり、定期的な移動における目的地しか予測できない、という問題がある。 In addition, the technique described in Patent Document 2 described above is a point-to-point movement prediction based on a movement distance, and has a problem that only a destination in a regular movement can be predicted.

本発明は、上記の問題点を解決するためになされたもので、定期的な移動でない場合であっても、目的地を精度良く予測することができる目的地予測装置及びプログラムを提供することを目的とする。 The present invention has been made to solve the above-described problems, and provides a destination prediction apparatus and program capable of accurately predicting a destination even when it is not a periodic movement. Objective.

上記の目的を達成するために本発明に係る目的地予測装置は、複数のユーザについて、文書データの閲覧履歴及び位置情報の履歴を収集する履歴収集手段と、収集された閲覧履歴の各文書データについて、前記文書データ内のテキストに応じて、閲覧された前記文書データを分類する分類手段と、前記収集された位置情報の履歴に基づいて、ユーザ毎に、滞在地データを抽出する抽出手段と、ユーザ毎に、前記ユーザの前記閲覧履歴の各文書データと、前記位置情報の履歴から抽出された各滞在地データとを関連付けて、前記文書データと前記滞在地データとの組み合わせの各々に関する相関を算出する相関算出手段と、算出した前記相関に基づいて、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出し、前記算出された確率に基づいて目的地を予測する目的地予測手段と、を含んで構成されている。 In order to achieve the above object, a destination prediction apparatus according to the present invention includes a history collection unit that collects document data browsing history and position information history for a plurality of users, and each document data of the collected browsing history. Categorizing means for classifying the browsed document data according to text in the document data, and extracting means for extracting stay place data for each user based on the history of the collected location information Corresponding to each of the combinations of the document data and the staying place data by associating each document data of the browsing history of the user with each staying place data extracted from the history of the position information for each user. Based on the calculated correlation, the probability calculating means calculates the probability of going to the location indicated by each stay location data for the prediction target user, and the calculated Is configured to include a, a destination predicting means for predicting a destination on the basis of probability.

本発明に係るプログラムは、コンピュータを、複数のユーザについて、文書データの閲覧履歴及び位置情報の履歴を収集する履歴収集手段、収集された閲覧履歴の各文書データについて、前記文書データ内のテキストに応じて、閲覧された前記文書データを分類する分類手段、前記収集された位置情報の履歴に基づいて、ユーザ毎に、滞在地データを抽出する抽出手段、ユーザ毎に、前記ユーザの前記閲覧履歴の各文書データと、前記位置情報の履歴から抽出された各滞在地データとを関連付けて、前記文書データと前記滞在地データとの組み合わせの各々に関する相関を算出する相関算出手段、及び算出した前記相関に基づいて、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出し、前記算出された確率に基づいて目的地を予測する目的地予測手段として機能させるためのプログラムである。 The program according to the present invention includes a computer, a history collecting unit that collects browsing history of document data and a history of position information for a plurality of users, and the text in the document data for each document data of the collected browsing history. Accordingly, the classification means for classifying the browsed document data, the extraction means for extracting the stay place data for each user based on the collected location information history, and the browsing history of the user for each user. Correlation calculation means for associating each document data with each stay place data extracted from the history of the position information and calculating a correlation for each combination of the document data and the stay place data, and the calculated Based on the correlation, for the prediction target user, the probability of going to the place indicated by each stay place data is calculated, and the target is calculated based on the calculated probability. Is a program for functioning as a destination predicting means for predicting the earth.

本発明によれば、履歴収集手段によって、複数のユーザについて、文書データの閲覧履歴及び位置情報の履歴を収集する。分類手段によって、収集された閲覧履歴の各文書データについて、前記文書データ内のテキストに応じて、閲覧された前記文書データを分類する。抽出手段によって、前記収集された位置情報の履歴に基づいて、ユーザ毎に、滞在地データを抽出する。 According to the present invention, the history collection unit collects the browsing history of document data and the history of position information for a plurality of users. The classified means classifies the browsed document data according to the text in the document data for each document data of the collected browsing history. Based on the collected history of position information, the extraction means extracts stay place data for each user.

そして、相関算出手段によって、ユーザ毎に、前記ユーザの前記閲覧履歴の各文書データと、前記位置情報の履歴から抽出された各滞在地データとを関連付けて、前記文書データと前記滞在地データとの組み合わせの各々に関する相関を算出する。目的地予測手段によって、算出した前記相関に基づいて、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出し、前記算出された確率に基づいて目的地を予測する。 Then, for each user, the document data of the browsing history of the user and the staying place data extracted from the history of the location information are associated with each other by the correlation calculating means, and the document data and the staying place data are A correlation for each of the combinations is calculated. Based on the calculated correlation, the destination predicting means calculates a probability of going to the place indicated by each stay data for the prediction target user, and predicts the destination based on the calculated probability.

このように、ユーザ毎に、閲覧履歴の各文書データと、位置情報の履歴から抽出された各滞在地データとを関連付けて相関を算出し、算出された相関に基づいて、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出することにより、定期的な移動でない場合であっても、目的地を精度良く予測することができる。 In this way, for each user, the correlation is calculated by associating each document data of the browsing history with each place of stay data extracted from the history of the position information, and based on the calculated correlation, By calculating the probability of going to the place indicated by each place of stay data, the destination can be predicted with high accuracy even when it is not a regular movement.

本発明に係る目的地予測装置は、前記相関算出手段によって算出された相関に基づいて、文書データの各分類と、各滞在地データとの組み合わせの各々について、前記分類に属する文書データを閲覧したときに前記滞在地データが示す場所へ行く事前確率を算出する確率算出手段を更に含み、前記目的地予測手段は、前記予測対象ユーザに対して算出された前記相関と、前記確率算出手段によって算出された前記事前確率とに基づいて、前記予測対象ユーザについて、各滞在地を目的地とする確率を算出し、目的地を予測するようにすることができる。 The destination prediction apparatus according to the present invention browses the document data belonging to the classification for each combination of the classification of the document data and each of the stay data based on the correlation calculated by the correlation calculation means. Probability calculation means for calculating a prior probability to go to the place indicated by the stay location data sometimes, wherein the destination prediction means is calculated by the correlation calculated for the prediction target user and the probability calculation means. Based on the prior probabilities that have been made, for the prediction target user, a probability that each destination is a destination can be calculated and the destination can be predicted.

本発明に係る相関算出手段は、前記ユーザの前記閲覧履歴の各文書データと、前記位置情報の履歴から抽出された各滞在地データとを関連付けて、前記文書データと前記滞在地データとの組み合わせの各々に関して、前記文書データの閲覧時刻と前記滞在地データの滞在時刻との時間差に応じた相関を算出するようにすることができる。 The correlation calculating means according to the present invention relates each document data of the browsing history of the user and each stay place data extracted from the history of the position information, and a combination of the document data and the stay place data. For each of the above, a correlation according to a time difference between the viewing time of the document data and the staying time of the staying place data can be calculated.

上記の相関算出手段は、前記文書データと前記滞在地データとの組み合わせの各々に関して、前記文書データの閲覧時刻と前記滞在地データの滞在時刻との時間差、前記文書データの閲覧時間の長さに応じた閲覧確率、及び前記滞在地データの滞在時間の長さに応じた滞在確率に基づいて、前記相関を算出するようにすることができる。 The correlation calculation means is configured to calculate, for each combination of the document data and the stay place data, a time difference between the view time of the document data and the stay time of the stay place data, and the length of the view time of the document data. The correlation can be calculated based on the corresponding browsing probability and the stay probability according to the length of the stay time of the stay place data.

上記の分類手段は、収集された閲覧履歴の各文書データについて、前記文書データ内のテキストから地名又は商品名を抽出し、前記抽出された地名を示す場所データ、又は前記抽出された商品名を販売する店舗の場所データを、前記文書データの分類として割り当てるようにすることができる。 For each document data of the collected browsing history, the classification means extracts a place name or a product name from the text in the document data, and uses the place data indicating the extracted place name or the extracted product name. The location data of the store to sell can be assigned as the classification of the document data.

上記の分類手段は、収集された閲覧履歴の各文書データについて、前記文書データ内のテキストから単語を抽出し、前記抽出された単語の集合についてクラスタリングを行い、前記文書データについて抽出された単語の集合が属するクラスタを、前記文書データの分類として割り当てるようにすることができる。 The classification means extracts words from the text in the document data for each document data of the collected browsing history, performs clustering on the extracted set of words, and extracts the words extracted from the document data. The cluster to which the set belongs can be assigned as the classification of the document data.

また、本発明のプログラムは、記憶媒体に格納して提供することも可能である。 The program of the present invention can also be provided by being stored in a storage medium.

以上説明したように、本発明の目的地予測装置及びプログラムによれば、ユーザ毎に、閲覧履歴の各文書データと、位置情報の履歴から抽出された各滞在地データとを関連付けて相関を算出し、算出された相関に基づいて、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出することにより、定期的な移動でない場合であっても、目的地を精度良く予測することができる、という効果が得られる。 As described above, according to the destination prediction apparatus and program of the present invention, for each user, the correlation is calculated by associating each document data of the browsing history with each place of stay data extracted from the history of the position information. Then, based on the calculated correlation, by calculating the probability of going to the location indicated by each place of stay data for the prediction target user, it is possible to accurately predict the destination even if it is not a regular movement The effect of being able to be obtained.

本発明の第１の実施の形態に係る目的地予測システムを示すブロック図である。It is a block diagram which shows the destination prediction system which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る携帯端末を示すブロック図である。It is a block diagram which shows the portable terminal which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る処理サーバを示すブロック図である。It is a block diagram which shows the processing server which concerns on the 1st Embodiment of this invention. 相関を算出する処理を説明するための図である。It is a figure for demonstrating the process which calculates a correlation. 本発明の第１の実施の形態に係る処理サーバにおける相関学習処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the correlation learning process routine in the process server which concerns on the 1st Embodiment of this invention. 本発明の第１の実施の形態に係る処理サーバにおける目的地予測処理ルーチンの内容を示すフローチャートである。It is a flowchart which shows the content of the destination prediction process routine in the process server which concerns on the 1st Embodiment of this invention.

以下、図面を参照して、本発明の好適な実施の形態について説明する。 Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

図１に示すように、第１の実施の形態に係る目的地予測システム１０は、ユーザがＷｅｂサイトを閲覧するためのクライアントＰＣ１２と、ユーザがＷｅｂサイトを閲覧すると共に、位置情報を計測する携帯端末１４と、移動体（例えば、車両）に搭載され、かつ、位置情報を計測する移動体端末１６と、ユーザの目的地を予測する処理を行う処理サーバ１８とを備えている。クライアントＰＣ１２、携帯端末１４、及び移動体端末１６の各々と、処理サーバ１８とは、インターネット２０を介して接続されている。 As shown in FIG. 1, the destination prediction system 10 according to the first embodiment includes a client PC 12 for a user to browse a website, and a mobile phone that measures the location information while the user browses the website. A terminal 14, a mobile terminal 16 that is mounted on a mobile body (for example, a vehicle) and that measures position information, and a processing server 18 that performs a process of predicting a user's destination are provided. Each of the client PC 12, the portable terminal 14, and the mobile terminal 16 and the processing server 18 are connected via the Internet 20.

図２に示すように、携帯端末１４は、自端末の位置を計測する位置計測部２２と、ユーザが自端末を操作するための操作部２４と、Ｗｅｂサイトへのアクセスを制御すると共に、閲覧データ及び移動データを生成するコンピュータ２６と、アクセスされたＷｅｂサイトをユーザに対して表示する表示部２８と、Ｗｅｂサイトへアクセスすると共に、閲覧データ及び移動データを送信する通信部３０とを備えている。 As shown in FIG. 2, the mobile terminal 14 controls the access to the position measuring unit 22 that measures the position of the own terminal, the operation unit 24 for the user to operate the own terminal, and the Web site, and browsing A computer 26 that generates data and movement data, a display unit 28 that displays the accessed website to the user, and a communication unit 30 that accesses the website and transmits browsing data and movement data. Yes.

位置計測部２２は、例えば、ＧＰＳセンサ又は慣性センサを用いて構成され、自端末の現在位置を計測する。 The position measurement unit 22 is configured using, for example, a GPS sensor or an inertial sensor, and measures the current position of the terminal itself.

通信部３０は、無線通信により、インターネット２０を介してＷｅｂサイトへアクセスする共に、コンピュータ２６により生成された閲覧データ及び移動データを、処理サーバ１８へ送信する。また、通信部３０は、無線通信により後述する目的地予測要求を、処理サーバ１８へ送信する。 The communication unit 30 accesses the website via the Internet 20 by wireless communication, and transmits browsing data and movement data generated by the computer 26 to the processing server 18. Further, the communication unit 30 transmits a destination prediction request to be described later to the processing server 18 by wireless communication.

コンピュータ２６は、ＣＰＵと、ＲＡＭと、ＲＯＭとを備え、機能的には次に示すように構成されている。コンピュータ２６は、ユーザによる操作部２４の操作に応じてＷｅｂサイトへアクセスし、Ｗｅｂサイトの画面を表示部２８に表示させる閲覧制御部３２と、閲覧制御部３２により表示されたＷｅｂサイトに関する閲覧データを生成して通信部３０へ出力する閲覧データ生成部３４と、位置計測部２２により計測された自端末の位置を示す移動データを生成し、通信部３０へ出力する移動データ生成部３６とを備えている。 The computer 26 includes a CPU, a RAM, and a ROM, and is functionally configured as follows. The computer 26 accesses the website in accordance with the operation of the operation unit 24 by the user, displays the website screen on the display unit 28, and the browsing data regarding the website displayed by the browsing control unit 32. And a movement data generation unit 36 that generates movement data indicating the position of the terminal measured by the position measurement unit 22 and outputs the movement data to the communication unit 30. I have.

ユーザが操作部２４を操作して、ユーザＩＤ及びパスワードをコンピュータ２６に入力すると、入力されたユーザＩＤ及びパスワードが、処理サーバ１８へ送信される。これによって、携帯端末１４は、処理サーバ１８に対してログイン状態となる。 When the user operates the operation unit 24 and inputs the user ID and password to the computer 26, the input user ID and password are transmitted to the processing server 18. As a result, the mobile terminal 14 is in a login state with respect to the processing server 18.

閲覧データ生成部３４は、閲覧制御部３２により表示されたＷｅｂサイトのＵＲＬ、当該ＷｅｂサイトのＨＴＭＬテキスト、Ｗｅｂサイトへのアクセスしたときの時刻（閲覧時刻）、サイト表示時間、ユーザＩＤ、携帯端末１４に予め設定された端末ＩＤ、及び端末タイプを含む各項目からなる閲覧データを生成する。 The browsing data generation unit 34 includes the URL of the website displayed by the browsing control unit 32, the HTML text of the website, the time when the website was accessed (viewing time), the site display time, the user ID, and the mobile terminal 14 generates browsing data including items including a terminal ID and a terminal type set in advance.

移動データ生成部３６は、位置計測部２２によって計測された位置（緯度、経度、及び高度）、位置が計測されたときの位置時刻、ユーザＩＤ、携帯端末１４に予め設定された端末ＩＤ、及び端末タイプを含む各項目からなる移動データを生成する。 The movement data generation unit 36 includes a position (latitude, longitude, and altitude) measured by the position measurement unit 22, a position time when the position is measured, a user ID, a terminal ID preset in the mobile terminal 14, and The movement data including each item including the terminal type is generated.

クライアントＰＣ１２は、操作部２４、表示部２８、通信部３０、閲覧制御部３２、及び閲覧データ生成部３４と同様の構成を備えており、ユーザの操作に応じてＷｅｂサイトへアクセスしてユーザに対してＷｅｂサイトの画面を表示すると共に、表示されたＷｅｂサイトに関する閲覧データを生成して処理サーバ１８へ送信する。 The client PC 12 has the same configuration as the operation unit 24, the display unit 28, the communication unit 30, the browsing control unit 32, and the browsing data generation unit 34. On the other hand, a screen of the website is displayed, and browsing data relating to the displayed website is generated and transmitted to the processing server 18.

移動体端末１６は、例えば、カーナビシステムであり、位置計測部２２、操作部２４、通信部３０、及び移動データ生成部３６と同様の構成を備えている。移動体端末１６は、自端末の位置を計測すると共に、計測された自端末の位置を示す移動データを生成して処理サーバ１８へ送信する。 The mobile terminal 16 is a car navigation system, for example, and has the same configuration as the position measurement unit 22, the operation unit 24, the communication unit 30, and the movement data generation unit 36. The mobile terminal 16 measures the position of the own terminal, generates movement data indicating the measured position of the own terminal, and transmits the movement data to the processing server 18.

図３に示すように、処理サーバ１８は、閲覧データ、移動データ、及び目的地予測要求を受信する通信部４０と、閲覧データ及び移動データを収集して記憶すると共に、予測対象ユーザの目的地を予測するコンピュータ４２と、を備えている。 As shown in FIG. 3, the processing server 18 collects and stores the browsing data, the movement data, and the destination prediction request, the browsing data and the movement data, and stores the destination data of the prediction target user. And a computer 42 for predicting the above.

通信部４０は、インターネット２０を介して、クライアントＰＣ１２、携帯端末１４、及び移動体端末１６から、閲覧データ、移動データ、及び目的地予測要求を受信する。 The communication unit 40 receives browsing data, movement data, and a destination prediction request from the client PC 12, the mobile terminal 14, and the mobile terminal 16 via the Internet 20.

コンピュータ４２は、ＣＰＵと、ＲＡＭと、後述する相関学習処理ルーチン及び目的地予測処理ルーチンＲＯＭとを備え、機能的には次に示すように構成されている。コンピュータ４２は、受信した閲覧データ及び移動データを収集するデータ収集部５０と、収集した閲覧データ及び移動データを閲覧履歴及び移動履歴としてユーザ毎に記憶するログデータベース５２と、ログデータベース５２からユーザ毎に移動履歴を取得する移動データ取得部５４と、取得した移動履歴に基づいて、滞在地を抽出する滞在地抽出部５６と、ログデータベース５２からユーザ毎に閲覧履歴を取得する閲覧データ取得部５８と、取得した閲覧履歴の各閲覧データのＨＴＭＬテキストに基づいて、場所を抽出する内容抽出部６０と、ユーザ毎に、閲覧データ及び移動データの各組み合わせについて、相関を算出する相関算出部６２と、算出された相関に基づいて、ある場所が抽出されるＷｅｂサイトを閲覧したときに滞在地へ行く確率を示す事前確率を学習する事前確率学習部６４と、算出された相関及び学習された事前確率を記憶する学習結果データベース６６とを備えている。なお、内容抽出部６０は、分類手段の一例であり、事前確率学習部６４が、確率算出手段の一例である。 The computer 42 includes a CPU, a RAM, and a correlation learning processing routine and a destination prediction processing routine ROM described later, and is functionally configured as follows. The computer 42 includes a data collection unit 50 that collects the received browsing data and movement data, a log database 52 that stores the collected browsing data and movement data as a browsing history and a movement history for each user, and a log database 52 for each user. The travel data acquisition unit 54 for acquiring the travel history, the stay location extraction unit 56 for extracting the stay location based on the acquired travel history, and the browsing data acquisition unit 58 for acquiring the browsing history for each user from the log database 52. And a content extracting unit 60 that extracts a location based on the HTML text of each browsing data of the acquired browsing history, and a correlation calculating unit 62 that calculates a correlation for each combination of browsing data and movement data for each user. Based on the calculated correlation, when browsing a website where a place is extracted, It includes a prior probability learning unit 64 for learning the prior probability indicating the Ku probability, and a learning result database 66 for storing the calculated prior probability of correlation and learned was. The content extraction unit 60 is an example of a classification unit, and the prior probability learning unit 64 is an example of a probability calculation unit.

データ収集部５０は、通信部４０によって受信した閲覧データ及び移動データを収集し、ユーザＩＤ毎に、閲覧データの時系列を閲覧履歴としてログデータベース５２に記憶すると共に、移動データの時系列を移動履歴としてログデータベース５２に記憶する。 The data collection unit 50 collects the browsing data and movement data received by the communication unit 40, stores the browsing data time series in the log database 52 as browsing history for each user ID, and moves the movement data time series. Stored in the log database 52 as a history.

滞在地抽出部５６は、ユーザＩＤ毎に、移動データ取得部５４によって取得された移動データの時系列に基づいて、以下に説明するように、滞在地を抽出する。 The stay location extraction unit 56 extracts a stay location for each user ID as described below based on the time series of the movement data acquired by the movement data acquisition unit 54.

まず、移動データの時系列から、移動データが示す位置が密集している部分を探す。例えば、移動データが示す位置から求められる速度データが一定値以下となる部分を密集部分として抽出する。そして、密集部分ごとに、密集部分を含む時間窓Ｔ（開始時刻ｔ１〜終了時刻ｔ２）を初期値として設定する。 First, a portion where the positions indicated by the movement data are dense is searched from the time series of the movement data. For example, a portion where the velocity data obtained from the position indicated by the movement data is a certain value or less is extracted as a dense portion. Then, for each dense portion, a time window T (start time t1 to end time t2) including the dense portion is set as an initial value.

次に、時間窓Ｔ毎に、開始時刻ｔ１を動かして、以下の（１）式で表わされる滞在確率Ｐ（ｊ）を最大化する。 Next, for each time window T, the start time t1 is moved to maximize the stay probability P (j) expressed by the following equation (1).

ただし、α１は予め定められたパラメータである。

Here, α1 is a predetermined parameter.

また、時間窓Ｔ毎に、終了時刻ｔ２を動かして、上記（１）で示す滞在確率Ｐ（ｊ）を最大化する。 Further, for each time window T, the end time t2 is moved to maximize the stay probability P (j) shown in (1) above.

ここで、もし時間窓Ｔが重複している場合には、重複している複数の時間窓Ｔを統合する。以上の処理により、滞在地を抽出する候補となる時間窓Ｔ（ｔ１〜ｔ２）が決定される。 Here, if the time windows T overlap, a plurality of overlapping time windows T are integrated. Through the above processing, a time window T (t1 to t2) that is a candidate for extracting a stay place is determined.

そして、時間窓Ｔ毎に、時間窓Ｔ内の移動データについて、上記（１）式に従って滞在確率Ｐ（ｊ）を算出する。上記（１）式では、位置の分散Ｖが小さいほど、滞在確率Ｐ（ｊ）は大きくなり、時間窓Ｔの幅（ｔ２−ｔ１）が長いほど、滞在確率Ｐ（ｊ）は大きい。 Then, for each time window T, the stay probability P (j) is calculated for the movement data in the time window T according to the above equation (1). In the above equation (1), the smaller the position variance V, the greater the stay probability P (j), and the longer the time window T width (t2-t1), the greater the stay probability P (j).

次に、滞在確率Ｐ（ｊ）が閾値β１より大きくなる時間窓Ｔを、滞在地の抽出対象として特定し、特定された時間窓Ｔの移動データから、平均座標を計算して、場所を特定し、予め用意された場所データｇのうち、最も近い場所データｇに分類する。なお、当該場所データｇは全ユーザに共通である。 Next, the time window T in which the stay probability P (j) is larger than the threshold value β1 is specified as a place of stay extraction, and the average coordinate is calculated from the movement data of the specified time window T to specify the place. Then, the location data g prepared in advance is classified into the nearest location data g. The location data g is common to all users.

以上のように、ユーザＩＤ毎に、移動データの時系列から、滞在地として、場所データｇが抽出される。 As described above, the location data g is extracted as the place of stay from the time series of the movement data for each user ID.

内容抽出部６０は、ユーザＩＤ毎に、閲覧データ取得部５８によって取得された閲覧データの時系列に基づいて、以下に説明するように、閲覧したＷｅｂサイト中に含まれる地名を抽出し、Ｗｅｂサイトの内容ｆとして、地名に対応する場所ＩＤを割り当てる。なお、Ｗｅｂサイトが、文書データの一例であり、場所ＩＤが、文書データの分類の一例である。 The content extraction unit 60 extracts, for each user ID, a place name included in the browsed website, based on the time series of the browse data acquired by the browse data acquisition unit 58, as described below. A location ID corresponding to the place name is assigned as the site content f. The website is an example of document data, and the location ID is an example of document data classification.

まず、閲覧データの時系列から、予め用意された辞書データに含まれる地名・商品名に合致する単語を抽出する。商品名が抽出された場合には、商品名でさらに周辺店舗を検索し、検索結果の地名を抽出する。ここで、複数の地名が抽出され、曖昧性がある場合は、すべての地名を列挙する。 First, a word that matches a place name / product name included in dictionary data prepared in advance is extracted from the time series of browsing data. When the product name is extracted, the nearby stores are further searched by the product name, and the place name of the search result is extracted. Here, if a plurality of place names are extracted and there is ambiguity, all place names are listed.

そして、予め用意された、地名・緯度経度の組になったデータである地名辞書に基づいて、閲覧データについて抽出された地名から、場所データ（緯度経度、場所ＩＤ）を取得し、閲覧データと場所データとを結びつける。また、以下の（２）式に従って、閲覧確率Ｐ（ｉ）を算出する。 Then, based on a place name dictionary which is a set of place name / latitude / longitude data prepared in advance, place data (latitude / longitude, place ID) is acquired from the place name extracted for the view data, and the view data and Connect with location data. Further, the browsing probability P (i) is calculated according to the following equation (2).

ただし、α0は、予め定められたパラメータであり、Ｔは、Ｗｅｂサイトの表示時間である。上記（２）式によれば、サイト表示時間Ｔが短いほど、閲覧確率が小さくなる。

However, α0 is a predetermined parameter, and T is the display time of the website. According to the above equation (2), the shorter the site display time T, the smaller the browsing probability.

なお、地名辞書を用いれば、場所データ（緯度経度）から最も近い地名を取得することも可能である。 If a place name dictionary is used, the nearest place name can be acquired from the place data (latitude and longitude).

相関算出部６２は、図４に示すように、ユーザＩＤ毎に、閲覧履歴の閲覧データｉの各々と、移動履歴から抽出された滞在地データｊの各々とを組み合わせて関連付け、ユーザｋにおける閲覧データｉと滞在地データｊとの相関ｗ^k _i,jを、以下の（３）式に従って算出する。 As shown in FIG. 4, the correlation calculation unit 62 associates each of the browsing history browsing data i with each of the stay location data j extracted from the movement history for each user ID, and browses by the user k. The correlation w ^k _i, j between the data i and the stay place data j is calculated according to the following equation (3).

ただし、α２は予め定められたパラメータであり、ｄは、閲覧データｉに対して結び付けられた場所データと、滞在地データ（場所データ）との距離であり、Ｔは、滞在地データの時刻ｔjと閲覧データの閲覧時刻ｔiとの時間間隔（Ｔ＝ｔｊ−ｔｉ）である。

Here, α2 is a predetermined parameter, d is the distance between the location data linked to the browsing data i and the stay location data (location data), and T is the time tj of the stay location data. And the time interval (T = tj−ti) between the browsing time ti of the browsing data.

上記（３）式によれば、場所の距離ｄが近いほど、相関ｗ^k _i,jが大きくなり、時間間隔Ｔが短いほど、相関ｗ^k _i,jが大きくなる。また、閲覧データの閲覧時刻より滞在地データの時刻の方が早い場合には、相関が０となる。 According to equation (3), as the distance where d is close correlation w ^k _{i, j} is increased, the shorter the time interval T, the correlation w ^k _{i, j} is increased. Moreover, when the time of the stay place data is earlier than the browsing time of the browsing data, the correlation is zero.

なお、上記（３）式の代わりに、以下の（４）式を用いて、閲覧確率Ｐ（ｉ）及び滞在確率Ｐ（ｊ）を考慮した相関を算出するようにしてもよい。 Instead of the above equation (3), the following equation (4) may be used to calculate the correlation considering the browsing probability P (i) and the stay probability P (j).

また、上記（３）式は一例であり、最も簡単な例としては、Ｔ≧０の場合に、ｗ^k _i,j＝１としてもよい。

Further, the above equation (3) is an example. As the simplest example, when T ≧ 0, w ^k _{i, j} = 1 may be set.

事前確率学習部６４は、各ユーザについて算出された閲覧データｉと滞在地データｊとの相関に基づいて、以下に説明するように、場所ｆが抽出される閲覧データのＷｅｂサイトを閲覧したときに、場所ｇの滞在地へ行く事前確率を求める。なお、この事前確率は全ユーザで共通と仮定する。 When the prior probability learning unit 64 browses the website of the browsing data from which the location f is extracted, as will be described below, based on the correlation between the browsing data i calculated for each user and the stay location data j. In addition, the prior probability of going to the place of stay at the place g is obtained. This prior probability is assumed to be common to all users.

まず、以下の（４）式、（５）式に従って、各ユーザについて算出された相関に基づいて、ユーザｋに対する閲覧データの場所ｆに関する相関を集計した集計結果Ｗ^k _fを算出すると共に、閲覧データの場所ｆに関する相関を集計した集計結果Ｗ_fを算出する。 First, in accordance with the following formulas (4) and (5), based on the correlation calculated for each user, a total result W ^k _f ^obtained by summing up correlations regarding the location f of the browsing data for the user k is calculated, and browsing to calculate the aggregate result W _f obtained by aggregating the correlation about the location f of the data.

ただし、ｗ^k _i=f,jは、ユーザｋについて算出された、場所ｆが抽出された閲覧データｉと滞在地データｊとの相関である。

However, w ^k _{i = f, j} is the correlation between the browsing data i from which the place f is extracted and the place of stay data j calculated for the user k.

また、以下の（６）式、（７）式に従って、各ユーザについて算出された相関に基づいて、ユーザｋに対する滞在地データの場所ｇに関する相関を集計した集計結果Ｗ^k _gを算出すると共に、滞在地データの場所ｇに関する相関を集計した集計結果Ｗ_gを算出する。 Further, according to the following formulas (6) and (7), based on the correlation calculated for each user, a total result W ^k _g ^obtained by totaling correlations regarding the location g of the stay location data for the user k is calculated, A total result W _g obtained by totaling the correlations regarding the place g of the stay place data is calculated.

ただし、ｗ^k _i,j=gは、ユーザｋについて算出された、閲覧データｉと、場所ｇが抽出された滞在地データｊとの相関である。

However, w ^k _{i, j = g} is a correlation between the browsing data i calculated for the user k and the stay place data j from which the place g is extracted.

また、以下の（８）式、（９）式に従って、各ユーザについて算出された相関に基づいて、ユーザｋに対する、閲覧データの場所ｆと滞在地データの場所ｇとの組み合わせに関する相関を集計した集計結果Ｗ^k _f,gを算出すると共に、閲覧データの場所ｆと滞在地データの場所ｇとの組み合わせに関する相関を集計した集計結果Ｗ_f,gを算出する。 Moreover, the correlation regarding the combination of the browsing data location f and the stay location data location g for the user k is tabulated based on the correlation calculated for each user in accordance with the following formulas (8) and (9). The tabulation result W ^k _{f, g} is calculated, and the tabulation result W _{f, g} obtained by tabulating the correlation regarding the combination of the browsing data location f and the stay location data location g is calculated.

ただし、ｗ^k _i=f,j=gは、ユーザｋについて算出された、場所ｆが抽出された閲覧データｉと場所ｇが抽出された滞在地データｊとの相関である。

Here, w ^k _{i = f, j = g} is the correlation between the browsing data i from which the location f is extracted and the stay location data j from which the location g is calculated, which is calculated for the user k.

また、以下の（１０）式に従って、全ての相関を集計した集計結果Ｗを算出する。 Moreover, the total result W which totaled all the correlations is computed according to the following (10) Formula.

そして、上記のように算出された集計結果に基づいて、以下の式に従って、滞在地データの場所ｇの事前確率Ｐ（ｇ）、閲覧データの場所ｆの事前確率Ｐ（ｆ）、滞在地データの場所ｇであり、かつ、閲覧データの場所ｆである事前確率Ｐ（ｆ，ｇ）を算出する。

Then, based on the calculation results calculated as described above, the prior probability P (g) of the location g of the stay location data, the prior probability P (f) of the location f of the browsing data, and the stay location data according to the following formulas: And a prior probability P (f, g) that is the location f of the browsing data.

そして、閲覧データの場所ｆと、滞在地データの場所ｇとの全ての組み合わせの各々について、場所ｆが抽出される閲覧データのＷｅｂサイトを閲覧したときに、場所ｇの滞在地へ行く事前確率Ｐ（ｆ，ｇ）を以下の（１１）式に従って算出する。

Then, for each of all combinations of the location f of the browsing data and the location g of the stay location data, the prior probability of going to the stay location of the location g when browsing the browsing data website from which the location f is extracted P (f, g) is calculated according to the following equation (11).

上記の相関の集計結果と、事前確率Ｐ（ｆ、ｇ）の算出結果とが、学習結果データベース６６に格納される。

The result of the above correlation and the calculation result of the prior probability P (f, g) are stored in the learning result database 66.

また、コンピュータ４２は、通信部４０により受信したユーザからの目的地予測要求を受け付ける予測要求受付部６８と、学習結果データベース６６に記憶された相関の集計結果及び事前確率Ｐ（ｆ、ｇ）の算出結果に基づいて、予測対象ユーザの目的地を予測する目的地予測部７０とを更に備えている。 The computer 42 also receives a prediction request receiving unit 68 that receives a destination prediction request from a user received by the communication unit 40, and a correlation total result and a prior probability P (f, g) stored in the learning result database 66. A destination prediction unit 70 that predicts the destination of the prediction target user based on the calculation result is further provided.

通信部４０は、クライアントＰＣ１２、携帯端末１４、又は移動体端末１６から、ユーザＩＤを含む目的地予測要求を受信する。 The communication unit 40 receives a destination prediction request including a user ID from the client PC 12, the mobile terminal 14, or the mobile terminal 16.

予測要求受付部６８は、通信部４０により受信した目的地予測要求を受け付ける。 The prediction request receiving unit 68 receives the destination prediction request received by the communication unit 40.

目的地予測部７０は、予測対象のユーザｋについて、ユーザｋに対する閲覧データの場所ｆに関する相関を集計した集計結果Ｗ^k _fを用いて、予め用意された滞在地の場所ｇの各々について、以下の（１２）式に従って、場所ｇへ行く確率を算出する。 The destination prediction unit 70 uses the total result W ^k _f ^obtained by totaling the correlations regarding the browsing data location f with respect to the user k for the prediction target user k, for each location g of the stay destination prepared in advance. The probability of going to the place g is calculated according to the equation (12).

目的地予測部７０は、上記（１２）式で算出される確率が閾値以上となる場所ｇを特定し、特定された場所ｇを、予測される目的地として出力する。目的地の予測結果は、通信部４０により、目的地予測要求を送信したクライアントＰＣ１２、携帯端末１４、又は移動体端末１６へ送信される。

The destination prediction unit 70 identifies a place g where the probability calculated by the above equation (12) is equal to or greater than a threshold value, and outputs the identified place g as a predicted destination. The destination prediction result is transmitted by the communication unit 40 to the client PC 12, the mobile terminal 14, or the mobile terminal 16 that has transmitted the destination prediction request.

次に、第１の実施の形態に係る目的地予測システム１０の動作について説明する。まず、クライアントＰＣ１２において、ユーザの操作により、ユーザＩＤ及びパスワードが入力されると、処理サーバ１８へ送信され、ログイン状態となる。クライアントＰＣ１２において、ユーザの操作により、Ｗｅｂサイトが閲覧されると、閲覧データが処理サーバ１８へ送信され、処理サーバ１８において、閲覧データが収集される。 Next, the operation of the destination prediction system 10 according to the first embodiment will be described. First, in the client PC 12, when a user ID and a password are input by a user operation, they are transmitted to the processing server 18 to enter a login state. When the Web site is browsed by the user operation in the client PC 12, the browsing data is transmitted to the processing server 18, and the browsing data is collected in the processing server 18.

また、携帯端末１４において、ユーザの操作により、ユーザＩＤ及びパスワードが入力されると、処理サーバ１８へ送信され、ログイン状態となる。携帯端末１４において、ユーザの操作により、Ｗｅｂサイトが閲覧されると、閲覧データが処理サーバ１８へ送信され、処理サーバ１８において、閲覧データが収集される。また、携帯端末１４において、自端末の位置が随時計測され、位置が計測される毎に、計測された位置を示す移動データが処理サーバ１８へ送信される。処理サーバ１８において、移動データが収集される。 In addition, when a user ID and a password are input by a user operation on the mobile terminal 14, it is transmitted to the processing server 18 and enters a login state. When a Web site is browsed by a user operation on the mobile terminal 14, browsing data is transmitted to the processing server 18, and browsing data is collected in the processing server 18. Moreover, in the portable terminal 14, the position of the own terminal is measured at any time, and movement data indicating the measured position is transmitted to the processing server 18 every time the position is measured. In the processing server 18, movement data is collected.

また、移動体端末１６において、ユーザの操作により、ユーザＩＤ及びパスワードが入力されると、処理サーバ１８へ送信され、ログイン状態となる。移動体端末１６において、自端末の位置が随時計測され、位置が計測される毎に、計測された位置を示す移動データが処理サーバ１８へ送信され、処理サーバ１８において、移動データが収集される。 In addition, when a user ID and a password are input by the user's operation on the mobile terminal 16, it is transmitted to the processing server 18 and enters a login state. In the mobile terminal 16, the position of the own terminal is measured as needed, and each time the position is measured, movement data indicating the measured position is transmitted to the processing server 18, and the movement data is collected in the processing server 18. .

そして、処理サーバ１８において、所定期間毎に、コンピュータ４２によって、図５に示す相関学習処理ルーチンが実行される。 Then, in the processing server 18, the correlation learning processing routine shown in FIG.

ステップ１００において、予め登録されている複数のユーザのうち、処理対象のユーザｋを設定する。次のステップ１０２では、ユーザｋのユーザＩＤに基づいて、ユーザｋの移動履歴（移動データの時系列）を、ログデータベース５２から読み込む。 In step 100, a user k to be processed is set among a plurality of users registered in advance. In the next step 102, the movement history (time series of movement data) of the user k is read from the log database 52 based on the user ID of the user k.

そして、ステップ１０４において、上記ステップ１０２で読み込んだ移動データの時系列に基づいて、ユーザｋの滞在地データ（場所データｇ）を抽出する。次のステップ１０６では、ユーザｋのユーザＩＤに基づいて、ユーザｋの閲覧履歴（閲覧データの時系列）を、ログデータベース５２から読み込む。 In step 104, the stay location data (location data g) of the user k is extracted based on the time series of the movement data read in step 102. In the next step 106, the browsing history (time series of browsing data) of the user k is read from the log database 52 based on the user ID of the user k.

そして、ステップ１０８において、上記ステップ１０６で読み込んだ閲覧データの時系列の各々について、閲覧データのＷｅｂサイトのＨＴＭＬテキストから、内容として、地名又は商品名を抽出し、地名辞書を用いて場所データｆを抽出する。 In step 108, for each time series of the browsing data read in step 106, a place name or a product name is extracted as content from the HTML text of the browsing data website, and the place data f is used using the place name dictionary. To extract.

次のステップ１１０では、ユーザｋの各閲覧データと、上記ステップ１０４で抽出された各滞在地データとの全ての組み合わせについて、上記（３）式に従って相関を算出する。そして、ステップ１１２において、予め登録されている全てのユーザについて、上記ステップ１００〜ステップ１１０の処理を実行したか否かを判定し、上記ステップ１００〜ステップ１１０の処理を実行していないユーザが存在する場合には、上記ステップ１００へ戻り、当該ユーザを、処理対象のユーザとして設定する。 In the next step 110, the correlation is calculated according to the above equation (3) for all combinations of each browsing data of the user k and each stay place data extracted in the above step 104. In step 112, it is determined whether or not the processes in steps 100 to 110 have been executed for all the users registered in advance, and there are users who have not executed the processes in steps 100 to 110. If so, the process returns to step 100 and the user is set as a user to be processed.

一方、全てのユーザについて、上記ステップ１００〜ステップ１１０の処理を実行した場合には、ステップ１１４において、上記ステップ１１０で全てのユーザについて算出した相関に基づいて、上記（４）式〜（１０）式に従って、相関の集計結果を算出し、学習結果データベース６６に記憶する。 On the other hand, when the processing of Step 100 to Step 110 is executed for all users, in Step 114, based on the correlation calculated for all users in Step 110, Equations (4) to (10) above. In accordance with the formula, a correlation total result is calculated and stored in the learning result database 66.

次のステップ１１６では、上記ステップ１１４で算出された相関の集計結果に基づいて、場所データｇ及び場所データｆの全ての組み合わせについて、上記（１１）式に従って、場所データｇが抽出された閲覧データを閲覧したときに、場所データｆが抽出された滞在地へ行く事前確率Ｐ（ｆ，ｇ）を算出し、学習結果データベース６６に記憶して、相関学習処理ルーチンを終了する。 In the next step 116, the browsing data from which the location data g is extracted according to the above equation (11) for all the combinations of the location data g and the location data f based on the correlation result calculated in the above step 114. , The prior probability P (f, g) of going to the place of stay where the place data f is extracted is calculated, stored in the learning result database 66, and the correlation learning processing routine is terminated.

ログデータベース５２の記憶内容は、随時更新されるため、上記の相関学習処理ルーチンが定期的に繰り返し実行され、学習結果データベース６６に記憶される相関の集計結果及び上記の事前確率が更新される。 Since the stored contents of the log database 52 are updated as needed, the correlation learning processing routine is periodically executed repeatedly, and the correlation total results and the prior probabilities stored in the learning result database 66 are updated.

次に、ログイン状態となっているクライアントＰＣ１２、携帯端末１４、又は移動体端末１６から、ユーザＩＤを含む目的地予測要求が、処理サーバ１８へ送信されると、処理サーバ１８において、コンピュータ４２によって、図６に示す目的地予測処理ルーチンが実行される。 Next, when a destination prediction request including the user ID is transmitted from the client PC 12, the portable terminal 14, or the mobile terminal 16 that is in the login state to the processing server 18, the computer 42 in the processing server 18 The destination prediction processing routine shown in FIG. 6 is executed.

まず、ステップ１３０において、受信した目的地予測要求を受け付けて、予測対象となるユーザを特定する。ステップ１３２において、学習結果データベース６６から、相関の集計結果及び事前確率を読み込む。 First, in step 130, the received destination prediction request is received, and a user to be predicted is specified. In step 132, the correlation result and the prior probability are read from the learning result database 66.

そして、ステップ１３４において、滞在地データとして予め用意された各場所ｇについて、上記（１２）式に従って、予測対象のユーザが場所ｇへ行く確率を算出する。次のステップ１３６において、上記ステップ１３４で算出された確率が閾値以上となる場所ｇを特定し、特定された場所ｇを、予測対象ユーザの目的地の予測結果として、目的予測要求を送信したクライアントＰＣ１２、携帯端末１４、又は移動体端末１６へ送信して、目的地予測処理ルーチンを終了する。 Then, in step 134, for each location g prepared in advance as stay location data, the probability that the prediction target user will go to location g is calculated according to the above equation (12). In the next step 136, the location g where the probability calculated in the above step 134 is equal to or greater than the threshold is specified, and the specified location g is used as the prediction result of the destination of the prediction target user, and the client that has transmitted the target prediction request It transmits to PC12, the portable terminal 14, or the mobile terminal 16, and complete | finishes a destination prediction process routine.

以上説明したように、第１の実施の形態に係る目的地予測システムによれば、ユーザ毎に、閲覧履歴の各閲覧データと、移動履歴から抽出された各滞在地データとを関連付けてそれぞれの組み合わせについて相関を算出し、算出された相関に基づいて事前確率を算出しておき、予測対象ユーザについて、各滞在地データが示す場所へ行く確率を算出することにより、定期的な移動でない場合であっても、目的地を精度良く予測することができる。 As described above, according to the destination prediction system according to the first embodiment, for each user, each browsing data of the browsing history and each staying destination data extracted from the movement history are associated with each other. By calculating the correlation for the combination, calculating the prior probability based on the calculated correlation, and calculating the probability of going to the location indicated by each place of stay data for the prediction target user. Even if it exists, the destination can be accurately predicted.

また、インターネットの閲覧履歴から行きたい場所を予測することで、道路上の車両や歩行者の将来の目的地と位置とを予測することができる。 Further, by predicting a place where the user wants to go from the browsing history of the Internet, it is possible to predict the future destination and position of a vehicle or a pedestrian on the road.

次に、第２の実施の形態について説明する。なお、第２の実施の形態に係る目的地予測システムは、第１の実施の形態と同様の構成となるため、同一符号を付して説明を省略する。 Next, a second embodiment will be described. In addition, since the destination prediction system according to the second embodiment has the same configuration as that of the first embodiment, the same reference numerals are given and description thereof is omitted.

第２の実施の形態では、閲覧したＷｅｂサイトから抽出された単語集合の類似度によって、閲覧データのクラスタリングを行い、閲覧データの内容として、クラスタを与えている点が、第１の実施の形態と異なっている。 In the second embodiment, the first embodiment is that the browsing data is clustered according to the similarity of the word sets extracted from the browsed website, and the cluster is given as the contents of the browsing data. Is different.

第２の実施の形態に係る処理サーバ１８の内容抽出部６０は、まず、ログデータベース５２に記憶された全てのユーザの各閲覧データから、Ｗｅｂサイトの単語集合を抽出し、単語集合の類似度に基づいて、閲覧データをクラスタリングする。なお、単語集合の類似度に基づくクラスタリングの手法については、従来既知の手法を用いればよいため、説明を省略する。また、クラスタリングの結果が、メモリに格納され、全ユーザに対して共通に用いられる。 The content extraction unit 60 of the processing server 18 according to the second embodiment first extracts a word set of websites from each browsing data of all users stored in the log database 52, and the similarity of word sets Based on the above, the browsing data is clustered. Note that a clustering method based on the similarity of word sets may be a conventionally known method, and thus description thereof is omitted. Further, the result of clustering is stored in a memory and used in common for all users.

内容抽出部６０は、ユーザＩＤ毎に、閲覧データ取得部５８によって取得された閲覧データの時系列の各々について、閲覧データから抽出される単語集合がどのクラスタに分類されるかを求め、閲覧データの内容ｆとして、クラスタＩＤを割り当てる。 The content extraction unit 60 obtains, for each user ID, to which cluster the word set extracted from the browsing data is classified for each time series of the browsing data acquired by the browsing data acquisition unit 58, and the browsing data A cluster ID is assigned as the content f.

相関算出部６２は、ユーザＩＤ毎に、閲覧データｉの各々と、移動履歴から抽出された滞在地データｊの各々とを組み合わせて関連付け、ユーザｋにおける閲覧データｉと滞在地データｊとの相関ｗ^k _i,jを、以下の（１３）式に従って算出する。 The correlation calculating unit 62 associates each of the browsing data i with each of the staying place data j extracted from the movement history for each user ID, and correlates the browsing data i and the staying place data j of the user k. w ^k _{i, j} is calculated according to the following equation (13).

ただし、α２は予め定められたパラメータであり、Ｔは、滞在地データの時刻ｔjと閲覧データの閲覧時刻ｔiとの時間間隔（Ｔ＝ｔｊ−ｔｉ）である。

However, α2 is a predetermined parameter, and T is a time interval (T = tj−ti) between the time tj of the stay place data and the browsing time ti of the browsing data.

上記（１３）式によれば、時間間隔Ｔが短いほど、相関相関ｗ^k _i,jが大きくなる。 According to the above equation (13), the shorter the time interval T _{, the} greater the correlation correlation w ^k _{i, j} .

事前確率学習部６４は、各ユーザについて算出された閲覧データｉと滞在地データｊとの相関に基づいて、上記（４）式〜（１０）式と同様の式に従って、相関の集計結果を算出すると共に、内容ｆ（クラスタＩＤ）が割り当てられる閲覧データのＷｅｂサイトを閲覧したときに、場所ｇの滞在地へ行く事前確率Ｐ（ｆ，ｇ）を求める。 Prior probability learning unit 64 calculates a correlation total result according to the same expression as Expressions (4) to (10) above, based on the correlation between browsing data i and stay place data j calculated for each user. At the same time, when browsing the Web site of the browsing data to which the content f (cluster ID) is allocated, the prior probability P (f, g) of going to the place of stay at the location g is obtained.

目的地予測部７０は、予測対象のユーザｋについて、ユーザｋに対する閲覧データのクラスタｆに関する相関を集計した集計結果Ｗ^k _fを用いて、予め用意された滞在地の場所ｇの各々について、上記（１２）式と同様の式に従って、場所ｇへ行く確率を算出する。目的地予測部７０は、算出された確率が閾値以上となる場所ｇを特定し、特定された場所ｇを、予測される目的地として出力する。 The destination prediction unit 70 uses the total result W ^k _f ^obtained by totaling the correlations regarding the cluster f of the browsing data with respect to the user k for the prediction target user k, for each of the places g of the stay destinations prepared in advance. The probability of going to the place g is calculated according to an equation similar to the equation (12). The destination predicting unit 70 specifies a place g where the calculated probability is equal to or higher than a threshold, and outputs the specified place g as a predicted destination.

なお、第２の実施の形態に係る目的地予測システムの他の構成及び作用については、第１の実施の形態と同様であるため、説明を省略する。 In addition, about the other structure and effect | action of the destination prediction system which concern on 2nd Embodiment, since it is the same as that of 1st Embodiment, description is abbreviate | omitted.

このように、閲覧履歴の各閲覧データの単語集合に基づいてクラスタリングを行って、閲覧データを分類し、予測対象ユーザについて、閲覧履歴の各閲覧データが属するクラスタに関する相関を用いて、各滞在地データが示す場所へ行く確率を算出することができる。 In this way, clustering is performed based on the word set of each browsing data in the browsing history, the browsing data is classified, and, for the prediction target user, each staying place is calculated using the correlation regarding the cluster to which each browsing data in the browsing history belongs. The probability of going to the location indicated by the data can be calculated.

なお、上記の第２の実施の形態では、Ｗｅｂサイトから単語集合を抽出して、単語集合の類似度に基づいて閲覧データを分類する場合を例に説明したが、これに限定されるものではない。閲覧したＷｅｂサイトの内容・意味を表わす特徴量を抽出して、特徴量の類似度又は距離に基づいて、閲覧データを分類するようにしてもよい。例えば、ＷｅｂサイトからＵＲＬを抽出して、ＵＲＬの類似度、またはＵＲＬ間の距離に基づいて閲覧データを分類するようにしてもよい。 In the second embodiment described above, a case has been described in which a word set is extracted from a website and browsing data is classified based on the similarity of the word set. However, the present invention is not limited to this. Absent. It is also possible to extract feature amounts representing the content / meaning of the browsed website and classify the browse data based on the similarity or distance of the feature amounts. For example, URLs may be extracted from a website, and browsing data may be classified based on the similarity between URLs or the distance between URLs.

また、履歴の古さを考慮して、相関を算出するようにしてもよい。例えば、目的地予測部によりＰ（ｇ｜ｋ）を計算する前に、以下の式に従って、予測対象ユーザの相関ｗ^k _i,_jを更新するようにしてもよい。 Further, the correlation may be calculated in consideration of the age of the history. For example, the correlation w ^k _i , _j of the prediction target user may be updated according to the following equation before calculating P (g | k) by the destination prediction unit.

ただし、α３は予め定められたパラメータであり、Ｔは、閲覧の古さ（Ｔ＝現在時刻−閲覧時刻）である。上記の式によれば、閲覧時刻が古いほど、相関が小さくなるように更新される。

Here, α3 is a predetermined parameter, and T is the age of browsing (T = current time−viewing time). According to the above formula, the correlation is updated so that the correlation is smaller as the browsing time is older.

１０目的地予測システム
１２クライアントＰＣ
１４携帯端末
１６移動体端末
１８処理サーバ
２２位置計測部
２６、４２コンピュータ
２８表示部
３０、４０通信部
３２閲覧制御部
３４閲覧データ生成部
３６移動データ生成部
５０データ収集部
５２ログデータベース
５６滞在地抽出部
６０内容抽出部
６２相関算出部
６４事前確率学習部
６６学習結果データベース
７０目的地予測部 10 Destination prediction system 12 Client PC
14 mobile terminal 16 mobile terminal 18 processing server 22 position measurement unit 26, 42 computer 28 display unit 30, 40 communication unit 32 browsing control unit 34 browsing data generation unit 36 movement data generation unit 50 data collection unit 52 log database 56 place of stay Extraction unit 60 Content extraction unit 62 Correlation calculation unit 64 Prior probability learning unit 66 Learning result database 70 Destination prediction unit

Claims

For a plurality of users, history collection means for collecting document data browsing history and position information history,
Classification means for classifying the browsed document data according to the text in the document data for each document data of the collected browsing history,
Based on the collected history of location information, for each user, extraction means for extracting stay place data;
For each user, correlate each document data of the browsing history of the user with each place of stay data extracted from the history of the location information, and correlate each of the combinations of the document data and the place of stay data. Correlation calculating means for calculating;
Based on the calculated correlation, for the prediction target user, a probability of going to the place indicated by each stay place data is calculated, and a destination prediction means for predicting the destination based on the calculated probability;
Destination prediction device including

Based on the correlation calculated by the correlation calculating means, for each combination of document data classification and each stay place data, to the location indicated by the stay place data when browsing the document data belonging to the classification A probability calculating means for calculating a prior probability to go;
The destination predicting means determines each destination for the prediction target user based on the correlation calculated for the prediction target user and the prior probability calculated by the probability calculation means. The destination prediction apparatus according to claim 1, wherein the destination is predicted by calculating a probability of.

The correlation calculation unit associates each document data of the browsing history of the user with each stay place data extracted from the history of the position information, and relates to each combination of the document data and the stay place data. The destination prediction apparatus according to claim 1, wherein a correlation according to a time difference between a viewing time of the document data and a stay time of the stay place data is calculated.

The correlation calculating means, for each combination of the document data and the stay place data, according to a time difference between the view time of the document data and the stay time of the stay place data, and the length of the view time of the document data The destination prediction apparatus according to claim 3, wherein the correlation is calculated based on a viewing probability and a stay probability corresponding to a length of stay time of the stay place data.

For each document data of the collected browsing history, the classification means extracts a place name or a product name from the text in the document data, and sells the location data indicating the extracted place name or the extracted product name The destination prediction apparatus according to any one of claims 1 to 4, wherein location data of a store to be assigned is assigned as a classification of the document data.

For each document data of the collected browsing history, the classification means extracts words from the text in the document data, performs clustering on the extracted set of words, and sets the extracted word set on the document data The destination prediction apparatus according to any one of claims 1 to 4, wherein a cluster to which the data belongs is assigned as a classification of the document data.

Computer
History collection means for collecting document data browsing history and position information history for a plurality of users,
Classifying means for classifying the browsed document data according to the text in the document data for each document data of the collected browsing history,
Based on the collected location information history, for each user, extraction means for extracting the stay location data,
For each user, correlate each document data of the browsing history of the user with each place of stay data extracted from the history of the location information, and correlate each of the combinations of the document data and the place of stay data. Correlation calculating means for calculating, and for the prediction target user, calculating the probability of going to the location indicated by each stay place data for the prediction target user, and predicting the destination based on the calculated probability Program to function as a means.