JP2014235654A

JP2014235654A - Risk evaluation device

Info

Publication number: JP2014235654A
Application number: JP2013118091A
Authority: JP
Inventors: 森　俊樹; Toshiki Mori; 俊樹森; 真吾覚井; Shingo Kakui; 朱麗田村; Shurei Tamura
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2013-06-04
Filing date: 2013-06-04
Publication date: 2014-12-15

Abstract

PROBLEM TO BE SOLVED: To quantize the degree of risk of project failure in a project in progress and find a chronological change in it, thereby detecting a symptom of project failure early.SOLUTION: Encoded project-in-progress information is generated by substituting a division value corresponding to a division to which belongs the value of each influential factor in project-in-progress information on the basis of division information of influential factor information. A division information lift value which is information indicating the degree of influence of the influential factor against success or failure of the project is calculated for timing information on each influential factor and each division value on the basis of encoded past project information and project-in-progress information. A risk score per timing information value about each project is calculated, for the encoded past project information or project-in-progress information, by using the timing information on each influential factor and the division information lift value for each division information.

Description

本明細書に記載の実施の形態は、リスク評価装置に関する。 Embodiments described herein relate to a risk evaluation apparatus.

大規模化、複雑化する開発プロジェクトにおいて、問題の予兆を早期にとらえ、プロジェクトの失敗を未然に防止するために、様々な開発管理データ（進捗、コスト、品質情報、など）の活用が重要となる。また、構成管理ツール、不具合管理ツールなど、様々な開発支援ツールの普及により、プロジェクトに関連する開発管理データを、開発担当者に余計な負担を掛けずに、収集、蓄積することが可能になってきた。 It is important to use various development management data (progress, cost, quality information, etc.) in order to quickly detect signs of problems and prevent project failures in large-scale and complicated development projects. Become. In addition, with the widespread use of various development support tools such as configuration management tools and defect management tools, it becomes possible to collect and accumulate development management data related to projects without placing an extra burden on the person in charge of development. I came.

プロジェクトの潜在的なリスクは目に見えにくく、対応が後手に回りがちであり、リスクを先回りした予見的な管理が必要である。変化が激しく、不確実性の高い状況においては、納期遅延、コスト超過、品質不足など、プロジェクトの失敗に関わるリスクを定量的に評価し、リスク予測値の変化を継続的にモニタリングしたいという要請があった。 The potential risks of the project are not visible, the response tends to be delayed, and proactive management ahead of the risks is necessary. In a situation where changes are intense and uncertainty is high, there is a demand to quantitatively evaluate risks related to project failures, such as delays in delivery, cost overruns, and quality shortages, and to continuously monitor changes in risk predictions. there were.

リスク予測の手法として以下のものが提案されていた。 The following methods have been proposed for risk prediction.

（１）重回帰分析による予測モデル
重回帰分析は、ソフトウェア開発管理における予測モデル構築において、最も一般的に用いられている手法の一つである。説明変数（入力）と目的変数（出力）の間の関係式を、直接、重回帰分析で求めて、説明変数の値が得られたときの目的変数の値そのものを予測する手法である。 (1) Prediction Model by Multiple Regression Analysis Multiple regression analysis is one of the most commonly used methods for building a prediction model in software development management. This is a technique for predicting the value of the objective variable itself when the value of the explanatory variable is obtained by directly obtaining the relational expression between the explanatory variable (input) and the objective variable (output) by multiple regression analysis.

（２）アンケート調査による混乱予測システム
過去プロジェクトに対して実施したアンケート調査の結果をベイズ識別器で学習し、新規プロジェクトに対する同アンケート結果に基づいて、そのプロジェクトが最終的に混乱状態に陥るかどうかを判定する手法である。 (2) Confusion prediction system based on questionnaire survey Whether the project will eventually fall into a confused state based on the results of the questionnaire survey for a new project by learning the results of a questionnaire survey conducted on a past project using a Bayes classifier Is a method for determining

特開２０１１−０７６４１１号公報JP 2011-076411 A 特開２０１０−１０８４０４号公報JP 2010-108404 A 特開２００７−２６５１４１号公報JP 2007-265141 A 特開２００９−１８７２８８号公報JP 2009-187288 A 特開２０１１−１４１６７４号公報JP 2011-141684 A 特開２００８−２０４３１３号公報JP 2008-204313 A

上記のような従来のリスク予見手法は以下のような問題点があった。
重回帰分析によるリスク予見手法については、外れ値や異常値の影響を受けやすいという問題がある。精度の高い予測モデルを構築するには、高精度かつ大量のデータが必要となるが、欠損値があると、サンプル数が減ってしまう。一般に、ソフトウェアの開発管理データは、ハードウェア製造プロセスの計測データ等に比べて、データの精度が低く、重回帰分析による予測モデル構築は困難な場合が多い。 The conventional risk prediction method as described above has the following problems.
The risk prediction method based on multiple regression analysis has a problem that it is easily affected by outliers and abnormal values. Building a highly accurate prediction model requires a large amount of data with high accuracy, but if there are missing values, the number of samples decreases. In general, software development management data is less accurate than hardware manufacturing process measurement data, and it is often difficult to construct a prediction model by multiple regression analysis.

アンケート調査によるプロジェクト混乱予測システムにおいては、予測の実施はアンケート集計後の一時点のみであって、リスク予測値の更新は考慮されておらず、プロジェクト成功/失敗の判定のみを行う。また、失敗確率の計算は行わず、定量的評価を提供するものではない。また、離散値の変数のみを入力とし、連続値の変数やその閾値の決定方法には触れられていない。 In the project confusion prediction system based on the questionnaire survey, the prediction is performed only at one time point after the questionnaire is aggregated, the update of the risk prediction value is not considered, and only the success / failure of the project is determined. It also does not calculate failure probabilities and does not provide a quantitative assessment. Also, only discrete value variables are input, and there is no mention of continuous value variables and their threshold determination methods.

本発明の一の実施の形態は、進行中プロジェクトにおけるプロジェクト失敗の危険度を定量化し、その時系列変化を求めて、プロジェクト失敗の兆候を早期に検知する技術を提供することを目的とする。 One embodiment of the present invention aims to provide a technique for quantifying the risk of project failure in an ongoing project, obtaining a time series change thereof, and detecting an early sign of project failure.

本発明の別の実施の形態は、進行中プロジェクトに対する、任意の時点でのプロジェクト失敗の確率を計算する技術を提供することを目的とする。 Another embodiment of the present invention aims to provide a technique for calculating the probability of project failure at any point in time for an ongoing project.

本発明のさらに別の実施の形態は、最適な影響因子の選定、及び、区分情報の設定を行うことにより、予測モデル構築作業の手間を大幅に低減すると共に、プロジェクト失敗の予測精度向上を図る技術を提供することを目的とする。 According to yet another embodiment of the present invention, by selecting the most influential factors and setting the classification information, it is possible to drastically reduce the labor of the prediction model construction work and improve the prediction accuracy of the project failure. The purpose is to provide technology.

本発明の第1の実施の形態はリスク評価装置として提案される。このリスク評価装置は第１の記憶手段と、第１の処理手段と、第2の記憶手段と、第２の処理手段とを有する。
第１の記憶手段は、過去に実施されたプロジェクトについて複数の影響因子の値を記述した情報である過去プロジェクト情報と、過去に実施されたプロジェクトを、成功したプロジェクトと失敗したプロジェクトに区別するための基準を記述した情報であるプロジェクト失敗定義情報と、影響因子の値が取り得る領域を複数の区分に分割した区分情報と、当該影響因子の値が取得可能な時期を示す情報であるタイミング情報とを有する情報である影響因子情報とを記憶する。 The first embodiment of the present invention is proposed as a risk evaluation apparatus. The risk evaluation apparatus includes a first storage unit, a first processing unit, a second storage unit, and a second processing unit.
The first storage means distinguishes the past project information, which is information describing the values of a plurality of influencing factors for projects executed in the past, and the projects executed in the past into successful projects and failed projects. Project failure definition information that is information describing the criteria for the above, division information that divides the possible area of the influence factor value into multiple divisions, and timing information that indicates when the value of the influence factor can be acquired And influencing factor information that is information including

第1の処理手段は、プロジェクト失敗定義情報及び影響因子情報の区分情報に基づいて、過去プロジェクト情報の各影響因子の値が属する区分に対応する区分値に置換することにより、符号化過去プロジェクト情報を生成し、符号化過去プロジェクト情報に基づいて、プロジェクトの成功失敗に対する、影響因子の影響の度合いを示す情報である区分情報リフト値を各影響因子の区分値ごとに算出する。 The first processing means, based on the division information of the project failure definition information and the influence factor information, by replacing with the division value corresponding to the division to which the value of each influence factor of the past project information belongs, encoded past project information Based on the encoded past project information, the category information lift value, which is the information indicating the degree of the influence of the influence factor on the success or failure of the project, is calculated for each of the influence value of each influence factor.

第２の記憶手段は、進行中のプロジェクトについて複数の影響因子の値を記述した情報である進行中プロジェクト情報を記憶する。
第２の処理手段は、影響因子情報の区分情報に基づいて、進行中プロジェクト情報の各影響因子の値が属する区分に対応する区分値に置換することにより、符号化進行中プロジェクト情報を生成し、符号化過去プロジェクト情報と進行中プロジェクト情報に基づいて、プロジェクトの成功失敗に対する、影響因子の影響の度合いを示す情報である区分情報リフト値を各影響因子のタイミング情報及び区分値毎に算出し、符号化過去プロジェクト情報若しくは進行中プロジェクト情報に対して、各影響因子のタイミング情報、及び区分情報毎の区分情報リフト値を用いて、それぞれのプロジェクトについてタイミング情報の値ごとのリスクスコアを計算する。 The second storage means stores in-progress project information, which is information describing a plurality of influencing factor values for the in-progress project.
The second processing means generates encoded in-progress project information by substituting the segment value corresponding to the category to which each influencing factor value of the ongoing project information belongs based on the category information of the influencing factor information. Based on the encoded past project information and ongoing project information, the division information lift value, which is the information indicating the degree of influence of the influencing factors on the success or failure of the project, is calculated for each timing information and the dividing value of each influencing factor. Using the timing information of each influencing factor and the category information lift value for each category information for the encoded past project information or the ongoing project information, the risk score for each timing information value is calculated for each project. .

本発明の第２の実施の形態は、リスク評価装置として提案される。このリスク評価装置は符号化過去プロジェクト情報のプロジェクトの成功失敗を示す値である成否判定値及びタイミング情報の値ごとの過去プロジェクトのリスクスコアに基づいて、プロジェクト失敗確率の回帰式を求め、求めたプロジェクト失敗確率回帰式を用いて、進行中プロジェクトに対するプロジェクト失敗確率を計算する第３の処理部をさらに有する。 The second embodiment of the present invention is proposed as a risk evaluation apparatus. This risk evaluation device obtains a regression formula of the project failure probability based on the success / failure determination value that is a value indicating the success or failure of the project in the encoded past project information and the risk score of the past project for each value of the timing information. The apparatus further includes a third processing unit that calculates a project failure probability for the ongoing project using the project failure probability regression equation.

本発明の第３の実施の形態はリスク評価装置として提案される。このリスク評価装置は影響因子及びその区分情報の候補の集合である影響因子候補情報を記憶する第３の記憶手段と、影響因子候補情報を用いて過去プロジェクト情報から符号化過去プロジェクト情報を生成し、区分情報リフト値、又は区分情報リフト値が1以下の場合には区分情報リフト値の逆数に対して期待値をとって合計した値である期待区分情報リフト値を計算し、期待区分情報リフト値に基づいて影響因子情報に含める影響因子及びその区分情報を選択し、選択した影響因子及びその区分情報を含む影響因子情報を生成する第４の処理手段とをさらに有する。 The third embodiment of the present invention is proposed as a risk evaluation apparatus. This risk evaluation apparatus generates encoded past project information from past project information using third storage means for storing influential factor candidate information which is a set of influential factor and its classification information candidates, and influential factor candidate information. If the category information lift value or the category information lift value is 1 or less, the expected category information lift value is calculated by calculating the expected category information lift value that is the sum of the expected value and the inverse of the category information lift value. And a fourth processing means for selecting an influence factor to be included in the influence factor information and its classification information based on the value, and generating influence factor information including the selected influence factor and the classification information.

第１の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図Functional block diagram showing a configuration example of the risk evaluation device according to the first embodiment 過去プロジェクト情報のデータ構成例を示す図Diagram showing data structure example of past project information 影響因子情報のデータ構成例を示す図A diagram showing an example of the data structure of influencing factor information 符号化過去プロジェクト情報のデータ構成例を示す図The figure which shows the data structural example of the encoding past project information 区分情報リフト値を計算するための要素の関係を示す図The figure which shows the relation of the element for calculating division information lift value 符号化過去プロジェクト情報におけるｃ１, ｃ２, ｃ３, ｃ４それぞれの値を示す図The figure which shows each value of c1, c2, c3, c4 in encoding past project information 符号化過去プロジェクト情報におけるｃ１, ｃ２, ｃ３, ｃ４それぞれの値を示す別の図Another diagram showing the values of c1, c2, c3, and c4 in the encoded past project information 進行中プロジェクト情報のデータ例を示す図Diagram showing data example of ongoing project information 符号化進行中プロジェクト情報のデータ構成例を示す図A diagram showing an example of the data structure of encoding ongoing project information 各タイミング情報（T=1、2、3、4）におけるリスクスコアの時系列変化を可視化した折れ線グラフを示す図The figure which shows the line graph which visualizes the time series change of the risk score in each timing information (T = 1, 2, 3, 4) 進行中プロジェクト（New1〜New5）のリスクスコアの時系列変化を示すグラフを示す図Diagram showing time-series changes in risk score of ongoing projects (New1-New5) リスクスコア算出処理の例を示したフローチャートFlow chart showing an example of risk score calculation processing 第２の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図Functional block diagram showing a configuration example of the risk evaluation apparatus according to the second embodiment ロジスティック回帰分析によるプロジェクト失敗確率回帰式を導出するための元データを示す図Diagram showing original data for deriving a project failure probability regression formula by logistic regression analysis プロジェクト失敗確率算出処理の例を示したフローチャートFlow chart showing an example of project failure probability calculation processing 第３の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図Functional block diagram showing a configuration example of a risk evaluation apparatus according to the third embodiment 第３の実施の形態に係るリスク評価装置の主たる動作の例を示したフローチャートThe flowchart which showed the example of the main operation | movement of the risk evaluation apparatus which concerns on 3rd Embodiment. 第１、第２、第３の実施の形態を組みあわせたリスク評価装置の構成例を示す機能ブロック図Functional block diagram showing an example of the configuration of a risk evaluation apparatus that combines the first, second, and third embodiments

以下、図面を参照して本発明の実施の形態に係るリスク評価装置を説明する。
[０．用語の定義]
本明細書で使用する用語の定義を述べる。
（１）プロジェクト情報
「プロジェクト情報」とは、プロジェクト実行中、または、終了後に、収集・記録された開発管理データをいう。「プロジェクト情報」は変数名と値のペアの集合として表現される。 Hereinafter, a risk evaluation apparatus according to an embodiment of the present invention will be described with reference to the drawings.
[0. Definition of terms]
Definitions of terms used in this specification are described.
(1) Project information “Project information” refers to development management data collected and recorded during or after project execution. “Project information” is expressed as a set of variable name and value pairs.

（２）プロジェクト失敗
「プロジェクト失敗」とは、ある評価基準により決定される、プロジェクト結果が望ましくない状態となったことをいう。「プロジェクト失敗」の典型的な例は、品質不良、コスト超過、納期遅延、などである。 (2) Project failure “Project failure” means that the result of the project, which is determined by a certain evaluation standard, is in an undesirable state. Typical examples of “project failure” are poor quality, cost overrun, delay in delivery, etc.

（３）影響因子
「影響因子」とは、プロジェクト失敗に影響を及ぼす開発管理データとその属性情報。属性情報として、区分情報とタイミング情報がある。 (3) Influencing factors “Influencing factors” are development management data and attribute information that affect project failure. As attribute information, there are classification information and timing information.

（４）区分情報
「区分情報」とは、影響因子の属性情報の一つであって、影響因子に対応する変数の値域をいくつかの領域に分割したものとして定義される。 (4) Classification information “Classification information” is one piece of attribute information of an influence factor, and is defined as a value range of a variable corresponding to the influence factor divided into several areas.

（５）タイミング情報
「タイミング情報」とは、影響因子の属性情報の一つであって、影響因子に対応する変数の値が取得可能な時期（工程、フェーズ、マイルストーン、など）を示し、時期の早いものから順に、タイミング情報T = 1, 2, …, ｍとラベル付けされる。 (5) Timing information “Timing information” is one of the attribute information of the influencing factors, and indicates the time (process, phase, milestone, etc.) when the value of the variable corresponding to the influencing factors can be obtained. The timing information is labeled as T = 1, 2, ..., m in order from the earliest.

（６）区分情報リフト値
「区分情報リフト値」とは、プロジェクト失敗のときに影響因子の値がある区分に属する確率を、条件がないときに影響因子の値が当該区分に属する確率で割った値をいう。「区分情報リフト値」は、その影響因子により得られる情報が、プロジェクト失敗の予測にどの程度寄与するかを示す。区分情報リフト値が「1」ならば、その影響因子はプロジェクト失敗の予測に寄与しないことを意味する。 (6) Category information lift value “Category information lift value” means that the probability that an influence factor value belongs to a category when the project fails is divided by the probability that the influence factor value belongs to the category when there is no condition. Value. The “category information lift value” indicates how much the information obtained by the influence factor contributes to the prediction of the project failure. If the category information lift value is “1”, it means that the influential factor does not contribute to the prediction of project failure.

（７）リスクスコア
「リスクスコア」とは、ベイズ確率に基づき、各事象の独立性を仮定したときのプロジェクト失敗の事後確率を算出し、その対数をとることで得られる値を言う。リスクスコアの値が大きくなると、プロジェクト失敗の確率が高まっていると考えられる。 (7) Risk score “Risk score” refers to a value obtained by calculating the posterior probability of project failure based on the Bayesian probability and taking the logarithm of the failure. It is considered that the probability of project failure increases as the risk score increases.

（８）期待区分情報リフト値
「期待区分情報リフト値」とは、区分情報リフト値、または、区分情報リフト値の逆数（※区分情報リフト値が1以下の場合）に対して期待値をとって合計した値をいう。影響因子の区分情報の決定などに用いられる。 (8) Expected category information lift value “Expected category information lift value” means the expected value for the category information lift value or the inverse of the category information lift value (* when the category information lift value is 1 or less). Is the total value. It is used to determine the influence factor category information.

（９）予測モデル
「予測モデル」とは、プロジェクトの事前、または、進行中に得られるデータを入力として、将来の結果（例えば、納期、コスト、品質、など）を予測する数学モデルをいう。本明細書においては、プロジェクト失敗リスクの予測モデルを指す。 (9) Prediction Model A “prediction model” refers to a mathematical model that predicts future results (for example, delivery date, cost, quality, etc.) using data obtained in advance or during the project as input. In this specification, it refers to a predictive model of project failure risk.

（１０）ベイズ統計
「ベイズ統計」とは、ベイズの定理を基礎とし、主観確率を積極的に取り入れた確率・統計論の体系。推測統計学など、主観確率を認めない確率・統計論の体系（頻度主義と呼ばれる）と対比される。 (10) Bayesian statistics "Bayesian statistics" is a probability / statistical system based on Bayes' theorem and incorporating subjective probabilities. Contrast with probabilistic and statistical systems (called frequencyism) that do not allow subjective probability, such as speculative statistics.

（１１）単純ベイズ分類器
「単純ベイズ分類器」とは、ベイズ統計の応用の一つ。事後確率の計算に事象の独立性を仮定し、確率的分類を行う。スパムメールのフィルタリング技術などに応用されている。 (11) Naive Bayes classifier The “Naive Bayes classifier” is one of the applications of Bayesian statistics. Probabilistic classification is performed assuming event independence in the calculation of posterior probabilities. It is applied to spam mail filtering technology.

（１２）ロジスティック回帰分析
「ロジスティック回帰分析」とは、統計的回帰モデルの一種であり、線形回帰分析が量的変数を予測するのに対して、ロジスティック回帰分析は２値の質的変数を従属変数として、説明変数を用いてその発生確率を予測する。 (12) Logistic regression analysis "Logistic regression analysis" is a kind of statistical regression model. Linear regression analysis predicts quantitative variables, whereas logistic regression analysis depends on binary qualitative variables. The occurrence probability is predicted using an explanatory variable as a variable.

[１．第１の実施の形態]
本発明の第１の実施の形態を説明する。第１の実施の形態は、リスク評価装置として提案される。 [1. First Embodiment]
A first embodiment of the present invention will be described. The first embodiment is proposed as a risk evaluation apparatus.

第１の実施の形態にかかるリスク評価装置は、多様な測定データから構成される過去プロジェクト情報に基づいて、進行中プロジェクトにおけるプロジェクト失敗の危険度を、「リスクスコア」として定量化し、その時系列変化を提供する装置である。第１の実施の形態にかかるリスク評価装置は、ベイズ統計を応用して、過去のプロジェクトデータからプロジェクト失敗の影響因子とその影響の度合い（区分情報リフト値）を求め、進行中プロジェクトにおいて新しく得られたデータに基づき、この進行中プロジェクトのプロジェクト失敗の危険度を示すリスクスコアを算出する。第１の実施の形態にかかるリスク評価装置は、進行中プロジェクトの進行に応じて、リスクスコアを更新することができる。 The risk evaluation apparatus according to the first embodiment quantifies the risk of project failure in an ongoing project as a “risk score” based on past project information composed of various measurement data, and changes its time series. Is a device that provides The risk assessment apparatus according to the first embodiment applies Bayesian statistics to obtain influential factors of project failure and the degree of impact (category information lift value) from past project data, and newly obtains in the ongoing project. Based on the obtained data, a risk score indicating the risk of project failure of this ongoing project is calculated. The risk evaluation apparatus according to the first embodiment can update the risk score according to the progress of the ongoing project.

リスク評価装置は、コンピュータ、ワークステーションなどの情報処理装置であって、この情報処理装置は、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読み出し専用メモリ（ROM）、入出力装置（Ｉ／Ｏ）、及び必要な場合にはハードディスク装置等の外部記憶装置を具備している装置である。 The risk evaluation apparatus is an information processing apparatus such as a computer or a workstation, and the information processing apparatus includes an arithmetic processing unit (CPU), a main memory (RAM), a read-only memory (ROM), and an input / output device (I / I). O) and, if necessary, a device having an external storage device such as a hard disk device.

［１．１．第１の実施の形態に係るリスク評価装置の構成例］
図１に第１の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図を掲げる。なお、機能ブロック図中に示す構成要素は、リスク評価装置の機能を機能ごとにまとめてブロックとして捉えたものであり、リスク評価装置が各構成要素に対応する基板、装置、回路、部品などの物理的構成要素を備えていなければならないことを意味するわけではない。また、「接続されている」とは、データ、情報、命令などの送受信、受け取り、受け渡しなどが可能な状態になっていることをいい、互いに配線で連結されているような物理的な接続に限られる意味ではない。本明細書中の他の機能ブロック図の説明についても同様である。 [1.1. Configuration example of risk evaluation device according to first embodiment]
FIG. 1 is a functional block diagram showing a configuration example of the risk evaluation apparatus according to the first embodiment. Note that the components shown in the functional block diagram are the functions of the risk assessment device that are grouped by function and considered as a block, and the risk assessment device is a board, device, circuit, component, etc. corresponding to each component. It does not mean that it must have physical components. “Connected” means that data, information, instructions, etc. can be sent, received, delivered, etc. It is not limited. The same applies to the description of other functional block diagrams in this specification.

リスク評価装置１は、入力情報記憶部３０と、この入力情報記憶部３０に接続された予測モデル構築部１０と、この予測モデル構築部１０に接続されたリスクスコア計算部２０と、このリスクスコア計算部２０へ接続された進行中プロジェクト情報記憶部４０とを有する。入力情報記憶部３０は第１の記憶手段に相当し、予測モデル構築部１０は第１の処理手段に相当し、進行中プロジェクト情報記憶部４０は第２の記憶手段に相当する。 The risk evaluation apparatus 1 includes an input information storage unit 30, a prediction model construction unit 10 connected to the input information storage unit 30, a risk score calculation unit 20 connected to the prediction model construction unit 10, and the risk score And an ongoing project information storage unit 40 connected to the calculation unit 20. The input information storage unit 30 corresponds to a first storage unit, the prediction model construction unit 10 corresponds to a first processing unit, and the ongoing project information storage unit 40 corresponds to a second storage unit.

［１．１．１．入力情報記憶部］
入力情報記憶部３０は、予測モデル構築に必要なデータを記憶する機能を有する。入力情報記憶部３０は、過去プロジェクト情報２００と、プロジェクト失敗定義情報３００と、影響因子情報４００を記憶する。 [1.1.1. Input information storage unit]
The input information storage unit 30 has a function of storing data necessary for constructing a prediction model. The input information storage unit 30 stores past project information 200, project failure definition information 300, and influence factor information 400.

［１．１．１．１．過去プロジェクト情報］
過去プロジェクト情報２００は、過去に実施された複数の個別プロジェクト情報から構成される情報である。個別プロジェクト情報は、当該プロジェクトの実行中または終了後に、収集・記録された様々な開発管理データであり、変数の変数名と値のペアの集合として表される。変数は、連続変数でも離散変数でもよい。変数の値は一部、欠損値が含まれていても構わない。 [1.1.1.1. Past project information]
The past project information 200 is information including a plurality of pieces of individual project information implemented in the past. The individual project information is various development management data collected and recorded during or after the execution of the project, and is represented as a set of variable name / value pairs of variables. The variable may be a continuous variable or a discrete variable. Some of the variable values may contain missing values.

図２に過去プロジェクト情報２００のデータ構成例を示す。過去プロジェクト情報２００は、個別プロジェクト情報に相当するレコード２０１を複数有している。図２に示す例では、過去プロジェクトPJ1から過去プロジェクトPJ10までの10件のプロジェクトに対応する10件のレコード２０１を有している。 FIG. 2 shows a data configuration example of the past project information 200. The past project information 200 has a plurality of records 201 corresponding to individual project information. In the example shown in FIG. 2, there are 10 records 201 corresponding to 10 projects from the past project PJ1 to the past project PJ10.

各レコード２０１は、変数の値を格納するフィールドを有している。図２に示す例では、変数として「仕様レビュー指摘密度」と、「設計レビュー指摘密度」と、「設計レビュー効率」と、「単体テストバグ密度」と、「結合テストバグ密度」と、「出荷後バグ密度」とが採用されている。各レコード２０１は、「仕様レビュー指摘密度」格納フィールド２１１と、「設計レビュー指摘密度」格納フィールド２１２と、「設計レビュー効率」格納フィールド２１３と、「単体テストバグ密度」格納フィールド２１４と、「結合テストバグ密度」格納フィールド２１５と、「出荷後バグ密度」格納フィールド２１６とを有している。 Each record 201 has a field for storing a variable value. In the example shown in FIG. 2, “spec review review indication density”, “design review indication density”, “design review efficiency”, “unit test bug density”, “joint test bug density”, and “post-shipment” "Bug density" is adopted. Each record 201 includes a “specification review indication density” storage field 211, a “design review indication density” storage field 212, a “design review efficiency” storage field 213, a “unit test bug density” storage field 214, and an “integration test”. It has a “bug density” storage field 215 and a “bag density after shipment” storage field 216.

各格納フィールド２１１〜２１６は、該当するプロジェクトにおける対応する変数の値を格納する。変数の値が得られていない場合には、「欠損値」であることを示す情報が格納される。 Each storage field 211 to 216 stores the value of the corresponding variable in the corresponding project. When the value of the variable is not obtained, information indicating “missing value” is stored.

［１．１．１．２．プロジェクト失敗定義情報］
プロジェクト失敗定義情報３００は、過去に実施されたプロジェクトの集合を、“成功”したプロジェクト群と“失敗”したプロジェクト群に分類（区別）するための基準を定義した情報である。 [1.1.1.2. Project failure definition information]
The project failure definition information 300 is information defining a standard for classifying (distinguishing) a set of projects executed in the past into a “successful” project group and a “failed” project group.

典型的には、プロジェクト失敗定義情報３００は、品質不良、コスト超過、納期遅延などに関連する結果指標（出荷後バグ率、コスト予実比、遅延日数、など）に対して、定量的な基準（閾値）を設定する情報である。 Typically, the project failure definition information 300 includes quantitative criteria (such as a post-shipment bug rate, a cost prediction ratio, days of delay, etc.) related to quality indicators, excess costs, delay in delivery date, and the like. (Threshold value) is set.

図２に示した過去プロジェクト情報２００の例に対応するプロジェクト失敗定義情報３００は、出荷後バグ密度の値が４．５以上であれば「失敗」、それ以外の値は「成功」とする。また、「失敗」、「成功」はそれぞれ後述する区分値が定められており、それぞれ「１」、「０」であるものとする。 The project failure definition information 300 corresponding to the example of the past project information 200 shown in FIG. 2 is “failure” if the post-shipment bug density value is 4.5 or more, and other values are “success”. In addition, “failure” and “success” each have a category value to be described later, and are “1” and “0”, respectively.

［１．１．１．３．影響因子情報］
影響因子情報４００は、プロジェクトの成功失敗に影響すると思われる変数である影響因子の集合に対して、当該影響因子の値が取り得る領域をいくつかの領域に分割した「区分情報」と、当該影響因子の値が取得可能な時期（工程、フェーズ、マイルストーン、など）を示す情報である「タイミング情報」とで構成される。タイミング情報は、時期が早いものから順に、T = 1, 2, …, m としてよい(値の重複可)。 [1.1.1.3. Influence factor information]
The influence factor information 400 includes “category information” obtained by dividing an area where the value of the influence factor can be divided into several areas with respect to a set of influence factors which are variables that are thought to affect the success or failure of the project, It is composed of “timing information” which is information indicating the time (process, phase, milestone, etc.) at which the value of the influence factor can be acquired. The timing information may be T = 1, 2,..., M in order from the earliest (duplicate values).

図３に影響因子情報４００のデータ構成例を示す。影響因子情報４００は、影響因子ごとに一つのレコード４０１を有する。各レコード４０１は、その影響因子を特定する情報（変数名、変数番号、変数IDなど）を格納する、影響因子特定情報格納フィールド４１１と、その影響因子の区分情報を格納する区分情報格納フィールド４１２と、その影響因子のタイミング情報を格納するタイミング情報格納フィールド４１３とを有する。 FIG. 3 shows a data configuration example of the influence factor information 400. The influence factor information 400 has one record 401 for each influence factor. Each record 401 stores information (variable name, variable number, variable ID, etc.) for specifying the influential factor, and an influential factor specifying information storage field 411 for storing the influential factor classification information. And a timing information storage field 413 for storing timing information of the influencing factors.

「区分情報」は、そのレコードに該当する影響因子が取り得る値を複数に区分してできる区分（値域）を示す情報である。例えば、図３に示した例では、影響因子「使用レビュー指摘密度」の区分情報として、第１の区分（値域）が「[MIN, ２．５)」であり、第２の区分（値域）が「[２．５, MAX]」であり、第３の区分は「欠損値」であることが記録されている。なお、「[MIN, ２．５)」は最小値以上且つ２．５未満の値域を意味しており、「[２．５, MAX]」は２．５以上且つ最大値以下の値域を意味している。 The “classification information” is information indicating a classification (value range) that can be divided into a plurality of values that can be taken by the influential factor corresponding to the record. For example, in the example shown in FIG. 3, as the category information of the influence factor “use review indication density”, the first category (range) is “[MIN, 2.5)”, and the second category (range) Is “[2.5, MAX]”, and the third segment is recorded as “missing value”. Note that “[MIN, 2.5)” means a range between the minimum value and less than 2.5, and “[2.5, MAX]” means a range between 2.5 and the maximum value. doing.

また、上記区分はそれぞれ「区分値」を有しており、例えば、上記第１の区分の区分値＝０、上記第２の区分の区分値＝１、上記第３の区分は区分値＝NAである。この区分値は後述する過去プロジェクト情報の符号化に用いられる。 Further, each of the above-mentioned categories has a “classification value”. For example, the category value of the first category = 0, the category value of the second category = 1, and the category of the third category = NA. It is. This division value is used for encoding past project information described later.

また、図３に示す例では、説明のため「タイミング情報」として「T=1（要求仕様化工程）」のように表記したが、実際には値のみタイミング情報格納フィールド４１３に格納されていれば足りる。 In the example shown in FIG. 3, “timing information” is expressed as “T = 1 (required specification process)” for the sake of explanation, but only the value is actually stored in the timing information storage field 413. It's enough.

［１．１．２．予測モデル構築部］
予測モデル構築部１０は、入力情報記憶部３０に記憶されている情報から、過去プロジェクト情報２００を符号化して、符号化過去プロジェクト情報を生成し、この符号化過去プロジェクト情報に基づいて区分情報リフト値を算出する機能を有する。 [1.1.2. Prediction model construction department]
The prediction model construction unit 10 encodes the past project information 200 from the information stored in the input information storage unit 30 to generate encoded past project information. Based on the encoded past project information, the classification information lift It has a function to calculate a value.

［１．１．２．１．符号化過去プロジェクト情報］
符号化過去プロジェクト情報の生成について説明する。予測モデル構築部１０は、プロジェクト失敗定義情報３００、及び影響因子情報４００の区分情報に基づいて、過去プロジェクト情報２００を符号化する。「符号化」とは、それぞれのレコードの格納フィールドに格納されている値を区分値に置換することをいう。本実施の形態では、格納フィールド２１１から格納フィールド２１６までの値を、当該値に対応する区分値に置換する。格納フィールド２１１から格納フィールド２１５までの値は、影響因子情報４００の区分情報に基づいて区分値に置換される。格納フィールド２１６の値はプロジェクト失敗定義情報３００に基づいて区分値に置換される。 [1.1.2.1. Encoded past project information]
Generation of encoded past project information will be described. The prediction model construction unit 10 encodes the past project information 200 based on the project failure definition information 300 and the classification information of the influence factor information 400. “Encoding” refers to replacing a value stored in the storage field of each record with a segment value. In the present embodiment, the values from the storage field 211 to the storage field 216 are replaced with the segment values corresponding to the values. The values from the storage field 211 to the storage field 215 are replaced with the partition values based on the partition information of the influence factor information 400. The value in the storage field 216 is replaced with a segment value based on the project failure definition information 300.

図４に、図３に示した過去プロジェクト情報２００を符号化した結果得られる、符号化過去プロジェクト情報のデータ構成例を示す。符号化過去プロジェクト情報５００は、過去プロジェクト情報２００と同じく、10件の過去プロジェクトPJ１〜PJ10に対応する１０件のレコード５０１を有している。各レコード５０１は、「仕様レビュー指摘密度」格納フィールド５１１と、「設計レビュー指摘密度」格納フィールド５１２と、「設計レビュー効率」格納フィールド５１３と、「単体テストバグ密度」格納フィールド５１４と、「結合テストバグ密度」格納フィールド５１５と、「出荷後バグ密度」格納フィールド５１６とを有している。 FIG. 4 shows a data configuration example of the encoded past project information obtained as a result of encoding the past project information 200 shown in FIG. Similar to the past project information 200, the encoded past project information 500 has ten records 501 corresponding to ten past projects PJ1 to PJ10. Each record 501 includes a “specification review indication density” storage field 511, a “design review indication density” storage field 512, a “design review efficiency” storage field 513, a “unit test bug density” storage field 514, and a “combination test”. It has a “bug density” storage field 515 and a “post-shipment bug density” storage field 516.

予測モデル構築部１０は、格納フィールド５１１から５１５のそれぞれに、過去プロジェクト情報２００の対応する格納フィールド２１１から２１５に格納されている値に対応する区分値を格納する。例えば、過去プロジェクト情報２００の過去プロジェクトPJ1に対応するレコード２０１において、仕様レビュー指摘密度格納フィールド２１１に格納されている値は「０．５」である。この値「０．５」は、影響因子情報４００における仕様レビュー指摘密度に対応するレコード４０１の区分情報４１２に記述された第1の区分（値域）「[MIN, ２．５)」に含まれる。この第１の区分（値域）「[MIN, ２．５)」の区分値は「０」であるので、予測モデル構築部１０は符号化過去プロジェクト情報５００の過去プロジェクトPJ1に対応するレコード５０１において、仕様レビュー指摘密度格納フィールド５１１に区分値「０」を格納する。同様に他の格納フィールド５１２から５１５にも対応する区分値を格納する。 The prediction model construction unit 10 stores the segment values corresponding to the values stored in the corresponding storage fields 211 to 215 of the past project information 200 in the storage fields 511 to 515, respectively. For example, in the record 201 corresponding to the past project PJ1 of the past project information 200, the value stored in the specification review indication density storage field 211 is “0.5”. This value “0.5” is included in the first category (value range) “[MIN, 2.5)” described in the category information 412 of the record 401 corresponding to the specification review indication density in the influence factor information 400. . Since the classification value of the first classification (value range) “[MIN, 2.5)” is “0”, the prediction model construction unit 10 includes the record 501 corresponding to the past project PJ1 of the encoded past project information 500. The segment value “0” is stored in the specification review indication density storage field 511. Similarly, the segment values corresponding to the other storage fields 512 to 515 are stored.

予測モデル構築部１０は、出荷後バグ密度格納フィールド５１６に、対応する区分値を格納する。例えば、過去プロジェクト情報２００の過去プロジェクトPJ1に対応するレコード２０１において、出荷後バグ密度格納フィールド２１６に格納されている値は「１．２」である。この値「１．２」は、出荷後バグ密度の値が４．５以上であるというプロジェクト失敗定義情報３００に記述された失敗の条件に該当しないので「成功」であると判定し、成功に対応する区分値「０」を、符号化過去プロジェクト情報５００の過去プロジェクトPJ1に対応するレコード５０１において、出荷後バグ密度格納フィールド５１６にこの成功に対応する区分値「０」を格納する。 The prediction model construction unit 10 stores the corresponding segment value in the post-shipment bug density storage field 516. For example, in the record 201 corresponding to the past project PJ1 of the past project information 200, the value stored in the post-shipment bug density storage field 216 is “1.2”. Since this value “1.2” does not correspond to the failure condition described in the project failure definition information 300 that the value of the bug density after shipment is 4.5 or more, it is determined as “successful” and succeeded. The corresponding segment value “0” is stored in the post-shipment bug density storage field 516 in the record 501 corresponding to the past project PJ1 of the encoded past project information 500.

予測モデル構築部１０は、全てのレコード５０１の全ての格納フィールド５１１〜５１６について対応する区分値を決定し、その区分値を格納フィールド５１１〜５１６に格納する。 The prediction model construction unit 10 determines corresponding segment values for all the storage fields 511 to 516 of all records 501 and stores the segment values in the storage fields 511 to 516.

図４に示す区分値の格納が完了した状態の符号化過去プロジェクト情報５００である。なお、図４に示す例では、各格納フィールド５１１〜５１６のラベルは、対応する影響因子（変数）名「仕様レビュー指摘密度」、「設計レビュー指摘密度」、「設計レビュー効率」、「単体テストバグ密度」、「結合テストバグ密度」、「出荷後バグ密度」に代えて、X１、X2a、X2b、X3、X4、Yとした。Xの右側の数字は当該影響因子のタイミング情報の値である。また、「設計レビュー指摘密度」、「設計レビュー効率」はともにタイミング情報T＝２であるので、区別のためにさらに添え字a,bを付した。プロジェクトの成功／失敗を示す「出荷後バグ密度」はYと表記した。なお、これらの表記は後述する計算過程を説明するためであって、符号化過去プロジェクト情報５００が係るデータ構成に限定される趣旨ではない。また、図４に示す例において「NA」は欠損値に対応する区分値である。 This is the encoded past project information 500 in a state where the storage of the segment values shown in FIG. 4 is completed. In the example illustrated in FIG. 4, the labels of the storage fields 511 to 516 include the corresponding influence factor (variable) names “specification review indication density”, “design review indication density”, “design review efficiency”, “unit test bug”. X1, X2a, X2b, X3, X4, and Y were used instead of “density”, “joint test bug density”, and “post-shipment bug density”. The number on the right side of X is the timing information value of the influencing factor. In addition, since “design review indication density” and “design review efficiency” are both timing information T = 2, subscripts a and b are further added for distinction. “Post-shipment bug density” indicating the success / failure of the project is indicated as Y. These notations are for explaining a calculation process described later, and are not intended to limit the data structure of the encoded past project information 500. In the example shown in FIG. 4, “NA” is a segment value corresponding to a missing value.

［１．１．２．２．区分情報リフト値］
予測モデル構築部１０は、前述の符号化過去プロジェクト情報２００に基づいて、プロジェクトの成功／失敗に対する、影響因子の影響の度合いを示す情報である区分情報リフト値を各影響因子の区分値ごとに算出する。 [1.1.2.2. Category information lift value]
The predictive model construction unit 10 sets the category information lift value, which is information indicating the degree of influence of the influence factor, on the success / failure of the project for each of the division values of each influence factor based on the encoded past project information 200 described above. calculate.

区分情報リフト値を算出するために、予測モデル構築部１０は影響因子Xi ( i はタイミング情報の値）の区分値k (0 ≦ k ≦ n）に対して、符号化過去プロジェクト情報５００を参照して、以下のｃ１, ｃ２, ｃ３, ｃ４を求める。 In order to calculate the segment information lift value, the prediction model construction unit 10 refers to the encoded past project information 500 for the segment value k (0 ≦ k ≦ n) of the influence factor Xi (i is the value of timing information). Then, the following c1, c2, c3, c4 are obtained.

「ｃ１」は、影響因子Xiについて、該当格納フィールドに格納されている値（区分値）が「ｋ」であって、且つプロジェクトの成功／失敗を示す格納フィールド５１６（Y）の値が０であるレコード５０１の個数である。 “C1” indicates that the value (partition value) stored in the corresponding storage field for the influence factor Xi is “k”, and the value of the storage field 516 (Y) indicating the success / failure of the project is 0. This is the number of records 501.

「ｃ２」は、影響因子Xiについて、該当格納フィールドに格納されている値が「ｋ」以外且つ「NA」以外であって、且つプロジェクトの成功／失敗を示す格納フィールド５１６（Y）の値が成功を示す値「０」であるレコード５０１の個数である。 “C2” is the value stored in the storage field other than “k” and other than “NA” for the influence factor Xi, and the value of the storage field 516 (Y) indicating the success / failure of the project is This is the number of records 501 having a value “0” indicating success.

「ｃ３」は、影響因子Xiについて、該当格納フィールドに格納されている値が「ｋ」であって、且つプロジェクトの成功／失敗を示す格納フィールド５１６（Y）の値が失敗を示す値「１」であるレコード５０１の個数である。 “C3” is a value “1” indicating that the value stored in the corresponding storage field is “k” and the value of the storage field 516 (Y) indicating the success / failure of the project is “1” indicating the failure. Is the number of records 501.

「ｃ４」は、影響因子Xiについて、該当格納フィールドに格納されている値が「ｋ」以外且つ「NA」以外であって、且つプロジェクトの成功／失敗を示す格納フィールド５１６（Y）の値が１であるレコード５０１の個数である。 “C4” is the value stored in the corresponding storage field other than “k” and other than “NA” for the influencing factor Xi, and the value of the storage field 516 (Y) indicating the success / failure of the project is This is the number of records 501 that are one.

図５にXi、ｋ、ｃ１, ｃ２, ｃ３, ｃ４との関係を示す表を掲げる。 FIG. 5 shows a table showing the relationship between Xi, k, c1, c2, c3, and c4.

図４に示した符号化過去プロジェクト情報５００の例では、予測モデル構築部１０は影響因子X１、X2a、X2b、X3、X4のそれぞれの区分値「０」と「１」のそれぞれについてｃ１, ｃ２, ｃ３, ｃ４を求めることになる。 In the example of the encoded past project information 500 shown in FIG. 4, the prediction model construction unit 10 c1 and c2 for each of the division values “0” and “1” of the influence factors X1, X2a, X2b, X3, and X4. , c3, c4.

次に予測モデル構築部１０は以下で定義される区分情報リフト値「L_Xi_k」を計算する。
区分情報リフト値「L_Xi_k」の算出式を以下に示す。 Next, the prediction model construction unit 10 calculates the segment information lift value “L_Xi_k” defined below.
The calculation formula of the classification information lift value “L_Xi_k” is shown below.

上記式１において、分子は、プロジェクト失敗のとき影響因子 Xi の取る値が区分値k に属する確率であり、分母は何も条件なしで、影響因子 Xi の取る値が区分値 k に属する確率である。 In the above equation 1, the numerator is the probability that the value taken by the influencing factor Xi belongs to the category value k when the project fails, the denominator is the probability that the value taken by the influencing factor Xi belongs to the category value k without any condition. is there.

上記式１におけるa, b は、データ数が少ないときに極端な値が出力されることを避けるための補正パラメータであり、一般に、a=1, b=2 （ラプラス補正）などが用いられる。また、データ数が十分にあれば、a=b=0 (補正なし)でもよい。 A and b in the above equation 1 are correction parameters for avoiding extreme values being output when the number of data is small. Generally, a = 1, b = 2 (Laplace correction) and the like are used. If the number of data is sufficient, a = b = 0 (no correction) may be used.

各影響因子の独立性を仮定したとき、プロジェクト失敗の事後確率は、ベイズの定理により、事前確率と、a=b=0 (補正なし) のときの区分情報リフト値との積に一致する。 Assuming the independence of each influencing factor, the posterior probability of project failure matches the product of the prior probability and the classification information lift value when a = b = 0 (no correction) according to Bayes' theorem.

区分情報リフト値の計算例を示す。図４に示した符号化過去プロジェクト情報５００における影響因子＝X1且つ区分値＝０についての区分情報リフト値L_X1_0は以下のとおりである。 The calculation example of a division | segmentation information lift value is shown. The division information lift value L_X1_0 for the influence factor = X1 and the division value = 0 in the encoded past project information 500 shown in FIG. 4 is as follows.

図６に影響因子＝X1且つ区分値＝０での、符号化過去プロジェクト情報５００におけるｃ１, ｃ２, ｃ３, ｃ４それぞれの値を示す。すなわち、ｃ１＝４, ｃ２＝２, ｃ３＝１, ｃ４＝２となる。これを上記式１にあてはめる。 FIG. 6 shows values of c1, c2, c3, and c4 in the encoded past project information 500 when the influence factor = X1 and the segment value = 0. That is, c1 = 4, c2 = 2, c3 = 1, and c4 = 2. This is applied to Equation 1 above.

（但し、ラプラス補正あり: a = 1, b = 2）
すなわち、影響因子＝X1且つ区分値＝０についての区分情報リフト値L_X1_0＝0.7333を得る。 (However, with Laplace correction: a = 1, b = 2)
That is, the division information lift value L_X1_0 = 0.7333 for the influence factor = X1 and the division value = 0 is obtained.

区分情報リフト値の別の計算例を示す。図４に示した符号化過去プロジェクト情報５００における影響因子＝X1且つ区分値＝１についての区分情報リフト値L_X1_１は以下のとおりである。 The other example of calculation of a division | segmentation information lift value is shown. The division information lift value L_X1_1 for the influence factor = X1 and the division value = 1 in the encoded past project information 500 shown in FIG. 4 is as follows.

図７に影響因子＝X1且つ区分値＝０での、符号化過去プロジェクト情報５００におけるｃ１, ｃ２, ｃ３, ｃ４それぞれの値を示す。すなわち、ｃ１＝２, ｃ２＝４, ｃ３＝２, ｃ４＝１となる。これを上記式１にあてはめる。 FIG. 7 shows values of c1, c2, c3, and c4 in the encoded past project information 500 when the influence factor = X1 and the segment value = 0. That is, c1 = 2, c2 = 4, c3 = 2, and c4 = 1. This is applied to Equation 1 above.

（但し、ラプラス補正あり: a = 1, b = 2）
すなわち、影響因子＝X1且つ区分値＝１についての区分情報リフト値L_X1_1＝1.32を得る。 (However, with Laplace correction: a = 1, b = 2)
That is, the segment information lift value L_X1_1 = 1.32 for the influence factor = X1 and the segment value = 1 is obtained.

予測モデル構築部１０は、他の影響因子X2a、X2b、X3、X4のそれぞれの区分値「０」と「１」についても同様に区分情報リフト値を算出し、各区分情報リフト値を出力する。 The prediction model construction unit 10 similarly calculates the division information lift value for each of the division values “0” and “1” of the other influencing factors X2a, X2b, X3, and X4, and outputs each division information lift value. .

以上で予測モデル構築部１０の説明を終了する。 Above, description of the prediction model construction part 10 is complete | finished.

［１．１．３．進行中プロジェクト情報記憶部］
進行中プロジェクト情報記憶部４０は、進行中プロジェクト情報８００を記憶する機能を有する。進行中プロジェクト情報８００は、１つ以上の“進行中”（成功 / 失敗が未確定）のプロジェクト情報の集まりである。進行中プロジェクト情報８００は、前述の過去プロジェクト情報２００と同様のデータ構成を有している。 [1.1.3. Ongoing project information storage unit]
The ongoing project information storage unit 40 has a function of storing the ongoing project information 800. The ongoing project information 800 is a collection of one or more “in progress” (success / failure unconfirmed) project information. The ongoing project information 800 has the same data structure as the past project information 200 described above.

図８に進行中プロジェクト情報のデータ例を示す。進行中プロジェクト情報８００は、個別の進行中プロジェクト情報に相当するレコード８０１を複数有している。図８に示す例では、進行中プロジェクトNEW1から進行中プロジェクトNEW5までの５件の進行中プロジェクトに対応する５件のレコード８０１を有している。 FIG. 8 shows an example of ongoing project information data. The ongoing project information 800 includes a plurality of records 801 corresponding to individual ongoing project information. In the example shown in FIG. 8, there are five records 801 corresponding to five ongoing projects from the ongoing project NEW1 to the ongoing project NEW5.

各レコード８０１は、過去プロジェクト情報２００と同様に、変数の値を格納するフィールドを有している。図８に示す例では、変数として「仕様レビュー指摘密度」と、「設計レビュー指摘密度」と、「設計レビュー効率」と、「単体テストバグ密度」と、「結合テストバグ密度」と、「出荷後バグ密度」とが採用されている。各レコード８０１は、「仕様レビュー指摘密度」格納フィールド８１１と、「設計レビュー指摘密度」格納フィールド８１２と、「設計レビュー効率」格納フィールド８１３と、「単体テストバグ密度」格納フィールド８１４と、「結合テストバグ密度」格納フィールド８１５と、「出荷後バグ密度」格納フィールド８１６とを有している。 Each record 801 has a field for storing a value of a variable, like the past project information 200. In the example shown in FIG. 8, “specification review indication density”, “design review indication density”, “design review efficiency”, “unit test bug density”, “joint test bug density”, and “post-shipment” "Bug density" is adopted. Each record 801 includes a “specification review indication density” storage field 811, a “design review indication density” storage field 812, a “design review efficiency” storage field 813, a “unit test bug density” storage field 814, and a “combination test”. It has a “bug density” storage field 815 and a “post-shipment bug density” storage field 816.

各格納フィールド８１１〜８１６は、該当するプロジェクトにおける対応する変数の値を格納する。変数の値が得られていない場合には、「欠損値」であることを示す情報が格納される。また、該当する変数についての検証・検査等が見実施である場合には「未実施」であることを示す情報が格納される。 Each storage field 811 to 816 stores the value of the corresponding variable in the corresponding project. When the value of the variable is not obtained, information indicating “missing value” is stored. In addition, when verification / inspection or the like for the corresponding variable is actually performed, information indicating “not performed” is stored.

なお、進行中プロジェクトの“プロジェクト完了後”（成功 / 失敗が確定した後）のプロジェクト情報は、過去プロジェクト情報２００の中に含めて、将来の進行中プロジェクトの予測に活用することも可能である。 The project information “after project completion” (after success / failure is confirmed) of the ongoing project can be included in the past project information 200 and used for prediction of the future ongoing project. .

［１．１．４．リスクスコア計算部］
リスクスコア計算部２０は、過去及び進行中プロジェクトに対するリスクスコアの計算し、過去及び進行中プロジェクトに対するリスクスコアの時系列データを出力する機能を有する。リスクスコア計算部２０は、影響因子情報４００と符号化過去プロジェクト情報５００と、各影響因子の区分情報毎の区分情報リフト値と、進行中プロジェクト情報とを入力情報として用いる。 [1.1.4. Risk score calculator]
The risk score calculation unit 20 has a function of calculating risk scores for past and ongoing projects and outputting time series data of risk scores for past and ongoing projects. The risk score calculation unit 20 uses the influence factor information 400, the encoded past project information 500, the division information lift value for each division information of each influence factor, and the ongoing project information as input information.

［１．１．４．１．進行中プロジェクト情報の符号化］
リスクスコア計算部２０は、進行プロジェクト情報８００を符号化して、符号化進行中プロジェクト情報８００を生成する。符号化処理の内容は、予測モデル構築部１０による過去プロジェクト情報の符号化と同様であって、影響因子情報４００の区分情報に基づいて、進行中プロジェクト情報８００の各格納フィールド８１１〜８１５に格納されている値を、対応する区分値に置換する。ただし、進行中のプロジェクトにおいては、プロジェクトの成功／失敗は、未確定なので、プロジェクト失敗定義情報３００はここでは必要ない。 [1.1.4.1. Encoding of ongoing project information]
The risk score calculation unit 20 encodes the ongoing project information 800 to generate the encoded ongoing project information 800. The content of the encoding process is the same as the encoding of the past project information by the prediction model construction unit 10, and is stored in each storage field 811 to 815 of the ongoing project information 800 based on the classification information of the influence factor information 400. Replace the value that has been set with the corresponding partition value. However, in the ongoing project, the success / failure of the project is uncertain, so the project failure definition information 300 is not necessary here.

図９に符号化進行中プロジェクト情報９００のデータ構成例を示す。図９に示す例は、図８に示した進行中プロジェクト情報８００から生成された符号化進行中プロジェクト情報９００の例である。 FIG. 9 shows a data configuration example of the encoding in progress project information 900. The example shown in FIG. 9 is an example of the encoding ongoing project information 900 generated from the ongoing project information 800 shown in FIG.

符号化進行中プロジェクト情報９００は、進行中プロジェクト情報８００と同じく、５件の進行中プロジェクトNEW1〜NEW5に対応する５件のレコード９０１を有している。各レコード９０１は、「仕様レビュー指摘密度」格納フィールド９１１と、「設計レビュー指摘密度」格納フィールド９１２と、「設計レビュー効率」格納フィールド９１３と、「単体テストバグ密度」格納フィールド９１４と、「結合テストバグ密度」格納フィールド９１５と、「出荷後バグ密度」格納フィールド９１６とを有している。 Similar to the ongoing project information 800, the encoding ongoing project information 900 includes five records 901 corresponding to the five ongoing projects NEW1 to NEW5. Each record 901 includes a “specification review indication density” storage field 911, a “design review indication density” storage field 912, a “design review efficiency” storage field 913, a “unit test bug density” storage field 914, and a “combination test”. It has a “bug density” storage field 915 and a “post-shipment bug density” storage field 916.

リスクスコア計算部２０は、格納フィールド９１１から９１５のそれぞれに、進行中プロジェクト情報８００の対応する格納フィールド８１１から８１５に格納されている値に対応する区分値を格納する。但し、進行中プロジェクト情報８００で格納されている値が「未実施」を示す値である場合は、符号化進行中プロジェクト情報９００の対応する格納フィールドにも「未実施」を示す値を格納する。 The risk score calculation unit 20 stores the segment values corresponding to the values stored in the corresponding storage fields 811 to 815 of the ongoing project information 800 in the storage fields 911 to 915, respectively. However, if the value stored in the ongoing project information 800 is a value indicating “not implemented”, the value indicating “not implemented” is also stored in the corresponding storage field of the encoding ongoing project information 900. .

なお、図９に示す例では、各格納フィールド９１１〜９１６のラベルは、対応する影響因子（変数）名「仕様レビュー指摘密度」、「設計レビュー指摘密度」、「設計レビュー効率」、「単体テストバグ密度」、「結合テストバグ密度」、「出荷後バグ密度」に代えて、X1、X2a、X2b、X3、X4、Yとしている。 In the example shown in FIG. 9, the labels of the storage fields 911 to 916 include the corresponding influence factor (variable) names “specification review indication density”, “design review indication density”, “design review efficiency”, “unit test bug”. X1, X2a, X2b, X3, X4, and Y are used instead of “density”, “joint test bug density”, and “post-shipment bug density”.

［１．１．４．２．リスクスコアの計算］
リスクスコア計算部２０は、符号化過去プロジェクト情報９００と進行中プロジェクト情報８００に基づいて、各影響因子のタイミング情報、及び区分値毎の区分情報リフト値を用いてリスクスコアを計算する。 [1.1.4.2. Risk score calculation]
Based on the encoded past project information 900 and the ongoing project information 800, the risk score calculation unit 20 calculates a risk score using the timing information of each influential factor and the category information lift value for each category value.

リスクスコア計算部２０は、プロジェクトｐ（過去、進行中を問わない）のリスクスコアを以下のように計算する。
（１）プロジェクト p の初期リスクスコア S_p_0 を以下の式２より算出する。 The risk score calculation unit 20 calculates the risk score of the project p (regardless of whether it is in the past or in progress) as follows.
(1) The initial risk score S_p_0 of project p is calculated from the following formula 2.

一般的には、当該プロジェクトに関して特に初期情報がないと仮定して、上記のように初期リスクスコア S_p_0 を設定するが、もし、プロジェクト毎に初期リスクスコアを変えたい場合には、個別に設定してもよい。
（２）プロジェクトｐのタイミング情報T=j についてのリスクスコアを算出する。 In general, the initial risk score S_p_0 is set as described above assuming that there is no initial information regarding the project. However, if you want to change the initial risk score for each project, set the initial risk score individually. May be.
(2) The risk score for the timing information T = j of the project p is calculated.

プロジェクト p の影響因子 Xi の値を kとする区分情報リフト値をL_Xi_k、影響因子Xiのタイミング情報の値をT(Xi)とすると、リスクスコア計算部２０は、プロジェクトpのT = jにおけるリスクスコアS_p_jを以下の式３により計算する。 The risk score calculation unit 20 calculates the risk at T = j of the project p, where L_Xi_k is the classification information lift value where the value of the influence factor Xi of the project p is k, and T (Xi) is the timing information value of the influence factor Xi. The score S_p_j is calculated by the following formula 3.

但し、ｋ＝NAならば、L_Xi_k＝1とする。式３の右辺第２項は、タイミング情報T=j である全ての影響因子についての区分情報リフト値の対数の合計を意味する。また、式３におけるｋの値は、符号化過去プロジェクト情報５００又は符号化進行中プロジェクト情報９００のプロジェクトｐに対応するレコード５０１又はレコード９０１の影響変数Xiに対応する格納フィールドに格納された区分値である。 However, if k = NA, L_Xi_k = 1. The second term on the right side of Equation 3 means the sum of the logarithms of the division information lift values for all influencing factors with timing information T = j. In addition, the value of k in Expression 3 is the partition value stored in the storage field corresponding to the influence variable Xi of the record 501 or the record 901 corresponding to the project p of the encoded past project information 500 or the encoded project information 900. It is.

進行中プロジェクトが過去プロジェクトと同様の傾向を示すと仮定すれば、一般に、プロジェクトのリスクスコアが高いほど、プロジェクト失敗の確率が高いと判断できる。 Assuming that ongoing projects show similar trends to past projects, it can be generally determined that the higher the project risk score, the higher the probability of project failure.

リスクスコア計算の例を挙げる。図８及び図９に示した進行中プロジェクト情報８００及び符号化進行中プロジェクト情報９００に含まれるプロジェクトNew1のリスクスコアの計算例を示す。 Give an example of risk score calculation. A calculation example of the risk score of the project New1 included in the ongoing project information 800 and the encoding ongoing project information 900 shown in FIGS. 8 and 9 is shown.

タイミング情報T = 1についてのプロジェクトNew1のリスクスコアS_New1_1は以下のように算出される。 The risk score S_New1_1 of the project New1 for the timing information T = 1 is calculated as follows.

なお、上記式中の区分情報リフト値L_X1_0の値は、前述の予測モデル構築部１０によって算出された値を使用する。
また、タイミング情報T = 2についてのプロジェクトNew1のリスクスコアS_New1_2は以下のように算出される。 In addition, the value calculated by the prediction model construction unit 10 described above is used as the value of the division information lift value L_X1_0 in the above formula.
Further, the risk score S_New1_2 of the project New1 for the timing information T = 2 is calculated as follows.

なお、本例においてタイミング情報T=2である影響因子はX2a,X2bの２つであるため、この2つの影響因子の区分情報リフト値の対数を計算に含めている。上記式中の区分情報リフト値L_X2a_1の値は、前述の予測モデル構築部１０によって算出された値を使用する。区分情報リフト値L_X2a_NAはk=NAの場合の値＝１を用いる。 In this example, since there are two influencing factors of timing information T = 2, X2a and X2b, the logarithm of the division information lift value of these two influencing factors is included in the calculation. The value calculated by the prediction model construction unit 10 is used as the value of the segment information lift value L_X2a_1 in the above formula. As the segment information lift value L_X2a_NA, a value = 1 when k = NA is used.

タイミング情報T = 3についてのプロジェクトNew1のリスクスコアS_New1_3は以下のように算出される。 The risk score S_New1_3 of the project New1 for the timing information T = 3 is calculated as follows.

なお、上記式中の区分情報リフト値L_X3_1の値は、前述の予測モデル構築部１０によって算出された値を使用する。

Note that the value calculated by the prediction model construction unit 10 described above is used as the value of the segment information lift value L_X3_1 in the above formula.

タイミング情報T = 4についてのプロジェクトNew1のリスクスコアS_New1_4は以下のように算出される。 The risk score S_New1_4 of the project New1 for the timing information T = 4 is calculated as follows.

なお、上記式中の区分情報リフト値L_X4_0の値は、前述の予測モデル構築部１０によって算出された値を使用する。 In addition, the value calculated by the prediction model construction unit 10 described above is used as the value of the segment information lift value L_X4_0 in the above formula.

以上で進行プロジェクトNew1について、タイミング情報T=1、T=2、T=3、T=4までのそれぞれについてリスクスコアを求めることができた。これで各タイミング情報（T=1,…,4）におけるリスクスコアの時系列変化を示すデータを生成、出力することが可能となる。 With the above, for the progress project New1, the risk score was obtained for each of timing information T = 1, T = 2, T = 3, and T = 4. As a result, it is possible to generate and output data indicating the time series change of the risk score in each timing information (T = 1,..., 4).

図１０に各タイミング情報（T=1、2，3，4）におけるリスクスコアの時系列変化を可視化した折れ線グラフを示す。このグラフより、T=2（設計工程）からT=3（単体テスト工程）にかけて、プロジェクト失敗リスクの予測値が少しずつ上昇していることが分かる。 FIG. 10 shows a line graph that visualizes the time-series change of the risk score in each timing information (T = 1, 2, 3, 4). From this graph, it can be seen that the predicted value of project failure risk gradually increases from T = 2 (design process) to T = 3 (unit test process).

同様にして、全ての進行中プロジェクト（New1〜New5）についてリスクスコアを算出することができる。各進行中プロジェクトのリスクスコアをタイミング情報（T=1、2，3，4）毎に示すことで、各進行中プロジェクトのリスクスコアの時系列変化をグラフで可視化することができる。図１１に進行中プロジェクト（New1〜New5）のリスクスコアの時系列変化を示すグラフを掲げる。このグラフから、例えば、以下のような情報が分かる。プロジェクト New3、及び、New4 は、プロジェクト失敗リスクの予測値であるリスクスコアが上昇傾向にある。これらプロジェクトについては監視強化が望ましく、必要に応じて、これらプロジェクトについてリスク軽減策を実施すべき、などの判断が行える。 Similarly, risk scores can be calculated for all ongoing projects (New1-New5). By showing the risk score of each ongoing project for each timing information (T = 1, 2, 3, 4), the time series change of the risk score of each ongoing project can be visualized in a graph. FIG. 11 shows a graph showing the time series change of the risk score of the ongoing project (New1 to New5). From this graph, for example, the following information can be understood. Projects New3 and New4 have an increasing risk score, which is a predictive value of project failure risk. For these projects, it is desirable to strengthen monitoring. If necessary, it can be judged that risk reduction measures should be implemented for these projects.

また、プロジェクト New5 は、T=2 （設計工程）時点では、比較的リスクが低いとみなせる。 Project New5 can be considered to be relatively low risk at T = 2 (design process).

［１．２．リスク評価装置の動作例］
図１に示したリスク評価装置１の動作例を説明する。図１２にリスク評価装置１の主たる動作であるリスクスコア算出処理の例を示したフローチャートを示す。 [1.2. Operation example of risk assessment device]
An operation example of the risk evaluation apparatus 1 shown in FIG. 1 will be described. FIG. 12 is a flowchart showing an example of risk score calculation processing which is the main operation of the risk evaluation apparatus 1.

リスクスコア算出処理において、リスク評価装置１、より詳しくは予測モデル構築部１０は過去プロジェクト情報２００、プロジェクト失敗定義情報３００、影響因子情報４００から符号化過去プロジェクト情報５００を生成する（S10)。 In the risk score calculation process, the risk evaluation apparatus 1, more specifically, the prediction model construction unit 10, generates encoded past project information 500 from the past project information 200, project failure definition information 300, and influence factor information 400 (S10).

次にリスク評価装置１、より詳しくは予測モデル構築部１０は符号化過去プロジェクト情報５００及び影響因子情報４００に基づいて、各影響因子について区分値ごとに区分情報リフト値を算出する（S20）。 Next, the risk evaluation apparatus 1, more specifically, the prediction model construction unit 10 calculates a division information lift value for each division value for each influence factor based on the encoded past project information 500 and the influence factor information 400 (S 20).

次にリスク評価装置１、より詳しくはリスクスコア計算部２０は、あらかじめ記憶されている進行プロジェクト情報８００、影響因子情報４００に基づいて、符号化進行中プロジェクト情報９００を生成する（S30）。 Next, the risk evaluation apparatus 1, more specifically, the risk score calculation unit 20 generates encoding ongoing project information 900 based on the pre-stored ongoing project information 800 and influence factor information 400 (S 30).

次にリスク評価装置１、より詳しくはリスクスコア計算部２０は、符号化過去プロジェクト情報９００若しくは進行中プロジェクト情報８００に対して、各影響因子のタイミング情報、及び区分情報毎の区分情報リフト値を用いて、それぞれのプロジェクトについてタイミング情報の値ごとのリスクスコアを計算する（S40）。 Next, the risk evaluation device 1, more specifically, the risk score calculation unit 20, calculates the timing information of each influential factor and the category information lift value for each category information for the encoded past project information 900 or the ongoing project information 800. The risk score for each value of timing information is calculated for each project (S40).

図２に示した過去プロジェクト情報２００に基づいてリスク評価装置１が出力するリスクスコアをS_p_jで表すと、プロジェクトPJ1について：S_PJ1_0、S_PJ1_1、S_PJ1_2、S_PJ1_4という値の組となる。同様に、他のプロジェクトのそれぞれについてもリスク評価装置１はタイミング情報の値ごとのリスクスコアS_p_0、S_p_1、S_p_2、... 、S_p_m（mはタイミング情報Tがとる最大の値）を出力する。 When the risk score output by the risk evaluation device 1 based on the past project information 200 shown in FIG. Similarly, for each of the other projects, the risk evaluation apparatus 1 outputs risk scores S_p_0, S_p_1, S_p_2,..., S_p_m (m is the maximum value taken by the timing information T) for each timing information value.

リスク評価装置１は計算の結果得られたリスクスコアを数値の列挙の形式で出力しても良いし、グラフとして出力するようにしてもかまわない。出力形式に限定はない。
以上でリスク評価装置１の動作例の説明を終了する。 The risk evaluation apparatus 1 may output the risk score obtained as a result of the calculation in a numerical enumeration format or may output it as a graph. There is no limitation on the output format.
Above, description of the operation example of the risk evaluation apparatus 1 is complete | finished.

［１．３．第１の実施の形態の利点］
（１）多様の測定データから構成される過去プロジェクト情報に基づいて、進行中プロジェクトにおけるプロジェクト失敗の危険度を、複数の測定データを総合した「リスクスコア」として定量化し、その時系列変化を求めることができる。 [1.3. Advantages of First Embodiment]
(1) Based on past project information composed of various measurement data, the risk of project failure in an ongoing project is quantified as a “risk score” that combines multiple measurement data, and the time series change is obtained. Can do.

（２）プロジェクト失敗の危険度を表す「リスクスコア」の時系列変化を監視することにより、プロジェクト失敗の兆候を早期に検知し、適切なタイミングで対応策を実施することができる。 (2) By monitoring the time series change of the “risk score” representing the risk of project failure, it is possible to detect signs of project failure at an early stage and implement countermeasures at an appropriate timing.

[２．第２の実施の形態]
本発明の第２の実施の形態を説明する。第２の実施の形態はリスク評価装置として提案される。 [2. Second Embodiment]
A second embodiment of the present invention will be described. The second embodiment is proposed as a risk evaluation apparatus.

第２の実施の形態にかかるリスク評価装置は、第１の実施の形態の特徴に加えて、リスクスコアを入力とするロジスティック回帰分析により、任意の時点でのプロジェクト失敗の確率を計算することを特徴とする。 In addition to the features of the first embodiment, the risk evaluation apparatus according to the second embodiment calculates the probability of project failure at an arbitrary point of time by logistic regression analysis using a risk score as an input. Features.

第２の実施の形態にかかるリスク評価装置は、コンピュータ、ワークステーションなどの情報処理装置であって、この情報処理装置は、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読み出し専用メモリ（ROM）、入出力装置（Ｉ／Ｏ）、及び必要な場合にはハードディスク装置等の外部記憶装置を具備している装置である。 The risk evaluation apparatus according to the second embodiment is an information processing apparatus such as a computer or a workstation. The information processing apparatus includes an arithmetic processing unit (CPU), a main memory (RAM), and a read-only memory (ROM). ), An input / output device (I / O), and, if necessary, an external storage device such as a hard disk device.

［２．１．リスク評価装置の構成例］
図１３に第２の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図を掲げる。 [2.1. Configuration example of risk assessment device]
FIG. 13 is a functional block diagram showing a configuration example of the risk evaluation apparatus according to the second embodiment.

リスク評価装置２は、入力情報記憶部３０と、この入力情報記憶部３０に接続された予測モデル構築部１０と、この予測モデル構築部１０に接続されたリスクスコア計算部２０と、このリスクスコア計算部２０へ接続された進行中プロジェクト情報記憶部４０と、リスクスコア計算部２０に接続されたプロジェクト失敗確率計算部５０を有する。プロジェクト失敗確率計算部５０は第３の処理手段に相当する。 The risk evaluation device 2 includes an input information storage unit 30, a prediction model construction unit 10 connected to the input information storage unit 30, a risk score calculation unit 20 connected to the prediction model construction unit 10, and the risk score An ongoing project information storage unit 40 connected to the calculation unit 20 and a project failure probability calculation unit 50 connected to the risk score calculation unit 20 are included. The project failure probability calculation unit 50 corresponds to a third processing unit.

リスク評価装置２は、第１の実施の形態に係るリスク評価装置１と基本的同様の構成要素を有しており、プロジェクト失敗確率計算部５０をさらに有している点が異なっている。 The risk evaluation device 2 has basically the same components as the risk evaluation device 1 according to the first embodiment, and differs in that it further includes a project failure probability calculation unit 50.

リスク評価装置２の構成要素のうち、第１の実施の形態に係るリスク評価装置１の構成要素と同一のものは、同一の参照符号を付してそれら構成要素の詳細な説明は省略する。 Among the constituent elements of the risk evaluation apparatus 2, the same constituent elements as those of the risk evaluation apparatus 1 according to the first embodiment are given the same reference numerals, and detailed descriptions thereof are omitted.

［２．１．１．プロジェクト失敗確率計算部］
プロジェクト失敗確率計算部５０は、リスクスコア計算部２０が出力するリスクスコアの時系列データと、符号化過去プロジェクト情報５００とに基づいて、過去プロジェクトに基づく失敗確率回帰式の導出を行い、この失敗確率回帰式を用いて進行中プロジェクトに対する失敗確率の計算を行い、タイミング情報毎（T = 1, 2, …, m）のプロジェクト失敗確率を出力する機能を有する。 [2.1.1. Project failure probability calculator]
The project failure probability calculation unit 50 derives a failure probability regression equation based on the past project based on the time-sequential data of the risk score output from the risk score calculation unit 20 and the encoded past project information 500. It has a function to calculate the failure probability for an ongoing project using a stochastic regression equation and output the project failure probability for each timing information (T = 1, 2, ..., m).

なお、リスクスコアの時系列データについては、過去プロジェクト分、及び、進行中プロジェクト分の両方を用いる。また、符号化過去プロジェクト情報５００で必要な情報は、Y （プロジェクトの成功 / 失敗）の値のみである。 As for the time series data of risk scores, both past projects and ongoing projects are used. Further, the only information necessary for the encoded past project information 500 is a value of Y (project success / failure).

［２．１．１．１．過去プロジェクトに基づく失敗確率回帰式の導出］
プロジェクト失敗確率計算部５０による、過去プロジェクトに基づく失敗確率回帰式の導出について説明する。 [2.1.1.1. Derivation of failure probability regression formula based on past projects]
Derivation of the failure probability regression formula based on the past project by the project failure probability calculation unit 50 will be described.

プロジェクト失敗確率計算部５０は、符号化過去プロジェクト情報５００のプロジェクトの成功／失敗を示す値を格納する格納フィールド（図5に示す例では格納フィールド５１６）の値（以下「成否判定値」と呼ぶ）、及び、タイミング情報毎（T = 1, 2, …, m）の過去プロジェクトのリスクスコアに基づいて、プロジェクト失敗確率の回帰式を求める。 The project failure probability calculation unit 50 stores a value (hereinafter, “success / failure determination value”) of a storage field (storage field 516 in the example shown in FIG. 5) that stores a value indicating success / failure of the encoded past project information 500. ) And the regression formula of the project failure probability based on the risk score of the past project for each timing information (T = 1, 2,..., M).

具体的には、プロジェクト失敗確率計算部５０は、符号化された過去プロジェクト情報５００の成否判定値、及び、タイミング情報 T = j における過去プロジェクトのリスクスコアを入力としてロジスティック回帰分析を実施し、下記式４のa, b を求める。なお、ロジスティック回帰分析は、MINITABやRなど、標準的な統計ツールに実装されているものを用いてよい。 Specifically, the project failure probability calculation unit 50 performs logistic regression analysis using the encoded success / failure determination value of the past project information 500 and the past project risk score in the timing information T = j as input. Find a and b in Equation 4. Logistic regression analysis may be implemented using standard statistical tools such as MINITAB and R.

上記式４において、「Pj」はタイミング情報 T = j における、成否判定値（Y） = 1（失敗）の確率を意味し、「X」は過去プロジェクトZのリスクスコア S_Z_jを意味する。 In the above equation 4, “Pj” means the probability of success / failure determination value (Y) = 1 (failure) in the timing information T = j, and “X” means the risk score S_Z_j of the past project Z.

元データによっては、ロジスティック回帰分析による式の導出が行えない場合もある。そのときは、該当するタイミング情報での失敗確率は求められない。例えば、元データが完全分離可能な場合（ある閾値で成功/失敗が完全に識別できる）、などである。 Depending on the original data, it may not be possible to derive a formula by logistic regression analysis. In that case, the failure probability with the corresponding timing information is not obtained. For example, when the original data can be completely separated (success / failure can be completely identified with a certain threshold).

他のタイミング情報（T = j 以外）についても、同様の計算を実施する。 Similar calculations are performed for other timing information (other than T = j).

失敗確率回帰式の導出の例を示す。図１４に、タイミング情報T=4の場合の、ロジスティック回帰分析によるプロジェクト失敗確率回帰式を導出するための元データを示す。この元データ１４００は、各過去プロジェクトの成否判定値（格納フィールド５１６に格納されている値）と、対応する過去プロジェクトpのリスクスコアS_p_4とで構成される。 An example of derivation of a failure probability regression equation is shown. FIG. 14 shows original data for deriving a project failure probability regression equation by logistic regression analysis in the case of timing information T = 4. The original data 1400 includes success / failure determination values (values stored in the storage field 516) of each past project and a risk score S_p_4 of the corresponding past project p.

プロジェクト失敗確率計算部５０は、元データ１４００より、a=3.544, b=7.342という結果を得え、以下の失敗確率回帰式を得る。 The project failure probability calculation unit 50 obtains the results of a = 3.544, b = 7.342 from the original data 1400, and obtains the following failure probability regression equation.

上記式５によって得られるプロジェクト失敗確率は、タイミング情報T=4の場合に適用されるものとなる。プロジェクト失敗確率計算部５０は、他のタイミング情報についてもそれぞれ失敗確率回帰式を得る。 The project failure probability obtained by Equation 5 is applied when the timing information T = 4. The project failure probability calculation unit 50 obtains a failure probability regression formula for each of other timing information.

［２．１．１．１．失敗確率回帰式によるプロジェクト失敗確率の計算］
プロジェクト失敗確率計算部５０は、求めたプロジェクト失敗確率回帰式を用いて、進行中プロジェクトに対するプロジェクト失敗確率を計算する。求めたいタイミング情報の失敗確率回帰式を用いて、対象プロジェクトのリスクスコアを入力すれば、そのプロジェクトにおけるプロジェクト失敗確率が得られる。 [2.1.1.1. Calculation of project failure probability using failure probability regression formula]
The project failure probability calculation unit 50 calculates the project failure probability for the ongoing project using the obtained project failure probability regression equation. If the risk score of the target project is input using the failure probability regression equation of the timing information to be obtained, the project failure probability in the project can be obtained.

プロジェクト失敗確率の算出例を示す。図８に示した例における進行中プロジェクトNew1について、タイミング情報T=4の場合のプロジェクト失敗確率P_New1_4を算出するためには、上記式５を用いた以下の計算を行う。 An example of calculating the project failure probability is shown. In order to calculate the project failure probability P_New1_4 when the timing information T = 4 for the ongoing project New1 in the example shown in FIG. 8, the following calculation using the above equation 5 is performed.

なお、上記式においてX=-0.4121（進行中プロジェクトNew1のタイミング情報T=4のリスクスコア；リスクスコア計算部２０によって算出されている）を用いている。 In the above formula, X = −0.4121 (risk score of timing information T = 4 of ongoing project New1; calculated by the risk score calculation unit 20) is used.

同様にして、プロジェクト失敗確率計算部５０は、いずれの進行中プロジェクトNew1, New2, New3, New4, New5についても、任意のタイミング情報（T=1,…,4）におけるプロジェクト失敗確率を計算可能である。 Similarly, the project failure probability calculation unit 50 can calculate the project failure probability in any timing information (T = 1,..., 4) for any ongoing project New1, New2, New3, New4, New5. is there.

[２．２．リスク評価装置２の動作例]
リスク評価装置２の動作例について説明する。図１５に第２の実施の形態に係るリスク評価装置２の主たる動作であるプロジェクト失敗確率算出処理の例を示したフローチャートを示す。 [2.2. Example of operation of risk assessment device 2]
An operation example of the risk evaluation apparatus 2 will be described. FIG. 15 is a flowchart showing an example of a project failure probability calculation process which is the main operation of the risk evaluation apparatus 2 according to the second embodiment.

プロジェクト失敗確率算出処理において、リスク評価装置２は、リスクスコア算出処理を実行する（S110）。リスクスコア算出処理は、第1の実施の形態におけるリスクスコア算出処理（図１２、ステップS10〜S40参照）と同一の処理であるので、処理内容の詳述は省略する。 In the project failure probability calculation process, the risk evaluation device 2 executes a risk score calculation process (S110). Since the risk score calculation process is the same process as the risk score calculation process (see FIG. 12, steps S10 to S40) in the first embodiment, the details of the process will be omitted.

次にリスク評価装置２、より詳しくはプロジェクト失敗確率計算部５０は、過去プロジェクト情報２００に基づいて失敗確率回帰式の導出を実行する（S120）。 Next, the risk evaluation device 2, more specifically, the project failure probability calculation unit 50, derives a failure probability regression equation based on the past project information 200 (S120).

次にリスク評価装置２、より詳しくはプロジェクト失敗確率計算部５０は、ステップS120において導出された失敗確率回帰式を用いて、進行中プロジェクトのプロジェクト失敗確率を算出する（S130）。 Next, the risk evaluation device 2, more specifically, the project failure probability calculation unit 50 calculates the project failure probability of the ongoing project using the failure probability regression equation derived in step S120 (S130).

最後に、リスク評価装置２、より詳しくはプロジェクト失敗確率計算部５０は、ステップS130において算出したプロジェクト失敗確率を出力する。
以上でリスク評価装置２はプロジェクト失敗確率算出処理を終了する。 Finally, the risk evaluation device 2, more specifically, the project failure probability calculation unit 50 outputs the project failure probability calculated in step S130.
Thus, the risk evaluation device 2 ends the project failure probability calculation process.

[２．３．第２の実施の形態の利点]
（１）進行中プロジェクトが過去プロジェクトと同様の傾向を示すと仮定することにより、ロジスティック回帰分析を用いて、進行中プロジェクトに対する、任意の時点でのプロジェクト失敗の確率を計算することができる。 [2.3. Advantages of Second Embodiment]
(1) By assuming that ongoing projects show similar trends as past projects, logistic regression analysis can be used to calculate the probability of project failure at any point in time for ongoing projects.

（２）プロジェクト失敗の確率は、直観的に理解しやすく、プロジェクト制御のための意思決定等に有効に役立てることができる。
[３．第３の実施の形態]
本発明の第３の実施の形態を説明する。第３の実施の形態はリスク評価装置として提案される。 (2) The probability of project failure is easy to understand intuitively and can be used effectively for decision making for project control.
[3. Third Embodiment]
A third embodiment of the present invention will be described. The third embodiment is proposed as a risk evaluation apparatus.

第３の実施の形態にかかるリスク評価装置は、第１の実施の形態の特徴に加えて、期待区分情報リフト値を用いて、影響因子の選定や区分情報の設定などを行い、影響因子情報を生成することを特徴とする。 In addition to the features of the first embodiment, the risk evaluation device according to the third embodiment uses the expected category information lift value to select influential factors, set category information, etc. Is generated.

第３の実施の形態にかかるリスク評価装置は、コンピュータ、ワークステーションなどの情報処理装置であって、この情報処理装置は、演算処理装置（ＣＰＵ）、主メモリ（ＲＡＭ）、読み出し専用メモリ（ROM）、入出力装置（Ｉ／Ｏ）、及び必要な場合にはハードディスク装置等の外部記憶装置を具備している装置である。 The risk evaluation apparatus according to the third embodiment is an information processing apparatus such as a computer or a workstation. The information processing apparatus includes an arithmetic processing unit (CPU), a main memory (RAM), and a read-only memory (ROM). ), An input / output device (I / O), and, if necessary, an external storage device such as a hard disk device.

［３．１．リスク評価装置の構成例］
図１６に第３の実施の形態に係るリスク評価装置の構成例を示す機能ブロック図を掲げる。リスク評価装置３は、影響因子候補情報記憶部６０と、影響因子候補情報記憶部６０に接続された影響因子情報決定部７０と、影響因子情報決定部７０に接続された入力情報記憶部３０と、この入力情報記憶部３０に接続された予測モデル構築部１０と、この予測モデル構築部１０に接続されたリスクスコア計算部２０と、このリスクスコア計算部２０へ接続された進行中プロジェクト情報記憶部４０とを有する。影響因子候補情報記憶部６０は第３の記憶手段に相当し、影響因子情報決定部７０は第４の処理に相当する。 [3.1. Configuration example of risk assessment device]
FIG. 16 is a functional block diagram showing a configuration example of the risk evaluation apparatus according to the third embodiment. The risk evaluation device 3 includes an influence factor candidate information storage unit 60, an influence factor information determination unit 70 connected to the influence factor candidate information storage unit 60, and an input information storage unit 30 connected to the influence factor information determination unit 70. The prediction model construction unit 10 connected to the input information storage unit 30, the risk score calculation unit 20 connected to the prediction model construction unit 10, and the ongoing project information storage connected to the risk score calculation unit 20 Part 40. The influence factor candidate information storage unit 60 corresponds to a third storage unit, and the influence factor information determination unit 70 corresponds to a fourth process.

リスク評価装置３は、第1の実施の形態に係るリスク評価装置１と基本的に同様の構成要素を有しており、影響因子候補情報記憶部６０と影響因子情報決定部７０とをさらに有している点が異なっている。 The risk evaluation device 3 has basically the same components as the risk evaluation device 1 according to the first embodiment, and further includes an influence factor candidate information storage unit 60 and an influence factor information determination unit 70. Is different.

第３の実施の形態に係るリスク評価装置３の構成要素のうち、第１の実施の形態に係るリスク評価装置１の構成要素と同一のものは、同一の参照符号を付してそれら構成要素の詳細な説明は省略する。 Among the constituent elements of the risk evaluation apparatus 3 according to the third embodiment, the same constituent elements as those of the risk evaluation apparatus 1 according to the first embodiment are denoted by the same reference numerals and are the constituent elements. The detailed description of is omitted.

［３．１．１．影響因子候補情報記憶部］
影響因子候補情報記憶部６０は、影響因子候補情報１２００を記憶する機能を有する。影響因子候補情報１２００は、影響因子及びその区分情報の候補の集合である。リスク評価装置３のオペレーター、ユーザ、管理者などが影響因子及びその区分情報の候補を複数入力した情報である影響因子候補情報１２００に基づいて、影響因子候補情報記憶部６０は、各影響因子と区分情報について期待区分情報リフト値を算出し、この期待区分情報リフト値に基づいて影響因子及びその区分情報の候補を比較し、影響因子情報４００に含める影響因子及びその区分情報を決定する。なお、タイミング情報は、影響因子及びその区分情報の決定では必要としないが、予め影響因子候補情報１２００に含められていてもよい。 [3.1.1. Influencing factor candidate information storage unit]
The influence factor candidate information storage unit 60 has a function of storing the influence factor candidate information 1200. The influence factor candidate information 1200 is a set of candidates for influence factors and their classification information. Based on the influence factor candidate information 1200, which is information in which an operator, a user, an administrator, and the like of the risk evaluation device 3 input a plurality of influence factor and its classification information candidates, the influence factor candidate information storage unit 60 includes each influence factor and An expected category information lift value is calculated for the category information, an influence factor and its category information candidate are compared based on the expected category information lift value, and an influence factor to be included in the influence factor information 400 and its category information are determined. The timing information is not required for determining the influence factor and its classification information, but may be included in the influence factor candidate information 1200 in advance.

［３．１．２．影響因子情報決定部］
影響因子情報決定部７０は、過去プロジェクト情報の符号化を行い、期待区分情報リフト値を計算し、期待区分情報リフト値に基づいて影響因子情報４００に含める影響因子及びその区分情報を選択し、選択した影響因子及びその区分情報からなる影響因子情報４００を生成し、入力情報記憶部３０に記憶させる機能を有する。 [3.1.2. Influencing factor information determination unit]
The influence factor information determination unit 70 encodes past project information, calculates an expected category information lift value, selects an influence factor to be included in the influence factor information 400 and its category information based on the expected category information lift value, It has a function of generating influence factor information 400 including the selected influence factor and its classification information and storing it in the input information storage unit 30.

［３．１．２．１．過去プロジェクト情報の符号化］
影響因子情報決定部７０は、過去プロジェクト情報２００の符号化を行い、符号化過去プロジェクト情報を生成する。過去プロジェクト情報２００を符号化する方法は、第１の実施の形態と同様である。ただし、符号化で使用する区分情報は、影響因子候補情報１２００のものを用いる。 [3.1.2.1. Encoding past project information]
The influencing factor information determination unit 70 encodes the past project information 200 and generates encoded past project information. A method for encoding the past project information 200 is the same as that in the first embodiment. However, the classification information used in encoding uses the influence factor candidate information 1200.

［３．１．２．２．期待区分情報リフト値の算出］
影響因子情報決定部７０は、影響因子候補情報１２００に含まれる影響因子情報の候補（影響因子 Xi、区分情報 k = {0, 1, …, n}）のそれぞれに対して、下記式６で定義される期待区分情報リフト値 EL_Xi を計算する。 [3.1.2.2. Calculation of expected category information lift value]
The influencing factor information determination unit 70 uses the following expression 6 for each influencing factor information candidate (influencing factor Xi, classification information k = {0, 1,..., N}) included in the influencing factor candidate information 1200. Calculate the defined expected category information lift value EL_Xi.

但し式6において Where in Equation 6

影響因子情報決定部７０は、算出した期待区分情報リフト値 EL_Xiを参照して、影響因子候補情報１２００に含まれる影響因子の内、どの影響因子を影響因子情報４００に含めるかを決定し、決定した影響因子を含む影響因子情報４００を生成し、これを入力情報記憶部３０に記憶させる。 The influence factor information determination unit 70 refers to the calculated expected category information lift value EL_Xi, determines which influence factors included in the influence factor candidate information 1200 are included in the influence factor information 400, and determines The influential factor information 400 including the influential factor is generated and stored in the input information storage unit 30.

なお、複数の影響因子情報の候補がある場合には、期待区分情報リフト値EL_Xiが高いものを選ぶ方が、プロジェクト成功／失敗の予測精度が高まる可能性が大きいと考えられるが、必ずしも期待区分情報リフト値EL_Xiが高いものから順に選ぶ方式に本実施の形態は限定されるものではない。 If there are multiple candidates for influencing factor information, it is considered that selecting a higher expected category information lift value EL_Xi is likely to increase the accuracy of project success / failure prediction. The present embodiment is not limited to the method of selecting the information lift value EL_Xi in descending order.

期待区分情報リフト値の算出の具体例を示す。同一の影響因子Xiについて異なる区分情報を有する、２つの影響因子の候補について期待区分情報リフト値EL_Xiを算出し、比較する。 A specific example of calculating the expected category information lift value is shown. Expected classification information lift value EL_Xi is calculated and compared for two candidate influence factors having different classification information for the same influence factor Xi.

（１）第１の候補
第１の候補は、影響因子 X1、区分情報｛0:[MIN, 2.5）, 1:[2.5, MAX], NA:欠損値｝という内容を有する。 (1) 1st candidate The 1st candidate has the contents of influence factor X1, division information {0: [MIN, 2.5), 1: [2.5, MAX], NA: missing value}.

この第１の候補の期待区分情報リフト値EL_Xiを算出する。但し、過去プロジェクト情報２００は、図２に示すデータを使用したものとする。 The expected classification information lift value EL_Xi of the first candidate is calculated. However, it is assumed that the past project information 200 uses the data shown in FIG.

影響因子情報決定部７０は、過去プロジェクト情報２００及び第１の候補の区分情報から符号化過去プロジェクト情報及び区分情報リフト値を生成し、この符号化過去プロジェクト情報及び区分情報リフト値から以下の値を得る。 The influence factor information determination unit 70 generates the encoded past project information and the segment information lift value from the past project information 200 and the first candidate segment information, and the following values are calculated from the encoded past project information and the segment information lift value. Get.

影響因子情報決定部７０は、これらの値を上記式６にあてはめ、第１の候補の期待区分情報リフト値EL_Xiを以下のように得る。 The influencing factor information determination unit 70 applies these values to the above equation 6 to obtain the first candidate expected category information lift value EL_Xi as follows.

（２）第２の候補
第２の候補は、影響因子 X1、区分情報｛0:[MIN, 2.0）, 1:[2.0, MAX], NA:欠損値｝のときという内容を有する。第１の候補とは同一の影響因子であるが区分情報の内容は異なっている。 (2) Second Candidate The second candidate has contents such as influencing factor X1, classification information {0: [MIN, 2.0), 1: [2.0, MAX], NA: missing value}. Although it is the same influence factor as a 1st candidate, the content of division information differs.

この第２の候補の期待区分情報リフト値EL_Xiを算出する。但し、第１の候補の場合と同様に過去プロジェクト情報２００は、図２に示すデータを使用したものとする。 The expected classification information lift value EL_Xi of the second candidate is calculated. However, as in the case of the first candidate, the past project information 200 is assumed to use the data shown in FIG.

影響因子情報決定部７０は、過去プロジェクト情報２００及び第２の候補の区分情報から符号化過去プロジェクト情報及び区分情報リフト値を生成し、この符号化過去プロジェクト情報及び区分情報リフト値から以下の値を得る。 The influence factor information determination unit 70 generates the encoded past project information and the segment information lift value from the past project information 200 and the second candidate segment information, and the following values are calculated from the encoded past project information and the segment information lift value. Get.

影響因子情報決定部７０は、これらの値を上記式６にあてはめ、第２の候補の期待区分情報リフト値EL_Xiを以下のように得る。 The influencing factor information determination unit 70 applies these values to the above equation 6 to obtain the second candidate expected category information lift value EL_Xi as follows.

上記第１及び第２の影響因子情報の候補の期待区分情報リフト値EL_Xiを比較すると、
第２の候補の期待区分情報リフト値EL_Xi（=1.3443）よりも、第２の候補及び期待区分情報リフト値EL_Xi（=1.4441）の方が若干高い。よって、影響因子情報４００の決定において、影響因子情報決定部７０は、第２の候補を採用し、影響因子情報４００に第２の候補のデータを含める。これによりプロジェクト成功 / 失敗の予測精度が高まる可能性がある。 When the expected category information lift value EL_Xi of the first and second influence factor information candidates is compared,
The second candidate and the expected classification information lift value EL_Xi (= 1.4441) are slightly higher than the expected classification information lift value EL_Xi (= 1.3443) of the second candidate. Therefore, in determining the influence factor information 400, the influence factor information determination unit 70 adopts the second candidate and includes the data of the second candidate in the influence factor information 400. This may increase the accuracy of project success / failure prediction.

[３．２．第３の実施の形態の動作例]
図１６に示したリスク評価装置３の動作例を説明する。図１７に第３の実施の形態に係るリスク評価装置３の主たる動作の例を示したフローチャートを示す。 [3.2. Example of operation of the third embodiment]
An example of operation of the risk evaluation apparatus 3 shown in FIG. 16 will be described. FIG. 17 is a flowchart showing an example of the main operation of the risk evaluation apparatus 3 according to the third embodiment.

まずリスク評価装置３、より詳しくは影響因子情報決定部７０は、過去プロジェクト情報２００、プロジェクト失敗定義情報３００、影響因子候補情報１２００に基づいて符号化過去プロジェクト情報を生成する（S210）。 First, the risk evaluation apparatus 3, more specifically, the influence factor information determination unit 70 generates encoded past project information based on the past project information 200, the project failure definition information 300, and the influence factor candidate information 1200 (S210).

次にリスク評価装置３、より詳しくは影響因子情報決定部７０は、ステップS210で生成した符号化過去プロジェクト情報から、影響因子候補情報１２００に含まれる影響因子の候補のそれぞれについて期待区分情報リフト値EL_Xiを算出する（S220)。 Next, the risk evaluation device 3, more specifically, the influence factor information determination unit 70, expects the expected category information lift value for each of the influence factor candidates included in the influence factor candidate information 1200 from the encoded past project information generated in step S 210. EL_Xi is calculated (S220).

次にリスク評価装置３、より詳しくは影響因子情報決定部７０は、期待区分情報リフト値EL_Xiに基づいて影響因子情報４００に含める影響因子及びその区分情報を決定し、決定した影響因子及びその区分情報を含む新たな影響因子情報４００を生成し、これを入力情報記憶部３０に記憶させる（S230）。 Next, the risk evaluation device 3, more specifically, the influence factor information determination unit 70 determines the influence factor and its classification information to be included in the influence factor information 400 based on the expected classification information lift value EL_Xi, and the determined influence factor and its classification New influence factor information 400 including the information is generated and stored in the input information storage unit 30 (S230).

次にリスク評価装置３はリスクスコア算出処理を実行する（S240）。リスクスコア算出処理は、第１の実施の形態におけるリスクスコア算出処理（図１２、ステップS10〜S40参照）と同一の処理であるので、処理内容の詳述は省略する。なお、リスクスコア算出処理（S240）で使用される影響因子情報４００は、ステップS230で生成、記憶された影響因子報４００である。
以上で第３の実施の形態に係るリスク評価装置の動作の説明を終了する。 Next, the risk evaluation device 3 executes a risk score calculation process (S240). Since the risk score calculation process is the same process as the risk score calculation process (see FIG. 12, steps S10 to S40) in the first embodiment, detailed description of the process content is omitted. The influence factor information 400 used in the risk score calculation process (S240) is the influence factor information 400 generated and stored in step S230.
This is the end of the description of the operation of the risk evaluation apparatus according to the third embodiment.

[３．３．第３の実施の形態の利点]
第１の実施の形態の予測モデル構築において、ユーザやオペレータ等によって行われる影響因子の選定や区分情報の設定などの作業に最も手間がかかり、その良し悪しが最終的なプロジェクト失敗の予測精度にも大きく影響する。 [3.3. Advantages of the third embodiment]
In the construction of the prediction model of the first embodiment, work such as selection of influence factors and setting of classification information that is performed by a user or an operator takes the most labor, and the good or bad is the final project failure prediction accuracy. Also greatly affects.

第３の実施の形態では、期待区分情報リフト値を評価関数として、最適な影響因子の選定、及び、区分情報の設定を行うことが可能となることにより、予測モデル構築作業の手間を大幅に低減すると共に、プロジェクト失敗の予測精度向上も期待できる。 In the third embodiment, it is possible to select an optimal influence factor and set the category information using the expected category information lift value as an evaluation function, thereby greatly reducing the labor of the prediction model construction work. In addition to reducing this, it can be expected to improve the accuracy of project failure prediction.

[４．第４の実施の形態]
第１、第２、第３の実施の形態を組みあわてもリスク評価装置は成立する。第１、第２、第３の実施の形態を組みあわせたリスク評価装置４の構成例を示す機能ブロック図を図１８に掲げる。なお、第１、第２、第３の実施の形態と同一の構成要素については同一の参照符号を付す。 [4. Fourth Embodiment]
Even if the first, second, and third embodiments are combined, the risk evaluation apparatus is established. FIG. 18 is a functional block diagram showing an example of the configuration of the risk evaluation apparatus 4 in which the first, second, and third embodiments are combined. Note that the same reference numerals are given to the same components as those in the first, second, and third embodiments.

[５．まとめ、その他]
以上、本発明の実施の形態を説明したが、本発明はこれらに限定されるものではなく、発明の趣旨を逸脱しない範囲内において、種々の変更、追加、組み合わせ等が可能である。 [5. Summary, etc.]
As mentioned above, although embodiment of this invention was described, this invention is not limited to these, A various change, addition, a combination, etc. are possible in the range which does not deviate from the meaning of invention.

１、２、３、４・・・リスク評価装置；１０・・・予測モデル構築部；２０・・・リスクスコア計算部；３０・・・入力情報記憶部；４０・・・進行中プロジェクト情報記憶部；５０・・・プロジェクト失敗確率計算部；６０・・・影響因子候補情報記憶部；７０・・・影響因子情報決定部
1, 2, 3, 4, risk assessment device, 10, prediction model construction unit, 20, risk score calculation unit, 30, input information storage unit, 40, ongoing project information storage 50 ... project failure probability calculation unit; 60 ... influence factor candidate information storage unit; 70 ... influence factor information determination unit

Claims

Past project information, which is information describing the values of multiple influencing factors for projects implemented in the past,
Project failure definition information, which describes the criteria for distinguishing projects that were implemented in the past from successful projects and failed projects,
A first storage unit that stores classification factor information obtained by dividing an area in which a value of an influence factor can be divided into a plurality of divisions, and influence factor information that is information including timing information that indicates when the value of the influence factor can be acquired; 1 storage means;
Based on the division information of the project failure definition information and the influence factor information, the past project information is generated and replaced with the division value corresponding to the division to which the value of each influence factor belongs. A first processing means for calculating, for each category value of each influence factor, a category information lift value that is information indicating the degree of influence of the influence factor on the success or failure of the project, based on past project information;
Second storage means for storing in-progress project information, which is information describing a plurality of influencing factor values for the in-progress project;
Based on the category information of the influential factor information, by replacing with the segment value corresponding to the category to which the value of each influential factor of the ongoing project information belongs, the encoded in-progress project information is generated, and the encoded past Based on the project information and ongoing project information, the division information lift value, which is the information indicating the degree of influence of the influencing factors on the success or failure of the project, is calculated for each influencing factor timing information and the dividing value. Second process of calculating risk score for each timing information value for each project using timing information of each influencing factor and category information lift value for each category information for project information or ongoing project information And a risk evaluation apparatus.

The second processing means calculates a division information lift value L_Xi_k having an influence variable of Xi and a division value of k by the following equation:

(However, c1 is the number of project records in which the value of the storage field indicating success or failure of the project is a value indicating success for the influence factor Xi,
c2 is the number of project records whose impact factor Xi has a partition value other than k and other than “NA”, and the value of the storage field indicating the success or failure of the project is a value indicating success,
c3 is the number of project records in which the partition value for the influencing factor Xi is k, and the value of the storage field indicating success or failure of the project is a value indicating failure,
c4 is the number of project records whose impact factor Xi is other than k and NA, and the value of the storage field indicating the success or failure of the project is a value indicating failure,
a and b are arbitrary numbers)
The second processing means calculates the risk score using the timing information of each influencing factor and the category information lift value for each category value according to the following formula:
The risk evaluation device according to claim 1

(However, S_p_0 is the risk score when the value of timing information of project p is 0,
S_p_j is the risk score when the timing information value of project p is 0, T (Xi) is the value of timing information of influence factor Xi, L_Xi_k is the division information lift value with influence factor Xi and division value k Is).

Based on the success / failure judgment value that is a value indicating the success or failure of the project in the encoded past project information and the risk score of the past project for each value of the timing information, the regression formula of the project failure probability is obtained, and the obtained project failure probability regression formula The risk evaluation apparatus according to claim 1, further comprising: a third processing unit that calculates a project failure probability for an ongoing project using.

The third processing unit performs logistic regression analysis using the encoded past project information success / failure judgment value and the past project risk score in the timing information T = j as input, and a and b in the following equations The risk evaluation device according to claim 3, wherein a risk probability regression equation is obtained to obtain

(However, Pj is the probability of project failure in the timing information T = j).

Third storage means for storing influential factor candidate information, which is a set of influential factor and its classification information candidates;
Generate encoded past project information from past project information using the influence factor candidate information, and expect the reciprocal of the classification information lift value when the classification information lift value or the classification information lift value is 1 or less Calculating an expected category information lift value that is a sum of the values, selecting an influence factor and its category information to be included in the influence factor information based on the expected category information lift value, and selecting the selected influence factor and its category The risk evaluation apparatus according to claim 1, further comprising a fourth processing unit that generates influence factor information including information.

6. The risk evaluation apparatus according to claim 5, wherein the fourth processing means calculates an expected category information lift value by the following equation.