CN112734569A - Stock risk prediction method and system based on user portrait and knowledge graph - Google Patents
Stock risk prediction method and system based on user portrait and knowledge graph Download PDFInfo
- Publication number
- CN112734569A CN112734569A CN202011641247.3A CN202011641247A CN112734569A CN 112734569 A CN112734569 A CN 112734569A CN 202011641247 A CN202011641247 A CN 202011641247A CN 112734569 A CN112734569 A CN 112734569A
- Authority
- CN
- China
- Prior art keywords
- stock
- user
- risk
- financial
- event
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000012549 training Methods 0.000 claims abstract description 52
- 238000013058 risk prediction model Methods 0.000 claims abstract description 22
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 15
- 238000010276 construction Methods 0.000 claims description 21
- 238000004590 computer program Methods 0.000 claims description 17
- 230000002349 favourable effect Effects 0.000 claims description 13
- 238000005516 engineering process Methods 0.000 claims description 12
- 238000000605 extraction Methods 0.000 claims description 10
- 230000006399 behavior Effects 0.000 claims description 9
- 238000013145 classification model Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 238000003058 natural language processing Methods 0.000 claims description 6
- 238000002372 labelling Methods 0.000 claims description 4
- 238000005065 mining Methods 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000013075 data extraction Methods 0.000 claims description 3
- 238000013506 data mapping Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 claims description 3
- 230000004927 fusion Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 3
- 238000007637 random forest analysis Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 6
- 238000004458 analytical method Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000004451 qualitative analysis Methods 0.000 description 2
- 238000004445 quantitative analysis Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012827 research and development Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000013486 operation strategy Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Animal Behavior & Ethology (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The application discloses a stock risk prediction method and a system based on user portrait and knowledge graph, comprising the following steps: constructing financial event characteristics by using related text data and a financial knowledge map in the stock field; constructing user investment characteristics by using the user portrait and the affected stock codes; calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data; associating the obtained financial event characteristics with the user investment characteristics and the stock risk values to finally form a training set; and training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge graph, and finally predicting the stock risk information according to the relevant characteristics of the investment user. The invention combines financial event characteristics and user investment characteristics, trains the LSTM model, realizes stock risk prediction of stocks held by investors, and can provide personalized stock risk prompts for users.
Description
Technical Field
The application relates to the technical field of computers, in particular to a stock risk prediction method and system based on user portrait and knowledge graph.
Background
In various application scenarios, research and risk prediction are required for various events, for example, the influence and risk of a user information leakage event of an internet company on network security are determined. The main method for predicting the influence of events on the stock risk is as follows: a conventional model method and a time series prediction method. Among them, the conventional model methods include two types: quantitative and qualitative methods.
1. The traditional model method comprises the following steps: the quantitative method usually uses a quantitative mode to carry out public sentiment factor mining and construct a public sentiment quantitative factor based on an algorithm, namely, an event is firstly factorized, and the influence and the risk degree of the event are measured through quantitative indexes, such as the height of the historical investment income within a preset time after the event. The qualitative method usually completes the definition and risk degree analysis of the event manually in a manual labeling mode.
2. Time series prediction: on one hand, the time series prediction method is a regression prediction method, and the basic principle is that the continuity of the development of the object is admitted, the past time series data is used for statistical analysis, and the development trend of the object is estimated; on the other hand, randomness caused by accidental factors is fully considered, and in order to eliminate the influence caused by random fluctuation, statistical analysis is carried out by using historical data, and the data is appropriately processed to carry out trend prediction.
However, the above prediction method has the following drawbacks:
1. the traditional model method comprises the following steps:
the scheme of quantitative analysis often lacks the detailed division of event types, loses the logic context of events and has poor interpretability.
Qualitative analysis, which requires strong professional analysis, needs to analyze events individually, and fails to systematize and automate, resulting in low analysis efficiency. Whether the analysis result is correct depends on whether the subjective experience of the analyst can cover the key attribute characteristics of the event. In addition, the conclusion of qualitative analysis can only be judged in the positive and negative directions, and the judgment on the influence degree cannot be quantized, so that the method has strong subjectivity.
2. Time series prediction:
model unification, prediction research aiming at stock price change or basic surface influence factors thereof, influence of financial events on stock risks is not fully considered, and risk prediction cannot be carried out on invested stocks according to risk preference of users.
Disclosure of Invention
In order to solve the problems of the prior art proposed in the background art, the application provides a stock risk prediction method and system based on a user portrait and a knowledge graph.
The embodiment of the application provides a stock risk prediction method based on user portrait and knowledge graph, which comprises the following steps:
constructing financial event characteristics by using related text data and a financial knowledge map in the stock field;
constructing user investment characteristics by using the user portrait and the affected stock codes;
calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data;
correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge graph;
inputting the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment into a stock risk prediction model based on a user portrait and a knowledge graph, predicting the risk value at the next moment for the stocks invested by the investment user, and outputting the stock risk value invested by the investment user;
and constructing a stock risk prompt information rule by using the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompt information rule.
Further, the user portrait construction method comprises the following steps:
acquiring user behavior data and user survey data in financial software;
performing data processing on the user behavior data and the user survey data through data extraction, data conversion and data fusion;
clustering the user behavior data and the user survey data after data processing by using k-means and GMM (Gaussian mixture model) algorithms of a model layer to each label dimension;
finally, a user portrait, namely a labeling display of the user and grade information corresponding to the label is formed.
Further, the construction method of the financial knowledge map comprises the following steps:
extracting the text of the unstructured data; performing at least Chinese word segmentation, keyword extraction and feature extraction by using a natural language processing technology;
the extraction rule of the semi-structured data is learned through a wrapper, and the content of the semi-structured data is extracted;
obtaining the structured data through an ETL technology, and directly obtaining entities, entities and relationships among the entities;
establishing and managing a knowledge base, namely establishing a mapping relation between terms in the knowledge base and words in the extracted knowledge from different data sources through a data mapping technology; fusing data of the same object of different data sources by using entity matching; finally, the fused knowledge base is stored and managed;
the knowledge base adopts different storage architectures according to different query scenes of a user, and finally forms a financial knowledge map.
Further, the construction of the financial event features by using the relevant text data and the financial knowledge graph in the stock field specifically comprises the following steps:
acquiring related text data in the stock field through a content text library, analyzing the text data by using a natural language processing technology, extracting event entities from the content in the text data, extracting events influencing the event entities, and constructing an event library by using an expert annotation method;
associating the extracted event entities with entities in a financial knowledge graph to obtain stock codes influenced by the events, and mining related elements from the financial knowledge graph by combining the financial knowledge graph; and combining the event and the time of the event to form the financial event characteristic.
Further, the user investment characteristics are constructed by using the user portrait and the affected stock codes, and are specifically expressed as:
E(u)=[user_id,stock_id,x5,x6]wherein user _ id represents user id, stock _ id represents stock code, X5 represents risk preference, and X6 represents risk tolerance, and original type risk preference X5 and risk tolerance X6 data are converted into numerical form by one-hot method.
Further, the method for calibrating the stock risk value by using the financial event, the financial knowledge map and the rise and fall conditions of the stock data specifically comprises the following steps:
wherein x isaStock risk value, omega, for financial event effects1Weight of stock risk value, x, for financial event impactbStock risk value, omega, for the effects of fluctuations in stock data2Weighting of stock risk values influenced by the fluctuation of stock data; and accumulating the data of the stock at n moments, and then taking the average value as the stock risk value of the stock.
Further, the financial event characteristics are defined as:
E(e)=[stock_id,time,x1,x2,x3,x4]wherein stock _ id represents stock code, time is time, x1Representing industry impact, x2Indicates the company's internal operation situation, x3Indicating a level of interest event, x4Indicating a level of a clean up event; converting data of the type into a numerical type by using a one-hot method;
wherein the interest event rank x3Level of events in clear space x4Is defined and obtained by the following way:
for events, event entities and event occurrence time, marking high, medium and low levels for favorable event level and favorable event level, and accumulating a certain training set;
training the accumulated training set into a classification model by using a random forest algorithm;
for newly-occurring events, event entities and event occurrence time are classified into favorable event grades and favorable event grades by using a trained classification model, namely high, medium and low grades of the events are classified.
The embodiment of the present application further provides a stock risk prediction system based on a user portrait and a knowledge graph, including:
the financial event characteristic construction module is used for constructing financial event characteristics by utilizing the related text data and the financial knowledge map in the stock field;
the investment characteristic construction module is used for constructing the investment characteristics of the user by utilizing the user portrait and the affected stock codes;
the stock risk value calibration module is used for calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data;
the training set building module is used for correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
the stock risk prediction model construction module is used for training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge map;
the stock risk value prediction module is used for inputting the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment into a stock risk prediction model based on a user portrait and a knowledge graph, predicting the risk value at the next moment for the stocks invested by the investment user and outputting the stock risk value invested by the investment user;
and the stock risk information prompting module is used for constructing a stock risk prompting information rule by utilizing the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompting information rule.
The embodiment of the application also provides a terminal device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor executes the computer program to realize the steps of the stock risk prediction method based on the user portrait and the knowledge graph.
Embodiments of the present application also provide a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the stock risk prediction method based on user portrait and knowledge graph as described above.
The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects:
the risk prediction model is widely applied to the financial securities industry, but most of the risk prediction models are simplified and are subjected to extensive reminding, and the risk prediction and reminding can not be performed on specific invested products according to the risk preference of users and the influence of financial events on stock risks. The invention provides a stock risk prediction method and a stock risk prediction system based on a user portrait and a financial knowledge map, which are used for constructing a financial event characteristic by expanding elements of a financial event on one hand, constructing a user investment characteristic by combining the user portrait on the other hand, combining the financial event characteristic and the user investment characteristic, training an LSTM model, realizing stock risk prediction of stocks held by investors and providing personalized stock risk prompts for users.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of a stock risk prediction method based on a user portrait and a knowledge graph according to the present application.
FIG. 2 is a user portrait construction flowchart of the stock risk prediction method based on a user portrait and an intellectual graph according to the present application.
FIG. 3 is a flow chart of construction of a financial intellectual graph of a stock risk prediction method based on a user portrait and an intellectual graph according to the present application.
FIG. 4 is a schematic diagram illustrating an implementation process of an embodiment of a stock risk prediction method based on a user portrait and a knowledge graph according to the present application.
FIG. 5 is a schematic diagram illustrating the correlation between events and financial knowledge maps in the stock risk prediction method based on user portrait and knowledge maps according to the present application.
FIG. 6 is a schematic diagram illustrating a financial knowledge graph and a user portrait associated with a stock risk prediction method based on the user portrait and the knowledge graph according to the present application.
FIG. 7 is a diagram of the relationship between the events and the affected stocks and user images of the stock risk prediction method based on user images and knowledge maps.
Fig. 8 is a schematic diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Example 1
The embodiment of the application provides a stock risk prediction method based on user portrait and knowledge graph, which comprises the following steps:
step 1, constructing financial event characteristics by using related text data and a financial knowledge map in the stock field; the content stored in the financial knowledge map mainly includes knowledge in the financial fields of stocks, futures, bonds, listed companies, financial characters, stock basic situation and the like. As shown in fig. 3, the construction method of the financial knowledge graph comprises the following steps:
(1) extracting text (stock codes, stock prices, etc.) from unstructured data (financial news, newspaper data, etc.); performing at least Chinese word segmentation, keyword extraction and feature extraction by using a natural language processing technology;
(2) the extraction rule of the semi-structured data is learned through a wrapper, and content extraction (entities, and the relation among the entities can be directly obtained) is carried out on the semi-structured data (news data in a hypertext markup format);
(3) structured data (stock information database and stock information registration table) are obtained through ETL technology, and entities, entities and relations among the entities are directly obtained;
(4) establishing and managing a knowledge base, namely establishing a mapping relation between terms in the knowledge base and words in the extracted knowledge from different data sources through a data mapping technology; fusing data of the same object of different data sources by using entity matching (also called entity alignment); finally, the fused knowledge base is stored and managed;
(5) the knowledge base adopts different storage architectures according to different user query scenes to finally form a financial knowledge graph, wherein the financial knowledge graph comprises stock codes, company names, high management, upstream and downstream enterprises, sub-companies, internal operation conditions (good, medium and bad) of the companies and industrial influences (good, constant and bad). The solution of knowledge storage and management can adopt different storage architectures, such as NoSQL or a relational database, according to different user query scenarios. Meanwhile, the large-scale knowledge base also conforms to the characteristics of big data, so that a traditional big data platform such as Spark or Hadoop is required to provide high-performance computing capability and support rapid operation.
Step 2, constructing user investment characteristics by using the user portrait and the affected stock codes; as shown in FIG. 2, the user portrait construction method comprises:
(1) acquiring user behavior data (including browsing, clicking, purchasing and other data) and user survey data (user basic information, financial level and other data) in financial management software;
(2) performing data processing on the user behavior data and the user survey data through data extraction, data conversion and data fusion;
(3) clustering the user behavior data and the user survey data after data processing by using a k-means and GMM (Gaussian mixture model) algorithm of a model layer to cluster each label dimension (for example, clustering risk preference into 3 categories, low risk, medium risk, high risk and the like);
(4) finally, a user portrait is formed, namely the user's tagged display and the corresponding grade information of the tag, for example: user A, age 30-40; the risk preference is medium risk; the risk tolerance is moderate; the financial level is medium.
Step 3, calibrating the stock risk value by using the financial event, the financial knowledge map and the fluctuation condition of the stock data;
step 4, correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
step 5, training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge map;
and 7, constructing a stock risk prompt information rule by using the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompt information rule.
The real-time process of the above steps is specifically analyzed as follows:
in the step 1, the financial event characteristics are constructed by using the related text data and the financial knowledge map in the stock field, and the method specifically comprises the following steps:
step 11, obtaining relevant text data in the stock field through a content text library, analyzing the text data by using a natural language processing technology, firstly extracting an event entity (entity, such as a certain company) from the content in the text data, then extracting events influencing the event entity, and constructing an event library (event, such as sanction) by using an expert notation (event which influences the stock industry is labeled by stock industry experts);
step 12, associating the extracted event entities with entities in a financial knowledge graph to obtain stock codes influenced by the events, and mining related elements (such as stock basic plane information) from the financial knowledge graph by combining the financial knowledge graph; and combining the event (namely the event influencing the stock) and the time when the event occurs to form a financial event characteristic, wherein the financial event characteristic is defined as:
E(e)=[stock_id,time,x1,x2,x3,x4]wherein stock _ id represents stock code, time is time, x1Representing industry impact, x2Indicates the company's internal operation situation, x3Indicating a level of interest event, x4Indicating a level of a clean up event; converting data of the type into a numerical type by using a one-hot method; wherein the interest event rank x3Level of events in clear space x4Is defined and obtained by the following way:
firstly, marking high, medium and low levels for favorable event level and the like according to events, event entities and event occurrence time, and accumulating a certain training set;
secondly, training the accumulated training set into a classification model by using a random forest algorithm;
and thirdly, for newly-generated events, classifying the events, event entities and event occurrence time for favorable event grades and favorable event grades by using a trained classification model, namely, classifying the high grade, the medium grade and the low grade of the events.
As shown in fig. 5, the entity of the financial event is associated with the affected stocks in the financial knowledge map to obtain the basic information.
In the step 2, the user investment characteristics are constructed by using the user portrait and the affected stock codes, and are specifically expressed as follows:
E(u)=[user_id,stock_id,x5,x6]wherein user _ id represents user id, stock _ id represents stock code, X5 represents risk preference, and X6 represents risk tolerance, and original type risk preference X5 and risk tolerance X6 data are converted into numerical form by one-hot method. As shown in fig. 6, a description is given of how affected stocks may be associated with a user representation to form a user investment profile.
In the step 3, the stock risk value is calibrated by using the financial event, the financial knowledge map and the fluctuation condition of the stock data, and the method specifically comprises the following steps:
wherein x isaStock risk value, omega, for financial event effects1Weight of stock risk value, x, for financial event impactbStock risk value, omega, for the effects of fluctuations in stock data2Weighting of stock risk values influenced by the fluctuation of stock data; the data of the stock at n moments are selected for accumulation, and then the average value is taken as the dataStock risk value of the branch stock. The final shape is as follows: stock code (stock _ id) and stock risk value (y).
In the step 4, the obtained financial event characteristics and the user investment characteristics are associated according to the stock codes to form a characteristic part of a training set, which specifically comprises the following steps: the financial event characteristics and the user investment characteristics are combined by using stock _ id to form a model as follows:
E(s,u,v)=[user_id,stock_id,time,x1,x2,x3,x4,x5,x6]so that the two features better predict the stock risk of the investment user. As shown in fig. 7, how the financial event features are combined and correlated with the user investment features is described. And finally, associating the characteristic part with the stock risk value y calculated in the step 3 according to the stock code stock _ id to finally form a training set. The shape is as follows: user ID (user _ ID), stock code (stock _ ID), time, industry impact (x)1) Company internal business situation (x)2) Interest event rating (x)3) Level of events in the clear (x)4) Risk preference (x) of user5) Risk tolerance (x)6) Equal characteristics and stock risk values (y). The above parameters are specifically described as follows:
(1) industry influence x1 (getting better, unchanged and getting worse) [ text content can be obtained by constructing a knowledge graph ], and the text content is converted into a numerical form by a one-hot method;
(2) the x2 situation (good, medium and bad) of the company internal management [ text content can be obtained by constructing a knowledge graph ], and is converted into a numerical form by a one-hot method;
(3) favorable event grades x3 (high, medium and low) [ expert labels in the early stage, grade division can be carried out through a classification model in the later stage ], and the favorable event grades are converted into numerical value forms through a one-hot method;
(4) the level of the free space event x4 (high, medium and low) (expert labeling in the early stage, and grading in the later stage can be carried out through a classification model), and the free space event level x4 is converted into a numerical form through a one-hot method;
(5) risk preference level x5 (high, medium, low) [ grades are classified by constructing user portrait, clustering ], and is converted into numerical form by one-hot method;
(6) risk tolerance x5 (high, medium, low) [ grades by building user profile, clustering ], was converted to numerical form by one-hot method.
And after the training set is formed, training by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge graph. The events in the stock market have time sequence relevance, the LSTM neural network model can find the influence and relationship between data through learning of past stock market and event data, and can deeply dig out the inherent rule of stock risk in time sequence by utilizing the high-grade machine learning function of selective memory, thereby predicting the stock risk.
Then, the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment can be input into a stock risk prediction model based on the user portrait and the knowledge graph, the risk value prediction at the next moment is carried out on the stocks invested by the investment user, and the risk value of the stocks invested by the investment user is output.
And constructing a stock risk prompt information rule by using the investment characteristics (such as age level, user risk preference, risk bearing capacity, financial level and the like) of the user and the predicted stock risk value, and prompting the stock risk information for the user through the rule. Constructing a prompt information rule, firstly judging the risk level of the stock forecast risk value of the user investment, such as: 70-80 is low risk, 80-90 is medium risk, and 90-100 is high risk; and then, according to the risk preference of the investor in the user portrait, stock risk information is prompted for the user by using the risk preference and the stock prediction risk value grade.
In the specific application of this application:
for example: the stock that a certain user holds, utilize stock prediction model based on user portrait and knowledge map to predict the stock risk value is 89, in the risk; combining the risk preference of the investor in the user portrait, if the risk preference is low, prompting the user according to the prompt information rule: "your investment risk preference is low risk, the risk value predicted by the stock utilization model held by your at present is middle risk, and the user is prompted to operate in time to achieve the purpose of stopping loss in time"; if the risk preference of the investor is high risk, prompting the user according to the prompting information rule: "your investment risk preference is high risk, and the risk value that stock that you hold at present utilized the model prediction is medium risk, and suggestion you continue to pay attention to the risk change, avoid losing".
For example: a certain investment user holds a certain stock of the new energy automobile, and the investment preference of the investment user is low risk by combining the characteristics of the user; combining financial event changes: firstly, the state adjusts the new energy automobile strategy, introduces a new energy electric automobile factory, influences the research and development operation strategy of the same type of domestic automobile factories according to the relation of the knowledge map, receives the research and development cost of the upstream and downstream industries, predicts the risk level of the stock held by the user to be low risk through a model, and gives a prompt: "your investment risk preference is low risk, and the current risk value that stock that you hold utilized the model prediction is low risk, and suggestion you continue to pay attention to the risk change, avoid investment loss". Secondly, with the large stockholders of the related companies performing a large amount of stock reduction, the risk level predicted by the model is further improved to be medium risk, and a prompt is given to the investment user: "your investment risk preference is low risk, and the current risk value that stock that you hold utilized the model prediction is medium risk, reminds you to operate in time, avoids investment loss". With the situation upgrade, the supervision department (certificate and prison) inquires and surveys, and the risk level of the model prediction is further improved to be high risk, and the investment user is prompted at the moment: "your investment risk preference is low risk, and the current stock of your possession utilizes the risk value of model prediction to be high risk, reminds you in time to operate, in time stop to lose".
Example 2
The embodiment of the present application further provides a stock risk prediction system based on a user portrait and a knowledge graph, including:
the financial event characteristic construction module is used for constructing financial event characteristics by utilizing the related text data and the financial knowledge map in the stock field;
the investment characteristic construction module is used for constructing the investment characteristics of the user by utilizing the user portrait and the affected stock codes;
the stock risk value calibration module is used for calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data;
the training set building module is used for correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
the stock risk prediction model construction module is used for training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge map;
the stock risk value prediction module is used for inputting the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment into a stock risk prediction model based on a user portrait and a knowledge graph, predicting the risk value at the next moment for the stocks invested by the investment user and outputting the stock risk value invested by the investment user;
and the stock risk information prompting module is used for constructing a stock risk prompting information rule by utilizing the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompting information rule.
Fig. 8 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 8, the terminal device 6 of this embodiment includes: a processor 60, a memory 61, and a computer program 62 stored in the memory 61 and executable on the processor 60, such as a stock risk prediction program based on a user profile and a knowledge graph. The processor 60, when executing the computer program 62, implements the steps in the above-described embodiments of a method for predicting stock risk based on a user representation and a knowledge graph, such as the steps shown in FIG. 1. Alternatively, the processor 60 executes the computer program 62 to implement the functions of the modules/units in the device embodiments, such as the functions of the financial event characteristic constructing module, the investment characteristic constructing module, the stock risk value calibrating module, the training set constructing module, the stock risk prediction model constructing module, the stock risk value predicting module, and the stock risk information prompting module.
Illustratively, the computer program 62 may be partitioned into one or more modules/units that are stored in the memory 61 and executed by the processor 60 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution of the computer program 62 in the terminal device 6. For example, the computer program 62 may be partitioned into a financial event feature construction module, an investment feature construction module, a stock risk value calibration module, a training set construction module, a stock risk prediction model construction module, a stock risk value prediction module, and a stock risk information prompting module.
The terminal device 6 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 60, a memory 61. Those skilled in the art will appreciate that fig. 8 is merely an example of a terminal device 6 and does not constitute a limitation of terminal device 6 and may include more or fewer components than shown, or some components may be combined, or different components, for example, the terminal device may also include input output devices, network access devices, buses, etc.
The Processor 60 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 61 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6. The memory 61 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 6. Further, the memory 61 may also include both an internal storage unit and an external storage device of the terminal device 6. The memory 61 is used for storing the computer programs and other programs and data required by the terminal device. The memory 61 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.
Claims (10)
1. A stock risk prediction method based on user portrait and knowledge graph is characterized by comprising the following steps:
constructing financial event characteristics by using related text data and a financial knowledge map in the stock field;
constructing user investment characteristics by using the user portrait and the affected stock codes;
calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data;
correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge graph;
inputting the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment into a stock risk prediction model based on a user portrait and a knowledge graph, predicting the risk value at the next moment for the stocks invested by the investment user, and outputting the stock risk value invested by the investment user;
and constructing a stock risk prompt information rule by using the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompt information rule.
2. The stock risk prediction method based on the user portrait and the knowledge graph as claimed in claim 1, wherein the user portrait is constructed by:
acquiring user behavior data and user survey data in financial software;
performing data processing on the user behavior data and the user survey data through data extraction, data conversion and data fusion;
clustering the user behavior data and the user survey data after data processing by using k-means and GMM (Gaussian mixture model) algorithms of a model layer to each label dimension;
finally, a user portrait, namely a labeling display of the user and grade information corresponding to the label is formed.
3. The stock risk prediction method based on the user portrait and the knowledge graph as claimed in claim 1, wherein the construction method of the financial knowledge graph is as follows:
extracting the text of the unstructured data; performing at least Chinese word segmentation, keyword extraction and feature extraction by using a natural language processing technology;
the extraction rule of the semi-structured data is learned through a wrapper, and the content of the semi-structured data is extracted;
obtaining the structured data through an ETL technology, and directly obtaining entities, entities and relationships among the entities;
establishing and managing a knowledge base, namely establishing a mapping relation between terms in the knowledge base and words in the extracted knowledge from different data sources through a data mapping technology; fusing data of the same object of different data sources by using entity matching; finally, the fused knowledge base is stored and managed;
the knowledge base adopts different storage architectures according to different query scenes of a user, and finally forms a financial knowledge map.
4. The stock risk prediction method based on user portrait and knowledge graph as claimed in claim 1, wherein the financial event feature is constructed by using text data related to the stock field and the financial knowledge graph, specifically:
acquiring related text data in the stock field through a content text library, analyzing the text data by using a natural language processing technology, extracting event entities from the content in the text data, extracting events influencing the event entities, and constructing an event library by using an expert annotation method;
associating the extracted event entities with entities in a financial knowledge graph to obtain stock codes influenced by the events, and mining related elements from the financial knowledge graph by combining the financial knowledge graph; and combining the event and the time of the event to form the financial event characteristic.
5. The method for predicting the stock risk based on the user portrait and the knowledge graph as claimed in claim 1, wherein the user portrait and the affected stock codes are used to construct a user investment characteristic, and the user investment characteristic is specifically represented as follows:
E(u)=[user_id,stock_id,x5,x6]wherein user _ id represents user id, stock _ id represents stock code, X5 represents risk preference, and X6 represents risk tolerance, and original type risk preference X5 and risk tolerance X6 data are converted into numerical form by one-hot method.
6. The stock risk prediction method based on user portrait and knowledge-graph as claimed in claim 1, wherein the stock risk value is calibrated by using the financial event, financial knowledge-graph and stock data fluctuation, specifically:
wherein x isaStock risk value, omega, for financial event effects1Weight of stock risk value, x, for financial event impactbStock risk value, omega, for the effects of fluctuations in stock data2Weighting of stock risk values influenced by the fluctuation of stock data; and accumulating the data of the stock at n moments, and then taking the average value as the stock risk value of the stock.
7. The method of claim 4, wherein the financial event characteristics are defined as:
E(e)=[stock_id,time,x1,x2,x3,x4]wherein stock _ id represents stock code, time is time, x1Representing industry impact, x2Indicates the company's internal operation situation, x3Indicating a level of interest event, x4Indicating a level of a clean up event; converting data of the type into a numerical type by using a one-hot method; wherein the interest event rank x3Level of events in clear space x4Is defined and obtained by the following way:
for events, event entities and event occurrence time, marking high, medium and low levels for favorable event level and favorable event level, and accumulating a certain training set;
training the accumulated training set into a classification model by using a random forest algorithm;
for newly-occurring events, event entities and event occurrence time are classified into favorable event grades and favorable event grades by using a trained classification model, namely high, medium and low grades of the events are classified.
8. A stock risk prediction system based on a user profile and a knowledge graph, comprising:
the financial event characteristic construction module is used for constructing financial event characteristics by utilizing the related text data and the financial knowledge map in the stock field;
the investment characteristic construction module is used for constructing the investment characteristics of the user by utilizing the user portrait and the affected stock codes;
the stock risk value calibration module is used for calibrating the stock risk value by using financial events, financial knowledge maps and the fluctuation condition of stock data;
the training set building module is used for correlating the obtained financial event characteristics and the user investment characteristics according to stock codes to form a characteristic part of a training set; then, associating the characteristic part of the training set with the stock risk value according to the stock code to finally form a training set;
the stock risk prediction model construction module is used for training the training set by using an LSTM algorithm to form a stock risk prediction model based on the user portrait and the knowledge map;
the stock risk value prediction module is used for inputting the investment characteristics of a certain investment user to be predicted and the financial event characteristics at the next moment into a stock risk prediction model based on a user portrait and a knowledge graph, predicting the risk value at the next moment for the stocks invested by the investment user and outputting the stock risk value invested by the investment user;
and the stock risk information prompting module is used for constructing a stock risk prompting information rule by utilizing the investment characteristics of the user and the predicted stock risk value, and prompting the stock risk information for the user through the stock risk prompting information rule.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method for predicting a risk of a stock based on a user profile and a knowledge graph according to any one of claims 1 to 7.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for predicting a risk of a stock based on a user profile and a knowledge graph according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011641247.3A CN112734569A (en) | 2020-12-31 | 2020-12-31 | Stock risk prediction method and system based on user portrait and knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011641247.3A CN112734569A (en) | 2020-12-31 | 2020-12-31 | Stock risk prediction method and system based on user portrait and knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112734569A true CN112734569A (en) | 2021-04-30 |
Family
ID=75609092
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011641247.3A Pending CN112734569A (en) | 2020-12-31 | 2020-12-31 | Stock risk prediction method and system based on user portrait and knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112734569A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113902569A (en) * | 2021-10-30 | 2022-01-07 | 平安科技(深圳)有限公司 | Method for identifying the proportion of green assets in digital assets and related products |
CN114357197A (en) * | 2022-03-08 | 2022-04-15 | 支付宝(杭州)信息技术有限公司 | Event reasoning method and device |
CN114820191A (en) * | 2022-05-10 | 2022-07-29 | 中科柏诚科技(北京)股份有限公司 | Artificial intelligent stock system application method |
KR20220167909A (en) * | 2021-06-15 | 2022-12-22 | 김상율 | Apparatus and method for providing stock market prediction service by using of machine learning |
-
2020
- 2020-12-31 CN CN202011641247.3A patent/CN112734569A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20220167909A (en) * | 2021-06-15 | 2022-12-22 | 김상율 | Apparatus and method for providing stock market prediction service by using of machine learning |
KR102597042B1 (en) | 2021-06-15 | 2023-10-31 | 김상율 | Apparatus and method for providing stock market prediction service by using of machine learning |
CN113902569A (en) * | 2021-10-30 | 2022-01-07 | 平安科技(深圳)有限公司 | Method for identifying the proportion of green assets in digital assets and related products |
CN114357197A (en) * | 2022-03-08 | 2022-04-15 | 支付宝(杭州)信息技术有限公司 | Event reasoning method and device |
CN114357197B (en) * | 2022-03-08 | 2022-07-26 | 支付宝(杭州)信息技术有限公司 | Event reasoning method and device |
CN114820191A (en) * | 2022-05-10 | 2022-07-29 | 中科柏诚科技(北京)股份有限公司 | Artificial intelligent stock system application method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020249125A1 (en) | Method and system for automatically training machine learning model | |
CN111401777B (en) | Enterprise risk assessment method, enterprise risk assessment device, terminal equipment and storage medium | |
CN112734569A (en) | Stock risk prediction method and system based on user portrait and knowledge graph | |
US9483544B2 (en) | Systems and methods for calculating category proportions | |
CN106447066A (en) | Big data feature extraction method and device | |
CN106844407B (en) | Tag network generation method and system based on data set correlation | |
CN110704572A (en) | Suspected illegal fundraising risk early warning method, device, equipment and storage medium | |
CN112561598A (en) | Customer loss prediction and retrieval method and system based on customer portrait | |
US11367116B1 (en) | System and method for automatic product matching | |
CN112182246A (en) | Method, system, medium, and application for creating an enterprise representation through big data analysis | |
KR20200039852A (en) | Method for analysis of business management system providing machine learning algorithm for predictive modeling | |
CN111179055B (en) | Credit line adjusting method and device and electronic equipment | |
Yao et al. | Using social media information to predict the credit risk of listed enterprises in the supply chain | |
Chen et al. | [Retracted] Analysis of E‐Commerce Marketing Strategy Based on Xgboost Algorithm | |
CN114372835B (en) | Comprehensive energy service potential customer identification method, system and computer equipment | |
CN116739649A (en) | User response potential evaluation method and device | |
Elena | News sentiment in bankruptcy prediction models: Evidence from Russian retail companies | |
CN116127189A (en) | User operation method, device, equipment and computer storage medium | |
CN115034762A (en) | Post recommendation method and device, storage medium, electronic equipment and product | |
CN114240553A (en) | Recommendation method, device and equipment for vehicle insurance products and storage medium | |
CN115099680A (en) | Risk management method, device, equipment and storage medium | |
CN114996579A (en) | Information pushing method and device, electronic equipment and computer readable medium | |
CN114513578A (en) | Outbound method, device, computer equipment and storage medium | |
CN112818215A (en) | Product data processing method, device, equipment and storage medium | |
CN111008874B (en) | Technical trend prediction method, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210430 |