CN114117903A - Rail transit short-time passenger flow prediction method based on bp neural network - Google Patents
Rail transit short-time passenger flow prediction method based on bp neural network Download PDFInfo
- Publication number
- CN114117903A CN114117903A CN202111387081.1A CN202111387081A CN114117903A CN 114117903 A CN114117903 A CN 114117903A CN 202111387081 A CN202111387081 A CN 202111387081A CN 114117903 A CN114117903 A CN 114117903A
- Authority
- CN
- China
- Prior art keywords
- passenger flow
- time
- neural network
- data
- station
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 38
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000004458 analytical method Methods 0.000 claims abstract description 34
- 238000005192 partition Methods 0.000 claims abstract description 18
- 230000006870 function Effects 0.000 claims abstract description 17
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000004422 calculation algorithm Methods 0.000 claims description 19
- 230000000737 periodic effect Effects 0.000 claims description 15
- 238000007621 cluster analysis Methods 0.000 claims description 11
- 238000012546 transfer Methods 0.000 claims description 9
- 230000008859 change Effects 0.000 claims description 8
- 238000007637 random forest analysis Methods 0.000 claims description 4
- 239000013598 vector Substances 0.000 claims description 4
- 230000002902 bimodal effect Effects 0.000 claims description 2
- 238000005206 flow analysis Methods 0.000 claims description 2
- 230000008569 process Effects 0.000 description 9
- 230000011218 segmentation Effects 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 238000003062 neural network model Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 3
- 210000002569 neuron Anatomy 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000005314 correlation function Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000011426 transformation method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
- G06F30/27—Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2119/00—Details relating to the type or aim of the analysis or the optimisation
- G06F2119/02—Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Artificial Intelligence (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Computing Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Computer Hardware Design (AREA)
- Geometry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a short-time passenger flow prediction method for rail transit based on a bp neural network, which specifically comprises the following steps: preprocessing original data; calculating the dissimilarity degree of the preprocessed data by calculating the distance between different date types; clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition; drawing a clustering pedigree graph and carrying out clustering analysis; periodically analyzing the passenger flow time characteristic by the passenger flow; periodically analyzing the passenger flow space characteristics; periodically analyzing the passenger flow time-space characteristics; extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis; selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.
Description
Technical Field
The invention relates to the technical field of data statistics, in particular to a short-time passenger flow prediction method for rail transit based on a bp neural network.
Background
BP neural network based on genetic algorithm improvement. The BP neural network model is a multilayer feedforward type neural network model based on an error back propagation algorithm. The input signal passes through the input layer, the hidden layer and the output layer of the BP neural network structure in sequence, is processed in the hidden layer and is finally output by the output layer, and the process is the forward transmission of the signal. When the actual output of the output layer does not match the expected output, the error is passed back. In the process of error reverse transmission, the total error of actual output and expected output is spread to each layer of neurons in a certain mode, the neurons receive the spread error, and based on the error, the weight and the threshold are corrected, so that the error of the next signal transmission result is reduced. And after multiple iterations, finally, the error is converged, the model training is finished, and the obtained weight and threshold can be used for data prediction.
In the process of using the BP network model, the model is often trapped in a locally optimal condition, and the accuracy of the model prediction result is greatly reduced.
Disclosure of Invention
The invention aims to make up for the defects of the prior art and provides a short-time passenger flow prediction method for rail transit based on a bp neural network.
The invention is realized by the following technical scheme:
a short-time passenger flow prediction method for rail transit based on a bp neural network specifically comprises the following steps:
(1) cluster analysis
(1.1) preprocessing the original data;
(1.2) calculating the dissimilarity of the preprocessed data by calculating the distance between different date types;
(1.3) clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition;
(1.4) drawing a clustering pedigree graph, and carrying out clustering analysis;
(2) periodic analysis of passenger flow
(2.1) periodically analyzing the passenger flow time characteristic;
(2.2) periodically analyzing the space characteristics of passenger flow;
(2.3) periodically analyzing the space-time characteristics of passenger flow;
(3) extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis;
(4) selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In this item xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1。
X=C1∪C2∪···∪Ck+1
These segmentations constitute the results of the cluster analysis.
Preprocessing the original data in the step (1.1), specifically standardizing passenger flow volumes of the original data on different working days:
in the formula, SjFor the sample standard deviation, x is the feature point of the data set.
Calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:
the Euclidean distance is adopted as the quantity index of the dissimilarity degree:
the degree of dissimilarity between objects is usually expressed in terms of the distance between vectors, i.e., d (x)i,xj) And x is a feature point of the data set.
The periodic passenger flow analysis in the step (2) specifically comprises the following steps:
(2.1) periodic analysis of passenger flow time characteristics
Describing the in/out passenger flow of a station by adopting a one-dimensional time sequence, and dividing the distribution condition of the in/out passenger flow of each station in one day into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type;
the unimodal distribution means that a station has an inbound peak and an outbound peak in the operating time of one day;
the bimodal distribution means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 and 9 in the morning, forms late traffic peak due to leaving work and leaving school between 17 and 19 in the evening, and has little change in passenger flow in other time periods, belonging to the average traffic peak;
the full peak type distribution is no clear passenger flow valley, and the passenger flow volume in each time period all day is large;
the peak-free passenger flow distribution has no obvious peak, and the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow;
the convex peak type distribution is near the large-scale event;
(2.2) periodic analysis of spatial characteristics of passenger flow
Analyzing the passenger flow of urban rail transit from the spatial dimension, and analyzing the passenger flow conditions of different stations or different sections in the same time period; the station passenger flow is the sum of the station passenger flow for getting on and off the bus and the transfer passenger flow in unit time; describing the passenger flow of n stations in a certain time period by adopting an n-dimensional sequence;
the section passenger flow refers to the on-line passenger flow passing through each section of the urban rail transit line in unit time, and is divided into an up section passenger flow and a down section passenger flow according to the difference of the running directions of trains;
(2.3) periodic analysis of passenger flow spatio-temporal characteristics
The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, and calculate the incoming and outgoing passenger flow of each station in the next time period or a plurality of time periods, namely to predict the passenger flow data of the (m + delta t) th column according to the data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and h is the number of columns of the matrix.
Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th periodnThe outbound passenger flow must be constantThe system consists of partial arrival passenger flow of other stations in the period before the stations, and the period is mainly determined by the running time of the trains between the two stations. This conclusion can be expressed simply as the time period from S in the tnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T isjThe expression of (b) is shown in the following formula.
It is obvious that later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnAnd the station exits the rail transit system. Obtaining a size of S by combining historical entrance/exit traffic data of each stationj×(h+1)]The multidimensional passenger flow matrix of (2) is shown as follows.
Each row of matrix F reflects the traffic characteristics of the stations in the time dimension. Each column of the matrix F represents the traffic characteristics of different stations at the same time in the spatial dimension. Therefore, the space-time characteristics of the urban rail transit passenger flow can be reflected by the formula.
The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m + Δ t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, Δ t is a time interval and is always 1, 2 and 3.
Extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:
clustering analysis is carried out on the stations, the stations are divided into 5 types, passenger flow of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and a data set of 5 types is used as input data of a BP network model;
firstly, according to the time proximity characteristics, selecting the passenger flow of three periods before the current time as influence factors respectively, and obtaining the periodicity between days and the periodicity between weeks through the periodic analysis of the passenger flow, so as to select the passenger flow of the same period of time one day before the current period and the passenger flow of the same period of time one week before the current period as the influence factors; considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the passenger flow volume in the previous time period in the same time period of the previous week is also included in the influence factor set in the same way; in addition, holidays often have more passenger flow volume than non-holidays, and the passenger flow volume between weekdays and weekends is different from the passenger flow volume time distribution characteristics, so that the current day is added as an influence factor of the weekdays, the weekends and the holidays;
after the influence factors are selected, the influence factors are sequenced on the importance of the results through a random forest algorithm, the factors with small importance are removed, and the factors with large importance are selected as the final influence factor set of the BP network model.
Selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.
The hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.
The invention has the advantages that:
the method realizes the short-time prediction of the rail transit passenger flow data by means of a BP neural network model and an OD matrix; the invention improves the transfer function and the training algorithm of the network on the basis of the performance of the traditional BP neural network. The initial weight and the threshold of the neural network are optimized by means of the genetic algorithm, so that the situation that the model is in a local optimum condition is avoided, and the accuracy of the model prediction result is greatly reduced. The invention improves the passenger flow monitoring capability of rail transit, provides advance evaluation and prediction and accurately controls passenger flow.
Drawings
FIG. 1 is a graph of the clustering lineage of the present invention.
Fig. 2 is an independent urban rail transit line with n stations according to the invention.
Detailed Description
basic steps of the bp neural network algorithm are as follows:
(1) weighting value W in BP neural networkijGiving a random small non-zero real number, and setting an initial learning rate mu and a momentum factor alpha;
(2) inputting a first set of sample pairs of the N sets of input-output sample pairs;
(3) calculating an actual output value O of each layer of elements in the BP neural network according to the formula (1), wherein k is a k-th layer neural network, m is an m-th layer neural network, i is an ith element of a certain layer, and j is a jth element of the certain layer;
wherein,is the output of the ith element of the kth layer;is the input sum of the ith element of the kth layer;is the connection weight from the i element of the k-1 layer to the j element of the k layer;
(4) adjusting the connection weight of each layer according to the formula (2)
in the formulaAs a result of the output layer k, yiIs the output of the ith neuron, and t is time t.
(5) Turning to the step (2), inputting a second pair of sample pairs, and circulating N groups of sample pairs until WijAnd stopping when the target error of the prediction is reached or the stability is stable.
A short-time passenger flow prediction method for rail transit based on a bp neural network specifically comprises the following steps:
1. cluster analysis
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In this item xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1。
2.X=C1∪C2∪···∪Ck+1
These segmentations constitute the results of the cluster analysis.
The existing training data is: the track traffic of 2019 and 2020 includes 8 lines and stations, and the data of the incoming and outgoing passenger flow are summarized at each time every day. Before short-time passenger flow prediction is carried out, the date types are required to be divided, and then the short-time passenger flow prediction is carried out according to the divided date types.
Clustering is to divide data samples into different clusters or classes according to a certain specific standard (generally, a distance criterion), and the clustering results are that the similarity among the samples in the classes is as large as possible and the difference among the samples in the classes is as large as possible, namely, the samples are completely separated and distributed through the data of different classes after the clustering analysis, and the data of the same class are clustered together.
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In the invention xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1. The cluster analysis satisfies the following formula:
X=C1∪C2∪···∪Ck+1 (5)
these segmentations constitute the results of the cluster analysis.
The clustering method is various, and the method selected by the invention is based on the defined segmentation and the distance between the segmentations. The specific method is that the data of each working day are divided into two types with minimum distance, and the two types are merged and then the process is repeated until all the divisions are merged into one division. The process may create a clustering pedigree map. The method comprises the following specific steps:
(1) clustered data preprocessing
Prior to cluster analysis, the existing raw data is first preprocessed. The main reason is that when the sample is extracted to measure the data, different variables have different dimensions and possibly have different order of magnitude units, and in order to put the data together for comparison and then segmentation, the original data needs to be preprocessed. The standardized transformation method is a commonly used method for data preprocessing. Raw (6) data traffic was normalized for different working days:
(2) calculating the degree of dissimilarity
In order to measure the similarity or difference degree, i.e. dissimilarity degree, between the passenger flow volumes of different types of dates, a cluster statistic is defined as a quantity index of cluster analysis, and then quantitative cluster analysis is completed according to the quantity index. The method of solving for dissimilarity is typically to calculate the distance between different date types. Euclidean distance (Euclid) is used herein as a quantitative indicator of dissimilarity:
(3) clustering
And merging the two types of partitions with the minimum distance, and continuously repeating the process after merging until all the partitions are merged into one partition.
(4) Drawing clustering pedigree graph
According to the steps of the clustering process, a passenger flow distribution amount is selected as an example for clustering analysis, as shown in fig. 1.
4. Periodic analysis of passenger flow
The passenger flow of each station is dynamically changed, the time-space characteristics and the periodic variation rule of the actual passenger flow of the urban rail transit are analyzed, and the management department is facilitated to adjust departure intervals and traffic schemes and arrange personnel at the stations to organize the passenger flow, so that the maximum passenger capacity of the urban rail transit network is exerted.
Time characteristic periodicity analysis of passenger flow
With the wide application of the AFC system, the urban rail transit operation department can collect passenger flow data from each independent AFC equipment terminal once every 15 minutes. Viewed in the time dimension, the invention adopts a one-dimensional time sequence to describe the in/out passenger flow of a station.
The distribution of the passenger flow of each station in and out of the station in a day can be mainly divided into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type under the influence of the geographical position of the station and the type of the station. The unimodal distribution means that a station has an on-station peak and an off-station peak in the operating time of a day. The double-peak type means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 o 'clock and 9 o' clock in the morning, forms late traffic peak due to leaving work and leaving school between 17 o 'clock and 19 o' clock in the evening, and the passenger flow changes little in other time intervals, belonging to the average traffic peak.
The full-peak station passenger flow distribution is characterized in that clear passenger flow valleys do not exist, the passenger flow in each time period all day is large, and the station is generally located in a highly developed area or a highly concentrated area of public facilities; the peak-free passenger flow distribution has no obvious peak, most of the time is that the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow, but the passenger flow of the stations can be gradually increased along with the continuous flourishing of cities. The hump type usually occurs when a large event, such as a sporting event, concert, etc., is held, or when the weather changes abruptly. When a large-scale activity is closed, a station close to the activity position has a section of sharply increased incoming passenger flow, and after a certain time, the outgoing passenger flow of other hub stations may have a sharply increased peak.
A cyclic analysis of spatial characteristics of the passenger flow is carried out
And analyzing the passenger flow of the urban rail transit from the spatial dimension, and researching the passenger flow conditions of different stations or different sections in the same time period. The station traffic refers to the sum of the traffic of passengers getting on and off the station and the traffic of passengers transferring the station in a unit time (usually, a peak hour or a day is taken as a unit time). The invention adopts an n-dimensional sequence to describe the passenger flow of n stations in a certain time period.
The section passenger flow refers to the on-train passenger flow passing through each section of the urban rail transit line in unit time (usually taking peak hours as a research period), and can be divided into an up-section passenger flow and a down-section passenger flow according to the different running directions of trains.
(3) Periodic analysis of passenger flow time-space characteristics
The urban rail transit trip data is typical space-time data, and the passenger flow data simultaneously changes in two dimensions of time and space and presents certain regularity. The outbound passenger flow volume of a certain station has a great relationship with the inbound and outbound passenger flow volume of each relevant station, so that the law is difficult to accurately describe by only using historical passenger flow data of one station. In order to show the space-time characteristics of passenger flow more clearly, the invention constructs an independent urban rail transit line comprising n stations, as shown in figure 2.
Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th periodnThe outbound traffic must be composed of partial inbound traffic during the period before the other stations, and this period is mainly determined by the train's running time between the two stationsAnd (4) determining. This conclusion can be expressed simply as the time period from S in the tnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T isjThe expression of (b) is shown in the following formula.
It is obvious that later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnAnd the station exits the rail transit system. Obtaining a size of S by combining historical entrance/exit traffic data of each stationj×(h+1)]The multidimensional passenger flow matrix of (2) is shown as follows.
Each row of matrix F reflects the traffic characteristics of the stations in the time dimension. Each column of the matrix F represents the traffic characteristics of different stations at the same time in the spatial dimension. Therefore, the space-time characteristics of the urban rail transit passenger flow can be reflected by the formula.
The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m +. DELTA.t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and Delta t is a time interval and is usually 1, 2 and 3.
5. Feature factor extraction
The stations are divided into 5 types by carrying out cluster analysis on the stations, the passenger flow volume of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and the data sets of the 5 types are used as input data of a BP network model. The step is to train the obtained model to have more universality, so that the model has more accurate prediction results for different data types.
The prediction model mainly considers the influence of time characteristic factors. Firstly, according to the time proximity characteristic, namely the arrival passenger flow needing to be predicted has stronger relation with the arrival passenger flow in the adjacent time period, the passenger flow in the previous three time periods at the current time is selected as an influence factor respectively. Through analyzing the passenger flow periodicity characteristics, the fact that the periodicity exists between days and between weeks is obtained, and therefore the passenger flow of the same time period in the day before the current time period and the passenger flow of the same time period in the week before the current time period are selected as influence factors. Considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the same processing is performed on the previous time period in the same time period of the previous week in the same way. In the process of analyzing data, it is found that holidays often have more passenger flow than non-holidays, and the passenger flow between weekdays and weekends is different from the passenger flow time distribution characteristics, so that the day is added as an influence factor of the weekdays, the weekends and the holidays.
TABLE 1 influencing factor selection
After selecting the factors through experience and data analysis, sorting the importance of the factors to the results through an RF (random forest) algorithm, removing the factors with small importance, and selecting the factors with large importance as a final influence factor set of the BP network model. Finally, we choose the set of influencing factors as { X }i-1,Xi-2,Xi-3,Xi-1,Xi-7,X_}
6. Neural network parameter and correlation function selection
The accuracy of the prediction model is influenced by the parameters of the BP neural network model, and the parameters need to be carefully selected.
1) Determination of the number of implicit layers: the three-layer BP neural network can complete the prediction of data, the prediction precision can be improved along with the increase of the number of hidden layers, but the training time of the corresponding BP neural network is greatly increased. Comprehensively considering, the number of hidden layers of the prediction model is selected as one layer.
2) Determination of the number of implicit nodes: in conventional studies, the number of hidden nodes is generally considered to be related to the dimension of an input vector and the dimension of an output vector. For the determination of the number of implicit nodes, different experts have proposed different methods. The prediction model of the invention roughly determines the number of hidden layer nodes by selecting a 2n +1 method, and the number of the nodes is continuously changed in a plurality of experiments later, so that the number of 24 hidden nodes is finally obtained, and the precision of the model can be effectively improved.
3) Determination of the transfer function: the prediction model of the invention uses Sigmoid type functions as the transfer functions of the BP network model. Sigmoid function smoothing is easy to derive. And (4) because the error is transmitted reversely, the distribution of the total error on each weight and threshold is obtained by derivation.
4) Determination of learning rate: the learning rate determines the variable quantity of the weight and the threshold in the training process, the prediction result is affected badly when the learning rate is too large, the training is vibrated when the learning rate is too large, and the convergence rate is slowed when the learning rate is too small. In the past, the study is generally selected from 0.01-0.8, and 0.01 is selected as the learning rate of the BP network model through continuous adjustment of the learning rate and comparison of results.
5) Determining the iteration times: firstly, according to experience of previous experiments and similar comparison of other excellent prediction models and the prediction model of the invention, the range of the iteration times is roughly determined to be 10000 to 100000, in multiple experiments, the iteration times are continuously adjusted, and finally 60000 is obtained as the optimal iteration times of the prediction model.
6) Selecting a gradient descent algorithm: under the conditions of looking up multi-party data and comparing the advantages and disadvantages of all gradient descent functions, the prediction model selects an Adma algorithm as a gradient descent algorithm. The Adam algorithm has the advantages of high-efficiency computing capacity, less required memory, suitability for solving the optimization problem containing large-scale data and parameters, suitability for non-steady-state (non-steady) targets, basically only needing a very small amount of parameter adjustment and the like. The method aims at the prediction of subway passenger flow and has large data volume, so that an algorithm suitable for processing large-scale data is suitable for the prediction model, and Adam becomes the optimal choice of the prediction model.
Claims (8)
1. A short-time passenger flow prediction method for rail transit based on a bp neural network is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) cluster analysis
(1.1) preprocessing the original data;
(1.2) calculating the dissimilarity of the preprocessed data by calculating the distance between different date types;
(1.3) clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition;
(1.4) drawing a clustering pedigree graph, and carrying out clustering analysis;
(2) periodic analysis of passenger flow
(2.1) periodically analyzing the passenger flow time characteristic;
(2.2) periodically analyzing the space characteristics of passenger flow;
(2.3) periodically analyzing the space-time characteristics of passenger flow;
(3) extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis;
(4) selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.
2. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 1, wherein:
preprocessing the original data in the step (1.1), specifically standardizing passenger flow volumes of the original data on different working days:
in the formula SjFor the sample standard deviation, x is the feature point of the data set.
3. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 2, wherein: calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:
the Euclidean distance is adopted as the quantity index of the dissimilarity degree:
in the formula, d (x)i,xj) The degree of dissimilarity between objects is usually the distance between vectors.
4. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 3, wherein: the periodic passenger flow analysis in the step (2) specifically comprises the following steps:
(2.1) periodic analysis of passenger flow time characteristics
Describing the in/out passenger flow of a station by adopting a one-dimensional time sequence, and dividing the distribution condition of the in/out passenger flow of each station in one day into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type;
the unimodal distribution means that a station has an inbound peak and an outbound peak in the operating time of one day;
the bimodal distribution means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 and 9 in the morning, forms late traffic peak due to leaving work and leaving school between 17 and 19 in the evening, and has little change in passenger flow in other time periods, belonging to the average traffic peak;
the full peak type distribution is no clear passenger flow valley, and the passenger flow volume in each time period all day is large;
the peak-free passenger flow distribution has no obvious peak, and the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow;
the convex peak type distribution is near the large-scale event;
(2.2) periodic analysis of spatial characteristics of passenger flow
Analyzing the passenger flow of urban rail transit from the spatial dimension, and analyzing the passenger flow conditions of different stations or different sections in the same time period; the station passenger flow is the sum of the station passenger flow for getting on and off the bus and the transfer passenger flow in unit time; describing the passenger flow of n stations in a certain time period by adopting an n-dimensional sequence;
the section passenger flow refers to the on-line passenger flow passing through each section of the urban rail transit line in unit time, and is divided into an up section passenger flow and a down section passenger flow according to the difference of the running directions of trains;
(2.3) periodic analysis of passenger flow spatio-temporal characteristics
The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, calculate the passenger flow of entering and exiting stations of each station in the next time period or a plurality of time periods, namely predict the passenger flow data of the (m plus Deltat) th row according to the data of the first m rows of a matrix F, wherein m is more than or equal to 1 and less than or equal to h, Deltat is a time period interval, the matrix F is a multidimensional passenger flow matrix obtained by combining the historical passenger flow data of entering/exiting stations, and h is the number of the rows of the matrix.
5. A bp neural network-based network as claimed in claim 4The short-time passenger flow prediction method for rail transit is characterized by comprising the following steps: each passenger has a pair of travel origin-destination points, namely, OD pairs; from S during the t-th time periodnThe outbound traffic must be composed of partial inbound traffic during the period before the other stops, and this period is determined by the train' S travel time between the two stops, i.e., from S during the t-th time periodnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1, TjThe expression of (A) is shown in the following formula,
later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnThe station exits the rail transit system, and a size of S is obtained by combining the historical data of the incoming/outgoing passenger flow of each stationj×(h+1)]The multi-dimensional passenger flow matrix of (2) is shown as follows:
each row of the matrix F reflects the passenger flow characteristics of each station in the time dimension, and each column of the matrix F represents the passenger flow characteristics of different stations at the same time in the space dimension, so that the above formula reflects the time-space characteristics of urban rail transit passenger flow at the same time.
6. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 5, wherein: extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:
clustering analysis is carried out on the stations, the stations are divided into 5 types, passenger flow of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and a data set of 5 types is used as input data of a BP network model;
firstly, according to the time proximity characteristics, selecting the passenger flow of three periods before the current time as influence factors respectively, and obtaining the periodicity between days and the periodicity between weeks through the periodic analysis of the passenger flow, so as to select the passenger flow of the same period of time one day before the current period and the passenger flow of the same period of time one week before the current period as the influence factors; considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the passenger flow volume in the previous time period in the same time period of the previous week is also included in the influence factor set in the same way; in addition, holidays often have more passenger flow volume than non-holidays, and the passenger flow volume between weekdays and weekends is different from the passenger flow volume time distribution characteristics, so that the current day is added as an influence factor of the weekdays, the weekends and the holidays;
after the influence factors are selected, the influence factors are sequenced on the importance of the results through a random forest algorithm, the factors with small importance are removed, and the factors with large importance are selected as the final influence factor set of the BP network model.
7. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 6, wherein: selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.
8. The method for predicting short-time passenger flow based on the bp neural network as claimed in claim 7, wherein: the hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111387081.1A CN114117903B (en) | 2021-11-22 | 2021-11-22 | Short-time passenger flow prediction method for rail transit based on bp neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111387081.1A CN114117903B (en) | 2021-11-22 | 2021-11-22 | Short-time passenger flow prediction method for rail transit based on bp neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114117903A true CN114117903A (en) | 2022-03-01 |
CN114117903B CN114117903B (en) | 2024-10-01 |
Family
ID=80439180
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111387081.1A Active CN114117903B (en) | 2021-11-22 | 2021-11-22 | Short-time passenger flow prediction method for rail transit based on bp neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114117903B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114707726A (en) * | 2022-04-02 | 2022-07-05 | 武汉氢客科技有限公司 | Passenger flow volume prediction method, system and storage medium based on Elman neural network |
CN116778739A (en) * | 2023-06-20 | 2023-09-19 | 深圳市中车智联科技有限公司 | Public transportation scheduling method and system based on demand response |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016045195A1 (en) * | 2014-09-22 | 2016-03-31 | 北京交通大学 | Passenger flow estimation method for urban rail network |
CN109086926A (en) * | 2018-07-25 | 2018-12-25 | 南京理工大学 | A kind of track traffic for passenger flow prediction technique in short-term based on combination neural net structure |
CN109558985A (en) * | 2018-12-10 | 2019-04-02 | 南通科技职业学院 | A kind of bus passenger flow amount prediction technique based on BP neural network |
CN110276474A (en) * | 2019-05-22 | 2019-09-24 | 南京理工大学 | A kind of track traffic station passenger flow forecasting in short-term |
CN110516866A (en) * | 2019-08-21 | 2019-11-29 | 上海工程技术大学 | A kind of real-time estimation method for handing over subway crowding for city rail |
WO2020125716A1 (en) * | 2018-12-21 | 2020-06-25 | 中兴通讯股份有限公司 | Method for realizing network optimization and related device |
CN111931978A (en) * | 2020-06-29 | 2020-11-13 | 南京熊猫电子股份有限公司 | Urban rail transit passenger flow state prediction method based on space-time characteristics |
CN112149902A (en) * | 2020-09-23 | 2020-12-29 | 吉林大学 | Subway short-time arrival passenger flow prediction method based on passenger flow characteristic analysis |
WO2021098619A1 (en) * | 2019-11-19 | 2021-05-27 | 中国科学院深圳先进技术研究院 | Short-term subway passenger flow prediction method, system and electronic device |
CN113298314A (en) * | 2021-06-10 | 2021-08-24 | 重庆大学 | Rail transit passenger flow prediction method considering dynamic space-time correlation |
-
2021
- 2021-11-22 CN CN202111387081.1A patent/CN114117903B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016045195A1 (en) * | 2014-09-22 | 2016-03-31 | 北京交通大学 | Passenger flow estimation method for urban rail network |
CN109086926A (en) * | 2018-07-25 | 2018-12-25 | 南京理工大学 | A kind of track traffic for passenger flow prediction technique in short-term based on combination neural net structure |
CN109558985A (en) * | 2018-12-10 | 2019-04-02 | 南通科技职业学院 | A kind of bus passenger flow amount prediction technique based on BP neural network |
WO2020125716A1 (en) * | 2018-12-21 | 2020-06-25 | 中兴通讯股份有限公司 | Method for realizing network optimization and related device |
CN110276474A (en) * | 2019-05-22 | 2019-09-24 | 南京理工大学 | A kind of track traffic station passenger flow forecasting in short-term |
CN110516866A (en) * | 2019-08-21 | 2019-11-29 | 上海工程技术大学 | A kind of real-time estimation method for handing over subway crowding for city rail |
WO2021098619A1 (en) * | 2019-11-19 | 2021-05-27 | 中国科学院深圳先进技术研究院 | Short-term subway passenger flow prediction method, system and electronic device |
CN111931978A (en) * | 2020-06-29 | 2020-11-13 | 南京熊猫电子股份有限公司 | Urban rail transit passenger flow state prediction method based on space-time characteristics |
CN112149902A (en) * | 2020-09-23 | 2020-12-29 | 吉林大学 | Subway short-time arrival passenger flow prediction method based on passenger flow characteristic analysis |
CN113298314A (en) * | 2021-06-10 | 2021-08-24 | 重庆大学 | Rail transit passenger flow prediction method considering dynamic space-time correlation |
Non-Patent Citations (2)
Title |
---|
袁坚;王鹏;王钺;杨欣;: "基于时空特征的城市轨道交通客流量预测方法", 北京交通大学学报, no. 06, 15 December 2017 (2017-12-15) * |
黎旭成;彭逸洲;吴宗翔;陈振武;: "基于深度时空网络的地铁站点客流短时预测", 交通与运输, no. 2, 31 August 2020 (2020-08-31) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114707726A (en) * | 2022-04-02 | 2022-07-05 | 武汉氢客科技有限公司 | Passenger flow volume prediction method, system and storage medium based on Elman neural network |
CN116778739A (en) * | 2023-06-20 | 2023-09-19 | 深圳市中车智联科技有限公司 | Public transportation scheduling method and system based on demand response |
Also Published As
Publication number | Publication date |
---|---|
CN114117903B (en) | 2024-10-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110570651B (en) | Road network traffic situation prediction method and system based on deep learning | |
CN111223301B (en) | Traffic flow prediction method based on graph attention convolution network | |
CN110070713B (en) | Traffic flow prediction method based on bidirectional nested LSTM neural network | |
CN113096388B (en) | Short-term traffic flow prediction method based on gradient lifting decision tree | |
CN108269401B (en) | Data-driven viaduct traffic jam prediction method | |
CN105702029B (en) | A kind of Expressway Traffic trend prediction method for considering space-time relationship at times | |
CN110555990B (en) | Effective parking space-time resource prediction method based on LSTM neural network | |
CN113159364A (en) | Passenger flow prediction method and system for large-scale traffic station | |
CN112270355B (en) | Active safety prediction method based on big data technology and SAE-GRU | |
CN113380025B (en) | Vehicle driving quantity prediction model construction method, prediction method and system | |
CN112820105B (en) | Road network abnormal area processing method and system | |
CN106448151A (en) | Short-time traffic flow prediction method | |
CN111145546B (en) | Urban global traffic situation analysis method | |
CN107194491A (en) | A kind of dynamic dispatching method based on Forecasting of Travel Time between bus passenger flow and station | |
CN114117903B (en) | Short-time passenger flow prediction method for rail transit based on bp neural network | |
CN115440032A (en) | Long-term and short-term public traffic flow prediction method | |
CN111160650B (en) | Adaboost algorithm-based traffic flow characteristic analysis and prediction method | |
Haputhanthri et al. | Short-term traffic forecasting using LSTM-based deep learning models | |
CN113051811B (en) | Multi-mode short-term traffic jam prediction method based on GRU network | |
CN115269758A (en) | Passenger-guidance-oriented road network passenger flow state deduction method and system | |
CN113537596A (en) | Short-time passenger flow prediction method for new line station of urban rail transit | |
CN108171367A (en) | A kind of horizontal Reliability Prediction Method of Bus Service | |
CN106779241B (en) | Short-term passenger flow prediction method for rail transit | |
Rasaizadi et al. | The ensemble learning process for short-term prediction of traffic state on rural roads | |
CN114463978A (en) | Data monitoring method based on rail transit information processing terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |