CN114117903A

CN114117903A - Rail transit short-time passenger flow prediction method based on bp neural network

Info

Publication number: CN114117903A
Application number: CN202111387081.1A
Authority: CN
Inventors: 张明明; 黄家琛; 张士雷; 廖舟; 彭思静; 龙瑾潇; 李思洋; 王成宝
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2021-11-22
Filing date: 2021-11-22
Publication date: 2022-03-01
Anticipated expiration: 2041-11-22
Also published as: CN114117903B

Abstract

The invention discloses a short-time passenger flow prediction method for rail transit based on a bp neural network, which specifically comprises the following steps: preprocessing original data; calculating the dissimilarity degree of the preprocessed data by calculating the distance between different date types; clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition; drawing a clustering pedigree graph and carrying out clustering analysis; periodically analyzing the passenger flow time characteristic by the passenger flow; periodically analyzing the passenger flow space characteristics; periodically analyzing the passenger flow time-space characteristics; extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis; selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.

Description

Rail transit short-time passenger flow prediction method based on bp neural network

Technical Field

The invention relates to the technical field of data statistics, in particular to a short-time passenger flow prediction method for rail transit based on a bp neural network.

Background

BP neural network based on genetic algorithm improvement. The BP neural network model is a multilayer feedforward type neural network model based on an error back propagation algorithm. The input signal passes through the input layer, the hidden layer and the output layer of the BP neural network structure in sequence, is processed in the hidden layer and is finally output by the output layer, and the process is the forward transmission of the signal. When the actual output of the output layer does not match the expected output, the error is passed back. In the process of error reverse transmission, the total error of actual output and expected output is spread to each layer of neurons in a certain mode, the neurons receive the spread error, and based on the error, the weight and the threshold are corrected, so that the error of the next signal transmission result is reduced. And after multiple iterations, finally, the error is converged, the model training is finished, and the obtained weight and threshold can be used for data prediction.

In the process of using the BP network model, the model is often trapped in a locally optimal condition, and the accuracy of the model prediction result is greatly reduced.

Disclosure of Invention

The invention aims to make up for the defects of the prior art and provides a short-time passenger flow prediction method for rail transit based on a bp neural network.

The invention is realized by the following technical scheme:

a short-time passenger flow prediction method for rail transit based on a bp neural network specifically comprises the following steps:

(1) cluster analysis

(1.1) preprocessing the original data;

(1.2) calculating the dissimilarity of the preprocessed data by calculating the distance between different date types;

(1.3) clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition;

(1.4) drawing a clustering pedigree graph, and carrying out clustering analysis;

(2) periodic analysis of passenger flow

(2.1) periodically analyzing the passenger flow time characteristic;

(2.2) periodically analyzing the space characteristics of passenger flow;

(2.3) periodically analyzing the space-time characteristics of passenger flow;

(3) extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis;

(4) selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.

The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points X_i＝(x_i1,x_i2,······,x_id)∈A,x_iIs each feature (or dimension) x_ijMay be enumerated or numerical. In this item x_ijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set X_iThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments C_m(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition C_k+1。

X＝C1∪C2∪···∪Ck+1

These segmentations constitute the results of the cluster analysis.

Preprocessing the original data in the step (1.1), specifically standardizing passenger flow volumes of the original data on different working days:

in the formula, S_jFor the sample standard deviation, x is the feature point of the data set.

Calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:

the Euclidean distance is adopted as the quantity index of the dissimilarity degree:

the degree of dissimilarity between objects is usually expressed in terms of the distance between vectors, i.e., d (x)_i,x_j) And x is a feature point of the data set.

The periodic passenger flow analysis in the step (2) specifically comprises the following steps:

(2.1) periodic analysis of passenger flow time characteristics

Describing the in/out passenger flow of a station by adopting a one-dimensional time sequence, and dividing the distribution condition of the in/out passenger flow of each station in one day into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type;

the unimodal distribution means that a station has an inbound peak and an outbound peak in the operating time of one day;

the bimodal distribution means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 and 9 in the morning, forms late traffic peak due to leaving work and leaving school between 17 and 19 in the evening, and has little change in passenger flow in other time periods, belonging to the average traffic peak;

the full peak type distribution is no clear passenger flow valley, and the passenger flow volume in each time period all day is large;

the peak-free passenger flow distribution has no obvious peak, and the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow;

the convex peak type distribution is near the large-scale event;

(2.2) periodic analysis of spatial characteristics of passenger flow

Analyzing the passenger flow of urban rail transit from the spatial dimension, and analyzing the passenger flow conditions of different stations or different sections in the same time period; the station passenger flow is the sum of the station passenger flow for getting on and off the bus and the transfer passenger flow in unit time; describing the passenger flow of n stations in a certain time period by adopting an n-dimensional sequence;

the section passenger flow refers to the on-line passenger flow passing through each section of the urban rail transit line in unit time, and is divided into an up section passenger flow and a down section passenger flow according to the difference of the running directions of trains;

(2.3) periodic analysis of passenger flow spatio-temporal characteristics

The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, and calculate the incoming and outgoing passenger flow of each station in the next time period or a plurality of time periods, namely to predict the passenger flow data of the (m + delta t) th column according to the data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and h is the number of columns of the matrix.

Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th period_nThe outbound passenger flow must be constantThe system consists of partial arrival passenger flow of other stations in the period before the stations, and the period is mainly determined by the running time of the trains between the two stations. This conclusion can be expressed simply as the time period from S in the t_nThe outbound traffic will be associated with the Tth_iTime period S_jThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T is_jThe expression of (b) is shown in the following formula.

It is obvious that later than T_jS of the time period_jThe passenger flow entering the station can not be from S in the t-th time period_nAnd the station exits the rail transit system. Obtaining a size of S by combining historical entrance/exit traffic data of each station_j×(h+1)]The multidimensional passenger flow matrix of (2) is shown as follows.

Each row of matrix F reflects the traffic characteristics of the stations in the time dimension. Each column of the matrix F represents the traffic characteristics of different stations at the same time in the spatial dimension. Therefore, the space-time characteristics of the urban rail transit passenger flow can be reflected by the formula.

The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m + Δ t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, Δ t is a time interval and is always 1, 2 and 3.

Extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:

clustering analysis is carried out on the stations, the stations are divided into 5 types, passenger flow of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and a data set of 5 types is used as input data of a BP network model;

firstly, according to the time proximity characteristics, selecting the passenger flow of three periods before the current time as influence factors respectively, and obtaining the periodicity between days and the periodicity between weeks through the periodic analysis of the passenger flow, so as to select the passenger flow of the same period of time one day before the current period and the passenger flow of the same period of time one week before the current period as the influence factors; considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the passenger flow volume in the previous time period in the same time period of the previous week is also included in the influence factor set in the same way; in addition, holidays often have more passenger flow volume than non-holidays, and the passenger flow volume between weekdays and weekends is different from the passenger flow volume time distribution characteristics, so that the current day is added as an influence factor of the weekdays, the weekends and the holidays;

after the influence factors are selected, the influence factors are sequenced on the importance of the results through a random forest algorithm, the factors with small importance are removed, and the factors with large importance are selected as the final influence factor set of the BP network model.

Selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.

The hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.

The invention has the advantages that:

the method realizes the short-time prediction of the rail transit passenger flow data by means of a BP neural network model and an OD matrix; the invention improves the transfer function and the training algorithm of the network on the basis of the performance of the traditional BP neural network. The initial weight and the threshold of the neural network are optimized by means of the genetic algorithm, so that the situation that the model is in a local optimum condition is avoided, and the accuracy of the model prediction result is greatly reduced. The invention improves the passenger flow monitoring capability of rail transit, provides advance evaluation and prediction and accurately controls passenger flow.

Drawings

FIG. 1 is a graph of the clustering lineage of the present invention.

Fig. 2 is an independent urban rail transit line with n stations according to the invention.

Detailed Description

basic steps of the bp neural network algorithm are as follows:

(1) weighting value W in BP neural network_ijGiving a random small non-zero real number, and setting an initial learning rate mu and a momentum factor alpha;

(2) inputting a first set of sample pairs of the N sets of input-output sample pairs;

(3) calculating an actual output value O of each layer of elements in the BP neural network according to the formula (1), wherein k is a k-th layer neural network, m is an m-th layer neural network, i is an ith element of a certain layer, and j is a jth element of the certain layer;

wherein,

is the output of the ith element of the kth layer;

is the input sum of the ith element of the kth layer;

is the connection weight from the i element of the k-1 layer to the j element of the k layer;

(4) adjusting the connection weight of each layer according to the formula (2)

Wherein when k is equal to m,

when k is less than m, the ratio of m,

in the formula

As a result of the output layer k, y_iIs the output of the ith neuron, and t is time t.

(5) Turning to the step (2), inputting a second pair of sample pairs, and circulating N groups of sample pairs until W_ijAnd stopping when the target error of the prediction is reached or the stability is stable.

1. cluster analysis

2.X＝C1∪C2∪···∪Ck+1

3.

These segmentations constitute the results of the cluster analysis.

The existing training data is: the track traffic of 2019 and 2020 includes 8 lines and stations, and the data of the incoming and outgoing passenger flow are summarized at each time every day. Before short-time passenger flow prediction is carried out, the date types are required to be divided, and then the short-time passenger flow prediction is carried out according to the divided date types.

Clustering is to divide data samples into different clusters or classes according to a certain specific standard (generally, a distance criterion), and the clustering results are that the similarity among the samples in the classes is as large as possible and the difference among the samples in the classes is as large as possible, namely, the samples are completely separated and distributed through the data of different classes after the clustering analysis, and the data of the same class are clustered together.

The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points X_i＝(x_i1,x_i2,······,x_id)∈A,x_iIs each feature (or dimension) x_ijMay be enumerated or numerical. In the invention x_ijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set X_iThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments C_m(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition C_k+1. The cluster analysis satisfies the following formula:

X＝C¹∪C²∪···∪C^k+1 (5)

these segmentations constitute the results of the cluster analysis.

The clustering method is various, and the method selected by the invention is based on the defined segmentation and the distance between the segmentations. The specific method is that the data of each working day are divided into two types with minimum distance, and the two types are merged and then the process is repeated until all the divisions are merged into one division. The process may create a clustering pedigree map. The method comprises the following specific steps:

(1) clustered data preprocessing

Prior to cluster analysis, the existing raw data is first preprocessed. The main reason is that when the sample is extracted to measure the data, different variables have different dimensions and possibly have different order of magnitude units, and in order to put the data together for comparison and then segmentation, the original data needs to be preprocessed. The standardized transformation method is a commonly used method for data preprocessing. Raw (6) data traffic was normalized for different working days:

(2) calculating the degree of dissimilarity

In order to measure the similarity or difference degree, i.e. dissimilarity degree, between the passenger flow volumes of different types of dates, a cluster statistic is defined as a quantity index of cluster analysis, and then quantitative cluster analysis is completed according to the quantity index. The method of solving for dissimilarity is typically to calculate the distance between different date types. Euclidean distance (Euclid) is used herein as a quantitative indicator of dissimilarity:

(3) clustering

And merging the two types of partitions with the minimum distance, and continuously repeating the process after merging until all the partitions are merged into one partition.

(4) Drawing clustering pedigree graph

According to the steps of the clustering process, a passenger flow distribution amount is selected as an example for clustering analysis, as shown in fig. 1.

4. Periodic analysis of passenger flow

The passenger flow of each station is dynamically changed, the time-space characteristics and the periodic variation rule of the actual passenger flow of the urban rail transit are analyzed, and the management department is facilitated to adjust departure intervals and traffic schemes and arrange personnel at the stations to organize the passenger flow, so that the maximum passenger capacity of the urban rail transit network is exerted.

Time characteristic periodicity analysis of passenger flow

With the wide application of the AFC system, the urban rail transit operation department can collect passenger flow data from each independent AFC equipment terminal once every 15 minutes. Viewed in the time dimension, the invention adopts a one-dimensional time sequence to describe the in/out passenger flow of a station.

The distribution of the passenger flow of each station in and out of the station in a day can be mainly divided into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type under the influence of the geographical position of the station and the type of the station. The unimodal distribution means that a station has an on-station peak and an off-station peak in the operating time of a day. The double-peak type means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 o 'clock and 9 o' clock in the morning, forms late traffic peak due to leaving work and leaving school between 17 o 'clock and 19 o' clock in the evening, and the passenger flow changes little in other time intervals, belonging to the average traffic peak.

The full-peak station passenger flow distribution is characterized in that clear passenger flow valleys do not exist, the passenger flow in each time period all day is large, and the station is generally located in a highly developed area or a highly concentrated area of public facilities; the peak-free passenger flow distribution has no obvious peak, most of the time is that the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow, but the passenger flow of the stations can be gradually increased along with the continuous flourishing of cities. The hump type usually occurs when a large event, such as a sporting event, concert, etc., is held, or when the weather changes abruptly. When a large-scale activity is closed, a station close to the activity position has a section of sharply increased incoming passenger flow, and after a certain time, the outgoing passenger flow of other hub stations may have a sharply increased peak.

A cyclic analysis of spatial characteristics of the passenger flow is carried out

And analyzing the passenger flow of the urban rail transit from the spatial dimension, and researching the passenger flow conditions of different stations or different sections in the same time period. The station traffic refers to the sum of the traffic of passengers getting on and off the station and the traffic of passengers transferring the station in a unit time (usually, a peak hour or a day is taken as a unit time). The invention adopts an n-dimensional sequence to describe the passenger flow of n stations in a certain time period.

The section passenger flow refers to the on-train passenger flow passing through each section of the urban rail transit line in unit time (usually taking peak hours as a research period), and can be divided into an up-section passenger flow and a down-section passenger flow according to the different running directions of trains.

(3) Periodic analysis of passenger flow time-space characteristics

The urban rail transit trip data is typical space-time data, and the passenger flow data simultaneously changes in two dimensions of time and space and presents certain regularity. The outbound passenger flow volume of a certain station has a great relationship with the inbound and outbound passenger flow volume of each relevant station, so that the law is difficult to accurately describe by only using historical passenger flow data of one station. In order to show the space-time characteristics of passenger flow more clearly, the invention constructs an independent urban rail transit line comprising n stations, as shown in figure 2.

Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th period_nThe outbound traffic must be composed of partial inbound traffic during the period before the other stations, and this period is mainly determined by the train's running time between the two stationsAnd (4) determining. This conclusion can be expressed simply as the time period from S in the t_nThe outbound traffic will be associated with the Tth_iTime period S_jThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T is_jThe expression of (b) is shown in the following formula.

The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m +. DELTA.t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and Delta t is a time interval and is usually 1, 2 and 3.

5. Feature factor extraction

The stations are divided into 5 types by carrying out cluster analysis on the stations, the passenger flow volume of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and the data sets of the 5 types are used as input data of a BP network model. The step is to train the obtained model to have more universality, so that the model has more accurate prediction results for different data types.

The prediction model mainly considers the influence of time characteristic factors. Firstly, according to the time proximity characteristic, namely the arrival passenger flow needing to be predicted has stronger relation with the arrival passenger flow in the adjacent time period, the passenger flow in the previous three time periods at the current time is selected as an influence factor respectively. Through analyzing the passenger flow periodicity characteristics, the fact that the periodicity exists between days and between weeks is obtained, and therefore the passenger flow of the same time period in the day before the current time period and the passenger flow of the same time period in the week before the current time period are selected as influence factors. Considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the same processing is performed on the previous time period in the same time period of the previous week in the same way. In the process of analyzing data, it is found that holidays often have more passenger flow than non-holidays, and the passenger flow between weekdays and weekends is different from the passenger flow time distribution characteristics, so that the day is added as an influence factor of the weekdays, the weekends and the holidays.

TABLE 1 influencing factor selection

After selecting the factors through experience and data analysis, sorting the importance of the factors to the results through an RF (random forest) algorithm, removing the factors with small importance, and selecting the factors with large importance as a final influence factor set of the BP network model. Finally, we choose the set of influencing factors as { X }_i-1，X_i-2，X_i-3，X_i-1，X_i-7，X_}

6. Neural network parameter and correlation function selection

The accuracy of the prediction model is influenced by the parameters of the BP neural network model, and the parameters need to be carefully selected.

1) Determination of the number of implicit layers: the three-layer BP neural network can complete the prediction of data, the prediction precision can be improved along with the increase of the number of hidden layers, but the training time of the corresponding BP neural network is greatly increased. Comprehensively considering, the number of hidden layers of the prediction model is selected as one layer.

2) Determination of the number of implicit nodes: in conventional studies, the number of hidden nodes is generally considered to be related to the dimension of an input vector and the dimension of an output vector. For the determination of the number of implicit nodes, different experts have proposed different methods. The prediction model of the invention roughly determines the number of hidden layer nodes by selecting a 2n +1 method, and the number of the nodes is continuously changed in a plurality of experiments later, so that the number of 24 hidden nodes is finally obtained, and the precision of the model can be effectively improved.

3) Determination of the transfer function: the prediction model of the invention uses Sigmoid type functions as the transfer functions of the BP network model. Sigmoid function smoothing is easy to derive. And (4) because the error is transmitted reversely, the distribution of the total error on each weight and threshold is obtained by derivation.

4) Determination of learning rate: the learning rate determines the variable quantity of the weight and the threshold in the training process, the prediction result is affected badly when the learning rate is too large, the training is vibrated when the learning rate is too large, and the convergence rate is slowed when the learning rate is too small. In the past, the study is generally selected from 0.01-0.8, and 0.01 is selected as the learning rate of the BP network model through continuous adjustment of the learning rate and comparison of results.

5) Determining the iteration times: firstly, according to experience of previous experiments and similar comparison of other excellent prediction models and the prediction model of the invention, the range of the iteration times is roughly determined to be 10000 to 100000, in multiple experiments, the iteration times are continuously adjusted, and finally 60000 is obtained as the optimal iteration times of the prediction model.

6) Selecting a gradient descent algorithm: under the conditions of looking up multi-party data and comparing the advantages and disadvantages of all gradient descent functions, the prediction model selects an Adma algorithm as a gradient descent algorithm. The Adam algorithm has the advantages of high-efficiency computing capacity, less required memory, suitability for solving the optimization problem containing large-scale data and parameters, suitability for non-steady-state (non-steady) targets, basically only needing a very small amount of parameter adjustment and the like. The method aims at the prediction of subway passenger flow and has large data volume, so that an algorithm suitable for processing large-scale data is suitable for the prediction model, and Adam becomes the optimal choice of the prediction model.

Claims

1. A short-time passenger flow prediction method for rail transit based on a bp neural network is characterized by comprising the following steps: the method specifically comprises the following steps:

(1) cluster analysis

(1.1) preprocessing the original data;

(2) periodic analysis of passenger flow

(2.1) periodically analyzing the passenger flow time characteristic;

(2.2) periodically analyzing the space characteristics of passenger flow;

(2.3) periodically analyzing the space-time characteristics of passenger flow;

2. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 1, wherein:

in the formula S_jFor the sample standard deviation, x is the feature point of the data set.

3. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 2, wherein: calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:

in the formula, d (x)_i,x_j) The degree of dissimilarity between objects is usually the distance between vectors.

4. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 3, wherein: the periodic passenger flow analysis in the step (2) specifically comprises the following steps:

(2.1) periodic analysis of passenger flow time characteristics

the convex peak type distribution is near the large-scale event;

(2.2) periodic analysis of spatial characteristics of passenger flow

(2.3) periodic analysis of passenger flow spatio-temporal characteristics

The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, calculate the passenger flow of entering and exiting stations of each station in the next time period or a plurality of time periods, namely predict the passenger flow data of the (m plus Deltat) th row according to the data of the first m rows of a matrix F, wherein m is more than or equal to 1 and less than or equal to h, Deltat is a time period interval, the matrix F is a multidimensional passenger flow matrix obtained by combining the historical passenger flow data of entering/exiting stations, and h is the number of the rows of the matrix.

5. A bp neural network-based network as claimed in claim 4The short-time passenger flow prediction method for rail transit is characterized by comprising the following steps: each passenger has a pair of travel origin-destination points, namely, OD pairs; from S during the t-th time period_nThe outbound traffic must be composed of partial inbound traffic during the period before the other stops, and this period is determined by the train' S travel time between the two stops, i.e., from S during the t-th time period_nThe outbound traffic will be associated with the Tth_iTime period S_jThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1, T_jThe expression of (A) is shown in the following formula,

later than T_jS of the time period_jThe passenger flow entering the station can not be from S in the t-th time period_nThe station exits the rail transit system, and a size of S is obtained by combining the historical data of the incoming/outgoing passenger flow of each station_j×(h+1)]The multi-dimensional passenger flow matrix of (2) is shown as follows:

each row of the matrix F reflects the passenger flow characteristics of each station in the time dimension, and each column of the matrix F represents the passenger flow characteristics of different stations at the same time in the space dimension, so that the above formula reflects the time-space characteristics of urban rail transit passenger flow at the same time.

6. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 5, wherein: extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:

7. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 6, wherein: selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.

8. The method for predicting short-time passenger flow based on the bp neural network as claimed in claim 7, wherein: the hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.