[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114117903A - Rail transit short-time passenger flow prediction method based on bp neural network - Google Patents

Rail transit short-time passenger flow prediction method based on bp neural network Download PDF

Info

Publication number
CN114117903A
CN114117903A CN202111387081.1A CN202111387081A CN114117903A CN 114117903 A CN114117903 A CN 114117903A CN 202111387081 A CN202111387081 A CN 202111387081A CN 114117903 A CN114117903 A CN 114117903A
Authority
CN
China
Prior art keywords
passenger flow
time
neural network
data
station
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111387081.1A
Other languages
Chinese (zh)
Other versions
CN114117903B (en
Inventor
张明明
黄家琛
张士雷
廖舟
彭思静
龙瑾潇
李思洋
王成宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN202111387081.1A priority Critical patent/CN114117903B/en
Publication of CN114117903A publication Critical patent/CN114117903A/en
Application granted granted Critical
Publication of CN114117903B publication Critical patent/CN114117903B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/02Reliability analysis or reliability optimisation; Failure analysis, e.g. worst case scenario performance, failure mode and effects analysis [FMEA]

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Artificial Intelligence (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a short-time passenger flow prediction method for rail transit based on a bp neural network, which specifically comprises the following steps: preprocessing original data; calculating the dissimilarity degree of the preprocessed data by calculating the distance between different date types; clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition; drawing a clustering pedigree graph and carrying out clustering analysis; periodically analyzing the passenger flow time characteristic by the passenger flow; periodically analyzing the passenger flow space characteristics; periodically analyzing the passenger flow time-space characteristics; extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis; selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.

Description

Rail transit short-time passenger flow prediction method based on bp neural network
Technical Field
The invention relates to the technical field of data statistics, in particular to a short-time passenger flow prediction method for rail transit based on a bp neural network.
Background
BP neural network based on genetic algorithm improvement. The BP neural network model is a multilayer feedforward type neural network model based on an error back propagation algorithm. The input signal passes through the input layer, the hidden layer and the output layer of the BP neural network structure in sequence, is processed in the hidden layer and is finally output by the output layer, and the process is the forward transmission of the signal. When the actual output of the output layer does not match the expected output, the error is passed back. In the process of error reverse transmission, the total error of actual output and expected output is spread to each layer of neurons in a certain mode, the neurons receive the spread error, and based on the error, the weight and the threshold are corrected, so that the error of the next signal transmission result is reduced. And after multiple iterations, finally, the error is converged, the model training is finished, and the obtained weight and threshold can be used for data prediction.
In the process of using the BP network model, the model is often trapped in a locally optimal condition, and the accuracy of the model prediction result is greatly reduced.
Disclosure of Invention
The invention aims to make up for the defects of the prior art and provides a short-time passenger flow prediction method for rail transit based on a bp neural network.
The invention is realized by the following technical scheme:
a short-time passenger flow prediction method for rail transit based on a bp neural network specifically comprises the following steps:
(1) cluster analysis
(1.1) preprocessing the original data;
(1.2) calculating the dissimilarity of the preprocessed data by calculating the distance between different date types;
(1.3) clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition;
(1.4) drawing a clustering pedigree graph, and carrying out clustering analysis;
(2) periodic analysis of passenger flow
(2.1) periodically analyzing the passenger flow time characteristic;
(2.2) periodically analyzing the space characteristics of passenger flow;
(2.3) periodically analyzing the space-time characteristics of passenger flow;
(3) extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis;
(4) selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In this item xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1
X=C1∪C2∪···∪Ck+1
Figure BDA0003367480630000024
These segmentations constitute the results of the cluster analysis.
Preprocessing the original data in the step (1.1), specifically standardizing passenger flow volumes of the original data on different working days:
Figure BDA0003367480630000021
Figure BDA0003367480630000022
Figure BDA0003367480630000023
in the formula, SjFor the sample standard deviation, x is the feature point of the data set.
Calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:
the Euclidean distance is adopted as the quantity index of the dissimilarity degree:
Figure BDA0003367480630000031
the degree of dissimilarity between objects is usually expressed in terms of the distance between vectors, i.e., d (x)i,xj) And x is a feature point of the data set.
The periodic passenger flow analysis in the step (2) specifically comprises the following steps:
(2.1) periodic analysis of passenger flow time characteristics
Describing the in/out passenger flow of a station by adopting a one-dimensional time sequence, and dividing the distribution condition of the in/out passenger flow of each station in one day into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type;
the unimodal distribution means that a station has an inbound peak and an outbound peak in the operating time of one day;
the bimodal distribution means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 and 9 in the morning, forms late traffic peak due to leaving work and leaving school between 17 and 19 in the evening, and has little change in passenger flow in other time periods, belonging to the average traffic peak;
the full peak type distribution is no clear passenger flow valley, and the passenger flow volume in each time period all day is large;
the peak-free passenger flow distribution has no obvious peak, and the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow;
the convex peak type distribution is near the large-scale event;
(2.2) periodic analysis of spatial characteristics of passenger flow
Analyzing the passenger flow of urban rail transit from the spatial dimension, and analyzing the passenger flow conditions of different stations or different sections in the same time period; the station passenger flow is the sum of the station passenger flow for getting on and off the bus and the transfer passenger flow in unit time; describing the passenger flow of n stations in a certain time period by adopting an n-dimensional sequence;
the section passenger flow refers to the on-line passenger flow passing through each section of the urban rail transit line in unit time, and is divided into an up section passenger flow and a down section passenger flow according to the difference of the running directions of trains;
(2.3) periodic analysis of passenger flow spatio-temporal characteristics
The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, and calculate the incoming and outgoing passenger flow of each station in the next time period or a plurality of time periods, namely to predict the passenger flow data of the (m + delta t) th column according to the data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and h is the number of columns of the matrix.
Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th periodnThe outbound passenger flow must be constantThe system consists of partial arrival passenger flow of other stations in the period before the stations, and the period is mainly determined by the running time of the trains between the two stations. This conclusion can be expressed simply as the time period from S in the tnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T isjThe expression of (b) is shown in the following formula.
Figure BDA0003367480630000041
It is obvious that later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnAnd the station exits the rail transit system. Obtaining a size of S by combining historical entrance/exit traffic data of each stationj×(h+1)]The multidimensional passenger flow matrix of (2) is shown as follows.
Figure BDA0003367480630000042
Each row of matrix F reflects the traffic characteristics of the stations in the time dimension. Each column of the matrix F represents the traffic characteristics of different stations at the same time in the spatial dimension. Therefore, the space-time characteristics of the urban rail transit passenger flow can be reflected by the formula.
The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m + Δ t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, Δ t is a time interval and is always 1, 2 and 3.
Extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:
clustering analysis is carried out on the stations, the stations are divided into 5 types, passenger flow of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and a data set of 5 types is used as input data of a BP network model;
firstly, according to the time proximity characteristics, selecting the passenger flow of three periods before the current time as influence factors respectively, and obtaining the periodicity between days and the periodicity between weeks through the periodic analysis of the passenger flow, so as to select the passenger flow of the same period of time one day before the current period and the passenger flow of the same period of time one week before the current period as the influence factors; considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the passenger flow volume in the previous time period in the same time period of the previous week is also included in the influence factor set in the same way; in addition, holidays often have more passenger flow volume than non-holidays, and the passenger flow volume between weekdays and weekends is different from the passenger flow volume time distribution characteristics, so that the current day is added as an influence factor of the weekdays, the weekends and the holidays;
after the influence factors are selected, the influence factors are sequenced on the importance of the results through a random forest algorithm, the factors with small importance are removed, and the factors with large importance are selected as the final influence factor set of the BP network model.
Selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.
The hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.
The invention has the advantages that:
the method realizes the short-time prediction of the rail transit passenger flow data by means of a BP neural network model and an OD matrix; the invention improves the transfer function and the training algorithm of the network on the basis of the performance of the traditional BP neural network. The initial weight and the threshold of the neural network are optimized by means of the genetic algorithm, so that the situation that the model is in a local optimum condition is avoided, and the accuracy of the model prediction result is greatly reduced. The invention improves the passenger flow monitoring capability of rail transit, provides advance evaluation and prediction and accurately controls passenger flow.
Drawings
FIG. 1 is a graph of the clustering lineage of the present invention.
Fig. 2 is an independent urban rail transit line with n stations according to the invention.
Detailed Description
basic steps of the bp neural network algorithm are as follows:
(1) weighting value W in BP neural networkijGiving a random small non-zero real number, and setting an initial learning rate mu and a momentum factor alpha;
(2) inputting a first set of sample pairs of the N sets of input-output sample pairs;
(3) calculating an actual output value O of each layer of elements in the BP neural network according to the formula (1), wherein k is a k-th layer neural network, m is an m-th layer neural network, i is an ith element of a certain layer, and j is a jth element of the certain layer;
Figure BDA0003367480630000061
wherein,
Figure BDA0003367480630000062
is the output of the ith element of the kth layer;
Figure BDA0003367480630000063
is the input sum of the ith element of the kth layer;
Figure BDA0003367480630000064
is the connection weight from the i element of the k-1 layer to the j element of the k layer;
(4) adjusting the connection weight of each layer according to the formula (2)
Figure BDA0003367480630000065
Wherein when k is equal to m,
Figure BDA0003367480630000066
when k is less than m, the ratio of m,
Figure BDA0003367480630000067
in the formula
Figure BDA0003367480630000068
As a result of the output layer k, yiIs the output of the ith neuron, and t is time t.
(5) Turning to the step (2), inputting a second pair of sample pairs, and circulating N groups of sample pairs until WijAnd stopping when the target error of the prediction is reached or the stability is stable.
A short-time passenger flow prediction method for rail transit based on a bp neural network specifically comprises the following steps:
1. cluster analysis
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In this item xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1
2.X=C1∪C2∪···∪Ck+1
3.
Figure BDA0003367480630000071
These segmentations constitute the results of the cluster analysis.
The existing training data is: the track traffic of 2019 and 2020 includes 8 lines and stations, and the data of the incoming and outgoing passenger flow are summarized at each time every day. Before short-time passenger flow prediction is carried out, the date types are required to be divided, and then the short-time passenger flow prediction is carried out according to the divided date types.
Clustering is to divide data samples into different clusters or classes according to a certain specific standard (generally, a distance criterion), and the clustering results are that the similarity among the samples in the classes is as large as possible and the difference among the samples in the classes is as large as possible, namely, the samples are completely separated and distributed through the data of different classes after the clustering analysis, and the data of the same class are clustered together.
The clustering analysis is defined as follows: in data space A, a data set X consists of a number of data points, data points Xi=(xi1,xi2,······,xid)∈A,xiIs each feature (or dimension) xijMay be enumerated or numerical. In the invention xijShort-term traffic of every 5min for the whole-day operating period representing different date types. If there are N objects X in the data set XiThe data set then corresponds to a matrix of N x d. The purpose of clustering is to segment the data set X into k segments Cm(m 1, k), it is also possible that some objects do not belong to any one partition, these constituting the noise partition Ck+1. The cluster analysis satisfies the following formula:
X=C1∪C2∪···∪Ck+1 (5)
Figure BDA0003367480630000072
these segmentations constitute the results of the cluster analysis.
The clustering method is various, and the method selected by the invention is based on the defined segmentation and the distance between the segmentations. The specific method is that the data of each working day are divided into two types with minimum distance, and the two types are merged and then the process is repeated until all the divisions are merged into one division. The process may create a clustering pedigree map. The method comprises the following specific steps:
(1) clustered data preprocessing
Prior to cluster analysis, the existing raw data is first preprocessed. The main reason is that when the sample is extracted to measure the data, different variables have different dimensions and possibly have different order of magnitude units, and in order to put the data together for comparison and then segmentation, the original data needs to be preprocessed. The standardized transformation method is a commonly used method for data preprocessing. Raw (6) data traffic was normalized for different working days:
Figure BDA0003367480630000081
Figure BDA0003367480630000082
Figure BDA0003367480630000083
(2) calculating the degree of dissimilarity
In order to measure the similarity or difference degree, i.e. dissimilarity degree, between the passenger flow volumes of different types of dates, a cluster statistic is defined as a quantity index of cluster analysis, and then quantitative cluster analysis is completed according to the quantity index. The method of solving for dissimilarity is typically to calculate the distance between different date types. Euclidean distance (Euclid) is used herein as a quantitative indicator of dissimilarity:
Figure BDA0003367480630000084
(3) clustering
And merging the two types of partitions with the minimum distance, and continuously repeating the process after merging until all the partitions are merged into one partition.
(4) Drawing clustering pedigree graph
According to the steps of the clustering process, a passenger flow distribution amount is selected as an example for clustering analysis, as shown in fig. 1.
4. Periodic analysis of passenger flow
The passenger flow of each station is dynamically changed, the time-space characteristics and the periodic variation rule of the actual passenger flow of the urban rail transit are analyzed, and the management department is facilitated to adjust departure intervals and traffic schemes and arrange personnel at the stations to organize the passenger flow, so that the maximum passenger capacity of the urban rail transit network is exerted.
Time characteristic periodicity analysis of passenger flow
With the wide application of the AFC system, the urban rail transit operation department can collect passenger flow data from each independent AFC equipment terminal once every 15 minutes. Viewed in the time dimension, the invention adopts a one-dimensional time sequence to describe the in/out passenger flow of a station.
The distribution of the passenger flow of each station in and out of the station in a day can be mainly divided into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type under the influence of the geographical position of the station and the type of the station. The unimodal distribution means that a station has an on-station peak and an off-station peak in the operating time of a day. The double-peak type means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 o 'clock and 9 o' clock in the morning, forms late traffic peak due to leaving work and leaving school between 17 o 'clock and 19 o' clock in the evening, and the passenger flow changes little in other time intervals, belonging to the average traffic peak.
The full-peak station passenger flow distribution is characterized in that clear passenger flow valleys do not exist, the passenger flow in each time period all day is large, and the station is generally located in a highly developed area or a highly concentrated area of public facilities; the peak-free passenger flow distribution has no obvious peak, most of the time is that the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow, but the passenger flow of the stations can be gradually increased along with the continuous flourishing of cities. The hump type usually occurs when a large event, such as a sporting event, concert, etc., is held, or when the weather changes abruptly. When a large-scale activity is closed, a station close to the activity position has a section of sharply increased incoming passenger flow, and after a certain time, the outgoing passenger flow of other hub stations may have a sharply increased peak.
A cyclic analysis of spatial characteristics of the passenger flow is carried out
And analyzing the passenger flow of the urban rail transit from the spatial dimension, and researching the passenger flow conditions of different stations or different sections in the same time period. The station traffic refers to the sum of the traffic of passengers getting on and off the station and the traffic of passengers transferring the station in a unit time (usually, a peak hour or a day is taken as a unit time). The invention adopts an n-dimensional sequence to describe the passenger flow of n stations in a certain time period.
The section passenger flow refers to the on-train passenger flow passing through each section of the urban rail transit line in unit time (usually taking peak hours as a research period), and can be divided into an up-section passenger flow and a down-section passenger flow according to the different running directions of trains.
(3) Periodic analysis of passenger flow time-space characteristics
The urban rail transit trip data is typical space-time data, and the passenger flow data simultaneously changes in two dimensions of time and space and presents certain regularity. The outbound passenger flow volume of a certain station has a great relationship with the inbound and outbound passenger flow volume of each relevant station, so that the law is difficult to accurately describe by only using historical passenger flow data of one station. In order to show the space-time characteristics of passenger flow more clearly, the invention constructs an independent urban rail transit line comprising n stations, as shown in figure 2.
Due to the complete closure of urban rail transit, the AFC system can acquire complete travel data, and each passenger has a pair of travel Origin-Destination points, commonly called OD (Origin-Destination) pairs. Normally, passengers leaving a station must enter the rail transit system from other stations before that. I.e. from S during the t-th periodnThe outbound traffic must be composed of partial inbound traffic during the period before the other stations, and this period is mainly determined by the train's running time between the two stationsAnd (4) determining. This conclusion can be expressed simply as the time period from S in the tnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1. T isjThe expression of (b) is shown in the following formula.
Figure BDA0003367480630000101
It is obvious that later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnAnd the station exits the rail transit system. Obtaining a size of S by combining historical entrance/exit traffic data of each stationj×(h+1)]The multidimensional passenger flow matrix of (2) is shown as follows.
Figure BDA0003367480630000102
Each row of matrix F reflects the traffic characteristics of the stations in the time dimension. Each column of the matrix F represents the traffic characteristics of different stations at the same time in the spatial dimension. Therefore, the space-time characteristics of the urban rail transit passenger flow can be reflected by the formula.
The short-term passenger flow prediction of the urban rail transit refers to the step of researching the change rule and the relation existing in the passenger flow data according to the collected passenger flow data and calculating the passenger flow of each station in the next time period or a plurality of time periods. That is, the passenger flow data of the (m +. DELTA.t) th column is predicted according to the partial data of the first m columns of the matrix F, wherein m is more than or equal to 1 and less than or equal to h, and Delta t is a time interval and is usually 1, 2 and 3.
5. Feature factor extraction
The stations are divided into 5 types by carrying out cluster analysis on the stations, the passenger flow volume of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and the data sets of the 5 types are used as input data of a BP network model. The step is to train the obtained model to have more universality, so that the model has more accurate prediction results for different data types.
The prediction model mainly considers the influence of time characteristic factors. Firstly, according to the time proximity characteristic, namely the arrival passenger flow needing to be predicted has stronger relation with the arrival passenger flow in the adjacent time period, the passenger flow in the previous three time periods at the current time is selected as an influence factor respectively. Through analyzing the passenger flow periodicity characteristics, the fact that the periodicity exists between days and between weeks is obtained, and therefore the passenger flow of the same time period in the day before the current time period and the passenger flow of the same time period in the week before the current time period are selected as influence factors. Considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the same processing is performed on the previous time period in the same time period of the previous week in the same way. In the process of analyzing data, it is found that holidays often have more passenger flow than non-holidays, and the passenger flow between weekdays and weekends is different from the passenger flow time distribution characteristics, so that the day is added as an influence factor of the weekdays, the weekends and the holidays.
TABLE 1 influencing factor selection
Figure BDA0003367480630000111
Figure BDA0003367480630000121
After selecting the factors through experience and data analysis, sorting the importance of the factors to the results through an RF (random forest) algorithm, removing the factors with small importance, and selecting the factors with large importance as a final influence factor set of the BP network model. Finally, we choose the set of influencing factors as { X }i-1,Xi-2,Xi-3,Xi-1,Xi-7,X_}
6. Neural network parameter and correlation function selection
The accuracy of the prediction model is influenced by the parameters of the BP neural network model, and the parameters need to be carefully selected.
1) Determination of the number of implicit layers: the three-layer BP neural network can complete the prediction of data, the prediction precision can be improved along with the increase of the number of hidden layers, but the training time of the corresponding BP neural network is greatly increased. Comprehensively considering, the number of hidden layers of the prediction model is selected as one layer.
2) Determination of the number of implicit nodes: in conventional studies, the number of hidden nodes is generally considered to be related to the dimension of an input vector and the dimension of an output vector. For the determination of the number of implicit nodes, different experts have proposed different methods. The prediction model of the invention roughly determines the number of hidden layer nodes by selecting a 2n +1 method, and the number of the nodes is continuously changed in a plurality of experiments later, so that the number of 24 hidden nodes is finally obtained, and the precision of the model can be effectively improved.
3) Determination of the transfer function: the prediction model of the invention uses Sigmoid type functions as the transfer functions of the BP network model. Sigmoid function smoothing is easy to derive. And (4) because the error is transmitted reversely, the distribution of the total error on each weight and threshold is obtained by derivation.
4) Determination of learning rate: the learning rate determines the variable quantity of the weight and the threshold in the training process, the prediction result is affected badly when the learning rate is too large, the training is vibrated when the learning rate is too large, and the convergence rate is slowed when the learning rate is too small. In the past, the study is generally selected from 0.01-0.8, and 0.01 is selected as the learning rate of the BP network model through continuous adjustment of the learning rate and comparison of results.
5) Determining the iteration times: firstly, according to experience of previous experiments and similar comparison of other excellent prediction models and the prediction model of the invention, the range of the iteration times is roughly determined to be 10000 to 100000, in multiple experiments, the iteration times are continuously adjusted, and finally 60000 is obtained as the optimal iteration times of the prediction model.
6) Selecting a gradient descent algorithm: under the conditions of looking up multi-party data and comparing the advantages and disadvantages of all gradient descent functions, the prediction model selects an Adma algorithm as a gradient descent algorithm. The Adam algorithm has the advantages of high-efficiency computing capacity, less required memory, suitability for solving the optimization problem containing large-scale data and parameters, suitability for non-steady-state (non-steady) targets, basically only needing a very small amount of parameter adjustment and the like. The method aims at the prediction of subway passenger flow and has large data volume, so that an algorithm suitable for processing large-scale data is suitable for the prediction model, and Adam becomes the optimal choice of the prediction model.

Claims (8)

1. A short-time passenger flow prediction method for rail transit based on a bp neural network is characterized by comprising the following steps: the method specifically comprises the following steps:
(1) cluster analysis
(1.1) preprocessing the original data;
(1.2) calculating the dissimilarity of the preprocessed data by calculating the distance between different date types;
(1.3) clustering according to the calculated dissimilarity, and merging the two types of partitions with the minimum distance until all the partitions are merged into one partition;
(1.4) drawing a clustering pedigree graph, and carrying out clustering analysis;
(2) periodic analysis of passenger flow
(2.1) periodically analyzing the passenger flow time characteristic;
(2.2) periodically analyzing the space characteristics of passenger flow;
(2.3) periodically analyzing the space-time characteristics of passenger flow;
(3) extracting characteristic factors according to results of clustering analysis and passenger flow periodicity analysis;
(4) selecting parameters and related functions of the BP neural network, and determining an initial weight and a threshold of the BP neural network.
2. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 1, wherein:
preprocessing the original data in the step (1.1), specifically standardizing passenger flow volumes of the original data on different working days:
Figure FDA0003367480620000011
Figure FDA0003367480620000012
Figure FDA0003367480620000013
in the formula SjFor the sample standard deviation, x is the feature point of the data set.
3. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 2, wherein: calculating the dissimilarity degree of the preprocessed data in the step (1.2), which is specifically as follows:
the Euclidean distance is adopted as the quantity index of the dissimilarity degree:
Figure FDA0003367480620000021
in the formula, d (x)i,xj) The degree of dissimilarity between objects is usually the distance between vectors.
4. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 3, wherein: the periodic passenger flow analysis in the step (2) specifically comprises the following steps:
(2.1) periodic analysis of passenger flow time characteristics
Describing the in/out passenger flow of a station by adopting a one-dimensional time sequence, and dividing the distribution condition of the in/out passenger flow of each station in one day into a single peak type, a double peak type, a full peak type, a convex peak type and a non-peak type;
the unimodal distribution means that a station has an inbound peak and an outbound peak in the operating time of one day;
the bimodal distribution means that the urban rail transit passenger flow forms early traffic peak due to commuting and commuting between 7 and 9 in the morning, forms late traffic peak due to leaving work and leaving school between 17 and 19 in the evening, and has little change in passenger flow in other time periods, belonging to the average traffic peak;
the full peak type distribution is no clear passenger flow valley, and the passenger flow volume in each time period all day is large;
the peak-free passenger flow distribution has no obvious peak, and the passenger flow in and out of the station all day is small due to the insufficient attraction of the surrounding environment to the passenger flow;
the convex peak type distribution is near the large-scale event;
(2.2) periodic analysis of spatial characteristics of passenger flow
Analyzing the passenger flow of urban rail transit from the spatial dimension, and analyzing the passenger flow conditions of different stations or different sections in the same time period; the station passenger flow is the sum of the station passenger flow for getting on and off the bus and the transfer passenger flow in unit time; describing the passenger flow of n stations in a certain time period by adopting an n-dimensional sequence;
the section passenger flow refers to the on-line passenger flow passing through each section of the urban rail transit line in unit time, and is divided into an up section passenger flow and a down section passenger flow according to the difference of the running directions of trains;
(2.3) periodic analysis of passenger flow spatio-temporal characteristics
The passenger flow data simultaneously change in two dimensions of time and space and show certain regularity, and the space-time characteristics of the passenger flow are displayed by constructing an independent urban rail transit line containing n stations; the short-time passenger flow prediction is to analyze the change rule and the relation existing in the passenger flow data according to the collected passenger flow data, calculate the passenger flow of entering and exiting stations of each station in the next time period or a plurality of time periods, namely predict the passenger flow data of the (m plus Deltat) th row according to the data of the first m rows of a matrix F, wherein m is more than or equal to 1 and less than or equal to h, Deltat is a time period interval, the matrix F is a multidimensional passenger flow matrix obtained by combining the historical passenger flow data of entering/exiting stations, and h is the number of the rows of the matrix.
5. A bp neural network-based network as claimed in claim 4The short-time passenger flow prediction method for rail transit is characterized by comprising the following steps: each passenger has a pair of travel origin-destination points, namely, OD pairs; from S during the t-th time periodnThe outbound traffic must be composed of partial inbound traffic during the period before the other stops, and this period is determined by the train' S travel time between the two stops, i.e., from S during the t-th time periodnThe outbound traffic will be associated with the TthiTime period SjThe station-entering passenger flow is related, wherein j is more than or equal to 1 and less than or equal to n-1, TjThe expression of (A) is shown in the following formula,
Figure FDA0003367480620000031
later than TjS of the time periodjThe passenger flow entering the station can not be from S in the t-th time periodnThe station exits the rail transit system, and a size of S is obtained by combining the historical data of the incoming/outgoing passenger flow of each stationj×(h+1)]The multi-dimensional passenger flow matrix of (2) is shown as follows:
Figure FDA0003367480620000032
each row of the matrix F reflects the passenger flow characteristics of each station in the time dimension, and each column of the matrix F represents the passenger flow characteristics of different stations at the same time in the space dimension, so that the above formula reflects the time-space characteristics of urban rail transit passenger flow at the same time.
6. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 5, wherein: extracting characteristic factors according to the results of the clustering analysis and the passenger flow periodicity analysis in the step (3), wherein the characteristic factors are as follows:
clustering analysis is carried out on the stations, the stations are divided into 5 types, passenger flow of each station in each time period is selected from each type to be used as a sample set of the BP neural network, and a data set of 5 types is used as input data of a BP network model;
firstly, according to the time proximity characteristics, selecting the passenger flow of three periods before the current time as influence factors respectively, and obtaining the periodicity between days and the periodicity between weeks through the periodic analysis of the passenger flow, so as to select the passenger flow of the same period of time one day before the current period and the passenger flow of the same period of time one week before the current period as the influence factors; considering that the passenger flow volume in the same time period of the previous day is also influenced by the passenger flow volume in the previous time period, the passenger flow volume is also included in the influence factor set, and the passenger flow volume in the previous time period in the same time period of the previous week is also included in the influence factor set in the same way; in addition, holidays often have more passenger flow volume than non-holidays, and the passenger flow volume between weekdays and weekends is different from the passenger flow volume time distribution characteristics, so that the current day is added as an influence factor of the weekdays, the weekends and the holidays;
after the influence factors are selected, the influence factors are sequenced on the importance of the results through a random forest algorithm, the factors with small importance are removed, and the factors with large importance are selected as the final influence factor set of the BP network model.
7. The method for predicting short-time passenger flow of rail transit based on the bp neural network as claimed in claim 6, wherein: selecting parameters and related functions of the BP neural network, and specifically comprising the following steps: the method comprises the following steps of determination of an implicit layer number, determination of an implicit node number, determination of a transfer function, determination of a learning rate, determination of iteration times and gradient descent algorithm selection.
8. The method for predicting short-time passenger flow based on the bp neural network as claimed in claim 7, wherein: the hidden layer number is one layer; the number of the hidden nodes is 24; the transfer function is a Sigmoid function; the learning rate is 0.01; the iteration number is 60000; the gradient descent algorithm is an Adma algorithm.
CN202111387081.1A 2021-11-22 2021-11-22 Short-time passenger flow prediction method for rail transit based on bp neural network Active CN114117903B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111387081.1A CN114117903B (en) 2021-11-22 2021-11-22 Short-time passenger flow prediction method for rail transit based on bp neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111387081.1A CN114117903B (en) 2021-11-22 2021-11-22 Short-time passenger flow prediction method for rail transit based on bp neural network

Publications (2)

Publication Number Publication Date
CN114117903A true CN114117903A (en) 2022-03-01
CN114117903B CN114117903B (en) 2024-10-01

Family

ID=80439180

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111387081.1A Active CN114117903B (en) 2021-11-22 2021-11-22 Short-time passenger flow prediction method for rail transit based on bp neural network

Country Status (1)

Country Link
CN (1) CN114117903B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707726A (en) * 2022-04-02 2022-07-05 武汉氢客科技有限公司 Passenger flow volume prediction method, system and storage medium based on Elman neural network
CN116778739A (en) * 2023-06-20 2023-09-19 深圳市中车智联科技有限公司 Public transportation scheduling method and system based on demand response

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016045195A1 (en) * 2014-09-22 2016-03-31 北京交通大学 Passenger flow estimation method for urban rail network
CN109086926A (en) * 2018-07-25 2018-12-25 南京理工大学 A kind of track traffic for passenger flow prediction technique in short-term based on combination neural net structure
CN109558985A (en) * 2018-12-10 2019-04-02 南通科技职业学院 A kind of bus passenger flow amount prediction technique based on BP neural network
CN110276474A (en) * 2019-05-22 2019-09-24 南京理工大学 A kind of track traffic station passenger flow forecasting in short-term
CN110516866A (en) * 2019-08-21 2019-11-29 上海工程技术大学 A kind of real-time estimation method for handing over subway crowding for city rail
WO2020125716A1 (en) * 2018-12-21 2020-06-25 中兴通讯股份有限公司 Method for realizing network optimization and related device
CN111931978A (en) * 2020-06-29 2020-11-13 南京熊猫电子股份有限公司 Urban rail transit passenger flow state prediction method based on space-time characteristics
CN112149902A (en) * 2020-09-23 2020-12-29 吉林大学 Subway short-time arrival passenger flow prediction method based on passenger flow characteristic analysis
WO2021098619A1 (en) * 2019-11-19 2021-05-27 中国科学院深圳先进技术研究院 Short-term subway passenger flow prediction method, system and electronic device
CN113298314A (en) * 2021-06-10 2021-08-24 重庆大学 Rail transit passenger flow prediction method considering dynamic space-time correlation

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016045195A1 (en) * 2014-09-22 2016-03-31 北京交通大学 Passenger flow estimation method for urban rail network
CN109086926A (en) * 2018-07-25 2018-12-25 南京理工大学 A kind of track traffic for passenger flow prediction technique in short-term based on combination neural net structure
CN109558985A (en) * 2018-12-10 2019-04-02 南通科技职业学院 A kind of bus passenger flow amount prediction technique based on BP neural network
WO2020125716A1 (en) * 2018-12-21 2020-06-25 中兴通讯股份有限公司 Method for realizing network optimization and related device
CN110276474A (en) * 2019-05-22 2019-09-24 南京理工大学 A kind of track traffic station passenger flow forecasting in short-term
CN110516866A (en) * 2019-08-21 2019-11-29 上海工程技术大学 A kind of real-time estimation method for handing over subway crowding for city rail
WO2021098619A1 (en) * 2019-11-19 2021-05-27 中国科学院深圳先进技术研究院 Short-term subway passenger flow prediction method, system and electronic device
CN111931978A (en) * 2020-06-29 2020-11-13 南京熊猫电子股份有限公司 Urban rail transit passenger flow state prediction method based on space-time characteristics
CN112149902A (en) * 2020-09-23 2020-12-29 吉林大学 Subway short-time arrival passenger flow prediction method based on passenger flow characteristic analysis
CN113298314A (en) * 2021-06-10 2021-08-24 重庆大学 Rail transit passenger flow prediction method considering dynamic space-time correlation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
袁坚;王鹏;王钺;杨欣;: "基于时空特征的城市轨道交通客流量预测方法", 北京交通大学学报, no. 06, 15 December 2017 (2017-12-15) *
黎旭成;彭逸洲;吴宗翔;陈振武;: "基于深度时空网络的地铁站点客流短时预测", 交通与运输, no. 2, 31 August 2020 (2020-08-31) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114707726A (en) * 2022-04-02 2022-07-05 武汉氢客科技有限公司 Passenger flow volume prediction method, system and storage medium based on Elman neural network
CN116778739A (en) * 2023-06-20 2023-09-19 深圳市中车智联科技有限公司 Public transportation scheduling method and system based on demand response

Also Published As

Publication number Publication date
CN114117903B (en) 2024-10-01

Similar Documents

Publication Publication Date Title
CN110570651B (en) Road network traffic situation prediction method and system based on deep learning
CN111223301B (en) Traffic flow prediction method based on graph attention convolution network
CN110070713B (en) Traffic flow prediction method based on bidirectional nested LSTM neural network
CN113096388B (en) Short-term traffic flow prediction method based on gradient lifting decision tree
CN108269401B (en) Data-driven viaduct traffic jam prediction method
CN105702029B (en) A kind of Expressway Traffic trend prediction method for considering space-time relationship at times
CN110555990B (en) Effective parking space-time resource prediction method based on LSTM neural network
CN113159364A (en) Passenger flow prediction method and system for large-scale traffic station
CN112270355B (en) Active safety prediction method based on big data technology and SAE-GRU
CN113380025B (en) Vehicle driving quantity prediction model construction method, prediction method and system
CN112820105B (en) Road network abnormal area processing method and system
CN106448151A (en) Short-time traffic flow prediction method
CN111145546B (en) Urban global traffic situation analysis method
CN107194491A (en) A kind of dynamic dispatching method based on Forecasting of Travel Time between bus passenger flow and station
CN114117903B (en) Short-time passenger flow prediction method for rail transit based on bp neural network
CN115440032A (en) Long-term and short-term public traffic flow prediction method
CN111160650B (en) Adaboost algorithm-based traffic flow characteristic analysis and prediction method
Haputhanthri et al. Short-term traffic forecasting using LSTM-based deep learning models
CN113051811B (en) Multi-mode short-term traffic jam prediction method based on GRU network
CN115269758A (en) Passenger-guidance-oriented road network passenger flow state deduction method and system
CN113537596A (en) Short-time passenger flow prediction method for new line station of urban rail transit
CN108171367A (en) A kind of horizontal Reliability Prediction Method of Bus Service
CN106779241B (en) Short-term passenger flow prediction method for rail transit
Rasaizadi et al. The ensemble learning process for short-term prediction of traffic state on rural roads
CN114463978A (en) Data monitoring method based on rail transit information processing terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant