CN109740957B

CN109740957B - Urban traffic network node classification method

Info

Publication number: CN109740957B
Application number: CN201910025828.5A
Authority: CN
Inventors: 韦胜; 高湛
Original assignee: JIANGSU INSTITUTE OF URBAN PLANNING AND DESIGN
Current assignee: Jiangsu Urban Planning And Design Institute Co ltd
Priority date: 2019-01-11
Filing date: 2019-01-11
Publication date: 2020-12-15
Anticipated expiration: 2039-01-11
Also published as: CN109740957A

Abstract

The invention discloses a method for classifying urban traffic network nodes, which relates to the technical field of urban planning and urban traffic. And secondly, sequencing passenger flow index data between each traffic network node and the traffic network nodes. And classifying the traffic network nodes by using a time series clustering algorithm, and associating the traffic network nodes with the traffic network node data in the research area in a geographic space. And finally, performing map visualization display on the data set P1 according to the classification result. The invention can quickly classify the urban traffic nodes according to the passenger flow relation between each traffic node and other traffic nodes, and the classification result can reflect the passenger flow of each traffic node and the passenger flow distribution characteristics between the traffic nodes and other traffic nodes, thereby providing decision basis for urban planning and traffic management.

Description

Urban traffic network node classification method

Technical Field

The invention relates to the technical field of urban planning and urban traffic, in particular to an urban traffic network node classification method based on time series clustering and OD passenger flow data.

Background

Currently, with the development of big data technology, urban traffic information data is showing an explosive growth trend, and particularly, passenger flow data occurring between traffic nodes is obvious (generally, such data is called as travel OD data). Such as traffic volume between urban bus stops, traffic volume between subway stops, traffic volume between public bicycle stops, etc. The large data sources provide a good data base for understanding urban operation characteristics and have very important practical value. A large amount of data can help people to excavate the most valuable information quickly and conveniently by a certain model algorithm. For the travel OD data, the 2 information points that are common and very important are: (1) the magnitude of the passenger flow value between each traffic node and other traffic nodes. (2) And the distribution condition of the passenger flow volume value between each traffic node and other traffic nodes. Based on these 2 points of information, researchers wish to mine out valuable content to understand the inherent characteristics of traffic operations. For example, for all the traffic nodes, the total amount of travel of each traffic node and other traffic nodes can be counted, so that the nodes with larger traffic volume are judged according to the sequence of the total amount of travel values; if the travel volume of a certain traffic node is very concerned, the travel volume of the traffic node and the travel volume of other traffic nodes can be ranked, so that the traffic node is most relevant to the other traffic nodes. The method mainly researches the operation characteristics of local traffic nodes, but lacks the research on the overall characteristics.

In fact, how to study the characteristics of the traffic relations between all the sites has been a very practical and important question! Although theoretical technologies such as complex network provide a good channel for researching the problem, certain defects exist: (1) the relationship between the overall characteristics and the local characteristics is difficult to be considered, for example, the moderate distribution of a complex network is taken as an important tool means for researching the overall network structure, but the direct over-distribution interpretation of the connection characteristics of each traffic node and other traffic nodes cannot be realized. (2) The similarity characteristics between the traffic nodes are not deep enough, for example, only a class of closely-connected traffic nodes can be put together through community division in a complex network, however, it is difficult for a user to directly see how the connection characteristics between each class of traffic nodes and the related nodes are, that is, the similarity characteristics cannot be divided according to the OD flow characteristics occurring at each station.

Meanwhile, some characteristic information of the characteristics of the traffic OD data have not been deeply mined, such as: if the passenger flow data formed by each traffic node and each other traffic node is sorted according to the passenger flow, the passenger flow data can be regarded as a group of ordered data sets (marked as L), and the data sets are displayed in a histogram form and have the spectral shape characteristics similar to a time sequence. In addition, sequencing can be performed according to spatial proximity relations, and time series clustering analysis is performed, so that spatial connection relation characteristics generated by traffic nodes under spatial distance constraint can be reflected in a stressed mode.

Disclosure of Invention

The invention aims to solve the technical problem of providing a method for classifying urban traffic network nodes based on time series clustering and OD passenger flow data, which can quickly classify the urban traffic nodes according to the passenger flow relation between each traffic node and other traffic nodes, and the classification result can reflect the passenger flow of each type of traffic node and the passenger flow distribution characteristics between the traffic nodes and other traffic nodes, thereby providing decision basis for urban planning and traffic management.

The invention adopts the following technical scheme for solving the technical problems:

the invention provides an urban traffic network node classification method based on time series clustering and OD passenger flow data, which comprises the following steps:

step 1, preprocessing a traffic network node in a research area; the method comprises the following specific steps:

step 1.1, setting unique identification numbers of all traffic network nodes to form a data set P;

step 1.2, counting passenger flow index data generated by each traffic network node and other traffic network nodes in the network;

step 1.3, the passenger flow volume index data obtained after the statistics in step 1.2 are completed form a passenger flow contact data set OD of all traffic network nodes;

step 2, sequencing the passenger flow index data of each traffic network node to form a new passenger flow contact data set ODN of all traffic network nodes;

step 2.1, traversing each traffic network node in the data set OD, and sequencing the traffic network nodes in the traversal and passenger flow index data generated by other traffic network nodes in the network according to the size sequence or spatial proximity relation of the passenger flow index data;

step 2.2, after traversing is completed, forming a new passenger flow contact data set ODN of all traffic network nodes;

step 3, classifying the traffic network nodes by using a time series clustering algorithm, and associating the classification results to the traffic network nodes in the research area in a geographic space;

the specific process of classifying the traffic network nodes by using the time series clustering algorithm is as follows: and (4) classifying the nodes of the traffic network in the research area by using the data set ODN as input data and utilizing a time series clustering algorithm.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, step 3, a data set ODN is used as input data, and a time series clustering algorithm is used for classifying the traffic network nodes in a research area; and then, realizing the association between the data set P and the classification result by utilizing the corresponding relation between the unique identification numbers of all the traffic network nodes stored in the classification result and the unique identification numbers of all the traffic network nodes in the data set P, and storing the associated result as a new data set P1.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, the method further comprises a step 4 after the step 3, and the step 4 comprises the following steps: and carrying out map visualization display on the data set P1 according to the classification result.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, the data format of P in the step 1.1 is a data format suitable for ArcGIS software.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, passenger flow index data in step 1.2 are train number of shifts or trip number.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, passenger flow index data is marked as 0 when no traffic passenger flow connection occurs between two traffic network nodes in step 1.2.

As a further optimization scheme of the urban traffic network node classification method based on time series clustering and OD passenger flow data, in step 2.1, sorting is carried out according to the ascending order or the descending order of the passenger flow index data, and the spatial proximity relation refers to sorting according to the spatial distance from large to small.

Compared with the prior art, the invention adopting the technical scheme has the following technical effects:

(1) the invention provides an urban traffic network node classification method based on time series clustering and OD passenger flow data, which provides a new understanding and analysis view for passenger flow relation among stations, gives consideration to integral and local characteristic contents, and has important reference significance for planning and analyzing urban traffic nodes;

(2) the final calculation result of the method is site classification, and if the method is combined with the facility distribution characteristics of the peripheral areas of each type of sites, designers can be helped to better understand how to configure urban public service facilities in different types of site areas; therefore, the invention can provide scientific guidance basis for planning new traffic node surrounding areas by combining the spatial facility distribution characteristics of the traffic node areas.

Drawings

FIG. 1 is a schematic overall flow diagram of the present invention.

Fig. 2 is a schematic diagram of a spatial distribution of traffic nodes.

FIG. 3 is a schematic illustration of traffic statistics between traffic nodes.

FIG. 4 is a histogram of passenger flow data between a single traffic node and other traffic nodes; the histogram of the passenger flow data between P1 and other traffic nodes, (b) the histogram of the passenger flow data between P2 and other traffic nodes, (c) the histogram of the passenger flow data between P3 and other traffic nodes, (d) the histogram of the passenger flow data between P4 and other traffic nodes, (e) the histogram of the passenger flow data between P5 and other traffic nodes, and (f) the histogram of the passenger flow data between P6 and other traffic nodes are shown.

Fig. 5 is a schematic diagram of a traffic node classification result space visualization.

Detailed Description

The technical scheme of the invention is further explained in detail by combining the attached drawings:

therefore, the invention classifies the data representing the spatial relationship among the traffic nodes, such as OD passenger flows, by a time series clustering method. That is to say, the invention changes and expands the scene that the time-series clustering algorithm can only be generally applied to the time operation rule of a single traffic node, so that the time-series clustering algorithm can carry out sequence classification on the entity space relationship, thereby better helping decision makers such as traffic, city planning and the like to observe the operation interactive relationship of the traffic flow on the space.

Among them, very attention is needed: the relational data (such as the number of people going out) between each node and other nodes must be sorted to a certain extent, otherwise, the time series clustering algorithm cannot be used and analyzed directly, and the obtained result is difficult to be applied to practical explanation, because the data analysis in the time series clustering generally requires the format: the method has the advantages that the recording is carried out according to the sequence of time traveling, so that when people analyze clustering results, phenomenon analysis can be carried out according to the sequence of time points, and if subway stations with the same traveling population number in the morning, the evening and the peak are divided into stations of the same type. If the passenger flow between the traffic nodes and other traffic nodes is sequenced, it can be seen that each traffic node is most relevant to which traffic nodes (containing various information: the number of the most relevant traffic nodes, the size of the passenger flow between the most relevant traffic nodes, etc.). Therefore, the result of the time series clustering can directly help people to know that some traffic nodes have correlation with other traffic nodes and easily judge the magnitude of the correlation (such as passenger flow), and further the spatial relationship distribution characteristics among the traffic nodes are analyzed on the whole.

Step 1) referring to fig. 1, it is first necessary to pre-process the traffic network nodes in the study area.

Step 1.1) referring to fig. 2, there are 6 traffic network nodes in the area to be studied, whose geographical coordinates are (x1, y1), (x2, y2), (x3, y3), (x4, y4), (x5, y5), (x6, y6), respectively. Generating a point data set in a shape format according to the 6 coordinate pairs in the arcgis software;

step 1.2) setting unique identification numbers of all traffic network nodes of the data set, and if the 6 points are marked as P1, P2, P3, P4, P5 and P6 in sequence, marking the obtained result as a data set P;

step 1.3) statistics is performed on the passenger flow index data generated by each traffic network node and other traffic network nodes in the network, for example, the passenger flow from P1 to P2 is 20, the passenger flow from P1 to P3 is 20, the passenger flow from P3 to P3 is 22, the passenger flow from P3 to P3 is 23, the passenger flow from P3 to P3 is 7, the passenger flow from P3 to P3 is 30, the passenger flow from P3 to P3 is 8, the passenger flow from P3 to P3 is 5, the passenger flow from P3 to P3 is 30, the passenger flow from P3 to P3 is 40, the passenger flow from P3 to P3 is 38, the passenger flow from P3 to P3 is 10, the passenger flow from P3 to P3 is P3, and the passenger flow from P3 to P3 is 10. When no traffic passenger flow contact occurs between the two traffic network nodes, the passenger flow volume index data is recorded as 0.

Step 1.4) after the statistics is completed, a passenger flow contact data set OD of all the traffic network nodes is formed, and the statistics result in this example is shown in fig. 3.

And 2) sequencing the passenger flow index data of each traffic network node.

And 2.1) traversing each traffic network node of the data set OD, and sequencing the traffic network nodes in traversal and the passenger flow index data generated by other traffic network nodes in the network according to the descending order of the passenger flow index data. For example, the traffic node P1 has the following ordering: {23, 22, 20, 20, 20 }; the sequencing result of the traffic node P2 is: {30, 8, 7, 0, 0 }; the sequencing result of the traffic node P3 is: {5, 0, 0, 0, 0 }; the sequencing result of the traffic node P4 is: {30, 0, 0, 0, 0 }; the sequencing result of the traffic node P5 is: {5, 0, 0, 0, 0 }; the sequencing result of the traffic node P6 is: {40, 38, 38, 10, 10}.

And 2.2) forming a new passenger flow contact data set ODN of all the traffic network nodes after traversing.

And 3) a time series clustering algorithm can classify the time operation rule characteristics of a series of research objects. Then, the data sets of all traffic nodes in the research area are classified according to a time series clustering algorithm, and the relation between each type of traffic node and other traffic nodes can be seen. Therefore, traffic network nodes are classified by using a time series clustering algorithm and are related to traffic network node data in a research area in a geographic space;

step 3.1) using the data set ODN as input data, and classifying the traffic network nodes in the research area by using a time series clustering algorithm;

step 3.2) after a reasonable classification result is determined, the classification result of this time can be seen as P1, and is marked as a class a; p2 is a class, denoted as class b; p3 and P5 are classified as class c; p4 is class 1 and is labeled as class d; p6 is class 1 and is labeled as class e.

Referring to FIG. 4, FIG. 4 is a histogram of passenger flow data between a single traffic node and other traffic nodes; wherein (a) in fig. 4 is a histogram of the passenger flow volume data between P1 and other traffic nodes, (b) in fig. 4 is a histogram of the passenger flow volume data between P2 and other traffic nodes, (c) in fig. 4 is a histogram of the passenger flow volume data between P3 and other traffic nodes, (d) in fig. 4 is a histogram of the passenger flow volume data between P4 and other traffic nodes, (e) in fig. 4 is a histogram of the passenger flow volume data between P5 and other traffic nodes, and (f) in fig. 4 is a histogram of the passenger flow volume data between P6 and other traffic nodes. And visually displaying the P1, the P2, the P3, the P4, the P5 and the P6 and other traffic node passenger flow in the form of a histogram. As can be seen from the classification results, the histogram distribution of P3 is similar to that of P5. The histograms of P1, P2, P4, and P6 are characterized and are classified into different categories.

And then, the data set P is associated with the classification result by utilizing the corresponding relation between the unique identification numbers of the traffic network nodes stored in the classification result and the unique identification numbers of the traffic network nodes in the data set P, and the data set P is stored as a new data set P1 (namely, the new field of the data set P records the classification result of time series clustering).

And 4) performing map visual display on the data set P1 according to the classification result, namely displaying each type of traffic node by using different graphic identification numbers in the ArcGIS map according to the classification result, referring to the attached figure 5.

The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all should be considered as belonging to the protection scope of the invention.

Claims

1. A city traffic network node classification method based on time series clustering and OD passenger flow data is characterized by comprising the following steps:

2. The urban traffic network node classification method based on time series clustering and OD passenger flow data according to claim 1, characterized in that in step 3, the data set ODN is used as input data, and a time series clustering algorithm is used for classifying the traffic network nodes in the research area; and then, realizing the association between the data set P and the classification result by utilizing the corresponding relation between the unique identification numbers of all the traffic network nodes stored in the classification result and the unique identification numbers of all the traffic network nodes in the data set P, and storing the associated result as a new data set P1.

3. The urban traffic network node classification method based on time series clustering and OD passenger flow data according to claim 2, characterized by further comprising step 4 after step 3, step 4: and carrying out map visualization display on the data set P1 according to the classification result.

4. The method for classifying nodes in urban traffic network based on time series clustering and OD passenger flow data according to claim 1, wherein the data format of P in step 1.1 is a data format suitable for ArcGIS software.

5. The urban transportation network node classification method based on time series clustering and OD passenger flow data according to claim 1 is characterized in that in step 1.2, the passenger flow index data is the number of train shifts or the number of travelers.

6. The method for classifying urban traffic network nodes based on time series clustering and OD passenger flow data according to claim 1, wherein in step 1.2, when no traffic passenger flow connection occurs between two traffic network nodes, the passenger flow index data is recorded as 0.

7. The method for classifying urban traffic network nodes based on time series clustering and OD passenger flow data according to claim 1, wherein in step 2.1, the passenger flow index data are sorted in ascending order or descending order, and the spatial proximity relation refers to sorting according to spatial distance from large to small.