[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111651502A - An urban functional area identification method based on multi-subspace model - Google Patents

An urban functional area identification method based on multi-subspace model Download PDF

Info

Publication number
CN111651502A
CN111651502A CN202010484901.8A CN202010484901A CN111651502A CN 111651502 A CN111651502 A CN 111651502A CN 202010484901 A CN202010484901 A CN 202010484901A CN 111651502 A CN111651502 A CN 111651502A
Authority
CN
China
Prior art keywords
matrix
functional area
functional areas
functional
urban
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010484901.8A
Other languages
Chinese (zh)
Other versions
CN111651502B (en
Inventor
朱佳玮
陶超
李海峰
肖俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202010484901.8A priority Critical patent/CN111651502B/en
Publication of CN111651502A publication Critical patent/CN111651502A/en
Application granted granted Critical
Publication of CN111651502B publication Critical patent/CN111651502B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明公开了一种基于多子空间模型的城市功能区识别方法,包括以下步骤:获取研究区域内出租车轨迹数据和签到数据;构建面向分区基于到访目的的时序特征矩阵C;输入时序特征矩阵C至稀疏子空间聚类算法,计算获得地理单元和城市功能区的对应关系;获得每个功能区的显著特征地点,进而识别每个功能区的主要功能。本发明方法本发明利用地理大数据提供的人类活动信息,基于多子空间的模型克服现有技术中存在的缺陷,能够更精确地识别城市功能区,并基于子空间的几何性质分析各功能区的独特度与丰度,为城市功能区的管理和发展提供了精细量化的指标指示。

Figure 202010484901

The invention discloses a method for identifying urban functional areas based on a multi-subspace model, comprising the following steps: acquiring taxi trajectory data and check-in data in a research area; constructing a partition-oriented time series feature matrix C based on visiting purposes; inputting time series features Matrix C to sparse subspace clustering algorithm, calculate the corresponding relationship between geographic units and urban functional areas; obtain the salient feature locations of each functional area, and then identify the main functions of each functional area. The method of the present invention The present invention utilizes the human activity information provided by the geographic big data, and overcomes the defects existing in the prior art based on the multi-subspace model, can more accurately identify urban functional areas, and analyze each functional area based on the geometric properties of the subspaces The uniqueness and abundance of urban functional areas provide precise and quantitative indicators for the management and development of urban functional areas.

Figure 202010484901

Description

一种基于多子空间模型的城市功能区识别方法An urban functional area identification method based on multi-subspace model

技术领域technical field

本发明属于地理空间信息识别技术领域,涉及城市地理信息的识别方法,具体涉及一种基于多子空间模型的城市功能区识别方法。The invention belongs to the technical field of geographic space information identification, relates to a method for identifying urban geographic information, and in particular relates to a method for identifying urban functional areas based on a multi-subspace model.

背景技术Background technique

城市空间结构是城市地理信息学的一个核心研究内容,也是人地关系的集中反映,因为城市空间在受人类活动的影响时又对人的生产和活动有影响,大到涉及城市规划,选址,小到出行、地点推荐。在城市空间结构分析中,城市功能区的分布是诸多因素影响下在地理空间中呈现的结果。Urban spatial structure is a core research content of urban geoinformatics, and it is also a concentrated reflection of the relationship between human and land, because urban space has an impact on human production and activities when it is affected by human activities. , small to travel, place recommendation. In the analysis of urban spatial structure, the distribution of urban functional areas is the result presented in geographic space under the influence of many factors.

分析城市功能区的方法有很多,如社会调查,但是在获取数据上费时费力,且在分析时可能受主观因素的极大影响,最大的缺点是不能直接反映城市发展的关键因素——人类的活动。随着移动通讯、互联网和卫星定位技术的快速发展通过具备定位功能的移动设备产生的一系列电子足迹,这些电子足迹是城市居民活动的真实记录,使我们能够从人类活动的角度探索城市功能区。目前已有方法利用社交媒体签到数据,手机数据以及出租车轨迹数据,以检测城市功能区域。There are many methods for analyzing urban functional areas, such as social surveys, but it is time-consuming and laborious to obtain data, and may be greatly affected by subjective factors during analysis. The biggest disadvantage is that it cannot directly reflect the key factor of urban development-human Activity. With the rapid development of mobile communication, Internet and satellite positioning technology, a series of electronic footprints are generated by mobile devices with positioning function. These electronic footprints are real records of urban residents' activities, which enable us to explore urban functional areas from the perspective of human activities. . There are existing methods using social media check-in data, mobile phone data, and taxi trajectory data to detect urban functional areas.

在用于分析数据的模型上,已有技术还不够完善。一般步骤如下,首先,在处理地理空间大数据时,将人类活动时序特征信息映射至人工划分的地理单元上后,使得每个地理单元都可以由向量表达,信息由此存储在一个高维的向量空间中。然后,他们通过一些算法如奇异值分解、潜在语义分析、潜在狄利克雷等分析方法对这些地理单元进行特征表达。最后,通过地理单元在特征表达上地相似性进行聚类,每一个聚类结果代表一个功能区,由此得到城市功能区的分布。然而,这些模型存在如下不足。The existing technology is not yet perfect in the models used to analyze the data. The general steps are as follows. First, when processing geospatial big data, after mapping the time series feature information of human activities to the manually divided geographic units, each geographic unit can be expressed by a vector, and the information is stored in a high-dimensional in vector space. Then, they characterize these geographic units through some algorithms such as singular value decomposition, latent semantic analysis, latent Dirichlet and other analytical methods. Finally, clustering is carried out by the similarity of feature expression of geographic units, each clustering result represents a functional area, and the distribution of urban functional areas is obtained. However, these models have the following shortcomings.

第一,在特征表达的过程中,部分算法先对特征做出严格的假设,如样本仅具有一组特征或服从同样的分布。因样本经过特征表达后,便会从一个高维空间降至一个低维子空间中,这些算法都可称为单子空间算法。单子空间算法严格的假设便于获得特征模式,且根据样本和特征之间的关系进行聚类即可获得功能区分布,但如果样本信息所占的权重较小,在特征表达后将会被边缘化,从而导致聚类结果不准确。并且功能区之间存在特征差异,使用同一组特征不能简洁精确地描述每一个功能区。当数据过大,特征模式过于复杂时,单子空间模型对特征模式的假设将会限制特征的挖掘。所以它们无法处理更为复杂的数据。First, in the process of feature expression, some algorithms first make strict assumptions about the features, such as the sample only has one set of features or obeys the same distribution. After the sample is characterized, it will be reduced from a high-dimensional space to a low-dimensional subspace, and these algorithms can be called single subspace algorithms. The strict assumption of the monadic space algorithm is easy to obtain feature patterns, and the distribution of functional areas can be obtained by clustering according to the relationship between samples and features, but if the weight of sample information is small, it will be marginalized after feature expression. , resulting in inaccurate clustering results. And there are feature differences between functional areas, and each functional area cannot be described concisely and accurately by using the same set of features. When the data is too large and the feature pattern is too complex, the assumption of the feature pattern in the monad space model will limit the feature mining. So they can't handle more complex data.

第二,这些模型忽略向量空间的几何意义。子空间的几何属性与城市功能区的特征是相关的,已有技术忽略了对此的探讨和考虑。Second, these models ignore the geometric meaning of vector spaces. The geometric properties of subspaces are related to the characteristics of urban functional areas, and the prior art ignores the discussion and consideration of this.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明的目的在于提供一种基于多子空间模型的城市功能区识别方法,本发明利用地理大数据提供的人类活动信息,基于多子空间的模型克服现有技术中存在的缺陷,能够更精确地识别城市功能区。In view of this, the purpose of the present invention is to provide a method for identifying urban functional areas based on a multi-subspace model. The present invention utilizes the human activity information provided by geographic big data, and the multi-subspace-based model overcomes the defects existing in the prior art. , which can more accurately identify urban functional areas.

本发明的目的是这样实现的,一种基于多子空间模型的城市功能区识别方法,包括以下步骤:The purpose of the present invention is to achieve this, a method for identifying urban functional areas based on a multi-subspace model, comprising the following steps:

步骤1,获取研究区域内出租车轨迹数据和签到数据;Step 1: Obtain the taxi trajectory data and check-in data in the research area;

步骤2,构建面向分区基于到访目的的时序特征矩阵C;Step 2, constructing a partition-oriented time series feature matrix C based on the visiting purpose;

步骤3,输入时序特征矩阵C至稀疏子空间聚类算法,计算获得地理单元和城市功能区的对应关系;Step 3, input the time series feature matrix C to the sparse subspace clustering algorithm, and calculate the corresponding relationship between the geographic unit and the urban functional area;

步骤4,获得每个功能区的显著特征地点,进而识别每个功能区的主要功能。Step 4, obtain the salient feature locations of each functional area, and then identify the main function of each functional area.

具体地,步骤2中所述的时序特征矩阵C的构建过程包括以下步骤:Specifically, the construction process of the time series feature matrix C described in step 2 includes the following steps:

步骤201,对所述的研究区域进行划分,得到N个地理单元;Step 201, dividing the research area to obtain N geographic units;

步骤202,对所述的出租车轨迹数据预处理,剔除异常点,提取每次行程的终点和到达时间,并将终点与地理单元进行映射,得到地理单元的到访记录;Step 202, preprocessing the taxi trajectory data, removing abnormal points, extracting the end point and arrival time of each trip, and mapping the end point with the geographic unit to obtain the visit record of the geographic unit;

步骤203,将所述的签到数据记录与地理单元的到访记录进行匹配,对每次到访的目的进行分类;Step 203, matching the check-in data record with the visit record of the geographic unit, and classifying the purpose of each visit;

步骤204,构建M行N列的时序特征矩阵C,表示地理单元在一段时间内所承载的人类活动动态,其中M=T×D,T表示划分的时间段数,D表示到访目的的类别数,C中每一列表示不同时间段为了不同目的访问对应地理单元的人数。Step 204, constructing a time series feature matrix C with M rows and N columns, which represents the human activity dynamics carried by the geographic unit within a period of time, where M=T×D, T represents the number of divided time periods, and D represents the number of categories for visiting purposes. , each column in C represents the number of people visiting the corresponding geographic unit for different purposes in different time periods.

具体地,步骤3中所述的稀疏子空间聚类算法,包括以下步骤:Specifically, the sparse subspace clustering algorithm described in step 3 includes the following steps:

步骤301,求解系数矩阵Z,大小为N×N,矩阵Z需满足在l1约束下的最小化:Step 301, solve the coefficient matrix Z, the size is N×N, the matrix Z needs to satisfy the minimization under the constraint of l 1 :

Figure BDA0002518778500000031
Figure BDA0002518778500000031

CZ=C,Zii=0CZ=C, Z ii =0

其中

Figure BDA0002518778500000041
表示l1范数,l1范数最小化使得系数矩阵Z稀疏,从而迫使每个地理单元的时序特征仅需用同一子空间中其他地理单元的时序特征的线性组合来表示;in
Figure BDA0002518778500000041
Represents the l1 norm, and the minimization of the l1 norm makes the coefficient matrix Z sparse, thus forcing the time series feature of each geographic unit to be represented only by a linear combination of the time series features of other geographic units in the same subspace;

步骤302,然后利用系数矩阵建立数据的相似度矩阵W=|Z|+|Z|T,W大小为N×N,矩阵中的值即为对应索引的地理单元之间在时序特征上的相似度;Step 302, then use the coefficient matrix to establish a data similarity matrix W=|Z|+|Z| T , the size of W is N×N, and the value in the matrix is the similarity in time series characteristics between the geographic units corresponding to the index Spend;

相似度矩阵W为分块对角矩阵,即只有主对角线上有非零子矩阵,其余子块都为零矩阵,每一个非零子矩阵是一个子空间,同一子空间内包含多个时序特征极为相似的地理单元,处于不同子空间的地理单元在时序特征上差异大,因此子空间即为所需探测的城市功能区;The similarity matrix W is a block diagonal matrix, that is, only the main diagonal has non-zero sub-matrices, and the rest of the sub-blocks are zero matrices. Each non-zero sub-matrix is a subspace, and the same subspace contains multiple Geographical units with very similar time series characteristics, geographical units in different subspaces have large differences in time series characteristics, so the subspace is the urban functional area to be detected;

步骤303,利用相似度矩阵W的归一化拉普拉斯矩阵L计算子空间个数,L=I-D-1/ 2WD-1/2,其中I是单位矩阵,D=∑iWij,将L的特征值升序排列,计算每两个相邻特征值的差值λk+1k,最大差值对应的k为所求子空间个数,亦即需要探测的城市功能区个数;Step 303: Calculate the number of subspaces by using the normalized Laplacian matrix L of the similarity matrix W, L=ID -1/ 2 WD -1/2 , where I is the identity matrix, D=∑ i W ij , Arrange the eigenvalues of L in ascending order, and calculate the difference λ k+1 - λ k of each two adjacent eigenvalues. The k corresponding to the largest difference is the number of subspaces to be sought, that is, the number of urban functional areas to be detected. number;

步骤304,对相似度矩阵W使用K均值聚类方法,聚类数设定为步骤303得到的k,得到地理单元与k个类别的对应关系,即与k个城市功能区的对应关系,完成城市功能区探测。In step 304, the K-means clustering method is used for the similarity matrix W, and the number of clusters is set to k obtained in step 303, and the corresponding relationship between geographic units and k categories, that is, the corresponding relationship with k urban functional areas, is completed. Urban functional area detection.

具体地,步骤4中所述的每个功能区的显著特征地点的获得包括:利用步骤304的对应关系从步骤302生成的相似度矩阵W中抽取每个城市功能区对应的子空间矩阵S1,...,Si,...,Sk,并进行主成分分析,得到的特征向量[e1,e2,...,ep,...,eM]i称为Si的特征地点,将前r个累加特征值占比高于90%的特征向量[e1,e2,...,er]i即为Si的显著特征地点。Specifically, obtaining the salient feature locations of each functional area described in step 4 includes: extracting a subspace matrix S1 corresponding to each urban functional area from the similarity matrix W generated in step 302 by using the corresponding relationship in step 304 , ..., S i , ..., S k , and perform principal component analysis, the obtained eigenvectors [e 1 , e 2 , ..., e p , ..., e M ] i are called S The characteristic location of i , the eigenvectors [e 1 , e 2 , .

具体地,所述的每个功能区的主要功能的识别包括,将每个功能区的每一个显著特征地点变形为D行T列的矩阵,每一行表示该特征地点在T个时间段上以D为目的的活跃水平变化,得到功能区的主要活动模式,并用所述的主要活动模式中最活跃的功能标记该功能区,完成城市功能区识别。Specifically, the identification of the main function of each functional area includes: transforming each salient feature location of each functional area into a matrix with D rows and T columns, each row indicating that the feature location is in T time periods with The main activity pattern of the functional area is obtained by changing the activity level for the purpose of D, and the function area is marked with the most active function in the main activity pattern to complete the identification of the urban functional area.

进一步地,所述的城市功能区识别方法,还包括以下步骤:Further, the method for identifying urban functional areas further includes the following steps:

步骤5,计算每个功能区的相似度;Step 5, calculate the similarity of each functional area;

步骤6,计算每个功能区的独特度,对按所述的独特度对每个功能区进行排序。Step 6: Calculate the uniqueness of each functional area, and sort each functional area according to the uniqueness.

具体地,所述功能区的相似度的计算是根据对应子空间之间的主要角而计算的,任意两个功能区对应子空间Sk和Sl的相似度aff(Sk,Sl)计算公式如下,Specifically, the calculation of the similarity of the functional areas is calculated according to the main angle between the corresponding subspaces, and the similarity aff(S k , S l ) of the corresponding subspaces Sk and S l of any two functional areas Calculated as follows,

Figure BDA0002518778500000051
Figure BDA0002518778500000051

其中,

Figure BDA0002518778500000052
Figure BDA0002518778500000053
的第i个最大奇异值,Uk和Ul分别是Sk和Sl的正交基,
Figure BDA0002518778500000054
是子空间之间的主要角,dk∧dl表示Sk和Sl的空间维数dk与dl中的较小值;in,
Figure BDA0002518778500000052
Yes
Figure BDA0002518778500000053
The ith largest singular value of , U k and U l are the orthonormal basis of S k and S l , respectively,
Figure BDA0002518778500000054
is the main angle between the subspaces, d k ∧ d l represents the smaller of the spatial dimensions d k and d l of S k and S l ;

具体地,所述的功能区的独特度与相似度成反比,若子空间之间的相似度较高,对应的功能区的功能将会极大相似,则功能区的独特度低,每个功能区Si的独特度计算公式如下:Specifically, the uniqueness of the functional areas is inversely proportional to the similarity. If the similarity between the subspaces is high, the functions of the corresponding functional areas will be very similar, and the uniqueness of the functional areas is low. The formula for calculating the uniqueness of region Si is as follows:

Figure BDA0002518778500000055
Figure BDA0002518778500000055

其中k是总的功能区个数,S-i表示除了Si以外的功能区。where k is the total number of functional areas, and S -i represents the functional areas other than Si.

进一步地,所述的城市功能区识别方法,还包括以下步骤:Further, the method for identifying urban functional areas further includes the following steps:

步骤7,计算每个功能区的丰度,对按所述的丰度对每个功能区进行排序。Step 7, calculate the abundance of each functional area, and sort each functional area according to the described abundance.

具体地,所述的功能区的丰度与每个功能区显著特征地点的重建误差有关,其计算如下:Specifically, the abundance of the functional areas is related to the reconstruction error of the salient feature locations of each functional area, which is calculated as follows:

Figure BDA0002518778500000061
Figure BDA0002518778500000061

其中C(Si)是由属于子空间Si的原始向量构成的矩阵,

Figure BDA0002518778500000062
是由Si的显著特征地点重构的矩阵,|| ||F表示矩阵的弗罗贝尼乌斯范数。where C(S i ) is a matrix consisting of primitive vectors belonging to subspace S i ,
Figure BDA0002518778500000062
is the matrix reconstructed from the salient feature locations of Si, and || || F denotes the Frobenius norm of the matrix.

本发明方法提出基于多子空间的模型,认为城市功能区拥有多组特征,当地理单元的时空活动信息用向量表达时,其向量样本便位于联合子空间构成的高维空间中,位于同一个子空间的地理单元所承载的人类活动动态特征是相似的,可以聚类为一个功能区,通过寻找子空间实现城市功能区的识别,并基于子空间的几何性质分析各功能区的独特度与丰度,为城市功能区的管理和发展提供了精细量化的指标指示。The method of the present invention proposes a model based on multiple subspaces, and considers that urban functional areas have multiple sets of characteristics. When the spatiotemporal activity information of geographic units is expressed by vectors, the vector samples are located in the high-dimensional space formed by the joint subspace, and are located in the same subspace. The dynamic characteristics of human activities carried by the geographical units of the space are similar, and can be clustered into a functional area. The identification of urban functional areas can be realized by finding subspaces, and the uniqueness and abundance of each functional area are analyzed based on the geometric properties of the subspaces. It provides precise and quantitative indicators for the management and development of urban functional areas.

附图说明Description of drawings

图1本发明方法的流程示意图;Fig. 1 is the schematic flow chart of the method of the present invention;

图2本发明方法实施例的流程示意图。Fig. 2 is a schematic flowchart of a method embodiment of the present invention.

图3本发明实施例中使用稀疏子空间聚类方法得出的相似度矩阵;Fig. 3 uses the similarity matrix that the sparse subspace clustering method obtains in the embodiment of the present invention;

图4本发明实施例探测城市功能区的结果;4 is the result of detecting urban functional areas according to an embodiment of the present invention;

图5本发明实施例中每个功能区的显著特征地点的功能活跃水平;5 is the functional activity level of the salient feature locations of each functional area in the embodiment of the present invention;

图6本发明实施例计算得出的功能区相似度;Fig. 6 Similarity of functional areas calculated by an embodiment of the present invention;

图7本发明实施例计算得出的功能区独特度和丰度。Fig. 7 Uniqueness and abundance of functional regions calculated by an embodiment of the present invention.

具体实施方式Detailed ways

下面结合实施例和附图对本发明作进一步的说明,但不以任何方式对本发明加以限制,基于本发明教导所作的任何变换或替换,均属于本发明的保护范围。The present invention will be further described below in conjunction with the embodiments and the accompanying drawings, but the present invention is not limited in any way, and any transformation or replacement based on the teachings of the present invention belongs to the protection scope of the present invention.

如图1所示,一种基于多子空间模型的城市功能区识别方法,包括以下步骤:As shown in Figure 1, a method for identifying urban functional areas based on a multi-subspace model includes the following steps:

步骤1,获取研究区域内出租车轨迹数据和签到数据;Step 1: Obtain the taxi trajectory data and check-in data in the research area;

步骤2,构建面向分区基于到访目的的时序特征矩阵C;Step 2, constructing a partition-oriented time series feature matrix C based on the visiting purpose;

步骤3,输入时序特征矩阵C至稀疏子空间聚类算法,计算获得地理单元和城市功能区的对应关系;Step 3, input the time series feature matrix C to the sparse subspace clustering algorithm, and calculate the corresponding relationship between the geographic unit and the urban functional area;

步骤4,获得每个功能区的显著特征地点,进而识别每个功能区的主要功能。Step 4, obtain the salient feature locations of each functional area, and then identify the main function of each functional area.

具体地,步骤2中所述的时序特征矩阵C的构建过程包括以下步骤:Specifically, the construction process of the time series feature matrix C described in step 2 includes the following steps:

步骤201,对所述的研究区域进行划分,得到N个地理单元;Step 201, dividing the research area to obtain N geographic units;

步骤202,对所述的出租车轨迹数据预处理,剔除异常点,提取每次行程的终点和到达时间,并将终点与地理单元进行映射,得到地理单元的到访记录;Step 202, preprocessing the taxi trajectory data, removing abnormal points, extracting the end point and arrival time of each trip, and mapping the end point with the geographic unit to obtain the visit record of the geographic unit;

步骤203,将所述的签到数据记录与地理单元的到访记录进行匹配,对每次到访的目的进行分类;Step 203, matching the check-in data record with the visit record of the geographic unit, and classifying the purpose of each visit;

步骤204,构建M行N列的时序特征矩阵C,表示地理单元在一段时间内所承载的人类活动动态,其中M=T×D,T表示划分的时间段数,D表示到访目的的类别数,C中每一列表示不同时间段为了不同目的访问对应地理单元的人数。Step 204, constructing a time series feature matrix C with M rows and N columns, which represents the human activity dynamics carried by the geographic unit within a period of time, where M=T×D, T represents the number of divided time periods, and D represents the number of categories for visiting purposes. , each column in C represents the number of people visiting the corresponding geographic unit for different purposes in different time periods.

具体地,步骤3中所述的稀疏子空间聚类算法,包括以下步骤:Specifically, the sparse subspace clustering algorithm described in step 3 includes the following steps:

步骤301,求解系数矩阵Z,大小为N×N,矩阵Z需满足在l1约束下的最小化:Step 301, solve the coefficient matrix Z, the size is N×N, the matrix Z needs to satisfy the minimization under the constraint of l 1 :

Figure BDA0002518778500000081
Figure BDA0002518778500000081

CZ=C,Zii=0CZ=C, Z ii =0

其中

Figure BDA0002518778500000082
表示l1范数,l1范数最小化使得系数矩阵Z稀疏,从而迫使每个地理单元的时序特征仅需用同一子空间中其他地理单元的时序特征的线性组合来表示;in
Figure BDA0002518778500000082
Represents the l1 norm, and the minimization of the l1 norm makes the coefficient matrix Z sparse, thus forcing the time series feature of each geographic unit to be represented only by a linear combination of the time series features of other geographic units in the same subspace;

步骤302,然后利用系数矩阵建立数据的相似度矩阵W=|Z|+|Z|T,W大小为N×N,矩阵中的值即为对应索引的地理单元之间在时序特征上的相似度;Step 302, then use the coefficient matrix to establish a data similarity matrix W=|Z|+|Z| T , the size of W is N×N, and the value in the matrix is the similarity in time series characteristics between the geographic units corresponding to the index Spend;

相似度矩阵W为分块对角矩阵,即只有主对角线上有非零子矩阵,其余子块都为零矩阵,每一个非零子矩阵是一个子空间,同一子空间内包含多个时序特征极为相似的地理单元,处于不同子空间的地理单元在时序特征上差异大,因此子空间即为所需探测的城市功能区;The similarity matrix W is a block diagonal matrix, that is, only the main diagonal has non-zero sub-matrices, and the rest of the sub-blocks are zero matrices. Each non-zero sub-matrix is a subspace, and the same subspace contains multiple Geographical units with very similar time series characteristics, geographical units in different subspaces have large differences in time series characteristics, so the subspace is the urban functional area to be detected;

步骤303,利用相似度矩阵W的归一化拉普拉斯矩阵L计算子空间个数,L=I-D-1/ 2WD-1/2,其中I是单位矩阵,D=∑iWij,将L的特征值升序排列,计算每两个相邻特征值的差值λk+1k,最大差值对应的k为所求子空间个数,亦即需要探测的城市功能区个数;Step 303: Calculate the number of subspaces by using the normalized Laplacian matrix L of the similarity matrix W, L=ID -1/ 2 WD -1/2 , where I is the identity matrix, D=∑ i W ij , Arrange the eigenvalues of L in ascending order, and calculate the difference λ k+1 - λ k of each two adjacent eigenvalues. The k corresponding to the largest difference is the number of subspaces to be sought, that is, the number of urban functional areas to be detected. number;

步骤304,对相似度矩阵W使用K均值聚类方法,聚类数设定为步骤303得到的k,得到地理单元与k个类别的对应关系,即与k个城市功能区的对应关系,完成城市功能区探测。In step 304, the K-means clustering method is used for the similarity matrix W, and the number of clusters is set to k obtained in step 303, and the corresponding relationship between geographic units and k categories, that is, the corresponding relationship with k urban functional areas, is completed. Urban functional area detection.

具体地,步骤4中所述的每个功能区的显著特征地点的获得包括:利用步骤304的对应关系从步骤302生成的相似度矩阵W中抽取每个城市功能区对应的子空间矩阵S1,...,Si,...,Sk,并进行主成分分析,得到的特征向量[e1,e2,...,ep,…,eM]i称为5i的特征地点,将前r个累加特征值占比高于90%的特征向量[e1,e2,...,er]i即为Si的显著特征地点。Specifically, obtaining the salient feature locations of each functional area described in step 4 includes: extracting a subspace matrix S1 corresponding to each urban functional area from the similarity matrix W generated in step 302 by using the corresponding relationship in step 304 , ..., S i , ..., Sk , and perform principal component analysis, the obtained eigenvectors [e 1 , e 2 , ..., e p , ..., e M ] i are called 5i 's Feature locations, the eigenvectors [ e 1 , e 2 , .

具体地,所述的每个功能区的主要功能的识别包括,将每个功能区的每一个显著特征地点变形为D行T列的矩阵,每一行表示该特征地点在T个时间段上以D为目的的活跃水平变化,得到功能区的主要活动模式,并将所述的主要活动模式中最活跃的功能视为该地区的主要功能。Specifically, the identification of the main function of each functional area includes: transforming each salient feature location of each functional area into a matrix with D rows and T columns, each row indicating that the feature location is in T time periods with D for the purpose of changing the activity level, get the main activity pattern of the functional area, and regard the most active function in the main activity pattern as the main function of the area.

进一步地,所述的城市功能区识别方法,还包括以下步骤:Further, the method for identifying urban functional areas further includes the following steps:

步骤5,计算每个功能区的相似度;Step 5, calculate the similarity of each functional area;

步骤6,计算每个功能区的独特度,对按所述的独特度对每个功能区进行排序。Step 6: Calculate the uniqueness of each functional area, and sort each functional area according to the uniqueness.

具体地,所述功能区的相似度的计算是根据子空间之间的主要角而计算的,任意两个功能区Sk和Sl的相似度计算公式如下,Specifically, the calculation of the similarity of the functional areas is calculated according to the main angle between the subspaces, and the calculation formula of the similarity of any two functional areas S k and S l is as follows:

Figure BDA0002518778500000091
Figure BDA0002518778500000091

其中,

Figure BDA0002518778500000092
Figure BDA0002518778500000093
的第i个最大奇异值,Uk和Ul分别是Sk和Sl的正交基,
Figure BDA0002518778500000094
是子空间之间的主要角,dk∧dl表示Sk和Sl的空间维数dk与dl中的较小值。in,
Figure BDA0002518778500000092
Yes
Figure BDA0002518778500000093
The ith largest singular value of , U k and U l are the orthonormal basis of S k and S l , respectively,
Figure BDA0002518778500000094
is the main angle between the subspaces, and d k ∧ d l denotes the smaller of the spatial dimensions d k and d l of Sk and S l .

具体地,所述的功能区的独特度与相似度成反比,若子空间之间的相似度较高,对应的功能区的功能将会极大相似,则功能区的独特度低,每个功能区Si的独特度计算公式如下:Specifically, the uniqueness of the functional areas is inversely proportional to the similarity. If the similarity between the subspaces is high, the functions of the corresponding functional areas will be very similar, and the uniqueness of the functional areas is low. The formula for calculating the uniqueness of region Si is as follows:

Figure BDA0002518778500000101
Figure BDA0002518778500000101

其中k是总的功能区个数,S-i表示除了Si以外的功能区。where k is the total number of functional areas, and S -i represents the functional areas other than Si.

进一步地,所述的城市功能区识别方法,还包括以下步骤:Further, the method for identifying urban functional areas further includes the following steps:

步骤7,计算每个功能区的丰度,对按所述的丰度对每个功能区进行排序。Step 7, calculate the abundance of each functional area, and sort each functional area according to the described abundance.

具体地,所述的功能区的丰度与每个功能区显著特征地点的重建误差有关,其计算如下:Specifically, the abundance of the functional areas is related to the reconstruction error of the salient feature locations of each functional area, which is calculated as follows:

Figure BDA0002518778500000102
Figure BDA0002518778500000102

其中C(Si)是由属于子空间Si的原始向量构成的矩阵,

Figure BDA0002518778500000103
是由Si的显著特征地点重构的矩阵。where C(S i ) is a matrix consisting of primitive vectors belonging to subspace S i ,
Figure BDA0002518778500000103
is the matrix reconstructed from the salient feature locations of Si.

重建误差描述的是由显著特征地点还原原始子空间矩阵的差值,重建误差越大,表明除了显著特征地点占主导地位之外,还需要更多的特征地点来描绘功能区中的动态变化。丰度考察的是这一个地区内人们的丰富活动模式,以及可支撑这种活动模式的功能发展。The reconstruction error describes the difference between the original subspace matrix restored by the salient feature sites. The larger the reconstruction error, the more feature sites are needed to describe the dynamic changes in the functional area in addition to the dominant feature sites. Abundance examines the patterns of enrichment activity of people in an area and the functional development that supports this pattern of activity.

如图2所示的流程,本发明进行实验包括以下步骤。As shown in the flow chart in FIG. 2 , the experiment of the present invention includes the following steps.

(1)数据处理(1) Data processing

步骤1.1:选择上海主要城区作为研究区域,划分格网大小为500米×500米,剔除水体单元以后,得到3166个地理单元。Step 1.1: Select the main urban area of Shanghai as the study area, and divide the grid size into 500 meters × 500 meters. After excluding the water body units, 3166 geographic units are obtained.

步骤1.2:对来自上海市内6600辆出租车的GPS轨迹数据预处理,剔除异常点,提取每次行程的终点和到达时间,并将终点与地理单元进行映射,得到7852724条到访记录。Step 1.2: Preprocess the GPS trajectory data from 6,600 taxis in Shanghai, remove outliers, extract the end point and arrival time of each trip, and map the end point to the geographic unit to obtain 7,852,724 visit records.

步骤1.3:将签到数据记录与地理单元的到访记录进行匹配,对每次到访的目的进行分类,到访目的有六种类型:家,交通,工作,餐饮,娱乐和其他(指去公园,博物馆,图书馆等地方)。Step 1.3: Match the check-in data records with the visit records of the geographic unit, classify the purpose of each visit, there are six types of visit purposes: home, transportation, work, dining, entertainment and other (referring to going to the park). , museums, libraries, etc.).

步骤1.4:将一天按小时划分,得到24个时间段,统计因各目的(总数6)在24个时间段访问每个地理单元的次数,得到144行3166列的时序特征矩阵C。Step 1.4: Divide a day into hours to obtain 24 time periods, count the number of visits to each geographic unit in 24 time periods for each purpose (total 6), and obtain a time series feature matrix C with 144 rows and 3166 columns.

(2)城市功能区识别(2) Identification of urban functional areas

步骤2.1:输入时序特征矩阵C至稀疏子空间聚类算法,得到相似度矩阵W,图3为可视化相似度矩阵结果,它揭示了地理单元之间的相似性,相似度值非零则上色黑色,可以看到存在五个块对角,这种结构揭示了城市功能区的个数为5。Step 2.1: Input the time series feature matrix C to the sparse subspace clustering algorithm to obtain the similarity matrix W. Figure 3 shows the visual similarity matrix result, which reveals the similarity between geographic units. If the similarity value is non-zero, it will be colored Black, it can be seen that there are five block diagonals, this structure reveals that the number of urban functional areas is 5.

步骤2.2:利用W的归一化拉普拉斯矩阵L计算子空间个数,L=I-D-1/2WD-1/2,其中I是单位矩阵,D=∑iWii。将L的特征值升序排列,计算每两个相邻特征值的差值λk+1k,最大差值对应的k为5,即子空间(城市功能区)个数为5,与步骤2.1判读结果一致。Step 2.2: Calculate the number of subspaces by using the normalized Laplacian matrix L of W, L=ID -1/2 WD -1/2 , where I is the identity matrix, and D=∑ i Wi ii . Arrange the eigenvalues of L in ascending order, calculate the difference λ k+1 - λ k of each two adjacent eigenvalues, the k corresponding to the largest difference is 5, that is, the number of subspaces (urban functional areas) is 5, and The interpretation results of step 2.1 are consistent.

步骤2.3:因此,对W使用K均值聚类方法,聚类数设定为5,完成城市功能区探测,得到城市功能区1,2,3,4,5。聚类结果在地图上可视化结果见附图4,可以看到中心区域主要由功能区5覆盖。Step 2.3: Therefore, use the K-means clustering method for W, set the number of clusters to 5, complete the detection of urban functional areas, and obtain urban functional areas 1, 2, 3, 4, and 5. The visualization results of the clustering results on the map are shown in Figure 4. It can be seen that the central area is mainly covered by the functional area 5.

步骤2.4:由于功能区的主要功能由功能区的显著活动特征决定,为确定探测出的城市功能区的实际功能,对每一个功能区对应的子空间矩阵进行主成分分析,得到各功能区的特征地点,发现功能区1、2、3、4中的前5个特征值占比超过90%,而功能区5中的前5个特征值占比少于90%,因此,我们以前5个特征值对应的基向量作为每个功能区的显著特征地点,而在分析功能区5的时候,以前10个特征值对应的基向量作为其显著特征地点。Step 2.4: Since the main function of the functional area is determined by the significant activity characteristics of the functional area, in order to determine the actual function of the detected urban functional area, the principal component analysis is performed on the subspace matrix corresponding to each functional area, and the Feature locations, it is found that the top 5 eigenvalues in functional areas 1, 2, 3, and 4 account for more than 90%, while the top 5 eigenvalues in functional area 5 account for less than 90%. Therefore, our previous 5 The basis vectors corresponding to the eigenvalues are used as the salient feature locations of each functional area, and when analyzing functional area 5, the basis vectors corresponding to the previous 10 eigenvalues are used as its salient feature locations.

步骤2.5:将每个功能区的每一个显著特征地点变形为6行24列的矩阵,则每一行分别表示该特征地点在24个小时以家(H),交通(Tr),工作(W),餐饮(D),娱乐(E)和其他(O,指去公园,博物馆,图书馆等地方)为目的的活跃水平变化,所有功能区的显著特征地点如附图5所示。由图可知,家庭活动(H)在功能区1的显著特征地点中最活跃,就餐活动(D)排在第二位,而娱乐活动(E)也较为突出,因此,功能区1可作为餐饮和娱乐设施配套发展的居住区;同理,功能区2交通活动(Tr)突出为交通枢纽;功能区3主要活动为工作(W),所以是工作区;对于功能区5,主要衡量了前10个显著特征地点的影响,发现就餐活动(D)和娱乐活动(E)活跃,视其为商业区;功能区4则对应为公园、博物馆、加油站等其他功能区域。Step 2.5: Transform each salient feature location of each functional area into a matrix with 6 rows and 24 columns, then each row represents the feature location in 24 hours at home (H), traffic (Tr), work (W) , catering (D), entertainment (E) and other (O, referring to parks, museums, libraries, etc.) for the purpose of activity level changes, the salient features of all functional areas are shown in Figure 5. It can be seen from the figure that family activities (H) are the most active among the salient features of functional area 1, dining activities (D) are ranked second, and entertainment activities (E) are also more prominent. Therefore, functional area 1 can be used as a restaurant. Residential area developed with entertainment facilities; similarly, the traffic activity (Tr) of functional area 2 is prominent as a transportation hub; the main activity of functional area 3 is work (W), so it is a work area; The influence of 10 salient feature locations, it is found that dining activities (D) and entertainment activities (E) are active, and they are regarded as commercial areas; functional area 4 corresponds to other functional areas such as parks, museums, and gas stations.

(3)城市功能区分析(3) Analysis of urban functional areas

步骤3.1:根据子空间之间的主要角计算子空间的临近度,即功能区的相似度,见附图6,功能区本身相似度不计算,设置为0。附图6中居民区和商业区的相似度最高,因为居民区更有可能有餐饮和娱乐设施,附图3中商业区所在位置本身也混杂了大量的居民区。Step 3.1: Calculate the proximity of the subspaces according to the main angle between the subspaces, that is, the similarity of the functional area, see Figure 6, the similarity of the functional area itself is not calculated, and is set to 0. The residential area and the commercial area in Figure 6 have the highest similarity, because the residential area is more likely to have dining and entertainment facilities, and the location of the commercial area in Figure 3 itself is also mixed with a large number of residential areas.

步骤3.2:根据功能区相似度计算功能区的独特度,结果见附图7。功能区独特度的整体值较高,表明研究区域的总体功能区差异显著。其中居民区和商业区的独特度较低,这与(3)中步骤3.1的结果也是相符的。Step 3.2: Calculate the uniqueness of the functional area according to the similarity of the functional area. The results are shown in Figure 7. The overall value of the uniqueness of the functional area is higher, indicating that the overall functional area of the study area is significantly different. Among them, the uniqueness of residential area and commercial area is low, which is also consistent with the result of step 3.1 in (3).

步骤3.3:计算功能区的丰度,结果见附图7。其中其它功能区(提供其他服务的区域)的重构误差最大意味着其它功能区中的活动模式最复杂,因为包含设施多,动态活动模式差异大。而居民区和商业区的重构误差最小,因为它们分别集中于居住,和餐饮、娱乐上,功能动态活动模式较为单一。Step 3.3: Calculate the abundance of functional regions, the results are shown in Figure 7. Among them, the reconstruction error of other functional areas (areas that provide other services) is the largest, which means that the activity patterns in other functional areas are the most complicated, because there are many facilities and the dynamic activity patterns vary greatly. The reconstruction error of residential area and commercial area is the smallest, because they are concentrated on living, dining and entertainment, respectively, and the functional dynamic activity mode is relatively simple.

由发明内容和实施例可知,为了解决现有技术中存在的问题,本发明提出基于多子空间的模型,认为城市功能区拥有多组特征,当地理单元的时空活动信息用向量表达时,其向量样本便位于联合子空间构成的高维空间中,位于同一个子空间的地理单元所承载的人类活动动态特征是相似的,可以聚类为一个功能区,通过寻找子空间实现城市功能区的识别,并基于子空间的几何性质分析各功能区的独特度与丰度,为城市功能区的管理和发展提供了精细量化的指标指示。It can be seen from the content of the invention and the examples that, in order to solve the problems existing in the prior art, the present invention proposes a model based on multiple subspaces, and considers that urban functional areas have multiple sets of characteristics. The vector samples are located in the high-dimensional space formed by the joint subspace, and the dynamic characteristics of human activities carried by the geographic units located in the same subspace are similar, and can be clustered into a functional area, and the identification of urban functional areas can be realized by finding the subspace. , and analyzes the uniqueness and abundance of each functional area based on the geometric properties of the subspace, providing a refined and quantitative indicator for the management and development of urban functional areas.

Claims (4)

1.一种基于多子空间模型的城市功能区识别方法,其特征在于,包括以下步骤:1. an urban functional area identification method based on a multi-subspace model, is characterized in that, comprises the following steps: 步骤1,获取研究区域内出租车轨迹数据和签到数据;Step 1: Obtain the taxi trajectory data and check-in data in the research area; 步骤2,构建面向分区基于到访目的的时序特征矩阵C;Step 2, constructing a partition-oriented time series feature matrix C based on the visiting purpose; 步骤3,输入时序特征矩阵C至稀疏子空间聚类算法,计算获得地理单元和城市功能区的对应关系;Step 3, input the time series feature matrix C to the sparse subspace clustering algorithm, and calculate the corresponding relationship between the geographic unit and the urban functional area; 步骤4,获得每个功能区的显著特征地点,进而识别每个功能区的主要功能;Step 4, obtain the salient feature locations of each functional area, and then identify the main function of each functional area; 其中,步骤2中所述的时序特征矩阵C的构建过程包括以下步骤:Wherein, the construction process of the time series feature matrix C described in step 2 includes the following steps: 步骤201,对所述的研究区域进行划分,得到N个地理单元;Step 201, dividing the research area to obtain N geographic units; 步骤202,对所述的出租车轨迹数据预处理,剔除异常点,提取每次行程的终点和到达时间,并将终点与地理单元进行映射,得到地理单元的到访记录;Step 202, preprocessing the taxi trajectory data, removing abnormal points, extracting the end point and arrival time of each trip, and mapping the end point with the geographic unit to obtain the visit record of the geographic unit; 步骤203,将所述的签到数据记录与地理单元的到访记录进行匹配,对每次到访的目的进行分类;Step 203, matching the check-in data record with the visit record of the geographic unit, and classifying the purpose of each visit; 步骤204,构建M行N列的时序特征矩阵C,表示地理单元在一段时间内所承载的人类活动动态,其中M=T×D,T表示划分的时间段数,D表示到访目的的类别数,C中每一列表示不同时间段为了不同目的访问对应地理单元的人数;Step 204, constructing a time series feature matrix C with M rows and N columns, which represents the human activity dynamics carried by the geographic unit within a period of time, where M=T×D, T represents the number of divided time periods, and D represents the number of categories for visiting purposes. , each column in C represents the number of people visiting the corresponding geographic unit for different purposes in different time periods; 步骤3中所述的稀疏子空间聚类算法,包括以下步骤:The sparse subspace clustering algorithm described in step 3 includes the following steps: 步骤301,求解系数矩阵Z,大小为N×N,矩阵Z需满足在l1约束下的最小化:Step 301, solve the coefficient matrix Z, the size is N×N, the matrix Z needs to satisfy the minimization under the constraint of l 1 :
Figure FDA0002518778490000011
Figure FDA0002518778490000011
CZ=C,Zii=0CZ=C, Z ii =0 其中
Figure FDA0002518778490000012
表示l1范数,l1范数最小化使得系数矩阵Z稀疏,从而迫使每个地理单元的时序特征仅需用同一子空间中其他地理单元的时序特征的线性组合来表示;
in
Figure FDA0002518778490000012
Represents the l1 norm, and the minimization of the l1 norm makes the coefficient matrix Z sparse, thus forcing the time series feature of each geographic unit to be represented only by a linear combination of the time series features of other geographic units in the same subspace;
步骤302,然后利用系数矩阵建立数据的相似度矩阵W=|Z|+|Z|T,W大小为N×N,矩阵中的值即为对应索引的地理单元之间在时序特征上的相似度;Step 302, then use the coefficient matrix to establish a data similarity matrix W=|Z|+|Z| T , the size of W is N×N, and the value in the matrix is the similarity in time series characteristics between the geographic units corresponding to the index Spend; 步骤303,利用相似度矩阵W的归一化拉普拉斯矩阵L计算子空间个数,L=I-D-1/2WD-1/2,其中I是单位矩阵,D=∑iWij,将L的特征值升序排列,计算每两个相邻特征值的差值λk+1k,最大差值对应的k为所求子空间个数,亦即需要探测的城市功能区个数;Step 303: Calculate the number of subspaces by using the normalized Laplacian matrix L of the similarity matrix W, L=ID -1/2 WD -1/2 , where I is the identity matrix, D=∑ i W ij , Arrange the eigenvalues of L in ascending order, and calculate the difference λ k+1 - λ k of each two adjacent eigenvalues. The k corresponding to the largest difference is the number of subspaces to be sought, that is, the number of urban functional areas to be detected. number; 步骤304,对相似度矩阵W使用K均值聚类方法,聚类数设定为步骤303得到的k,得到地理单元与k个类别的对应关系,即与k个城市功能区的对应关系,完成城市功能区探测。In step 304, the K-means clustering method is used for the similarity matrix W, and the number of clusters is set to k obtained in step 303, and the corresponding relationship between geographic units and k categories, that is, the corresponding relationship with k urban functional areas, is completed. Urban functional area detection.
2.根据权利要求1所述的城市功能区识别方法,其特征在于,步骤4中所述的每个功能区的显著特征地点的获得包括:利用步骤304的对应关系从步骤302生成的相似度矩阵W中抽取每个城市功能区对应的子空间矩阵S1,...,Si,...,Sk,并进行主成分分析,得到的特征向量[e1,e2,...,ep,…,eM]i称为Si的特征地点,将前r个累加特征值占比高于90%的特征向量[e1,e2,...,er]i即为Si的显著特征地点;2 . The method for identifying urban functional areas according to claim 1 , wherein the obtaining of the salient feature locations of each functional area described in step 4 comprises: using the corresponding relationship in step 304 to generate the similarity from step 302 . 3 . Extract the subspace matrices S 1 , . ., e p , ..., e M ] i is called the feature location of Si , and the first r accumulated eigenvalues account for more than 90% of the eigenvectors [e 1 , e 2 , ..., er ] i is the salient feature location of Si; 所述的每个功能区的主要功能的识别包括,将每个功能区的每一个显著特征地点变形为D行T列的矩阵,每一行表示该特征地点在T个时间段上以D为目的的活跃水平变化,得到功能区的主要活动模式,并用所述的主要活动模式中最活跃的功能标记该功能区,完成城市功能区识别。The identification of the main function of each functional area includes: transforming each salient feature location of each functional area into a matrix with D rows and T columns, each row representing that the feature location takes D as the purpose in T time periods. The main activity pattern of the functional area is obtained, and the function area is marked with the most active function in the main activity pattern to complete the identification of the urban functional area. 3.根据权利要求1或2所述的城市功能区识别方法,其特征在于,所述的城市功能区识别方法还包括以下步骤:3. The method for identifying urban functional areas according to claim 1 or 2, wherein the method for identifying urban functional areas further comprises the following steps: 步骤5,计算每个功能区的相似度;Step 5, calculate the similarity of each functional area; 步骤6,计算每个功能区的独特度,对按所述的独特度对每个功能区进行排序;Step 6: Calculate the uniqueness of each functional area, and sort each functional area according to the uniqueness; 所述功能区的相似度的计算是根据对应子空间之间的主要角而计算的,任意两个功能区对应子空间Sk和Sl的相似度aff(Sk,Sl)计算公式如下,The calculation of the similarity of the functional area is calculated according to the main angle between the corresponding subspaces, and the calculation formula of the similarity aff(S k , S l ) of the corresponding subspaces S k and S l of any two functional areas is as follows ,
Figure FDA0002518778490000031
Figure FDA0002518778490000031
其中,
Figure FDA0002518778490000032
Figure FDA0002518778490000033
的第i个最大奇异值,Uk和Ul分别是Sk和Sl的正交基,
Figure FDA0002518778490000034
是子空间之间的主要角,dk∧dl表示Sk和Sl的空间维数dk与dl中的较小值;
in,
Figure FDA0002518778490000032
Yes
Figure FDA0002518778490000033
The ith largest singular value of , U k and U l are the orthonormal basis of S k and S l , respectively,
Figure FDA0002518778490000034
is the main angle between subspaces, d k ∧ d l represents the smaller of the spatial dimensions d k and d l of S k and S l ;
所述的功能区的独特度与相似度成反比,若子空间之间的相似度较高,对应的功能区的功能将会极大相似,则功能区的独特度低,每个功能区Si的独特度计算公式如下:The uniqueness of the functional areas is inversely proportional to the similarity. If the similarity between the subspaces is high, the functions of the corresponding functional areas will be extremely similar, and the uniqueness of the functional areas is low, and each functional area S i The uniqueness calculation formula of is as follows:
Figure FDA0002518778490000035
Figure FDA0002518778490000035
其中k是总的功能区个数,S-i表示除了Si以外的功能区。where k is the total number of functional areas, and S -i represents the functional areas other than Si.
4.根据权利要求3所述的城市功能区识别方法,其特征在于,所述的城市功能区识别方法还包括以下步骤:4. The method for identifying urban functional areas according to claim 3, wherein the method for identifying urban functional areas further comprises the following steps: 步骤7,计算每个功能区的丰度,对按所述的丰度对每个功能区进行排序;Step 7: Calculate the abundance of each functional area, and sort each functional area according to the abundance; 所述的功能区的丰度与每个功能区显著特征地点的重建误差有关,其计算如下:The abundance of the described functional areas is related to the reconstruction error of the salient feature sites of each functional area, which is calculated as follows:
Figure FDA0002518778490000036
Figure FDA0002518778490000036
其中C(Si)是由属于子空间Si的原始向量构成的矩阵,
Figure FDA0002518778490000037
是由Si的显著特征地点重构的矩阵,|| ||F表示矩阵的弗罗贝尼乌斯范数。
where C(S i ) is a matrix consisting of primitive vectors belonging to subspace S i ,
Figure FDA0002518778490000037
is the matrix reconstructed from the salient feature locations of Si, and || || F denotes the Frobenius norm of the matrix.
CN202010484901.8A 2020-06-01 2020-06-01 City functional area identification method based on multi-subspace model Active CN111651502B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010484901.8A CN111651502B (en) 2020-06-01 2020-06-01 City functional area identification method based on multi-subspace model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010484901.8A CN111651502B (en) 2020-06-01 2020-06-01 City functional area identification method based on multi-subspace model

Publications (2)

Publication Number Publication Date
CN111651502A true CN111651502A (en) 2020-09-11
CN111651502B CN111651502B (en) 2021-09-14

Family

ID=72344015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010484901.8A Active CN111651502B (en) 2020-06-01 2020-06-01 City functional area identification method based on multi-subspace model

Country Status (1)

Country Link
CN (1) CN111651502B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559909A (en) * 2020-12-18 2021-03-26 浙江工业大学 Business area discovery method based on GCN embedded spatial clustering model
CN113343781A (en) * 2021-05-17 2021-09-03 武汉大学 Urban functional area identification method comprehensively using remote sensing data and taxi track data
CN113806419A (en) * 2021-08-26 2021-12-17 西北大学 Urban area function identification model and method based on space-time big data
CN113902185A (en) * 2021-09-30 2022-01-07 北京百度网讯科技有限公司 Method and device for determining regional land property, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117595A (en) * 2015-08-19 2015-12-02 大连理工大学 Floating car data based private car travel data integration method
US20170300566A1 (en) * 2016-04-19 2017-10-19 Strava, Inc. Determining clusters of similar activities
CN108764193A (en) * 2018-06-04 2018-11-06 北京师范大学 Merge the city function limited region dividing method of POI and remote sensing image
CN108876475A (en) * 2018-07-12 2018-11-23 青岛理工大学 City functional area identification method based on interest point acquisition, server and storage medium
CN110298500A (en) * 2019-06-19 2019-10-01 大连理工大学 A kind of urban transportation track data set creation method based on taxi car data and city road network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105117595A (en) * 2015-08-19 2015-12-02 大连理工大学 Floating car data based private car travel data integration method
US20170300566A1 (en) * 2016-04-19 2017-10-19 Strava, Inc. Determining clusters of similar activities
CN108764193A (en) * 2018-06-04 2018-11-06 北京师范大学 Merge the city function limited region dividing method of POI and remote sensing image
CN108876475A (en) * 2018-07-12 2018-11-23 青岛理工大学 City functional area identification method based on interest point acquisition, server and storage medium
CN110298500A (en) * 2019-06-19 2019-10-01 大连理工大学 A kind of urban transportation track data set creation method based on taxi car data and city road network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
YE ZHI等: "Latent spatio-temporal activity structures: a new approach to inferring intra-urban functional regions via social media check-in data", 《GEO-SPATIAL INFORMATION SCIENCE》 *
刘旭: "基于出租车和POI数据的城市土地利用现状变化研究", 《中国优秀硕士学位论文全文数据库 基础科技辑》 *
宁鹏飞等: "基于签到数据的城市热点功能区识别研究", 《测绘地理信息》 *
张慧杰等: "基于轨迹和兴趣点数据的城市功能区动态识别与时变规律可视分析", 《计算机辅助设计与图形学学报》 *
柯文聪等: "基于Landsat与DMSP-OLS的非监督城区提取方法研究", 《测绘与空间地理信息》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559909A (en) * 2020-12-18 2021-03-26 浙江工业大学 Business area discovery method based on GCN embedded spatial clustering model
CN113343781A (en) * 2021-05-17 2021-09-03 武汉大学 Urban functional area identification method comprehensively using remote sensing data and taxi track data
CN113343781B (en) * 2021-05-17 2022-02-01 武汉大学 City functional area identification method using remote sensing data and taxi track data
CN113806419A (en) * 2021-08-26 2021-12-17 西北大学 Urban area function identification model and method based on space-time big data
CN113806419B (en) * 2021-08-26 2024-04-12 西北大学 Urban area function recognition model and recognition method based on space-time big data
CN113902185A (en) * 2021-09-30 2022-01-07 北京百度网讯科技有限公司 Method and device for determining regional land property, electronic equipment and storage medium
CN113902185B (en) * 2021-09-30 2023-10-31 北京百度网讯科技有限公司 Determination method and device for regional land property, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111651502B (en) 2021-09-14

Similar Documents

Publication Publication Date Title
CN111651502B (en) City functional area identification method based on multi-subspace model
CN110929607B (en) A remote sensing identification method and system for urban building construction progress
CN110796284B (en) Method and device for predicting pollution level of fine particulate matters and computer equipment
CN110334293B (en) Position social network-oriented position recommendation method with time perception based on fuzzy clustering
CN109189917B (en) A method and system for dividing urban functional areas that integrate landscape and social characteristics
CN105138668A (en) Urban business center and retailing format concentrated area identification method based on POI data
CN104679942B (en) A method for measuring the carrying efficiency of construction land based on data mining
CN115984850A (en) Lightweight remote sensing image semantic segmentation method based on improved Deeplabv3+
CN110428126A (en) A kind of urban population spatialization processing method and system based on the open data of multi-source
CN106650810B (en) Reservoir water body classification method and device based on spectral attribute information and spatial information
CN108495254B (en) Traffic cell population characteristic estimation method based on signaling data
CN110297875A (en) It is a kind of to assess the method and apparatus that demand tightness is contacted between each functional areas in city
CN112070056A (en) Sensitive land use identification method based on object-oriented and deep learning
Renigier-Biłozor et al. Modern challenges of property market analysis-homogeneous areas determination
Lai et al. A name‐led approach to profile urban places based on geotagged Twitter data
CN108898244B (en) A digital signage location recommendation method with coupled multi-source elements
CN113672788A (en) Urban building function classification method based on multi-source data and weight coefficient method
CN102722578A (en) Unsupervised cluster characteristic selection method based on Laplace regularization
CN107967454B (en) A two-way convolutional neural network remote sensing classification method considering the spatial neighborhood relationship
CN113379269A (en) Urban business function zoning method, device and medium for multi-factor spatial clustering
CN118467857A (en) A method and system for constructing a land use intelligent engine
CN116628462B (en) Method for identification of functional attributes of urban three-dimensional space and monitoring and analysis of spatio-temporal changes
CN117291638A (en) Business circle division method, device, terminal equipment and storage medium
CN116385783A (en) Typical information identification method, device and medium for urban scale building
Peng et al. Specifying multi-scale spatial heterogeneity in the rental housing market: The case of the Tokyo metropolitan area

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant