Commuter Mobility Patterns in Social Media: Correlating Twitter and LODES Data
<p>Study area overview.</p> "> Figure 2
<p>Tweets per Month and County for the entire observation period.</p> "> Figure 3
<p>Schematic workflow illustrating input data, analysis steps and outputs.</p> "> Figure 4
<p>Frequency distribution of <math display="inline"><semantics> <msup> <mi>r</mi> <mn>2</mn> </msup> </semantics></math> values for the comparison of LODES to all Twitter data.</p> "> Figure 5
<p>Comparison of road segment use for different temporal subsets of the Twitter data. Map (<b>a</b>) shows the location of the detailed maps within the study area. Maps (<b>b</b>,<b>c</b>) show the road segment usage for different temporal subsets.</p> "> Figure 6
<p>Correlation coefficients between Twitter and LODES flows between regions. The mean of correlation coefficients for the spatial scale levels is indicated with a dashed line.</p> "> Figure 7
<p>Chord diagrams for County-level connections of (<b>a</b>) Twitter flows during rush hours, (<b>b</b>) Twitter flows outside of rush hours and (<b>c</b>) LODES data (magnitude × 1000).</p> "> Figure 8
<p>Sankey diagrams of land-use pairs from (<b>a</b>) Twitter flows during rush hours, (<b>b</b>) Twitter flows outside of rush hours and (<b>c</b>) LODES data (magnitude × 1000).</p> "> Figure 9
<p>Areas with a (mildly, up to <math display="inline"><semantics> <mrow> <mo>−</mo> <mn>0.13</mn> </mrow> </semantics></math>) negative <math display="inline"><semantics> <msup> <mi>r</mi> <mn>2</mn> </msup> </semantics></math> for Twitter data predicting LODES street segment use. The areas in red are the only areas, where there is a significant difference for all the tweets that were filtered for the purpose of simulating LODES data. The total sum of these areas is small and not spatially autocorrelated.</p> ">
Abstract
:1. Introduction
- To what degree do commuter flow patterns identified in GSND correlate with official LODES commuting data?
- Which traffic flow information beyond commuting is contained in GSND?
- How strong is the influence of spatial scale on correlations between flows extracted from GSND and LODES commuting flows?
2. Related Work
3. Materials
3.1. Study Area Description
3.2. Data Description and Preprocessing
4. Methods
4.1. Computing Trajectories and Flows in Twitter Data
4.1.1. Detecting Clusters of User Locations
4.1.2. Identifying Individual User Trajectories
4.1.3. Identifying Directed Flows at Different Scales
4.2. Flows between Spatial Regions and Land-Use Classes
4.3. Mapping Flows to Street Segments
4.3.1. Correlating Twitter Flows and Lodes Flows
4.3.2. Determining Non-Work-Related Twitter-Derived Street Segment Usage
5. Results
5.1. Flows between Spatial Regions and Land-Use Classes
5.1.1. Comparing Lodes with Twitter-Derived Street Segment Usage
5.1.2. Determining Rush Hour Twitter-Derived Street Segment Usage
5.1.3. Determining Non-Work-Related Twitter-Derived Street Segment Usage
6. Discussion and Limitations
6.1. Discussion of Research Questions
6.2. Discussion of Methods
6.3. Discussion of Results and Relevance for Urban Planning
Potential Use in Urban Planning
7. Conclusions and Outlook
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ihlanfeldt, K.R.; Sjoquist, D.L. The spatial mismatch hypothesis: A review of recent studies and their implications for welfare reform. Hous. Policy Debate 1998, 9, 849–892. [Google Scholar] [CrossRef]
- Rodrigue, J.P. The Geography of Transport Systems, 5th ed.; Routledge: Abingdon, UK; New York, NY, USA, 2020. [Google Scholar] [CrossRef]
- Giuliano, G.; Small, K.A. Is the Journey to Work Explained by Urban Structure? Urban Stud. 1993, 30, 1485–1500. [Google Scholar] [CrossRef] [Green Version]
- Kockelman, K.M. Travel Behavior as Function of Accessibility, Land Use Mixing, and Land Use Balance: Evidence from San Francisco Bay Area. Transp. Res. Rec. J. Transp. Res. Board 1997, 1607, 116–125. [Google Scholar] [CrossRef]
- Schleith, D.; Widener, M.; Kim, C. An examination of the jobs-housing balance of different categories of workers across 26 metropolitan regions. J. Transp. Geogr. 2016, 57, 145–160. [Google Scholar] [CrossRef]
- McKenzie, B. Who Drives to Work? Commuting by Automobile in the United States: 2013; American Community Survey Reports; U.S. Census Bureau: Washington, DC, USA, 2015.
- U.S. Census Bureau. LODES Data Directory. 2019. Available online: https://lehd.ces.census.gov/data/lodes/ (accessed on 13 November 2020).
- National Household Travel Survey; Federal Highway Administration, U.S. Department of Transportation: Washington, DC, USA, 2017. Available online: https://nhts.ornl.gov (accessed on 23 February 2020).
- Twitter, Inc. Twitter Developer API v1.1. 2020. Available online: https://developer.twitter.com/en/docs/twitter-api/v1 (accessed on 13 November 2020).
- Gao, S. Spatio-Temporal Analytics for Exploring Human Mobility Patterns and Urban Dynamics in the Mobile Age. Spat. Cognit. Comput. 2015, 15, 86–114. [Google Scholar] [CrossRef]
- Steiger, E.; Resch, B.; de Albuquerque, J.P.; Zipf, A. Mining and correlating traffic events from human sensor observations with official transport data using self-organizing-maps. Transp. Res. Part Emerg. Technol. 2016, 73, 91–104. [Google Scholar] [CrossRef] [Green Version]
- Martí, P.; Serrano-Estrada, L.; Nolasco-Cirugeda, A. Social Media data: Challenges, opportunities and limitations in urban studies. Comput. Environ. Urban Syst. 2019, 74, 161–174. [Google Scholar] [CrossRef]
- Kurkcu, A.; Ozbay, K.; Morgul, E.F. Evaluating the usability of geo-located twitter as a tool for human activity and mobility patterns: A case study for nyc. In Proceedings of the Transportation Research Board’s 95th Annual Meeting, Washington, DC, USA, 10–14 January 2016; pp. 1–20. [Google Scholar]
- Jurdak, R.; Zhao, K.; Liu, J.; AbouJaoude, M.; Cameron, M.; Newth, D. Understanding Human Mobility from Twitter. PLoS ONE 2015, 10, e0131469. [Google Scholar] [CrossRef]
- Osorio-Arjona, J.; García-Palomares, J.C. Social media and urban mobility: Using twitter to calculate home-work travel matrices. Cities 2019, 89, 268–280. [Google Scholar] [CrossRef]
- Gao, S.; Yang, J.A.; Yan, B.; Hu, Y.; Janowicz, K.; McKenzie, G. Detecting origin-destination mobility flows from geotagged tweets in greater Los Angeles area. In Proceedings of the Eighth International Conference on Geographic Information Science, Vienna, Austria, 24–26 September 2014; pp. 1–4. [Google Scholar]
- Steiger, E.; Westerholt, R.; Resch, B.; Zipf, A. Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data. Comput. Environ. Urban Syst. 2015, 54, 255–265. [Google Scholar] [CrossRef]
- Batty, M. Big data, smart cities and city planning. Dialogues Hum. Geogr. 2013, 3, 274–279. [Google Scholar] [CrossRef]
- Li, L.; Goodchild, M.F.; Xu, B. Spatial, temporal, and socioeconomic patterns in the use of twitter and flickr. Cartogr. Geogr. Inf. Sci. 2013, 40, 61–77. [Google Scholar] [CrossRef]
- Petutschnig, A.; Resch, B.; Lang, S.; Havas, C. Evaluating the Representativeness of Socio-Demographic Variables over Time for Geo-Social Media Data. ISPRS Int. J. Geo-Inf. 2021, 10, 323. [Google Scholar] [CrossRef]
- Zhang, G.; Zhu, A.X. The representativeness and spatial bias of volunteered geographic information: A review. Ann. GIS 2018, 24, 151–162. [Google Scholar] [CrossRef]
- City of Walnut Creek. Rethinking Mobility. 2020. Available online: http://www.rethinkingmobilitywc.com/ (accessed on 13 November 2020).
- Convery, S.; Williams, B. Determinants of Transport Mode Choice for Non-Commuting Trips: The Roles of Transport, Land Use and Socio-Demographic Characteristics. Urban Sci. 2019, 3, 82. [Google Scholar] [CrossRef] [Green Version]
- Yang, F.; Jin, P.J.; Cheng, Y.; Zhang, J.; Ran, B. Origin-Destination Estimation for Non-Commuting Trips Using Location-Based Social Networking Data. Int. J. Sustain. Transp. 2015, 9, 551–564. [Google Scholar] [CrossRef]
- Pourebrahim, N.; Sultana, S.; Thill, J.C.; Mohanty, S. Enhancing trip distribution prediction with twitter data: Comparison of neural network and gravity models. In Proceedings of the 2nd ACM Sigspatial International Workshop on AI for Geographic Knowledge Discovery, GeoAI 2018, Seattle, WA, USA, 6 November 2018; pp. 33–42. [Google Scholar] [CrossRef]
- Wilson, A.G. A statistical theory of spatial distribution models. Transp. Res. 1967, 1, 253–269. [Google Scholar] [CrossRef]
- Lee, J.H.; Davis, A.W.; Yoon, S.Y.; Goulias, K.G. Activity space estimation with longitudinal observations of social media data. Transportation 2016, 43, 955–977. [Google Scholar] [CrossRef]
- Liao, Y.; Yeh, S.; Gil, J. Feasibility of estimating travel demand using geolocations of social media data. Transportation 2021. [Google Scholar] [CrossRef]
- Waddell, P. Integrated land use and transportation planning and modelling: Addressing challenges in research and practice. Transp. Rev. 2011, 31, 209–229. [Google Scholar] [CrossRef]
- McNeill, G.; Bright, J.; Hale, S.A. Estimating local commuting patterns from geolocated Twitter data. EPJ Data Sci. 2017, 6, 24. [Google Scholar] [CrossRef] [Green Version]
- Mackenzie, J.; Azumbrado, T.; Connolly, D.; Dutra-vernaci, C.; Halsted, A.W.; Schaaf, L.; Slocum, W.; Worth, A.R.; Pierce, C.J.; Gibbons, M.L.; et al. Plan Bay Area 2040; Metropolitan Transportation Commission: San Francisco, CA, USA, 2017.
- Cervero, R. Jobs-Housing Balance Revisited: Trends and Impacts in the San Francisco Bay Area. J. Am. Plan. Assoc. 1996, 62, 492–511. [Google Scholar] [CrossRef]
- Cervero, R.; Duncan, M. Which Reduces Vehicle Travel More: Jobs-Housing Balance or Retail-Housing Mixing? J. Am. Plan. Assoc. 2006, 72, 475–490. [Google Scholar] [CrossRef]
- Chapple, K.; Zuk, M. Case Studies on Gentrification and Displacement in the San Francisco Bay Area; Technical Report; Unviersity of California Berkeley: Berkeley, CA, USA, 2015. [Google Scholar]
- Nguyen, V.B.; Stivers, E. Moving Silicon Valley Forward; Technical Report; Urban Habitat: Oakland, CA, USA, 2012. [Google Scholar]
- Graham, M.R.; Kutzbach, M.J.; McKenzie, B. Design Comparison of LODES and ACS Commuting Data Products; Working Papers 14-38; Center for Economic Studies, U.S. Census Bureau: Washington, DC, USA, 2014.
- U.S. Census Bureau. Means of Transportation to Work by Selected Characteristics. 2019. Available online: https://data.census.gov/cedsci/table?q=S0802&tid=ACSST1Y2019.S0802 (accessed on 13 November 2020).
- Petutschnig, A.; Havas, C.R.; Resch, B.; Krieger, V.; Ferner, C. Exploratory Spatiotemporal Language Analysis of Geo-Social Network Data for Identifying Movements of Refugees. GI_Forum 2020, 1, 137–152. [Google Scholar] [CrossRef]
- U.S. Census Bureau. Geographic Region Outline Data. 2019. Available online: https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2019 (accessed on 13 November 2020).
- Boundary Solutions, Inc. ParcelAtlas User Manual. 2020. Available online: https://www.boundarysolutions.com/ParcelAtlas/ParcelAtlasUserManual.pdf (accessed on 13 November 2020).
- OpenStreetMap Foundation. OpenStreetMap Contributors. 2020. Available online: https://www.openstreetmap.org (accessed on 13 November 2020).
- Boeing, G. OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks. Comput. Environ. Urban Syst. 2017, 65, 126–139. [Google Scholar] [CrossRef] [Green Version]
- Hagberg, A.A.; Schult, D.A.; Swart, P.J. Exploring network structure, dynamics, and function using NetworkX. In Proceedings of the 7th Python in Science Conference (SciPy 2008), Pasadena, CA, USA, 19–24 August 2008; pp. 11–15. [Google Scholar]
- PostgreSQL Global Development Group. PostgreSQL. 2020. Available online: https://www.postgresql.org (accessed on 13 November 2020).
- PostGIS. PostGIS. 2020. Available online: https://www.postgis.net (accessed on 13 November 2020).
- Python Software Foundation. Python. 2020. Available online: https://www.python.org (accessed on 13 November 2020).
- R Core Team. The R Project for Statistical Computing. 2020. Available online: https://www.r-project.org (accessed on 13 November 2020).
- QGIS Development Team. QGIS. 2020. Available online: https://www.qgis.org (accessed on 13 November 2020).
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
- Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Trans. Database Syst. 2017, 42, 1–21. [Google Scholar] [CrossRef]
- Kounadi, O.; Resch, B. A Geoprivacy by Design Guideline for Research Campaigns That Use Participatory Sensing Data. J. Empir. Res. Hum. Res. Ethics 2018, 13, 203–222. [Google Scholar] [CrossRef] [Green Version]
- Kounadi, O.; Resch, B.; Petutschnig, A. Privacy Threats and Protection Recommendations for the Use of Geosocial Network Data in Research. Soc. Sci. 2018, 7, 191. [Google Scholar] [CrossRef] [Green Version]
- Cuba, N. Research note: Sankey diagrams for visualizing land cover dynamics. Landsc. Urban Plan. 2015, 139, 163–167. [Google Scholar] [CrossRef]
- Anselin, L. Spatial Econometrics: Methods and Models; Volume 4, Studies in Operational Regional Science; Springer: Dordrecht, The Netherlands, 1988. [Google Scholar] [CrossRef] [Green Version]
- Ward, M.D.; Gleditsch, K.S. Spatial Regression Models; Sage Publications: Thousand Oaks, CA, USA, 2008. [Google Scholar] [CrossRef] [Green Version]
- SeeClickFix, Inc. SeeClickFix. 2020. Available online: https://seeclickfix.com/ (accessed on 13 November 2020).
- Waze Online. Waze. 2020. Available online: https://www.waze.com/ (accessed on 13 November 2020).
- Kogan, N.E.; Clemente, L.; Liautaud, P.; Kaashoek, J.; Link, N.B.; Nguyen, A.T.; Lu, F.S.; Huybers, P.; Resch, B.; Havas, C.; et al. An early warning approach to monitor COVID-19 activity with multiple digital traces in near real time. Sci. Adv. 2021, 7, eabd6989. [Google Scholar] [CrossRef] [PubMed]
County | Census Tract | Census Block | |
---|---|---|---|
Number of Regions | 9 | 1584 | 109,228 |
Mean Area [km2] | 2357.7 | 13.4 | 0.2 |
Median Area [km2] | 2126.5 | 1.6 | 0.02 |
Standard Deviation Area [km2] | 1028.9 | 69.0 | 2.1 |
CV Area | 0.44 | 5.15 | 10.50 |
Mean Number of Tweets | 3,750,657 | 21,310 | 352 |
Median Number of Tweets | 3,013,390 | 15,503 | 54 |
Standard Deviation Number of Tweets | 2,794,267 | 27,339 | 1863 |
CV Number of Tweets | 0.75 | 1.28 | 5.29 |
Layer | Feature Count |
---|---|
Total OSM network | 541,898 |
LODES | 136,492 |
All Twitter | 114,002 |
Twitter—Outside of rush hour | 112,769 |
Twitter—Weekends | 97,997 |
Twitter—Rush hour | 96,896 |
Twitter—Outside of rush hour 2018/19 only | 38,330 |
Twitter—Weekends 2018/19 only | 31,766 |
Twitter—Rush hour 2018/19 only | 31,441 |
LODES Prediction | Spatial Lag | Spatial Error |
---|---|---|
All Twitter | 0.76117 | 0.766143 |
Twitter—Outside of rush hour | 0.764129 | 0.770077 |
Twitter—Rush hour | 0.728379 | 0.736184 |
Twitter—Weekends | 0.763587 | 0.772134 |
Twitter—Outside of rush hour 2018/19 only | 0.352977 | 0.425247 |
Twitter—Rush hour 2018/19 only | 0.266606 | 0.320049 |
Twitter—Weekends 2018/19 only | 0.349353 | 0.41174 |
All-Twitter Predictions 2018/19 | Spatial Lag | Spatial Error |
---|---|---|
Twitter—Outside of rush hour | 0.505801 | 0.602277 |
Twitter—Rush hour | 0.418713 | 0.492426 |
Twitter—Weekends | 0.515357 | 0.609862 |
Within Region | Across Regions | |||
---|---|---|---|---|
LODES | LODES | |||
Census Block | 0 (0%) | 4313 (0.1%) | 946,907 (100%) | 3,248,031 (99.9%) |
Census Tract | 404,121 (42.7%) | 110,630 (3.4%) | 542,786 (57.3%) | 3,141,714 (96.6%) |
County | 833,318 (88.0%) | 1,897,982 (58.4%) | 113,589 (12.0%) | 1,354,362 (41.6%) |
LODES | ||
---|---|---|
Minimum | 0.001 km/0 min | 0.008 km/0 min |
Median | 12.660 km/14 min | 2.800 km/3 min |
Mean | 22.538 km/19 min | 7.561 km/6 min |
Maximum | 259.761 km/178 min | 366.315 km/264 min |
Temporal Subset | |
---|---|
Twitter—Weekends | 0.6098 |
Twitter—Outside of rush hour | 0.6023 |
Twitter—Rush hour | 0.4924 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Petutschnig, A.; Albrecht, J.; Resch, B.; Ramasubramanian, L.; Wright, A. Commuter Mobility Patterns in Social Media: Correlating Twitter and LODES Data. ISPRS Int. J. Geo-Inf. 2022, 11, 15. https://doi.org/10.3390/ijgi11010015
Petutschnig A, Albrecht J, Resch B, Ramasubramanian L, Wright A. Commuter Mobility Patterns in Social Media: Correlating Twitter and LODES Data. ISPRS International Journal of Geo-Information. 2022; 11(1):15. https://doi.org/10.3390/ijgi11010015
Chicago/Turabian StylePetutschnig, Andreas, Jochen Albrecht, Bernd Resch, Laxmi Ramasubramanian, and Aleisha Wright. 2022. "Commuter Mobility Patterns in Social Media: Correlating Twitter and LODES Data" ISPRS International Journal of Geo-Information 11, no. 1: 15. https://doi.org/10.3390/ijgi11010015
APA StylePetutschnig, A., Albrecht, J., Resch, B., Ramasubramanian, L., & Wright, A. (2022). Commuter Mobility Patterns in Social Media: Correlating Twitter and LODES Data. ISPRS International Journal of Geo-Information, 11(1), 15. https://doi.org/10.3390/ijgi11010015