[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3397536.3422254acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

A Geocoding Framework Powered by Delivery Data

Published: 03 November 2020 Publication History

Abstract

Over the last decade, India has witnessed an explosion in the ecommerce industry. There is increasing adoption of e-commerce in smaller towns and cities over and above the densely populated urban centers. In this paper, we discuss the practical challenges involved with developing high-precision geocoding engines for these geographical regions in India. These challenges motivate the next iteration of our geocoding framework. In particular, we focus on addressing three core areas of improvement: 1) leveraging customer delivery data for geocoding, 2) understanding and solving for the diversity and variations in addresses for these new regions, and 3) overcoming the limited coverage of our reference corpus. To this end, we present GeoCloud. Key contributions of GeoCloud are 1) a training algorithm for learning reference-representations from delivery coordinates and 2) a retrieval algorithm for geocoding new addresses. We perform extensive testing of GeoCloud across India to capture the regional, socio-economical and linguistic diversity of our country. Our evaluation data is sampled from 72 cities and 21 states from the delivery addresses of a large e-commerce platform in India. The results show a significant improvement in precision and recall over the state-of-the-art geocoding system for India, and demonstrate the effectiveness of our intuitive, robust and generic approach. While we have shown the effectiveness of the framework for Indian addresses, we believe the framework can be applied to other countries as well, particularly where addresses are unstructured. To the best of our knowledge, this is the first instance of geocoding by learning reference-representations from large-scale delivery data.

References

[1]
T. Ravindra Babu, Abhranil Chatterjee, Shivram Khandeparker, A. Vamsi Subhash, and Sawan Gupta. 2015. Geographical Address Classification without Using Geolocation Coordinates. In Proceedings of the 9th Workshop on Geographic Information Retrieval (Paris, France) (GIR '15). Association for Computing Machinery, New York, NY, USA, Article 8, 10 pages. https://doi.org/10.1145/2837689.2837696
[2]
Pavel Berkhin, Michael R. Evans, Florin Teodorescu, Wei Wu, and Dragomir Yankov. 2015. A New Approach to Geocoding: BingGC. In Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems (Seattle, Washington) (SIGSPATIAL '15). Association for Computing Machinery, New York, NY, USA, Article 7, 10 pages. https://doi.org/10.1145/2820783.2820827
[3]
Kalaari Capital. 2017. Kalaari-KStart - Internet Growth in India. http://kstart.in/wp-content/uploads/2017/08/India_Internet_Report_2017.pdf
[4]
Abhranil Chatterjee, Janit Anjaria, Sourav Roy, Arnab Ganguli, and Krishanu Seal. 2016. SAGEL: Smart Address Geocoding Engine for Supply-Chain Logistics. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (Burlingame, California) (SIGSPACIAL '16). Association for Computing Machinery, New York, NY, USA, Article 42, 10 pages. https://doi.org/10.1145/2996913.2996917
[5]
DescriptionPricewaterhouseCoopers. 2014. PWC e-Commerce Growth Report for 2014. https://www.pwc.in/assets/pdfs/publications/2015/ecommerce-in-india-accelerating- growth.pdf
[6]
Ernst and Young. 2017. EY Growth Report, 2017. https://www.ey.com/Publication/vwLUAssets/ey-indias-growth-paradigm/$FILE/ey- indias- growth-paradigm.pdf
[7]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A Density-based Algorithm for Discovering Clusters a Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (Portland, Oregon) (KDD'96). AAAI Press, 226--231. http://dl.acm.org/citation.cfm?id=3001460.3001507
[8]
Indian Brand Equity Foundation. 2018. IBEF e-Commerce Review 2018. https://www.ibef.org/download/Ecommerce-February-20181.pdf
[9]
Wolf Garbe. 2019. Sym Spell Compound. https://github.com/wolfgarbe/SymSpell
[10]
GeoNames. 2017. GeoNames Geographical Database. http://www.geonames.org/
[11]
Gisgraphy. 2020. Gisgraphy World Geocoding. http://www.gisgraphy.com
[12]
Daniel W Goldberg. 2008. A geocoding best practices guide. (2008).
[13]
Daniel W Goldberg. 2011. Improving Geocoding Match Rates with Spatially-Varying Block Metrics. Transactions in GIS 15, 6 (2011), 829--850.
[14]
Daniel W Goldberg, John P Wilson, and Craig A Knoblock. 2007. From text to geographic coordinates: the current state of geocoding. URISA-WASHINGTON DC- 19, 1 (2007), 33.
[15]
Chandigarh Government. 2018. Interactive Map of Chandigarh. http://chandigarh.gov.in/knowchd_map.htm
[16]
Komoot. 2020. Photon. https://github.com/komoot/photon
[17]
MapQuest. 2020. MapQuest Developer Network. https://developer.mapquest.com/
[18]
Office of the Registrar General and India Census Commissioner. 2001. Census Data 2001, India at a glance. https://www.censusindia.gov.in/Census_Data_2001/India_at_glance/glance.aspx
[19]
OpenStreetMap. 2020. Nominatim Opensource Search. https://github.com/twain47/Nominatim
[20]
OpenStreetMap. 2020. OpenStreetMap Nominatim. http://nominatim.openstreetmap.org/
[21]
Google Maps Platform. 2020. Google Maps Geocoding API. https://developers.google.com/
[22]
Sina Rashidian, Xinyu Dong, Amogh Avadhani, Prachi Poddar, and Fusheng Wang. 2017. Effective Scalable and Integrative Geocoding for Massive Address Datasets. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (Redondo Beach, CA, USA) (SIGSPATIAL '17). Association for Computing Machinery, New York, NY, USA, Article 26, 10 pages. https://doi.org/10.1145/3139958.3139986
[23]
RedSeer. 2018. RedSeer 2018 e-Tailing Perspective. http://redseer.com/reports/e-tailing-in-india-redseer-perspective/
[24]
Anil Kumar Singh. 2006. A Computational Phonetic Model for Indian Language Scripts. In Proceedings of Constraints on Spelling Changes: Fifth International Workshop on Writing Systems.
[25]
Here Technologies. 2020. HERE Geocoder API. https://developer.here.com/
[26]
Duck-Hye Yang, Lucy Mackey Bilaver, Oscar Hayes, and Robert Goerge. 2004. Improving geocoding practices: evaluation of geocoding tools. Journal of medical systems 28, 4 (2004), 361--370.
[27]
Paul A Zandbergen. 2009. Geocoding quality and implications for spatial analysis. Geography Compass 3, 2 (2009), 647--680.

Cited By

View all
  • (2024)Nationwide Behavior-Aware Coordinates Mining From Uncertain Delivery EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341156236:11(6681-6698)Online publication date: Nov-2024
  • (2023)Urban-scale POI Updating with Crowd IntelligenceProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614724(4631-4638)Online publication date: 21-Oct-2023
  • (2023)AutoBuild: Automatic Community Building Labeling for Last-mile DeliveryProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614658(4623-4630)Online publication date: 21-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGSPATIAL '20: Proceedings of the 28th International Conference on Advances in Geographic Information Systems
November 2020
687 pages
ISBN:9781450380195
DOI:10.1145/3397536
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Address Processing
  2. Data Preprocessing
  3. Geocoding
  4. Geographic Information System
  5. Spatial Data Mining and Knowledge Discovery

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SIGSPATIAL '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 257 of 1,238 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Nationwide Behavior-Aware Coordinates Mining From Uncertain Delivery EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341156236:11(6681-6698)Online publication date: Nov-2024
  • (2023)Urban-scale POI Updating with Crowd IntelligenceProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614724(4631-4638)Online publication date: 21-Oct-2023
  • (2023)AutoBuild: Automatic Community Building Labeling for Last-mile DeliveryProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614658(4623-4630)Online publication date: 21-Oct-2023
  • (2023)C-AOI: Contour-based Instance Segmentation for High-Quality Areas-of-Interest in Online Food Delivery PlatformProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599786(5750-5759)Online publication date: 6-Aug-2023
  • (2023)TDCM: Transport Destination Calibrating Based on Multi-task LearningMachine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track10.1007/978-3-031-43430-3_17(276-292)Online publication date: 17-Sep-2023
  • (2022)Address Location Correction System for Q-commerceProceedings of the Second International Conference on AI-ML Systems10.1145/3564121.3564800(1-7)Online publication date: 12-Oct-2022
  • (2022)Automatic generation of areas of interest using multimodal geospatial data from an on-demand food delivery platform (industrial paper)Proceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3561023(1-10)Online publication date: 1-Nov-2022
  • (2022)Simultaneous detection of multiple areas-of-interest using geospatial data from an online food delivery platform (industrial paper)Proceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3561014(1-10)Online publication date: 1-Nov-2022
  • (2022)FastAddrProceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3560999(1-10)Online publication date: 1-Nov-2022
  • (2022)CoMinerProceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3560944(1-10)Online publication date: 1-Nov-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media