[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3397536.3422201acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
poster

Noise Prediction for Geocoding Queries using Word Geospatial Embedding and Bidirectional LSTM

Published: 13 November 2020 Publication History

Abstract

User geocoding queries in map applications often contain noisy tokens such as typos in street, city name, wrong postal code, redundant words due to copy-paste action, etc. This issue becomes worse with the rapid growth of mobile devices, where errors from user input are inevitable. Such noisy tokens may fail the searching process if they are passed as-is to the downstream query processing components. In particular, there might be nothing or irrelevant results returned to the user. Therefore, noisy tokens in geocoding queries should be recognized and handled properly prior to the searching process. In this paper, a deep learning based noise prediction model for geocoding queries is proposed. It combines a novel Word Geospatial Embedding (WGE) and a Bidirectional LSTM based sequence tagging model. The proposed WGE is the first language model that allows geospatial semantics to be encoded into the vector representations. It allows geo-related machine learning/deep learning models making spatial-aware prediction.

References

[1]
Matthew J Beal, Zoubin Ghahramani, and Carl E Rasmussen. 2002. The infinite hidden Markov model. In Advances in neural information processing systems. 577--584.
[2]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[3]
Mordechai Haklay and Patrick Weber. 2008. Openstreetmap: User-generated street maps. IEEE Pervasive Computing 7, 4(2008), 12--18.
[4]
Zhiheng Huang, Wei Xu, and Kai Yu. 2015. Bidirectional LSTM-CRF models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015).
[5]
Ryan McDonald, Keith Hall, and Gideon Mann. 2010. Distributed training strategies for the structured perceptron. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics. 456--464.
[6]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111--3119.
[7]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing(EMNLP). 1532--1543.
[8]
Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018).
[9]
Helmut Schmid. 1994. Part-of-speech tagging with neural networks. In Proceedings of the 15th conference on Computational linguistics-Volume 1. Association for Computational Linguistics, 172--176.
[10]
Charles Sutton, Andrew McCallum, et al. 2012. An introduction to conditional random fields. Foundations and Trends® in Machine Learning 4, 4 (2012), 267--373.

Cited By

View all
  • (2024)Nationwide Behavior-Aware Coordinates Mining From Uncertain Delivery EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341156236:11(6681-6698)Online publication date: Nov-2024
  • (2023)Feasibility Analysis of Machine Learning-Based Autoscoring Feature for Korean Language Spoken by Foreigners2023 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)10.1109/IoTaIS60147.2023.10346058(126-132)Online publication date: 28-Nov-2023
  • (2022)CoMinerProceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3560944(1-10)Online publication date: 1-Nov-2022

Index Terms

  1. Noise Prediction for Geocoding Queries using Word Geospatial Embedding and Bidirectional LSTM

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGSPATIAL '20: Proceedings of the 28th International Conference on Advances in Geographic Information Systems
      November 2020
      687 pages
      ISBN:9781450380195
      DOI:10.1145/3397536
      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 November 2020

      Check for updates

      Author Tags

      1. deep learning
      2. geocoding
      3. word embeddings

      Qualifiers

      • Poster
      • Research
      • Refereed limited

      Conference

      SIGSPATIAL '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 257 of 1,238 submissions, 21%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 08 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Nationwide Behavior-Aware Coordinates Mining From Uncertain Delivery EventsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.341156236:11(6681-6698)Online publication date: Nov-2024
      • (2023)Feasibility Analysis of Machine Learning-Based Autoscoring Feature for Korean Language Spoken by Foreigners2023 IEEE International Conference on Internet of Things and Intelligence Systems (IoTaIS)10.1109/IoTaIS60147.2023.10346058(126-132)Online publication date: 28-Nov-2023
      • (2022)CoMinerProceedings of the 30th International Conference on Advances in Geographic Information Systems10.1145/3557915.3560944(1-10)Online publication date: 1-Nov-2022

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media