[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1463434.1463460acmconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Qualitative geocoding of persistent web pages

Published: 05 November 2008 Publication History

Abstract

Information and specifically Web pages may be organized, indexed, searched, and navigated using various metadata aspects, such as keywords, categories (themes), and also space. While categories and keywords are up for interpretation, space represents an unambiguous aspect to structure information. The basic problem of providing spatial references to content is solved by geocoding; a task that relates identifiers in texts to geographic co-ordinates. This work presents a methodology for the semiautomatic geocoding of persistent Web pages in the form of collaborative human intervention to improve on automatic geocoding results. While focusing on the Greek language and related Web pages, the developed techniques are universally applicable. The specific contributions of this work are (i) automatic geocoding algorithms for phone numbers, addresses and place name identifiers and (ii) a Web browser extension providing a map-based interface for manual geocoding and updating the automatically generated results. With the geocoding of a Web page being stored as respective annotations in a central repository, this overall mechanism is especially suited for persistent Web pages such as Wikipedia. To illustrate the applicability and usefulness of the overall approach, specific geocoding examples of Greek Web pages are presented.

References

[1]
E. Amitay, N. Har'EL, R. Sivan, A. Soffer. Web-a-Where: Geotagging Web Content. In Proc. of SIGIR, pages 273--280, 2004.
[2]
A. E. Axelrod. On Building a High Performance Gazetteer Database. Technical Report, MetaCarta, electronically available at http://www.metacarta.com/Collateral/Documents/English- US/Building-high-performance-gazetteer-Axelrod.pdf. Current as of June 2008.
[3]
M. Bacchin, N. Ferro, and M. Melucci. A probabilistic model for stemmer generation. Information Processing and Management, 41(1), pages 121--137, 2005.
[4]
K. A. V. Borges, A. H. F. Laender, C. B. Medeiros, C. A. Davis. The Web as a Data Source for Spatial Databases. In Proc. 4th ACM Workshop on Geographical information retrieval, pages 31--36, 2003.
[5]
E. Brill. A Simple Rule-based Part of Speech Tagger. In Proc. 3rd Conf. on Applied Natural Language Processing, 1992.
[6]
A. Chalamandaris, A. Protopapas, P. Tsiakoulis, S. Raptis. All Greek to me! An Automatic Greeklish to Greek Transliteration System. In Proc. 5th Int'l Conf. on Language Resources and Evaluation (LREC), 2006.
[7]
J. Cowan. TagSoup parser. http://home.ccil.org/~cowan/XML/tagsoup/. Web page, current as of June 2008.
[8]
H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In Proc. 40th Anniversary Meeting of the Association for Computational Linguistics (ACL'02), 2002.
[9]
P. DeRose, X. Chai, B. J. Gao, W. Shen, A. Doan, P. Bohannon, X. Zhu. Building Community Wikipedias: A Machine-Human Partnership Approach. In Proc. ICDE, pages 646--655, 2008.
[10]
J. Ding, L. Gravano, N. Shivakumar. Computing Geographical Scopes of Web Resources. In Proc. VLDB, pages 545--556, 2000.
[11]
R. Elsinga. www.elsinga.org. Web page, current as of June 2008.
[12]
Explore Our Pla. Net. RSS to GeoRSS Converter. Web page http://exploreourpla.net/2006-06-08/georss-feed-readershows-podcasts.html, current as of June 2008.
[13]
H. Foundalis. The Details of Modern Greek Phonetics and Phonology. Web page http://www.cogsci.indiana.edu/farg/harry/lan/grphdetl.htm, current as of June 2008.
[14]
A. Fuxman, E. Fazli, R. J. Miller, ConQuer: Efficient Management of Inconsistent Databases., SIGMOD, pages 155--166, 2005
[15]
M. Gilleland. Levenshtein Distance, in Three Flavors, http://www.merriampark.com/ld.htm, 2000.
[16]
Google Inc. Google Maps API. http://code.google.com/apis/maps/. Web page, current as of June 2008.
[17]
L. Gravano, P. G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava, Approximate String Joins in a Database (Almost) for Free. In Proc. VLDB, pages 491--500, 2001
[18]
L. Gravano, V. Hatzivassiloglou, R. Lichtenstein. Categorizing web queries according to geographical locality. In Proc. of CIKM, pages 325--333, 2003.
[19]
G. Klein, S. Rowe, and R. Décamps. JFlex - The Fast Scanner Generator for Java. http://jflex.de/. Web page, current as of June 2008.
[20]
A. J. Lait, B. Randell. An Assessment of Name Matching Algorithms, Technical Report, Dept. of Comp. Sci., University of Newcastle upon Tyne, 1993
[21]
M. L. Lee, T. W. Ling, W. L. Low, IntelliClean: A Knowledge-Based Intelligent Data Cleaner, In Proc. KDD, pages, 290--204, 2000
[22]
MetaCarta Inc. Company homepage. http://www.metacarta.com/, Web page, current as of June 2008.
[23]
K. McCurley. Geospatial mapping and navigation of the web. In Proc. 10th WWW conf., pages 221--229, 2001.
[24]
NGA. GEOnet Names Server (GNS). http://earth-info.nga.mil/gns/html/index.html. Web page, current as of June 2008.
[25]
G. Petasis, G. Paliouras, V. Karkaletsis, C. Spyropoulos, I. Androutsopoulos. Resolving Part-Of-Speech Ambiguity in the Greek Language Using Learning Techniques. In Proc. CoRR, 1999.
[26]
S. Raghavan, H. Garcia-Molina. Crawling the Hidden Web. In Proc. VLDB, pages 129--138, 2001.
[27]
E. Rahm, H. H. Do, Data Cleaning: Problems and Current Approaches, IEEE Bulletin on Data Engineering, vol 23(4), pages 3--13, 2000.
[28]
K. Sgarbas, N. Fakotakis, G. Kokkinakis, A PC-KIMMO-Based Bi-directional Graphemic/Phonetic Converter for Modern Greek, Literary & Linguistic Computing, Oxford University Press, vol 13(2), pages 65--75, 1998.
[29]
R. Waters. Way to go? Mapping looks to be the Web's next big thing. Financial Times, May 22, 2008.
[30]
Yahoo Inc. Yahoo Yellow Pages. http://yp.yahoo.com/. Web page current as of June 2008.

Cited By

View all
  • (2023)EventMapping: Geoparsing and Geocoding of Twitter Messages in the Greek LanguageArtificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops10.1007/978-3-031-34171-7_25(312-324)Online publication date: 2-Jun-2023
  • (2021)Geocoding Freeform Placenames: An Example of Deciphering the Czech National Immigration DatabaseISPRS International Journal of Geo-Information10.3390/ijgi1005033510:5(335)Online publication date: 15-May-2021
  • (2019)Using OpenStreetMap point-of-interest data to model urban change—A feasibility studyPLOS ONE10.1371/journal.pone.021260614:2(e0212606)Online publication date: 25-Feb-2019
  • Show More Cited By

Index Terms

  1. Qualitative geocoding of persistent web pages

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GIS '08: Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
    November 2008
    559 pages
    ISBN:9781605583235
    DOI:10.1145/1463434
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 November 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. digital libraries
    2. indexing
    3. multilanguage content
    4. multilingual metadata
    5. spatiotemporal databases

    Qualifiers

    • Research-article

    Conference

    GIS '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 257 of 1,238 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)EventMapping: Geoparsing and Geocoding of Twitter Messages in the Greek LanguageArtificial Intelligence Applications and Innovations. AIAI 2023 IFIP WG 12.5 International Workshops10.1007/978-3-031-34171-7_25(312-324)Online publication date: 2-Jun-2023
    • (2021)Geocoding Freeform Placenames: An Example of Deciphering the Czech National Immigration DatabaseISPRS International Journal of Geo-Information10.3390/ijgi1005033510:5(335)Online publication date: 15-May-2021
    • (2019)Using OpenStreetMap point-of-interest data to model urban change—A feasibility studyPLOS ONE10.1371/journal.pone.021260614:2(e0212606)Online publication date: 25-Feb-2019
    • (2018)GeoHbbTVMultimedia Tools and Applications10.5555/3287850.328790077:21(28023-28048)Online publication date: 1-Nov-2018
    • (2018)GeoHbbTV: A framework for the development and evaluation of geographic interactive TV contentsMultimedia Tools and Applications10.1007/s11042-018-6021-677:21(28023-28048)Online publication date: 26-Apr-2018
    • (2013)Cross-lingual geo-parsing for non-structured dataProceedings of the 7th Workshop on Geographic Information Retrieval10.1145/2533888.2533943(64-71)Online publication date: 5-Nov-2013
    • (2013)Management of intagible cultural heritage in digital media using pamin2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)10.1109/ICMEW.2013.6618425(1-6)Online publication date: Jul-2013
    • (2011)Extraction and geographical navigation of important historical events in the webProceedings of the 10th international conference on Web and wireless geographical information systems10.5555/1966271.1966277(21-35)Online publication date: 3-Mar-2011
    • (2011)Geolocating for web based geospatial applicationsAdvances in Web-based GIS, Mapping Services and Applications10.1201/b11080-16(171-183)Online publication date: 29-Aug-2011
    • (2011)Extraction and Geographical Navigation of Important Historical Events in the WebWeb and Wireless Geographical Information Systems10.1007/978-3-642-19173-2_4(21-35)Online publication date: 2011
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media