[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2835776.2835820acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Improving IP Geolocation using Query Logs

Published: 08 February 2016 Publication History

Abstract

IP geolocation databases map IP addresses to their geographical locations. These databases are important for several applications such as local search engine relevance, credit card fraud protection, geotargetted advertising, and online content delivery. While they are the most popular method of geolocation, they can have low accuracy at the city level. In this paper we evaluate and improve IP geolocation databases using data collected from search engine logs. We generate a large ground-truth dataset using real time global positioning data extracted from search engine logs. We show that incorrect geolocation information can have a negative impact on implicit user metrics. Using the dataset we measure the accuracy of three state-of-the-art commercial IP geolocation databases. We then introduce a technique to improve existing geolocation databases by mining explicit locations from query logs. We show significant accuracy gains in 44 to 49 out of the top 50 countries, depending on the IP geolocation database. Finally, we validate the approach with a large scale A/B experiment that shows improvements in several user metrics.

References

[1]
Bing Geocoding API. http://msdn.microsoft.com/en-us/library/ff701711.aspx, (accessed July 17, 2015).
[2]
Bing Reverse Geocoding API. http://msdn.microsoft.com/en-us/library/ff701710.aspx, (accessed July 17, 2015).
[3]
Google Geocoding API. https://developers.google.com/maps/documentation/geocoding/, (accessed July 17, 2015).
[4]
OpenCalais. http://www.opencalais.com/, (accessed July 17, 2015).
[5]
Yahoo! PlaceSpotter. https://developer.yahoo.com/boss/geo/docs/key-concepts.html, (accessed July 17, 2015).
[6]
L. Backstrom, E. Sun, and C. Marlow. Find Me if You Can: Improving Geographical Prediction with Social and Spatial Proximity. In WWW 2010, pages 61--70, Raleigh, North Carolina, USA, 2010. ACM.
[7]
P. N. Bennett, F. Radlinski, R. W. White, and E. Yilmaz. Inferring and Using Location Metadata to Personalize Web Search. In SIGIR 2011, pages 135--144, Beijing, China, 2011. ACM.
[8]
B. J. L. Berry. City Size Distributions and Economic Development. Economic Development and Cultural Change, 9(4):573--588, 1961.
[9]
T. P. Bhatla, V. Prabhu, and A. Dua. Understanding credit card frauds. Cards business review, 1(6), 2003.
[10]
R. Chaiken, B. Jenkins, P.-A. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. Scope: Easy and efficient parallel processing of massive data sets. Proc. VLDB Endow., 1(2):1265--1276, Aug. 2008.
[11]
Y.-C. Cheng, Y. Chawathe, A. LaMarca, and J. Krumm. Accuracy Characterization for Metropolitan-scale Wi-Fi Localization. In MobiSys 2005, pages 233--245, Seattle, Washington, 2005. ACM.
[12]
H. Cunningham, D. Maynard, K. Bontcheva, and V. Tablan. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In ACL 2002, pages 168--175, 2002.
[13]
A. El-Rabbany. Introduction to GPS: The Global Positioning System. Artech House mobile communications series. Artech House, 2002.
[14]
P. Endo and D. Sadok. Whois Based Geolocation: A Strategy to Geolocate Internet Hosts. In AINA 2010, pages 408--413, April 2010.
[15]
J. R. Finkel, T. Grenager, and C. Manning. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. In ACL 2005, pages 363--370, Ann Arbor, Michigan, 2005.
[16]
S. Fox, K. Karnawat, M. Mydland, S. Dumais, and T. White. Evaluating Implicit Measures to Improve Web Search. ACM Transactions on Information Systems, 23(2):147--168, Apr. 2005.
[17]
B. Gueye, A. Ziviani, M. Crovella, and S. Fdida. Constraint-Based Geolocation of Internet Hosts. IEEE/ACM Transactions on Networking, 14(6):1219--1232, Dec 2006.
[18]
C. Guo, Y. Liu, W. Shen, H. Wang, Q. Yu, and Y. Zhang. Mining the Web and the Internet for Accurate IP Address Geolocations. In INFOCOM 2009, pages 2841--2845, April 2009.
[19]
B. Hofmann-Wellenhof, H. Lichtenegger, and J. Collins. Global Positioning System: Theory and Practice. Springer, 1993.
[20]
B. Hofmann-Wellenhof, H. Lichtenegger, and E. Wasle. GNSS -- Global Navigation Satellite Systems: GPS, GLONASS, Galileo, and more. Springer, 2007.
[21]
C. Huang, D. Maltz, J. Li, and A. Greenberg. Public DNS system and Global Traffic Management. In INFOCOM 2011, pages 2615--2623, April 2011.
[22]
K. Hubbard, M. Kosters, D. Conrad, D. Karrenberg, and J. Postel. Internet Registry IP Allocation Guidelines. Technical report, United States, 1996.
[23]
E. Katz-Bassett, J. P. John, A. Krishnamurthy, D. Wetherall, T. Anderson, and Y. Chawathe. Towards IP Geolocation Using Delay and Topology Measurements. In IMC 2006, pages 71--84, Rio de Janeriro, Brazil, 2006. ACM.
[24]
B. Kölmel and S. Alexakis. Location based advertising. In First International Conference on Mobile Business, Athens, Greece, 2002.
[25]
L. MacVittie. Geolocation and Application Delivery. https://f5.com/resources/white-papers/geolocation-and-application-delivery, 2012 (accessed November 28, 2015).
[26]
V. N. Padmanabhan and L. Subramanian. An Investigation of Geographic Mapping Techniques for Internet Hosts. In SIGCOMM 2001, pages 173--185, San Diego, California, USA, 2001. ACM.
[27]
C. A. Shue, N. Paul, and C. R. Taylor. From an IP Address to a Street Address: Using Wireless Signals to Locate a Target. In WOOT 2013, Washington, D.C., 2013. USENIX.
[28]
D. J. B. Svantesson. E-Commerce Tax: How The Taxman Brought Geography To The 'Borderless' Internet. Revenue Law Journal, 17(1):11, 2007.
[29]
L. Wang, C. Wang, X. Xie, J. Forman, Y. Lu, W.-Y. Ma, and Y. Li. Detecting Dominant Locations from Search Queries. In SIGIR 2015, pages 424--431, Salvador, Brazil, 2005. ACM.
[30]
Y. Wang, D. Burgener, M. Flores, A. Kuzmanovic, and C. Huang. Towards Street-level Client-independent IP Geolocation. In NSDI 2011, pages 365--379, Berkeley, CA, USA, 2011. USENIX.
[31]
T. White. Hadoop: The Definitive Guide. O'Reilly and Associates Series. O'Reilly, 2012.
[32]
B. Wong, I. Stoyanov, and E. G. Sirer. Octant: A Comprehensive Framework for the Geolocalization of Internet Hosts. In NSDI 2007, pages 23--23, Berkeley, CA, USA, 2007. USENIX Association.
[33]
I. Youn, B. Mark, and D. Richards. Statistical Geolocation of Internet Hosts. In ICCCN 2009, pages 1--6, Aug 2009.

Cited By

View all
  • (2024)Mobile IP Geolocation Based on District Anchor Without Cooperation of Users or Internet Service ProvidersIEEE/ACM Transactions on Networking10.1109/TNET.2024.347133532:6(5507-5523)Online publication date: Dec-2024
  • (2023)IP2vec: an IP node representation model for IP geolocationFrontiers of Computer Science10.1007/s11704-023-2616-918:6Online publication date: 28-Dec-2023
  • (2022)IP Geolocation through Geographic ClicksACM Transactions on Spatial Algorithms and Systems10.1145/34767748:1(1-22)Online publication date: 4-Mar-2022
  • Show More Cited By

Index Terms

  1. Improving IP Geolocation using Query Logs

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WSDM '16: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining
      February 2016
      746 pages
      ISBN:9781450337168
      DOI:10.1145/2835776
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 08 February 2016

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. contextual relevance
      2. geographic personalization
      3. geographic targeting
      4. geotargeting
      5. ip geolocation
      6. local search

      Qualifiers

      • Research-article

      Conference

      WSDM 2016
      WSDM 2016: Ninth ACM International Conference on Web Search and Data Mining
      February 22 - 25, 2016
      California, San Francisco, USA

      Acceptance Rates

      WSDM '16 Paper Acceptance Rate 67 of 368 submissions, 18%;
      Overall Acceptance Rate 498 of 2,863 submissions, 17%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)29
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 24 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Mobile IP Geolocation Based on District Anchor Without Cooperation of Users or Internet Service ProvidersIEEE/ACM Transactions on Networking10.1109/TNET.2024.347133532:6(5507-5523)Online publication date: Dec-2024
      • (2023)IP2vec: an IP node representation model for IP geolocationFrontiers of Computer Science10.1007/s11704-023-2616-918:6Online publication date: 28-Dec-2023
      • (2022)IP Geolocation through Geographic ClicksACM Transactions on Spatial Algorithms and Systems10.1145/34767748:1(1-22)Online publication date: 4-Mar-2022
      • (2021)The cosmos big data platform at MicrosoftProceedings of the VLDB Endowment10.14778/3476311.347639014:12(3148-3161)Online publication date: 28-Oct-2021
      • (2021)IP Geolocation through Reverse DNSACM Transactions on Internet Technology10.1145/345761122:1(1-29)Online publication date: 15-Oct-2021
      • (2021)IP Geolocation Using Traceroute Location Propagation and IP Range Location InterpolationCompanion Proceedings of the Web Conference 202110.1145/3442442.3451888(332-338)Online publication date: 19-Apr-2021
      • (2020)Network Entity Landmark Mining Technology2020 6th International Symposium on System and Software Reliability (ISSSR)10.1109/ISSSR51244.2020.00020(75-82)Online publication date: Oct-2020
      • (2020)RNBG: A Ranking Nodes Based IP Geolocation MethodIEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFOCOMWKSHPS50562.2020.9162976(80-84)Online publication date: Jul-2020
      • (2019)Street-Level Geolocation Based on Router Multilevel PartitioningIEEE Access10.1109/ACCESS.2019.29149727(59237-59248)Online publication date: 2019
      • (2019)GeoCET: Accurate IP Geolocation via Constraint-Based Elliptical TrajectoriesCollaborative Computing: Networking, Applications and Worksharing10.1007/978-3-030-30146-0_41(603-622)Online publication date: 18-Aug-2019
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media