[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2976749.2978313acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

Online Tracking: A 1-million-site Measurement and Analysis

Published: 24 October 2016 Publication History

Abstract

We present the largest and most detailed measurement of online tracking conducted to date, based on a crawl of the top 1 million websites. We make 15 types of measurements on each site, including stateful (cookie-based) and stateless (fingerprinting-based) tracking, the effect of browser privacy tools, and the exchange of tracking data between different sites ("cookie syncing"). Our findings include multiple sophisticated fingerprinting techniques never before measured in the wild. This measurement is made possible by our open-source web privacy measurement tool, OpenWPM, which uses an automated version of a full-fledged consumer browser. It supports parallelism for speed and scale, automatic recovery from failures of the underlying browser, and comprehensive browser instrumentation. We demonstrate our platform's strength in enabling researchers to rapidly detect, quantify, and characterize emerging online tracking behaviors.

References

[1]
G. Acar, C. Eubank, et al. The web never forgets: Persistent tracking mechanisms in the wild. In Proceedings of CCS. 2014.
[2]
G. Acar, M. Juarez, et al. FPDetective: dusting the web for fingerprinters. In Proceedings of CCS. ACM, 2013.
[3]
L. A. Adamic and B. A. Huberman. Zipf's law and the internet. Glottometrics, 3(1):143--150, 2002.
[4]
H. C. Altaweel I, Good N. Web privacy census. Technology Science, 2015.
[5]
J. Angwin. What they know. The Wall Street Journal. http://online.wsj.com/public/page/what-they-know-digital-privacy.html, 2012.
[6]
M. Ayenson, D. J. Wambach, et al. Flash cookies and privacy II: Now with HTML5 and ETag respawning. World Wide Web Internet And Web Information Systems, 2011.
[7]
P. E. Black. Ratcliff/Obershelp pattern recognition. http://xlinux.nist.gov/dads/HTML/ratcliffObershelp.html, Dec. 2004.
[8]
Bugzilla. WebRTC Internal IP Address Leakage. https://bugzilla.mozilla.org/show_bug.cgi?id=959893.
[9]
A. Datta, M. C. Tschantz, et al. Automated experiments on ad privacy settings. Privacy Enhancing Technologies, 2015.
[10]
P. Eckersley. How unique is your web browser? In Privacy Enhancing Technologies. Springer, 2010.
[11]
Electronic Frontier Foundation. Encrypting the Web. https://www.eff.org/encrypt-the-web.
[12]
S. Englehardt, D. Reisman, et al. Cookies that give you away: The surveillance implications of web tracking. In 24th International Conference on World Wide Web, pp. 289--299. International World Wide Web Conferences Steering Committee, 2015.
[13]
Federal Trade Commission. Google will pay$22.5 million to settle FTC charges it misrepresented privacy assurances to users of Apple's Safari internet browser. https://www.ftc.gov/news-events/press-releases/2012/08/google-will-pay-225-million-settle-ftc-charges-it-misrepresented, 2012.
[14]
D. Fifield and S. Egelman. Fingerprinting web users through font metrics. In Financial Cryptography and Data Security, pp. 107--124. Springer, 2015.
[15]
N. Fruchter, H. Miao, et al. Variations in tracking in relation to geographic location. In Proceedings of W2SP. 2015.
[16]
C. J. Hoofnagle and N. Good. Web privacy census. Available at SSRN 2460547, 2012.
[17]
M. Kranch and J. Bonneau. Upgrading HTTPS in midair: HSTS and key pinning in practice. In NDSS '15: The 2015 Network and Distributed System Security Symposium. February 2015.
[18]
S. A. Krashakov, A. B. Teslyuk, et al. On the universality of rank distributions of website popularity. Computer Networks, 50(11):1769--1780, 2006.
[19]
B. Krishnamurthy and C. Wills. Privacy diffusion on the web: a longitudinal perspective. In Conference on World Wide Web. ACM, 2009.
[20]
P. Laperdrix, W. Rudametkin, et al. Beauty and the beast: Diverting modern web browsers to build unique browser fingerprints. In 37th IEEE Symposium on Security and Privacy (S&P 2016). 2016.
[21]
M. Lécuyer, G. Ducoffe, et al. Xray: Enhancing the web's transparency with differential correlation. In USENIX Security Symposium. 2014.
[22]
T. Libert. Exposing the invisible web: An analysis of third-party http requests on 1 million websites. International Journal of Communication, 9(0), 2015. ISSN 1932--8036.
[23]
J. R. Mayer and J. C. Mitchell. Third-party web tracking: Policy and technology. In Security and Privacy (S&P). IEEE, 2012.
[24]
A. M. McDonald and L. F. Cranor. Survey of the use of Adobe Flash Local Shared Objects to respawn HTTP cookies, a. ISJLP, 7, 2011.
[25]
K. Mowery and H. Shacham. Pixel perfect: Fingerprinting canvas in html5. Proceedings of W2SP, 2012.
[26]
Mozilla Developer Network. Mixed content - Security. https://developer.mozilla.org/en-US/docs/Security/Mixed_content.
[27]
C. Neasbitt, B. Li, et al. Webcapsule: Towards a lightweight forensic engine for web browsers. In Proceedings of CCS. ACM, 2015.
[28]
N. Nikiforakis, L. Invernizzi, et al. You are what you include: Large-scale evaluation of remote javascript inclusions. In Proceedings of CCS. ACM, 2012.
[29]
N. Nikiforakis, A. Kapravelos, et al. Cookieless monster: Exploring the ecosystem of web-based device fingerprinting. In Security and Privacy (S&P). IEEE, 2013.
[30]
F. Ocariza, K. Pattabiraman, et al. Javascript errors in the wild: An empirical study. In Software Reliability Engineering (ISSRE). IEEE, 2011.
[31]
L. Olejnik, G. Acar, et al. The leaking battery. Cryptology ePrint Archive, Report 2015/616, 2015.
[32]
Phantom JS. Supported web standards. http://www.webcitation.org/6hI3iptm5, 2016.
[33]
M. Z. Rafique, T. Van Goethem, et al. It's free for a reason: Exploring the ecosystem of free live streaming services. In Network and Distributed System Security (NDSS). 2016.
[34]
N. Robinson and J. Bonneau. Cognitive disconnect: Understanding Facebook Connect login permissions. In 2nd ACM conference on Online social networks. ACM, 2014.
[35]
F. Roesner, T. Kohno, et al. Detecting and Defending Against Third-Party Tracking on the Web. In Symposium on Networking Systems Design and Implementation. USENIX, 2012.
[36]
S. Schelter and J. Kunegis. On the ubiquity of web tracking: Insights from a billion-page web crawl. arXiv preprint arXiv:1607.07403, 2016.
[37]
Selenium Browser Automation. Selenium faq. https://code.google.com/p/selenium/wiki/FrequentlyAskedQuestions, 2014.
[38]
K. Singh, A. Moshchuk, et al. On the incoherencies in web browser access control policies. In Proceedings of S&P. IEEE, 2010.
[39]
A. Soltani, S. Canty, et al. Flash cookies and privacy. In AAAI Spring Symposium: Intelligent Information Privacy Management. 2010.
[40]
O. Starov, J. Dahse, et al. No honor among thieves: A large-scale analysis of malicious web shells. In International Conference on World Wide Web. 2016.
[41]
Z. Tollman. We're Going HTTPS: Here's How WIRED Is Tackling a Huge Security Upgrade. https://www.wired.com/2016/04/wired-launching-https-security-upgrade/, 2016.
[42]
J. Uberti. New proposal for IP address handling in WebRTC. https://www.ietf.org/mail-archive/web/rtcweb/current/msg14494.html.
[43]
J. Uberti and G. wei Shieh. WebRTC IP Address Handling Recommendations. https://datatracker.ietf.org/doc/draft-ietf-rtcweb-ip-handling/.
[44]
S. Van Acker, D. Hausknecht, et al. Password meters and generators on the web: From large-scale empirical study to getting it right. In Conference on Data and Application Security and Privacy. ACM, 2015.
[45]
S. Van Acker, N. Nikiforakis, et al. Flashover: Automated discovery of cross-site scripting vulnerabilities in rich internet applications. In Proceedings of CCS. ACM, 2012.
[46]
T. Van Goethem, F. Piessens, et al. Clubbing seals: Exploring the ecosystem of third-party security seals. In Proceedings of CCS. ACM, 2014.
[47]
W. V. Wazer. Moving the Washington Post to HTTPS. https://developer.washingtonpost.com/pb/blog/post/2015/12/10/moving-the-washington-post-to-https/, 2015.
[48]
X. Xing, W. Meng, et al. Understanding malvertising through ad-injecting browser extensions. In 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2015.
[49]
C. Yue and H. Wang. A measurement study of insecure javascript practices on the web. ACM Transactions on the Web (TWEB), 7(2):7, 2013.
[50]
A. Zarras, A. Kapravelos, et al. The dark alleys of madison avenue: Understanding malicious advertisements. In Internet Measurement Conference. ACM, 2014.

Cited By

View all
  • (2024)Automated Ruleset Generation for “HTTPS Everywhere”International Journal of Information Security and Privacy10.4018/IJISP.34733018:1(1-14)Online publication date: 17-Jul-2024
  • (2024)Pervasive User Data Collection from Cyberspace: Privacy Concerns and CountermeasuresCryptography10.3390/cryptography80100058:1(5)Online publication date: 31-Jan-2024
  • (2024)How Users Assess Privacy Risks in the Internet of Things: The Role of Framing, Comparing, and EducatingBusiness & Society10.1177/0007650324125508263:8(1794-1841)Online publication date: 23-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '16: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security
October 2016
1924 pages
ISBN:9781450341394
DOI:10.1145/2976749
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. browser fingerprinting
  2. browser privacy
  3. browser security
  4. device fingerprinting
  5. measurement
  6. online advertising
  7. web measurement
  8. web privacy
  9. web tracking

Qualifiers

  • Research-article

Funding Sources

Conference

CCS'16
Sponsor:

Acceptance Rates

CCS '16 Paper Acceptance Rate 137 of 831 submissions, 16%;
Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,613
  • Downloads (Last 6 weeks)210
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Automated Ruleset Generation for “HTTPS Everywhere”International Journal of Information Security and Privacy10.4018/IJISP.34733018:1(1-14)Online publication date: 17-Jul-2024
  • (2024)Pervasive User Data Collection from Cyberspace: Privacy Concerns and CountermeasuresCryptography10.3390/cryptography80100058:1(5)Online publication date: 31-Jan-2024
  • (2024)How Users Assess Privacy Risks in the Internet of Things: The Role of Framing, Comparing, and EducatingBusiness & Society10.1177/0007650324125508263:8(1794-1841)Online publication date: 23-Jul-2024
  • (2024)A First View of Topics API Usage in the WildProceedings of the 20th International Conference on emerging Networking EXperiments and Technologies10.1145/3680121.3697810(48-54)Online publication date: 9-Dec-2024
  • (2024)Re-Identification Attacks against the Topics APIACM Transactions on the Web10.1145/367540018:3(1-24)Online publication date: 27-Jun-2024
  • (2024)Poster: How Do Visually Impaired Users Navigate Accessibility Challenges in an Ad-Driven Web?Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3691389(5021-5023)Online publication date: 2-Dec-2024
  • (2024)Taming the Variability of Browser FingerprintsProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672591(66-71)Online publication date: 2-Sep-2024
  • (2024)Browsing without Third-Party Cookies: What Do You See?Proceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3689014(130-138)Online publication date: 4-Nov-2024
  • (2024)Browser Polygraph: Efficient Deployment of Coarse-Grained Browser Fingerprints for Web-Scale Detection of Fraud BrowsersProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3688455(681-703)Online publication date: 4-Nov-2024
  • (2024)Analyzing the (In)Accessibility of Online AdvertisementsProceedings of the 2024 ACM on Internet Measurement Conference10.1145/3646547.3688427(92-106)Online publication date: 4-Nov-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media