research-article

Public Access

Sunlight: Fine-grained Targeting Detection at Scale with Statistical Confidence

Authors:

Daniel HsuAuthors Info & Claims

CCS '15: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security

Pages 554 - 566

https://doi.org/10.1145/2810103.2813614

Published: 12 October 2015 Publication History

PDF eReader

Abstract

We present Sunlight, a system that detects the causes of targeting phenomena on the web -- such as personalized advertisements, recommendations, or content -- at large scale and with solid statistical confidence. Today's web is growing increasingly complex and impenetrable as myriad of services collect, analyze, use, and exchange users' personal information. No one can tell who has what data, for what purposes they are using it, and how those uses affect the users. The few studies that exist reveal problematic effects -- such as discriminatory pricing and advertising -- but they are either too small-scale to generalize or lack formal assessments of confidence in the results, making them difficult to trust or interpret. Sunlight brings a principled and scalable methodology to personal data measurements by adapting well-established methods from statistics for the specific problem of targeting detection. Our methodology formally separates different operations into four key phases: scalable hypothesis generation, interpretable hypothesis formation, statistical significance testing, and multiple testing correction. Each phase bears instantiations from multiple mechanisms from statistics, each making different assumptions and tradeoffs. Sunlight offers a modular design that allows exploration of this vast design space. We explore a portion of this space, thoroughly evaluating the tradeoffs both analytically and experimentally. Our exploration reveals subtle tensions between scalability and confidence. Sunlight's default functioning strikes a balance to provide the first system that can diagnose targeting at fine granularity, at scale, and with solid statistical justification of its results.

We showcase our system by running two measurement studies of targeting on the web, both the largest of their kind. Our studies -- about ad targeting in Gmail and on the web -- reveal statistically justifiable evidence that contradicts two Google statements regarding the lack of targeting on sensitive and prohibited topics.

References

[1]

AdBlockPlus.small https://adblockplus.org/, 2015.

Abstract

References

Cited By

Recommendations

I always feel like somebody's watching me: measuring online behavioural advertising

Should You Use the App for That?: Comparing the Privacy Implications of App- and Web-based Online Services

MyAdChoices: Bringing Transparency and Control to Online Advertising

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations