[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2566486.2568038acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Monitoring web browsing behavior with differential privacy

Published: 07 April 2014 Publication History

Abstract

Monitoring web browsing behavior has benefited many data mining applications, such as top-K discovery and anomaly detection. However, releasing private user data to the greater public would concern web users about their privacy, especially after the incident of AOL search log release where anonymization was not correctly done. In this paper, we adopt differential privacy, a strong, provable privacy definition, and show that differentially private aggregates of web browsing activities can be released in real-time while preserving the utility of shared data. Our proposed algorithms utilize the rich correlation of the time series of aggregated data and adopt a state-space approach to estimate the underlying, true aggregates from the perturbed values by the differential privacy mechanism. We evaluate our algorithms with real-world web browsing data. Utility evaluations with three metrics demonstrate that the quality of the private, released data by our solutions closely resembles that of the original, unperturbed aggregates.

References

[1]
M. Barbaro and T. Zeller. A face is exposed for aol searcher no. 4417749. The New York Times, Aug. 2006.
[2]
A. Blum, K. Ligett, and A. Roth. A learning theory approach to non-interactive database privacy. In Proceedings of the 40th annual ACM symposium on Theory of computing, pages 609--618, New York, 2008. ACM.
[3]
L. Bonomi, L. Xiong, and J. J. Lu. Linkit: privacy preserving record linkage and integration via transformations. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 1029--1032, New York, NY, USA, 2013. ACM.
[4]
I. Cadez, D. Heckerman, C. Meek, P. Smyth, and S. White. Visualization of navigation patterns on a web site using model-based clustering. In Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, KDD '00, pages 280--284, New York, NY, USA, 2000. ACM.
[5]
D. Canali and D. Balzarotti. Behind the scenes of online attacks: an analysis of exploitation behaviors on the web. In NDSS 2013, 20th Annual Network and Distributed System Security Symposium, February 24--27, 2013, San Diego, CA, United States, San Diego, UNITED STATES, 02 2013.
[6]
T.-H. Chan, M. Li, E. Shi, and W. Xu. Differentially private continual monitoring of heavy hitters from distributed streams. In S. Fischer-Hübner and M. Wright, editors, Privacy Enhancing Technologies, volume 7384 of Lecture Notes in Computer Science, pages 140--159. Springer Berlin Heidelberg, 2012.
[7]
T.-H. H. Chan, E. Shi, and D. Song. Private and continual release of statistics. In Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II, pages 405--417, Heidelberg, 2010. Springer-Verlag.
[8]
E. Chlebus and J. Brazier. Nonstationary poisson modeling of web browsing session arrivals. Inf. Process. Lett., 102(5):187--190, May 2007.
[9]
R. Cooley, B. Mobasher, and J. Srivastava. Web mining: information and pattern discovery on the world wide web. In Tools with Artificial Intelligence, 1997. Proceedings., Ninth IEEE International Conference on, pages 558--567, 1997.
[10]
C. Dwork, F. Mcsherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In In Proceedings of the 3rd Theory of Cryptography Conference, pages 265--284, Heidelberg, 2006. Springer-Verlag.
[11]
C. Dwork, M. Naor, T. Pitassi, and G. N. Rothblum. Differential privacy under continual observation. In Proceedings of the 42nd ACM symposium on Theory of computing, pages 715--724, New York, 2010. ACM.
[12]
S. Egelman, L. F. Cranor, and J. Hong. You've been warned: an empirical study of the effectiveness of web browser phishing warnings. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI '08, pages 1065--1074, New York, NY, USA, 2008. ACM.
[13]
M. Eirinaki and M. Vazirgiannis. Web mining for web personalization. ACM Trans. Internet Technol., 3(1):1--27, Feb. 2003.
[14]
L. Fan and L. Xiong. Real-time aggregate monitoring with differential privacy. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 2169--2173, New York, 2012. ACM.
[15]
L. Fan and L. Xiong. An adaptive approach to real-time aggregate monitoring with differential privacy. IEEE Transactions on Knowledge and Data Engineering, 99(PrePrints):1, 2013.
[16]
M. Götz, S. Nath, and J. Gehrke. Maskit: privately releasing user context streams for personalized mobile applications. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, SIGMOD '12, pages 289--300, New York, NY, USA, 2012. ACM.
[17]
R. E. Kalman et al. A new approach to linear filtering and prediction problems. Journal of basic Engineering, 82(1):35--45, 1960.
[18]
A. Korolova, K. Kenthapadi, N. Mishra, and A. Ntoulas. Releasing search queries and clicks privately. In Proceedings of the 18th international conference on World wide web, WWW '09, pages 171--180, New York, NY, USA, 2009. ACM.
[19]
R. Kosala and H. Blockeel. Web mining research: a survey. SIGKDD Explor. Newsl., 2(1):1--15, June 2000.
[20]
R. Kumar and A. Tomkins. A characterization of online browsing behavior. In Proceedings of the 19th international conference on World wide web, WWW '10, pages 561--570, New York, NY, USA, 2010. ACM.
[21]
F. McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. volume 53, pages 89--97, New York, 2010. ACM.
[22]
S. Papadimitriou, F. Li, G. Kollios, and P. S. Yu. Time series compressibility and privacy. VLDB '07, pages 459--470. VLDB Endowment, 2007.
[23]
V. Rastogi and S. Nath. Differentially private aggregation of distributed time-series with transformation and encryption. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pages 735--746, New York, 2010. ACM.
[24]
J. Srivastava, R. Cooley, M. Deshpande, and P.-N. Tan. Web usage mining: discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl., 1(2):12--23, Jan. 2000.
[25]
D. Wang, Y. He, E. Rundensteiner, and J. F. Naughton. Utility-maximizing event stream suppression. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data, SIGMOD '13, pages 589--600, New York, NY, USA, 2013. ACM.
[26]
O. Williams and F. McSherry. Probabilistic inference and differential privacy. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23, pages 2451--2459. 2010.
[27]
J. Xu, Z. Zhang, X. Xiao, Y. Yang, and G. Yu. Differentially private histogram publication. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, pages 32--43, Washington, DC, 2012. IEEE Computer Society.
[28]
J. Yan, D. Yuan, X. Xing, and Q. Jia. Kalman filtering parameter optimization techniques based on genetic algorithm. In Automation and Logistics, 2008. ICAL 2008. IEEE International Conference on, pages 1717--1720, 2008.
[29]
H. Yu, D. Zheng, B. Y. Zhao, and W. Zheng. Understanding user behavior in large-scale video-on-demand systems. In Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, EuroSys '06, pages 333--344, New York, NY, USA, 2006. ACM.

Cited By

View all
  • (2024)GEES: Enabling Location Privacy-Preserving Energy Saving in Multi-Access Edge ComputingProceedings of the ACM Web Conference 202410.1145/3589334.3645329(2735-2746)Online publication date: 13-May-2024
  • (2024)Learning Markov Chain Models from Sequential Data Under Local Differential PrivacyComputer Security – ESORICS 202310.1007/978-3-031-51476-0_18(359-379)Online publication date: 11-Jan-2024
  • (2023)Private Web Search Using Proxy-Query Based Query Obfuscation SchemeIEEE Access10.1109/ACCESS.2023.323500011(3607-3625)Online publication date: 2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '14: Proceedings of the 23rd international conference on World wide web
April 2014
926 pages
ISBN:9781450327442
DOI:10.1145/2566486

Sponsors

  • IW3C2: International World Wide Web Conference Committee

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 April 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. differential privacy
  2. web mining
  3. web monitoring

Qualifiers

  • Research-article

Funding Sources

Conference

WWW '14
Sponsor:
  • IW3C2

Acceptance Rates

WWW '14 Paper Acceptance Rate 84 of 645 submissions, 13%;
Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)3
Reflects downloads up to 12 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)GEES: Enabling Location Privacy-Preserving Energy Saving in Multi-Access Edge ComputingProceedings of the ACM Web Conference 202410.1145/3589334.3645329(2735-2746)Online publication date: 13-May-2024
  • (2024)Learning Markov Chain Models from Sequential Data Under Local Differential PrivacyComputer Security – ESORICS 202310.1007/978-3-031-51476-0_18(359-379)Online publication date: 11-Jan-2024
  • (2023)Private Web Search Using Proxy-Query Based Query Obfuscation SchemeIEEE Access10.1109/ACCESS.2023.323500011(3607-3625)Online publication date: 2023
  • (2022)Detecting Anomalous LAN Activities under Differential PrivacySecurity and Communication Networks10.1155/2022/14032002022Online publication date: 1-Jan-2022
  • (2022)Privacy at a Glance: A Process to Learn Modular Privacy Icons During Web BrowsingProceedings of the 2022 Conference on Human Information Interaction and Retrieval10.1145/3498366.3505813(102-112)Online publication date: 14-Mar-2022
  • (2022)Monitoring Smart Home Traffic under Differential PrivacyNOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium10.1109/NOMS54207.2022.9789817(1-10)Online publication date: 25-Apr-2022
  • (2022)Hide and Seek: Revisiting DNS-based User Tracking2022 IEEE 7th European Symposium on Security and Privacy (EuroS&P)10.1109/EuroSP53844.2022.00020(188-205)Online publication date: Jun-2022
  • (2021)Differentially Private Web Browsing Trajectory over Infinite StreamsSecurity and Communication Networks10.1155/2021/99689052021Online publication date: 1-Jan-2021
  • (2021)DPCrowd: Privacy-Preserving and Communication-Efficient Decentralized Statistical Estimation for Real-Time Crowdsourced DataIEEE Internet of Things Journal10.1109/JIOT.2020.30200898:4(2775-2791)Online publication date: 15-Feb-2021
  • (2021)Releasing ARP Data with Differential Privacy Guarantees For LAN Anomaly Detection2021 18th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON)10.1109/ECTI-CON51831.2021.9454785(404-408)Online publication date: 19-May-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media