More Web Proxy on the site http://driver.im/

research-article

Public Access

Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis

Authors:

Xiaoqian JiangAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 1291 - 1300

https://doi.org/10.1145/3357384.3357878

Published: 03 November 2019 Publication History

Abstract

Tensor factorization has been demonstrated as an efficient approach for computational phenotyping, where massive electronic health records (EHRs) are converted to concise and meaningful clinical concepts. While distributing the tensor factorization tasks to local sites can avoid direct data sharing, it still requires the exchange of intermediary results which could reveal sensitive patient information. Therefore, the challenge is how to jointly decompose the tensor under rigorous and principled privacy constraints, while still support the model's interpretability. We propose DPFact, a privacy-preserving collaborative tensor factorization method for computational phenotyping using EHR. It embeds advanced privacy-preserving mechanisms with collaborative learning. Hospitals can keep their EHR database private but also collaboratively learn meaningful clinical concepts by sharing differentially private intermediary results. Moreover, DPFact solves the heterogeneous patient population using a structured sparsity term. In our framework, each hospital decomposes its local tensors and sends the updated intermediary results with output perturbation every several iterations to a semi-trusted server which generates the phenotypes. The evaluation on both real-world and synthetic datasets demonstrated that under strict privacy constraints, our method is more accurate and communication-efficient than state-of-the-art baseline methods.

References

[1]

Brett W. Bader, Tamara G. Kolda, et almbox. 2017. MATLAB Tensor Toolbox Version 3.0-dev. Available online. https://gitlab.com/tensors/tensor_toolbox

[2]

Arnaud Berlioz, Arik Friedman, Mohamed Ali Kaafar, Roksana Boreli, and Shlomo Berkovsky. 2015. Applying differential privacy to matrix factorization. In Proceedings of the 9th ACM Conference on Recommender Systems. ACM, 107--114.

Digital Library

[3]

Alex Beutel, Partha Pratim Talukdar, Abhimanu Kumar, Christos Faloutsos, Evangelos E Papalexakis, and Eric P Xing. 2014. Flexifact: Scalable flexible factorization of coupled tensors on hadoop. In Proceedings of the 2014 SDM. 109--117.

[4]

Mark Bun and Thomas Steinke. 2016. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography Conference. Springer, 635--658.

Digital Library

[5]

Eric C Chi and Tamara G Kolda. 2012. On tensors, sparsity, and nonnegative factorizations. SIAM J. Matrix Anal. Appl., Vol. 33, 4 (2012), 1272--1299.

Digital Library

[6]

Joon Hee Choi and S Vishwanathan. 2014. DFacTo: Distributed factorization of tensors. In NIPS. 1296--1304.

[7]

Patrick L Combettes and Jean-Christophe Pesquet. 2011. Proximal splitting methods in signal processing. In Fixed-point algorithms for inverse problems in science and engineering. Springer, 185--212.

[8]

Cynthia Dwork, Aaron Roth, et almbox. 2014. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, Vol. 9, 3--4 (2014), 211--407.

[9]

Cynthia Dwork and Guy N Rothblum. 2016. Concentrated differential privacy. arXiv preprint arXiv:1603.01887 (2016).

[10]

Cynthia Dwork, Guy N Rothblum, and Salil Vadhan. 2010. Boosting and differential privacy. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science. IEEE, 51--60.

Digital Library

[11]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 1322--1333.

Digital Library

[12]

Trisha Greenhalgh, Susan Hinder, Katja Stramer, Tanja Bratan, and Jill Russell. 2010. Adoption, non-adoption, and abandonment of a personal electronic health record: case study of HealthSpace. Bmj, Vol. 341 (2010), c5814.

[13]

Yuhong Guo and Wei Xue. 2013. Probabilistic Multi-Label Classification with Sparse Feature Learning. In IJCAI . 1373--1379.

[14]

Briland Hitaj, Giuseppe Ateniese, and Fernando Perez-Cruz. 2017. Deep models under the GAN: information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 603--618.

Digital Library

[15]

Joyce C Ho, Joydeep Ghosh, and Jimeng Sun. 2014. Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In Proceedings of the 20th ACM SIGKDD. ACM, 115--124.

Digital Library

[16]

Jingyu Hua, Chang Xia, and Sheng Zhong. 2015. Differentially Private Matrix Factorization. In IJCAI. 1763--1770.

[17]

Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data, Vol. 3 (2016), 160035.

[18]

U Kang, Evangelos Papalexakis, Abhay Harpale, and Christos Faloutsos. 2012. Gigatensor: scaling tensor analysis up by 100 times-algorithms and discoveries. In Proceedings of the 18th ACM SIGKDD. ACM, 316--324.

Digital Library

[19]

Yejin Kim, Robert El-Kareh, Jimeng Sun, Hwanjo Yu, and Xiaoqian Jiang. 2017a. Discriminative and distinct phenotyping by constrained tensor factorization. Scientific reports, Vol. 7, 1 (2017), 1114.

[20]

Yejin Kim, Jimeng Sun, Hwanjo Yu, and Xiaoqian Jiang. 2017b. Federated tensor factorization for computational phenotyping. In Proceedings of the 23rd ACM SIGKDD. ACM, 887--895.

Digital Library

[21]

Jun Liu, Shuiwang Ji, and Jieping Ye. 2009. Multi-task feature learning via efficient $ell_2, 1$-norm minimization. In UAI . 339--348.

[22]

Ziqi Liu, Yu-Xiang Wang, and Alexander Smola. 2015. Fast differentially private matrix factorization. In Proceedings of the 9th ACM RecSys . 171--178.

Digital Library

[23]

Jing Ma, Qiuchen Zhang, Jian Lou, Joyce C Ho, Li Xiong, and Xiaoqian Jiang. 2019. Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis. arXiv preprint arXiv:1908.09888 (2019).

[24]

Feiping Nie, Heng Huang, Xiao Cai, and Chris H Ding. 2010. Efficient and robust feature selection via joint $ell_2, 1$-norms minimization. In NeurIPS . 1813--1821.

[25]

Rachel L Richesson, Jimeng Sun, Jyotishman Pathak, Abel N Kho, and Joshua C Denny. 2016. Clinical phenotyping in selected national networks: demonstrating the need for high-throughput, portable, and computational methods. Artificial intelligence in medicine, Vol. 71 (2016), 57--61.

[26]

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In Security and Privacy (SP), 2017 IEEE Symposium on. IEEE, 3--18.

[27]

Yining Wang and Anima Anandkumar. 2016. Online and differentially-private tensor decomposition. In NeurIPS . 3531--3539.

[28]

Yichen Wang, Robert Chen, Joydeep Ghosh, Joshua C Denny, Abel Kho, You Chen, Bradley A Malin, and Jimeng Sun. 2015. Rubik: Knowledge guided tensor factorization and completion for health data analytics. In Proceedings of the 21th ACM SIGKDD. ACM, 1265--1274.

Digital Library

[29]

Wei-Qi Wei and Joshua C Denny. 2015. Extracting research-quality phenotypes from electronic health records to support precision medicine. Genome medicine, Vol. 7, 1 (2015), 41.

[30]

Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, and Jeffrey Naughton. 2017. Bolt-on differential privacy for scalable stochastic gradient descent-based analytics. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 1307--1322.

Digital Library

[31]

Xiao Xu, Shu-Xia Li, Haiqun Lin, SL Normand, Tara Lagu, Nihar Desai, Michael Duan, Eugene A Kroch, and Harlan M Krumholz. 2016. Hospital Phenotypes in the Management of Patients Admitted for Acute Myocardial Infarction. Medical care, Vol. 54, 10 (2016), 929--936.

[32]

Yi Yang, Heng Tao Shen, Zhigang Ma, Zi Huang, and Xiaofang Zhou. 2011. $ell_2, 1$-norm regularized discriminative feature selection for unsupervised learning. In IJCAI, Vol. 22. 1589.

[33]

Doaa Youssef, Hadeel Abd-Elrahman, Mohamed M Shehab, Mohamed Abd-Elrheem, et almbox. 2015. Incidence of acute kidney injury in the neonatal intensive care unit. Saudi journal of kidney diseases and transplantation, Vol. 26, 1 (2015), 67.

[34]

Lei Yu, Ling Liu, Calton Pu, Mehmet Emre Gursoy, and Stacey Truex. 2019. Differentially Private Model Publishing for Deep Learning. arXiv preprint arXiv:1904.02200 (2019).

[35]

Sixin Zhang, Anna E Choromanska, and Yann LeCun. 2015. Deep learning with elastic averaging SGD. In NeurIPS. 685--693.

Cited By

Zhang QLee HMa JLou JYang CXiong LChua TNgo CKa-Wei Lee RKumar RLauw H(2024)DPAR: Decoupled Graph Neural Networks with Node-Level Differential PrivacyProceedings of the ACM Web Conference 202410.1145/3589334.3645531(1170-1181)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645531
Yang ZXiong BChen KYang LDeng XZhu CHe Y(2024)Differentially Private Federated Tensor Completion for Cloud–Edge Collaborative AIoT Data PredictionIEEE Internet of Things Journal10.1109/JIOT.2023.331446011:1(256-267)Online publication date: 1-Jan-2024
https://doi.org/10.1109/JIOT.2023.3314460
Osorio-Marulanda PEpelde GHernandez MIsasa IReyes NIraola A(2024)Privacy Mechanisms and Evaluation Metrics for Synthetic Data Generation: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.341760812(88048-88074)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3417608
Show More Cited By

Index Terms

Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis

Recommendations

Tensor Factorization with Total Variation and Tikhonov Regularization for Low-Rank Tensor Completion in Imaging Data
Abstract
The main aim of this paper is to study tensor factorization for low-rank tensor completion in imaging data. Due to the underlying redundancy of real-world imaging data, the low-tubal-rank tensor factorization (the tensor–tensor product of two ...
Unifying tensor factorization and tensor nuclear norm approaches for low-rank tensor completion
Abstract
Low-rank tensor completion (LRTC) has gained significant attention due to its powerful capability of recovering missing entries. However, it has to repeatedly calculate the time-consuming singular value decomposition (SVD). To address ...
Robust Irregular Tensor Factorization and Completion for Temporal Health Data Analysis
CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management

Electronic health records (EHR) are often generated and collected across a large number of patients featuring distinctive medical conditions and clinical progress over a long period of time, which results in unaligned records along the time dimension. ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
956
Total Downloads

Downloads (Last 12 months)193
Downloads (Last 6 weeks)16

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang QLee HMa JLou JYang CXiong LChua TNgo CKa-Wei Lee RKumar RLauw H(2024)DPAR: Decoupled Graph Neural Networks with Node-Level Differential PrivacyProceedings of the ACM Web Conference 202410.1145/3589334.3645531(1170-1181)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589334.3645531
Yang ZXiong BChen KYang LDeng XZhu CHe Y(2024)Differentially Private Federated Tensor Completion for Cloud–Edge Collaborative AIoT Data PredictionIEEE Internet of Things Journal10.1109/JIOT.2023.331446011:1(256-267)Online publication date: 1-Jan-2024
https://doi.org/10.1109/JIOT.2023.3314460
Osorio-Marulanda PEpelde GHernandez MIsasa IReyes NIraola A(2024)Privacy Mechanisms and Evaluation Metrics for Synthetic Data Generation: A Systematic ReviewIEEE Access10.1109/ACCESS.2024.341760812(88048-88074)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3417608
Zhao MReisi Gahrooei M(2024)FedPAR: Federated PARAFAC2 tensor factorization for computational phenotypingIISE Transactions on Healthcare Systems Engineering10.1080/24725579.2024.233326114:3(264-275)Online publication date: 8-Apr-2024
https://doi.org/10.1080/24725579.2024.2333261
Kanhegaonkar PPrakash S(2024)Federated learning in healthcare applicationsData Fusion Techniques and Applications for Smart Healthcare10.1016/B978-0-44-313233-9.00013-8(157-196)Online publication date: 2024
https://doi.org/10.1016/B978-0-44-313233-9.00013-8
Feng HPang TDu CChen WYan SLin M(2024)BAFFLE: A Baseline of Backpropagation-Free Federated LearningComputer Vision – ECCV 202410.1007/978-3-031-73226-3_6(89-109)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73226-3_6
Barry GKonyar EHarvill BJohnstone C(2024)A Survey of Advances in Multimodal Federated Learning with ApplicationsMultimodal and Tensor Data Analytics for Industrial Systems Improvement10.1007/978-3-031-53092-0_15(315-344)Online publication date: 26-Feb-2024
https://doi.org/10.1007/978-3-031-53092-0_15
Karimian Sichani ESmith AEl Emam KMosquera L(2023)Creating High-Quality Synthetic Health Data: A Framework for Model Development and Validation (Preprint)JMIR Formative Research10.2196/53241Online publication date: 2-Oct-2023
https://doi.org/10.2196/53241
Zhang SLou JXiong LZhang XLiu JFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)Closed-form Machine Unlearning for Matrix FactorizationProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614811(3278-3287)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614811
Ying CLi BLi B(2023)FedSaw: Communication-Efficient Cross-Silo Federated Learning with Adaptive Compression2023 IEEE 20th International Conference on Mobile Ad Hoc and Smart Systems (MASS)10.1109/MASS58611.2023.00013(37-45)Online publication date: 25-Sep-2023
https://doi.org/10.1109/MASS58611.2023.00013
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents