More Web Proxy on the site http://driver.im/

research-article

FL-Enhance: : A federated learning framework for balancing non-IID data with augmented and shared compressed samples

Authors:

Diletta Chiaro,

Edoardo Prezioso,

Fabio GiampaoloAuthors Info & Claims

Volume 98, Issue C

https://doi.org/10.1016/j.inffus.2023.101836

Published: 01 October 2023 Publication History

Abstract

Federated Learning (FL), which enables multiple clients to cooperatively train global models without revealing private data, has gained significant attention from researchers in recent years. However, the data samples on each participating device in FL are often not independent and identically distributed (IID), leading to significant statistical heterogeneity challenges. In this paper, we propose FL-Enhance, a novel framework to address the non-IID-ness data issue in FL by leveraging established solutions such as data selection, data compression, and data augmentation. FL-Enhance, specifically, utilizes cGANs that are trained locally on the server level, which represents a relatively novel approach within the FL framework. Also, data compression techniques are applied to preserve privacy during data sharing between clients and servers. We compare our framework with the commonly used SMOTE data augmentation technique and test it with different FL algorithms, including FedAvg, FedNova, and FedOpt. We conducted experiments using both image and tabular data to evaluate the effectiveness of our proposed framework. The experimental findings show that FL-Enhance can substantially enhance the performance of the trained models in situations of severe pathological clients while still preserving privacy, which is the fundamental requirement in the FL context.

Highlights

•

We introduce FL-Enhance, a new data-selection-based method to handle pathological non-IID-ness in FL.

•

We experiment on various datasets and compare FedAvg, FedNova, and FedOpt algorithms.

•

We are among the first to integrate cGAN into FL especially on tabular datasets.

•

FL-Enhance provides better privacy protection compared to baseline FL methods.

References

[1]

Jordan M.I., Mitchell T.M., Machine learning: Trends, perspectives, and prospects, Science 349 (6245) (2015) 255–260.

[2]

LeCun Y., Bengio Y., Hinton G., Deep learning, Nature 521 (7553) (2015) 436–444.

[3]

Shinde P.P., Shah S., A review of machine learning and deep learning applications, in: 2018 Fourth International Conference on Computing Communication Control and Automation, ICCUBEA, IEEE, 2018, pp. 1–6.

[4]

Voigt P., Von dem Bussche A., The eu general data protection regulation (gdpr), in: A Practical Guide, Vol. 10, first Ed., Springer International Publishing, Cham, 2017, pp. 10–5555.

[5]

Aggarwal C.C., Yu P.S., A general survey of privacy-preserving data mining models and algorithms, in: Privacy-Preserving Data Mining, Springer, 2008, pp. 11–52.

[6]

McMahan B., Moore E., Ramage D., Hampson S., y Arcas B.A., Communication-efficient learning of deep networks from decentralized data, in: Artificial Intelligence and Statistics, PMLR, 2017, pp. 1273–1282.

[7]

Yang Q., Liu Y., Chen T., Tong Y., Federated machine learning: Concept and applications, ACM Trans. Intell. Syst. Technol. 10 (2) (2019) 1–19.

Digital Library

[8]

Zhu H., Xu J., Liu S., Jin Y., Federated learning on non-IID data: A survey, Neurocomputing 465 (2021) 371–390.

Digital Library

[9]

Zhao Y., Li M., Lai L., Suda N., Civin D., Chandra V., Federated learning with non-iid data, 2018, arXiv preprint arXiv:1806.00582.

[10]

Ma X., Zhu J., Lin Z., Chen S., Qin Y., A state-of-the-art survey on solving non-IID data in federated learning, Future Gener. Comput. Syst. 135 (2022) 244–258.

Digital Library

[11]

Li T., Sahu A.K., Zaheer M., Sanjabi M., Talwalkar A., Smith V., Federated optimization in heterogeneous networks, in: Proceedings of Machine Learning and Systems, Vol. 2, 2020, pp. 429–450.

[12]

Karimireddy S.P., Kale S., Mohri M., Reddi S.J., Stich S.U., Suresh A.T., SCAFFOLD: Stochastic controlled averaging for on-device federated learning, PMLR (2019).

[13]

Wang J., Liu Q., Liang H., Joshi G., Poor H.V., Tackling the objective inconsistency problem in heterogeneous federated optimization, Adv. Neural Inf. Process. Syst. 33 (2020) 7611–7623.

[14]

Zhu L., Liu Z., Han S., Deep leakage from gradients, Adv. Neural Inf. Process. Syst. 32 (2019).

[15]

Gentry C., A Fully Homomorphic Encryption Scheme, Stanford university, 2009.

[16]

Dwork C., Roth A., et al., The algorithmic foundations of differential privacy, Found. Trends Theoret. Comput. Sci. 9 (3–4) (2014) 211–407.

Digital Library

[17]

Mirza M., Osindero S., Conditional generative adversarial nets, 2014, arXiv preprint arXiv:1411.1784.

[18]

Li X., Huang K., Yang W., Wang S., Zhang Z., On the convergence of fedavg on non-iid data, 2019, arXiv preprint arXiv:1907.02189.

[19]

Kairouz P., McMahan H.B., Avent B., Bellet A., Bennis M., Bhagoji A.N., Bonawitz K., Charles Z., Cormode G., Cummings R., et al., Advances and open problems in federated learning, Found. Trends Mach. Learn. 14 (1–2) (2021) 1–210.

Digital Library

[20]

Karimireddy S.P., Kale S., Mohri M., Reddi S., Stich S., Suresh A.T., Scaffold: Stochastic controlled averaging for federated learning, in: International Conference on Machine Learning, PMLR, 2020, pp. 5132–5143.

[21]

Duan M., Liu D., Chen X., Tan Y., Ren J., Qiao L., Liang L., Astraea: Self-balancing federated learning for improving classification accuracy of mobile deep learning applications, in: 2019 IEEE 37th International Conference on Computer Design, ICCD, IEEE, 2019, pp. 246–254.

[22]

Zhang H., Cisse M., Dauphin Y.N., Lopez-Paz D., Mixup: Beyond empirical risk minimization, 2017, arXiv preprint arXiv:1710.09412.

[23]

Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y., Generative Adversarial Nets in Advances in Neural Information Processing Systems (NIPS), Curran Associates, Inc., Red Hook, NY, USA, 2014, pp. 2672–2680.

[24]

Nishio T., Yonetani R., Client selection for federated learning with heterogeneous resources in mobile edge, in: ICC 2019-2019 IEEE International Conference on Communications, ICC, IEEE, 2019, pp. 1–7.

[25]

Jeong E., Oh S., Park J., Kim H., Bennis M., Kim S.-L., Multi-hop federated private data augmentation with sample compression, 2019, arXiv preprint arXiv:1907.06426.

[26]

B. Hitaj, G. Ateniese, F. Perez-Cruz, Deep models under the GAN: information leakage from collaborative deep learning, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 603–618.

[27]

Hardy C., Le Merrer E., Sericola B., Md-gan: Multi-discriminator generative adversarial networks for distributed datasets, in: 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS, IEEE, 2019, pp. 866–877.

[28]

Rasouli M., Sun T., Rajagopal R., Fedgan: Federated generative adversarial networks for distributed data, 2020, arXiv preprint arXiv:2006.07228.

[29]

Wu Y., Kang Y., Luo J., He Y., Yang Q., Fedcg: Leverage conditional gan for protecting privacy and maintaining competitive performance in federated learning, 2021, arXiv preprint arXiv:2111.08211.

[30]

Reddi S., Charles Z., Zaheer M., Garrett Z., Rush K., Konečnỳ J., Kumar S., McMahan H.B., Adaptive federated optimization, 2020, arXiv preprint arXiv:2003.00295.

[31]

Acar D.A.E., Zhao Y., Navarro R.M., Mattina M., Whatmough P.N., Saligrama V., Federated learning based on dynamic regularization, 2021, arXiv preprint arXiv:2111.04263.

[32]

Li X., Jiang M., Zhang X., Kamp M., Dou Q., Fedbn: Federated learning on non-iid features via local batch normalization, 2021, arXiv preprint arXiv:2102.07623.

[33]

L. Wang, S. Xu, X. Wang, Q. Zhu, Addressing class imbalance in federated learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. 10165–10173.

[34]

Q. Li, B. He, D. Song, Model-contrastive federated learning, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 10713–10722.

[35]

Shin M., Hwang C., Kim J., Park J., Bennis M., Kim S.-L., Xor mixup: Privacy-preserving data augmentation for one-shot federated learning, 2020, arXiv preprint arXiv:2006.05148.

[36]

Danilenka A., Ganzha M., Paprzycki M., Mańdziuk J., Using adversarial images to improve outcomes of federated learning for non-IID data, 2022, arXiv preprint arXiv:2206.08124.

[37]

Yoon T., Shin S., Hwang S.J., Yang E., Fedmix: Approximation of mixup under mean augmented federated learning, 2021, arXiv preprint arXiv:2107.00233.

[38]

W. Hao, M. El-Khamy, J. Lee, J. Zhang, K.J. Liang, C. Chen, L.C. Duke, Towards fair federated learning with zero-shot data augmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3310–3319.

[39]

Jeong E., Oh S., Park J., Kim H., Bennis M., Kim S.-L., Hiding in the crowd: Federated data augmentation for on-device learning, IEEE Intell. Syst. 36 (5) (2020) 80–87.

[40]

Oh S., Park J., Jeong E., Kim H., Bennis M., Kim S.-L., Mix2FLD: Downlink federated learning after uplink federated distillation with two-way mixup, IEEE Commun. Lett. 24 (10) (2020) 2211–2215.

[41]

Hsieh K., Phanishayee A., Mutlu O., Gibbons P., The non-iid data quagmire of decentralized machine learning, in: International Conference on Machine Learning, PMLR, 2020, pp. 4387–4398.

[42]

Esfandiari Y., Tan S.Y., Jiang Z., Balu A., Herron E., Hegde C., Sarkar S., Cross-gradient aggregation for decentralized learning from non-iid data, in: International Conference on Machine Learning, PMLR, 2021, pp. 3036–3046.

[43]

Ucar T., Hajiramezanali E., Edwards L., Subtab: Subsetting features of tabular data for self-supervised representation learning, Adv. Neural Inf. Process. Syst. 34 (2021) 18853–18865.

[44]

Chawla N.V., Bowyer K.W., Hall L.O., Kegelmeyer W.P., SMOTE: synthetic minority over-sampling technique, J. Artificial Intelligence Res. 16 (2002) 321–357.

[45]

Soltanzadeh P., Hashemzadeh M., RCSMOTE: Range-controlled synthetic minority over-sampling technique for handling the class imbalance problem, Inform. Sci. 542 (2021) 92–111.

[46]

Caldas S., Duddu S.M.K., Wu P., Li T., Konečnỳ J., McMahan H.B., Smith V., Talwalkar A., Leaf: A benchmark for federated settings, 2018, arXiv preprint arXiv:1812.01097.

[47]

Hu S., Li Y., Liu X., Li Q., Wu Z., He B., The oarf benchmark suite: Characterization and implications for federated learning systems, ACM Trans. Intell. Syst. Technol. 13 (4) (2022) 1–32.

[48]

Xu L., Skoularidou M., Cuesta-Infante A., Veeramachaneni K., Modeling tabular data using conditional gan, Adv. Neural Inf. Process. Syst. 32 (2019).

[49]

Asad M., Moustafa A., Ito T., FedOpt: Towards communication efficiency and privacy preservation in federated learning, Appl. Sci. 10 (8) (2020) 2864.

[50]

Yale A., Dash S., Dutta R., Guyon I., Pavao A., Bennett K.P., Generation and evaluation of privacy preserving synthetic health data, Neurocomputing 416 (2020) 244–255.

[51]

Dietterich T.G., Approximate statistical tests for comparing supervised classification learning algorithms, Neural Comput. 10 (7) (1998) 1895–1923.

Digital Library

[52]

Bostanci B., Bostanci E., An evaluation of classification algorithms using Mc Nemar’s test, in: Proceedings of Seventh International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012) Volume 1, Springer, 2013, pp. 15–26.

[53]

Brown I., Mues C., An experimental comparison of classification algorithms for imbalanced credit scoring data sets, Expert Syst. Appl. 39 (3) (2012) 3446–3453.

Digital Library

[54]

Agresti A., Categorical Data Analysis, John Wiley & Sons, 2012.

[55]

Sun X., Yang Z., Generalized McNemar’s test for homogeneity of the marginal distributions, in: SAS Global Forum, Vol. 382, 2008, pp. 1–10.

[56]

Cohen J., A coefficient of agreement for nominal scales, Educ. Psychol. Meas. 20 (1) (1960) 37–46.

[57]

Agresti A., Modelling patterns of agreement and disagreement, Stat. Methods Med. Res. 1 (2) (1992) 201–218.

[58]

Bergan J.R., Measuring observer agreement using the quasi-independence concept, J. Educ. Meas. (1980) 59–69.

Cited By

Zhou FLiu SFujita HHu XZhang YWang BWang K(2024)Fault diagnosis based on federated learning driven by dynamic expansion for model layers of imbalanced clientExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121982238:PDOnline publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121982
Wu HLi HChi HKou WWu YWang S(2024)A hierarchical federated learning framework for collaborative quality defect inspection in constructionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108218133:PCOnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108218
Cheng SLi PWang RXu H(2024)Differentially private federated learning with non-IID dataComputing10.1007/s00607-024-01257-2106:7(2459-2488)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s00607-024-01257-2

Recommendations

FL-FD: Federated learning-based fall detection with multimodal data fusion
Abstract
Multimodal data fusion is a critical element of fall detection systems, as it provides more comprehensive information than single-modal data. Yet, data heterogeneity between sources has posed a challenge for the effective fusion of such data. ...
Highlights
- A method for fusing input-level data is proposed.
- A user privacy-preserving FL framework is proposed to ensure data security.
- Multimodal data fusion enhances fall detection performance when compared to single-modal data.
FedAUXfdp: Differentially Private One-Shot Federated Distillation
Trustworthy Federated Learning
Abstract
Federated learning suffers in the case of non-iid local datasets, i.e., when the distributions of the clients’ data are heterogeneous. One promising approach to this challenge is the recently proposed method FedAUX, an augmentation of federated ... $_{}$
SSI − FL: Self-sovereign identity based privacy-preserving federated learning
Abstract
Traditional federated learning (FL) raises security and privacy concerns such as identity fraud, data poisoning attacks, membership inference attacks, and model inversion attacks. In the conventional FL, any entity can falsify its identity and ...
Highlights
- Study proposes SSI (blockchain integrated) & privacy-based federated learning, ensuring secure channels & verified identities.
- Novel hybrid deep learning merges RNN & MLP, outperforming state-of-the-art methods.
- Hybrid model ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Information Fusion

Information Fusion Volume 98, Issue C

Oct 2023

286 pages

ISSN:1566-2535

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 October 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhou FLiu SFujita HHu XZhang YWang BWang K(2024)Fault diagnosis based on federated learning driven by dynamic expansion for model layers of imbalanced clientExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121982238:PDOnline publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121982
Wu HLi HChi HKou WWu YWang S(2024)A hierarchical federated learning framework for collaborative quality defect inspection in constructionEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.108218133:PCOnline publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.108218
Cheng SLi PWang RXu H(2024)Differentially private federated learning with non-IID dataComputing10.1007/s00607-024-01257-2106:7(2459-2488)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s00607-024-01257-2

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents