research-article

PiPar: : Pipeline parallelism for collaborative machine learning

Authors:

Blesson VargheseAuthors Info & Claims

Volume 193, Issue C

https://doi.org/10.1016/j.jpdc.2024.104947

Published: 18 November 2024 Publication History

Abstract

Collaborative machine learning (CML) techniques, such as federated learning, have been proposed to train deep learning models across multiple mobile devices and a server. CML techniques are privacy-preserving as a local model that is trained on each device instead of the raw data from the device is shared with the server. However, CML training is inefficient due to low resource utilization. We identify idling resources on the server and devices due to sequential computation and communication as the principal cause of low resource utilization. A novel framework PiPar that leverages pipeline parallelism for CML techniques is developed to substantially improve resource utilization. A new training pipeline is designed to parallelize the computations on different hardware resources and communication on different bandwidth resources, thereby accelerating the training process in CML. A low overhead automated parameter selection method is proposed to optimize the pipeline, maximizing the utilization of available resources. The experimental results confirm the validity of the underlying approach of PiPar and highlight that when compared to federated learning: (i) the idle time of the server can be reduced by up to 64.1×, and (ii) the overall training time can be accelerated by up to 34.6× under varying network conditions for a collection of six small and large popular deep neural networks and four datasets without sacrificing accuracy. It is also experimentally demonstrated that PiPar achieves performance benefits when incorporating differential privacy methods and operating in environments with heterogeneous devices and changing bandwidths.

Highlights

•

Compute resources are underutilized in collaborative machine learning.

•

Underutilization leads to idle time and increases overall training time.

•

Our work Pipar uses pipeline parallelism to reduce idle time and accelerate training.

•

Pipar overlaps computation and communication.

•

Pipar reduces idle time by up to 64.1x and accelerates training by up to 34.6x.

References

[1]

M. Abadi, A. Chu, I. Goodfellow, H.B. McMahan, I. Mironov, K. Talwar, L. Zhang, Deep learning with differential privacy, in: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 308–318.

Abstract

Highlights

References

Index Terms

Recommendations

Adaptive thresholds for improved load balancing in mobile edge computing using K-means clustering

A taxonomic survey on load balancing in cloud

A novel load balancing technique for cloud computing platform based on PSO

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations