Cited By
View all- Lin ZSun NBhattacharya PFeng XFeng LOwens J(2025)Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU PlatformsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.350781436:2(226-238)Online publication date: Feb-2025
- Wang ZWang YFeng BHuang GMudigere DMuthiah BLi ADing YBagchi SZhang Y(2024)OPERProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692033(667-682)Online publication date: 10-Jul-2024
- Wang QLan TTang YSang BHuang ZDu YZhang HSha JLu HZhou YZhang KTang M(2024)DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the CloudProceedings of the VLDB Endowment10.14778/3685800.368583217:12(4130-4144)Online publication date: 8-Nov-2024
- Show More Cited By