Cited By
View all- Lin CXu TYang ZWang RHuang RLi MDe V(2024)FastQuery: Communication-efficient Embedding Table Query for Private LLMs inferenceProceedings of the 61st ACM/IEEE Design Automation Conference10.1145/3649329.3657374(1-6)Online publication date: 23-Jun-2024
- Pang QZhu JMöllering HZheng WSchneider T(2024)BOLT: Privacy-Preserving, Accurate and Efficient Inference for Transformers2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00130(4753-4771)Online publication date: 19-May-2024
- Hu PSun LHu CDai LGuo SYu M(2024)DReP: Deep ReLU pruning for fast private inferenceJournal of Systems Architecture10.1016/j.sysarc.2024.103156152(103156)Online publication date: Jul-2024
- Show More Cited By