Author: Huang, Randy : Search

More Web Proxy on the site http://driver.im/

Applied Filters

People

Publications

Conferences

Reproducibility Badges

Publication Date

4 Results for: Author: Huang, RandyEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,801,559 records)|Limit your search to The ACM Full-Text Collection (771,255 records)

Showing 1 - 4of4 Results

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Open Access
November 2024
Distributed Training of Large Language Models on AWS Trainium
SoCC '24: Proceedings of the 2024 ACM Symposium on Cloud ComputingPages 961–976https://doi.org/10.1145/3698038.3698535

Large language models (LLMs) are ubiquitously powerful but prohibitively expensive to train, often requiring thousands of compute devices, typically GPUs. To reduce the cost of training LLMs for customers, Amazon Web Services (AWS) launched the Amazon ...
0
149
Metrics
Total Citations0
Total Downloads149
Last 12 Months149
Last 6 weeks149
View online with eReader
PDF
research-article
November 2024
Artifacts Available / v1.1
Counterfactual Explanation Analytics: Empowering Lay Users to Take Action Against Consequential Automated Decisions
Proceedings of the VLDB Endowment (PVLDB), Volume 17, Issue 12Pages 4349–4352https://doi.org/10.14778/3685800.3685872

Machine learning is routinely used to automate consequential decisions about users in domains such as finance and healthcare, raising concerns of transparency and recourse for negative outcomes. Existing Explainable AI techniques generate a static ...
0
11
Metrics
Total Citations0
Total Downloads11
Last 12 Months11
Last 6 weeks11
Get Access
research-article
February 2017
Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?
FPGA '17: Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysPages 5–14https://doi.org/10.1145/3020078.3021740

Current-generation Deep Neural Networks (DNNs), such as AlexNet and VGG, rely heavily on dense floating-point matrix multiplication (GEMM), which maps well to GPUs (regular parallelism, high TFLOP/s). Because of this, GPUs are widely used for ...
318
6,232
Metrics
Total Citations318
Total Downloads6,232
Last 12 Months243
Last 6 weeks24
Get Access
invited-talk
August 2016
Dissecting Xeon + FPGA: Why the integration of CPUs and FPGAs makes a power difference for the datacenter: Invited Paper
- Herman Schmit,
- Randy Huang
ISLPED '16: Proceedings of the 2016 International Symposium on Low Power Electronics and DesignPages 152–153https://doi.org/10.1145/2934583.2953983

Intel's Xeon roadmap includes package-integrated FPGAs in every new generation. In this talk, we will dissect why this is such a powerful combination at this time of great change in datacenter workloads. We will show how power savings within the CPU ...
5
273
Metrics
Total Citations5
Total Downloads273
Last 12 Months5
Last 6 weeks1
Get Access