default search action
27th HPCA 2021: Seoul, South Korea
- IEEE International Symposium on High-Performance Computer Architecture, HPCA 2021, Seoul, South Korea, February 27 - March 3, 2021. IEEE 2021, ISBN 978-1-6654-2235-2
Security Architectures
- Seonjin Na, Sunho Lee, Yeonjae Kim, Jongse Park, Jaehyuk Huh:
Common Counters: Compressed Encryption Counters for Secure GPU Memory. 1-13 - Dingyuan Cao, Mingzhe Zhang, Hang Lu, Xiaochun Ye, Dongrui Fan, Yuezhi Che, Rujia Wang:
Streamline Ring ORAM Accesses through Spatial and Temporal Optimization. 14-25 - Brandon Reagen, Wooseok Choi, Yeongil Ko, Vincent T. Lee, Hsien-Hsin S. Lee, Gu-Yeon Wei, David Brooks:
Cheetah: Optimizing and Accelerating Homomorphic Encryption for Private Inference. 26-39 - Zecheng He, Guangyuan Hu, Ruby B. Lee:
New Models for Understanding and Reasoning about Speculative Execution Attacks. 40-53
Accelerators for Machine Learning 1
- Sean Kinzer, Joon Kyung Kim, Soroush Ghodrati, Brahmendra Reddy Yatham, Alric Althoff, Divya Mahajan, Sorin Lerner, Hadi Esmaeilzadeh:
A Computational Stack for Cross-Domain Acceleration. 54-70 - Hyoukjun Kwon, Liangzhen Lai, Michael Pellauer, Tushar Krishna, Yu-Hsin Chen, Vikas Chandra:
Heterogeneous Dataflow Accelerators for Multi-DNN Workloads. 71-83 - Reza Hojabr, Ali Sedaghati, Amirali Sharifian, Ahmad Khonsari, Arrvindh Shriraman:
SPAGHETTI: Streaming Accelerators for Highly Sparse GEMM on FPGAs. 84-96 - Hanrui Wang, Zhekai Zhang, Song Han:
SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning. 97-110
Storage Systems
- Mohammad A. Alshboul, Prakash Ramrakhyani, William Wang, James Tuck, Yan Solihin:
BBB: Simplifying Persistent Programming using Battery-Backed Buffers. 111-124 - Per Ekemark, Yuan Yao, Alberto Ros, Konstantinos Sagonas, Stefanos Kaxiras:
TSOPER: Efficient Coherence-Based Strict Persistency. 125-138 - Mazen Al-Wadi, Vamsee Reddy Kommareddy, Clayton Hughes, Simon David Hammond, Amro Awad:
Stealth-Persist: Architectural Support for Persistent Applications in Hybrid Memory Systems. 139-152
Quantum Computing
- Xin-Chuan Wu, Dripto M. Debroy, Yongshan Ding, Jonathan M. Baker, Yuri Alexeev, Kenneth R. Brown, Frederic T. Chong:
TILT: Achieving Higher Fidelity on a Trapped-Ion Linear-Tape Quantum Computing Architecture. 153-166 - Lei Liu, Xinglei Dou:
QuCloud: A New Qubit Mapping Mechanism for Multi-programming Quantum Computing in Cloud Environment. 167-178 - Ji Liu, Huiyang Zhou:
Systematic Approaches for Precise and Approximate Quantum State Runtime Assertion. 179-193 - Aneeqa Fatima, Igor L. Markov:
Faster Schrödinger-style simulation of quantum circuits. 194-207
Systems for Machine Learning 1
- Sung-En Chang, Yanyu Li, Mengshu Sun, Runbin Shi, Hayden K. H. So, Xuehai Qian, Yanzhi Wang, Xue Lin:
Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework. 208-220 - Mohsen Imani, Zhuowen Zou, Samuel Bosch, Sanjay Anantha Rao, Sahand Salamat, Venkatesh Kumar, Yeseong Kim, Tajana Rosing:
Revisiting HyperDimensional Learning for FPGA and Low-Power Architectures. 221-234 - Youngeun Kwon, Yunjae Lee, Minsoo Rhu:
Tensor Casting: Co-Designing Algorithm-Architecture for Personalized Recommendation Training. 235-248 - Heesu Kim, Hanmin Park, Taehyun Kim, Kwanheum Cho, Eojin Lee, Soojung Ryu, Hyuk-Jae Lee, Kiyoung Choi, Jinho Lee:
GradPIM: A Practical Processing-in-DRAM Architecture for Gradient Descent. 249-262
Cache Design
- Christina Giannoula, Nandita Vijaykumar, Nikela Papadopoulou, Vasileios Karakostas, Ivan Fernandez, Juan Gómez-Luna, Lois Orosa, Nectarios Koziris, Georgios I. Goumas, Onur Mutlu:
SynCron: Efficient Synchronization Support for Near-Data-Processing Architectures. 263-276 - Mainak Chaudhuri:
Zero Directory Eviction Victim: Unbounded Coherence Directory and Core Cache Isolation. 277-290 - Subhash Sethumurugan, Jieming Yin, John Sartori:
Designing a Cost-Effective Cache Replacement Policy using Machine Learning. 291-303 - Antonio Franques, Apostolos Kokolis, Sergi Abadal, Vimuth Fernando, Sasa Misailovic, Josep Torrellas:
WiDir: A Wireless-Enabled Directory Cache Coherence Protocol. 304-317
Security Attacks
- Zhihui Shao, Mohammad A. Islam, Shaolei Ren:
Heat Behind the Meter: A Hidden Threat of Thermal Attacks in Edge Colocation Data Centers. 318-331 - Jaeguk Ahn, Cheolgyu Jin, Jiho Kim, Minsoo Rhu, Yunsi Fei, David R. Kaeli, John Kim:
Trident: A Hybrid Correlation-Collision GPU Cache Timing Attack for AES Key Recovery. 332-344 - Abdullah Giray Yaglikçi, Minesh Patel, Jeremie S. Kim, Roknoddin Azizi, Ataberk Olgun, Lois Orosa, Hasan Hassan, Jisung Park, Konstantinos Kanellopoulos, Taha Shahroodi, Saugata Ghose, Onur Mutlu:
BlockHammer: Preventing RowHammer at Low Cost by Blacklisting Rapidly-Accessed DRAM Rows. 345-358 - Jianming Huang, Yu Hua:
A Write-Friendly and Fast-Recovery Scheme for Security Metadata in Non-Volatile Memories. 359-370
Hardware Accelerators Beyond Machine Learning
- Yu Zhang, Xiaofei Liao, Hai Jin, Ligang He, Bingsheng He, Haikun Liu, Lin Gu:
DepGraph: A Dependency-Driven Accelerator for Efficient Iterative Graph Processing. 371-384 - Yifan Yuan, Yipeng Wang, Ren Wang, Rangeen Basu Roy Chowdhury, Charlie Tai, Nam Sung Kim:
QEI: Query Acceleration Can be Generic and Efficient in the Cloud. 385-398 - Lei Jiang, Farzaneh Zokaee:
EXMA: A Genomics Accelerator for Exact-Matching. 399-411 - Christopher Torng, Peitian Pan, Yanghui Ou, Cheng Tan, Christopher Batten:
Ultra-Elastic CGRAs for Irregular Loop Specialization. 412-425
Memory and Storage Architectures
- Chun-Yi Liu, Yunju Lee, Wonil Choi, Myoungsoo Jung, Mahmut Taylan Kandemir, Chita R. Das:
GSSA: A Resource Allocation Scheme Customized for 3D NAND SSDs. 426-439 - Ananth Krishna Prasad, Morteza Rezaalipour, Masoud Dehyadegari, Mahdi Nazm Bojnordi:
Memristive Data Ranking. 440-452 - Vamsee Reddy Kommareddy, Clayton Hughes, Simon David Hammond, Amro Awad:
DeACT: Architecture-Aware Virtual Memory Support for Fabric Attached Memory Systems. 453-466
High Throughput Architectures
- Mohamed Assem Ibrahim, Onur Kayiran, Yasuko Eckert, Gabriel H. Loh, Adwait Jog:
Analyzing and Leveraging Decoupled L1 Caches in GPUs. 467-478 - Tsung Tai Yeh, Matthew D. Sinclair, Bradford M. Beckmann, Timothy G. Rogers:
Deadline-Aware Offloading for High-Throughput Accelerators. 479-492 - Yujeong Choi, Yunseong Kim, Minsoo Rhu:
Lazy Batching: An SLA-aware Batching System for Cloud Machine Learning Inference. 493-506 - Chandrashis Mazumdar, Prachatos Mitra, Arkaprava Basu:
Dead Page and Dead Block Predictors: Cleaning TLBs and Caches Together. 507-519
Power Efficiency and Resiliency
- Sam Ainsworth, Lionel Zoubritzky, Alan Mycroft, Timothy M. Jones:
ParaDox: Eliminating Voltage Margins via Heterogeneous Fault Tolerance. 520-532 - Jian Chen, Xiaowei Jiang, Ying Zhang, Liyin Liu, Huifeng Xu, Qiang Liu:
CARE: Coordinated Augmentation for Elastic Resilience on DRAM Errors in Data Centers. 533-544 - Erick Carvajal Barboza, Sara Jacob, Mahesh Ketkar, Michael Kishinevsky, Paul Gratz, Jiang Hu:
Automatic Microprocessor Performance Bug Detection. 545-556 - Helena Caminal, Kailin Yang, Srivatsa Srinivasa, Akshay Krishna Ramanathan, Khalid Al-Hawaj, Tianshu Wu, Vijaykrishnan Narayanan, Christopher Batten, José F. Martínez:
CAPE: A Content-Addressable Processing Engine. 557-569
Systems for Machine Learning 2
- Xinfeng Xie, Zheng Liang, Peng Gu, Abanti Basak, Lei Deng, Ling Liang, Xing Hu, Yuan Xie:
SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator. 570-583 - Young H. Oh, Seonghak Kim, Yunho Jin, Sam Son, Jonghyun Bae, Jongsung Lee, Yeonhong Park, Dong Uk Kim, Tae Jun Ham, Jae W. Lee:
Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling. 584-597 - Jie Ren, Jiaolin Luo, Kai Wu, Minjia Zhang, Hyeran Jeon, Dong Li:
Sentinel: Efficient Tensor Migration and Allocation on Heterogeneous Memory Systems for Deep Learning. 598-611 - Jiajun Li, Ahmed Louri, Avinash Karanth, Razvan C. Bunescu:
CSCNN: Algorithm-hardware Co-design for CNN Accelerators using Centrosymmetric Filters. 612-625
Best Paper Nominees
- B. Pratheek, Neha Jawalkar, Arkaprava Basu:
Improving GPU Multi-tenancy with Page Walk Stealing. 626-639 - Zhengrong Wang, Jian Weng, Jason Lowe-Power, Jayesh Gaur, Tony Nowatzki:
Stream Floating: Enabling Proactive and Decentralized Cache Optimizations. 640-653 - Nishil Talati, Kyle May, Armand Behroozi, Yichen Yang, Kuba Kaszyk, Christos Vasiladiotis, Tarunesh Verma, Lu Li, Brandon Nguyen, Jiawen Sun, John Magnus Morton, Agreen Ahmadi, Todd M. Austin, Michael F. P. O'Boyle, Scott A. Mahlke, Trevor N. Mudge, Ronald G. Dreslinski:
Prodigy: Improving the Memory Latency of Data-Indirect Irregular Workloads Using Hardware-Software Co-Design. 654-667 - Vignesh Balaji, Neal Clayton Crago, Aamer Jaleel, Brandon Lucia:
P-OPT: Practical Optimal Cache Replacement for Graph Analytics. 668-681
Network on Chip
- Hossein Farrokhbakht, Henry Kao, Kamran Hasan, Paul V. Gratz, Tushar Krishna, Joshua San Miguel, Natalie D. Enright Jerger:
Pitstop: Enabling a Virtual Network Free Network-on-Chip. 682-695 - Gyuyoung Kwauk, Seungkwan Kang, Hans Kasan, Hyojun Son, John Kim:
BoomGate: Deadlock Avoidance in Non-Minimal Routing for High-Radix Networks. 696-708 - Xiaowei Ren, Mieszko Lis:
CHOPIN: Scalable Graphics Rendering in Multi-GPU Systems via Parallel Image Composition. 709-722 - Hao Zheng, Ke Wang, Ahmed Louri:
Adapt-NoC: A Flexible Network-on-Chip Design for Heterogeneous Manycore Architectures. 723-735
Emerging Technologies and Applications
- Chencheng Ye, Yuanchao Xu, Xipeng Shen, Xiaofei Liao, Hai Jin, Yan Solihin:
Hardware-Based Address-Centric Acceleration of Key-Value Store. 736-748 - Richard Afoakwa, Yiqiao Zhang, Uday Kumar Reddy Vengalam, Zeljko Ignjatovic, Michael C. Huang:
BRIM: Bistable Resistively-Coupled Ising Machine. 749-760 - Ben Feinberg, Ryan Wong, T. Patrick Xiao, Christopher H. Bennett, Jacob N. Rohan, Erik G. Boman, Matthew J. Marinella, Sapan Agarwal, Engin Ipek:
An Analog Preconditioner for Solving Linear Systems. 761-774 - Jiajun Li, Ahmed Louri, Avinash Karanth, Razvan C. Bunescu:
GCNAX: A Flexible and Energy-efficient Accelerator for Graph Convolutional Neural Networks. 775-788
Industry Track 1
- Heng Liao, Jiajin Tu, Jing Xia, Hu Liu, Xiping Zhou, Honghui Yuan, Yuxing Hu:
Ascend: a Scalable and Unified Architecture for Ubiquitous Deep Neural Network Computing : Industry Track Paper. 789-801 - Bilge Acun, Matthew Murphy, Xiaodong Wang, Jade Nie, Carole-Jean Wu, Kim M. Hazelwood:
Understanding Training Efficiency of Deep Learning Recommendation Models at Scale. 802-814 - Ying Zhang, Jian Chen, Xiaowei Jiang, Qiang Liu, Ian M. Steiner, Andrew J. Herdrich, Kevin Shu, Ripan Das, Long Cui, Litrin Jiang:
LIBRA: Clearing the Cloud Through Dynamic Memory Bandwidth Management. 815-826 - Yiming Gan, Bo Yu, Boyuan Tian, Leimeng Xu, Wei Hu, Shaoshan Liu, Qiang Liu, Yanjun Zhang, Jie Tang, Yuhao Zhu:
Eudoxus: Characterizing and Accelerating Localization in Autonomous Machines Industry Track Paper. 827-840
Industry Track 2
- Tianqi Tang, Sheng Li, Lifeng Nai, Norman P. Jouppi, Yuan Xie:
NeuroMeter: An Integrated Power, Area, and Timing Modeling Framework for Machine Learning Accelerators Industry Track Paper. 841-853 - Udit Gupta, Young Geun Kim, Sylvia Lee, Jordan Tse, Hsien-Hsin S. Lee, Gu-Yeon Wei, David Brooks, Carole-Jean Wu:
Chasing Carbon: The Elusive Environmental Footprint of Computing. 854-867 - Oreste Villa, Daniel Lustig, Zi Yan, Evgeny Bolotin, Yaosheng Fu, Niladrish Chatterjee, Nan Jiang, David W. Nellans:
Need for Speed: Experiences Building a Trustworthy System-Level GPU Simulator. 868-880 - Rohan Basu Roy, Tirthak Patel, Raj Kettimuthu, William E. Allcock, Paul Rich, Adam Scovel, Devesh Tiwari:
Operating Liquid-Cooled Large-Scale Systems: Long-Term Monitoring, Reliability Analysis, and Efficiency Measures. 881-893
Best of CAL
Accelerators for Machine Learning 2
- Jianxun Yang, Zhao Zhang, Zhuangzhi Liu, Jing Zhou, Leibo Liu, Shaojun Wei, Shouyi Yin:
FuseKNA: Fused Kernel Convolution based Accelerator for Deep Neural Networks. 894-907 - Bahar Asgari, Ramyad Hadidi, Jiashen Cao, Da Eun Shim, Sung Kyu Lim, Hyesoon Kim:
FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction. 908-920 - Julian Pavon, Iván Vargas Valdivieso, Adrián Barredo, Joan Marimon, Miquel Moretó, Francesc Moll, Osman S. Unsal, Mateo Valero, Adrián Cristal:
VIA: A Smart Scratchpad for Vector Units with Application to Sparse Matrix Computations. 921-934
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.