[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Volume 21, Issue 3September 2024
Reflects downloads up to 12 Dec 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Cross-core Data Sharing for Energy-efficient GPUs
Article No.: 42, Pages 1–32https://doi.org/10.1145/3653019

Graphics Processing Units (GPUs) are the accelerator of choice in a variety of application domains, because they can accelerate massively parallel workloads and can be easily programmed using general-purpose programming frameworks such as CUDA and OpenCL. ...

research-article
Open Access
ReSA: Reconfigurable Systolic Array for Multiple Tiny DNN Tensors
Article No.: 43, Pages 1–24https://doi.org/10.1145/3653363

Systolic array architecture has significantly accelerated deep neural networks (DNNs). A systolic array comprises multiple processing elements (PEs) that can perform multiply-accumulate (MAC). Traditionally, the systolic array can execute a certain amount ...

research-article
Open Access
An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPU
Article No.: 44, Pages 1–25https://doi.org/10.1145/3659209

The hash-based signature (HBS) is the most conservative and time-consuming among many post-quantum cryptography (PQC) algorithms. Two HBSs, LMS and XMSS, are the only PQC algorithms standardised by the National Institute of Standards and Technology (NIST) ...

research-article
Open Access
Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design Code
Article No.: 45, Pages 1–26https://doi.org/10.1145/3660526

Verification of hardware design code is crucial for the quality assurance of hardware products. Being an indispensable part of verification, localizing bugs in the hardware design code is significant for hardware development but is often regarded as a ...

research-article
Open Access
D2Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage
Article No.: 46, Pages 1–22https://doi.org/10.1145/3656584

LSM-based key-value stores suffer from sub-optimal performance due to their slow and heavy background compactions. The compaction brings severe CPU and network overhead on high-speed disaggregated storage. This article further reveals that data-intensive ...

research-article
Open Access
iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments
Article No.: 47, Pages 1–24https://doi.org/10.1145/3653302

This article proposes iSwap, a new memory page swap mechanism that reduces the ineffective I/O swap operations and improves the QoS for applications with a high priority in cloud environments. iSwap works in the OS kernel. iSwap accurately learns the ...

research-article
Open Access
GraphSER: Distance-Aware Stream-Based Edge Repartition for Many-Core Systems
Article No.: 48, Pages 1–25https://doi.org/10.1145/3661998

With the explosive growth of graph data, distributed graph processing has become popular, and many graph hardware accelerators use distributed frameworks. Graph partitioning is foundation in distributed graph processing. However, dynamic changes in graph ...

research-article
Open Access
COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol Codesign
Article No.: 49, Pages 1–26https://doi.org/10.1145/3660525

RDMA (Remote Direct Memory Access) networks require efficient congestion control to maintain their high throughput and low latency characteristics. However, congestion control protocols deployed at the software layer suffer from slow response times due to ...

research-article
Open Access
Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads
Article No.: 50, Pages 1–23https://doi.org/10.1145/3659207

The increasing demand for computing power and the emergence of heterogeneous computing architectures have driven the exploration of innovative techniques to address current limitations in both the compute and memory subsystems. One such solution is the ...

research-article
Open Access
CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling
Article No.: 51, Pages 1–27https://doi.org/10.1145/3664925

For datacenter architects, it is the most important goal to minimize the datacenter’s total cost of ownership for the target performance (i.e., TCO/performance). As the major component of a datacenter is a server farm, the most effective way of reducing ...

research-article
Open Access
Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star Networks
Article No.: 52, Pages 1–24https://doi.org/10.1145/3664926

More and more storage systems use erasure code to tolerate faults. It takes pieces of data blocks as input and encodes a small number of parity blocks as output, where these blocks form a stripe. When reconsidering the recovery problem in the multi-stripe ...

research-article
Open Access
Fixed-point Encoding and Architecture Exploration for Residue Number Systems
Article No.: 53, Pages 1–27https://doi.org/10.1145/3664923

Residue Number Systems (RNS) demonstrate the fascinating potential to serve integer addition/ multiplication-intensive applications. The complexity of Artificial Intelligence (AI) models has grown enormously in recent years. From a computer system’s ...

research-article
Open Access
Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs
Article No.: 54, Pages 1–27https://doi.org/10.1145/3664924

AMG is one of the most efficient and widely used methods for solving sparse linear systems. The computational process of AMG mainly consists of a series of iterative calculations of generalized sparse matrix-matrix multiplication (SpGEMM) and sparse ...

research-article
Open Access
Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access
Article No.: 55, Pages 1–28https://doi.org/10.1145/3663479

The growing memory demands of modern applications have driven the adoption of far memory technologies in data centers to provide cost-effective, high-capacity memory solutions. However, far memory presents new performance challenges because its access ...

research-article
Open Access
SAL: Optimizing the Dataflow of Spin-based Architectures for Lightweight Neural Networks
Article No.: 56, Pages 1–27https://doi.org/10.1145/3673654

As the Convolutional Neural Network (CNN) goes deeper and more complex, the network becomes memory-intensive and computation-intensive. To address this issue, the lightweight neural network reduces parameters and Multiplication-and-Accumulation (MAC) ...

research-article
Open Access
Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory
Article No.: 57, Pages 1–26https://doi.org/10.1145/3666004

Disaggregated memory separates compute and memory resources into independent pools connected by RDMA (Remote Direct Memory Access) networks, which can improve memory utilization, reduce cost, and enable elastic scaling of compute and memory resources. ...

research-article
Open Access
Lavender: An Efficient Resource Partitioning Framework for Large-Scale Job Colocation
Article No.: 58, Pages 1–23https://doi.org/10.1145/3674736

Workload consolidation is a widely used approach to enhance resource utilization in modern data centers. However, the concurrent execution of multiple jobs on a shared server introduces contention for essential shared resources such as CPU cores, Last ...

research-article
Open Access
Achieving Tunable Erasure Coding with Cluster-Aware Redundancy Transitioning
Article No.: 59, Pages 1–24https://doi.org/10.1145/3672077

Erasure coding has been demonstrated as a storage-efficient means against failures, yet its tunability remains a challenging issue in data centers, which is prone to induce substantial cross-cluster traffic. In this article, we present ClusterRT, a ...

research-article
Open Access
Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture
Article No.: 60, Pages 1–29https://doi.org/10.1145/3673653

Modern computing systems access data in main memory at coarse granularity (e.g., at 512-bit cache block granularity). Coarse-grained access leads to wasted energy because the system does not use all individually accessed small portions (e.g., words, each ...

research-article
Open Access
ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error Detection
Article No.: 61, Pages 1–26https://doi.org/10.1145/3674909

To satisfy prohibitively massive computational requirements of current deep Convolutional Neural Networks (CNNs), CNN-specific accelerators are widely deployed in large-scale systems. Caused by high-energy neutrons and α-particle strikes, soft error may ...

research-article
Open Access
Characterizing and Optimizing LDPC Performance on 3D NAND Flash Memories
Article No.: 62, Pages 1–26https://doi.org/10.1145/3663478

With the development of NAND flash memories’ bit density and stacking technologies, while storage capacity keeps increasing, the issue of reliability becomes increasingly prominent. Low-density parity check (LDPC) code, as a robust error-correcting code, ...

research-article
Open Access
ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
Article No.: 63, Pages 1–26https://doi.org/10.1145/3659208

ReRAM-based Processing-In-Memory (PIM) architectures have been increasingly explored to accelerate various Deep Neural Network (DNN) applications because they can achieve extremely high performance and energy-efficiency for in-situ analog Matrix-Vector ...

research-article
Open Access
Time-Aware Spectrum-Based Bug Localization for Hardware Design Code with Data Purification
Article No.: 64, Pages 1–25https://doi.org/10.1145/3678009

The verification of hardware design code is a critical aspect in ensuring the quality and reliability of hardware products. Finding bugs in hardware design code is important for hardware development and is frequently considered as a notoriously ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.