[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Reflects downloads up to 13 Dec 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Critical Data Backup with Hybrid Flash-Based Consumer Devices
Article No.: 1, Pages 1–23https://doi.org/10.1145/3631529

Hybrid flash-based storage constructed with high-density and low-cost flash memory has become increasingly popular in consumer devices in the last decade due to its low cost. However, its poor reliability is one of the major concerns. To protect critical ...

research-article
Open Access
DAG-Order: An Order-Based Dynamic DAG Scheduling for Real-Time Networks-on-Chip
Article No.: 2, Pages 1–24https://doi.org/10.1145/3631527

With the high-performance requirement of safety-critical real-time tasks, the platforms of many-core processors with high parallelism are widely utilized, where network-on-chip (NoC) is generally employed for inter-core communication due to its ...

research-article
Open Access
JiuJITsu: Removing Gadgets with Safe Register Allocation for JIT Code Generation
Article No.: 3, Pages 1–26https://doi.org/10.1145/3631526

Code-reuse attacks have the capability to craft malicious instructions from small code fragments, commonly referred to as “gadgets.” These gadgets are generated by JIT (Just-In-Time) engines as integral components of native instructions, with the ...

research-article
Open Access
Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations
Article No.: 4, Pages 1–25https://doi.org/10.1145/3631709

Leveraging the SIMD capability of modern CPU architectures is mandatory to take full advantage of their increased performance. To exploit this capability, binary executables must be vectorized, either manually by developers or automatically by a tool. For ...

research-article
Open Access
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs
Article No.: 5, Pages 1–26https://doi.org/10.1145/3632956

Low-precision computation has emerged as one of the most effective techniques for accelerating convolutional neural networks and has garnered widespread support on modern hardware. Despite its effectiveness in accelerating convolutional neural networks, ...

research-article
Open Access
QoS-pro: A QoS-enhanced Transaction Processing Framework for Shared SSDs
Article No.: 6, Pages 1–25https://doi.org/10.1145/3632955

Solid State Drives (SSDs) are widely used in data-intensive scenarios due to their high performance and decreasing cost. However, in shared environments, concurrent workloads can interfere with each other, leading to a violation of Quality of Service (QoS)...

research-article
Open Access
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs
Article No.: 7, Pages 1–26https://doi.org/10.1145/3632957

Deep Neural Networks (DNNs) have achieved great progress in academia and industry. But they have become computational and memory intensive with the increase of network depth. Previous designs seek breakthroughs in software and hardware levels to mitigate ...

research-article
Open Access
Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping
Article No.: 8, Pages 1–26https://doi.org/10.1145/3629525

Collecting sufficient microarchitecture performance data is essential for performance evaluation and workload characterization. There are many events to be monitored in a modern processor while only a few hardware performance monitoring counters (PMCs) ...

research-article
Open Access
QuCloud+: A Holistic Qubit Mapping Scheme for Single/Multi-programming on 2D/3D NISQ Quantum Computers
Article No.: 9, Pages 1–27https://doi.org/10.1145/3631525

Qubit mapping for NISQ superconducting quantum computers is essential to fidelity and resource utilization. The existing qubit mapping schemes meet challenges, e.g., crosstalk, SWAP overheads, diverse device topologies, etc., leading to qubit resource ...

research-article
Open Access
Abakus: Accelerating k-mer Counting with Storage Technology
Article No.: 10, Pages 1–26https://doi.org/10.1145/3632952

This work seeks to leverage Processing-with-storage-technology (PWST) to accelerate a key bioinformatics kernel called k-mer counting, which involves processing large files of sequence data on the disk to build a histogram of fixed-size genome sequence ...

research-article
Open Access
ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization Opportunities
Article No.: 11, Pages 1–24https://doi.org/10.1145/3632951

As solid-state drives (SSDs) with sufficient computing power have recently become the dominant devices in modern computer systems, in-storage processing (ISP), which processes data within the storage without transferring it to the host memory, is being ...

research-article
Open Access
COWS for High Performance: Cost Aware Work Stealing for Irregular Parallel Loop
Article No.: 12, Pages 1–26https://doi.org/10.1145/3633331

Parallel libraries such as OpenMP distribute the iterations of parallel-for-loops among the threads, using a programmer-specified scheduling policy. While the existing scheduling policies perform reasonably well in the context of balanced workloads, in ...

research-article
Open Access
Hardware-hardened Sandbox Enclaves for Trusted Serverless Computing
Article No.: 13, Pages 1–25https://doi.org/10.1145/3632954

In cloud-based serverless computing, an application consists of multiple functions provided by mutually distrusting parties. For secure serverless computing, the hardware-based trusted execution environment (TEE) can provide strong isolation among ...

research-article
Open Access
Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory
Article No.: 14, Pages 1–24https://doi.org/10.1145/3632953

The abstraction of a shared memory space over separate CPU and GPU memory domains has eased the burden of portability for many HPC codebases. However, users pay for ease of use provided by system-managed memory with a moderate-to-high performance ...

research-article
Open Access
Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL
Article No.: 15, Pages 1–26https://doi.org/10.1145/3634916

Memory disaggregation is a promising architecture for modern datacenters that separates compute and memory resources into independent pools connected by ultra-fast networks, which can improve memory utilization, reduce cost, and enable elastic scaling of ...

research-article
Open Access
WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDs
Article No.: 16, Pages 1–23https://doi.org/10.1145/3637488

ZNS SSDs divide the storage space into sequential-write zones, reducing costs of DRAM utilization, garbage collection, and over-provisioning. The sequential-write feature of zones is well-suited for LSM-based databases, where random writes are organized ...

research-article
Open Access
Improving Utilization of Dataflow Unit for Multi-Batch Processing
Article No.: 17, Pages 1–26https://doi.org/10.1145/3637906

Dataflow architectures can achieve much better performance and higher efficiency than general-purpose core, approaching the performance of a specialized design while retaining programmability. However, advanced application scenarios place higher demands ...

research-article
Open Access
Extension VM: Interleaved Data Layout in Vector Memory
Article No.: 18, Pages 1–23https://doi.org/10.1145/3631528

While vector architecture is widely employed in processors for neural networks, signal processing, and high-performance computing; however, its performance is limited by inefficient column-major memory access. The column-major access limitation originates ...

research-article
Open Access
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis
Article No.: 19, Pages 1–29https://doi.org/10.1145/3632950

Profile hidden Markov models (pHMMs) are widely employed in various bioinformatics applications to identify similarities between biological sequences, such as DNA or protein sequences. In pHMMs, sequences are represented as graph structures, where states ...

research-article
Open Access
Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs
Article No.: 20, Pages 1–20https://doi.org/10.1145/3633462

An important sparse tensor computation is sparse-tensor-dense-matrix multiplication (SpTM), which is used in tensor decomposition and applications. SpTM is a multi-dimensional analog to sparse-matrix-dense-matrix multiplication (SpMM). In this article, we ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.