[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Reflects downloads up to 14 Dec 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of Cores, Caches, and On-chip Network
Article No.: 32, Pages 1–25https://doi.org/10.1145/3132170

Modern multi-core systems provide huge computational capabilities, which can be used to run multiple processes concurrently. To achieve the best possible performance within limited power budgets, the various system resources need to be allocated ...

research-article
Open Access
Bringing Parallel Patterns Out of the Corner: The P3 ARSEC Benchmark Suite
Article No.: 33, Pages 1–26https://doi.org/10.1145/3132710

High-level parallel programming is an active research topic aimed at promoting parallel programming methodologies that provide the programmer with high-level abstractions to develop complex parallel software with reduced time to solution. Pattern-based ...

research-article
Open Access
Cache Exclusivity and Sharing: Theory and Optimization
Article No.: 34, Pages 1–26https://doi.org/10.1145/3134437

A problem on multicore systems is cache sharing, where the cache occupancy of a program depends on the cache usage of peer programs. Exclusive cache hierarchy as used on AMD processors is an effective solution to allow processor cores to have a large ...

research-article
Open Access
Energy-Efficient Compilation of Irregular Task-Parallel Loops
Article No.: 35, Pages 1–29https://doi.org/10.1145/3136063

Energy-efficient compilation is an important problem for multi-core systems. In this context, irregular programs with task-parallel loops  present interesting challenges: the threads with lesser work-loads (non-critical-threads) wait at the join-points ...

research-article
Open Access
Compiler-Assisted Loop Hardening Against Fault Attacks
Article No.: 36, Pages 1–25https://doi.org/10.1145/3141234

Secure elements widely used in smartphones, digital consumer electronics, and payment systems are subject to fault attacks. To thwart such attacks, software protections are manually inserted requiring experts and time. The explosion of the Internet of ...

research-article
Open Access
A Transactional Correctness Tool for Abstract Data Types
Article No.: 37, Pages 1–24https://doi.org/10.1145/3148964

Transactional memory simplifies multiprocessor programming by providing the guarantee that a sequential block of code in the form of a transaction will exhibit atomicity and isolation. Transactional data structures offer the same guarantee to concurrent ...

research-article
Open Access
Power Consumption Models for Multi-Tenant Server Infrastructures
Article No.: 38, Pages 1–22https://doi.org/10.1145/3148965

Multi-tenant virtualized infrastructures allow cloud providers to minimize costs through workload consolidation. One of the largest costs is power consumption, which is challenging to understand in heterogeneous environments. We propose a power modeling ...

research-article
Open Access
CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution Near In-Order Energy with Near Out-of-Order Performance
Article No.: 39, Pages 1–26https://doi.org/10.1145/3151034

We introduce the Coarse-Grain Out-of-Order (CG-OoO) general-purpose processor designed to achieve close to In-Order (InO) processor energy while maintaining Out-of-Order (OoO) performance. CG-OoO is an energy-performance-proportional architecture. Block-...

research-article
Open Access
ECS: Error-Correcting Strings for Lifetime Improvements in Nonvolatile Memories
Article No.: 40, Pages 1–29https://doi.org/10.1145/3151083

Emerging nonvolatile memories (NVMs) suffer from low write endurance, resulting in early cell failures (hard errors), which reduce memory lifetime. It was recognized early on that conventional error-correcting codes (ECCs), which are designed for soft ...

research-article
Open Access
SLOOP: QoS-Supervised Loop Execution to Reduce Energy on Heterogeneous Architectures
Article No.: 41, Pages 1–25https://doi.org/10.1145/3148053

Most systems allocate computational resources to each executing task without any actual knowledge of the application’s Quality-of-Service (QoS) requirements. Such best-effort policies lead to overprovisioning of the resources and increase energy loss. ...

research-article
Open Access
MBZip: Multiblock Data Compression
Article No.: 42, Pages 1–29https://doi.org/10.1145/3151033

Compression techniques at the last-level cache and the DRAM play an important role in improving system performance by increasing their effective capacities. A compressed block in DRAM also reduces the transfer time over the memory bus to the caches, ...

research-article
Open Access
Fuse: Accurate Multiplexing of Hardware Performance Counters Across Executions
Article No.: 43, Pages 1–26https://doi.org/10.1145/3148054

Collecting hardware event counts is essential to understanding program execution behavior. Contemporary systems offer few Performance Monitoring Counters (PMCs), thus only a small fraction of hardware events can be monitored simultaneously. We present ...

research-article
Open Access
Could Compression Be of General Use? Evaluating Memory Compression across Domains
Article No.: 44, Pages 1–24https://doi.org/10.1145/3138805

Recent proposals present compression as a cost-effective technique to increase cache and memory capacity and bandwidth. While these proposals show potentials of compression, there are several open questions to adopt these proposals in real systems ...

research-article
Open Access
Improving the Efficiency of GPGPU Work-Queue Through Data Awareness
Article No.: 45, Pages 1–22https://doi.org/10.1145/3151035

The architecture and programming model of current GPGPUs are best suited for applications that are dominated by structured control and data flows across large regular datasets. Parallel workloads with irregular control and data structures cannot easily ...

research-article
Open Access
A Framework for Automated and Controlled Floating-Point Accuracy Reduction in Graphics Applications on GPUs
Article No.: 46, Pages 1–25https://doi.org/10.1145/3151032

Reducing the precision of floating-point values can improve performance and/or reduce energy expenditure in computer graphics, among other, applications. However, reducing the precision level of floating-point values in a controlled fashion needs ...

research-article
Open Access
Generating Fine-Grain Multithreaded Applications Using a Multigrain Approach
Article No.: 47, Pages 1–26https://doi.org/10.1145/3155288

The recent evolution in hardware landscape, aimed at producing high-performance computing systems capable of reaching extreme-scale performance, has reignited the interest in fine-grain multithreading, particularly at the intranode level. Indeed, ...

research-article
Open Access
CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level Offloading of Processing-In-Memory
Article No.: 48, Pages 1–25https://doi.org/10.1145/3155287

Three-dimensional (3D)-stacking technology and the memory-wall problem have popularized processing-in-memory (PIM) concepts again, which offers the benefits of bandwidth and energy savings by offloading computations to functional units inside the ...

research-article
Open Access
Triple Engine Processor (TEP): A Heterogeneous Near-Memory Processor for Diverse Kernel Operations
Article No.: 49, Pages 1–25https://doi.org/10.1145/3155920

The advent of 3D memory stacking technology, which integrates a logic layer and stacked memories, is expected to be one of the most promising memory technologies to mitigate the memory wall problem by leveraging the concept of near-memory processing (...

research-article
Open Access
ReDirect: Reconfigurable Directories for Multicore Architectures
Article No.: 50, Pages 1–23https://doi.org/10.1145/3162015

As we enter the dark silicon era, architects should not envision designs in which every transistor remains turned on permanently but rather ones in which portions of the chip are judiciously turned on/off depending on the characteristics of a workload. ...

research-article
Open Access
HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated Heterogeneous Systems
Article No.: 51, Pages 1–26https://doi.org/10.1145/3158641

Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics Pprocessing Units (GPGPUs) alongside latency-oriented Central Processing Units (CPUs) on the same die sharing certain resources, e.g., shared last-level ...

research-article
Open Access
Optimizing Affine Control With Semantic Factorizations
Article No.: 52, Pages 1–22https://doi.org/10.1145/3162017

Hardware accelerators generated by polyhedral synthesis techniques make extensive use of affine expressions (affine functions and convex polyhedra) in control and steering logic. Since the control is pipelined, these affine objects must be evaluated at ...

research-article
Open Access
Data-Driven Concurrency for High Performance Computing
Article No.: 53, Pages 1–26https://doi.org/10.1145/3162014

In this work, we utilize dynamic dataflow/data-driven techniques to improve the performance of high performance computing (HPC) systems. The proposed techniques are implemented and evaluated through an efficient, portable, and robust programming ...

research-article
Open Access
SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded Workloads
Article No.: 54, Pages 1–25https://doi.org/10.1145/3158643

Shared memory machines continue to increase in scale by adding more parallelism through additional cores and complex memory hierarchies. Often, executing multiple applications concurrently, dividing among them hardware threads, provides greater ...

research-article
Open Access
Optimization of Triangular and Banded Matrix Operations Using 2d-Packed Layouts
Article No.: 55, Pages 1–19https://doi.org/10.1145/3162016

Over the past few years, multicore systems have become increasingly powerful and thereby very useful in high-performance computing. However, many applications, such as some linear algebra algorithms, still cannot take full advantage of these systems. ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.