[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2707681guideproceedingsBook PagePublication PagesConference Proceedingsacm-pubtype
SBAC-PAD '14: Proceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing
2014 Proceeding
Publisher:
  • IEEE Computer Society
  • 1730 Massachusetts Ave., NW Washington, DC
  • United States
Conference:
October 22 - 24, 2014
ISBN:
978-1-4799-6905-0
Published:
22 October 2014

Reflects downloads up to 15 Jan 2025Bibliometrics
Abstract

No abstract available.

Article
Article
Article
Article
Article
Keynotes
Article
A Case Study of Hybrid Dataflow and Shared-Memory Programming Models: Dependency-Based Parallel Game Engine

Recently proposed hybrid dataflow and shared memory programming models combine these two underlying models in order to support a wider range of problems naturally. The effectiveness of such hybrid models for parallel implementations of dense and sparse ...

Article
Wait-Free Global Virtual Time Computation in Shared Memory TimeWarp Systems

Global Virtual Time (GVT) is a powerful abstraction used to discriminate what events belong (and what do not belong) to the past history of a parallel/distributed computation. For high performance simulation systems based on the Time Warp ...

Article
Compact Hash Tables for High-Performance Traffic Classification on Multi-core Processors

Traffic classification is one of the kernel applications in network management. Many Machine Learning (ML) traffic classification algorithms are based on decision-trees. While most of the existing implementations of decision-trees are hardwarebased, a ...

Article
Flying Memcache: Lessons Learned from Different Acceleration Strategies

Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research ...

Article
Improving an MPI Application-Level Migration Approach through Checkpoint File Splitting

Traditionally used for load balancing, process migration has been gaining popularity in the fault tolerance context. Recently, checkpoint-based migration has been proposed to implement failure avoidance in MPI applications through the proactive ...

Article
HPCG: Preliminary Evaluation and Optimization on Tianhe-2 CPU-only Nodes

HPCG has become a new metric for the design and ranking of HPC. By incorporating a local symmetric Gauss-Seidel preconditioned, HPCG implements the Conjugate Gradient method to solve a sparse linear system. HPCG performs poorly with irregular memory ...

Article
Leveraging Optimization Methods for Dynamically Assisted Control-Flow Integrity Mechanisms

Dynamic Binary Modification (DBM) tools are useful for cross-platform execution of binaries and are powerful run time environments that allow execution optimizations, instrumentation and profiling. These tools have also been used as enablers for control-...

Article
Energy Efficient Seismic Wave Propagation Simulation on a Low-Power Manycore Processor

Large-scale simulation of seismic wave propagation is an active research topic. Its high demand for processing power makes it a good match for High Performance Computing (HPC). Although we have observed a steady increase on the processing capabilities ...

Article
Performance-Aware Task Management and Frequency Scaling in Embedded Systems

Due to the dissemination of smartphones and tablets, a constant complexity growth can be observed for both embedded systems and mobile applications. However, this results in an increase in energy consumption. To guarantee longer battery life cycles, it ...

Article
Analyzing Performance Improvements and Energy Savings in Infiniband Architecture using Network Compression

One of the greatest challenges in HPC is total system power and energy consumption. Whereas HPC interconnects have traditionally been designed with a focus on bandwidth and latency, there is an increasing interest in minimising the interconnect's energy ...

Article
Bit-Parallel Approximate Pattern Matching on the Xeon Phi Coprocessor

Bit-parallel pattern matching encodes calculated values in bit arrays. This approach gains its efficiency by performing multiple updates within a machine word. An important parameter is therefore the machine word size (e.g. 32 or 64 bits). With the ...

Article
Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems

High performance computing is experiencing a major paradigm shift with the introduction of accelerators, such as graphics processing units (GPUs) and Intel Xeon Phi (MIC). These processors have made available a tremendous computing power at low cost, ...

Article
High-Performance Traffic Classification on GPU

Traffic classification is an essential task in network management. Recently, there has been a new trend in exploring Graphics Processing Unit (GPU) for network applications. These applications typically do not perform floating point operations and ...

Article
Accelerating Curvature Estimate in 3D Seismic Data Using GPGPU

Seismic interpretation is a vital step in oil and gas industry. Choosing proper drilling locations is a major challenge to the interpreters, since an ultra-deep water oil well located below 2500 meters of water can cost dozens of millions of dollars. ...

Article
Leveraging OmpSs to Exploit Hardware Accelerators

CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both programming models provide a C-based programming language to write accelerator kernels and a host API used to glue the host and kernel parts. Although ...

Article
Improving Signature Behavior by Irrevocability in Transactional Memory Systems

Signatures have been proposed in Hardware Transactional Memory (HTM) to represent read and write sets of transactions and decouple transaction conflict detection from private caches. Generally, signatures are implemented as Bloom filters that allow ...

Article
Scalability Analysis of Signatures in Transactional Memory Systems

Signatures have been proposed in transactional memory systems to represent read and write sets and to decouple transaction conflict detection from private caches or to accelerate it. Generally, signatures are implemented as Bloom filters that allow ...

Article
Profiling Patterns of Bit Flipping for Software Transactional Memories

Software Transactional Memory (STM) is a synchronization method proposed as an alternative to lockbased synchronization. It provides a higher-level abstraction that is easier to program, and that enables software composition. Transactions are defined by ...

Article
Multi-dimensional Evaluation of Haswell's Transactional Memory Performance

This paper presents an extensive performance study of the implementation of Hardware Transactional Memory (HTM) in the Haswell generation of Intel x86 core processors. This study evaluates the strengths and weaknesses of this new architecture exploring ...

Article
DeTrans: Deterministic and Parallel execution of Transactions

Deterministic execution of a multithreaded application guarantees the same output as long as the application runs with the same input parameters. Determinism helps a programmer to test and debug an application and to provide fault-tolerance in the ...

Article
Design Space Exploration of Memory Model for Heterogeneous Computing

Heterogeneous computing that combines a traditional CPU architecture with an accelerator has become a popular architecture. Memory modelling design decisions affect not only architecture designs but also programming models. Hence, comparing them is very ...

Article
Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs

GPUs have gained tremendous popularity in a broad range of application domains. These applications possess varying grains of parallelism and place high demands on compute resources--many times imposing real-time constraints, requiring flexible work ...

Article
Automatic Generation of Custom Parallel Processors for Morphological Image Processing

Image processing applications are well established in modern society, presenting continuous advances and challenges. One of its fundamental techniques is morphological image processing, a nonlinear branch in image processing, which have high performance ...

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations