Proceedings of the 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing

Article

Cover Art

Page C1https://doi.org/10.1109/SBAC-PAD.2014.54

Article

Title Page i

Page ihttps://doi.org/10.1109/SBAC-PAD.2014.1

Article

Title Page iii

Page iiihttps://doi.org/10.1109/SBAC-PAD.2014.2

Article

Copyright Page

Page ivhttps://doi.org/10.1109/SBAC-PAD.2014.3

Article

Message from the General Chairs

Page xhttps://doi.org/10.1109/SBAC-PAD.2014.4

Article

Message from the Program and Track Chairs

Pages xi–xiiihttps://doi.org/10.1109/SBAC-PAD.2014.5

Article

Keynotes

Pages xviii–xxiihttps://doi.org/10.1109/SBAC-PAD.2014.7

Article

A Case Study of Hybrid Dataflow and Shared-Memory Programming Models: Dependency-Based Parallel Game Engine

Pages 1–8https://doi.org/10.1109/SBAC-PAD.2014.21

Recently proposed hybrid dataflow and shared memory programming models combine these two underlying models in order to support a wider range of problems naturally. The effectiveness of such hybrid models for parallel implementations of dense and sparse ...

Article

Wait-Free Global Virtual Time Computation in Shared Memory TimeWarp Systems

Pages 9–16https://doi.org/10.1109/SBAC-PAD.2014.38

Global Virtual Time (GVT) is a powerful abstraction used to discriminate what events belong (and what do not belong) to the past history of a parallel/distributed computation. For high performance simulation systems based on the Time Warp ...

Article

Compact Hash Tables for High-Performance Traffic Classification on Multi-core Processors

Pages 17–24https://doi.org/10.1109/SBAC-PAD.2014.32

Traffic classification is one of the kernel applications in network management. Many Machine Learning (ML) traffic classification algorithms are based on decision-trees. While most of the existing implementations of decision-trees are hardwarebased, a ...

Article

Flying Memcache: Lessons Learned from Different Acceleration Strategies

Pages 25–32https://doi.org/10.1109/SBAC-PAD.2014.17

Distributed key-value and always-in-memory store is employed by large and demanding services, such as Facebook and Amazon. It is apparent that generic implementations of such caches can not meet the needs of every application, therefore further research ...

Article

Improving an MPI Application-Level Migration Approach through Checkpoint File Splitting

Pages 33–40https://doi.org/10.1109/SBAC-PAD.2014.25

Traditionally used for load balancing, process migration has been gaining popularity in the fault tolerance context. Recently, checkpoint-based migration has been proposed to implement failure avoidance in MPI applications through the proactive ...

Article

HPCG: Preliminary Evaluation and Optimization on Tianhe-2 CPU-only Nodes

Pages 41–48https://doi.org/10.1109/SBAC-PAD.2014.10

HPCG has become a new metric for the design and ranking of HPC. By incorporating a local symmetric Gauss-Seidel preconditioned, HPCG implements the Conjugate Gradient method to solve a sparse linear system. HPCG performs poorly with irregular memory ...

Article

Leveraging Optimization Methods for Dynamically Assisted Control-Flow Integrity Mechanisms

Pages 49–56https://doi.org/10.1109/SBAC-PAD.2014.35

Dynamic Binary Modification (DBM) tools are useful for cross-platform execution of binaries and are powerful run time environments that allow execution optimizations, instrumentation and profiling. These tools have also been used as enablers for control-...

Article

Energy Efficient Seismic Wave Propagation Simulation on a Low-Power Manycore Processor

Pages 57–64https://doi.org/10.1109/SBAC-PAD.2014.28

Large-scale simulation of seismic wave propagation is an active research topic. Its high demand for processing power makes it a good match for High Performance Computing (HPC). Although we have observed a steady increase on the processing capabilities ...

Article

Performance-Aware Task Management and Frequency Scaling in Embedded Systems

Pages 65–72https://doi.org/10.1109/SBAC-PAD.2014.14

Due to the dissemination of smartphones and tablets, a constant complexity growth can be observed for both embedded systems and mobile applications. However, this results in an increase in energy consumption. To guarantee longer battery life cycles, it ...

Article

Analyzing Performance Improvements and Energy Savings in Infiniband Architecture using Network Compression

Pages 73–80https://doi.org/10.1109/SBAC-PAD.2014.27

One of the greatest challenges in HPC is total system power and energy consumption. Whereas HPC interconnects have traditionally been designed with a focus on bandwidth and latency, there is an increasing interest in minimising the interconnect's energy ...

Article

Bit-Parallel Approximate Pattern Matching on the Xeon Phi Coprocessor

Pages 81–88https://doi.org/10.1109/SBAC-PAD.2014.37

Bit-parallel pattern matching encodes calculated values in bit arrays. This approach gains its efficiency by performing multiple updates within a machine word. An important parameter is therefore the machine word size (e.g. 32 or 64 bits). With the ...

Article

Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems

Pages 89–96https://doi.org/10.1109/SBAC-PAD.2014.15

High performance computing is experiencing a major paradigm shift with the introduction of accelerators, such as graphics processing units (GPUs) and Intel Xeon Phi (MIC). These processors have made available a tremendous computing power at low cost, ...

Article

High-Performance Traffic Classification on GPU

Pages 97–104https://doi.org/10.1109/SBAC-PAD.2014.48

Traffic classification is an essential task in network management. Recently, there has been a new trend in exploring Graphics Processing Unit (GPU) for network applications. These applications typically do not perform floating point operations and ...

Article

Accelerating Curvature Estimate in 3D Seismic Data Using GPGPU

Pages 105–111https://doi.org/10.1109/SBAC-PAD.2014.11

Seismic interpretation is a vital step in oil and gas industry. Choosing proper drilling locations is a major challenge to the interpreters, since an ultra-deep water oil well located below 2500 meters of water can cost dozens of millions of dollars. ...

Article

Leveraging OmpSs to Exploit Hardware Accelerators

Pages 112–119https://doi.org/10.1109/SBAC-PAD.2014.26

CUDA and OpenCL are the most widely used programming models to exploit hardware accelerators. Both programming models provide a C-based programming language to write accelerator kernels and a host API used to glue the host and kernel parts. Although ...

Article

Improving Signature Behavior by Irrevocability in Transactional Memory Systems

Pages 120–127https://doi.org/10.1109/SBAC-PAD.2014.41

Signatures have been proposed in Hardware Transactional Memory (HTM) to represent read and write sets of transactions and decouple transaction conflict detection from private caches. Generally, signatures are implemented as Bloom filters that allow ...

Article

Scalability Analysis of Signatures in Transactional Memory Systems

Pages 128–135https://doi.org/10.1109/SBAC-PAD.2014.40

Signatures have been proposed in transactional memory systems to represent read and write sets and to decouple transaction conflict detection from private caches or to accelerate it. Generally, signatures are implemented as Bloom filters that allow ...

Article

Profiling Patterns of Bit Flipping for Software Transactional Memories

Pages 136–143https://doi.org/10.1109/SBAC-PAD.2014.51

Software Transactional Memory (STM) is a synchronization method proposed as an alternative to lockbased synchronization. It provides a higher-level abstraction that is easier to program, and that enables software composition. Transactions are defined by ...

Article

Multi-dimensional Evaluation of Haswell's Transactional Memory Performance

Pages 144–151https://doi.org/10.1109/SBAC-PAD.2014.33

This paper presents an extensive performance study of the implementation of Hardware Transactional Memory (HTM) in the Haswell generation of Intel x86 core processors. This study evaluates the strengths and weaknesses of this new architecture exploring ...

Article

DeTrans: Deterministic and Parallel execution of Transactions

Pages 152–159https://doi.org/10.1109/SBAC-PAD.2014.20

Deterministic execution of a multithreaded application guarantees the same output as long as the application runs with the same input parameters. Determinism helps a programmer to test and debug an application and to provide fault-tolerance in the ...

Article

Design Space Exploration of Memory Model for Heterogeneous Computing

Pages 160–167https://doi.org/10.1109/SBAC-PAD.2014.9

Heterogeneous computing that combines a traditional CPU architecture with an accelerator has become a popular architecture. Memory modelling design decisions affect not only architecture designs but also programming models. Hence, comparing them is very ...

Article

Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs

Pages 168–175https://doi.org/10.1109/SBAC-PAD.2014.43

GPUs have gained tremendous popularity in a broad range of application domains. These applications possess varying grains of parallelism and place high demands on compute resources--many times imposing real-time constraints, requiring flexible work ...

Article

Automatic Generation of Custom Parallel Processors for Morphological Image Processing

Pages 176–181https://doi.org/10.1109/SBAC-PAD.2014.47

Image processing applications are well established in modern society, presenting continuous advances and challenges. One of its fundamental techniques is morphological image processing, a nonlinear branch in image processing, which have high performance ...

Browse Proceedings

Sections

Cover Art

Title Page i

Title Page iii

Copyright Page

Message from the General Chairs

Message from the Program and Track Chairs

Keynotes

A Case Study of Hybrid Dataflow and Shared-Memory Programming Models: Dependency-Based Parallel Game Engine

Wait-Free Global Virtual Time Computation in Shared Memory TimeWarp Systems

Compact Hash Tables for High-Performance Traffic Classification on Multi-core Processors

Flying Memcache: Lessons Learned from Different Acceleration Strategies

Improving an MPI Application-Level Migration Approach through Checkpoint File Splitting

HPCG: Preliminary Evaluation and Optimization on Tianhe-2 CPU-only Nodes

Leveraging Optimization Methods for Dynamically Assisted Control-Flow Integrity Mechanisms

Energy Efficient Seismic Wave Propagation Simulation on a Low-Power Manycore Processor

Performance-Aware Task Management and Frequency Scaling in Embedded Systems

Analyzing Performance Improvements and Energy Savings in Infiniband Architecture using Network Compression

Bit-Parallel Approximate Pattern Matching on the Xeon Phi Coprocessor

Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems

High-Performance Traffic Classification on GPU

Accelerating Curvature Estimate in 3D Seismic Data Using GPGPU

Leveraging OmpSs to Exploit Hardware Accelerators

Improving Signature Behavior by Irrevocability in Transactional Memory Systems

Scalability Analysis of Signatures in Transactional Memory Systems

Profiling Patterns of Bit Flipping for Software Transactional Memories

Multi-dimensional Evaluation of Haswell's Transactional Memory Performance

DeTrans: Deterministic and Parallel execution of Transactions

Design Space Exploration of Memory Model for Heterogeneous Computing

Runtime Support for Adaptive Spatial Partitioning and Inter-Kernel Communication on GPUs

Automatic Generation of Custom Parallel Processors for Morphological Image Processing

IHM '14: Proceedings of the 26th Conference on l'Interaction Homme-Machine

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing

Save to Binder

Sections

Save to Binder

Recommendations

IHM '14: Proceedings of the 26th Conference on l'Interaction Homme-Machine

UbiMob '05: Proceedings of the 2nd French-speaking conference on Mobility and ubiquity computing

UbiMob '08: Proceedings of the 4th French-speaking conference on Mobility and ubiquity computing