Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- articleJune 2024
Performance Study of Object Tracking with Multiple Kalman Filters in Autonomous Driving Systems
ACM SIGAda Ada Letters (SIGADA), Volume 43, Issue 2Pages 89–93https://doi.org/10.1145/3672359.3672374Object tracking is an important and central aspect of autonomous driving, as it underlies the obstacle detection and avoidance systems of any type of autonomous vehicles. A widely used method for tracking is based on Kalman filters, both for linear and ...
- research-articleMarch 2024
Integration of RISC-V Page Table Walk in gem5 SE Mode
RAPIDO '24: Proceedings of the 16th Workshop on Rapid Simulation and Performance Evaluation for DesignPages 22–28https://doi.org/10.1145/3642921.3642926gem5 is a popular architectural simulator, for both academic and industrial researchers. It can be used in two configurations: Full System mode and Syscall Emulation mode. The former requires running a real kernel to achieve realistic results, at the ...
- research-articleAugust 2023
Energy and Performance Improvements for Convolutional Accelerators Using Lightweight Address Translation Support
CF '23: Proceedings of the 20th ACM International Conference on Computing FrontiersPages 84–90https://doi.org/10.1145/3587135.3592208The growing demand for deep learning applications has led to the design and development of several hardware accelerators to increase performance and energy efficiency. In particular, convolutional accelerators are among those receiving the most attention ...
- review-articleAugust 2022
A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives
Journal of Systems Architecture: the EUROMICRO Journal (JOSA), Volume 129, Issue Chttps://doi.org/10.1016/j.sysarc.2022.102561AbstractIn recent years, the limits of the multicore approach emerged in the so-called “dark silicon” issue and diminishing returns of an ever-increasing core count. Hardware manufacturers, out of necessity, switched their focus to ...
- research-articleMay 2022
Performance portability in a real world application: PHAST applied to Caffe
International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 36, Issue 3Pages 419–439https://doi.org/10.1177/10943420221077107This work covers the PHAST Library’s employment, a hardware-agnostic programming library, to a real-world application like the Caffe framework. The original implementation of Caffe consists of two different versions of the source code: one to run on CPU ...
-
- research-articleFebruary 2019
Task-DAG Support in Single-Source PHAST Library: Enabling Flexible Assignment of Tasks to CPUs and GPUs in Heterogeneous Architectures
PMAM'19: Proceedings of the 10th International Workshop on Programming Models and Applications for Multicores and ManycoresPages 91–100https://doi.org/10.1145/3303084.3309496Nowadays, the majority of desktop, mobile, and embedded devices in the consumer and industrial markets are heterogeneous, as they contain at least multi-core CPU and GPU resources in the same system. However, exploiting the performance and energy-...
- short-paperJanuary 2019
Single-source Library for Enabling Seamless Assignment of Data-parallel Task-DAGs to CPUs and GPUs in Heterogeneous Architectures
PARMA-DITAM 2019: Proceedings of the 10th and 8th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing PlatformsArticle No.: 3, Pages 1–4https://doi.org/10.1145/3310411.3310416Currently, the majority of devices is heterogeneous and comprises at least a multi-core CPU and a GPU. Exploiting these modules requires programmers to a) assign parallel activities to the different hardware resources, and b) code each activity through ...
- research-articleJanuary 2019
PHAST - A Portable High-Level Modern C++ Programming Library for GPUs and Multi-Cores
IEEE Transactions on Parallel and Distributed Systems (TPDS), Volume 30, Issue 1Pages 174–189https://doi.org/10.1109/TPDS.2018.2855182A decade after the beginning of the many-core era, multi-core CPU and GPU architectures are everywhere, from mobile devices up to high-performance workstations and servers. To this day, programmers willing to harness their power need to express their code ...
- research-articleMarch 2018
Scalable Path-Setup Scheme for All-Optical Dynamic Circuit Switched NoCs in Cache Coherent CMPs
ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 14, Issue 1Article No.: 12, Pages 1–27https://doi.org/10.1145/3154840Nanophotonics is a promising solution for on-chip interconnection due to its intrinsic low-latency and low-power features, which can be useful for performance and energy in future Chip Multi-Processors (CMPs).
This article proposes a novel arbitrated ...
- ArticleAugust 2014
Simultaneous Optical Path-Setup for Reconfigurable Photonic Networks in Tiled CMPs
HPCC '14: Proceedings of the 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS)Pages 482–485https://doi.org/10.1109/HPCC.2014.80This paper proposes a tiled chip multiprocessor (CMP) architecture built around an all-optical reconfigurable network, thought to significantly reduce path-setup latency and energy consumption. We propose a novel optical path-setup procedure that is ...
- research-articleJune 2014
Design Options for Optical Ring Interconnect in Future Client Devices
ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 10, Issue 4Article No.: 30, Pages 1–25https://doi.org/10.1145/2602155Nanophotonic is a promising solution for on-chip interconnection due to its intrinsic low-latency and low-power features. Future tiled chip multiprocessors (CMPs) for rich client devices can receive energy benefits from this technology but we show that ...
- research-articleMarch 2014
Assessing the energy break-even point between an optical NoC architecture and an aggressive electronic baseline
DATE '14: Proceedings of the conference on Design, Automation & Test in EuropeArticle No.: 308, Pages 1–6Many crossbenchmarking results reported in the open literature raise optimistic expectations on the use of optical networks-on-chip (ONoCs) for high-performance and low-power on-chip communication. However, most of those previous works ultimately fail to ...
- ArticleSeptember 2013
Olympic: A Hierarchical All-Optical Photonic Network for Low-Power Chip Multiprocessors
DSD '13: Proceedings of the 2013 Euromicro Conference on Digital System DesignPages 56–59https://doi.org/10.1109/DSD.2013.142The continuous increase of the number of cores in tiled chip-multi-processors (CMP) will prevent traditional electronic networks on chip (NoC) to maintain an acceptable tradeoff between performance and power consumption. Recent advances in silicon-...
- research-articleJune 2013
Co-tuning of a hybrid electronic-optical network for reducing energy consumption in embedded CMPs
MES '13: Proceedings of the First International Workshop on Many-core Embedded SystemsPages 9–16https://doi.org/10.1145/2489068.2489070Nanophotonic is a promising solution for on-chip interconnection due to its intrinsic low-latency and especially low-power features, desirable especially in future chip multiprocessors (CMPs) for rich client devices. In this paper we address the co-...
- research-articleMarch 2013
Contrasting wavelength-routed optical NoC topologies for power-efficient 3D-stacked multicore processors using physical-layer analysis
Optical networks-on-chip (ONoCs) are currently still in the concept stage, and would benefit from explorative studies capable of bridging the gap between abstract analysis frameworks and the constraints and challenges posed by the physical layer. This ...
- ArticleSeptember 2012
A Simple On-Chip Optical Interconnection for Improving Performance of Coherency Traffic in CMPs
DSD '12: Proceedings of the 2012 15th Euromicro Conference on Digital System DesignPages 312–318https://doi.org/10.1109/DSD.2012.13Nanophotonic interconnection is a promising solution for inter-core communication in future chip multiprocessors (CMPs). Main benefits derive from its intrinsic low-latency and high-bandwidth, especially when employing wavelength division multiplexing (...
- ArticleApril 2011
Link-time optimization for power efficiency in a tagless instruction cache
The instruction cache is a critical component in any microprocessor. It must have high performance to enable fetching of instructions on every cycle. However, current designs waste a large amount of energy on each access as tags and data banks from all ...
- chapterJanuary 2011
- ArticleJanuary 2011
Eighth MEDEA Workshop
Proceedings of the 2011 conference on Transactions on High-Performance Embedded Architectures and Compilers III - Volume 6590Pages 91–92https://doi.org/10.1007/978-3-642-19448-1_5It is our pleasure to welcome you to this special section of Transactions on High-Performance Embedded Architectures and Compilers HiPEAC, presenting selected papers from the 2007 edition of Medea Workshop. This workshop, held in conjunction with the ...
- ArticleOctober 2010
Feedback-Driven Restructuring of Multi-threaded Applications for NUCA Cache Performance in CMPs
SBAC-PAD '10: Proceedings of the 2010 22nd International Symposium on Computer Architecture and High Performance ComputingPages 87–94https://doi.org/10.1109/SBAC-PAD.2010.20This paper addresses feedback-directed restructuring techniques tuned to Non Uniform Cache Architectures (NUCA) in CMPs running multi-threaded applications. Access time to NUCA caches depends on the location of the referred block, so the locality and ...