No abstract available.
Proceeding Downloads
Extensible Embedded Hardware Description Languages with Compilation, Simulation and Verification
Typical hardware description languages, such as Verilog and VHDL, are low-level declarative languages with little room for flexibility. Extending, verifying, or reinterpreting programs in these languages is typically done with external tools and at ...
Breaking Boundaries: Optimizing FPGA CAD with Flexible and Multi-threaded Re-Clustering
Packing and Placement are critical stages in the FPGA backend computer-aided design (CAD) flow that significantly impact the Quality-of-Results (QoR) of design implementation. Most existing approaches either compromise on generality by focusing on ...
base2: An IR for Binary Numeral Types
Custom data types and arbitrary-precision arithmetic are often key for efficient hardware designs on Field Programmable Gate Array (FPGA) platforms. Current end-to-end flows incorporating quantization are not only domain-specific, but also tightly ...
Mutation Tree Reconstruction of Tumor Cells on FPGAs Using a Bit-Level Matrix Representation
Reconstructing the mutation history of a cancer cell in the form of a phylogenetic tree from noisy genome sequencing data requires a likelihood model and a search or optimization strategy. The “Single Cell Inference of Tumor Evolution” (SCITE) software ...
FPGA-based detector with SiC sensing for real-time monitoring of muon beams: A preliminary report of the SCIBER-1 system in COMET Phase-α
- Yoon Jongkwan,
- Yoshiki Yamaguchi,
- Yowichi Fujita,
- Yoshinori Fukao,
- Eitaro Hamada,
- Tetsuichi Kishishita,
- Youichi Igarashi,
- Masayoshi Shoji,
- Kazuki Ueno
This article reports the development of a muon monitoring system that utilizes SiC p+n junction diodes as the primary sensing element, integrated with FPGAs for signal processing and control. The system will be designed to detect muons, which are ...
Efficient FPGA Implementation of Amoeba-inspired SAT Solver with Feedback and Bounceback Control: Harnessing Variable-Level Parallelism for Large-Scale Problem Solving in Edge Computing
The Boolean satisfiability problem (SAT), an NP-complete problem, poses significant challenges for conventional general-purpose computers due to its inherent “combinatorial explosion” nature. Fast SAT solvers, however, offer immense potential for smart ...
Quantitative study of floating-point precision on modern FPGAs
Modern computer systems perform computations on real numbers through approximate representation that is often based on a specific floating-point format. Compared to fixed-point numbers, floating-point numbers provide an extended dynamic range and ...
Exploration of Compute vs. Interconnect Tradeoffs in CGRAs for HPC
We consider the balance between compute density and interconnect in Coarse-Grained Reconfigurable Architectures (CGRAs) intended for acceleration of HPC applications. We model a baseline CGRA architecture [2] in the open-source CGRA-ME framework [11] ...
Customisable Processing of Neural Networks for FPGAs
When implementing neural networks on FPGAs, existing methods for resource optimisation are closely tied to the design and performance of the neural network itself. We wish to independently control the individual Processing Elements (PEs) responsible ...
Resource-efficient RISC-V Vector Extension Architecture for FPGA-based Accelerators
For the increasing demands of embedded computation, hardware accelerators are widely used with processors. FPGA provides flexibility to design such accelerators because it is a programmable device. But developing a custom accelerator for each ...
Efficient Implementation of 2-D Convolution on DRRA and DiMArch Architectures
Convolution has been widely employed in image processing and computer vision applications such as picture augmentation, smoothing, and structure extraction. In addition, convolution operations are the most prevalent computing patterns in machine ...
CSA Based Radix-4 Gemmini Systolic Array for Machine Learning Applications
Systolic arrays are becoming the backbone of machine learning accelerators due to high computational parallelism and data re-usability. This paper presents a novel fully factored systolic array architecture: it extracts out the booth encoding logic ...
ZyPy: Intercepting NumPy operations for acceleration on FPGAs
Python's most popular numerical library, NumPy, is accelerated using GPUs, multi core CPUs, and clusters, yet a general approach on how to achieve this goal using Field Programmable Gate Arrays (FPGAs) is lacking. A methodology is presented to ...
cuSCNN : an Efficient CUDA Implementation of Sparse CNNs
Deep Neural Network models are becoming much larger which greatly increases their computation and memory requirements. Sparsity offers great opportunities to reduce unnecessary data transfers and computations. However, exploiting sparsity in CNN ...
Noise Resilience of Reduced Precision Neural Networks
Reduced Precision Neural Networks, where computations are performed with as low as one or two bits of precision, are starting to find relevance in a wide range of applications, including vision, speech, and natural language processing. Such networks ...
Index Terms
- Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies