The International Conference on Field-Programmable Logic and Applications (FPL) is widely regarded to be the premier venue for presenting current research on reconfigurable technology in Europe. As the original conference in this domain, FPL continues to be a most comprehensive gathering for experts, researchers, and enthusiasts in this dynamic and evolving area. In 2022, the 32nd event has returned to Belfast, where it already took place in 2001.
Despite this long history, FPL's key topics remain as relevant as ever. Field-programmable devices, notably FPGAs (Field-Programmable Gate Arrays), present a unique blend of the benefits of dedicated hardware—such as enhanced performance and power efficiency—combined with a versatility and user-friendly nature akin to software. This duality positions them as an attractive option in scenarios where traditional computing platforms like CPUs or GPUs fall short in performance or flexibility, and where the deployment of fully application-specific integrated circuits (ASICs) is impractical due to their prohibitive nonrecurring costs and the intensive design efforts required for contemporary silicon fabrication technologies.
The scope of reconfigurable technology is vast, encompassing a broad spectrum of research areas crucial for its advancement. These areas include the development of innovative tools and design methodologies, the architecture of field-programmable systems, and the exploration of device technology for field-programmable chips. Equally important is the practical application of this technology—understanding how it can be effectively utilized in various domains to translate its potential into tangible benefits for end users. This exploration is fundamental in bridging the gap between theoretical advancements and real-world impacts.
Across this wide range of topics, FPL 2022 received an initial 178 abstract submissions and 129 full submissions. After a reviewing process that included a rebuttal phase and at least three reviews for the research papers, 33 contributions could be accepted as full papers and 28 as short papers. In addition, the newly offered FPL journal track, which focused on more mature work profiting from a longer-form presentation, attracted eight submissions. Four of these contributions could be accepted for publication in ACM Transactions on Reconfigurable Technology and Systems (TRETS), and were also presented as regular talks at the event.
After the conference, we invited the best 10 papers from the FPL conference to submit extended versions of their work to ACM TRETS for a special issue. Six author groups accepted this invitation and provided new manuscripts that underwent the full TRETS review-and-revision process. Of the new manuscripts submitted, five could be accepted in time for inclusion into this special issue.
This paper introduces an AI-Engine (AIE)–based CNN accelerator on AMD/Xilinx Versal chips, designed to address the challenges of larger networks resulting from trends toward higher accuracy and resolution. The XVDPU effectively mitigates I/O bottlenecks through improved data-reuse and I/O reduction techniques, alongside an innovative Arithmetic Logic Unit (ALU) that enhances resource utilization, feature support, and overall system efficiency. The accelerator has been deployed in more than 100 CNN models, achieving significant performance improvements. For high-definition CNNs (HD-CNNs), a tiling strategy is employed to achieve feature-map-stationary (FMS) processing, resulting in high FPS improvements in advanced networks like RCAN and SESR.
Partial Reconfiguration (PR) is a crucial technique in FPGA application design, but current tools require developers to manually handle PR module definition, floor planning, and flow control at a low level, without supporting High-Level-Synthesis (HLS) languages favored by software developers. To address this, the authors introduce HiPR, an open source framework designed to integrate HLS with PR. HiPR enables developers to define partially reconfigurable functions in C/C++ instead of Verilog, streamlining the process from C/C++ to bitstreams and significantly speeding up FPGA incremental compilation. The framework incorporates a lightweight Simulated Annealing floor planner, which generates high-quality PR floor plans much faster than traditional analytic methods.
Recent advancements in graph processing using FPGAs have great potential for addressing performance issues caused by irregular memory access patterns, common in areas like machine learning and data analytics. However, existing graph processing accelerators on FPGAs either inefficiently utilize off-chip memory bandwidth or struggle to scale across memory channels. To address these challenges, this work introduces GraphScale, a scalable graph processing framework specifically designed for FPGAs. GraphScale leverages multi-channel memory and asynchronous graph processing for rapid convergence and employs a compressed graph representation to efficiently use memory bandwidth and minimize memory footprint. It is capable of handling standard graph problems such as breadth-first search, PageRank, and weakly connected components, facilitated by allowing user-defined functions, an innovative two-dimensional partitioning scheme, and a high-performance two-level crossbar design. Additionally, GraphScale is extended to support modern high-bandwidth memory (HBM) and incorporates binary packing to reduce partitioning overhead in large graphs.
As “big data” applications continue to grow in size, the effectiveness of specialized compute accelerators is increasingly hampered by rising I/O overheads. Addressing this gap, the DeLiBA framework was developed to enable high-productivity development of software components for the I/O stack in Linux user space, utilizing an FPGA SoC framework for rapid deployment of FPGA-based I/O accelerators. With DeLiBA2, the framework was further refined to enhance support for distributed storage systems such as Ceph, integrating block I/O accelerators with a hardware-accelerated network stack and accelerating additional storage functions, leading to significant performance gains for both synthetic and real-world workloads.
This paper introduces a parallel waveform-matching architecture designed for high-end FPGA-based digitizers, addressing the challenge of efficiently processing large volumes of data generated from digitizing side-channel signals at high sampling rates. Traditional methods for detecting segments containing Cryptographic Operations (COs) in these signals rely on waveform-matching techniques, which compare the signal with a template of the CO's characteristic pattern. Existing designs, which process samples sequentially, are limited by the clock speed of FPGAs, and thus can only handle low sampling rates. The proposed architecture not only achieves high-speed waveform matching but also incorporates reconfigurability to adapt to specific CO patterns. Additionally, the paper presents a workflow for calibrating the system to the CO's pattern within the hardware constraints of the FPGA. The paper also demonstrates the application of this technique in attacking the XTS-AES algorithm, showing its effectiveness in recovering the encrypted tweak even amidst systemic noise.
We acknowledge the efforts of all authors who have contributed to enhancing their FPL papers for inclusion as TRETS articles in this volume. We also thank the reviewers for their comments, which have assisted the authors in refining their manuscripts to the final versions presented here. The continued support of TRETS Editor-in-Chief Deming Chen and Associate Managing Editor Clarissa Nemeth has been essential in this process. Additionally, we of course appreciate the roles of John McAllister and Roger Woods, the general chairs of FPL 2022, in organizing the event that actually initiated these contributions.
We hope you find the selected papers interesting and helpful for your own research, and we look forward to seeing you at one of the next FPL conferences.
Andreas Koch
Kentaro Sano
Co-Guest Editors