Security of embedded system is widely noticed for its common usage and open application environment. A hardware-assisted monitoring architecture based on lightweight hash function is proposed to detect run-time program integrity on the embedded processor. The fine-grained property information is extracted as the integrity verification object, and hashed by lightweight hash function as the monitoring model. The hardware architecture is implemented on an SoPC platform. Take five standard benchmarks for experimental objects, the experiments show that the proposed monitor accounts for less than 8.37% area overheads of the processor, and the average CPI of our secure processor with pipelined lightweight hash functions increases no more than 6.36%.
From the perspective of user experience and safety when using mobile devices, skin (outer surface) temperature-aware thermal management, along with the application processor die-junction temperature, is crucial. Traditional thermal management techniques have ignored the combined effect of the junction and skin temperatures, resulting in unnecessary performance degradation due to excessive thermal throttling. We propose a novel thermal management method for mobile devices, by incorporating an adaptive thermal property control (ATPC) technique. The ATPC technique is designed to adapt the thermal properties between the junction and skin according to their thermal margins. Intensive simulation results show that the ATPC technique prolongs the maximum performance duration of the mobile device up to 34 min in the nominal application processor power consumption range of 2.9–4.6 W. In other words, the technique provides a performance gain of approximately 7% by preventing false early thermal throttling.
This paper presents a broadband millimeter-wave power amplifier with a combination of 2-way, each of which consist of a distributed amplifier and cascaded single-ended stages for high gain and output power. To the best of our knowledge, it is the first time that the two amplifiers based on a distributed stage and cascaded single-ended stages have been combined for high power. As a result, the saturated power is improved up to more than 20.5 dBm in the frequency band of 33–66 GHz. Meanwhile, by combining distributed amplifier and cascaded single-ended stages methods, the power amplifier has inherent advantages of high gain and wide bandwidth. Moreover, to improve the gain flatness, small resistor-capacitors in bias circuits are introduced in the cascaded single-ended stages amplifier structure, so the measured S21 is improved to 21.8 ± 0.6 dB in the 38–67 GHz band. These results show that high gain with good flatness and power can be achieved using the proposed method.
Memory-based physical unclonable functions (PUFs), top priorities in the hardware security applications, are in the face of limited challenge-response pairs (CRPs). Concerned with this, we propose a differential reconfigurable PUF (rPUF) scheme with phase change memory (PCM) in this paper. By making use of the spontaneous resistant randomness between cycles, a simple but practical method for reconfiguration is realized. 30 PCM chips in 40-nm process are measured for their electrical properties and the PUF system is simulated. Diffuseness, uniqueness and stability of our PUF system are very close to the industry standards according to the experimental results. Meanwhile, system crisis from exhausted CRPs is remarkably decreased by effective entropy of cycling programming when the quantifiable level is increased.
A broadband single-stage power amplifier (PA) is presented in this paper. The proposed PA is designed and implemented using 2-µm GaAs HBT process to be targeted for wide range handset devices at operating frequency around 5 GHz. In this PA, mixed matching networks are designed with transmission lines (TLs) and lumped capacitors for bandwidth enhancement. In conjunction with feedback technology and diode-based bias circuit allow us to achieve the high efficiency and comparable linearity at a low supply voltage. Measured small signal flatten gain, maximum average output powers are all better than 10 dB and 22.5 ± 0.5 dBm over 4.2–5.8 GHz (32%), respectively. The prototype achieves a peak power-added efficiency (PAE) of 47.2% at 5 GHz, and the third-order intermodulation distortion (IMD3) performance below −38 dBc up to saturation power of 23.2 dBm. This work has potential for wideband high efficiency Doherty PA (DPA) used in future mobile communication system.
New generation space-borne SAR (synthetic aperture radar) systems require high real-time processing performance and have size, weight and power constrains. This paper presents a multi-channel stripmap SAR imaging system implemented on an FPGA platform. In order to reduce FPGA design cost, a high-level synthesis tool Xilinx Vivado HLS is applied to design and implement the SAR imaging system. FFT algorithms in the imaging algorithm use FFT IP cores in FPGA, and the rest is customized on HLS. The modules designed on HLS are optimized and packaged as IP blocks for FPGA implementation of the imaging system. The performance and resource utilization of the whole system are evaluated by processing a two-channel SAR raw data with a granularity of 16384 × 4096. The system can complete imaging in about 4.7 s at 100 MHz operating frequency.
Large-scale floating-point matrix multiplication is widely used in many scientific and engineering applications. Most existing works focus on designing a linear array architecture for accelerating matrix multiplication on FPGAs. This paper towards the extension of this architecture by proposing a scalable and highly configurable multi-array architecture. In addition, we present a work-stealing scheme to ensure the equality in the workload partition among multiple linear arrays. Furthermore, an analytical model is developed to determine the optimal parameters for matrix multiplication acceleration. Experiments on real-life convolutional neural networks (CNNs) show that we can obtain the optimal extension of the linear array architecture.
In this paper, the minimum adder-delay Discrete Cosine Transform (DCT) architecture is proposed using the Adaptive CORDIC (ACor) algorithm with fixed-rotation implementations. The proposed method has six different versions differ from the number of DCT point, i.e., 8-point (8p), 16-point (16p), and 32-point (32p), and the number of ACor stages, i.e., 2-Stage (2S) and 3-Stage (3S). The Altera Stratix IV and Stratix II FPGAs were used to built and verified the implementations. The 2S designs of 8p, 16p, and 32p DCT achieved the timing performances of four, five, and six adder-delay results, respectively. The proposed method was proven to have the best timing performances, good accuracy results, and adequate resources cost in comparison with other recent works.
In this paper, we present a broadband Ka-band LNA using 0.15-µm GaAs pseudomorphic high electron mobility transistor (pHEMT) process. By using bandwidth enhancement techniques and deep negative feedback technology, the LNA achieves relatively broadband performances. The LNA attains 20 dB small signal gain from 25 to 40 GHz and shows a measured noise figure of 2.8 dB from 25 to 40 GHz with 230-mW dc power consumption. The input and output return loss of the LNA is less than 8 dB, which is competitive compared with other published Ka-band LNAs. The size of the chip is 2.5 mm × 1.2 mm.
This paper presents a tunable dual-mode filtering power divider (TDFPD) with harmonic suppression. Two tunable dual-mode resonators are embedded into a power divider with equal power ratio to realize the proposed TDFPD. The odd-mode of the dual-mode resonator is rigorously designed to approach its even-mode so that the proposed TDFPD will show a harmonic suppression performance. Tuning elements sharing technology is also utilized to minimize the numbers of varactors. To demonstrate the proposed design, a prototype is designed and fabricated. The measurement shows the TDFPD can be tuned from 2.10 to 2.31 GHz with a return loss better than 15 dB and Harmonic suppression is better than 20 dB at the frequency range of 2.5 to 9 GHz. Good agreements are observed between measured and simulated results.
As many emerging applications use FPGAs for acceleration (e.g. deep learning, data mining), designing highly-optimized application-specific soft processors on FPGAs gets much attention. Cache is an important component of the soft processor, which is built from Block-RAMs (BRAMs) in FPGAs. SRAM based BRAMs suffer from high static power consumption and area penalty, which prevents implementing large caches with high associativity. STT-RAM based BRAM may be a good solution to these issues. However, existing cache design with SRAM-based BRAMs for soft processors or SRAM and STT-RAM hybrid cache design in conventional processors is not suitable for the cache with STT-RAM based BRAMs. In this paper, we propose a BRAM allocation method that can effectively implement highly set-associative caches whereas reducing the impact of long delays and power consumption of write operations in STT-RAM. Using our framework, we show that the optimal size of STT-RAM based BRAM is 1 KB with 64-bit IO width for soft-processor cache and the proposed cache structure reduces power and area on average by 55.3% and 76.9%, and reduces runtime by up to 15.6%. Supporting diverse sizes and associativity enables application specific optimization of a cache. In addition, we show that a hybrid cache with SRAM and STT-RAM is not recommended for the soft processor.
An offset voltage suppressed sense amplifier (SA) with self-adaptive distribution transformation technique is proposed. By means of the peripheral assisted circuits, the offset voltage of the proposed SA will be automatically judged, and the most appropriate offset amount will be selected to narrow the distribution of the offset voltage in two stages. Moreover, the calibration results can be locked in the peripheral circuits by the initialization operation, thus, the calibration process is unnecessary for each read operation. Compared with the conventional voltage latch SA (VLSA) and the robust latch-type SA (RLSA), the simulation results show that the offset voltage of the proposed SA is reduced by 57.1% and 45.4%, respectively, at 1.2 V supply voltage with TT corner in TSMC 65-nm CMOS technology. Additionally, under the extreme conditions, it is also reduced by 49.9∼58.3% and 35.8∼47.9% compared with that of the VLSA and RLSA, respectively.
We observed the near-field patterns of light output from few-mode fibers (FMFs) in which an LP mode was selectively excited. It was confirmed from the variation of the intensity profile with the wavelength that the true eigenmodes of a circular core fiber are guided in single-core step-index and graded-index FMFs. On the other hand, we discovered a new phenomenon that LP11 modes propagate as eigenmodes oriented along a specific axis in 4-LP mode 12-core FMF. In addition, we confirmed that this phenomenon is not due to the elliptical deformation of the core by observing the NFP and calculating the eigenmodes in an elliptical core using elliptical cylindrical coordinates and the Mathieu function.
We characterize the bifurcational structure of the recently proposed hard-type oscillator using tunnel-effect devices. Such an oscillator succeeds in suppressing the spurious oscillation occurred in the bias line and exhibits large oscillation amplitude by device cascade, so that it has significant advantages for stable high-frequency signal generation. For experimental demonstration, we fabricate a test oscillator using tunnel diodes. Through bifurcational analyses, the test oscillator is shown to exhibit hard-type oscillation. We then carry out several time-domain measurements and succeed in confirming the advantages of the oscillator.
In this letter, a method for widening impedance matching spaces of power amplifier is proposed by expanding the voltage and current equations of class F power amplifiers. The proposed method can improve design flexibility and convenience of broadband power amplifiers while maintaining high output power and drain efficiency. To verify this theory, a power amplifier from 1 GHz to 3.5 GHz is designed. The measured experimental results show that the drain efficiency can reach 50.6%–63.4%, the output power is greater than 40 dBm, and the gain is greater than 10 dB.
In this work, we present an optically powered drone cell using optical fibers for airborne base stations. We investigate the conversion performance of photovoltaic power converters and evaluate the power consumption required for flying an entry-type drone. Based on the specifications, we successfully achieve the flight demonstration of the drone powered by 20-W power-over-fiber feed, for the first time.
Metastability of RS latches can be a source of entropy for true random number generators (TRNGs). This study presents a new composition of an RS latch using the latch functionality of storage elements of Xilinx FPGAs. Our TRNG is implemented as a soft macro, or RTL description with directives, which is easily integrated into other logic components. According to our evaluation with an Artix-7 FPGA (XC7A35T), our TRNG with 320 latches (716 LUTs and 974 registers) passed the NIST SP 800-22 test suite without post-processing. Also, our new TRNG presented a 2.3x better area-delay product than the existing design to pass the diehard test.
The design of the broadband transmitter faces the challenge of canceling harmonic distortion in the communication frequency band. Here we improve the digital predistortion (DPD) to propose a digital harmonic canceling algorithm based on direct learning structure — a reverse function predistorter based on IIR and FIR filter structure, which have less computational complexity and achieve better effect compared with the nonlinear adaptive filter. Experimental results verify the effectiveness of this canceling algorithm.
This paper presents the design and characterization of a micro-electromechanical system (MEMS) oscillator. The oscillator is composed of a low-noise sustaining circuit and a vacuum-encapsulated MEMS disk resonator with excellent frequency response. The sustaining circuit is based on two-port matching networks, a feedback circuit, and a phase shift network. The oscillator exhibits a measured phase noise of −96 dBc/Hz at 1 kHz offset, and −120 dBc/Hz at far-from-carrier offset. Furthermore, the short term and medium term frequency stabilities are ±0.5 ppm and ±5 ppm, respectively. The oscillator shows linear frequency-temperature (f-T) characteristics and small hysteresis. With these promising performances, the proposed MEMS oscillator has potential applications in high-end timing systems.
Quasi-cyclic (QC) low-density parity-check (LDPC) codes are famous for their excellent error correction performance and hardware friendly structure in NAND flash memory application. Array LDPC code is a type of highly structured QC-LDPC code that provides a good balance between performance and complexity. In this paper, a method is proposed for the construction of (18900, 17010) LDPC code that is based on the Latin square and an improved array dispersion strategy to achieve multi-column alignment of the structure. Compared with traditional design, the parallel hardware architecture reduces the number of barrel shifters by 32%. The corresponding ASIC implementation results show that the throughput of the proposed QC-LDPC code was up to 3.49 Gb/s and the throughput-to-area (TAR) of the proposed codes was significantly improved.