[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Volume 73, Issue 1Jan. 2024
Publisher:
  • IEEE Computer Society
  • 1730 Massachusetts Ave., NW Washington, DC
  • United States
ISSN:0018-9340
Reflects downloads up to 05 Jan 2025Bibliometrics
Skip Table Of Content Section
research-article
: Toward Heterogeneous Federated Learning via Global Knowledge Distillation<sc/>

Federated learning, as one enabling technology of edge intelligence, has gained substantial attention due to its efficacy in training deep learning models without data privacy and network bandwidth concerns. However, due to the heterogeneity of the edge ...

research-article
Revisit and Benchmarking of Automated Quantization Toward Fair Comparison

Automated quantization has emerged as an entirely new design paradigm to automate the optimal configuration of bitwidth for deep neural networks (DNNs), making the DNN more memory-efficient and faster to execute on hardware with limited resources. ...

research-article
ASHL: An Adaptive Multi-Stage Distributed Deep Learning Training Scheme for Heterogeneous Environments

With the increment of data sets and models sizes, distributed deep learning has been proposed to accelerate training and improve the accuracy of DNN models. The parameter server framework is a popular collaborative architecture for data-parallel training, ...

research-article
Toward an SGX-Friendly Java Runtime

Hardware enclaves assist in constructing a trusted execution environment (TEE) to store private code and data and thus become an appealing solution to enhance applications&#x2019; security. Nevertheless, state-of-the-art enclave implementations like Intel ...

research-article
A Secure and Robust Knowledge Transfer Framework via Stratified-Causality Distribution Adjustment in Intelligent Collaborative Services

The rapid development of device-edge-cloud collaborative computing techniques has actively contributed to the popularization and application of intelligent service models. The intensity of knowledge transfer plays a vital role in enhancing the performance ...

research-article
Applying Delta Compression to Packed Datasets for Efficient Data Reduction

Backup systems often adopt deduplication techniques for data reduction. Real-world backup products often group files into larger units (called packed files) before deduplicating them. The grouping entails inserting metadata immediately before the contents ...

research-article
General Bootstrapping Approach for RLWE-Based Homomorphic Encryption

Homomorphic Encryption (HE) makes it possible to compute on encrypted data without decryption. In lattice-based HE, a ciphertext contains noise, which accumulates along with homomorphic computations. Bootstrapping refreshes the noise and it is possible to ...

research-article
Split-Radix Based Compact Hardware Architecture for CRYSTALS-Kyber

Facing the threat of large-scale quantum computers to traditional public-key cryptography, the National Institute of Standards and Technology has conducted Post-Quantum Cryptography algorithms evaluation for a long time, and CRYSTALS-Kyber has been ...

research-article
Computation Off-Loading in Resource-Constrained Edge Computing Systems Based on Deep Reinforcement Learning

Edge computing is a computational paradigm that brings resources closer to the network edge, such as base stations or gateways, in order to provide quick and efficient computing services for mobile devices while relieving pressure on the core network. ...

research-article
A Reliability-Critical Path Identifying Method With Local and Global Adjacency Probability Matrix in Combinational Circuits

Accurate and efficient identification of reliability-critical paths (RCPs) not only facilitates fault localization and troubleshooting but also allows circuit designers to improve circuit reliability at a low cost. This article proposes a local and global ...

research-article
Open Access
Enabling HW-Based Task Scheduling in Large Multicore Architectures

Dynamic Task Scheduling is an enticing programming model aiming to ease the development of parallel programs with intrinsically irregular or data-dependent parallelism. The performance of such solutions relies on the ability of the Task Scheduling HW/SW ...

research-article
Bit-Balance: Model-Hardware Codesign for Accelerating NNs by Exploiting Bit-Level Sparsity

Bit-serial architectures can handle Neural Networks (NNs) with different weight precision, achieving higher resource efficiency compared with bit-parallel architectures. Besides, the weights contain abundant zero bits owing to the fault tolerance of NNs, ...

research-article
An Efficient Deep Reinforcement Learning-Based Automatic Cache Replacement Policy in Cloud Block Storage Systems

With the popularity of cloud services, cloud block storage (CBS) systems have been widely deployed by cloud providers. Cloud cache plays a vital role in maintaining high and stable performance in cloud block storage systems. In the past few decades, much ...

research-article
An Efficient and Secure Data Sharing Scheme for Edge-Enabled IoT

Sharing the big data generated by IoT via cloud is slow and expensive. Besides, transmitting and sharing data among IoT devices via cloud may be insecure. To address these issues, a novel efficient and secure data sharing scheme termed EB-SDSS (Edge ...

research-article
Hybrid Edge-Cloud Collaborator Resource Scheduling Approach Based on Deep Reinforcement Learning and Multiobjective Optimization

Collaborative resource scheduling between edge terminals and cloud centers is regarded as a promising means of effectively completing computing tasks and enhancing quality of service. In this paper, to further improve the achievable performance, the edge ...

research-article
SplitDB: Closing the Performance Gap for LSM-Tree-Based Key-Value Stores

Log Structured Merge Tree (LSM tree) serves as the core data storage engine in modern key-value stores. Its adoption is rapidly accelerated with cloud computing and data center development. Acknowledging its widespread use, the LSM tree still faces severe ...

research-article
Open Access
Cyclebite: Extracting Task Graphs From Unstructured Compute-Programs

Extracting portable performance in an application requires structuring that program into a data-flow graph of coarse-grained tasks (CGTs). Structuring applications that interconnect multiple external libraries and custom code (i.e., &#x201C;Code From The ...

research-article
SafeDRL: Dynamic Microservice Provisioning With Reliability and Latency Guarantees in Edge Environments

As a key technology of 5G, network function virtualization enables each monolithic service to be divided into microservices, facilitating their deployment and management in edge environments. One of the most critical issues in 5G is how to support ...

research-article
Zero and Narrow-Width Value-Aware Compression for Quantized Convolutional Neural Networks

Convolutional neural networks are normally used in systems with dedicated neural processing units for CNN-related computations. For high performance and low hardware overheads, CNN datatype quantization is applied. As an additional optimization, to ...

research-article
A High-Performance, Energy-Efficient Modular DMA Engine Architecture

Data transfers are essential in today&#x0027;s computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAES) are critically needed to transfer data independently of the ...

research-article
Correct-by-Construction Design of Custom Accelerator Microarchitectures

Modern application-specific System-on-Chip designs include a variety of accelerator blocks that customize microcontrollers with domain-specific instruction sets and optimized microarchitectures. Unfortunately, accelerator implementations can be highly ...

research-article
Unified Digit Selection for Radix-4 Recurrence Division and Square Root

Division and square root are fundamental operations required by most computer systems. They are commonly implemented in hardware using radix-4 recurrence, which produces a 2-bit result digit on each step. Unified digit selection logic chooses the next ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.