[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Reflects downloads up to 31 Jan 2025Bibliometrics
Skip Table Of Content Section
research-article
Acceleration of Multi-Body Molecular Dynamics With Customized Parallel Dataflow

FPGAs are drawing increasing attention in resolving molecular dynamics (MD) problems, and have already been applied in problems such as two-body potentials, force fields composed of these potentials, etc. Competitive performance is obtained compared with ...

research-article
Optimizing I/O Performance Through Effective vCPU Scheduling Interference Management

Virtual machines (VMs) heavily rely on virtual CPUs (vCPUs) scheduling to achieve efficient I/O performance. The vCPU scheduling interference can cause inconsistent scheduling latency and degraded I/O performance, potentially compromising ...

research-article
Two-Timescale Joint Optimization of Task Scheduling and Resource Scaling in Multi-Data Center System Based on Multi-Agent Deep Reinforcement Learning

As a new computing paradigm, multi-data center computing enables service providers to deploy their applications close to the users. However, due to the spatio-temporal changes in workloads, it is challenging to coordinate multiple distributed data centers ...

research-article
Fair Coflow Scheduling via Controlled Slowdown

The average coflow completion time (CCT) is the standard performance metric in coflow scheduling. However, standard CCT minimization may introduce unfairness between the data transfer phase of different computing jobs. Thus, while progress guarantees have ...

research-article
GeoDeploy: Geo-Distributed Application Deployment Using Benchmarking

Geo-distributed web-applications (GWA) can be deployed across multiple geographically separated datacenters to reduce the latency of access for users. Finding a suitable deployment for a GWA is challenging due to the requirement to consider a number of ...

research-article
Efficient Schedule Construction for Distributed Execution of Large DNN Models

Increasingly complex and diverse deep neural network (DNN) models necessitate distributing the execution across multiple devices for training and inference tasks, and also require carefully planned schedules for performance. However, existing practices ...

research-article
Distributed Task Processing Platform for Infrastructure-Less IoT Networks: A Multi-Dimensional Optimization Approach

With the rapid development of artificial intelligence (AI) and the Internet of Things (IoT), intelligent information services have showcased unprecedented capabilities in acquiring and analysing information. The conventional task processing platforms rely ...

research-article
VisionAGILE: A Versatile Domain-Specific Accelerator for Computer Vision Tasks

The emergence of diverse machine learning (ML) models has led to groundbreaking revolutions in computer vision (CV). These ML models include convolutional neural networks (CNNs), graph neural networks (GNNs), and vision transformers (ViTs). However, ...

research-article
FastLoad: Speeding Up Data Loading of Both Sparse Matrix and Vector for SpMV on GPUs

Sparse Matrix-Vector Multiplication (SpMV) on GPUs has gained significant attention because of SpMV's importance in modern applications and the increasing computing power of GPUs in the last decade. Previous studies have emphasized the importance ...

research-article
BCB-SpTC: An Efficient Sparse High-Dimensional Tensor Contraction Employing Tensor Core Acceleration

Sparse tensor contraction (SpTC) is an important operator in tensor networks, which tends to generate a large amount of sparse high-dimensional data, placing higher demands on the computational performance and storage bandwidth of the processor. Using ...

research-article
Competitive Analysis of Online Elastic Caching of Transient Data in Multi-Tiered Content Delivery Network

As the demand for faster and more reliable content delivery escalates, Content Delivery Networks (CDNs) face significant challenges in managing content placement across their increasingly complex, multi-tiered structures to balance performance, complexity,...

research-article
Open Access
A Survey on Performance Modeling and Prediction for Distributed DNN Training

The recent breakthroughs in large-scale DNN attract significant attention from both academia and industry toward distributed DNN training techniques. Due to the time-consuming and expensive execution process of large-scale distributed DNN training, it is ...

research-article
TrieKV: A High-Performance Key-Value Store Design With Memory as Its First-Class Citizen

Key-value (KV) stores based on log-structured merge tree (LSM-tree) have been extensively studied and deployed in major information technology infrastructures. Because this type of systems is catered for KV store accessing disks, a limited disk bandwidth ...

research-article
Open Access
Mitosis: A Scalable Sharding System Featuring Multiple Dynamic Relay Chains

Sharding is a prevalent approach for addressing performance issues in blockchain. To reduce governance complexities and ensure system security, a common practice involves a relay chain to coordinate cross-shard transactions. However, with a growing number ...

research-article
Breaking the Memory Wall for Heterogeneous Federated Learning via Model Splitting

Federated Learning (FL) enables multiple devices to collaboratively train a shared model while preserving data privacy. Ever-increasing model complexity coupled with limited memory resources on the participating devices severely bottlenecks the deployment ...

research-article
TARIS: Scalable Incremental Processing of Time-Respecting Algorithms on Streaming Graphs

Temporal graphs change with time and have a lifespan associated with each vertex and edge. These graphs are suitable to process time-respecting algorithms where the traversed edges must have monotonic timestamps. Interval-centric Computing Model (ICM) is ...

research-article
MoltDB: Accelerating Blockchain via Ancient State Segregation

Blockchain store states in Log-Structured Merge (LSM) tree-based database. Due to blockchain traceability, the growing ancient states are inevitably stored in the databases. Unfortunately, by default, this process mixes <italic>current</italic> and <...

research-article
Efficient Distributed Edge Computing for Dependent Delay-Sensitive Tasks in Multi-Operator Multi-Access Networks

We study the problem of distributed computing in the <italic>multi-operator multi-access edge computing</italic> (MEC) network for <italic>dependent tasks</italic>. Every task comprises several <italic>sub-tasks</italic> which are executed based on ...

research-article
PeakFS: An Ultra-High Performance Parallel File System via Computing-Network-Storage Co-Optimization for HPC Applications

Emerging high-performance computing (HPC) applications with diverse workload characteristics impose greater demands on parallel file systems (PFSs). PFSs also require more efficient software designs to fully utilize the performance of modern hardware, ...

research-article
Design and Performance Evaluation of Linearly Extensible Cube-Triangle Network for Multicore Systems

High-performance interconnection networks are currently being used to design Massively Parallel Computers. Selecting the set of nodes on which parallel tasks execute plays a vital role in the performance of such systems. These networks when deployed to ...

research-article
HybRAID: A High-Performance Hybrid RAID Storage Architecture for Write-Intensive Applications in All-Flash Storage Systems

With the ever-increasing demand for higher I/O performance and reliability in data-intensive applications, <italic>solid-state drives</italic> (SSDs) typically configured as <italic>redundant array of independent disks</italic> (RAID) are broadly used in ...

research-article
DyLaClass: Dynamic Labeling Based Classification for Optimal Sparse Matrix Format Selection in Accelerating SpMV

Sparse matrix-vector multiplication (SpMV) is crucial in many scientific and engineering applications, particularly concerning the effectiveness of different sparse matrix storage formats for various architectures, no single format excels across all ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.