Sun et al., 2024 - Google Patents

FPGA-based acceleration architecture for Apache Spark operators

Sun et al., 2024

Document ID: 9958732504838835754
Author: Sun Y; Liu H; Liao X; Jin H; Zhang Y
Publication year: 2024
Publication venue: CCF Transactions on High Performance Computing

External Links

Cited by

Snippet

Apache Spark has been the most popular in-memory processing framework for big data applications deployed in data centers. As a CPU-only parallel programming framework, Spark can satisfy the requirement of computing resource by scaling up the nodes of clusters …

Continue reading at link.springer.com (other versions)

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogramme communication; Intertask communication
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30575—Replication, distribution or synchronisation of data between databases or within a distributed database; Distributed database system architectures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/44—Arrangements for executing specific programmes
- G06F9/455—Emulation; Software simulation, i.e. virtualisation or emulation of application or operating system execution engines
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G06F15/78—Architectures of general purpose stored programme computers comprising a single central processing unit
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network-specific arrangements or communication protocols supporting networked applications
- H04L67/10—Network-specific arrangements or communication protocols supporting networked applications in which an application is distributed across nodes in the network

Similar Documents

Publication	Publication Date	Title
Ma et al.	2017	Garaph: Efficient {GPU-accelerated} graph processing on a single machine with balanced replication
Li et al.	2016	MapReduce parallel programming model: a state-of-the-art survey
Zaharia et al.	2010	Spark: Cluster computing with working sets
Siddique et al.	2016	Apache Hama: An emerging bulk synchronous parallel computing framework for big data applications
US20170091668A1 (en)	2017-03-30	System and method for network bandwidth aware distributed learning
Zhao et al.	2014	Kylix: A sparse allreduce for commodity clusters
Chu et al.	2016	Cuda kernel based collective reduction operations on large-scale gpu clusters
Hashmi et al.	2020	FALCON-X: Zero-copy MPI derived datatype processing on modern CPU and GPU architectures
Slagter et al.	2014	SmartJoin: a network-aware multiway join for MapReduce
Qiu et al.	2013	Mammoth data in the cloud: clustering social images
Nowicki et al.	2021	PCJ Java library as a solution to integrate HPC, Big Data and Artificial Intelligence workloads
Hou et al.	2019	Design and implementation of reconfigurable acceleration for in-memory distributed big data computing
Piñeiro et al.	2022	A unified framework to improve the interoperability between HPC and Big Data languages and programming models
Kalnis et al.	2012	Mizan: Optimizing graph mining in large parallel systems
Sun et al.	2023	FPGA-based acceleration architecture for Apache Spark operators
Chavarria-Miranda et al.	2008	Early experience with out-of-core applications on the Cray XMT
Cheng et al.	2018	A highly cost-effective task scheduling strategy for very large graph computation
Li et al.	2019	Dual buffer rotation four-stage pipeline for CPU–GPU cooperative computing
Potluri et al.	2018	Efficient breadth first search on multi-gpu systems using gpu-centric openshmem
Wickramasinghe et al.	2022	High‐performance iterative dataflow abstractions in Twister2: TSet
Lehner et al.	2018	Diversity of Processing Units: An Attempt to Classify the Plethora of Modern Processing Units
US11194625B2 (en)	2021-12-07	Systems and methods for accelerating data operations by utilizing native memory management
Popov et al.	2022	Teragraph heterogeneous system for ultra-large graph processing
Döschl et al.	2021	Performance evaluation of GPU-and cluster-computing for parallelization of compute-intensive tasks
Grant et al.	2015	Networks and MPI for cluster computing