Zhu et al., 2007 - Google Patents

Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures

Zhu et al., 2007

Document ID: 14376284014824086320
Author: Zhu W; Sreedhar V; Hu Z; Gao G
Publication year: 2007
Publication venue: Proceedings of the 34th annual international symposium on Computer architecture

External Links

Cited by

Snippet

Efficient fine-grain synchronization is extremely important to effectively harness the computational power of many-core architectures. However, designing and implementing finegrain synchronization in such architectures presents several challenges, including …

Continue reading at www.capsl.udel.edu (PDF) (other versions)

230000015654 memory 0 abstract description 109

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/30076—Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
- G06F9/30087—Synchronisation or serialisation instructions
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Programme synchronisation; Mutual exclusion, e.g. by means of semaphores; Contention for resources among tasks
- G06F9/524—Deadlock detection or avoidance
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3885—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
- G06F9/3889—Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled by multiple instructions, e.g. MIMD, decoupled access or execute
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G06F15/163—Interprocessor communication
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored programme computers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring

Similar Documents

Publication	Publication Date	Title
Zhu et al.	2007	Synchronization state buffer: supporting efficient fine-grain synchronization on many-core architectures
Fung et al.	2011	Hardware transactional memory for GPU architectures
Treichler et al.	2014	Realm: An event-based low-level runtime for distributed memory architectures
Tullsen et al.	1999	Supporting fine-grained synchronization on a simultaneous multithreading processor
Lê et al.	2013	Correct and efficient work-stealing for weak memory models
Zhang et al.	2015	Exploiting commutativity to reduce the cost of updates to shared data in cache-coherent systems
Blake et al.	2009	Proactive transaction scheduling for contention management
Mukherjee et al.	2016	A comprehensive performance analysis of HSA and OpenCL 2.0
Xu et al.	2016	Lock-based synchronization for GPU architectures
Abdolrashidi et al.	2017	Wireframe: Supporting data-dependent parallelism through dependency graph execution in gpus
Bhattacharjee et al.	2011	Parallelization libraries: Characterizing and reducing overheads
Yiapanis et al.	2013	Optimizing software runtime systems for speculative parallelization
Tang et al.	2015	Cache-oblivious wavefront: improving parallelism of recursive dynamic programming algorithms without losing cache-efficiency
Kumar et al.	2008	Atomic vector operations on chip multiprocessors
Mühlig et al.	2021	MxTasks: How to Make Efficient Synchronization and Prefetching Easy
del Cuvillo et al.	2006	Landing openmp on cyclops-64: An efficient mapping of openmp to a many-core system-on-a-chip
Bousias et al.	2006	Instruction level parallelism through microthreading—a scalable approach to chip multiprocessors
Xiang et al.	2015	Conflict reduction in hardware transactions using advisory locks
Kranz et al.	1994	Low-cost support for fine-grain synchronization in multiprocessors
Nelson et al.	2019	Don't forget about synchronization! a case study of k-means on gpu
Dallou et al.	2012	Hardware-based task dependency resolution for the starss programming model
Holey	2015	Enhancing GPU programmability and correctness through transactional execution
US20180373560A1 (en)	2018-12-27	Snapshot isolation in graphical processing unit hardware transactional memory
Jesshope et al.	2009	The implementation of an svp many-core processor and the evaluation of its memory architecture
Nelson et al.	2022	Don't forget about synchronization! Guidelines for using locks on graphics processing units