Low-complexity vector microprocessor extension

January 2008

Author:
Joseph James Gebis
University of California, Berkeley
,
Adviser:
David A. Patterson
University of California, Berkeley

Publisher:

University of California at Berkeley
Computer Science Division 571 Evans Hall Berkeley, CA
United States

ISBN:978-0-549-87452-2

Order Number:AAI3334297

Pages:

181

Purchase on ProQuest

Bibliometrics

Abstract

For the last few years, single-thread performance has been improving at a snail’s pace. Power limitations, increasing relative memory latency, and the exhaustion of improvement in instruction-level parallelism are forcing microprocessor architects to examine new processor design strategies. In this dissertation, I take a look at a technology that can improve the efficiency of modern microprocessors: vectors. Vectors are a simple, power-efficient way to take advantage of common data-level parallelism in an extensible, easily-programmable manner. My work focuses on the process of transitioning from traditional scalar microprocessors to computers that can take advantage of vectors.

First, I describe a process for extending existing single-instruction, multiple-data instruction sets to support full vector processing, in a way that remains binary compatible with existing applications. Initial implementations can be low cost, but be transparently extended to higher performance later.

I also describe ViVA, the Virtual Vector Architecture. ViVA adds vector-style memory operations to existing microprocessors but does not include arithmetic datapaths; instead, memory instructions work with a new buffer placed between the core and second-level cache. ViVA serves as a low-cost solution to getting much of the performance of full vector memory hierarchies while avoiding the complexity of adding a full vector system.

Finally, I test the performance of ViVA by modifying a cycle-accurate full-system simulator to support ViVA’s operation. After extensive calibration, I test the basic performance of ViVA using a series of microbenchmarks. I compare the performance of a variety of ViVA configurations for corner turn, used in processing multidimensional data, and sparse matrix-vector multiplication, used in many scientific applications. Results show that ViVA can give significant benefit for a variety of memory access patterns, without relying on a costly hardware prefetcher.

Cited By

Contributors

David A Patterson
Google LLC
- Publication Years1975 - 2024
- Publication counts298
- Citation count36,442
- Available for Download153
- Downloads (cumulative)1,643,853
- Downloads (12 months)143,065
- Downloads (6 weeks)23,491
- Average Downloads per Article10,744
- Average Citation per Article122
View Full Profile
Joseph James Gebis
Lawrence Berkeley National Laboratory
- Publication Years2007 - 2009
- Publication counts3
- Citation count11
- Available for Download0
- Downloads (cumulative)0
- Downloads (12 months)0
- Downloads (6 weeks)0
- Average Downloads per Article0
- Average Citation per Article4
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

A low-complexity microprocessor design with speculative pre-execution

Current superscalar architectures strongly depend on an instruction issue queue to achieve multiple instruction issue and out-of-order execution. However, the issue queue requires a centralized structure and mainly causes globally broadcasting ...
Design, implementation, and evaluation of a low-complexity vector-core for executing scalar/vector instructions

This paper proposes a low-complexity vector-core called LcVc for executing both scalar and vector instructions on the same execution datapath. A unified register file in the decode stage is used for storing both scalar operands and vector elements. The ...
The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor
COMPCON '95: Proceedings of the 40th IEEE Computer Society International Conference

The PowerPC 620 RISC microprocessor is the first chip for the application server and technical workstation product line within the PowerPC family. It utilizes a high performance microarchitecture with many advanced superscalar features to exploit ...

Browse Theses

Sections

Cited By

A low-complexity microprocessor design with speculative pre-execution

Design, implementation, and evaluation of a low-complexity vector-core for executing scalar/vector instructions

The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor

Sections

Cited By

Save to Binder

Recommendations

A low-complexity microprocessor design with speculative pre-execution

Design, implementation, and evaluation of a low-complexity vector-core for executing scalar/vector instructions

The PowerPC 620 microprocessor: a high performance superscalar RISC microprocessor