Santana et al., 2003 - Google Patents
Latency tolerant branch predictorsSantana et al., 2003
View PDF- Document ID
- 11284262451982427825
- Author
- Santana O
- Ramirez A
- Valero M
- Publication year
- Publication venue
- Innovative Architecture for Future Generation High-Performance Processors and Systems, 2003
External Links
Snippet
The access latency of branch predictors is a well known problem of fetch engine design. Prediction overriding techniques are commonly accepted to overcome this problem. However, prediction overriding requires a complex recovery mechanism to discard the …
- 238000000034 method 0 abstract description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3824—Operand accessing
- G06F9/383—Operand prefetching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3802—Instruction prefetching
- G06F9/3808—Instruction prefetching for instruction reuse, e.g. trace cache, branch target cache
- G06F9/381—Loop buffering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3836—Instruction issuing, e.g. dynamic instruction scheduling, out of order instruction execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/38—Concurrent instruction execution, e.g. pipeline, look ahead
- G06F9/3867—Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
- G06F9/3873—Variable length pipelines, e.g. elastic pipeline
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/30—Arrangements for executing machine-instructions, e.g. instruction decode
- G06F9/32—Address formation of the next instruction, e.g. incrementing the instruction counter, jump
- G06F9/322—Address formation of the next instruction, e.g. incrementing the instruction counter, jump for non-sequential address
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
- G06F8/44—Encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G06F1/26—Power supply means, e.g. regulation thereof
- G06F1/32—Means for saving power
- G06F1/3203—Power Management, i.e. event-based initiation of power-saving mode
- G06F1/3234—Action, measure or step performed to reduce power consumption
- G06F1/3237—Power saving by disabling clock generation or distribution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/50—Computer-aided design
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G06F2201/885—Monitoring specific for caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation; Recording or statistical evaluation of user activity, e.g. usability assessment
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02B—INDEXING SCHEME RELATING TO CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO BUILDINGS, e.g. INCLUDING HOUSING AND APPLIANCES OR RELATED END-USER APPLICATIONS
- Y02B60/00—Information and communication technologies [ICT] aiming at the reduction of own energy use
- Y02B60/10—Energy efficient computing
- Y02B60/12—Reducing energy-consumption at the single machine level, e.g. processors, personal computers, peripherals, power supply
- Y02B60/1207—Reducing energy-consumption at the single machine level, e.g. processors, personal computers, peripherals, power supply acting upon the main processing unit
- Y02B60/1217—Frequency modification
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jiménez et al. | The impact of delay on the design of branch predictors | |
Venkat et al. | Harnessing ISA diversity: Design of a heterogeneous-ISA chip multiprocessor | |
Patt et al. | One billion transistors, one uniprocessor, one chip | |
Ramirez et al. | Fetching instruction streams | |
Ramirez et al. | Software trace cache | |
Chou et al. | Piperench implementation of the instruction path coprocessor | |
Ramirez et al. | Trace cache redundancy: Red and blue traces | |
Woo et al. | COMPASS: a programmable data prefetcher using idle GPU shaders | |
Santana et al. | Latency tolerant branch predictors | |
Oberoi et al. | Parallelism in the front-end | |
Desmet et al. | Enlarging instruction streams | |
Marculescu | Power efficient processors using multiple supply voltages | |
Burtscher et al. | Hybridizing and coalescing load value predictors | |
Santana et al. | A low-complexity fetch architecture for high-performance superscalar processors | |
Oberoi et al. | Out-of-order instruction fetch using multiple sequencers | |
Falcón et al. | Tolerating branch predictor latency on SMT | |
Santana et al. | Reducing fetch architecture complexity using procedure inlining | |
Santana Jaria et al. | Techniques for enlarging instruction streams | |
Santana et al. | Multiple stream prediction | |
Eskesen et al. | Performance analysis of simultaneous multithreading in a PowerPC-based processor | |
Ali et al. | Predictive line buffer: a fast, energy efficient cache architecture | |
Bhargava et al. | Value prediction design for high-frequency microprocessors | |
Charles et al. | Improving ILP with instruction-reuse cache hierarchy | |
Ismail | Evaluation of dynamic branch predictors for modern ILP processors | |
Bournoutian et al. | Reducing impact of cache miss stalls in embedded systems by extracting guaranteed independent instructions |