[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3410463.3414629acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

The Forward Slice Core Microarchitecture

Published: 30 September 2020 Publication History

Abstract

Superscalar out-of-order cores deliver high performance at the cost of increased complexity and power budget. In-order cores, in contrast, are less complex and have a smaller power budget, but offer low performance. A processor architecture should ideally provide high performance in a power- and cost-efficient manner. Recently proposed slice-out-of-order (sOoO) cores identify backward slices of memory operations which they execute out-of-order with respect to the rest of the dynamic instruction stream for increased instruction-level and memory-hierarchy parallelism. Unfortunately, constructing backward slices is imprecise and hardware-inefficient, leaving performance on the table.
In this paper, we propose Forward Slice Core (FSC), a novel core microarchitecture that builds on a stall-on-use in-order core and extracts more instruction-level and memory-hierarchy parallelism than slice-out-of-order cores. FSC does so by identifying and steering forward slices (rather than backward slices) to dedicated in-order FIFO queues. Moreover, FSC puts load-consumers that depend on L1 D-cache misses on the side to enable younger independent load-consumers to execute faster. Finally, FSC eliminates the need for dynamic memory disambiguation by replicating store-address instructions across queues. FSC improves performance by 9.7% on average compared to Freeway, the state-of-the-art sOoO core, across the SPEC CPU2017 benchmarks, while incurring reduced hardware complexity and a similar power budget.

References

[1]
M. Alipour, S. Kaxiras, D. Black-Schaffer, and R. Kumar. Delay and bypass: Ready and criticality aware instruction scheduling in out-of-order processors. In Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 424--434, 2020.
[2]
ARM. ARM Cortex-A7 processor. http://www.arm.com/products/processors/cortex-a/cortex-a7.php.
[3]
R. D. Barnes, S. Ryoo, and W. W. Hwu. "flea-flicker" multipass pipelining: an alternative to the high-power out-of-order offense. In Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 319--330, 2005.
[4]
T. E. Carlson, W. Heirman, S. Eyerman, I. Hur, and L. Eeckhout. An evaluation of high-level mechanistic core models. ACM Transactions on Architecture and Code Optimization (TACO), 11 (3): 28, 2014.
[5]
T. E. Carlson, W. Heirman, O. Allam, S. Kaxiras, and L. Eeckhout. The load slice core microarchitecture. In Proceedings of the 42nd International Symposium on Computer Architecture (ISCA), pages 272--284, 2015.
[6]
N. C. Crago and S. J. Patel. OUTRIDER: Efficient memory latency tolerance with decoupled strands. In Proceedings of the 38th Annual International Symposium on Computer Architecture (ISCA), pages 117--128, 2011.
[7]
L. Eeckhout. Computer Architecture Performance Evaluation Methods. Synthesis Lectures on Computer Architecture. Morgan and Claypool Publishers, 2010.
[8]
I. Jeong, S. Park, C. Lee, and W. W. Ro. CASINO core microarchitecture: Generating out-of-order schedules using cascaded in-order scheduling windows. In Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 383--396, 2020.
[9]
H. Kim and J. E. Smith. An instruction set and microarchitecture for instruction level distributed processing. In Proceedings of the 29th Annual International Symposium on Computer Architecture (ISCA), pages 71--81, 2002.
[10]
R. Kumar, M. Alipour, and D. Black-Schaffer. Freeway: Maximizing mlp for slice-out-of-order execution. In Proceedings of the 25th International Symposium on High-Performance Computer Architecture (HPCA), pages 558--569, 2019.
[11]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures. In Proceedings of the International Symposium on Microarchitecture (MICRO), pages 469--480, Dec. 2009.
[12]
t al.(2011)Li, Chen, Ahn, Brockman, and Jouppi]1057_cactiS. Li, K. Chen, J. H. Ahn, J. B. Brockman, and N. P. Jouppi. Cacti-p: Architecture-level modeling for sram-based structures with advanced leakage reduction techniques. In 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 694--701, 2011.
[13]
S. Palacharla, N. P. Jouppi, and J. E. Smith. Complexity-effective superscalar processors. In Proceedings of the 24th Annual International Symposium on Computer Architecture (ISCA), pages 206--218, June 1997.
[14]
P. Salverda and C. Zilles. Dependence-based scheduling revisited: A tale of two baselines. In Proceedings of the Sixth Annual Workshop on Duplicating, Deconstructing and Debunking (WDDD), held in conjunction with ISCA, 2007.
[15]
P. Salverda and C. Zilles. Fundamental performance constraints in horizontal fusion of in-order cores. In Proceedings of the 14th Annual International Symposium on High Performance Computer Architecture (HPCA), pages 252--263, 2008.
[16]
A. Sembrant, T. Carlson, E. Hagersten, D. Black-Shaffer, A. Perais, A. Seznec, and P. Michaud. Long term parking (LTP): Criticality-aware resource allocation in ooo processors. In Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 334--346, 2015.
[17]
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically characterizing large scale program behavior. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 45--57, 2002.
[18]
R. Shioya, M. Goshima, and H. Ando. A front-end execution architecture for high energy efficiency. In Proceedings of the 47th International Symposium on Microarchitecture (MICRO), pages 419--431, 2014.
[19]
J. E. Smith. Decoupled access/execute computer architectures. In Proceedings of the 9th Annual International Symposium on Computer Architecture (ISCA), pages 112--119, 1982.
[20]
nder, Spiliopoulos, Kaxiras, and Jimborean]1014_tranK. A. Tran, T. E. Carlson, K. Koukos, M. Själander, V. Spiliopoulos, S. Kaxiras, and A. Jimborean. Clairvoyance: Look-ahead compile-time scheduling. In Proceedings of the International Conference on Code Generation and Optimization (CGO), pages 171--184, 2017.
[21]
nder, and Kaxiras]1015_tranK. A. Tran, A. Jimborean, T. E. Carlson, K. Koukos, M. Sj"alander, and S. Kaxiras. SWOOP: Software-hardware co-design for non-speculative, execute-ahead, in-order cores. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 328--343, 2018.
[22]
F. Tseng and Y. N. Patt. Achieving out-of-order performance with almost in-order complexity. In Proceedings of the 35th Annual International Symposium on Computer Architecture (ISCA), pages 3--12, 2008.
[23]
C. Zilles and G. Sohi. Execution-based prediction using speculative slices. In Proceedings of the 28th Annual International Symposium on Computer Architecture (ISCA), pages 2--13, July 2001.
[24]
C. B. Zilles and G. S. Sohi. Understanding the backward slices of performance degrading instructions. In Proceedings of the Annual International Symposium on Computer Architecture (ISCA), pages 172--181, June 2000.

Cited By

View all
  • (2024)FOCAL: A First-Order Carbon Model to Assess Processor SustainabilityProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640415(401-415)Online publication date: 27-Apr-2024
  • (2024)Early Execution for Soft Error Detection2024 37th International Conference on VLSI Design and 2024 23rd International Conference on Embedded Systems (VLSID)10.1109/VLSID60093.2024.00067(366-371)Online publication date: 6-Jan-2024
  • (2023)Orinoco: Ordered Issue and Unordered Commit with Non-Collapsible QueuesProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589046(1-14)Online publication date: 17-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '20: Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques
September 2020
505 pages
ISBN:9781450380751
DOI:10.1145/3410463
  • General Chair:
  • Vivek Sarkar,
  • Program Chair:
  • Hyesoon Kim
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. complexity effective architecture
  2. mobile computing
  3. slice out of order cores
  4. superscalar architecture

Qualifiers

  • Research-article

Conference

PACT '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)FOCAL: A First-Order Carbon Model to Assess Processor SustainabilityProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640415(401-415)Online publication date: 27-Apr-2024
  • (2024)Early Execution for Soft Error Detection2024 37th International Conference on VLSI Design and 2024 23rd International Conference on Embedded Systems (VLSID)10.1109/VLSID60093.2024.00067(366-371)Online publication date: 6-Jan-2024
  • (2023)Orinoco: Ordered Issue and Unordered Commit with Non-Collapsible QueuesProceedings of the 50th Annual International Symposium on Computer Architecture10.1145/3579371.3589046(1-14)Online publication date: 17-Jun-2023
  • (2023)Performance Analysis of Criticality-Aware Out-of-Order Cores for Exploiting MLP2023 International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC)10.1109/ITC-CSCC58803.2023.10212794(1-4)Online publication date: 25-Jun-2023
  • (2023)ERrOR: Improving Performance and Fault Tolerance Using Early Execution2023 IEEE 29th International Symposium on On-Line Testing and Robust System Design (IOLTS)10.1109/IOLTS59296.2023.10224863(1-3)Online publication date: 3-Jul-2023
  • (2022)Dependence-aware Slice Execution to Boost MLP in Slice-out-of-order CoresACM Transactions on Architecture and Code Optimization10.1145/350670419:2(1-28)Online publication date: 7-Mar-2022
  • (2022)CRISP: critical slice prefetchingProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507745(300-313)Online publication date: 28-Feb-2022
  • (2022)The Forward Slice Core: A High-Performance, Yet Low-Complexity MicroarchitectureACM Transactions on Architecture and Code Optimization10.1145/349942419:2(1-25)Online publication date: 31-Jan-2022
  • (2022)A First-Order Model to Assess Computer Architecture SustainabilityIEEE Computer Architecture Letters10.1109/LCA.2022.321736621:2(137-140)Online publication date: 1-Jul-2022
  • (2021)Vector runaheadProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00024(195-208)Online publication date: 14-Jun-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media