[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article
Public Access

Performance and Thermal Tradeoffs for Energy-Efficient Monolithic 3D Network-on-Chip

Published: 22 August 2018 Publication History

Abstract

Three-dimensional (3D) integration enables the design of high-performance and energy-efficient network on chip (NoC) architectures as communication backbones for manycore chips. To exploit the benefits of the vertical dimension of 3D integration, through-silicon-via (TSV) has been predominantly used in state-of-the-art manycore chip design. However, for TSV-based systems, high power density and the resultant thermal hotspot remain major concerns from the perspectives of chip functionality and overall reliability. The power consumption and thermal profiles of 3D NoCs can be improved by incorporating a Voltage-Frequency-Island (VFI)-based power management strategy. However, due to inherent thermal constraints of a TSV-based 3D system, we are unable to fully exploit the benefits offered by the power management methodology. In this context, emergence of monolithic 3D (M3D) integration has opened up new possibility of designing ultra-low-power and high-performance circuits and systems. The smaller dimensions of the inter-layer dielectric (ILD) and monolithic inter-tier vias (MIVs) offer high-density integration, flexibility of partitioning logic blocks across multiple tiers, and significant reduction of total wire-length. In this work, we present the first-ever study of the performance-thermal tradeoffs for energy efficient monolithic 3D manycore chips. In particular, we present a comparative performance evaluation of M3D NoCs with respect to their conventional TSV-based counterparts. We demonstrate that the proposed M3D-based NoC architecture incorporating VFI-based power management achieves a maximum of 29.4% lower energy-delay-product (EDP) compared to the TSV-based designs for a large set of benchmarks. We also demonstrate that the M3D-based NoC shows up to 29.1% lower maximum temperature than the TSV-based counterpart for these benchmarks.

References

[1]
P. Batude, T. Ernst, J. Arcamone, G. Arndt, P. Coudrain, and P. E. Gaillardon. 2012. 3-D sequential integration: A key enabling technology for heterogeneous co-integration of new function with CMOS. IEEE J. Emer. Select. Top. Circ. Syst. 2, 4, 714--722.
[2]
P. Batude, B. Sklenard, C. Fenouillet-Beranger, B. Previtali, C. Tabone, O. Rozeau, O. Billoint, O. Turkyilmaz, H. Sarhan, S. Thuries, G. Cibrario, L. Brunet, F. Deprat, J.-E. Michallet, F. Clermidy, and M. Vinet. 2014. 3D sequential integration opportunities and technology optimization. In Proceedings of the IEEE International Interconnect Technology Conference. 373--376.
[3]
C. Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Diss., Princeton Univ., Princeton NJ.
[4]
N. Binkert, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, D. A. Wood, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, and T. Krishna. 2011. The gem5 simulator. SIGARCH Comput. Archit. News 39, 2 (August 2011), 1--7.
[5]
S. Bobba, A. Chakraborty, O. Thomas, P. Batude, T. Ernst, O. Faynot, D. Z. Pan, and G. De Micheli. 2011. CELONCEL: Effective design technique for 3-D monolithic integration targeting high performance integrated circuits. In Proceedings of the Asia South Pacific Design Automation Conference (ASP-DAC), 336--343.
[6]
T. Bui, C. Heigham, C. Jones, and T. Leighton. 1989. Improving the performance of the Kernighan-Lin and simulated annealing graph bisection algorithms. In Proceedings of the 26th ACM/IEEE Design Automation Conference (DAC'89). ACM, New York, 775--778.
[7]
T. Chelcea and S. M. Nowick. 2000. A low-latency FIFO for mixed-clock systems. In Proceedings of the IEEE Computer Society Workshop on VLSI System Design for a System-on-Chip Era, 119--126.
[8]
T.-C. Chen and Y.-W. Chang. 2005. Modern floorplanning based on fast simulated annealing. In Proceedings of the 2005 International Symposium on Physical design (ISPD'05). ACM, New York, 104--112.
[9]
J. Cong, J. Wei, and Y. Zhang. 2004. A thermal-driven floorplanning algorithm for 3D ICs. In Proceedings of the 2004 IEEE/ACM International Conference on Computer-Aided Design (ICCAD'04). IEEE Computer Society, Washington, DC, 306--313.
[10]
A. K. Coskun, J. L. Ayala, D. Atienza, T. S. Rosing, and Y. Leblebici. 2009. Dynamic thermal management in 3D multicore architectures. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). European Design and Automation Association, Belgium, 1410--1415.
[11]
A. K. Coskun, T. S. Rosing, and K. Whisnant. 2007. Temperature aware task scheduling in MPSoCs. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'07). EDA Consortium, San Jose, CA, 1659--1664.
[12]
B. Dang, M. S. Bakir, D. C. Sekar, C. R. King, and J. D. Meindl. 2010. Integrated microfluidic cooling and interconnects for 2D and 3D chips. IEEE Trans. Adv. Packag. 33, 79--87.
[13]
S. Das, A. Chandrakasan, and R. Reif. 2004. Timing, energy, and thermal performance of three-dimensional integrated circuits. In Proceedings of the 14th ACM Great Lakes Symposium on VLSI (GLSVLSI'04). ACM, New York, 338--343.
[14]
S. Das, J. R. Doppa, P. P. Pande, and K. Chakrabarty. 2017. Monolithic 3D-enabled high performance and energy efficient network-on-chip. In Proceedings of the IEEE International Conference on Computer Design (ICCD), Boston, MA, 233--240.
[15]
S. Das, D. Lee, D. H. Kim, and P. P. Pande. 2015. Small-world network enabled energy efficient and robust 3D NOC architectures. In Proceedings of the 25th Edition on Great Lakes Symposium on VLSI (GLSVLSI'15). ACM, New York, 133--138.
[16]
K. Furumi, M. Imai, and A. Kurokawa. 2017. Cooling architectures using thermal sidewalls, interchip plates, and bottom plate for 3D ICs. In Proceedings of the 18th International Symposium on Quality Electronic Design (ISQED). 283--288.
[17]
W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, and M. R. Stan. 2006. HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Trans. Very Large Scale Integr. Syst. 14 (2006) 501--513.
[18]
W. Huang, M. R. Stan, K. Skadron, K. Sankaranarayanan, S. Ghosh, and S. Velusam. 2004. Compact thermal modeling for temperature-aware design. In Proceedings of the 41st Annual Design Automation Conference (DAC'04). ACM, New York, 878--883.
[19]
R. Ishihara, M. R. T. Mofrad, J. Derakhshandeh, N. Golshani and C. I. M. Beenakker. 2012. Monolithic 3D-ICs with single grain Si thin film transistors. In Proceedings of the IEEE International Conference on Solid-State and Integrated Circuit Technology. 1--4.
[20]
The International Technology Roadmap for Semiconductors. 2015. ITRS Reports. Retrieved from http://www.itrs2.net/itrs-reports.html.
[21]
W. Jang and D. Z. Pan. 2011. A voltage-frequency island aware energy optimization framework for networks-on-chip. IEEE J. Emerg. Sel. Top. Circ. Syst. 1 (2011) 420--432.
[22]
D.-C. Juan, S. Garg, and D. Marculescu. 2014. Statistical peak temperature prediction and thermal yield improvement for 3d chip multiprocessors. ACM Trans. Des. Autom. Electron. Syst. 19, 4, Article 39 (August 2014), 23 pages.
[23]
D.-C. Juan, S. Garg, J. Park, and D. Marculescu. 2013. Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling. In Proceedings of the IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS'13). IEEE Press, Piscataway, NJ, Article 8, 10 pages.
[24]
R. G. Kim, W. Choi, Z. Chen, P. P. Pande, D. Marculescu, and R. Marculescu. 2016. Wireless NoC and dynamic VFI codesign: Energy efficiency without performance penalty. IEEE Trans. Very Large Scale Integr. Syst. 24 (2016) 2488--2501.
[25]
J. Kim, C. Nicopoulos, D. Park, R. Das, Y. Xie, V. Narayanan, M. S. Yousif, and C. R. Das. 2007. A novel dimensionally-decomposed router for on-chip communication in 3D architectures. In Proceedings of the 34th Annual International Symposium on Computer Architecture (ISCA'07). ACM, New York, 138--149.
[26]
S. S. Kumar, A. Aggarwal, R. S. Jagtap, A. Zjajo, and R. Van Leuken. 2014a. System level methodology for interconnect aware and temperature constrained power management of 3-D MP-SOCs. IEEE Trans. Very Large Scale Integr. Syst. 22 (2014), 1606--1619.
[27]
P. Kumar, H. Yang, I. Bacivarov, and L. Thiele. 2014b. COOLIP: Simple yet effective job allocation for distributed thermally-throttled processors. In Proceedings of the Conference on Design, Automation 8 Test in Europe (DATE'14). European Design and Automation Association, Belgium, Article 280, 4 pages.
[28]
J. H. Lau. 2010. TSV manufacturing yield and hidden costs for 3D IC integration. In Proceedings of the 60th Electronic Components and Technology Conference (ECTC). 1031--1042.
[29]
H. H. S. Lee and K. Chakrabarty. 2009. Test challenges for 3D integrated circuits. IEEE Des. Test Comput. 26 (2009) 26--35.
[30]
Y. J. Lee, D. Limbrick, and S. K. Lim. 2013. Power benefit study for ultra-high density transistor-level monolithic 3D ICs. In Proceedings of the 50th Annual Design Automation Conference (DAC'13). ACM, New York, Article 104, 10 pages.
[31]
S. Li, J. H. Ahn, R. D. Strong, J. B. Brockman, D. M. Tullsen, and N. P. Jouppi. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 42). ACM, New York, 469--480.
[32]
C. Liu and S. K. Lim. 2012. A design tradeoff study with monolithic 3D integration. In Proceedings of the International Symposium on Quality Electronic Design (ISQED). IEEE, Santa Clara, CA, 529--536.
[33]
O. Lysne, T. Skeie, S. A. Reinemo, and I. Theiss. 2006. Layered routing in irregular networks. IEEE Trans. Parallel Distrib. Syst. 17 (2006), 51--65.
[34]
C. Marcon, R. Fernandes, R. Cataldo, F. Grando, T. Webber, A. Benso, and L. B. Poehls. 2014. Tiny NoC: A 3D mesh topology with router channel optimization for area and latency minimization. In Proceedings of the IEEE International Conference on VLSI Design. 228--233.
[35]
S. Murali, C. Seiculescu, L. Benini, and G. De Micheli. 2009. Synthesis of networks on chips for 3D systems on chips. In Proceedings of the Asia South Pacific Design Automation Conference (ASP-DAC). 242--247.
[36]
D. K. Nayak, S. Banna, S. K. Samal, and S. K. Lim. 2015. Power, performance, and cost comparisons of monolithic 3D ICs and TSV-based 3D ICs. In Proceedings of the IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S). Rohnert Park, CA, 1--2.
[37]
V. Nguyen and C. Martel. 2005. Analyzing and characterizing small-world graphs. In Proceedings of the 16th Annual ACM-SIAM Symposium on Discrete Algorithms. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 311--320.
[38]
U. Y. Ogras and R. Marculescu. 2006. It's a Small World after All: NoC performance optimization via long-range link insertion. IEEE Trans. on VLSI Systems, 14, 693--706.
[39]
U. Y. Ogras, R. Marculescu, D. Marculescu, and E. G. Jung. 2009. Design and management of voltage-frequency island partitioned networks-on-chip. IEEE Trans. Very Large Scale Integr. Syst. 17 330--341.
[40]
D. Oh, C. C. P. Chen, and Y. H. Hu. 2012. Efficient thermal simulation for 3-D IC with thermal through-silicon vias. IEEE Trans. Comput. Des. Integr. Circuits Syst. 31 (2012) 1767--1771.
[41]
S. Panth, K. Samadi, Y. Du, and S. K. Lim. 2014. Design and CAD methodologies for low power gate-level monolithic 3D ICs. In Proceedings of the 2014 International Symposium on Low Power Electronics and Design (ISLPED'14). ACM, New York, 171--176.
[42]
T. Petermann and P. D. L. Rios. 2005. Spatial small-world networks: a wiring-cost perspective, e-print arXiv:cond-mat/0501420.
[43]
A.-M. Rahmani, P. Liljeberg, J. Plosila, and Ha. Tenhunen. 2013. Developing a power-efficient and low-cost 3D NoC using smart GALS-based vertical channels. J. Comput. Syst. Sci. 79, 4, 440--456.
[44]
R. Rao and S. Vrudhula. 2007. Performance optimal processor throttling under thermal constraints. In Proceedings of the 2007 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES'07). ACM, New York, 257--266.
[45]
M. M. Sabry, A. Sridhar, J. Meng, A. K. Coskun, and D. Atienza. 2013. Greencool: An energy-efficient liquid cooling design technique for 3-D MPSoCs via channel width modulation. IEEE Trans. Comput. Des. Integr. Circuits Syst. 32, 524--537.
[46]
S. K. Samal, D. Nayak, M. Ichihashi, S. Banna, and S. K. Lim. 2016. Monolithic 3D IC vs. TSV-based 3D IC in 14nm FinFET technology. In Proceedings of the IEEE SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S). 1--2.
[47]
S. K. Samal, S. Panth, K. Samadi, M. Saedi, Y. Du, and S. K. Lim. 2014. Fast and accurate thermal modeling and optimization for monolithic 3D ICs. In Proceedings of the 51st Annual Design Automation Conference (DAC'14). ACM, New York, Article 206, 6 pages.
[48]
C. Santos, P. Vivet, S. Thuries, O. Billoint, J.-P. Colonna, P. Coudrain, and L. Wang. 2016. Thermal performance of CoolCube™ monolithic and TSV-based 3D integration processes. In Proceedings of the 2016 IEEE International 3D Systems Integration Conference (3DIC). 1--5.
[49]
D. K. Schroder and J. A. Babcock. 2003. Negative bias temperature instability: Road to cross in deep submicron silicon semiconductor manufacturing. J. Appl. Phys. 94, 1--18.
[50]
D. K. Schroder. 2007. Negative bias temperature instability: What do we understand?. Microelect. Reliab. 47, 841--852.
[51]
C. Seiculescu, S. Murali, L. Benini, and G. De Micheli. 2009. SunFloor 3D: A tool for networks on chip topology synthesis for 3D systems on chips. In Proceedings of the Conference on Design, Automation and Test in Europe (DATE'09). European Design and Automation Association, Belgium, 9--14.
[52]
C. Seiculescu, S. Murali, L. Benini, and G. De Micheli. 2010. Comparative analysis of NoCs for two-dimensional versus three-dimensional SoCs supporting multiple voltage and frequency islands. IEEE Trans. Circ. Syst. II: Express Briefs, 57, 5, 364--368.
[53]
B. Shi, A. Srivastava, and P. Wang. 2011. Non-uniform micro-channel design for stacked 3D-ICs. In Proceedings of the 48th Design Automation Conference (DAC'11). ACM, New York, 658--663.
[54]
J. C. Souriau, L. Castagné, J. L. Liotard, K. Inal, J. Mazuir, F. L. Texier, G. Fresquet, M. Varvara, N. Launay, B. Dubois, and T. Malia. 2012. 3D multi-stacking of thin dies based on TSV and micro-inserts interconnections. In Proceedings of the IEEE 62nd Electronic Components and Technology Conference. 1047--1053.
[55]
O. Turkyilmaz, G. Cibrario, O. Rozeau, P. Batude, and F. Clermidy. 2014. 3D FPGA using high-density interconnect monolithic integration. In Proceedings of the Conference on Design, Automation 8 Test in Europe (DATE'14). European Design and Automation Association, Belgium, Article 338, 4 pages.
[56]
T. Uhrmann, T. Wagenleitner, T. Glinsner, M. Wimplinger, and P. Lindner. 2014. Monolithic IC integration key alignment aspects for high process yield. In Proeedings of the 2014 SOI-3D-Subthreshold Microelectronics Technology Unified Conference (S3S). 1--2.
[57]
N. Watanabe, H. Shimamoto, K. Kikuchi, M. Aoyagi, H. Kikuchi, A. Yanagisawa, and A. Nakamura. 2016. Wet cleaning process for high-yield via-last TSV formation. In Proceedings of the IEEE International 3D Systems Integration Conference (3DIC). 1--4.
[58]
N. Watanabe, H. Kikuchi, A. Yanagisawa, H. Shimamoto, K. Kikuchi, M. Aoyagi, and A. Nakamura. 2017. Development of a high-yield via-last through silicon via process using notchless silicon etching and wet cleaning of the first metal layer. Japan, J. Appl. Phys., 56.
[59]
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. 1995. The SPLASH-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd Annual International Symposium on Computer Architecture (ISCA'95). ACM, New York, 24--36.
[60]
Y. Xu, Y. Du, B. Zhao, X. Zhou, Y. Zhang, and J. Yang. 2009. A low-radix and low-diameter 3D interconnection network design. In Proceedings of the International Symposium on High-Performance Computer Architecture. 30--41.
[61]
P. M. Yaghini, A. Eghbal, S. S. Yazdi, and N. Bagherzadeh. 2015. Accurate system-level TSV-to-TSV capacitive coupling fault model for 3D-NoC. In Proceedings of the 9th International Symposium on Networks-on-Chip (NOCS'15). ACM, New York, Article 3, 8 pages.
[62]
P. Zhou, P. H. Yuh, and S. S. Sapatnekar. 2010. Application-specific 3D network-on-chip design using simulated allocation. In Proceedings of the Asia South Pacific Design Automation Conference (ASP-DAC). 517--522.

Cited By

View all
  • (2024)NeuroTAP: Thermal and Memory Access Pattern-Aware Data Mapping on 3D DRAM for Maximizing DNN PerformanceACM Transactions on Embedded Computing Systems10.1145/367717823:6(1-30)Online publication date: 11-Sep-2024
  • (2024)Thermal-Aware Scheduling for Deep Learning on Mobile Devices With NPUIEEE Transactions on Mobile Computing10.1109/TMC.2024.337950123:12(10706-10719)Online publication date: Dec-2024
  • (2024)3D-TemPo: Optimizing 3-D DRAM Performance Under Temperature and Power ConstraintsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336723543:8(2263-2276)Online publication date: 1-Aug-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 23, Issue 5
September 2018
310 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/3268934
  • Editor:
  • Naehyuck Chang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 22 August 2018
Accepted: 01 May 2018
Revised: 01 April 2018
Received: 01 January 2018
Published in TODAES Volume 23, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D NoC
  2. VFI power management
  3. monolithic 3D
  4. thermal hotspots

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)171
  • Downloads (Last 6 weeks)21
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)NeuroTAP: Thermal and Memory Access Pattern-Aware Data Mapping on 3D DRAM for Maximizing DNN PerformanceACM Transactions on Embedded Computing Systems10.1145/367717823:6(1-30)Online publication date: 11-Sep-2024
  • (2024)Thermal-Aware Scheduling for Deep Learning on Mobile Devices With NPUIEEE Transactions on Mobile Computing10.1109/TMC.2024.337950123:12(10706-10719)Online publication date: Dec-2024
  • (2024)3D-TemPo: Optimizing 3-D DRAM Performance Under Temperature and Power ConstraintsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336723543:8(2263-2276)Online publication date: 1-Aug-2024
  • (2024)A survey on mapping and scheduling techniques for 3D Network-on-chipJournal of Systems Architecture: the EUROMICRO Journal10.1016/j.sysarc.2024.103064147:COnline publication date: 17-Apr-2024
  • (2023)NeuroCool: Dynamic Thermal Management of 3D DRAM for Deep Neural Networks through Customized PrefetchingACM Transactions on Design Automation of Electronic Systems10.1145/363001229:1(1-35)Online publication date: 18-Dec-2023
  • (2023)Dynamic Thermal Management of 3D Memory through Rotating Low Power States and Partial Channel ClosureACM Transactions on Embedded Computing Systems10.1145/362458122:6(1-27)Online publication date: 9-Nov-2023
  • (2023)The Learnable Model-Based Genetic Algorithm for the IP Mapping ProblemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.321502342:7(2350-2363)Online publication date: Jul-2023
  • (2023)3D-TTP: Efficient Transient Temperature-Aware Power Budgeting for 3D-Stacked Processor-Memory Systems2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI59464.2023.10238664(1-6)Online publication date: 20-Jun-2023
  • (2023)Parallel Software-Based Self-Testing with Bounded Model Checking for Kilo-Core Networks-on-ChipJournal of Computer Science and Technology10.1007/s11390-022-2553-338:2(405-421)Online publication date: 30-Mar-2023
  • (2023)Aggressive GPU cache bypassing with monolithic 3D-based NoCThe Journal of Supercomputing10.1007/s11227-022-04878-679:5(5421-5442)Online publication date: 1-Mar-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media