Preemptive Zone Reset Design within Zoned Namespace SSD Firmware
<p>Logical-to-Physical Zone Mapping Method.</p> "> Figure 2
<p>Average block erase execution time by zone size.</p> "> Figure 3
<p>Physical layout of ZNS prototype. Zone size is configured as two erase blocks.</p> "> Figure 4
<p>Partial Zone Erase scheme.</p> "> Figure 5
<p>State diagram of Preemptive Zone Reset.</p> "> Figure 6
<p>fio latency for sequential writes.</p> "> Figure 7
<p>fio latency for random writes.</p> "> Figure 8
<p>RocksDB write latency of fillrandom with 464B values.</p> "> Figure 9
<p>Total number of block erases performed during fillrandom with 464B values.</p> "> Figure 10
<p>RocksDB read latency of mixgraph with 464B values.</p> ">
Abstract
:1. Introduction
- As yet, ZNS SSDs are not available for public use. Furthermore, detailed design decisions for the NVMe controller, firmware, flash controller, and physical layout of actual ZNS SSD prototypes have not been released by manufacturers. Thus, in order to conduct our firmware research, we implemented our own ZNS SSD prototype by modifying the Cosmos+ OpenSSD platform;
- Zone reset handling in ZNS SSD firmware rests unexplored in academia and stays hidden by manufacturers. We show that block erases needed for a zone reset request introduce significant overhead as zone sizes increase;
- We first implemented an intuitive zone reset design which manages a zone mapping table, and point out its limitations. Then, we present our proposed Preemptive Zone Reset scheme, which gives foreground I/O higher priority and performs Partial Zone Erases accordingly. We evaluated and compared both designs on our ZNS SSD prototype.
2. Background & Problem Definition
2.1. NVMe Zoned Namespace Standard
- The Zone Append command appends data to the zone matching the designated ZSLBA and returns the lowest LBA of the set of logical blocks written;
- The Zone Management Receive command returns to the host a data buffer containing information about zones, such as the zone state, zone capacity, WP, etc.;
- The Zone Management Send command can be used to request an action on one or more zones, including close zone, open zone, reset zone, etc.
2.2. Problem Definition
2.2.1. Synchronous Method
2.2.2. Logical-to-Physical Zone Mapping Method
2.2.3. Disadvantages and Issues
3. ZNS SSD Prototype and Preemptive Zone Reset Design
3.1. ZNS SSD Prototype
3.1.1. Cosmos+ OpenSSD
3.1.2. ZNS Prototype Implementation
3.2. Preemptive Zone Reset
3.2.1. Partial Zone Erase
3.2.2. States and Thresholds
- State 0 (): only services foreground I/O requests and does not perform Partial Zone Erases;
- State 1 (): performs Partial Zone Erases only if there are no pending foreground I/Os to service;
- State 2 (): blocks all foreground I/O requests and performs all necessary block erase operations in order to provide a free zone, namely a Full Zone Erase on an invalid zone. In this state, securing a free zone that can be used for allocation is the highest priority.
3.2.3. Implementation
4. Evaluation
4.1. fio Benchmark
4.2. RocksDB Benchmark
5. Related Works
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Western Digital Corporation. Zoned Storage. Available online: https://zonedstorage.io/docs/introduction/zoned-storage (accessed on 12 June 2012).
- Gupta, A.; Kim, Y.; Urgaonkar, B. DFTL: A Flash Translation Layer Employing Demand-Based Selective Caching of Page-Level Address Mappings. In Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XIV), Washington, DC, USA, 7–11 March 2009; Association for Computing Machinery: New York, NY, USA, 2009; pp. 229–240. [Google Scholar] [CrossRef]
- Kim, Y.; Gupta, A.; Urgaonkar, B. A Temporal Locality-Aware Page-Mapped Flash Translation Layer. J. Comput. Sci. Technol. 2013, 28, 1025. [Google Scholar] [CrossRef]
- Bjørling, M.; Aghayev, A.; Holmberg, H.; Ramesh, A.; Le Moal, D.; Ganger, G.R.; Amvrosiadis, G. ZNS: Avoiding the Block Interface Tax for Flash-based SSDs. In Proceedings of the USENIX Annual Technical Conference (ATC ’21), Santa Clara, CA, USA, 14–16 July 2021; pp. 689–703. [Google Scholar]
- Agrawal, N.; Prabhakaran, V.; Wobber, T.; Davis, J.D.; Manasse, M.; Panigrahy, R. Design Tradeoffs for SSD Performance. In Proceedings of the USENIX Annual Technical Conference (ATC ’08), Boston, MA, USA, 22–27 June 2008; USENIX Association: Berkeley, CA, USA, 2008; pp. 57–70. [Google Scholar]
- Hu, X.Y.; Eleftheriou, E.; Haas, R.; Iliadis, I.; Pletka, R. Write Amplification Analysis in Flash-Based Solid State Drives. In Proceedings of the ACM International Systems and Storage Conference (SYSTOR ’09), Haifa, Israel, 4–6 May 2009; Association for Computing Machinery: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
- NVM Express. NVM Express Workgroup, NVM Express® Zoned Namespace Command Set Specification Revision 1.1a. Available online: https://www.nvmexpress.org/specifications (accessed on 12 June 2022).
- Bjørling, M.; From Open-channel SSDs to Zoned Namespaces. Vault 2019. Available online: https://www.usenix.org/conference/vault19/presentation/bjorling (accessed on 12 June 2022).
- Choi, G.; Lee, K.; Oh, M.; Choi, J.; Jhin, J.; Oh, Y. A New LSM-Style Garbage Collection Scheme for ZNS SSDs. In Proceedings of the 12th USENIX Conference on Hot Topics in Storage and File Systems (HotStorage ’20), Virtual, 13–14 July 2020; USENIX Association: Berkeley, CA, USA, 2020. [Google Scholar]
- Stavrinos, T.; Berger, D.S.; Katz-Bassett, E.; Lloyd, W. Don’t Be a Blockhead: Zoned Namespaces Make Work on Conventional SSDs Obsolete. In Proceedings of the Workshop on Hot Topics in Operating Systems (HotOS ’21), Ann Arbor, MI, USA, 1–3 June 2021; Association for Computing Machinery: New York, NY, USA, 2021; pp. 144–151. [Google Scholar] [CrossRef]
- Han, K.; Gwak, H.; Shin, D.; Hwang, J. ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction. In Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2021, Virtual, 14–16 July 2021; Brown, A.D., Lorch, J.R., Eds.; USENIX Association: Berkeley, CA, USA, 2021; pp. 147–162. [Google Scholar]
- Lee, H.R.; Lee, C.G.; Lee, S.; Kim, Y. Compaction-Aware Zone Allocation for LSM Based Key-Value Store on ZNS SSDs. In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage ’22), Virtual, Virtual, 27–28 June 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 93–99. [Google Scholar] [CrossRef]
- Jung, J.; Shin, D. Lifetime-Leveling LSM-Tree Compaction for ZNS SSD. In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage ’22), Virtual, 27–28 June 2022; Association for Computing Machinery: New York, NY, USA, 2022; pp. 100–105. [Google Scholar] [CrossRef]
- Wu, G.; He, X. Reducing SSD Read Latency via NAND Flash Program and Erase Suspension. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST ’12), San Jose, CA, USA, 14–17 February 2012; USENIX Association: Berkeley, CA, USA, 2012; p. 10. [Google Scholar] [CrossRef]
- Kwak, J.; Lee, S.; Park, K.; Jeong, J.; Song, Y.H. Cosmos+ OpenSSD: Rapid Prototype for Flash Storage Systems. ACM Trans. Storage 2020, 16, 15. [Google Scholar] [CrossRef]
- Axboe, J. fio Benchmark Tool. Available online: https://git.kernel.dk/cgit/fio/ (accessed on 12 June 2022).
- Facebook. RocksDB Database Benchmark Tool. Available online: https://github.com/facebook/rocksdb/wiki/Benchmarking-tools (accessed on 12 June 2022).
- Facebook. RocksDB. Available online: https://github.com/facebook/rocksdb (accessed on 12 June 2022).
- O’Neil, P.; Cheng, E.; Gawlick, D.; O’Neil, E. The Log-Structured Merge-Tree (LSM-tree). Acta Inform. 1996, 33, 351–385. [Google Scholar] [CrossRef]
- Western Digital Corporation. ZenFS. Available online: https://github.com/westerndigitalcorporation/zenfs (accessed on 12 June 2022).
- Holmberg, H. ZenFS, Zones and RocksDB—Who Likes to Take out the Garbage Anyway? SNIA 2020. Available online: https://www.snia.org/educational-library/zenfs-zones-and-rocksdb-who-likes-take-out-garbage-anyway-2020 (accessed on 12 June 2022).
- Cao, Z.; Dong, S.; Vemuri, S.; Du, D.H.C. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20), Santa Clara, CA, USA, 24–27 February 2020; USENIX Association: Berkeley, CA, USA, 2020; pp. 209–224. [Google Scholar]
- Salkhordeh, R.; Kremer, K.; Nagel, L.; Maisenbacher, D.; Holmberg, H.; Bjørling, M.; Brinkmann, A. Constant Time Garbage Collection in SSDs. In Proceedings of the 2021 IEEE International Conference on Networking, Architecture and Storage (NAS), Riverside, CA, USA, 24–26 October 2021; pp. 1–9. [Google Scholar] [CrossRef]
- Lin, M.; Yao, Z. Dynamic garbage collection scheme based on past update times for NAND flash-based consumer electronics. IEEE Trans. Consum. Electron. 2015, 61, 478–483. [Google Scholar] [CrossRef]
- Pan, Y.; Lin, M.; Wu, Z.; Zhang, H.; Xu, Z. Caching-Aware Garbage Collection to Improve Performance and Lifetime for NAND Flash SSDs. IEEE Trans. Consum. Electron. 2021, 67, 141–148. [Google Scholar] [CrossRef]
- Lee, C.; Sim, D.; Hwang, J.Y.; Cho, S. F2FS: A New File System for Flash Storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST ’15), Santa Clara, CA, USA, 16–19 February 2015; USENIX Association: Berkeley, CA, USA, 2015; pp. 273–286. [Google Scholar]
- Lee, J.; Kim, Y.; Shipman, G.M.; Oral, S.; Wang, F.; Kim, J. A Semi-Preemptive Garbage Collector for Solid State Drives. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS ’11), Austin, TX, USA, 10–12 April 2011; pp. 12–21. [Google Scholar] [CrossRef]
- Lee, J.; Kim, Y.; Shipman, G.M.; Oral, S.; Kim, J. Preemptible I/O Scheduling of Garbage Collection for Solid State Drives. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2013, 32, 247–260. [Google Scholar] [CrossRef]
- Kim, Y.; Lee, J.; Oral, S.; Dillow, D.A.; Wang, F.; Shipman, G.M. Coordinating Garbage Collection for Arrays of Solid-State Drives. IEEE Trans. Comput. 2014, 63, 888–901. [Google Scholar] [CrossRef]
Hardware | Specification |
---|---|
FPGA | Xilinx Zynq-7000 |
CPU | Dual-Core ARM Cortex |
DRAM | DDR3 1 GB |
Storage Capacity | 256 GB |
Host Interface | PCIe Gen2 8-lane |
Zone Size | ||
---|---|---|
512 MB | 1 | 479 |
1 GB | 1 | 239 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jung, S.; Lee, S.; Han, J.; Kim, Y. Preemptive Zone Reset Design within Zoned Namespace SSD Firmware. Electronics 2023, 12, 798. https://doi.org/10.3390/electronics12040798
Jung S, Lee S, Han J, Kim Y. Preemptive Zone Reset Design within Zoned Namespace SSD Firmware. Electronics. 2023; 12(4):798. https://doi.org/10.3390/electronics12040798
Chicago/Turabian StyleJung, Siu, Seungjin Lee, Jungwook Han, and Youngjae Kim. 2023. "Preemptive Zone Reset Design within Zoned Namespace SSD Firmware" Electronics 12, no. 4: 798. https://doi.org/10.3390/electronics12040798
APA StyleJung, S., Lee, S., Han, J., & Kim, Y. (2023). Preemptive Zone Reset Design within Zoned Namespace SSD Firmware. Electronics, 12(4), 798. https://doi.org/10.3390/electronics12040798