CN113823346A

CN113823346A - Region-aware memory management in a memory subsystem

Info

Publication number: CN113823346A
Application number: CN202110676084.0A
Authority: CN
Inventors: A·巴德瓦杰
Original assignee: Micron Technology Inc
Current assignee: Micron Technology Inc
Priority date: 2020-06-18
Filing date: 2021-06-18
Publication date: 2021-12-21
Anticipated expiration: 2041-06-18
Also published as: US20230161712A1; CN113823346B; US20210397562A1; US11550727B2; US20240256463A1; US11960409B2

Abstract

The application relates to region-aware memory management in a memory subsystem. A system is disclosed, comprising: a memory device having a plurality of physical memory blocks and associated with a logical address space comprising a plurality of regions, wherein each region comprises a plurality of logical block addresses, LBAs; and a processing device operatively coupled with the memory device to perform the following operations: receiving a request to store data referenced by an LBA associated with a first region of the plurality of regions; obtaining a version identifier of the first region; obtaining erase values for a plurality of available physical memory blocks of the memory device; selecting a first physical memory block of the plurality of available physical memory blocks in view of the version identifier and the erasure value for the first region; mapping a next available LBA within the first region to the first physical memory block; and storing the data in the first physical memory block.

Description

Region-aware memory management in a memory subsystem

Technical Field

Embodiments of the present disclosure relate generally to memory subsystems and, more particularly, to region aware memory management in memory subsystems.

Background

The memory subsystem may include one or more memory devices that store data. The memory devices may be, for example, non-volatile memory devices and volatile memory devices. In general, a host system may utilize a memory subsystem, storing data at and retrieving data from a memory device.

Disclosure of Invention

One aspect of the present application relates to a system, comprising: a memory device comprising a plurality of physical memory blocks and associated with a logical address space comprising a plurality of regions, wherein each region comprises a plurality of Logical Block Addresses (LBAs); and processing means operatively coupled with the memory means to perform operations comprising: receiving a request to store data referenced by an LBA associated with a first region of the plurality of regions; obtaining a version identifier of the first region; obtaining erase values for a plurality of available physical memory blocks of the memory device, wherein each erase value represents a number of completed erase operations for a respective one of the plurality of memory blocks; selecting a first physical memory block of a plurality of available physical memory blocks in view of the version identifier and the erasure value for the first region; mapping a next available LBA associated with the first region to the first physical memory block; and storing the data in the first physical memory block.

Another aspect of the application relates to a system comprising: a memory device comprising a plurality of physical memory blocks and associated with a logical address space comprising a plurality of regions, wherein each region comprises a plurality of Logical Block Addresses (LBAs), and a processing device operatively coupled with the memory device to perform operations comprising: initiating a scan of the memory device; identifying a first region of the plurality of regions as a region having a lowest version identifier, wherein the version identifier of a region reflects a number of invalidations to the region; performing error correction analysis to detect risky physical memory blocks mapped to consecutive LBAs of the first region, starting from a first LBA of the first region; and determining whether to stop the scan of the memory device at the first region in view of the detected risky physical memory blocks mapped to LBAs of the first region.

Yet another aspect of the present application relates to a method, comprising: receiving, by a processing device operatively coupled with a memory device, a request to store data referenced by a Logical Block Address (LBA) associated with a first region of a plurality of regions, wherein each region includes a plurality of Logical Block Addresses (LBAs) of a logical address space of the memory device, the memory device having a plurality of physical memory blocks; obtaining, by the processing device, a version identifier of the first region; obtaining, by the processing device, erase values for a plurality of available physical memory blocks of the memory device, wherein each erase value represents a number of completed erase operations for a respective one of the plurality of available memory blocks; selecting, by the processing device, a first physical memory block of the plurality of available physical memory blocks in view of the version identifier and the erasure value for the first region; mapping, by the processing device, a next available LBA associated with the first region to the first physical memory block; and storing, by the processing device, the data in the first physical memory block.

Drawings

The present disclosure will be understood more fully from the detailed description provided below and from the accompanying drawings of various embodiments of the disclosure.

FIG. 1 illustrates an example computing system including a memory subsystem in accordance with some embodiments of the present disclosure.

2A-B schematically illustrate high-level descriptions of region-aware memory management in a memory subsystem.

Fig. 3 schematically illustrates region-block matching in a region-aware memory management according to some embodiments of the present disclosure.

Fig. 4 schematically illustrates distribution-based region-block matching in a region-aware memory management, according to some embodiments of the present disclosure.

FIG. 5 illustrates a flow diagram of an example method of region-aware memory management of a memory device, according to some embodiments of the present disclosure.

Fig. 6A illustrates a flow diagram of an example method of performing a region-aware background scan in a memory device, according to some embodiments of the present disclosure.

Fig. 6B is a flow diagram of one possible embodiment of the method shown in fig. 6A, according to some embodiments of the present disclosure.

Fig. 6C schematically illustrates an example area-aware background scan performed according to some embodiments of the present disclosure.

FIG. 7 is a block diagram of an example computer system in which embodiments of the present disclosure may operate.

Detailed Description

Aspects of the present disclosure relate to region-aware memory management in a memory subsystem using version numbers of memory regions. The memory subsystem may be a memory device, a memory module, or a mixture of memory devices and memory modules. Examples of memory devices and memory modules are described below in conjunction with FIG. 1. In general, a host system may utilize a memory subsystem that includes one or more memory components (e.g., memory devices) that store data. The host system may provide data to be stored at the memory subsystem and may request data to be retrieved from the memory subsystem.

The memory device may be a non-volatile memory device that can store data from the host system. One example of a non-volatile memory device is a NAND (NAND) memory device. Other examples of non-volatile memory devices are described below in connection with FIG. 1. Each of the memory devices may include one or more arrays of memory cells. A memory cell ("cell") is an electronic circuit that stores information. Depending on the cell type, a cell may store one or more binary information bits and have various logic states related to the number of bits stored. A logic state may be represented by a binary value (e.g., "0" and "1") or a combination of such values.

Various access operations may be performed on the memory cells. Data can be written to, read from, and erased from the memory cells. The memory cells may be grouped into write units, such as pages. For some types of memory devices, a page is the smallest unit of write. A page is a group of cells that span the same word line. The page size represents a particular number of cells in the page. For some types of memory devices (e.g., NAND), the memory cells may be grouped into erase units, such as physical blocks, which are a set of pages. A physical block is a 2-dimensional memory array of pages (rows of cells) and strings (columns of cells). Data may be written to the block page by page. Data may be erased at the block level. However, portions of the block cannot be erased.

The pages in a block may contain valid data, invalid data, or no data. Invalid data is data that is marked as obsolete because a new version of the data is stored on the memory device. Invalid data contains data previously written but no longer associated with a valid logical address (e.g., a logical address referenced by the host system in the physical-to-logical (P2L) mapping table). Valid data is the most recent version of such data stored on the memory device. The memory subsystem may mark the data as invalid based on information received from, for example, an operating system. Pages that do not contain data include pages that have been previously erased and have not been written to.

The memory subsystem controller may perform operations for media management algorithms, such as wear leveling, refresh, garbage collection, scrubbing, etc., to help manage data on the memory subsystem. A block may have some pages that contain valid data and some pages that contain invalid data. To avoid waiting for all pages in a block to have invalid data for erasing and reusing the block, an algorithm, hereinafter referred to as "garbage collection" may be invoked, enabling the block to be erased and freed up as a free block for a subsequent write operation. Garbage collection is a set of media management operations including, for example, selecting a block containing valid and invalid data, selecting a page in the block containing valid data, copying valid data to a new location (e.g., a free page in another block), marking data in a previously selected page as invalid, and erasing the selected block.

Hereinafter, "garbage collection" refers to selecting a block, rewriting valid data from the selected block to another block, and erasing all invalid data and valid data stored at the selected block. Valid data from multiple selected blocks may be copied to a lesser number of other blocks, and the selected blocks may then be erased. Thus, the number of blocks that have been erased may be increased so that more blocks are available for storing subsequent data from the host system. However, effective memory management may be complicated by the following intrinsic memory subsystem design constraints: typical transistor-based memory cells wear out after a certain number of erase operations (from thousands in a tri-level cell to tens of thousands or more in a single level cell) when the ability of the floating gate to sustain charge is significantly reduced.

Some memory management schemes may deploy a partitioned namespace (ZNS), also referred to herein as a "zone". A region is a partition of the logical address space of a memory device that is configured in such a way that the Logical Block Address (LBA) of each region will be used sequentially in a write operation starting from a low-end LBA. LBA ranges belonging to a particular region may be dedicated to a particular host application, and a separate application may only access regions dedicated to those applications. Because the LBAs of each region are written sequentially, rewriting data already stored in a region requires invalidating the entire region (by, for example, moving the write pointer to the starting LBA of the region). The use of partition namespaces allows for more efficient (faster) read, write, and erase operations, and reduces the need to over-provision memory blocks to various applications (by reducing the number of blocks that would otherwise store invalid data and that must be taken into account when allocating memory to an application).

However, the ZNS architecture does not completely eliminate the need for efficient memory management. For example, different applications may store (and update) data at different rates. Thus, some areas can only be overwritten a few times, while other areas are overwritten more frequently. If the LBA to physical memory block mapping remains static, significant differences in erase counts between physical memory blocks used by different applications may result. For example, areas used to store relatively static databases may be rarely overwritten, while areas used to store temporary computing data or short term caches may be overwritten at a much higher rate.

Aspects of the present disclosure address the above and other deficiencies by describing region-aware memory management in a memory subsystem. In some embodiments, a memory controller of a memory subsystem may track the number of times a region is overwritten-the region's Version Number (VN) -and track the number of times each physical block is erased. In some embodiments, when an application (e.g., running on a host system) requests that data be stored in association with a particular region, the application may specify an identifier of the region, which may be the starting (lowest) LBA of the region or any other pointer to the specified region. In response to receiving the identifier, the memory controller may determine the VN of the area and further obtain an Erase Count (EC) of the available physical memory blocks, which may be identified by their respective Physical Block Addresses (PBAs). The memory controller may select blocks of the zone-based VN such that "cold" zones with low VN (indicating a low probability of the zone becoming invalid in the future) are associated with high EC "hot" physical memory blocks (i.e., blocks with fewer erase operations left in the life cycle than the average blocks of the memory subsystem). Conversely, the memory controller may associate "cold" physical memory blocks (blocks with low EC) with "hot" regions with high VN (high probability indicating that the region will often become invalid in the future). After selecting the appropriate physical block, the memory controller may match the next available LBA of the region with the PBA of the selected block, write data to the selected block, and store the LBA-PBA association (e.g., in a memory lookup table) for subsequent read and erase operations that may be directed to the LBA by the application.

In some embodiments, a memory controller (or local media controller) may perform (e.g., at idle time or periodically) a background scan of physical memory blocks of a memory subsystem to determine whether data in some blocks is at risk of corruption. For example, data that has been stored in a particular block for a long time may be compromised due to physical (electrostatic) interference from other (e.g., physically adjacent) blocks that are subject to various read, write, and/or erase operations. The background scan may perform Error Correction Code (ECC) analysis of at least some blocks and schedule those blocks for folding whose Bit Error Rate (BER) exceeds a certain threshold, which refers to refreshing the data stored therein by moving the data to other blocks. In a region-aware memory management, the background scan may only consume a portion of the time required to examine all blocks of the memory subsystem. More specifically, because a low VN indicates a long storage time (without erasure) for data stored in association with a region (and the LBA of the region), background scanning may begin with the lowest VN region since the physical memory blocks (least number of times overwritten) of such regions are more likely to be at risk of data loss.

Furthermore, because data is stored in (i.e., associated with) the low VN areas using high EC-blocks, as described above, such high EC-blocks may be closer to their resource limits, thereby having a higher risk of corruption. Thus, once the ECC operation determines that the blocks of a particular zone (e.g., a zone with an nth low VN) are within a low risk BER range, the background scan may stop. In addition, because within each region, data is stored in (i.e., associated with) sequential LBAs, it is sufficient to scan only the physical memory blocks associated with the lower LBAs of each region, since these blocks have stored data for the longest time and therefore have the highest risk of corruption. Once the ECC operation determines that a particular number (e.g., consecutive) of physical blocks of a region are within a low risk BER range, the background scanning of the particular region may stop (and the next region may be scanned if necessary). Thus, only a small portion of the physical blocks associated with each region and a limited number of regions may need to be scanned.

FIG. 1 illustrates an example computing system 100 that includes a memory subsystem 110 according to some embodiments of the present disclosure. Memory subsystem 110 may include media, such as one or more volatile memory devices (e.g., memory device 140), one or more non-volatile memory devices (e.g., memory device 130), or a combination of such devices.

Memory subsystem 110 may be a memory device, a memory module, or a mixture of memory devices and memory modules. Examples of storage devices include Solid State Drives (SSDs), flash drives, Universal Serial Bus (USB) flash drives, embedded multimedia controller (eMMC) drives, Universal Flash Storage (UFS) drives, Secure Digital (SD) cards, and Hard Disk Drives (HDDs). Examples of memory modules include dual in-line memory modules (DIMMs), small outline DIMMs (SO-DIMMs), and various types of non-volatile dual in-line memory modules (NVDIMMs).

The computing system 100 may be a computing device, such as a desktop computer, a laptop computer, a network server, a mobile device, a vehicle (e.g., an aircraft, drone, train, automobile, or other vehicle), an internet of things (IoT) -enabled device, an embedded computer (e.g., an embedded computer included in a vehicle, industrial equipment, or networked business device), or such computing device including memory and processing devices.

The computing system 100 may include a host system 120 coupled to one or more memory subsystems 110. In some embodiments, host system 120 is coupled to different types of memory subsystems 110. FIG. 1 shows one example of a host system 120 coupled to one memory subsystem 110. As used herein, "coupled to" generally refers to a connection between components that may be an indirect communication connection or a direct communication connection (e.g., without intervening components), whether wired or wireless, including electrical, optical, magnetic, etc., connection.

The host system 120 may include a processor chipset and a software stack executed by the processor chipset. The processor chipset may include one or more cores, one or more caches, a memory controller (e.g., NVDIMM controller), and a storage protocol controller (e.g., PCIe controller, SATA controller). The memory subsystem 110 is used by the host system 120, for example, to write data to the memory subsystem 110 and to read data from the memory subsystem 110.

The host system 120 may be coupled to the memory subsystem 110 via a physical host interface. Examples of physical host interfaces include, but are not limited to, a Serial Advanced Technology Attachment (SATA) interface, a peripheral component interconnect express (PCIe) interface, a Universal Serial Bus (USB) interface, a fibre channel, serial attached SCSI (sas), a Double Data Rate (DDR) memory bus, a Small Computer System Interface (SCSI), a dual in-line memory module (DIMM) interface (e.g., a DIMM socket interface supporting Double Data Rate (DDR)), and so forth. The physical host interface may be used to transfer data between the host system 120 and the memory subsystem 110. The host system 120 may further utilize an NVM express (NVMe) interface to access components (e.g., the memory device 130) when the memory subsystem 110 is coupled with the host system 120 over a PCIe interface. The physical host interface may provide an interface for passing control, address, data, and other signals between the memory subsystem 110 and the host system 120. By way of example, FIG. 1 illustrates a memory subsystem 110. In general, host system 120 may access multiple memory subsystems via the same communication connection, multiple separate communication connections, and/or a combination of communication connections.

Memory devices

130, 140 may include different types of non-volatile memory devices and/or any combination of volatile memory devices. Volatile memory devices, such as memory device 140, may be, but are not limited to, Random Access Memory (RAM), such as Dynamic Random Access Memory (DRAM) and Synchronous Dynamic Random Access Memory (SDRAM).

Some examples of non-volatile memory devices, such as memory device 130, include NAND (NAND) type flash memory and write-in-place memory, such as three-dimensional cross-point ("3D cross-point") memory. A non-volatile memory cross-point array may perform bit storage based on changes in body resistance in conjunction with a stackable cross-meshed data access array. In addition, in contrast to many flash-based memories, cross-point non-volatile memories can perform in-place write operations in which non-volatile memory cells can be programmed if they have been previously erased. The NAND type flash memory includes, for example, two-dimensional NAND (2D NAND) and three-dimensional NAND (3D NAND).

Each of memory devices 130 may include one or more arrays of memory cells. One type of memory cell, such as a Single Level Cell (SLC), can store one bit per cell. Other types of memory cells, such as multi-level cells (MLC), three-level cells (TLC), and four-level cells (QLC), may store multiple bits per cell. In some embodiments, each of the memory devices 130 may include one or more arrays such as SLC, MLC, TLC, QLC, or any combination thereof. In some embodiments, a particular memory device may include an SLC portion, as well as an MLC portion, a TLC portion, or a QLC portion of a memory cell. The memory cells of memory device 130 may be grouped into pages, which may refer to logical units of the memory device for storing data. In some types of memory (e.g., NAND), pages may be grouped to form blocks.

Although non-volatile memory components are described, such as 3D cross-point non-volatile memory cell arrays and NAND-type flash memories (e.g., 2D NAND, 3D NAND), memory device 130 may be based on any other type of non-volatile memory, such as Read Only Memory (ROM), Phase Change Memory (PCM), self-select memory, other chalcogenide-based memory, ferroelectric transistor random access memory (FeTRAM), ferroelectric random access memory (FeRAM), Magnetic Random Access Memory (MRAM), Spin Transfer Torque (STT) -MRAM, conductive bridge ram (cbram), Resistive Random Access Memory (RRAM), oxide-based RRAM (oxram), NOR (NOR) flash memory, Electrically Erasable Programmable Read Only Memory (EEPROM).

Memory subsystem controller 115 (or controller 115 for simplicity) may communicate with memory device 130 to perform operations, such as reading data, writing data, or erasing data at memory device 130, and other such operations. Memory subsystem controller 115 may include hardware such as one or more integrated circuits and/or discrete components, buffer memory, or a combination thereof. The hardware may comprise digital circuitry with dedicated (i.e., hard-coded) logic to perform the operations described herein. Memory subsystem controller 115 may be a microcontroller, special purpose logic circuitry (e.g., a Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), etc.), or other suitable processor.

Memory subsystem controller 115 may include a processor 117 (e.g., a processing device) configured to execute instructions stored in local memory 119. In the example shown, the local memory 119 of the memory subsystem controller 115 includes embedded memory configured to store instructions for executing various processes, operations, logic flows, and routines that control the operation of the memory subsystem 110, including handling communications between the memory subsystem 110 and the host system 120.

In some embodiments, local memory 119 may include memory registers that store memory pointers, fetched data, and so forth. Local memory 119 may also include Read Only Memory (ROM) for storing microcode. Although the example memory subsystem 110 in fig. 1 is shown to contain memory subsystem controller 115, in another embodiment of the present disclosure, memory subsystem 110 does not contain memory subsystem controller 115, but may rely on external control (e.g., provided by an external host or by a processor or controller separate from the memory subsystem).

In general, memory subsystem controller 115 may receive commands or operations from host system 120 and may convert the commands or operations into instructions or appropriate commands to achieve the desired access to memory device 130. The memory subsystem controller 115 may be responsible for other operations, such as wear leveling operations, garbage collection operations, error detection and Error Correction Code (ECC) operations, encryption operations, cache operations, and address translation between logical addresses (e.g., logical block addresses LBA, partition namespaces) and physical addresses (e.g., physical block addresses PBA) associated with the memory device 130. Memory subsystem controller 115 may further include host interface circuitry for communicating with host system 120 via a physical host interface. Host interface circuitry may convert commands received from the host system into command instructions to access memory device 130 and convert responses associated with memory device 130 into information for host system 120.

Memory subsystem 110 may also include additional circuitry or components not shown. In some embodiments, memory subsystem 110 may include a cache or buffer (e.g., DRAM) and address circuitry (e.g., row decoder and column decoder) that may receive an address from memory subsystem controller 115 and decode the address to access memory device 130.

In some embodiments, memory device 130 includes a local media controller 135 used in conjunction with memory subsystem controller 115 to perform operations on one or more memory units of memory device 130. An external controller (e.g., memory subsystem controller 115) may manage memory device 130 externally (e.g., perform media management operations on memory device 130). In some embodiments, memory device 130 is a managed memory device, which is an original memory device combined with a local controller (e.g., local controller 135) for media management within the same memory device package. An example of a managed memory device is a managed nand (mnand) device.

The memory subsystem 110 includes a region-aware memory management component (ZMC)113 that can access a Version Number (VN) of a particular region in response to receiving a request to store data in the region, obtain ECs for at least a subset of available physical memory blocks (e.g., blocks that currently do not store data or store data that has been invalidated), and select a PBA for the best block to store data. The best block to store data may be a block in which EC is negatively associated with the region VN, as described in more detail below with reference to fig. 3, thereby ensuring uniform wear across memory devices 130 and/or 140. The ZMC 113 may further perform one or

more memory device

130, 140 region-aware background scans by scanning the regions in order of version numbers and by scanning the physical blocks associated with the LBA of each region starting from the lowest LBA (earliest written PBA). In some embodiments, ZMC 113 may stop scanning blocks within a given region once the first condition is met. For example, ZMC 113 may determine that a predetermined number of blocks (e.g., consecutive blocks) within the scanned region have a BER below a "risk" threshold. In some embodiments, the ZMC 113 may similarly stop scanning the region when the second condition is met. For example, ZMC 113 may determine that the number of risk blocks in the one or more regions does not exceed a second predetermined number of blocks. However, different first and second conditions may be used in various other embodiments, as explained in more detail with reference to fig. 6. During operation of the memory subsystem 110, the VNs for the various regions of the memory subsystem 110 may be stored in one of the memory devices 130 and/or 140 or in a separate data structure accessible to the memory controller 115 (or the local media controller 135). The memory device or separate data structure may be a volatile memory structure (e.g., DRAM). In some embodiments, when the memory subsystem 110 is powered on and the memory controller 115 (and/or the local media controller 135) initializes, the VN may be fetched from non-volatile memory (e.g., NAND memory) to volatile memory structures. When a zone is invalid, the VN for that zone may be updated in a volatile memory structure and copied to non-volatile memory to maintain the VN in the event of an unexpected power outage.

In some embodiments, ZMC 113 may perform a background scan during idle time of the memory subsystem, such as when the number of write, erase, and/or read requests per unit time falls below some set threshold. In some embodiments, the ZMC 113 may perform a background scan in response to the number (e.g., percentage) of most recent read or write operations that failed. In some embodiments, the ZMC 113 may perform background scans at fixed time intervals from the start (or end) time of an earlier scan. In some embodiments, the ZMC 113 may perform background scanning at specific times-hourly, daily, or every other unit of time, as configured by the memory subsystem 110 and/or the host system 120 or a human operator (e.g., a production engineer or system administrator). In some embodiments, the monitoring schedule for the background scan may be stored in local memory 119 (e.g., embedded memory).

In some embodiments, memory subsystem controller 115 comprises at least a portion of ZMC 113. For example, memory subsystem controller 115 may include a processor 117 (processing device) configured to execute instructions stored in local memory 119 to perform the operations described herein. In some embodiments, the ZMC 113 is part of the host system 120, an application, or an operating system. In some embodiments, ZMC 113 may have configuration data, libraries, and other information stored in memory device 130 (and/or memory device 140).

2A-B schematically illustrate high-level descriptions of region-aware memory management in a memory subsystem. FIG. 2A depicts one embodiment 200 of memory management. FIG. 2B depicts another embodiment of memory management 202. As shown in fig. 2A, the host system 120 (e.g., its operating system or any application executing on the host system 120) may transmit read, write or erase requests to the memory subsystem 110. The request may include a LBA 210 within a logical space of the memory subsystem 110 or within a portion of the logical space accessible to the initiator of the request. The LBAs 210 may be used by the memory subsystem controller 115 to identify a physical memory partition (block, page, plane, etc.) associated with a received memory request. For example, the memory subsystem controller 115 may access the memory lookup table 212 to determine the PBAs 216 corresponding to the LBAs 210. The controller 115 may then execute the received memory request using the corresponding block identified by the determined PBA 216. For example, the controller 115 may read data stored in the corresponding block (and provide the data to the processing device of the host system 120) or write data into the corresponding block. In some cases, when data in a block is to be overwritten or if no block is currently allocated to a LBA 210, controller 115 may first execute a dynamic Garbage Collection (GC) routine (214). For example, the controller 115 may identify an appropriate block (via its PBA 216), which may be the block with the smallest number of erase counts, or more generally, a block that has been erased less often than the average block in the memory device. Upon such identification, the controller 115 may write new data into the identified block. The controller 115 may also update the memory lookup table 212 with the new memory translation LBA → PBA (as indicated by the dashed connecting blocks 214 and 212). In the case where the memory request is an overwrite request, the controller 115 may also invalidate data already stored in the old block (previously identified by LBA 210) and update the memory lookup table 212 with the new memory translation LBA → PBA.

After performing the memory request, for example, during a downtime or scheduled GC time, the controller 115 may perform the static GC 218 and/or the background scan 220. During the static GC 218, "cold" data that is only overwritten a small number of times (e.g., less than the average block of the memory device) may be identified and moved to a block that has an Erase Count (EC) above average and is therefore closer to the end of its resource limit than other blocks. During background scan 220, memory subsystem controller 115 may identify blocks with a higher risk of: having too many bits corrupted, such that an Error Correction Code (ECC) may eventually fail to repair the data stored thereon. Such blocks may be collapsed (copied to a new location, where the old location is erased or invalidated, i.e., scheduled for erasure) to other blocks, such as blocks with lower EC. Upon completion of the static GC 218 and/or the background scan 220 (or in some embodiments during execution), the controller 115 may update the memory lookup table 212 with the new memory translation LBA → PBA (as indicated by the corresponding dashed line). In some embodiments, the dynamic GC 214 may include region-block matching 232 and the background scan 220 may include a region-aware background scan 236, as described in more detail below with reference to fig. 2B.

As shown in fig. 2B, host system 120 may generate a region write request 230 and communicate this request to memory subsystem 110, according to some embodiments of the present disclosure. Because write requests in the partition namespace architecture are performed using sequential LBAs assigned to the respective regions, request 230 can identify the regions, but need not specify LBAs. Controller 115 may perform region-block matching 232, as described in more detail below with reference to fig. 3. During region-block matching 232, the controller may identify the physical memory blocks to be used to store the data referenced in write request 230 to obtain the best performance of the memory device (e.g., maximize its lifetime and long-term capability). For example, the controller 115 may match a region with a low Version Number (VN) with a block with a high EC in anticipation that a region that was rarely invalidated in the past may hold data for a long period of time in the future without the data being overwritten. Conversely, the controller 115 may match areas with a high VN (expected to be often invalid in the future) with blocks with a low EC. The memory subsystem controller 115 may then update the memory lookup table 212 with an LBA → PBA translation, where LBA is the next available logical address of the region and PBA (234) is the physical address of the block identified during region-block matching 232 and used to perform the write request.

During the down time or scheduled GC time, the controller 115 may perform the area-aware background scan 236. In particular, memory subsystem controller 115 may identify blocks that have a higher risk of data loss (e.g., blocks that have too many bits corrupted to be repaired via an ECC operation). Such blocks may be folded to other blocks, such as blocks with lower EC. The zone-aware background scan 220 may begin with the zone having the lowest VN. Such areas are allocated physical memory blocks that have not been erased for the longest time and therefore have the highest risk of data loss. Within each zone, the blocks associated with LBAs near the beginning of the zone are the blocks with the highest risk of data loss (because the blocks are not overwritten/erased the longest time among the blocks allocated to the zone). Thus, the controller 115 may initiate a block test from the beginning of the region LBA and stop background scanning of a particular (e.g., predetermined) region after a particular number of blocks successfully pass the block test criteria. Likewise, the controller 115 may sequentially test the different zones in ascending order of VN and scan the background of the memory device after a predetermined number of zones successfully pass the zone test criteria. Blocks identified as "risk" blocks may be collapsed to other blocks using a procedure similar to region-block matching 232 described above. During the region-aware background scan 236 (or after the scan is completed), the controller 115 may update the memory lookup table 212 (as indicated by the dashed line) by replacing the folded block PBA with the new block PBA in the stored LBA → PBA conversion.

Fig. 3 schematically illustrates region-block matching 300 in a region-aware memory management, according to some embodiments of the present disclosure. A schematic depiction of a set of regions 310 of memory devices sorted in ascending order of their VN (from top to bottom) is shown on the left. Each zone is denoted as a "zone (VN)", where VN₁Is the current lowest VM, VN among the regions of the memory device_MIs the current highest VM. There may be any number of zones with a given VN, as depicted, where multiple zones are associated with some VNs (e.g., there are three zones with VNs 3, as depicted)₃And two zones with VN_MThe area of (a). As shown in the gray (indicating the filling of the area) schematic, the area 310 may be written continuously, e.g., from lower (high level) LBAs to higher (low level) LBAs. When a region is completely filled, any associated with the regionSubsequent write operations require the entire region to be invalidated. Likewise, once the entire area becomes invalid, a write or erase operation performed in association with an LBA belonging to a gray (written) portion of the area may be performed.

A schematic depiction of a set of physical memory blocks 320 that may be used for region 310 is shown on the right side of fig. 3. Block 320 is ordered in its EC ascending order (bottom to top). Each block is denoted as "Block (EC)", where EC₁Is the current lowest EC, EC among the available blocks of the memory device_NIs the current highest EC. There may be any number of blocks, each with a particular EC, as depicted, where multiple blocks are associated with some erase counts (e.g., there are four with ECs, as depicted)₅And three have EC₁Block(s).

In some embodiments, memory subsystem controller 115 performs region-block matching with the ultimate goal of ensuring that the distribution of the physical memory blocks of the memory subsystem over their erase counts is as narrow as possible. For example, region-block matching may be performed such that the difference between the maximum erase count and the minimum erase count among blocks of the memory subsystem (or some portion of the memory subsystem) is within a target count gap (which may be set by a memory controller or host system). In an ideal situation, all memory blocks will have the same EC, thereby ensuring uniform wear of the memory device. Thus, the zone-block matching 300 may be performed such that the highest EC-block is provided to perform write requests associated with the zone with the lowest VN. Similarly, the lowest EC-Block is provided to perform write requests associated with the zone with the highest VN. For example, in one embodiment, once the controller 115 determines that a zone has the lowest VN (e.g., a zone (VN))₁) The controller 115 may select the highest EC block (e.g., block (EC))_N-1) And executes the write request directed to the lowest VN using the selected block. Conversely, once the controller 115 determines that a zone has the highest VN (e.g., a zone (VN))_M) The controller 115 may select the lowest EC block (e.g., block (EC))₁) And executes the write request directed to the highest VN area using the selected block.

In some embodiments, the blocksGroups may be grouped, with each group containing blocks with similar (but not necessarily equal) ECs. For example, a group may include blocks such that the difference between the maximum erase count and the minimum erase count among the blocks in the group is within a target range (which may be set by a memory controller or a host system). For example, as depicted in FIG. 3, with EC₁And EC₂May be included into group 1, with EC₃And EC₄May be included into group 2, with EC_N-1And EC_NMay be included into group p, and so on. The controller 115 may use groups to speed up block matching. For example, because the blocks within each group have similar ECs, the controller 115 may no longer need to determine the distribution of available blocks over ECs each time a write request is received, but may select a block in the corresponding group of blocks (e.g., a random block or the next block in the queue). Controller 115 may periodically (or at idle time, or both) determine the current distribution of blocks p (ec) over erase counts and reclassify the currently available blocks into groups. Similarly, the controller 115 may group the zones into categories (not depicted in fig. 3) each having a similar (but not necessarily equal) zone VN. To speed up region-block matching, the controller 115 may periodically (or at idle time, or both) determine the current distribution of regions p (vn) over version numbers and reclassify the current regions into different categories. In some embodiments, the number of region classes may be equal to the number of block groups. Once the region and group are determined, the controller 115 may select a block from the group of blocks j to match the selected block with the write request associated with the region belonging to the region class j. The blocks in a given group may be selected by various algorithms. For example, in one embodiment, the blocks may be randomly selected from the group. In some embodiments, the blocks may be selected in the order of the EC of the blocks remaining in the group (e.g., starting with the lowest or highest EC).

Fig. 4 schematically illustrates distribution-based region-block matching 400 in a region-aware memory management, according to some embodiments of the present disclosure. An exemplary distribution p (ec) of physical memory blocks over their erase counts is depicted in fig. 4 (top). The distribution p (EC) is dynamic (time dependent) and gradually shifts to the right (towards higher EC) over the lifetime of the memory device. The depiction shown in fig. 4 should be understood as a time snapshot of the distribution of blocks on the EC at some point during utilization of the memory device. An exemplary distribution p (vn) of areas on their version numbers is depicted in fig. 4 (bottom). The profile p (VN) is also dynamic and gradually shifts to the left (towards a higher VN) over the lifetime of the memory device (at least before a zone reset occurs, at which time the controller 115 or host system 120 resets the zone allocations to the various applications supported by the memory device). The depiction shown in fig. 4 should be understood as a time snapshot of the distribution of the zones over the VN. The controller 115 may monitor the distributions p (ec) and p (vn) at periodic time intervals (e.g., quasi-continuously) or at shutdown time or upon receipt of a write request or after a predetermined number of write requests have been received since last distribution monitoring.

Distributions p (ec) and p (vn) may be separated into groups and classes, which are indicated by reference numbers in fig. 4. In some embodiments, the block groups may be padded to ensure that there are equal or significant numbers of blocks within each block group. Likewise, the region classes may be populated to ensure that there are equal or significant numbers of regions within each region class. For example, in one non-limiting embodiment, the groups and categories are based on the respective percentiles of the two distributions. More specifically, block group 1 may include blocks with EC within the last 20% of all available blocks, group 2 may include blocks with EC above the last 20% but below the last 40% of all available blocks, group 3 may include blocks with EC above the last 40% but below the first 40% of all available blocks, group 4 may include blocks with EC above the first 40% but below the first 20% of all available blocks, and group 5 may include blocks with EC within the first 20% of all available blocks. In some embodiments, the "available" blocks do not include those blocks that have been erased more than a certain number of times, such as the blocks belonging to the P (EC) tail. Such blocks may be discarded (as schematically depicted in fig. 4 with shading) and excluded by region-block matching. The region categories may be populated in the same manner, with the boundaries between categories (depicted in dashed lines in FIG. 4) set at the same percentile.

In some embodiments, controller 115 may populate the block groups and region classes according to different schemes. For example, the groups and categories may be populated based on the standard deviation σ of the respective distributions. For example, if there are three groups for region-block matching, then group 2 may contain blocks with EC within + -0.5 σ (+ -0.6 σ, +0.7 σ, etc.) of the mean (or median, modulus, etc.) of the P (EC) distribution, group 1 may contain blocks (regions) with EC lower than-0.5 σ (-0.6 σ, -0.7 σ, etc.) thereof, and group 3 may contain blocks (regions) with EC higher than +0.5 σ (+0.6 σ, +0.7 σ, etc.) thereof. In some embodiments, the bounds of the region class may reflect the bounds of the block group. In particular, in the last example, category 1 may include areas with VN higher than the mean (or median, modulo, etc.) of the area distribution p (VN) +0.5 σ (+0.6 σ, +0.7 σ, etc.), while category 3 may include areas with VN lower than-0.5 σ (-0.6 σ, -0.7 σ, etc.). However, in other embodiments, the boundaries of the region categories may be set differently than the boundaries of the corresponding group of blocks. For example, block group 2 may contain blocks within a range of ± 0.5 σ, while corresponding region class 2 may contain regions within a range of ± 0.8 σ.

In some embodiments, populating a group of blocks and a region class may be performed according to different schemes. For example, the group of blocks may be set using percentiles, while the area class may be set with reference to the standard deviation (or any other parameter of the respective distribution), or vice versa. Those skilled in the art will recognize that there are an almost unlimited number of possibilities to associate several intervals (ranges) of the distribution p (ec) of available blocks with intervals (ranges) of the distribution p (vn) of existing regions, achieving the goal of keeping the two distributions as narrow as possible during operation of the memory device.

Fig. 5 and 6 illustrate a method 500 and a method 600, respectively.

Methods

500 and 600 may be performed by processing logic that may comprise hardware (e.g., processing device, circuitry, dedicated logic, programmable logic, microcode, hardware of a device, integrated circuits, etc.), software (e.g., instructions run or executed on a processing device), or a combination thereof. Although shown in a particular order, the order of the operations may be modified unless otherwise specified. Thus, the illustrated embodiments should be understood as examples only, and the illustrated operations may be performed in a different order, some of which may be performed in parallel. In addition, in various embodiments, one or more operations may be omitted. Thus, not all of the operations of the

methods

500 and 600 are required in every embodiment. Other operational flows are possible. In some embodiments, different operations may be used. It may be noted that aspects of the present disclosure may be used with any type of multi-bit memory cell.

FIG. 5 illustrates a flow diagram of an example method 500 of region-aware memory management of a memory device, according to some embodiments of the present disclosure. In one embodiment, the controller 115 of the memory subsystem (more specifically, the ZMC 113 of the controller 115) may perform the example method 500 based on instructions stored in embedded memory of the local memory 119. In some embodiments, firmware of memory subsystem 110 (or memory device 130) may perform example method 500. In some embodiments, an external processing device (e.g., a processing device of host system 120) may perform example method 500.

The method 500 may be implemented in a memory device that includes a number of physical memory blocks. The memory device may have a logical address space (e.g., a partition namespace) that includes multiple regions, each region having a continuous Logical Block Address (LBA). The method 500 may involve receiving a request to store data (operation 510). The request may reference an LBA with which the data is to be stored in association. The LBA can belong to a first region of a memory device, for example, having multiple regions. Although in some cases such an order may be explicitly used or implied, in other cases the terms "first", "second", etc. refer to any selected entity, which need not be ordered in any way), each or some of the regions may be allocated to a respective application being executed by the host system. A region may be configured to require that all LBAs of the region must be invalid if some information referenced by the LBAs belonging to the region is to be overwritten. At operation 520, a processing device executing the method 500 may obtain a Version Identifier (VI) of the first region. The VI of a region may depend on the number of invalidations of the region. For example, VI of a region may be equal to the number of times the region is invalid, or may be a function of the number of times the region is invalid N, e.g., VI ═ f (N). In some embodiments, the function may be an increasing or decreasing function of the number of invalidations of a region. For example, the first region may be a region having the lowest VI among the regions of the memory device (or among some subset of the regions of the memory device). In some embodiments, where the function f (N) is a decreasing function of the degree N, the first region may be a region having the highest VI among the regions (or subset of regions) of the memory device. In some embodiments, f (N) ═ N, and the VI of the region is equal to version number VN, as described above. In some embodiments, the version identifier may be a non-numeric representation (e.g., a letter or other symbolic indicia) of the version number.

At operation 530, the processing device performing the method 500 may obtain erase values for available physical memory blocks of the memory device. The erase value may depend on the number of times the corresponding block was erased (the number of erase counts completed). In some embodiments, the erase value may be the same as the erase count. In some embodiments, the erase value may be a numerical value, such as a mathematical function of the erase count. In some embodiments, the erasure value may be a non-numeric representation of the erasure count (e.g., a letter or other symbolic indicia). For example, the available blocks may be blocks that currently store no information, blocks that store information that has been invalidated, erased blocks, and so forth. Some blocks of the memory device may be reserved for use by the memory controller 115 or the host system 120 (or any application running thereon), etc., and thus may not be included within the plurality of available blocks. At operation 540, the method 500 may continue with selecting a first physical memory block in view of the first region version identifier and the erasure value of the available block. In some embodiments, the first physical memory block has a highest erase value among the qualified subset of available physical memory blocks. For example, the qualified subset may include blocks having an erase value that does not exceed a threshold erase value. In one embodiment, blocks having an erase value higher than (or equal to) the threshold erase value may be discarded and no longer used to store new information and may be excluded from the pool of available blocks.

In some embodiments, to select the first physical memory block, the processing device may obtain a distribution p (vn) (first distribution) of version identifiers for the regions, and may further obtain a distribution p (ec) (second distribution) of erase values or counts for the available physical memory blocks (or eligible available blocks). The processing device may then select a first physical memory block based on the first distribution and the second distribution. For example, in one embodiment, the processing device may identify a specified number of intervals of the first distribution and the particular number of intervals of the second distribution. The intervals may correspond to region classes and block groups, as described above with respect to fig. 3 and 4. The intervals may be identified based on a predetermined scheme of grouping regions and blocks, such as intervals corresponding to a predetermined percentile, intervals associated with a standard deviation (or other parameter of the distribution), and so forth. In some embodiments, the selection of the physical memory block may be performed by a negative correlation with the region version identifier and the block erase value. In particular, the processing device may determine that the region version identifier belongs to a jth low version identifier interval of the first distribution and select the first physical memory block from a jth high erasure value interval of the second distribution.

In one illustrative, non-limiting example, upon determining that the first region has the lowest number of failed operations (e.g., the lowest version identifier) among the regions, the processing device may select the first physical memory block from among the blocks having the highest erase value among the available blocks. In another illustrative, non-limiting example, once the processing device determines that the first region is not the region having the lowest number of failed operations among the regions (e.g., having a version identifier different from the lowest version identifier), the processing device may select the first physical memory block from the blocks having the lowest erase value among the available blocks.

At operation 550, the processing device performing method 500 may map the next available LBA of the first region to the selected first physical memory block, and may store the data in the first physical memory block at operation 560.

Fig. 6A illustrates a flow diagram of an example method 600 of performing a region-aware background scan in a memory device, according to some embodiments of the present disclosure. In one embodiment, the controller 115 of the memory subsystem (more specifically, the ZMC 113 of the controller 115) may perform the example method 600 based on instructions stored in embedded memory of the local memory 119. In some embodiments, firmware of memory subsystem 110 (or memory device 130) may perform example method 600. In some embodiments, an external processing device (e.g., a processing device of host system 120) may perform example method 600. Fig. 6B is a flow diagram of one possible embodiment 601 of a method 600 according to some embodiments of the present disclosure.

The method 600 may involve initiating a scan of a memory device (operation 610). In some embodiments, the scan may be initiated at regular intervals, at idle times, or in response to a failure of a memory operation (e.g., a read or write operation or a particular number of such operations). In some embodiments, the scan may be initiated by the controller 115 based on instructions stored in the local memory 119 or based on instructions received from the host system 120. In some embodiments, the scan may be initiated and performed by the host system 120. At operation 620, the processing device performing the method 600 may identify the first region as having the lowest version identifier among the regions of the memory device. In some embodiments, the physical memory blocks mapping the LBAs of the first region have the same first erase value. In some embodiments, in addition to identifying the first region, the processing device may also identify at least a plurality of regions in the memory device that must be scanned. The identified regions may be ordered by their version identifiers. As indicated by block 620-1 in fig. 6B, in some embodiments, the processing device may set a region counter, starting with the region having the lowest version identifier (e.g., the first region).

At operation 630, the method 600 may continue with the processing device performing an Error Correction Code (ECC) analysis to detect risky physical memory blocks that map to consecutive LBAs of the first region. As indicated by block 630-1 in fig. 6B, in some embodiments, the processing device may set a block counter, starting from the beginning of the region (i.e., starting from the block storing the oldest data), and scan for consecutive blocks (step 630-2, as shown in fig. 6B). For example, a processing device performing method 600 may select (e.g., randomly or according to a predetermined pattern) some pages in a current block and determine a Bit Error Rate (BER) of the current block. If it is determined at decision step 630-3 that the BER of the current block is above a certain predetermined BER threshold, then the current block may be "folded". That is, data from the current block may be moved to an available physical memory block (which may be associated with the next available LBA of the region), and the current block may be marked as an invalid block.

In some embodiments, all blocks that are currently storing data may be scanned within the area under test (e.g., the first area). In other embodiments, a limited number of blocks may be tested and the region scan may be stopped when some predetermined first condition is met. For example, if it is determined (step 630-5, as shown in FIG. 6B) that the BER of n of the m physical memory blocks associated with the last m LBAs is below the BER threshold, then the scanning of the current region may be stopped. Because in area-aware memory management, consecutive LBAs are used in time sequence for write operations, untested LBAs are associated with storing data for a shorter time (shorter than the LBA under test) and are less likely to map to risky physical memory blocks. However, if the processing device determines that the scanning of the current region is to continue, the processing device may increment the block counter (operation 630-6, as shown in FIG. 6B) and repeat the loop 630-2-630-5 for the memory block mapped to the next LBA of the region.

Once the scan of the first region is complete (all blocks have been tested) or stopped (a sufficient number of blocks in the region are found to be at no risk), the method 600 may continue to determine at operation 640 whether the scan of the memory device is to be stopped at the first region (provided that some predetermined second condition is met) or whether the second region must be scanned as such. In some embodiments, to stop scanning at the first (or any subsequent) area, the processing means needs to determine that the first (or any subsequent) area has no more than a certain predetermined number p of risk blocks, or no more than a certain predetermined percentage of risk blocks, or a certain number of blocks with no risk present, or any other criteria in an almost infinite number of possibilities. In some embodiments, an untested block of a region (if the scanning of the region stops before all LBAs of the region are tested) may be assumed to be risk free. In some embodiments, some untested blocks may be assumed to be at risk in the same proportion as in the last m test blocks of the zone, or in all test blocks of the zone, and so on. In some embodiments, more than one region must satisfy one of the aforementioned (or other) conditions.

If it is determined that the scan of the memory device is to be stopped at the first region ("yes" outcome to decision block 640-1, as shown in FIG. 6B), the processing device may stop the scan. However, if it is determined that scanning of the memory device is not to be stopped at the first (or any subsequent) region, the method 600 may continue with identifying the second (or any subsequent) region as having a second (or any subsequent) low version identifier in the plurality of regions (operation 650). In some embodiments, the physical memory blocks that map to LBAs of the second region have the same second erase value, which is higher than (or at least equal to) the first erase value. Thus, in one embodiment, the processing device may increment the zone counter (block 650-1, as shown in FIG. 6B) and repeat the loop 630-1-640-1 for the second (or any subsequent) zone. In particular, at operation 660, the processing device may perform an error correction analysis (starting at the first LBA of the second region) to detect risky physical memory blocks mapped to consecutive LBAs of the second region, and determine, at operation 670, whether to stop scanning of the memory device at the second region in view of the detected risky physical memory blocks mapped to LBAs of the second region. Method 600 may continue until the following occurs: 1) all regions are scanned, or 2) the scanning is stopped after determining (e.g., at operation 640-1) that there are no other regions to test.

Fig. 6C schematically illustrates an example area-aware background scan 602 performed according to some embodiments of the present disclosure. The first six regions, each having seven LBAs mapped to their respective physical memory blocks, are schematically depicted in ascending order of version identifier. The background scanning is performed in the direction indicated by the arrow. Blocks with a BER above (or equal to) the threshold are marked as "failed", while blocks with a BER below the threshold are marked as "passed". In the illustrative example of fig. 6C, scanning of a given area is stopped once three of the last four blocks receive a "pass" label (first condition). The untested blocks are indicated by shading. Further, in the illustrative example of fig. 6C, the scanning of the memory device stops if more than two-thirds of the blocks associated with each of the last two regions receive a "pass" label (second condition), where the untested blocks are assumed to have received a "pass" label. As depicted in fig. 6C, all blocks of the first and second regions are scanned, but the scanning of the third region is stopped after the first six blocks are tested (the third region satisfies the first condition). The scanning of the fourth area is stopped after testing only three blocks, at which time the scanning of the memory device is stopped (the fourth area satisfies the first condition). No other region is tested because more than two thirds (5 out of 7) of the blocks associated with the third region and more than two thirds (7 out of 7) of the blocks associated with the fourth region receive a "pass" label (the second condition is met).

Fig. 7 illustrates an example machine of a computer system 700 within which a set of instructions for causing the machine to perform any one or more of the methodologies discussed herein may be executed. In some embodiments, computer system 700 may correspond to a host system (e.g., host system 120 of fig. 1) that includes, is coupled to, or utilizes a memory subsystem (e.g., memory subsystem 110 of fig. 1), or may be used to perform the operations of a controller (e.g., to execute an operating system to perform operations corresponding to region-aware memory management component 113 of fig. 1). In alternative embodiments, the machine may be connected (e.g., networked) to other machines in a LAN, an intranet, an extranet, and/or the internet. The machine may operate in the capacity of a server or a client machine in a client-server network environment, as a peer machine in a peer-to-peer (or distributed) network environment, or as a server or a client machine in a cloud computing infrastructure or environment.

The machine may be a Personal Computer (PC), a tablet PC, a set-top box (STB), a Personal Digital Assistant (PDA), a cellular telephone, a web appliance, a server, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine. Additionally, while a single machine is illustrated, the term "machine" shall also be taken to include any collection of machines that individually or jointly execute a set (or multiple sets) of instructions to perform any one or more of the methodologies discussed herein.

Example computer system 700 includes a processing device 702, a main memory 704 (e.g., Read Only Memory (ROM), flash memory, Dynamic Random Access Memory (DRAM), such as synchronous DRAM (sdram) or Rambus DRAM (RDRAM), etc.), a static memory 706 (e.g., flash memory, Static Random Access Memory (SRAM), etc.), and a data storage system 718, which communicate with each other via a bus 730.

The processing device 702 represents one or more general-purpose processing devices such as a microprocessor, central processing unit, or the like. More specifically, the processing device may be a Complex Instruction Set Computing (CISC) microprocessor, Reduced Instruction Set Computing (RISC) microprocessor, Very Long Instruction Word (VLIW) microprocessor, or processor implementing other instruction sets, or processors implementing a combination of instruction sets. The processing device 702 may also be one or more special-purpose processing devices such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), network processor, or the like. The processing device 702 is configured to execute instructions 726 for performing the operations and steps discussed herein. The computer system 700 may further include a network interface device 708 that communicates over a network 720.

The data storage system 718 may include a machine-readable storage medium 724 (also referred to as a non-transitory computer-readable storage medium) on which is stored one or more sets of instructions 726 or software embodying any one or more of the methodologies or functions described herein. The instructions 726 may also reside, completely or at least partially, within the main memory 704 and/or within the processing device 702 during execution thereof by the computer system 700, the main memory 704, and the processing device 702 also constituting machine-readable storage media. The machine-readable storage media 724, data storage system 718, and/or main memory 704 may correspond to memory subsystem 110 of fig. 1.

In one embodiment, instructions 726 include instructions that implement the functionality corresponding to ZMC 113 of FIG. 1. While the machine-readable storage medium 724 is shown in an example embodiment to be a single medium, the term "machine-readable storage medium" should be taken to include a single medium or multiple media that store the one or more sets of instructions. The term "machine-readable storage medium" shall also be taken to include any medium that is capable of storing or encoding a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present disclosure. The term "machine-readable storage medium" shall accordingly be taken to include, but not be limited to, solid-state memories, optical media, and magnetic media.

Some portions of the preceding detailed description have been presented in terms of operations and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the most effective means used by those skilled in the data processing arts to convey the substance of their work to others skilled in the art. An algorithm or operation is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.

It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. The present disclosure may refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage systems.

The present disclosure also relates to apparatus for performing the operations herein. This apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, read-only memories (ROMs), Random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.

The algorithms, operations, and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct a more specialized apparatus to perform the method. The structure for a variety of these systems will appear from the description below. In addition, the present disclosure is not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the disclosure as described herein.

The present disclosure may be provided as a computer program product or software which may include a machine-readable medium having stored thereon instructions which may be used to program a computer system (or other electronic devices) to perform a process according to the present disclosure. A machine-readable medium includes any mechanism for storing information in a form readable by a machine (e.g., a computer). In some embodiments, a machine-readable (e.g., computer-readable) medium includes a machine (e.g., computer) -readable storage medium, such as read only memory ("ROM"), random access memory ("RAM"), magnetic disk storage media, optical storage media, flash memory components, and so forth.

The word "example" or "exemplary" is used herein to mean serving as an example, instance, or illustration. Any aspect or design described herein as "example" or "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects or designs. Rather, use of the word "example" or "exemplary" is intended to present concepts in a concrete fashion. As used in this application, the term "or" means an inclusive "or" rather than an exclusive "or". That is, unless specified otherwise or clear from context, "X comprises a or B" means any of the natural inclusive permutations. That is, if X contains A; x comprises B; or X includes both A and B, then "X includes A or B" is satisfied under any of the foregoing circumstances. In addition, the articles "a" and "an" as used in this application and the appended claims may generally be construed to mean "one or more" unless specified otherwise or clear from context to be directed to a singular form. Furthermore, the use of the terms "an embodiment" or "one embodiment" or the like throughout is not intended to denote the same embodiment or embodiment, unless so described. One or more embodiments described herein may be combined in particular embodiments. The terms "first," "second," "third," "fourth," and the like as used herein are intended as labels to distinguish between different elements and may not necessarily have an ordinal meaning consistent with their numerical designation.

In the foregoing specification, embodiments of the disclosure have been described with reference to specific example embodiments thereof. It will be evident that various modifications may be made thereto without departing from the broader spirit and scope of embodiments of the disclosure as set forth in the following claims. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than a restrictive sense.

Claims

1. A system, comprising:

a memory device comprising a plurality of physical memory blocks and associated with a logical address space comprising a plurality of regions, wherein each region comprises a plurality of logical block addresses, LBAs; and

a processing device operatively coupled with the memory device to perform operations comprising:

receiving a request to store data referenced by an LBA associated with a first region of the plurality of regions;

obtaining a version identifier of the first region;

obtaining erase values for a plurality of available physical memory blocks of the memory device, wherein each erase value represents a number of completed erase operations for a respective one of the plurality of memory blocks;

selecting a first physical memory block of a plurality of available physical memory blocks in view of the version identifier and the erasure value for the first region;

mapping a next available LBA associated with the first region to the first physical memory block; and

storing the data in the first physical memory block.

2. The system of claim 1, wherein the version identifier of a region reflects a number of invalidations of the region.

3. The system of claim 1, wherein overwriting information referenced by the LBAs associated with the first region comprises invalidating at least some LBAs of the first region.

4. The system of claim 1, wherein the first region has a lowest number of failed operations among the plurality of regions, and wherein the first physical memory block has a highest erase value among a qualified subset of available physical memory blocks.

5. The system of claim 4, wherein the qualified subset of available physical memory blocks is a subset of the plurality of available physical memory blocks for which an erase value does not exceed a threshold erase value.

6. The system of claim 1, wherein the first region is different from a region having a lowest failure operation number among the plurality of regions, and wherein the first physical memory block has a lowest erasure value among the plurality of available physical memory blocks.

7. The system of claim 1, wherein selecting the first physical memory block comprises:

obtaining a first distribution of version identifiers for the plurality of regions;

obtaining a second distribution of erase values for the plurality of available physical memory blocks; and

selecting the first physical memory block based on the first distribution and the second distribution.

8. The system of claim 7, wherein selecting the first physical memory block based on the first distribution and the second distribution comprises:

identifying a specified number of intervals of the first distribution;

identifying the specified number of intervals of the second distribution;

determining that the version identifier of the first region belongs to a jth low version identifier interval of the first distribution; and

selecting the first physical memory block from a jth high erase interval of the second distribution.

9. The system of claim 1, wherein the request to store data is generated by an application executed by a host computing system communicatively coupled to the memory device, and wherein the first region is allocated to the application.

10. The system of claim 1, wherein the operations further comprise:

initiating a scan of the memory device;

identifying a second region of the plurality of regions as the region having the lowest version identifier;

performing error correction analysis to detect risky physical memory blocks mapped to consecutive LBAs of the first region, starting from a first LBA of the first region; and

determining whether to stop the scan of the memory device at the first region in view of detected risky physical memory blocks mapped to LBAs of the first region.

11. The system of claim 10, wherein the operations further comprise:

upon determining that the scanning of the memory device is not to be stopped at the first region, identifying a second region as having a second low version identifier in the plurality of regions;

performing an error correction analysis to detect risky physical memory blocks mapped to consecutive LBAs of the second region, starting from the first LBA of the second region; and

determining whether to stop the scan of the memory device at the second region in view of detected risky physical memory blocks mapped to LBAs of the first region.

12. A system, comprising:

a memory device comprising a plurality of physical memory blocks and associated with a logical address space comprising a plurality of regions, wherein each region comprises a plurality of logical block addresses, LBAs, and

initiating a scan of the memory device;

identifying a first region of the plurality of regions as a region having a lowest version identifier, wherein the version identifier of a region reflects a number of invalidations to the region;

13. The system of claim 12, wherein the operations further comprise:

stopping the scanning of the memory device upon determining that the scanning of the memory device is to be stopped at the first region.

14. The system of claim 12, wherein the operations further comprise:

upon determining that the scanning of the memory device is not to be stopped at the first area, identifying a second area as having a second low identifier number in the plurality of areas;

determining whether to stop the scan of the memory device at the second region in view of detected risky physical memory blocks mapped to LBAs of the second region.

15. The system of claim 14, wherein the physical memory blocks mapped to the LBAs of the first region have a same first erase value.

16. The system of claim 15, wherein the physical memory blocks mapped to the LBAs of the second region have a same second erase value, wherein the second erase value is higher than the first erase value.

17. The system of claim 15, wherein the operations further comprise:

upon determining that the scanning of the memory device is not to be stopped at the second region, identifying a third region as having a third low version identifier in the plurality of regions;

performing an error correction analysis to detect risky physical memory blocks mapped to consecutive LBAs of the third region, starting from the first LBA of the third region; and

determining whether to stop the scan of the memory device at the third region in view of a number of detected risky physical memory blocks mapped to LBAs of the third region.

18. The system of claim 12, wherein the operations performed by the processing device further comprise: receiving a request to store data in association with the first region of the plurality of regions;

obtaining a version identifier of the first region;

obtaining erase values for a plurality of available physical memory blocks;

selecting a target physical memory block of the plurality of available physical memory blocks in view of the version identifier of the first region and the obtained erasure value;

mapping the target physical memory block to a next available LBA associated with the first region; and

storing the data in the target physical memory block.

19. A method, comprising:

receiving, by a processing device operatively coupled with a memory device, a request to store data referenced by a logical block address, LBA, associated with a first region of a plurality of regions, wherein each region includes a plurality of logical block addresses, LBAs, of a logical address space of the memory device, the memory device having a plurality of physical memory blocks;

obtaining, by the processing device, a version identifier of the first region;

obtaining, by the processing device, erase values for a plurality of available physical memory blocks of the memory device, wherein each erase value represents a number of completed erase operations for a respective one of the plurality of available memory blocks;

selecting, by the processing device, a first physical memory block of the plurality of available physical memory blocks in view of the version identifier and the erasure value for the first region;

mapping, by the processing device, a next available LBA associated with the first region to the first physical memory block; and

storing, by the processing device, the data in the first physical memory block.

20. The method of claim 19, wherein the first region has a lowest number of failed operations among the plurality of regions, and wherein the first physical memory block has a highest erase value among a qualified subset of available physical memory blocks.