[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

WO2022125101A1 - Distributed ecc scheme in memory controllers - Google Patents

Distributed ecc scheme in memory controllers Download PDF

Info

Publication number
WO2022125101A1
WO2022125101A1 PCT/US2020/064310 US2020064310W WO2022125101A1 WO 2022125101 A1 WO2022125101 A1 WO 2022125101A1 US 2020064310 W US2020064310 W US 2020064310W WO 2022125101 A1 WO2022125101 A1 WO 2022125101A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
nand flash
error correction
decoder
ecc
Prior art date
Application number
PCT/US2020/064310
Other languages
French (fr)
Inventor
Chaohong Hu
Chun Liu
Xin LIAO
Original Assignee
Futurewei Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Futurewei Technologies, Inc. filed Critical Futurewei Technologies, Inc.
Priority to PCT/US2020/064310 priority Critical patent/WO2022125101A1/en
Priority to CN202080107284.8A priority patent/CN116490853A/en
Publication of WO2022125101A1 publication Critical patent/WO2022125101A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1048Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices using arrangements adapted for a specific error detection or correction feature
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present disclosure is generally related to SSD (Solid State Drive) controllers, and specifically to methods for using NAND (Not- AND) flash memory SRAM (Static Random Access Memory) in SSD controllers.
  • NAND Not- AND
  • flash memory SRAM Static Random Access Memory
  • SSDs store data in solid state devices, rather than in a magnetic or optical medium.
  • a typical SSD comprises a controller and solid state memory devices.
  • a host device performs write and read operations on the SSD.
  • the SSD acknowledges receipt of the data, stores the data, and subsequently retrieves data. Reading and storing the data on the SSD is prone to errors.
  • Typical SSDs perform error correction when reading data by the host interface or the memory controller.
  • the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
  • the method includes determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
  • the method includes determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to a second NAND flash device, the second NAND flash device decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the second NAND flash device to correct the one or more errors in the retrieved data.
  • the method includes receiving, by the flash memory controller of the SSD, the decoded data from the second NAND flash device over a second channel associated with the second NAND flash device.
  • the retrieved data and the ECC are communicated to the second NAND flash device via the flash memory controller of the SSD.
  • the method includes determining that the error correction criterion corresponds to prioritizing a latency parameter to reduce error correction latency; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using the second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data in parallel with decoding, based on the ECC with which the data was encoded, the data using the first decoder.
  • the method includes accessing the decoded data from whichever one of the first decoder and the second decoder completes decoding the data first.
  • the method includes determining that the error correction criterion corresponds to prioritizing a balanced parameter; and performing partial decoding, based on the ECC with which the data was encoded, of the data using the first decoder implemented on the first NAND flash device; and communicating the partially decoded data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller completing decoding the partially decoded data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller.
  • the first decoder is configured to perform a weak error correction comprising at least one of hard sensing of the data or a first number of iterations
  • the second decoder is configured to perform a strong error correction comprising at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations.
  • the first and second decoders comprise different resource characteristics and different latencies.
  • the NAND flash devices comprise a 3D or 4D flash memory device.
  • the method includes generating an error correction result in the first decoder; determining that an uncorrectable error exists in the error correction result; and in response to determining that the uncorrectable error exists in the error correction result, transmitting, to the flash memory controller of the SSD, a data packet comprising the data and error correction information from the first NAND flash device.
  • the ECC comprises block codes.
  • a system for use in performing error correction in a solid state drive includes: a plurality of Not- AND (NAND) flash devices, each NAND flash device of the plurality of NAND flash devices having on-die NAND flash memory and a respective decoder of a plurality of decoders, a first NAND flash device of the plurality of NAND flash devices performs operations comprising: receiving a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; determining whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a
  • the operations further comprise: in response to determining that the parameter fails to satisfy the error correction criterion: bypassing the first decoder implemented on the first NAND flash device; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
  • the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
  • an apparatus for use in a solid state drive (SSD) comprising a plurality of Not-AND (NAND) flash devices includes: means for receiving, by a first NAND flash device of the plurality of NAND flash devices, a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; means for retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; means for determining, by the first NAND flash device, whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; means for in response to determining that the parameter sati
  • FIG. 1 is a schematic diagram of a NAND flash SSD, according to some embodiments.
  • FIG. 2 is a schematic diagram of the NAND flash devices of the SSD of FIG. 1, according to some embodiments.
  • FIGS. 3A-D are schematic diagrams of distributed error correction schemes, according to some embodiments.
  • FIGS. 4A, 4B, 5, 6 and 7 illustrate flow charts of performing distributed error correction, according to some embodiments.
  • FIG. 8 is a block diagram illustrating circuitry in the form of a processing system for implementing the systems and methods for performing distributed error correction, according to some embodiments.
  • Newly developed NAND flash memory chips include static randomaccess memory (SRAM) on the chip. Such chips may be so-called 3D NAND chips or 4D NAND chips. In this disclosure both types will be referred to, collectively, as ‘NAND chips with on-die SRAM.’ Some such NAND chips provide 1 MB (megabyte) of on-die SRAM, but others provide more or less than 1 MB of on-die SRAM.
  • the physical layout of such 3D NAND chips and 4D NAND chips provides increased memory storage and additional physical space for extra processing devices, such as encoders and decoders.
  • These processing devices are referred to as the backend memory controllers and can be used to distribute the error correction operations, such as decoding data stored by the 3D or 4D NAND chips on the die itself.
  • These decoding operations can supplement or replace decoding operations typically performed by flash memory controllers, referred to as the front-end memory controllers.
  • This disclosure presents novel processes for performing error correction, such as data decoding, using the on-die decoders of such NAND chips.
  • FIG. 1 is a schematic diagram of a NAND flash SSD 100.
  • the SSD 100 includes a main CPU 102 and a NAND Flash Interface (NFI) CPU 108.
  • the main CPU 102 includes a front-end CPU 104 and a back-end CPU 106.
  • the frontend CPU 104 implements a handler for commands received from a host device 130 via a PCIe bus (Peripheral Component Interconnect Express), SAS bus (Serial Attached SCSI (Small Computer System Interface)), or other suitable interface.
  • the front-end CPU 104 also implements a scheduler for Back End (BE) commands that are issued in response to received host commands.
  • the back-end CPU 106 implements back end firmware (FW), performing Flash Translation Layer (FTL), mapping, and other back-end functions.
  • FW back end firmware
  • FTL Flash Translation Layer
  • the NFI CPU 108 controls and manages channels 122. Each channel 122 communicates data and commands to a subset of NAND flash chips in NAND flash devices 210 (which are described in greater detail with reference to FIG. 2). In other SSDs, the main CPU 102 and/or NFI CPU 108 may be implemented with other numbers or types of CPUs and/or other distributions of functionality.
  • the SSD 100 further includes Dynamic Random Access Memory (DRAM) 112, SRAM 114, Hardware (HW) Accelerators 116, and Other Peripherals 118.
  • DRAM Dynamic Random Access Memory
  • SRAM SRAM
  • HW Hardware Accelerators
  • Other Peripherals 118 The DRAM 112 is 32 Gigabytes (GB) in size, but may be larger or smaller in other SSDs.
  • SRAM 114 is 10 Megabytes (MB), but may be larger or smaller in other SSDs.
  • the HW Accelerators 116 includes an Exclusive-OR (XOR) engine, a buffer manager, a HW Garbage Collection (GC) engine, and may include other HW circuits designed to independently handle specific, limited functions for the Main CPU 102 and the NFI CPU 108.
  • the Other Peripherals 118 may include circuits such as a Serial Peripheral Interface (SPI) circuit, a General Purpose Input/Output (GPIO) circuit, an Inter-Integrated Circuit (I2C) bus interface, a Universal Asynchronous Receiver/Transmitter (UART) circuit, and other interface circuits.
  • SPI Serial Peripheral Interface
  • GPIO General Purpose Input/Output
  • I2C Inter-Integrated Circuit
  • UART Universal Asynchronous Receiver/Transmitter
  • the SSD 100 further includes flash subsystems 120, which may include a Low Density Parity Check (LDPC) or other error correction circuit (e.g., decoder), a randomizer circuit, a flash signal processing circuit, and may include other circuits that provide processing relating to writing and reading data to the NAND flash devices 210.
  • the flash subsystems 120 are in some instances referred to herein as the front-end memory controller.
  • the decoders can be implemented on the front -end memory controller and on the NAND flash array 150. Specifically, when digital data is stored in nonvolatile memory, it is crucial to have a mechanism that can detect and correct a certain number of errors. This mechanism is known as data decoding.
  • ECC Error correction code
  • Block code decoders operate on codes that are referred to as n and k codes. A block of k data bits is encoded to become a block of n bits called a code word. In block codes, the code words do not have any dependency on previously encoded messages.
  • Block codes can include linear and non-linear codes and either type can be systematic.
  • Linear codes include repetition, parity, Hamming and Cyclic codes.
  • Convolution code decoders operate on code words that depend on both the data message and a given number of previously encoded messages. The encoder changes state with every message processed.
  • LDPC is a type of convolutional error correction code.
  • data is read from the NAND flash array 150 by the flash subsystems 120.
  • the flash subsystems 120 are configured to always perform error correction using the decoder implemented by the front-end memory controller to detect and correct for memory and bus errors. While such approaches generally work well, using the same decoder to handle all error correction operations creates a bottleneck and a single point of failure for reading, recovering and correcting data. This process also consumes bandwidth on the channels used to receive the data from the NAND flash array 150 as ECC information has to be transmitted over the channels in addition to the underlying data. This slows down the process of reading and decoding data from the NAND flash array 150.
  • NAND flash arrays 150 As technology with NAND flash arrays 150 improves, additional physical space becomes available on the NAND flash arrays 150. This is because the NAND memory cells become increasingly smaller and can be physically arranged in a stacked and layered manner which frees up physical space on the same size die. This physical space can be utilized to include additional decoders on the NAND flash arrays 150 themselves. This duplication and additional provision of decoders enables schemes to distribute error correction efforts between the frontend memory controller and the backend memory controllers. According to the disclosed embodiments a distributed scheme for performing error correction on NAND flash arrays 150 is provided and specifically for performing error correction on 3D or 4D memory devices.
  • the distributed and coupled ECC scheme includes an ECC manager in the front -end memory controller that manages the ECC resources and distributes error correction operations. The ECC manager determines when and where the ECC function or partial ECC function will be executed.
  • the ECC manager coordinates ECC operations to cause error correction to be performed only on the front-end memory controller, only on the decoder implemented on one or more backend memory controllers, or partially on both the front-end memory controller and on one or more backend memory controllers.
  • the ECC manager considers data traffic, ECC resource characteristics, performance/energy priorities and various other factors in controlling where and when ECC operations are performed in the distributed scheme.
  • the ECC manager provides a parameter in a request to read data to a given backend memory controller to control whether the backend memory controller uses the decoder implemented on the NAND flash device to decode the data or whether such a decoder is bypassed to perform decoding by the front-end memory controller.
  • the NAND flash device determines whether the parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a decoder implemented on the NAND flash device or whether error correction will be performed by the front-end memory controller.
  • the error correction criterion can include a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
  • the NAND flash device based on determining that the parameter satisfies the error correction criterion, decodes, based on the ECC associated with the data, the data using the decoder implemented on the NAND flash device to correct one or more errors in the retrieved data and communicates the decoded data over a channel to the flash memory controller of the SSD.
  • the NAND flash devices determines that the parameter fails to satisfy the error correction criterion.
  • the decoder implemented on the NAND flash device is bypassed (e.g., raw data along with its ECC is routed around the decoder of the NAND flash device).
  • the encoded data (raw data plus ECC) is communicated over the channel to the flash memory controller of the SSD which decodes the data, based on the associated ECC, using a decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
  • a determination is made that the decoder of the NAND flash device is busy.
  • the retrieved data and the ECC associated with the data is communicated over a channel to a second NAND flash device.
  • the second NAND flash device decodes the data, based on the associated ECC, using a decoder implemented by the second NAND flash device to correct the one or more errors in the retrieved data.
  • the flash memory controller of the SSD receives the decoded data from the second NAND flash device over a channel associated with the second NAND flash device.
  • the NAND flash device determines that the error correction criterion corresponds to prioritizing a balanced parameter.
  • the NAND flash device performs partial decoding, based on the ECC associated with the data, of the data using the decoder implemented on the NAND flash device.
  • the partially decoded data and the ECC associated with the partially decoded data is communicated over the channel to the flash memory controller of the SSD which completes decoding the partially decoded data, based on the associated ECC, using a decoder implemented by the flash memory controller.
  • the decoder of the NAND flash device can be configured to perform a weak error correction (e.g., by decoding based on at least one of hard sensing of the data or a first number of iterations).
  • the decoder of the front-end memory controller is configured to perform a strong error correction (e.g., by decoding the data based on at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations).
  • the NAND flash device may start decoding the encoded data for a first number of iterations of an LDPC error correction code and then communicate that partially decoded data and corresponding ECC information to the front-end memory controller to perform a remaining set of iterations of the LDPC error correction code to complete decoding the data.
  • the NAND flash device attempts to decode the data read from the NAND flash device using the decoder of the NAND flash device according to a first error correction scheme.
  • the NAND flash device determines that there exist uncorrectable errors.
  • the NAND flash device communicates the ECC along with the raw data to the front-end memory controller with an indication that an uncorrectable error exists.
  • the front-end memory controller uses a more advanced decoder and error correction scheme to attempt to recover the uncorrectable errors.
  • FIG. 2 is a schematic diagram of NAND flash devices 210 of the SSD 100 of FIG. 1.
  • Each channel 122 communicates data and commands from the flash subsystems 120 to a subset of NAND flash chips in the NAND flash devices 210.
  • the sixteen channels CH0, CHI, ... CH15 are coupled respectively to subsets 226a, 226b, ... 226p of the NAND flash devices 210.
  • Within each subset are sixteen NAND flash devices, identified as Logical Unit (LUN)0, LUN1, ... LUN15.
  • LUN Logical Unit
  • the terms NAND flash device and LUN are used interchangeably herein.
  • fewer channels or more channels may be used.
  • fewer or more NAND flash devices per channel may be provided.
  • each of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210 implements a respective decoder on its backend memory controller.
  • one decoder may be implemented on the flash memory controller, such as on the flash subsystems 120 and one or more additional decoder instances may be implemented on each of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210.
  • one subset 226a may communicate with any one or more of the other subsets 226b-p via the channels 122. These communications may take place directly between the subsets and/or via processing devices in the flash subsystem 120.
  • the decoders implemented by the backend memory controllers may differ from the decoder implemented by the front-end memory controller.
  • the decoder implemented by the backend memory controllers may be configured to perform a weak error correction (e.g., error correction that includes at least one of hard sensing of the data or a first number of iterations) and the decoder implemented by the front-end memory controller may be configured to perform a strong error correction (e.g., error correction that includes at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations).
  • the decoders implemented by the backend memory may include different resource characteristics and have different latencies than the decoder implemented by the front-end memory controller.
  • the front-end memory controller may include a manager component that is configured to select how, when and where error correction is performed on the data read from a given subset 226a-p of the NAND flash devices 210.
  • the manner and configuration of the different ways in which error correction is distributed by the front-end memory controller are discussed in connection with FIGS. 3A-D.
  • the manager component configures the error correction distribution based on one or more criteria (e.g., balancing priorities with respect to data traffic, energy consumption, processing resource availability, and/or latency).
  • FIGS. 3A-D are schematic diagrams 300-303 of distributed error correction schemes, according to some embodiments. As shown in FIGS.
  • the distributed error correction schemes include a flash memory controller 310 (e.g., a front-end memory controller) and one or more memory devices 320 (e.g., backend memory controllers).
  • the flash memory controller 310 may be implemented at least in part by the flash subsystems 120 and the one or more memory devices 320 may be implemented at least in part respectively by the subsets 226a, 226b, . . . 226p of the NAND flash devices 210.
  • the memory controller 310 may communicate with the one or more memory devices 320 over a channel, such as over respective channels 122.
  • the flash memory controller 310 includes a first decoder 312, an ECC manager 318 and an interface 314.
  • the flash memory controller 310 communicates with a host interface to receive a request to read data from the memory device 320.
  • the ECC manager 318 analyzes various parameters to select a distributed error correction scheme. For example, the ECC manager 318 may analyze data traffic patterns, energy consumption, a priority or latency associated with the request received from the host, and whether the first decoder 312 of the flash memory controller 310 is busy. Based on this analysis, the ECC manager 318 determines where and when decoding of the data read from the flash memory device 320 is to be performed.
  • the ECC manager 318 determines that decoding is to be performed only by the second decoder 322 implemented on the flash memory device 320. This is illustrated by the solid line around the second decoder 322 and the dashed line around the first decoder 312.
  • the ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to decode the data locally using the second decoder 322 implemented by the memory device 320 before providing the data back to the flash memory controller 310.
  • the flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320.
  • the memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320.
  • the memory device 320 reads the data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210).
  • the memory device 320 retrieves the parameter from the request and determines that the parameter satisfies an error correction criterion for performing error correction using the decoder 322 of the memory device 320.
  • the memory device 320 decodes the data using the second decoder 322 based on the ECC associated with the data.
  • the memory device 320 communicates a data packet that includes the decoded data back over the channel 122 via the interface 324 to the flash memory controller 310.
  • the second decoder 322 detects an uncorrectable error (UE).
  • the memory device 320 communicates a data packet that includes the ECC, the raw data, and the uncorrectable error to the flash memory controller 310.
  • the flash memory controller 310 may attempt to perform additional error decoding using the first decoder 312 or indicate to the host device that an UE exists in the data read from the memory device 320.
  • the ECC manager 318 determines that decoding is to be performed only by the first decoder 312 implemented on the flash memory controller 310. This is illustrated by the solid line around the first decoder 312 and the dashed line around the second decoder 322.
  • the ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to bypass the second decoder 322 when returning data back to the flash memory controller 310.
  • the flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320.
  • the memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320.
  • the memory device 320 reads the data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210).
  • the memory device 320 retrieves the parameter from the request and determines that the parameter fails to satisfy an error correction criterion for performing error correction using the decoder 322 of the memory device 320.
  • the memory device 320 bypasses the second decoder 322 and routes the raw data and ECC information 330 directly to the interface 324 to be communicated to the flash memory controller 310.
  • the memory device 320 communicates a data packet that includes the raw data and ECC back over the channel 122 via the interface 324 to the flash memory controller 310.
  • the flash memory controller 310 perform decodes the data based on the ECC received from the memory device 320 using the first decoder 312.
  • the ECC manager 318 determines that decoding is to be performed by both the first decoder 312 implemented on the flash memory controller 310 and the second decoder 322 implemented on the memory device 320. This is illustrated by the solid line around the first decoder 312 and the solid line around the second decoder 322.
  • the ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to decode the data using the second decoder 322 when returning data back to the flash memory controller 310.
  • the flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320.
  • the parameter may specify the level of decoding that is to be performed by the second decoder 322 relative to the level of decoding that is to be performed by the first decoder 312.
  • partial decoding takes place on the memory device 320 and remaining decoding takes place on the flash memory controller 310. This divides the work of decoding between the two devices which improves the overall efficiency and speed at which data is read from memory.
  • the memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320.
  • the memory device 320 reads the raw data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210).
  • the memory device 320 retrieves the parameter from the request and determines that the parameter satisfies an error correction criterion for performing error correction using the decoder 322 of the memory device 320.
  • the memory device 320 passes the raw data and ECC 330 read from the memory cells 326 to the second decoder 322 to perform an initial decoding of the data based on the ECC of the data using the second decoder 322.
  • the memory device 320 decodes the data using a first number of LDPC iterations and/or weak decoding operations (e.g., using only hard sensing of the data). If the initial decoding (e.g., the weak decoding by the second decoder 322) is successful, then, the memory device 320 provides back to the flash memory controller 310 a data packet that includes the partial decoding result (e.g., the data decoded using the first number of iterations of the LDPC code) and the originally-read ECC information. The flash memory controller 310 completes decoding the data based on the ECC information and the partially decoded data using the first decoder 312.
  • the initial decoding e.g., the weak decoding by the second decoder 322
  • the memory device 320 provides back to the flash memory controller 310 a data packet that includes the partial decoding result (e.g., the data decoded using the first number of iterations of the LDPC code) and the originally-read ECC information.
  • the first decoder 312 can process the data generated by the second decoder 322 and the ECC to perform a second number of iterations remaining in the LDPC code to complete decoding the data and/or can perform a stronger decoding technique (e.g., using hard and soft sending of the data) to decode the data.
  • the partially decoded data from the second decoder 322 can be processed by the first decoder 312 to determine that no further errors are detected by the first decoder 312. In such circumstances, the first decoder 312 passes the partially decoded data received from the second decoder 322 to the requesting host.
  • the initial decoding by the second decoder 322 is unsuccessful.
  • the memory device 320 provides back to the flash memory controller 310 a data packet that includes the raw data and the originally-read ECC information.
  • the flash memory controller 310 attempts to decode the raw data based on the ECC information using the first decoder 312.
  • the first decoder 312 can process the data read from the memory cells and the ECC information using a strong decoding technique (e.g., using hard and soft sending of the data).
  • the memory device 320 is instructed by the ECC manager 318 to perform local decoding of the data using the second decoder 322.
  • the memory device 320 determines that the second decoder 322 is currently busy performing other operations and cannot complete the request to perform local decoding.
  • the memory device 320 communicates with a second memory device 340 to perform the decoding operations. Namely, the memory device 320 can send a data packet that includes the raw data and the ECC 330 read from the memory 326 to the second memory device 340 with an instruction for the second memory device 340 to perform the decoding using a third decoder 342 implemented by the second memory device 340.
  • the second memory device 340 may be implemented as another instance of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210. Namely, the memory device 320 may be a first subset 226a and the second memory device 340 may be a second subset 226b.
  • the memory device 320 provides a data packet that includes the data and the ECC 330 directly to the second memory device 340 through the interface 324 of the memory device 320 and the interface 344 of the second memory device 340.
  • the memory device 320 can communicate via the channels 122 directly with the second memory device 340 without passing information or messages via the flash memory controller 310.
  • the second memory device 340 uses the third decoder 342 implemented on the second memory device 340 to decode the data based on the ECC information. Once the data is decoded, the second memory device 340 can return the decoded data back to the memory device 320.
  • the memory device 320 send the data, decoded by the second memory device 340, to the flash memory controller 310 via the channel associated with the memory device 320.
  • the second memory device 340 communicates the decoded data directly back to the flash memory controller 310 via the channel associated with the second memory device 340.
  • the memory device 320 provides a data packet that includes the data and the ECC 330 to the second memory device 340 via the flash memory controller 310. Specifically, the memory device 320 provides a data packet that includes the data and the ECC information to the flash memory controller 310.
  • the ECC manager 318 in flash memory controller 310 finds a memory device (e.g., the second memory device 340) that is not busy or selects one at random or in a round robin manner.
  • the flash memory controller 310 provides the data and the ECC associated with the data to the selected memory device 340 with an instruction to the selected memory device to decode the data using a decoder of the selected memory device.
  • the second memory device 340 uses the third decoder 342 to decode the data and returns the decoded data back to the flash memory controller 310.
  • the second memory device 340 can be used to completely decode the data read from the memory 326 of the memory device 320 or to partially or initially decode the data read from the memory 326 of the memory device 320.
  • the second memory device 340 returns the partially decoded data back to the flash memory controller with the ECC information for the first decoder 312 of the flash memory controller 310 to complete decoding the data.
  • FIGS. 4A, 4B, 5, 6 and 7 illustrate flow charts of performing distributed error correction, according to some embodiments.
  • the processes 400, 410, 500, 600 and 700 may be embodied in computer-readable instructions for execution by one or more processors or one or more servers, front-end memory controller, backend memory controllers, or combination thereof; accordingly, the processes 400, 410, 500, 600 and 700 are described below by way of example with reference thereto. However, in other embodiments, at least some of the operations of the processes 400, 410, 500, 600 and 700 may be deployed on various other hardware configurations. Some or all of the operations of the processes 400, 410, 500, 600 and 700 can be performed in parallel or out of order, or entirely omitted.
  • an ECC decoder result is received.
  • the memory device 320 receives a request to read data from the memory 326.
  • the memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3A).
  • the memory device 320 routes the data from the memory 326 along with the ECC of the data to the second decoder 322.
  • the second decoder 322 decodes the data based on the ECC of the data to generate a decoder result.
  • UE uncorrectable error
  • the raw data and error correction information is transferred to the front-end memory controller.
  • the ECC information and the raw data is provided to the flash memory controller 310 with an indication of an UE so that the first decoder 312 of the flash memory controller 310 can be used to decode the data read from the memory 326.
  • the process proceeds to operation 405 discussed in connection with FIG. 4B.
  • data and error correction information is received from a memory device.
  • the second decoder 322 provides partially decoded data to the flash memory controller 310 along with the ECC information.
  • the raw data and the ECC information is received by the flash memory controller 310 from the memory device 320 in case the decoder of the memory device 320 was unsuccessfully in decoding the data and detected an UE.
  • error correction is performed with the front-end memory controller decoder.
  • the first decoder 312 is used to complete decoding data that has been partially decoded by the second decoder 322 of the memory device 320 or to correct data in which the second decoder 322 detected an UE.
  • UE uncorrectable error
  • the corrected data is transferred to a host interface.
  • an uncorrectable error flag is raised. For example, the host interface is notified that data read from the memory was not successfully decoded.
  • the process 500 demonstrates operations for identifying and using a decoder of another memory device when a decoder of a memory device from which data is read is busy or unavailable.
  • a read request is received from a front-end memory controller on a first memory device.
  • the memory device 320 receives a request to read data from the memory 326.
  • the memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3D).
  • the memory device 320 determines that the request to read the data from the memory device 320 needs to be completed with minimal latency and that there is not enough time to wait for the decoder of the memory device 320 to complete performing other operations (e.g., decoding data from a prior read request).
  • error correction is performed with the backend memory controller of the first memory device.
  • the decoder of the memory device 320 is not busy, the second decoder 322 of the memory device 320 is used to decode the data read from the memory 326.
  • error correction is performed with the front-end memory controller.
  • the memory device 320 bypasses the second decoder 322 of the memory device 320 and communicates the raw data and the associated ECC information to the memory controller 310.
  • the memory controller 310 uses the first decoder 312 to decode the data read from the memory device 320.
  • error correction is performed with the backend memory controller of a second memory device.
  • the memory device 320 communicates the raw data and the associated ECC information to the second memory device 340 (directly without passing through the memory controller 310 or indirectly via the memory controller 310).
  • the second memory device 340 uses the third decoder 342 to decode the data read from the memory device 320 and provides the decoding result back to the flash memory controller 310 directly or via the memory device 320.
  • the memory device 340 is selected by the memory device 320 at random or in a round-robin manner.
  • the memory device 320 communicates with the ECC manager 318 to identify the memory device 340 from a set of available memory devices.
  • the process 600 demonstrates operations for identifying and using a decoder of a memory device based on error correction criteria.
  • the error correction criteria can include a latency parameter, a balanced parameter, an energy savings parameter, a bandwidth parameter, and various other conditions or parameters.
  • a read request is received from a front-end memory controller on a first memory device.
  • the memory device 320 receives a request to read data from the memory 326.
  • the memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3D).
  • a latency parameter e.g., latency is prioritized over the other criterions
  • a balanced parameter e.g., a balance between latency and energy is prioritized over the other criterions
  • the process proceeds to operation 606. If the distributed error correction criterion includes an energy
  • the ECC manager 318 can analyze various conditions and operations to select the parameter for performing distributed error correction.
  • the ECC manager 318 specifies which of the parameters (e.g., energy, latency, balanced, and so forth) are to be used by the memory device 320 in determining whether to perform error correction using the decoder of the memory device 320.
  • error correction is performed with the fastest decoder.
  • the first decoder 312 implemented by the flash memory controller 310 may be more complex and have more processing and decoder power than decoders implemented by the memory devices (e.g., the second decoder 322 and the third decoder 342).
  • the ECC manager specifies the parameter that causes the memory device 320 to bypass the second decoder 322 and provide the raw data and the associated ECC back to the flash memory controller 310 to perform the decoding using the first decoder 312.
  • the first decoder 312 implemented by the flash memory controller 310 and the second decoder 322 implemented by the memory device 320 are both instructed to decode the data read from the memory device 320 in parallel.
  • the raw data and the ECC are provided to both the first decoder 312 and the second decoder 322.
  • the ECC manager 318 monitors their decoding operations to determine which one of the decoders completes decoding the data first. If the first decoder 312 completes decoding the data first, the flash memory controller 310 uses the decoding result from the first decoder 312 to provide data back to the host interface. If the second decoder 322 completes decoding the data before the first decoder 312, the decoded data is communicated back to the flash memory controller 310 to be provided to the host interface.
  • the flash memory controller 310 may receive the raw data and the ECC from the memory device 320 and may determine that the first decoder 312 is busy decoding previously read data.
  • an alternate error correction device is searched for and found.
  • the flash memory controller 310 may send the raw data and the ECC to another flash memory controller 310, a host device, or a second memory device 340 or back to the memory device 320.
  • the ECC manager 318 may store an index of different decoders that are available and their processing speeds. If the decoder implemented by the flash memory controller 310 is busy and is at the top of the list as being the fastest decoder, the ECC manager 318 selects the next decoder that is on the list. In some cases, the decoder of the second memory device 340 may be faster than the decoder of the memory device 320. In such cases, the flash memory controller 310 provides the data read from the memory device 320 and the ECC information to the second memory device 340 to perform decoding, for example in the manner described in connection with FIG. 3D.
  • error correction is performed with the decoder of the first memory device and the decoder of the front -end memory controller.
  • the second decoder 322 can perform an initial or partial decoding of the data and provide the partial decoded data to the first decoder 312 to complete decoding the data, as explained above in connection with FIG. 3C.
  • error correction is performed with the energy efficient decoder.
  • the second decoder 322 implemented by the memory device 320 may be less complex and consume less energy than the decoder 312 implemented by the flash memory controller 310.
  • the ECC manager specifies the parameter that causes the memory device 320 to decode the data read from the memory 326 using the second decoder 322 and provide the decoded data back to the flash memory controller 310.
  • the process 700 demonstrates operations for decoding data with an on-board decoder of a NAND flash memory device of a NAND flash array.
  • operation 701 a request to read data stored on the NAND flash memory is received.
  • the data and the ECC associated with the data is retrieved from the NAND flash memory.
  • the data is decoded using the decoder implemented on the NAND flash memory.
  • FIG. 8 is a block diagram illustrating circuitry in the form of a processing system for implementing the systems and methods for performing distributed error correction as described above with respect to FIGS. 1-7 according to some embodiments. All components need not be used in various embodiments.
  • One example computing device in the form of a computer 800 may include a processing unit 802, memory 803, a cache 807, removable storage 811, and non-removable storage 822. Although the example computing device is illustrated and described as the computer 800, the computing device may be in different forms in different embodiments. Devices such as smartphones, tablets, and smartwatches are generally collectively referred to as mobile devices or user equipment. Further, although the various data storage elements are illustrated as part of the computer 800, the storage may also or alternatively include cloudbased storage accessible via a network, such as the Internet, or server-based storage.
  • the memory 803 may include volatile memory 814 and non-volatile memory 808.
  • the computer 800 also may include - or have access to a computing environment that includes - a variety of computer-readable media, such as the volatile memory 814, non-volatile memory 808, removable storage 811, and non-removable storage 822.
  • Computer storage includes random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, compact disc readonly memory (CD ROM), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer- readable instructions.
  • RAM random access memory
  • ROM read-only memory
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technologies
  • compact disc readonly memory CD ROM
  • DVD Digital Versatile Disks
  • magnetic cassettes magnetic tape
  • magnetic disk storage magnetic disk storage devices
  • the computer 800 may include or have access to a computing environment that includes an input interface 826, an output interface 824, and a communication interface 816.
  • the output interface 824 may include a display device, such as a touchscreen, that also may serve as an input device.
  • the input interface 826 may include one or more of a touchscreen, a touchpad, a mouse, a keyboard, a camera, one or more device-specific buttons, one or more sensors integrated within or coupled via wired or wireless data connections to the computer 800, and other input devices.
  • the computer 800 may operate in a networked environment using a communication connection to connect to one or more remote computers, which may include a personal computer (PC), server, router, network PC, peer device or other common DFD network switch, or the like.
  • PC personal computer
  • the communication connection may include a Local Area Network (LAN), a Wide Area Network (WAN), a cellular network, a Wi-Fi network, a Bluetooth network, or other networks.
  • LAN Local Area Network
  • WAN Wide Area Network
  • cellular network a Wi-Fi network
  • Bluetooth network a Bluetooth network
  • Computer-readable instructions such as a program 818, stored on a computer-readable medium are executable by the processing unit 802 of the computer 800.
  • the program 818 in some embodiments comprises software that, upon execution by the processing unit 802, performs the task distribution operations according to any of the embodiments included herein.
  • a hard drive, CD-ROM, and RAM are some examples of articles including a non-transitory computer-readable medium such as a storage device.
  • the terms “computer- readable medium” and “storage device” do not include carrier waves to the extent that carrier waves are deemed to be transitory. Storage can also include networked storage, such as a storage area network (SAN).
  • the computer program 818 also may include instruction modules that upon processing cause the processing unit 802 to perform one or more methods or algorithms described herein.
  • software including one or more computer-executable instructions that facilitate processing and operations as described above with reference to any one or all of steps of the disclosure can be installed in and sold with one or more computing devices consistent with the disclosure.
  • the software can be obtained and loaded into one or more computing devices, including obtaining the software through a physical medium or distribution system, including, for example, from a server owned by the software creator or from a server not owned but used by the software creator.
  • the software can be stored on a server for distribution over the Internet, for example.
  • the components of the illustrative devices, systems, and methods employed in accordance with the illustrated embodiments can be implemented, at least in part, in digital electronic circuitry, in analog electronic circuitry, in computer hardware, firmware, or software, or in combinations of them. These components can be implemented, for example, as a computer program product such as a computer program, program code, or computer instructions tangibly embodied in an information carrier, or in a machine-readable storage device, for execution by, or to control the operation of, data processing apparatus such as a programmable processor, a computer, or multiple computers.
  • a computer program product such as a computer program, program code, or computer instructions tangibly embodied in an information carrier, or in a machine-readable storage device, for execution by, or to control the operation of, data processing apparatus such as a programmable processor, a computer, or multiple computers.
  • a computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
  • a computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.
  • functional programs, code, and code segments for accomplishing the techniques described herein can be easily construed as within the scope of the claims by programmers skilled m the art to which the techniques described herein pertain.
  • Method steps associated with the illustrative embodiments can be performed by one or more programmable processors executing a computer program, code, or instructions to perform functions e.g., by operating on input data and/or generating output). Method steps can also be performed by, and apparatus for performing the methods can be implemented as, special-purpose logic circuitry, e.g., an FPGA or an application-specific integrated circuit (ASIC), for example.
  • special-purpose logic circuitry e.g., an FPGA or an application-specific integrated circuit (ASIC), for example.
  • DSP digital signal processor
  • a general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine.
  • a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
  • processors suitable for the execution of a computer program include, by way of example, both general- and special-purpose microprocessors, and any one or more processors of any kind of digital computer.
  • a processor will receive instructions and data from a read-only memory or a random-access memory or both.
  • the required elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data.
  • a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic disks, magneto-optical disks, or optical disks.
  • Information carriers suitable for embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, e.g., erasable programmable read-only memory or ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory devices, and data storage disks (e.g., magnetic disks, internal hard disks, removable disks, magneto-optical disks, and CD-ROM and DVD-ROM disks).
  • semiconductor memory devices e.g., erasable programmable read-only memory or ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory devices, and data storage disks (e.g., magnetic disks, internal hard disks, removable disks, magneto-optical disks, and CD-ROM and DVD-ROM disks).
  • EPROM erasable programmable read-only memory
  • EEPROM electrically erasable programmable ROM
  • flash memory devices e.g., electrically era
  • machine-readable medium means a device able to store instructions and data temporarily or permanently and may include, but is not limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., erasable programmable read-only memory (EPROM)), and/or any suitable combination thereof.
  • RAM random-access memory
  • ROM read-only memory
  • buffer memory flash memory
  • optical media magnetic media
  • cache memory other types of storage
  • other types of storage e.g., erasable programmable read-only memory (EPROM)
  • EPROM erasable programmable read-only memory
  • machine-readable medium shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions for execution by one or more processors (e.g., the processing unit 802), such that the instructions, upon execution by the one or more processors, cause the one or more processors to perform any one or more of the methodologies described herein. Accordingly, “machine -readable medium” refers to a single storage apparatus or device, as well as “cloud-based” storage systems that include multiple storage apparatus or devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

Methods and systems are provided for: receiving, by a first NAND, a request to read data: the data stored on the NAND flash memory having been encoded with an ECC; and the request being received from a flash memory controller; retrieving the data and the ECC from the NAND flash memory of the first NAND flash device; determining whether a parameter specified in the request satisfies an error correction criterion for decoding the data using a first decoder implemented on the first NAND flash device; if the parameter satisfies the error correction criterion: decoding the data using the first decoder implemented on the first NAND flash device. If the parameter fails to satisfy the error correction criterion, communicating the retrieved data and the ECC to the flash memory controller for decoding using a second decoder implemented by the flash memory controller.

Description

DISTRIBUTED ECC SCHEME IN MEMORY CONTROLLERS
TECHNICAL HELD
[0001] The present disclosure is generally related to SSD (Solid State Drive) controllers, and specifically to methods for using NAND (Not- AND) flash memory SRAM (Static Random Access Memory) in SSD controllers.
BACKGROUND
[0002] SSDs store data in solid state devices, rather than in a magnetic or optical medium. A typical SSD comprises a controller and solid state memory devices. A host device performs write and read operations on the SSD. In response, the SSD acknowledges receipt of the data, stores the data, and subsequently retrieves data. Reading and storing the data on the SSD is prone to errors. Typical SSDs perform error correction when reading data by the host interface or the memory controller.
SUMMARY
[0003] Various examples are now described to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. The Summary is not intended to identify key or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
[0004] In some aspects, an error correction method is provided for use in a solid state drive (SSD) comprising a plurality of Not- AND (NAND) flash devices includes: receiving, by a first NAND flash device of the plurality of NAND flash devices, a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; determining, by the first NAND flash device, whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD.
[0005] In some aspects, in response to determining that the parameter fails to satisfy the error correction criterion:
[0006] In some aspects, the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
[0007] In some aspects, the method includes determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
[0008] In some aspects, the method includes determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to a second NAND flash device, the second NAND flash device decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the second NAND flash device to correct the one or more errors in the retrieved data.
[0009] In some aspects, the method includes receiving, by the flash memory controller of the SSD, the decoded data from the second NAND flash device over a second channel associated with the second NAND flash device.
[0010] In some aspects, the retrieved data and the ECC are communicated to the second NAND flash device via the flash memory controller of the SSD.
[0011] In some aspects, the method includes determining that the error correction criterion corresponds to prioritizing a latency parameter to reduce error correction latency; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using the second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data in parallel with decoding, based on the ECC with which the data was encoded, the data using the first decoder.
[0012] In some aspects, the method includes accessing the decoded data from whichever one of the first decoder and the second decoder completes decoding the data first.
[0013] In some aspects, the method includes determining that the error correction criterion corresponds to prioritizing a balanced parameter; and performing partial decoding, based on the ECC with which the data was encoded, of the data using the first decoder implemented on the first NAND flash device; and communicating the partially decoded data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller completing decoding the partially decoded data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller.
[0014] In some aspects, the first decoder is configured to perform a weak error correction comprising at least one of hard sensing of the data or a first number of iterations, and the second decoder is configured to perform a strong error correction comprising at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations.
[0015] In some aspects, the first and second decoders comprise different resource characteristics and different latencies.
[0016] In some aspects, the NAND flash devices comprise a 3D or 4D flash memory device.
[0017] In some aspects, the method includes generating an error correction result in the first decoder; determining that an uncorrectable error exists in the error correction result; and in response to determining that the uncorrectable error exists in the error correction result, transmitting, to the flash memory controller of the SSD, a data packet comprising the data and error correction information from the first NAND flash device.
[0018] In some aspects, the ECC comprises block codes.
[0019] In some aspects, a system is provided for use in performing error correction in a solid state drive (SSD) includes: a plurality of Not- AND (NAND) flash devices, each NAND flash device of the plurality of NAND flash devices having on-die NAND flash memory and a respective decoder of a plurality of decoders, a first NAND flash device of the plurality of NAND flash devices performs operations comprising: receiving a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; determining whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD. [0020] In some aspects, the operations further comprise: in response to determining that the parameter fails to satisfy the error correction criterion: bypassing the first decoder implemented on the first NAND flash device; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
[0021] In some aspects, the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
[0022] In some aspects, an apparatus for use in a solid state drive (SSD) comprising a plurality of Not-AND (NAND) flash devices includes: means for receiving, by a first NAND flash device of the plurality of NAND flash devices, a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; means for retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; means for determining, by the first NAND flash device, whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; means for in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD.
[0023] The explanations provided for each aspect and its implementation apply equally to the other aspects and the corresponding implementations. The different embodiments may be implemented in hardware, software, or any combination thereof. Also, any one of the foregoing examples may be combined with any one or more of the other foregoing examples to create a new embodiment within the scope of the present disclosure.
BRIEF DESCRIPTION OF THE DRAWINGS
[0024] In the drawings, which are not necessarily drawn to scale, like numerals may describe similar components in different views. The drawings illustrate generally, by way of example, but not by way of limitation, various embodiments discussed in the present document.
[0025] FIG. 1 is a schematic diagram of a NAND flash SSD, according to some embodiments.
[0026] FIG. 2 is a schematic diagram of the NAND flash devices of the SSD of FIG. 1, according to some embodiments.
[0027] FIGS. 3A-D are schematic diagrams of distributed error correction schemes, according to some embodiments.
[0028] FIGS. 4A, 4B, 5, 6 and 7 illustrate flow charts of performing distributed error correction, according to some embodiments.
[0029] FIG. 8 is a block diagram illustrating circuitry in the form of a processing system for implementing the systems and methods for performing distributed error correction, according to some embodiments.
DETAILED DESCRIPTION
[0030] It should be understood at the outset that although an illustrative implementation of one or more embodiments is provided below, the disclosed systems and/or methods described with respect to FIGS. 1-8 may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the example designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
[0031] Newly developed NAND flash memory chips include static randomaccess memory (SRAM) on the chip. Such chips may be so-called 3D NAND chips or 4D NAND chips. In this disclosure both types will be referred to, collectively, as ‘NAND chips with on-die SRAM.’ Some such NAND chips provide 1 MB (megabyte) of on-die SRAM, but others provide more or less than 1 MB of on-die SRAM. The physical layout of such 3D NAND chips and 4D NAND chips provides increased memory storage and additional physical space for extra processing devices, such as encoders and decoders. These processing devices are referred to as the backend memory controllers and can be used to distribute the error correction operations, such as decoding data stored by the 3D or 4D NAND chips on the die itself. These decoding operations can supplement or replace decoding operations typically performed by flash memory controllers, referred to as the front-end memory controllers. This disclosure presents novel processes for performing error correction, such as data decoding, using the on-die decoders of such NAND chips.
[0032] FIG. 1 is a schematic diagram of a NAND flash SSD 100. The SSD 100 includes a main CPU 102 and a NAND Flash Interface (NFI) CPU 108. The main CPU 102 includes a front-end CPU 104 and a back-end CPU 106. The frontend CPU 104 implements a handler for commands received from a host device 130 via a PCIe bus (Peripheral Component Interconnect Express), SAS bus (Serial Attached SCSI (Small Computer System Interface)), or other suitable interface. The front-end CPU 104 also implements a scheduler for Back End (BE) commands that are issued in response to received host commands. The back-end CPU 106 implements back end firmware (FW), performing Flash Translation Layer (FTL), mapping, and other back-end functions.
[0033] The NFI CPU 108 controls and manages channels 122. Each channel 122 communicates data and commands to a subset of NAND flash chips in NAND flash devices 210 (which are described in greater detail with reference to FIG. 2). In other SSDs, the main CPU 102 and/or NFI CPU 108 may be implemented with other numbers or types of CPUs and/or other distributions of functionality.
[0034] The SSD 100 further includes Dynamic Random Access Memory (DRAM) 112, SRAM 114, Hardware (HW) Accelerators 116, and Other Peripherals 118. The DRAM 112 is 32 Gigabytes (GB) in size, but may be larger or smaller in other SSDs. The SRAM 114 is 10 Megabytes (MB), but may be larger or smaller in other SSDs.
[0035] The HW Accelerators 116 includes an Exclusive-OR (XOR) engine, a buffer manager, a HW Garbage Collection (GC) engine, and may include other HW circuits designed to independently handle specific, limited functions for the Main CPU 102 and the NFI CPU 108. The Other Peripherals 118 may include circuits such as a Serial Peripheral Interface (SPI) circuit, a General Purpose Input/Output (GPIO) circuit, an Inter-Integrated Circuit (I2C) bus interface, a Universal Asynchronous Receiver/Transmitter (UART) circuit, and other interface circuits.
[0036] The SSD 100 further includes flash subsystems 120, which may include a Low Density Parity Check (LDPC) or other error correction circuit (e.g., decoder), a randomizer circuit, a flash signal processing circuit, and may include other circuits that provide processing relating to writing and reading data to the NAND flash devices 210. The flash subsystems 120 are in some instances referred to herein as the front-end memory controller. The decoders, according to the disclosed techniques, can be implemented on the front -end memory controller and on the NAND flash array 150. Specifically, when digital data is stored in nonvolatile memory, it is crucial to have a mechanism that can detect and correct a certain number of errors. This mechanism is known as data decoding. Error correction code (ECC) encodes data in such a way that a decoder can identify and correct errors in the data. Typically, data strings are encoded by adding a number of redundant bits to them. When the original data is reconstructed, a decoder examines the encoded message to check for any errors. There are many types of ECC decoders, including block code decoders and convolution code decoders. Block code decoders operate on codes that are referred to as n and k codes. A block of k data bits is encoded to become a block of n bits called a code word. In block codes, the code words do not have any dependency on previously encoded messages. Block codes can include linear and non-linear codes and either type can be systematic. Linear codes include repetition, parity, Hamming and Cyclic codes. Convolution code decoders operate on code words that depend on both the data message and a given number of previously encoded messages. The encoder changes state with every message processed. LDPC is a type of convolutional error correction code.
[0037] Typically, data is read from the NAND flash array 150 by the flash subsystems 120. The flash subsystems 120 are configured to always perform error correction using the decoder implemented by the front-end memory controller to detect and correct for memory and bus errors. While such approaches generally work well, using the same decoder to handle all error correction operations creates a bottleneck and a single point of failure for reading, recovering and correcting data. This process also consumes bandwidth on the channels used to receive the data from the NAND flash array 150 as ECC information has to be transmitted over the channels in addition to the underlying data. This slows down the process of reading and decoding data from the NAND flash array 150.
[0038] As technology with NAND flash arrays 150 improves, additional physical space becomes available on the NAND flash arrays 150. This is because the NAND memory cells become increasingly smaller and can be physically arranged in a stacked and layered manner which frees up physical space on the same size die. This physical space can be utilized to include additional decoders on the NAND flash arrays 150 themselves. This duplication and additional provision of decoders enables schemes to distribute error correction efforts between the frontend memory controller and the backend memory controllers. According to the disclosed embodiments a distributed scheme for performing error correction on NAND flash arrays 150 is provided and specifically for performing error correction on 3D or 4D memory devices. This enables near-data-computing such as data copy, data search, or any data processing functions, by utilizing available space in logic chip of 3D and 4D memory. Decoders and processing circuits or devices on the NAND flash arrays 150 themselves are in some instances referred to as the backend memory controllers. [0039] In some embodiments, the distributed and coupled ECC scheme includes an ECC manager in the front -end memory controller that manages the ECC resources and distributes error correction operations. The ECC manager determines when and where the ECC function or partial ECC function will be executed. Specifically, the ECC manager coordinates ECC operations to cause error correction to be performed only on the front-end memory controller, only on the decoder implemented on one or more backend memory controllers, or partially on both the front-end memory controller and on one or more backend memory controllers. The ECC manager considers data traffic, ECC resource characteristics, performance/energy priorities and various other factors in controlling where and when ECC operations are performed in the distributed scheme.
[0040] The ECC manager provides a parameter in a request to read data to a given backend memory controller to control whether the backend memory controller uses the decoder implemented on the NAND flash device to decode the data or whether such a decoder is bypassed to perform decoding by the front-end memory controller. Namely, the NAND flash device determines whether the parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a decoder implemented on the NAND flash device or whether error correction will be performed by the front-end memory controller. The error correction criterion can include a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter. The NAND flash device, based on determining that the parameter satisfies the error correction criterion, decodes, based on the ECC associated with the data, the data using the decoder implemented on the NAND flash device to correct one or more errors in the retrieved data and communicates the decoded data over a channel to the flash memory controller of the SSD.
[0041] In some embodiments, the NAND flash devices determines that the parameter fails to satisfy the error correction criterion. In such cases, the decoder implemented on the NAND flash device is bypassed (e.g., raw data along with its ECC is routed around the decoder of the NAND flash device). The encoded data (raw data plus ECC) is communicated over the channel to the flash memory controller of the SSD which decodes the data, based on the associated ECC, using a decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data. [0042] In some embodiments, a determination is made that the decoder of the NAND flash device is busy. In such cases, in response to determining that the decoder is busy performing other operations (e.g., decoding data for another NAND flash device or still decoding data from a prior read operation), the retrieved data and the ECC associated with the data is communicated over a channel to a second NAND flash device. The second NAND flash device decodes the data, based on the associated ECC, using a decoder implemented by the second NAND flash device to correct the one or more errors in the retrieved data. The flash memory controller of the SSD receives the decoded data from the second NAND flash device over a channel associated with the second NAND flash device.
[0043] In some embodiments, the NAND flash device determines that the error correction criterion corresponds to prioritizing a balanced parameter. In response, the NAND flash device performs partial decoding, based on the ECC associated with the data, of the data using the decoder implemented on the NAND flash device. The partially decoded data and the ECC associated with the partially decoded data is communicated over the channel to the flash memory controller of the SSD which completes decoding the partially decoded data, based on the associated ECC, using a decoder implemented by the flash memory controller. As an example, the decoder of the NAND flash device can be configured to perform a weak error correction (e.g., by decoding based on at least one of hard sensing of the data or a first number of iterations). In such cases, the decoder of the front-end memory controller is configured to perform a strong error correction (e.g., by decoding the data based on at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations). Namely, the NAND flash device may start decoding the encoded data for a first number of iterations of an LDPC error correction code and then communicate that partially decoded data and corresponding ECC information to the front-end memory controller to perform a remaining set of iterations of the LDPC error correction code to complete decoding the data.
[0044] In some embodiments, the NAND flash device attempts to decode the data read from the NAND flash device using the decoder of the NAND flash device according to a first error correction scheme. The NAND flash device determines that there exist uncorrectable errors. In response, the NAND flash device communicates the ECC along with the raw data to the front-end memory controller with an indication that an uncorrectable error exists. The front-end memory controller uses a more advanced decoder and error correction scheme to attempt to recover the uncorrectable errors.
[0045] FIG. 2 is a schematic diagram of NAND flash devices 210 of the SSD 100 of FIG. 1. Each channel 122 communicates data and commands from the flash subsystems 120 to a subset of NAND flash chips in the NAND flash devices 210. The sixteen channels CH0, CHI, ... CH15 are coupled respectively to subsets 226a, 226b, ... 226p of the NAND flash devices 210. Within each subset are sixteen NAND flash devices, identified as Logical Unit (LUN)0, LUN1, ... LUN15. The terms NAND flash device and LUN are used interchangeably herein. In other SSDs, fewer channels or more channels may be used. Similarly, in other SSDs, fewer or more NAND flash devices per channel may be provided.
[0046] According to some embodiments, each of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210 implements a respective decoder on its backend memory controller. In this way, one decoder may be implemented on the flash memory controller, such as on the flash subsystems 120 and one or more additional decoder instances may be implemented on each of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210. In some embodiments, one subset 226a may communicate with any one or more of the other subsets 226b-p via the channels 122. These communications may take place directly between the subsets and/or via processing devices in the flash subsystem 120.
[0047] The decoders implemented by the backend memory controllers may differ from the decoder implemented by the front-end memory controller. For example, the decoder implemented by the backend memory controllers may be configured to perform a weak error correction (e.g., error correction that includes at least one of hard sensing of the data or a first number of iterations) and the decoder implemented by the front-end memory controller may be configured to perform a strong error correction (e.g., error correction that includes at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations). In some implementations the decoders implemented by the backend memory may include different resource characteristics and have different latencies than the decoder implemented by the front-end memory controller.
[0048] The front-end memory controller may include a manager component that is configured to select how, when and where error correction is performed on the data read from a given subset 226a-p of the NAND flash devices 210. The manner and configuration of the different ways in which error correction is distributed by the front-end memory controller are discussed in connection with FIGS. 3A-D. In some cases, the manager component configures the error correction distribution based on one or more criteria (e.g., balancing priorities with respect to data traffic, energy consumption, processing resource availability, and/or latency). [0049] FIGS. 3A-D are schematic diagrams 300-303 of distributed error correction schemes, according to some embodiments. As shown in FIGS. 3A-D, the distributed error correction schemes include a flash memory controller 310 (e.g., a front-end memory controller) and one or more memory devices 320 (e.g., backend memory controllers). The flash memory controller 310 may be implemented at least in part by the flash subsystems 120 and the one or more memory devices 320 may be implemented at least in part respectively by the subsets 226a, 226b, . . . 226p of the NAND flash devices 210. The memory controller 310 may communicate with the one or more memory devices 320 over a channel, such as over respective channels 122.
[0050] The flash memory controller 310 includes a first decoder 312, an ECC manager 318 and an interface 314. The flash memory controller 310 communicates with a host interface to receive a request to read data from the memory device 320. In response, the ECC manager 318 analyzes various parameters to select a distributed error correction scheme. For example, the ECC manager 318 may analyze data traffic patterns, energy consumption, a priority or latency associated with the request received from the host, and whether the first decoder 312 of the flash memory controller 310 is busy. Based on this analysis, the ECC manager 318 determines where and when decoding of the data read from the flash memory device 320 is to be performed.
[0051] As shown in the schematic diagram 300 of FIG. 3A, the ECC manager 318 determines that decoding is to be performed only by the second decoder 322 implemented on the flash memory device 320. This is illustrated by the solid line around the second decoder 322 and the dashed line around the first decoder 312. The ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to decode the data locally using the second decoder 322 implemented by the memory device 320 before providing the data back to the flash memory controller 310. The flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320.
[0052] The memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320. The memory device 320 reads the data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210). The memory device 320 retrieves the parameter from the request and determines that the parameter satisfies an error correction criterion for performing error correction using the decoder 322 of the memory device 320. In response, the memory device 320 decodes the data using the second decoder 322 based on the ECC associated with the data. The memory device 320 communicates a data packet that includes the decoded data back over the channel 122 via the interface 324 to the flash memory controller 310. In some cases, the second decoder 322 detects an uncorrectable error (UE). In such cases, the memory device 320 communicates a data packet that includes the ECC, the raw data, and the uncorrectable error to the flash memory controller 310. At that point, the flash memory controller 310 may attempt to perform additional error decoding using the first decoder 312 or indicate to the host device that an UE exists in the data read from the memory device 320.
[0053] As shown in the schematic diagram 301 of FIG. 3B, the ECC manager 318 determines that decoding is to be performed only by the first decoder 312 implemented on the flash memory controller 310. This is illustrated by the solid line around the first decoder 312 and the dashed line around the second decoder 322. The ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to bypass the second decoder 322 when returning data back to the flash memory controller 310. The flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320.
[0054] The memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320. The memory device 320 reads the data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210). The memory device 320 retrieves the parameter from the request and determines that the parameter fails to satisfy an error correction criterion for performing error correction using the decoder 322 of the memory device 320. In response, the memory device 320 bypasses the second decoder 322 and routes the raw data and ECC information 330 directly to the interface 324 to be communicated to the flash memory controller 310. The memory device 320 communicates a data packet that includes the raw data and ECC back over the channel 122 via the interface 324 to the flash memory controller 310. The flash memory controller 310 perform decodes the data based on the ECC received from the memory device 320 using the first decoder 312.
[0055] As shown in the schematic diagram 302 of FIG. 3C, the ECC manager 318 determines that decoding is to be performed by both the first decoder 312 implemented on the flash memory controller 310 and the second decoder 322 implemented on the memory device 320. This is illustrated by the solid line around the first decoder 312 and the solid line around the second decoder 322. The ECC manager 318 inserts a parameter in the message sent to the memory device 320 to read the data and the parameter causes the memory device 320 to decode the data using the second decoder 322 when returning data back to the flash memory controller 310. The flash memory controller 310 sends the request to read data and the parameter that controls where decoding will take place over a channel 122 associated with the memory device 320. The parameter may specify the level of decoding that is to be performed by the second decoder 322 relative to the level of decoding that is to be performed by the first decoder 312. In this implementation, partial decoding takes place on the memory device 320 and remaining decoding takes place on the flash memory controller 310. This divides the work of decoding between the two devices which improves the overall efficiency and speed at which data is read from memory.
[0056] The memory device 320 receives the request over the channel 122 via the interface 324 of the memory device 320. The memory device 320 reads the raw data and the ECC associated with the data from one or more memory cells 326 (e.g., implemented by the NAND flash devices 210). The memory device 320 retrieves the parameter from the request and determines that the parameter satisfies an error correction criterion for performing error correction using the decoder 322 of the memory device 320. In response, the memory device 320 passes the raw data and ECC 330 read from the memory cells 326 to the second decoder 322 to perform an initial decoding of the data based on the ECC of the data using the second decoder 322. For example, the memory device 320 decodes the data using a first number of LDPC iterations and/or weak decoding operations (e.g., using only hard sensing of the data). If the initial decoding (e.g., the weak decoding by the second decoder 322) is successful, then, the memory device 320 provides back to the flash memory controller 310 a data packet that includes the partial decoding result (e.g., the data decoded using the first number of iterations of the LDPC code) and the originally-read ECC information. The flash memory controller 310 completes decoding the data based on the ECC information and the partially decoded data using the first decoder 312. As an example, the first decoder 312 can process the data generated by the second decoder 322 and the ECC to perform a second number of iterations remaining in the LDPC code to complete decoding the data and/or can perform a stronger decoding technique (e.g., using hard and soft sending of the data) to decode the data. In some cases, the partially decoded data from the second decoder 322 can be processed by the first decoder 312 to determine that no further errors are detected by the first decoder 312. In such circumstances, the first decoder 312 passes the partially decoded data received from the second decoder 322 to the requesting host.
[0057] In some cases, the initial decoding by the second decoder 322 is unsuccessful. In such cases, the memory device 320 provides back to the flash memory controller 310 a data packet that includes the raw data and the originally-read ECC information. The flash memory controller 310 attempts to decode the raw data based on the ECC information using the first decoder 312. As an example, the first decoder 312 can process the data read from the memory cells and the ECC information using a strong decoding technique (e.g., using hard and soft sending of the data).
[0058] As shown in the schematic diagram 303 of FIG. 3D, the memory device 320 is instructed by the ECC manager 318 to perform local decoding of the data using the second decoder 322. The memory device 320 determines that the second decoder 322 is currently busy performing other operations and cannot complete the request to perform local decoding. In response, the memory device 320 communicates with a second memory device 340 to perform the decoding operations. Namely, the memory device 320 can send a data packet that includes the raw data and the ECC 330 read from the memory 326 to the second memory device 340 with an instruction for the second memory device 340 to perform the decoding using a third decoder 342 implemented by the second memory device 340. The second memory device 340 may be implemented as another instance of the subsets 226a, 226b, . . . 226p of the NAND flash devices 210. Namely, the memory device 320 may be a first subset 226a and the second memory device 340 may be a second subset 226b.
[0059] In one embodiment, the memory device 320 provides a data packet that includes the data and the ECC 330 directly to the second memory device 340 through the interface 324 of the memory device 320 and the interface 344 of the second memory device 340. For example, the memory device 320 can communicate via the channels 122 directly with the second memory device 340 without passing information or messages via the flash memory controller 310. In such cases, the second memory device 340 uses the third decoder 342 implemented on the second memory device 340 to decode the data based on the ECC information. Once the data is decoded, the second memory device 340 can return the decoded data back to the memory device 320. At that point, the memory device 320 send the data, decoded by the second memory device 340, to the flash memory controller 310 via the channel associated with the memory device 320. In other embodiments, the second memory device 340 communicates the decoded data directly back to the flash memory controller 310 via the channel associated with the second memory device 340.
[0060] In another embodiment, the memory device 320 provides a data packet that includes the data and the ECC 330 to the second memory device 340 via the flash memory controller 310. Specifically, the memory device 320 provides a data packet that includes the data and the ECC information to the flash memory controller 310. The ECC manager 318 in flash memory controller 310 finds a memory device (e.g., the second memory device 340) that is not busy or selects one at random or in a round robin manner. The flash memory controller 310 provides the data and the ECC associated with the data to the selected memory device 340 with an instruction to the selected memory device to decode the data using a decoder of the selected memory device. As an example, the second memory device 340 uses the third decoder 342 to decode the data and returns the decoded data back to the flash memory controller 310. The second memory device 340 can be used to completely decode the data read from the memory 326 of the memory device 320 or to partially or initially decode the data read from the memory 326 of the memory device 320. In case of partially decoding the data, the second memory device 340 returns the partially decoded data back to the flash memory controller with the ECC information for the first decoder 312 of the flash memory controller 310 to complete decoding the data.
[0061] FIGS. 4A, 4B, 5, 6 and 7 illustrate flow charts of performing distributed error correction, according to some embodiments. The processes 400, 410, 500, 600 and 700 may be embodied in computer-readable instructions for execution by one or more processors or one or more servers, front-end memory controller, backend memory controllers, or combination thereof; accordingly, the processes 400, 410, 500, 600 and 700 are described below by way of example with reference thereto. However, in other embodiments, at least some of the operations of the processes 400, 410, 500, 600 and 700 may be deployed on various other hardware configurations. Some or all of the operations of the processes 400, 410, 500, 600 and 700 can be performed in parallel or out of order, or entirely omitted.
[0062] At operation 401, an ECC decoder result is received. For example, the memory device 320 receives a request to read data from the memory 326. The memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3A). The memory device 320 routes the data from the memory 326 along with the ECC of the data to the second decoder 322. The second decoder 322 decodes the data based on the ECC of the data to generate a decoder result.
[0063] At operation 402, a determination is made as to whether the decoder result includes an uncorrectable error (UE). If so, the process proceeds to operation 403 and otherwise the process proceeds to operation 404. For example, if the second decoder 322 does not detect uncorrectable errors, the process proceeds to operation 404. If the second decoder 322 detects uncorrectable errors, the process proceeds to operation 403. [0064] At operation 404, the corrected data and error correction information is transferred to the front-end memory controller. For example, the memory device 320 provides the decoded data, and optionally the ECC information, to the flash memory controller 310. Specifically, if the second decoder 322 only performs partial decoding, the ECC information is provided to the flash memory controller 310 to complete decoding the partially decoded data.
[0065] At operation 403, the raw data and error correction information is transferred to the front-end memory controller. For example, the ECC information and the raw data is provided to the flash memory controller 310 with an indication of an UE so that the first decoder 312 of the flash memory controller 310 can be used to decode the data read from the memory 326. After performing operation 403, the process proceeds to operation 405 discussed in connection with FIG. 4B.
[0066] At operation 405, data and error correction information is received from a memory device. For example, the second decoder 322 provides partially decoded data to the flash memory controller 310 along with the ECC information. As another example, the raw data and the ECC information is received by the flash memory controller 310 from the memory device 320 in case the decoder of the memory device 320 was unsuccessfully in decoding the data and detected an UE.
[0067] At operation 406, error correction is performed with the front-end memory controller decoder. For example, the first decoder 312 is used to complete decoding data that has been partially decoded by the second decoder 322 of the memory device 320 or to correct data in which the second decoder 322 detected an UE.
[0068] At operation 407, a determination is made as to whether the decoder result includes an uncorrectable error (UE). If so, the process proceeds to operation 409 and otherwise the process proceeds to operation 408. For example, the first decoder 312 of the flash memory controller 310 determines whether the data can successfully be decoded (e.g., no UE exist).
[0069] At operation 408, the corrected data is transferred to a host interface.
[0070] At operation 409, an uncorrectable error flag is raised. For example, the host interface is notified that data read from the memory was not successfully decoded. [0071] The process 500 demonstrates operations for identifying and using a decoder of another memory device when a decoder of a memory device from which data is read is busy or unavailable. At operation 501, a read request is received from a front-end memory controller on a first memory device. For example, the memory device 320 receives a request to read data from the memory 326. The memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3D).
[0072] At operation 502, a determination is made as to whether the decoder of a first memory device is busy. If so, the process proceeds to operation 504, otherwise the process proceeds to operation 503. For example, the memory device 320 determines that the request to read the data from the memory device 320 needs to be completed with minimal latency and that there is not enough time to wait for the decoder of the memory device 320 to complete performing other operations (e.g., decoding data from a prior read request).
[0073] At operation 503, error correction is performed with the backend memory controller of the first memory device. In case, the decoder of the memory device 320 is not busy, the second decoder 322 of the memory device 320 is used to decode the data read from the memory 326.
[0074] At operation 504, a determination is made as to whether the error correction device of the front-end controller is busy. If so, the process proceeds to operation 506, otherwise the process proceeds to operation 505. If the decoder of the memory device 320 is busy, the memory device 320 communicates with the memory controller 310 to determine whether the first decoder 312 of the memory controller 310 is available to perform the decoding operations.
[0075] At operation 505, error correction is performed with the front-end memory controller. For example, the memory device 320 bypasses the second decoder 322 of the memory device 320 and communicates the raw data and the associated ECC information to the memory controller 310. The memory controller 310 uses the first decoder 312 to decode the data read from the memory device 320.
[0076] At operation 506, error correction is performed with the backend memory controller of a second memory device. For example, the memory device 320 communicates the raw data and the associated ECC information to the second memory device 340 (directly without passing through the memory controller 310 or indirectly via the memory controller 310). The second memory device 340 uses the third decoder 342 to decode the data read from the memory device 320 and provides the decoding result back to the flash memory controller 310 directly or via the memory device 320. In some cases, the memory device 340 is selected by the memory device 320 at random or in a round-robin manner. In some cases, the memory device 320 communicates with the ECC manager 318 to identify the memory device 340 from a set of available memory devices. [0077] The process 600 demonstrates operations for identifying and using a decoder of a memory device based on error correction criteria. The error correction criteria can include a latency parameter, a balanced parameter, an energy savings parameter, a bandwidth parameter, and various other conditions or parameters.
[0078] At operation 601, a read request is received from a front-end memory controller on a first memory device. For example, the memory device 320 receives a request to read data from the memory 326. The memory device 320 determines that a parameter specified in the request satisfies a criterion to perform error correction using the second decoder 322 implemented on the memory device 320 (FIG. 3D).
[0079] At operation 602, a determination is made as to the distributed error correction criterion and whether a parameter specified in the request satisfies a condition for performing error correction using a decoder of the memory device on which the data is stored. If the distributed error correction criterion includes a latency parameter (e.g., latency is prioritized over the other criterions), the process proceeds to operation 603. If the distributed error correction criterion includes a balanced parameter (e.g., a balance between latency and energy is prioritized over the other criterions), the process proceeds to operation 606. If the distributed error correction criterion includes an energy parameter (e.g., energy savings is prioritized over the other criterions), the process proceeds to operation 607. For example, the ECC manager 318 can analyze various conditions and operations to select the parameter for performing distributed error correction. The ECC manager 318 specifies which of the parameters (e.g., energy, latency, balanced, and so forth) are to be used by the memory device 320 in determining whether to perform error correction using the decoder of the memory device 320.
[0080] At operation 603, error correction is performed with the fastest decoder. For example, the first decoder 312 implemented by the flash memory controller 310 may be more complex and have more processing and decoder power than decoders implemented by the memory devices (e.g., the second decoder 322 and the third decoder 342). In such circumstances, the ECC manager specifies the parameter that causes the memory device 320 to bypass the second decoder 322 and provide the raw data and the associated ECC back to the flash memory controller 310 to perform the decoding using the first decoder 312.
[0081] As another example, the first decoder 312 implemented by the flash memory controller 310 and the second decoder 322 implemented by the memory device 320 are both instructed to decode the data read from the memory device 320 in parallel. In this case, the raw data and the ECC are provided to both the first decoder 312 and the second decoder 322. The ECC manager 318 monitors their decoding operations to determine which one of the decoders completes decoding the data first. If the first decoder 312 completes decoding the data first, the flash memory controller 310 uses the decoding result from the first decoder 312 to provide data back to the host interface. If the second decoder 322 completes decoding the data before the first decoder 312, the decoded data is communicated back to the flash memory controller 310 to be provided to the host interface.
[0082] At operation 604, a determination is made as to whether the fastest decoder is busy. If so, the process proceeds to operation 605, otherwise error correction is performed with the fastest error correction device. For example, the flash memory controller 310 may receive the raw data and the ECC from the memory device 320 and may determine that the first decoder 312 is busy decoding previously read data.
[0083] At operation 605, an alternate error correction device is searched for and found. For example, the flash memory controller 310 may send the raw data and the ECC to another flash memory controller 310, a host device, or a second memory device 340 or back to the memory device 320. Namely, the ECC manager 318 may store an index of different decoders that are available and their processing speeds. If the decoder implemented by the flash memory controller 310 is busy and is at the top of the list as being the fastest decoder, the ECC manager 318 selects the next decoder that is on the list. In some cases, the decoder of the second memory device 340 may be faster than the decoder of the memory device 320. In such cases, the flash memory controller 310 provides the data read from the memory device 320 and the ECC information to the second memory device 340 to perform decoding, for example in the manner described in connection with FIG. 3D.
[0084] At operation 606, error correction is performed with the decoder of the first memory device and the decoder of the front -end memory controller. For example, the second decoder 322 can perform an initial or partial decoding of the data and provide the partial decoded data to the first decoder 312 to complete decoding the data, as explained above in connection with FIG. 3C.
[0085] At operation 607, error correction is performed with the energy efficient decoder. For example, the second decoder 322 implemented by the memory device 320 may be less complex and consume less energy than the decoder 312 implemented by the flash memory controller 310. In such circumstances, the ECC manager specifies the parameter that causes the memory device 320 to decode the data read from the memory 326 using the second decoder 322 and provide the decoded data back to the flash memory controller 310.
[0086] The process 700 demonstrates operations for decoding data with an on-board decoder of a NAND flash memory device of a NAND flash array. At operation 701, a request to read data stored on the NAND flash memory is received.
[0087] At operation 702, the data and the ECC associated with the data is retrieved from the NAND flash memory.
[0088] At operation 703, a determination is made as to whether a parameter specified in the request to read data satisfies an error correction criterion to decode data using a decoder implemented on the NAND flash memory.
[0089] At operation 704, the data is decoded using the decoder implemented on the NAND flash memory.
[0090] At operation 705, the decoded data is communicated over a channel to the flash memory controller. [0091] FIG. 8 is a block diagram illustrating circuitry in the form of a processing system for implementing the systems and methods for performing distributed error correction as described above with respect to FIGS. 1-7 according to some embodiments. All components need not be used in various embodiments. One example computing device in the form of a computer 800 may include a processing unit 802, memory 803, a cache 807, removable storage 811, and non-removable storage 822. Although the example computing device is illustrated and described as the computer 800, the computing device may be in different forms in different embodiments. Devices such as smartphones, tablets, and smartwatches are generally collectively referred to as mobile devices or user equipment. Further, although the various data storage elements are illustrated as part of the computer 800, the storage may also or alternatively include cloudbased storage accessible via a network, such as the Internet, or server-based storage.
[0092] The memory 803 may include volatile memory 814 and non-volatile memory 808. The computer 800 also may include - or have access to a computing environment that includes - a variety of computer-readable media, such as the volatile memory 814, non-volatile memory 808, removable storage 811, and non-removable storage 822. Computer storage includes random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM) or electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, compact disc readonly memory (CD ROM), Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium capable of storing computer- readable instructions.
[0093] The computer 800 may include or have access to a computing environment that includes an input interface 826, an output interface 824, and a communication interface 816. The output interface 824 may include a display device, such as a touchscreen, that also may serve as an input device. The input interface 826 may include one or more of a touchscreen, a touchpad, a mouse, a keyboard, a camera, one or more device-specific buttons, one or more sensors integrated within or coupled via wired or wireless data connections to the computer 800, and other input devices. The computer 800 may operate in a networked environment using a communication connection to connect to one or more remote computers, which may include a personal computer (PC), server, router, network PC, peer device or other common DFD network switch, or the like. The communication connection may include a Local Area Network (LAN), a Wide Area Network (WAN), a cellular network, a Wi-Fi network, a Bluetooth network, or other networks. According to one embodiment, the various components of the computer 800 are connected with a system bus 820.
[0094] Computer-readable instructions, such as a program 818, stored on a computer-readable medium are executable by the processing unit 802 of the computer 800. The program 818 in some embodiments comprises software that, upon execution by the processing unit 802, performs the task distribution operations according to any of the embodiments included herein. A hard drive, CD-ROM, and RAM are some examples of articles including a non-transitory computer-readable medium such as a storage device. The terms “computer- readable medium” and “storage device” do not include carrier waves to the extent that carrier waves are deemed to be transitory. Storage can also include networked storage, such as a storage area network (SAN). The computer program 818 also may include instruction modules that upon processing cause the processing unit 802 to perform one or more methods or algorithms described herein.
[0095] Although a few embodiments have been described in detail above, other modifications are possible. For example, the logic flows depicted in the figures do not require the particular order shown, or sequential order, to achieve desirable results. Other steps may be provided, or steps may be eliminated, from the described flows, and other components may be added to, or removed from, the described systems. Other embodiments may be within the scope of the following claims.
[0096] It should be further understood that software including one or more computer-executable instructions that facilitate processing and operations as described above with reference to any one or all of steps of the disclosure can be installed in and sold with one or more computing devices consistent with the disclosure. Alternatively, the software can be obtained and loaded into one or more computing devices, including obtaining the software through a physical medium or distribution system, including, for example, from a server owned by the software creator or from a server not owned but used by the software creator. The software can be stored on a server for distribution over the Internet, for example.
[0097] Also, it will be understood by one skilled in the art that this disclosure is not limited in its application to the details of construction and the arrangement of components set forth in the description or illustrated in the drawings. The embodiments herein are capable of other embodiments, and capable of being practiced or carried out in various ways. Also, it will be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of “including,” “comprising,” or “having” and variations thereof herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. Unless limited otherwise, the terms “connected,” “coupled,” and “mounted,” and variations thereof herein, are used broadly and encompass direct and indirect connections, couplings, and mountings. In addition, the terms “connected” and “coupled” and variations thereof are not restricted to physical or mechanical connections or couplings.
[0098] The components of the illustrative devices, systems, and methods employed in accordance with the illustrated embodiments can be implemented, at least in part, in digital electronic circuitry, in analog electronic circuitry, in computer hardware, firmware, or software, or in combinations of them. These components can be implemented, for example, as a computer program product such as a computer program, program code, or computer instructions tangibly embodied in an information carrier, or in a machine-readable storage device, for execution by, or to control the operation of, data processing apparatus such as a programmable processor, a computer, or multiple computers.
[0099] A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network. Also, functional programs, code, and code segments for accomplishing the techniques described herein can be easily construed as within the scope of the claims by programmers skilled m the art to which the techniques described herein pertain. Method steps associated with the illustrative embodiments can be performed by one or more programmable processors executing a computer program, code, or instructions to perform functions e.g., by operating on input data and/or generating output). Method steps can also be performed by, and apparatus for performing the methods can be implemented as, special-purpose logic circuitry, e.g., an FPGA or an application-specific integrated circuit (ASIC), for example.
[0100] The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general -purpose processor, a digital signal processor (DSP), an ASIC, an FPGA or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
[0101] Processors suitable for the execution of a computer program include, by way of example, both general- and special-purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read-only memory or a random-access memory or both. The required elements of a computer are a processor for executing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic disks, magneto-optical disks, or optical disks.
Information carriers suitable for embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, e.g., erasable programmable read-only memory or ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory devices, and data storage disks (e.g., magnetic disks, internal hard disks, removable disks, magneto-optical disks, and CD-ROM and DVD-ROM disks). The processor and the memory can be supplemented by or incorporated m special- purpose logic circuitry.
[0102] Those of skill in the art understand that information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals, bits, symbols, and chips that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, or any combination thereof.
[0103] As used herein, “machine-readable medium” means a device able to store instructions and data temporarily or permanently and may include, but is not limited to, random-access memory (RAM), read-only memory (ROM), buffer memory, flash memory, optical media, magnetic media, cache memory, other types of storage (e.g., erasable programmable read-only memory (EPROM)), and/or any suitable combination thereof. The term “machine-readable medium” should be taken to include a single medium or multiple media e.g., a centralized or distributed database, or associated caches and servers) able to store processor instructions. The term “machine-readable medium” shall also be taken to include any medium, or combination of multiple media, that is capable of storing instructions for execution by one or more processors (e.g., the processing unit 802), such that the instructions, upon execution by the one or more processors, cause the one or more processors to perform any one or more of the methodologies described herein. Accordingly, “machine -readable medium” refers to a single storage apparatus or device, as well as “cloud-based” storage systems that include multiple storage apparatus or devices.
[0104] In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component, whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the scope of the subject matter disclosed herein. [0105] Although the present disclosure has been described with reference to specific features and embodiments thereof, it is evident that various modifications and combinations can be made thereto without departing from the scope of the disclosure. The specification and drawings are, accordingly, to be regarded simply as an illustration of the disclosure as defined by the appended claims, and are contemplated to cover any and all modifications, variations, combinations, or equivalents that fall within the scope of the present disclosure.

Claims

CLAIMS What is claimed is:
1. An error correction method for use in a solid state drive (SSD) comprising a plurality of Not-AND (NAND) flash devices, each NAND flash device of the plurality of NAND flash devices having on-die NAND flash memory and a respective decoder of a plurality of decoders, the method comprising: receiving, by a first NAND flash device of the plurality of NAND flash devices, a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; determining, by the first NAND flash device, whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; and in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD.
2. The error correction method of claim 1, further comprising: in response to determining that the parameter fails to satisfy the error correction criterion: bypassing the first decoder implemented on the first NAND flash device; and
29 communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
3. The error correction method of claim 1, wherein the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
4. The error correction method of claim 1, further comprising: determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
5. The error correction method of claim 1, further comprising: determining that the first decoder is busy performing other operations; and in response to determining that the first decoder is busy performing other operations, communicating the retrieved data and the ECC with which the data was encoded over the first channel to a second NAND flash device, the second NAND flash device decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the second NAND flash device to correct the one or more errors in the retrieved data.
6. The error correction method of claim 5 further comprising: receiving, by the flash memory controller of the SSD, the decoded data from the second NAND flash device over a second channel associated with the second
NAND flash device.
30
7. The error correction method of claim 5, wherein the retrieved data and the ECC are communicated to the second NAND flash device via the flash memory controller of the SSD.
8. The error correction method of claim 5 further comprising: determining that the error correction criterion corresponds to prioritizing a latency parameter to reduce error correction latency; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using the second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data in parallel with decoding, based on the ECC with which the data was encoded, the data using the first decoder.
9. The error correction method of claim 8, further comprising accessing the decoded data from whichever one of the first decoder and the second decoder completes decoding the data first.
10. The error correction method of claim 1, further comprising: determining that the error correction criterion corresponds to prioritizing a balanced parameter; performing partial decoding, based on the ECC with which the data was encoded, of the data using the first decoder implemented on the first NAND flash device; and communicating the partially decoded data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller completing decoding the partially decoded data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller.
11. The error correction method of claim 10, wherein the first decoder is configured to perform a weak error correction comprising at least one of hard sensing of the data or a first number of iterations, and wherein the second decoder is configured to perform a strong error correction comprising at least one of soft and hard sensing of the data or a second number of iterations greater than the first number of iterations.
12. The error correction method of claim 10, wherein the first and second decoders comprise different resource characteristics and different latencies.
13. The error correction method of claim 1, wherein the NAND flash devices comprise a 3D or 4D flash memory device.
14. The error correction method of claim 1, further comprising: generating an error correction result in the first decoder; determining that an uncorrectable error exists in the error correction result; and in response to determining that the uncorrectable error exists in the error correction result, transmitting, to the flash memory controller of the SSD, a data packet comprising the data and error correction information from the first NAND flash device.
15. The error correction method of claim 1 , wherein the ECC comprises block codes.
16. The error correction method of claim 1, wherein the ECC comprises convolution codes.
17. A system for use in performing error correction in a solid state drive (SSD), the system comprising: a plurality of Not- AND (NAND) flash devices, each NAND flash device of the plurality of NAND flash devices having on-die NAND flash memory and a respective decoder of a plurality of decoders, a first NAND flash device of the plurality of NAND flash devices performs operations comprising: receiving a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; determining whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; and in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD.
18. The system of claim 17, wherein the operations further comprise: in response to determining that the parameter fails to satisfy the error correction criterion: bypassing the first decoder implemented on the first NAND flash device; and communicating the retrieved data and the ECC with which the data was encoded over the first channel to the flash memory controller of the SSD, the flash memory controller decoding the data, based on the ECC with which the data was encoded, using a second decoder implemented by the flash memory controller to correct the one or more errors in the retrieved data.
19. The system of claim 17, wherein the error correction criterion comprises at least one of a plurality of priorities comprising a latency parameter, a balanced parameter, or an energy consumption parameter.
33
20. An apparatus for use in a solid state drive (SSD) comprising a plurality of Not-AND (NAND) flash devices, each NAND flash device of the plurality of NAND flash devices having on-die NAND flash memory and a respective decoder of a plurality of decoders, the apparatus comprising: means for receiving, by a first NAND flash device of the plurality of NAND flash devices, a request to read data stored on the NAND flash memory of the first NAND flash device: the data stored on the NAND flash memory having been encoded with an error correction code (ECC); and the request being received by the first NAND flash device from a flash memory controller of the SSD over a first channel associated with the first NAND flash device; means for retrieving the data and the ECC with which the data was encoded from the NAND flash memory of the first NAND flash device; means for determining, by the first NAND flash device, whether a parameter specified in the request to read data satisfies an error correction criterion for decoding the data encoded with the ECC using a first decoder of the plurality of decoders implemented on the first NAND flash device; and means for in response to determining that the parameter satisfies the error correction criterion: decoding, based on the ECC with which the data was encoded, the data using the first decoder implemented on the first NAND flash device to correct one or more errors in the retrieved data; and communicating the decoded data over the first channel to the flash memory controller of the SSD.
34
PCT/US2020/064310 2020-12-10 2020-12-10 Distributed ecc scheme in memory controllers WO2022125101A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/US2020/064310 WO2022125101A1 (en) 2020-12-10 2020-12-10 Distributed ecc scheme in memory controllers
CN202080107284.8A CN116490853A (en) 2020-12-10 2020-12-10 Distributed ECC scheme in a memory controller

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2020/064310 WO2022125101A1 (en) 2020-12-10 2020-12-10 Distributed ecc scheme in memory controllers

Publications (1)

Publication Number Publication Date
WO2022125101A1 true WO2022125101A1 (en) 2022-06-16

Family

ID=74183507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2020/064310 WO2022125101A1 (en) 2020-12-10 2020-12-10 Distributed ecc scheme in memory controllers

Country Status (2)

Country Link
CN (1) CN116490853A (en)
WO (1) WO2022125101A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337688A1 (en) * 2009-06-30 2014-11-13 Micron Technology, Inc. Switchable on-die memory error correcting engine
US20160034354A1 (en) * 2014-08-01 2016-02-04 Kabushiki Kaisha Toshiba Global error recovery system
US20190034269A1 (en) * 2017-12-21 2019-01-31 Intel Corporation Transfer of encoded data stored in non-volatile memory for decoding by a controller of a memory device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140337688A1 (en) * 2009-06-30 2014-11-13 Micron Technology, Inc. Switchable on-die memory error correcting engine
US20160034354A1 (en) * 2014-08-01 2016-02-04 Kabushiki Kaisha Toshiba Global error recovery system
US20190034269A1 (en) * 2017-12-21 2019-01-31 Intel Corporation Transfer of encoded data stored in non-volatile memory for decoding by a controller of a memory device

Also Published As

Publication number Publication date
CN116490853A (en) 2023-07-25

Similar Documents

Publication Publication Date Title
US9858015B2 (en) Solid-state storage management
US9274866B2 (en) Programming non-volatile memory using a relaxed dwell time
US9405672B2 (en) Map recycling acceleration
EP4020244A1 (en) Memory system architecture for heterogeneous memory technologies
US11430540B2 (en) Defective memory unit screening in a memory system
US11372564B2 (en) Apparatus and method for dynamically allocating data paths in response to resource usage in data processing system
US9390003B2 (en) Retirement of physical memory based on dwell time
US11726869B2 (en) Performing error control operation on memory component for garbage collection
US10223022B2 (en) System and method for implementing super word line zones in a memory device
CN112053730A (en) Redundant cloud memory storage for memory subsystems
CN113010098A (en) Apparatus and method for improving input/output throughput of memory system
JP6342013B2 (en) Method, system and computer program for operating a data storage system including a non-volatile memory array
TW202316259A (en) Apparatus and method for controlling a shared memory in a data processing system
CN115480707A (en) Data storage method and device
WO2021034464A1 (en) Configurable media structure
WO2022125101A1 (en) Distributed ecc scheme in memory controllers
EP3841474A1 (en) Data recovery within a memory sub-system
KR20200102527A (en) Identify read behaviors for storage devices based on the host system's workload
CN115291796A (en) Method and device for storing data
KR20210150779A (en) Memory system for processing an delegated task and operation method thereof
Zuolo et al. Memory driven design methodologies for optimal SSD performance
WO2024036473A1 (en) Selectable error handling modes in memory systems
US20230153023A1 (en) Storage device and method performing processing operation requested by host
EP4428699A1 (en) Memory device and method for scheduling block request
US20240004745A1 (en) Pausing memory system based on critical event

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20841796

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202080107284.8

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20841796

Country of ref document: EP

Kind code of ref document: A1