[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112051964A - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN112051964A
CN112051964A CN201910493541.5A CN201910493541A CN112051964A CN 112051964 A CN112051964 A CN 112051964A CN 201910493541 A CN201910493541 A CN 201910493541A CN 112051964 A CN112051964 A CN 112051964A
Authority
CN
China
Prior art keywords
data
storage
storage space
temperature
storage medium
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910493541.5A
Other languages
Chinese (zh)
Other versions
CN112051964B (en
Inventor
吴忠杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910493541.5A priority Critical patent/CN112051964B/en
Publication of CN112051964A publication Critical patent/CN112051964A/en
Application granted granted Critical
Publication of CN112051964B publication Critical patent/CN112051964B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System (AREA)

Abstract

The application provides a data processing method and device. The method comprises the following steps: receiving a write request for first data; writing first data into a first data stream and storing the first data into a first storage space of a storage medium; performing garbage collection on the first data, and converting the first data into second data; writing the second data into a second data stream and storing the second data into a second storage space of the storage medium; wherein the first data and the second data have different data temperatures. The write amplification coefficient of the storage medium can be reduced, and the performance of the storage medium is further improved.

Description

Data processing method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus.
Background
At present, more and more users use a memory to store data, when the data stored in the memory is more, in order to improve the storage space utilization rate of the memory, it is often necessary to process the high temperature data stored in the memory to obtain medium temperature data, store the medium temperature data in the memory, then process the medium temperature data to obtain low temperature data, and store the low temperature data in the memory, and so on. Then, garbage collection processing is performed on the data stored in the memory.
However, the inventors have found that performing garbage collection processing on data stored in a memory increases the write amplification factor of the memory, thereby degrading the performance of the memory.
Disclosure of Invention
In order to solve the above technical problem, an embodiment of the present application shows a data processing method and apparatus.
In a first aspect, an embodiment of the present application shows a data processing method, where the method includes:
receiving a write request for first data;
writing the first data into a first data stream and storing the first data into a first storage space of a storage medium;
performing garbage collection on the first data, and converting the first data into second data;
writing the second data into a second data stream and storing the second data into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In an optional implementation, the method further includes:
performing garbage collection on the second data, and converting the second data into third data;
writing the third data to a third data stream.
In an optional implementation, the method further includes:
and storing the third data stream into a third storage space of the storage medium.
In an optional implementation manner, the temperature of the second data is lower than that of the first data, and the temperature of the third data is lower than that of the second data.
In an optional implementation manner, the performing garbage collection on the first data and converting the first data into second data includes:
reading the first data from the first data stream, and converting the first data into erasure code data;
and generating the second data according to the erasure code data.
In an optional implementation manner, the performing garbage collection on the second data and converting the second data into third data includes:
and reading the second data from the second data stream, and aggregating the second data into third data.
In an alternative implementation manner, the writing the second data into the second data stream and storing the second data into the second storage space of the storage medium includes:
determining a temperature of the second data;
determining a second storage space for storing data of the temperature among a plurality of storage spaces included in the storage medium;
and writing the second data into a second data stream and storing the second data into the second storage space.
In an optional implementation manner, the determining the temperature of the second data includes:
acquiring the processing times of garbage recycling included in the process of acquiring the second data according to the original data of the second data;
and determining the temperature of the second data according to the processing times.
In an optional implementation manner, the determining, among a plurality of storage spaces included in the storage medium, a second storage space for storing data of the temperature includes:
determining a corresponding relation table between the temperature and the storage address of the storage space;
searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and if the corresponding relation table has a storage address corresponding to the temperature, determining the second storage space according to the storage address.
In an optional implementation, the method further includes:
and if the storage address corresponding to the temperature does not exist in the corresponding relation table, creating the second storage space in the storage medium.
In an optional implementation, the method further includes:
acquiring a storage address of the created second storage space;
and forming a corresponding table entry by the temperature and the storage address of the created second storage space, and storing the corresponding table entry in the corresponding relation table.
In a second aspect, an embodiment of the present application shows a data processing apparatus, including:
the receiving module is used for receiving a write request of first data;
the first writing module is used for writing the first data into a first data stream, and the first storing module is used for storing the first data stream into a first storage space of a storage medium;
the first recovery module is used for performing garbage recovery on the first data and converting the first data into second data;
a second writing module, configured to write the second data into a second data stream, and a second storing module, configured to store the second data stream into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In an optional implementation, the apparatus further comprises:
the second recovery module is used for performing garbage recovery on the second data and converting the second data into third data;
and the third writing module is used for writing the third data into a third data stream.
In an optional implementation, the apparatus further comprises:
and the third storing module is used for storing the third data stream into a third storage space of the storage medium.
In an optional implementation manner, the temperature of the second data is lower than that of the first data, and the temperature of the third data is lower than that of the second data.
In an optional implementation manner, the first recovery module includes:
a conversion unit, configured to read the first data from the first data stream, and convert the first data into erasure code data;
a first determining unit configured to generate the second data according to the erasure code data.
In an optional implementation, the second recycling module includes:
and the aggregation unit reads the second data from the second data stream and aggregates the second data into third data.
In an alternative implementation, the first logging module includes:
a second determining unit for determining the temperature of the second data;
a third determination unit configured to determine, among a plurality of storage spaces included in the storage medium, a second storage space for storing the data of the temperature;
and the storage unit is used for writing the second data into a second data stream and storing the second data into the second storage space.
In an optional implementation manner, the second determining unit includes:
the first obtaining subunit is configured to obtain the number of garbage collection processes included in the process of obtaining the second data according to the original data of the second data;
and the first determining subunit is used for determining the temperature of the second data according to the processing times.
In an optional implementation manner, the third determining unit includes:
the second determining subunit is used for determining a corresponding relation table between the temperature and the storage address of the storage space;
the searching subunit is used for searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and a third determining subunit, configured to determine, if a storage address corresponding to the temperature exists in the correspondence table, the second storage space according to the storage address.
In an optional implementation manner, the third determining unit further includes:
and a creating subunit, configured to create the second storage space in the storage medium if a storage address corresponding to the temperature does not exist in the correspondence table.
In an optional implementation manner, the third determining unit further includes:
a second obtaining subunit, configured to obtain a storage address of the created second storage space;
and the storage subunit is used for forming a corresponding table entry by the temperature and the storage address of the created second storage space and storing the corresponding table entry in the corresponding relation table.
In a third aspect, an embodiment of the present application shows an electronic device, including:
a processor; and
a memory having executable code stored thereon that, when executed, causes the processor to perform:
receiving a write request for first data;
writing the first data into a first data stream and storing the first data into a first storage space of a storage medium;
performing garbage collection on the first data, and converting the first data into second data;
writing the second data into a second data stream and storing the second data into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In an alternative implementation, the processor further performs:
performing garbage collection on the second data, and converting the second data into third data;
writing the third data to a third data stream.
In an alternative implementation, the processor further performs:
and storing the third data stream into a third storage space of the storage medium.
In an alternative implementation, the processor further performs:
reading the first data from the first data stream, and converting the first data into erasure code data;
and generating the second data according to the erasure code data.
In an alternative implementation, the processor further performs:
reading the second data from the second data stream, and converting the second data into erasure code data;
and generating the second data according to the erasure code data.
In an alternative implementation, the processor further performs:
determining a temperature of the second data;
determining a second storage space for storing data of the temperature among a plurality of storage spaces included in the storage medium;
and writing the second data into a second data stream and storing the second data into the second storage space.
In an alternative implementation, the processor further performs:
acquiring the processing times of garbage recycling included in the process of acquiring the second data according to the original data of the second data;
and determining the temperature of the second data according to the processing times.
In an alternative implementation, the processor further performs:
determining a corresponding relation table between the temperature and the storage address of the storage space;
searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and if the corresponding relation table has a storage address corresponding to the temperature, determining the second storage space according to the storage address.
In an alternative implementation, the processor further performs:
and if the storage address corresponding to the temperature does not exist in the corresponding relation table, creating the second storage space in the storage medium.
In an alternative implementation, the processor further performs:
acquiring a storage address of the created second storage space;
and forming a corresponding table entry by the temperature and the storage address of the created second storage space, and storing the corresponding table entry in the corresponding relation table.
In a fourth aspect, embodiments of the present application show one or more machine-readable media having stored thereon executable code that, when executed, causes a processor to perform a data processing method as described in the first aspect.
Compared with the prior art, the embodiment of the application has the following advantages:
in the prior art, when data at different temperatures are stored in a storage medium, the data at different temperatures are not distinguished, so that when the data in the storage medium is subjected to garbage collection, all the data stored in the storage medium are often subjected to garbage collection at the same time, and it is seen that the data related to the garbage collection process are all the data stored in the storage medium, that is, the data related to the garbage collection process has a large amount, so that the write amplification factor of the storage medium can be increased, and the performance of the storage medium can be reduced.
The temperature of some data stored in the storage medium is higher, more junk data can be generated, the data with higher temperature generally needs to be subjected to junk recovery, the temperature of some data is lower, only less junk data can be generated, even the junk data is not generated, and the lower temperature of the data with higher temperature often does not need to be subjected to junk recovery.
Therefore, in the present application, data at different temperatures can be stored in different storage spaces of a storage medium, and when data in which storage space needs to be garbage collected, the data in which storage space is garbage collected, among a plurality of storage spaces included in the storage medium, the data in which storage space is not garbage collected, and data in all other storage spaces can not be garbage collected at the same time.
That is, data stored in the storage medium are classified into different storage spaces according to different temperatures, and when garbage collection is required for data in which storage space, garbage collection is performed for data in which storage space, or when garbage collection is required for data at which temperature, garbage collection is performed for data at which temperature.
Therefore, compared with the prior art, the data related to the garbage collection process in the application is part of the data stored in the storage medium, that is, compared with the prior art, the data size related to the garbage collection process in the application is smaller, so that compared with the prior art, the write amplification coefficient of the storage medium can be reduced, and the performance of the storage medium is improved.
Drawings
FIG. 1 is a flow chart illustrating a method of data processing according to an exemplary embodiment.
FIG. 2 is a flow chart illustrating a method of data processing according to an exemplary embodiment.
FIG. 3 is a block diagram illustrating a data processing apparatus according to an example embodiment.
FIG. 4 is a block diagram illustrating a data processing apparatus according to an example embodiment.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, the present application is described in further detail with reference to the accompanying drawings and the detailed description.
Fig. 1 is a flowchart illustrating a data processing method according to an exemplary embodiment, where the method is applied to an electronic device, the electronic device includes a storage medium, such as an SSD (Solid State Disk) or SSHD (Solid State Hybrid Drive), and the method includes the following steps.
In step S101, a write request of first data is received;
in this application, when the first data needs to be stored in the storage medium of the electronic device, a write request of the first data may be input in the electronic device, and then the electronic device receives the input write request of the first data, and then step S102 is executed; alternatively, a write request of the first data is sent to the electronic device by the other device, and then the electronic device receives the write request of the first data sent by the other device, and then step S102 is executed.
In step S102, writing first data into a first data stream, and storing the first data into a first storage space of a storage medium;
in the application, a plurality of different storage spaces can be created in the storage medium in advance, the different storage spaces are used for storing data with different temperatures, when certain data needs to be stored in the storage medium, the temperature of the data can be determined firstly, then the data is written into a data stream, and the data stream is stored in the storage space for storing the data with the temperature, so that the data is stored in the storage space for storing the data with the temperature, and the temperatures of the data stored in the different storage spaces in the storage medium are different from each other.
The first data is original data, so a higher preset value can be set as the temperature of the first data, for example, a value greater than 1, and then a first storage space for storing the temperature data is determined in a plurality of storage spaces included in the storage medium; and writing the first data into the first data stream and storing the first data into the first storage space.
In this application, the temperature of the data includes the frequency at which the data is accessed, and the like.
In step S103, performing garbage collection on the first data, and converting the first data into second data;
in this application, for any storage space in the storage medium, after a certain data in the storage space is accessed, the junk data corresponding to the data is generated in the storage space, the more times the data is accessed, the more junk data is generated in the storage space, and when a ratio between a space size occupied by the junk data in the storage space and a space size of the storage space is greater than a preset threshold, it may be determined that the data stored in the storage space satisfies a preset condition, and the data in the storage space may be subjected to junk recovery, so as to delete the junk data in the storage space from the storage medium. Thereby improving the utilization rate of the storage space. The same is true for every other storage space in the storage medium.
That is, in the present application, in order to improve the utilization rate of the first storage space, when the data stored in the first storage space meets the preset condition, garbage collection may be performed on the first data, and the first data is converted into the second data, or garbage collection may be performed on the first data periodically, and the first data is converted into the second data, and the like. In one example, first data may be read from a first data stream, the first data may be converted to erasure coded data, and second data may be generated based on the erasure coded data.
In step S104, writing the second data into the second data stream, and storing the second data into the second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In the present application, each time erasure correction code data is obtained by garbage-collecting one data, the temperature of the obtained erasure correction code data is often lower than that of the data. Therefore, the temperature of the second data may be lower than the temperature of the first data.
When the second data needs to be stored in the storage medium, the temperature of the second data may be determined, and then the second data is written into a data stream and stored in the storage space for storing the data at the temperature, so as to implement the storage of the second data in the storage space for storing the data at the temperature, such that the temperatures of the data stored in different storage spaces in the storage medium are different from each other. The specific logging process can be seen in the following examples, which are not described in detail herein.
In the prior art, when data at different temperatures are stored in a storage medium, the data at different temperatures are not distinguished, so that when the data in the storage medium is subjected to garbage collection, all the data stored in the storage medium are often subjected to garbage collection at the same time, and it is seen that the data related to the garbage collection process are all the data stored in the storage medium, that is, the data related to the garbage collection process has a large amount, so that the write amplification factor of the storage medium can be increased, and the performance of the storage medium can be reduced.
The temperature of some data stored in the storage medium is higher, more junk data can be generated, the data with higher temperature generally needs to be subjected to junk recovery, the temperature of some data is lower, only less junk data can be generated, even the junk data is not generated, and the lower temperature of the data with higher temperature often does not need to be subjected to junk recovery.
Therefore, in the present application, data at different temperatures can be stored in different storage spaces of a storage medium, and when data in which storage space needs to be garbage collected, the data in which storage space is garbage collected, among a plurality of storage spaces included in the storage medium, the data in which storage space is not garbage collected, and data in all other storage spaces can not be garbage collected at the same time.
That is, data stored in the storage medium are classified into different storage spaces according to different temperatures, and when garbage collection is required for data in which storage space, garbage collection is performed for data in which storage space, or when garbage collection is required for data at which temperature, garbage collection is performed for data at which temperature.
Therefore, compared with the prior art, the data related to the garbage collection process in the application is part of the data stored in the storage medium, that is, compared with the prior art, the data size related to the garbage collection process in the application is smaller, so that compared with the prior art, the write amplification coefficient of the storage medium can be reduced, and the performance of the storage medium is improved.
Further, the second data in the second storage space may be accessed later, the second data generates garbage data corresponding to the second data in the second storage space each time the second data is accessed, the more times the second data is accessed, the more garbage data is generated in the second storage space, and when the ratio between the space size occupied by the garbage data in the second storage space and the space size of the second storage space is greater than a preset threshold, it may be determined that the data stored in the second storage space satisfies a preset condition, garbage collection may be performed on the second data, the second data may be converted into third data, and then the third data is written into a third data stream; or regularly performing garbage collection on the first data, converting the first data into second data, writing third data into a third data stream, and the like; to enable deletion of the garbage data in the second storage space from the storage medium. And further improve the utilization ratio of the second storage space.
When performing garbage collection on the second data and converting the second data into third data, the second data may be read from the second data stream and then aggregated into the third data, for example, the second data is converted into erasure code data, then the third data is generated according to the erasure code data, and then the third data stream may also be stored in a third storage space of the storage medium.
In the present application, the temperature of the second data may be lower than the temperature of the first data, and the temperature of the third data may be lower than the temperature of the second data.
Further, garbage collection may then continue on the third data, converting the third data to a fourth data, and so on, until the life cycle of the resulting data expires, i.e., until the resulting data is deleted from the storage medium.
In an embodiment of the present application, referring to fig. 2, when writing the second data into the second data stream and storing the second data into the second storage space of the storage medium, the following process may be performed, including:
in step S201, a temperature of the second data is determined;
the processing times of garbage recycling included in the process of acquiring the second data according to the original data of the second data can be acquired; the temperature of the second data is determined according to the number of times of processing.
In the application, since the temperature of erasure code data is often lower than the temperature of a certain data after the erasure code data is obtained by performing garbage collection on the data, when the temperature of the certain data needs to be determined, the processing times of garbage collection included in the process of obtaining the data according to the original data of the data can be obtained, and then the temperature of the data is determined according to the processing times.
For example, if data 1 is obtained by performing garbage collection on data 2 once, and if data 2 is obtained by performing garbage collection on data 3 once, and data 3 is not obtained by performing garbage collection on a certain data stored in the storage medium once but is obtained from another place and directly stored in the storage medium, then data 3 is original data, the number of garbage collection processes included in the process of obtaining data 1 from data 3 is 2, the temperature of data 1 can be determined according to number 2, and the number of garbage collection processes included in the process of obtaining data 2 from data 3 is 1, and the temperature of data 2 can be determined according to number 1.
In the application, after one data is subjected to garbage collection to obtain erasure code data, the processing times of garbage collection included in the process of obtaining the erasure code data can be increased in a preset field in the erasure code data.
In this way, in the present application, when the temperature of a certain data needs to be determined, the number of garbage collection processes included in the process of acquiring the data from the original data of the data may be acquired in a preset field in the data, and the higher the number of garbage collection processes included, the lower the temperature of the data, and the lower the number of garbage collection processes included, the higher the temperature of the data, so the inverse number of the number of garbage collection processes, etc. may be taken as the temperature of the data.
In the present application, since the first data is directly stored in the first storage space in the storage medium according to the write request, and is not obtained by garbage-collecting other data, the number of times of processing for the first data is 0, a higher preset value, for example, a value greater than 1, may be set as the temperature of the first data, and the second data is obtained by garbage-collecting the first data, and therefore, the number of times of processing for the second data is 1, and 1 may be directly used as the temperature of the second data, and the third data is obtained by garbage-collecting the second data, and therefore, the number of times of processing for the third data is 2, and the reciprocal 1/2 of 2 may be directly used as the temperature of the third data.
In step S202, among a plurality of storage spaces included in the storage medium, a second storage space for storing data of the temperature is determined;
in the present application, different storage spaces in the storage medium have different storage addresses, and for any one storage space in the storage medium, the temperature for storing data in the storage space and the storage address of the storage space may be combined into a corresponding table entry and stored in a corresponding relationship table between the temperature and the storage address of the storage space, and the above operation is performed for each of the other storage spaces in the storage medium.
Thus, in this step, a correspondence table between the temperature and the storage address of the storage space may be determined; searching whether a storage address corresponding to the temperature exists in a corresponding relation table; if the corresponding relation table has a storage address corresponding to the temperature, determining the second storage space according to the storage address, for example, determining the storage space corresponding to the storage address as the second storage space.
However, if the storage address corresponding to the temperature does not exist in the correspondence table, a second storage space is created in the storage medium, for example, a new storage space is created in the storage medium, and the central storage space is determined as the second storage space.
For example, it is possible to perform garbage collection on data in the storage space corresponding to the storage address in advance, and in the process of performing garbage collection, the storage space corresponding to the storage address is released, and then the corresponding table entry including the storage address is deleted from the corresponding relationship table, so that the storage space corresponding to the storage address no longer exists in the storage medium, that is, the second storage space for storing the data at the temperature no longer exists, and therefore, a new storage space may be created in the storage medium, and then the new storage space is determined as the second storage space for storing the data at the temperature.
Further, if another data of the temperature needs to be stored in the storage medium later, in order to avoid creating a storage space in the storage medium to store the another data and continue to store the another data in the created second storage space, in another embodiment of the present application, a storage address of the created second storage space may be obtained; and forming a corresponding table entry by the temperature and the created storage address of the second storage space, and storing the corresponding table entry in a corresponding relation table between the temperature and the storage address of the storage space.
Therefore, if the another data of the temperature needs to be stored in the storage medium later, the storage address of the created second storage space corresponding to the temperature can be found in the corresponding relation table, and then the another data can be stored in the created second storage space according to the storage address of the created second storage space, so that the another data is prevented from being stored in the storage medium by creating a storage space, and the storage resource in the storage medium can be saved.
In step S203, the second data is written into the second data stream and stored into the second storage space.
By the method, the data at different temperatures can be stored in different storage spaces, so that when the data stored in the storage medium is subjected to garbage collection, when the data at the temperature is required to be subjected to garbage collection, the data at the temperature can be only subjected to garbage collection, the write amplification coefficient of the storage medium can be reduced, and the performance of the storage medium is improved.
When the third data stream is stored in the third storage space of the storage medium, the third data stream can be implemented by referring to the above-mentioned flow, which is not described in detail herein.
Fig. 3 is a block diagram illustrating a data processing apparatus according to an exemplary embodiment, the apparatus including, as shown in fig. 3:
a receiving module 11, configured to receive a write request of first data;
a first writing module 12, configured to write the first data into a first data stream, and a first storing module 13, configured to store the first data stream into a first storage space of a storage medium;
a first recovery module 14, configured to perform garbage recovery on the first data, and convert the first data into second data;
a second writing module 15, configured to write the second data into a second data stream, and a second storing module 16, configured to store the second data stream into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In an optional implementation, the apparatus further comprises:
the second recovery module is used for performing garbage recovery on the second data and converting the second data into third data;
and the third writing module is used for writing the third data into a third data stream.
In an optional implementation, the apparatus further comprises:
and the third storing module is used for storing the third data stream into a third storage space of the storage medium.
In an optional implementation manner, the temperature of the second data is lower than that of the first data, and the temperature of the third data is lower than that of the second data.
In an optional implementation manner, the first recovery module includes:
a conversion unit, configured to read the first data from the first data stream, and convert the first data into erasure code data;
a first determining unit configured to generate the second data according to the erasure code data.
In an optional implementation, the second recycling module includes:
and the aggregation unit is used for reading the second data from the second data stream and aggregating the second data into third data.
In an alternative implementation, the first logging module includes:
a second determining unit for determining the temperature of the second data;
a third determination unit configured to determine, among a plurality of storage spaces included in the storage medium, a second storage space for storing the data of the temperature;
and the storage unit is used for writing the second data into a second data stream and storing the second data into the second storage space.
In an optional implementation manner, the second determining unit includes:
the first obtaining subunit is configured to obtain the number of garbage collection processes included in the process of obtaining the second data according to the original data of the second data;
and the first determining subunit is used for determining the temperature of the second data according to the processing times.
In an optional implementation manner, the third determining unit includes:
the second determining subunit is used for determining a corresponding relation table between the temperature and the storage address of the storage space;
the searching subunit is used for searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and a third determining subunit, configured to determine, if a storage address corresponding to the temperature exists in the correspondence table, the second storage space according to the storage address.
In an optional implementation manner, the third determining unit further includes:
and a creating subunit, configured to create the second storage space in the storage medium if a storage address corresponding to the temperature does not exist in the correspondence table.
In an optional implementation manner, the third determining unit further includes:
a second obtaining subunit, configured to obtain a storage address of the created second storage space;
and the storage subunit is used for forming a corresponding table entry by the temperature and the storage address of the created second storage space and storing the corresponding table entry in the corresponding relation table.
In the prior art, when data at different temperatures are stored in a storage medium, the data at different temperatures are not distinguished, so that when the data in the storage medium is subjected to garbage collection, all the data stored in the storage medium are often subjected to garbage collection at the same time, and it is seen that the data related to the garbage collection process are all the data stored in the storage medium, that is, the data related to the garbage collection process has a large amount, so that the write amplification factor of the storage medium can be increased, and the performance of the storage medium can be reduced.
The temperature of some data stored in the storage medium is higher, more junk data can be generated, the data with higher temperature generally needs to be subjected to junk recovery, the temperature of some data is lower, only less junk data can be generated, even the junk data is not generated, and the lower temperature of the data with higher temperature often does not need to be subjected to junk recovery.
Therefore, in the present application, data at different temperatures can be stored in different storage spaces of a storage medium, and when data in which storage space needs to be garbage collected, the data in which storage space is garbage collected, among a plurality of storage spaces included in the storage medium, the data in which storage space is not garbage collected, and data in all other storage spaces can not be garbage collected at the same time.
That is, data stored in the storage medium are classified into different storage spaces according to different temperatures, and when garbage collection is required for data in which storage space, garbage collection is performed for data in which storage space, or when garbage collection is required for data at which temperature, garbage collection is performed for data at which temperature.
Therefore, compared with the prior art, the data related to the garbage collection process in the application is part of the data stored in the storage medium, that is, compared with the prior art, the data size related to the garbage collection process in the application is smaller, so that compared with the prior art, the write amplification coefficient of the storage medium can be reduced, and the performance of the storage medium is improved.
The present application further provides a non-transitory, readable storage medium, where one or more modules (programs) are stored, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of method steps in this application.
The present embodiments provide one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an electronic device to perform the data processing methods as described in one or more of the above embodiments. In the embodiment of the application, the electronic device comprises a server, a gateway, a sub-device and the like, wherein the sub-device is a device such as an internet of things device.
Embodiments of the present disclosure may be implemented as an apparatus, which may include electronic devices such as servers (clusters), terminal devices such as IoT devices, and the like, using any suitable hardware, firmware, software, or any combination thereof, for a desired configuration.
Fig. 4 schematically illustrates an example apparatus 1300 that can be used to implement various embodiments described herein.
For one embodiment, fig. 4 illustrates an example apparatus 1300 having one or more processors 1302, a control module (chipset) 1304 coupled to at least one of the processor(s) 1302, memory 1306 coupled to the control module 1304, non-volatile memory (NVM)/storage 1308 coupled to the control module 1304, one or more input/output devices 1310 coupled to the control module 1304, and a network interface 1312 coupled to the control module 1306.
Processor 1302 may include one or more single-core or multi-core processors, and processor 1302 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the apparatus 1300 can be a server device such as a gateway or a controller as described in the embodiments of the present application.
In some embodiments, apparatus 1300 may include one or more computer-readable media (e.g., memory 1306 or NVM/storage 1308) having instructions 1314 and one or more processors 1302, which in combination with the one or more computer-readable media, are configured to execute instructions 1314 to implement modules to perform actions described in this disclosure.
For one embodiment, control module 1304 may include any suitable interface controllers to provide any suitable interface to at least one of the processor(s) 1302 and/or any suitable device or component in communication with control module 1304.
The control module 1304 may include a memory controller module to provide an interface to the memory 1306. The memory controller module may be a hardware module, a software module, and/or a firmware module.
Memory 1306 may be used, for example, to load and store data and/or instructions 1314 for device 1300. For one embodiment, memory 1306 may comprise any suitable volatile memory, such as suitable DRAM. In some embodiments, the memory 1306 may comprise a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).
For one embodiment, control module 1304 may include one or more input/output controllers to provide an interface to NVM/storage 1308 and input/output device(s) 1310.
For example, NVM/storage 1308 may be used to store data and/or instructions 1314. NVM/storage 1308 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more Hard Disk Drives (HDDs), one or more Compact Disc (CD) drives, and/or one or more Digital Versatile Disc (DVD) drives).
NVM/storage 1308 may include storage resources that are physically part of the device on which apparatus 1300 is installed, or it may be accessible by the device and need not be part of the device. For example, NVM/storage 1308 may be accessible over a network via input/output device(s) 1310.
Input/output device(s) 1310 may provide an interface for apparatus 1300 to communicate with any other suitable device, input/output device(s) 1310 may include communication components, audio components, sensor components, and so forth. The network interface 1312 may provide an interface for the device 1300 to communicate over one or more networks, and the device 1300 may wirelessly communicate with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as access to a communication standard-based wireless network, e.g., WiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.
For one embodiment, at least one of the processor(s) 1302 may be packaged together with logic for one or more controllers (e.g., memory controller modules) of the control module 1304. For one embodiment, at least one of the processor(s) 1302 may be packaged together with logic for one or more controllers of the control module 1304 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1302 may be integrated on the same die with logic for one or more controller(s) of the control module 1304. For one embodiment, at least one of the processor(s) 1302 may be integrated on the same die with logic of one or more controllers of the control module 1304 to form a system on chip (SoC).
In various embodiments, apparatus 1300 may be, but is not limited to being: a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.), among other terminal devices. In various embodiments, apparatus 1300 may have more or fewer components and/or different architectures. For example, in some embodiments, device 1300 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.
An embodiment of the present application provides an electronic device, including: one or more processors; and one or more machine readable media having instructions stored thereon, which when executed by the one or more processors, cause the processors to perform a data processing method as described in one or more of the embodiments of the present application:
receiving a write request for first data;
writing the first data into a first data stream, and storing the first data stream into a first storage space of a storage medium;
performing garbage collection on the first data, and converting the first data into second data;
writing the second data into a second data stream, and storing the second data stream into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
In an alternative implementation, the processor further performs:
performing garbage collection on the second data, and converting the second data into third data;
writing the third data to a third data stream.
In an alternative implementation, the processor further performs:
and storing the third data stream into a third storage space of the storage medium.
In an alternative implementation, the processor further performs:
reading the first data from the first data stream, and converting the first data into erasure code data;
and generating the second data according to the erasure code data.
In an alternative implementation, the processor further performs:
and reading the second data from the second data stream, and aggregating the second data into third data.
In an alternative implementation, the processor further performs:
determining a temperature of the first data;
determining a first storage space for storing data of the temperature among a plurality of storage spaces included in the storage medium;
storing the first data stream in the first storage space.
In an alternative implementation, the processor further performs:
acquiring the processing times of garbage recycling included in the process of acquiring the first data according to the original data of the first data;
and determining the temperature of the first data according to the processing times.
In an alternative implementation, the processor further performs:
determining a corresponding relation table between the temperature and the storage address of the storage space;
searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and if the corresponding relation table has a storage address corresponding to the temperature, determining the storage space corresponding to the storage address as a storage space for storing the data of the temperature.
In an alternative implementation, the processor further performs:
if the corresponding relation table does not have a storage address corresponding to the temperature, creating a created first storage space in the storage medium;
and determining the created first storage space as a storage space for storing the data of the temperature.
In an alternative implementation, the processor further performs:
acquiring a storage address of the created first storage space;
and forming a corresponding table entry by the temperature and the storage address of the created first storage space, and storing the corresponding table entry in the corresponding relation table.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing terminal to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing terminal, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing terminal to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing terminal to cause a series of operational steps to be performed on the computer or other programmable terminal to produce a computer implemented process such that the instructions which execute on the computer or other programmable terminal provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications of these embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the true scope of the embodiments of the application.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or terminal that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or terminal. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or terminal that comprises the element.
The data processing method and apparatus provided by the present application are introduced in detail, and a specific example is applied in the present application to explain the principle and the implementation of the present application, and the description of the above embodiment is only used to help understand the method and the core idea of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (23)

1. A data processing method, comprising:
receiving a write request for first data;
writing the first data into a first data stream and storing the first data into a first storage space of a storage medium;
performing garbage collection on the first data, and converting the first data into second data;
writing the second data into a second data stream and storing the second data into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
2. The method of claim 1, further comprising:
performing garbage collection on the second data, and converting the second data into third data;
writing the third data to a third data stream.
3. The method of claim 2, further comprising:
and storing the third data stream into a third storage space of the storage medium.
4. The method of claim 3, wherein the second data is at a lower temperature than the first data, and the third data is at a lower temperature than the second data.
5. The method of claim 1, wherein performing garbage collection on the first data, converting the first data into second data, comprises:
reading the first data from the first data stream, and converting the first data into erasure code data;
and generating the second data according to the erasure code data.
6. The method of claim 2, wherein performing garbage collection on the second data and converting the second data into third data comprises:
and reading the second data from the second data stream, and aggregating the second data into third data.
7. The method of claim 1, wherein writing the second data into a second data stream to a second storage space of the storage medium comprises:
determining a temperature of the second data;
determining a second storage space for storing data of the temperature among a plurality of storage spaces included in the storage medium;
and writing the second data into a second data stream and storing the second data into the second storage space.
8. The method of claim 7, wherein the determining the temperature of the second data comprises:
acquiring the processing times of garbage recycling included in the process of acquiring the second data according to the original data of the second data;
and determining the temperature of the second data according to the processing times.
9. The method according to claim 7, wherein the determining a second storage space for storing the data of the temperature among a plurality of storage spaces included in the storage medium comprises:
determining a corresponding relation table between the temperature and the storage address of the storage space;
searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and if the corresponding relation table has a storage address corresponding to the temperature, determining the second storage space according to the storage address.
10. The method of claim 9, further comprising:
and if the storage address corresponding to the temperature does not exist in the corresponding relation table, creating the second storage space in the storage medium.
11. The method of claim 10, further comprising:
acquiring a storage address of the created second storage space;
and forming a corresponding table entry by the temperature and the storage address of the created second storage space, and storing the corresponding table entry in the corresponding relation table.
12. A data processing apparatus, comprising:
the receiving module is used for receiving a write request of first data;
the first writing module is used for writing the first data into a first data stream, and the first storing module is used for storing the first data stream into a first storage space of a storage medium;
the first recovery module is used for performing garbage recovery on the first data and converting the first data into second data;
a second writing module, configured to write the second data into a second data stream, and a second storing module, configured to store the second data stream into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
13. An electronic device, comprising:
a processor; and
a memory having executable code stored thereon that, when executed, causes the processor to perform:
receiving a write request for first data;
writing the first data into a first data stream and storing the first data into a first storage space of a storage medium;
performing garbage collection on the first data, and converting the first data into second data;
writing the second data into a second data stream and storing the second data into a second storage space of the storage medium;
wherein the first data and the second data have different data temperatures.
14. The electronic device of claim 13, wherein the processor further performs:
performing garbage collection on the second data, and converting the second data into third data;
writing the third data to a third data stream.
15. The electronic device of claim 14, wherein the processor further performs:
and storing the third data stream into a third storage space of the storage medium.
16. The electronic device of claim 13, wherein the processor further performs:
reading the first data from the first data stream, and converting the first data into erasure code data;
and generating the second data according to the erasure code data.
17. The electronic device of claim 14, wherein the processor further performs:
and reading the second data from the second data stream, and aggregating the second data into third data.
18. The electronic device of claim 13, wherein the processor further performs:
determining a temperature of the second data;
determining a second storage space for storing data of the temperature among a plurality of storage spaces included in the storage medium;
and writing the second data into a second data stream and storing the second data into the second storage space.
19. The electronic device of claim 18, wherein the processor further performs:
acquiring the processing times of garbage recycling included in the process of acquiring the second data according to the original data of the second data;
and determining the temperature of the second data according to the processing times.
20. The electronic device of claim 18, wherein the processor further performs:
determining a corresponding relation table between the temperature and the storage address of the storage space;
searching whether a storage address corresponding to the temperature exists in the corresponding relation table;
and if the corresponding relation table has a storage address corresponding to the temperature, determining the second storage space according to the storage address.
21. The electronic device of claim 20, wherein the processor further performs:
and if the storage address corresponding to the temperature does not exist in the corresponding relation table, creating the second storage space in the storage medium.
22. The electronic device of claim 21, wherein the processor further performs:
acquiring a storage address of the created second storage space;
and forming a corresponding table entry by the temperature and the storage address of the created second storage space, and storing the corresponding table entry in the corresponding relation table.
23. One or more machine readable media having executable code stored thereon that, when executed, causes a processor to perform a data processing method as recited in one or more of claims 1-11.
CN201910493541.5A 2019-06-06 2019-06-06 Data processing method and device Active CN112051964B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910493541.5A CN112051964B (en) 2019-06-06 2019-06-06 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910493541.5A CN112051964B (en) 2019-06-06 2019-06-06 Data processing method and device

Publications (2)

Publication Number Publication Date
CN112051964A true CN112051964A (en) 2020-12-08
CN112051964B CN112051964B (en) 2024-08-27

Family

ID=73608906

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910493541.5A Active CN112051964B (en) 2019-06-06 2019-06-06 Data processing method and device

Country Status (1)

Country Link
CN (1) CN112051964B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347296A1 (en) * 2014-05-30 2015-12-03 Sandisk Enterprise Ip Llc Prioritizing Garbage Collection and Block Allocation Based on I/O History for Logical Address Regions
CN107967125A (en) * 2017-12-20 2018-04-27 北京京存技术有限公司 Management method, device and the computer-readable recording medium of flash translation layer (FTL)
CN108196978A (en) * 2017-12-22 2018-06-22 新华三技术有限公司 Date storage method, device, data-storage system and readable storage medium storing program for executing
CN108845770A (en) * 2018-06-22 2018-11-20 深圳忆联信息系统有限公司 Reduce method, apparatus and computer equipment that SSD writes amplification
CN109343796A (en) * 2018-09-21 2019-02-15 新华三技术有限公司 A kind of data processing method and device
CN109542358A (en) * 2018-12-03 2019-03-29 浪潮电子信息产业股份有限公司 Solid state disk cold and hot data separation method, device and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150347296A1 (en) * 2014-05-30 2015-12-03 Sandisk Enterprise Ip Llc Prioritizing Garbage Collection and Block Allocation Based on I/O History for Logical Address Regions
CN107967125A (en) * 2017-12-20 2018-04-27 北京京存技术有限公司 Management method, device and the computer-readable recording medium of flash translation layer (FTL)
CN108196978A (en) * 2017-12-22 2018-06-22 新华三技术有限公司 Date storage method, device, data-storage system and readable storage medium storing program for executing
CN108845770A (en) * 2018-06-22 2018-11-20 深圳忆联信息系统有限公司 Reduce method, apparatus and computer equipment that SSD writes amplification
CN109343796A (en) * 2018-09-21 2019-02-15 新华三技术有限公司 A kind of data processing method and device
CN109542358A (en) * 2018-12-03 2019-03-29 浪潮电子信息产业股份有限公司 Solid state disk cold and hot data separation method, device and equipment

Also Published As

Publication number Publication date
CN112051964B (en) 2024-08-27

Similar Documents

Publication Publication Date Title
JP6316974B2 (en) Flash memory compression
US8615499B2 (en) Estimating data reduction in storage systems
US9823945B2 (en) Method and apparatus for managing application program
CN105094709A (en) Dynamic data compression method for solid-state disc storage system
CN104281533A (en) Data storage method and device
US20160117116A1 (en) Electronic device and a method for managing memory space thereof
CN112748863A (en) Method, electronic device and computer program product for processing data
CN104461698A (en) Dynamic virtual disk mounting method, virtual disk management device and distributed storage system
KR101720101B1 (en) Writing method of writing data into memory system and writing method of memory systme
US20140258247A1 (en) Electronic apparatus for data access and data access method therefor
CN107423425B (en) Method for quickly storing and inquiring data in K/V format
CN114817978B (en) Data access method and system, hardware unloading device, electronic device and medium
US9791911B2 (en) Determining whether a change in power usage is abnormal when power usage exceeds a threshold based on additional metrics of components in an electronic device
CN110019347B (en) Data processing method and device of block chain and terminal equipment
CN112051964B (en) Data processing method and device
CN106933499B (en) Method and device for improving performance of MLC flash memory system
CN114356591A (en) Inter-process communication method and device, Internet of things operating system and Internet of things equipment
CN110891033B (en) Network resource processing method, device, gateway, controller and storage medium
CN114904216B (en) Feedback enhancement processing method and system for virtual reality treadmill
CN113297267A (en) Data caching and task processing method, device, equipment and storage medium
CN113010113B (en) Data processing method, device and equipment
US11216320B2 (en) Method and apparatus for communication between processes
CN113849524A (en) Data processing method and device
CN116049246A (en) Resource data query method and device, electronic equipment and storage medium
JP6553650B2 (en) Data processing method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant