JP7548609B1

JP7548609B1 - Information processing device, information processing method, and program

Info

Publication number: JP7548609B1
Application number: JP2023028000A
Authority: JP
Inventors: 俊栗田
Original assignee: NEC Platforms Ltd
Current assignee: NEC Platforms Ltd
Priority date: 2023-02-27
Filing date: 2023-02-27
Publication date: 2024-09-10
Anticipated expiration: 2043-02-27
Also published as: JP2024125451A

Abstract

【課題】ストレージ装置に対するアクセス性能の低下・負荷集中を低減する。【解決手段】ホスト制御部３１は、ホスト装置１０とのデータの送受信を行う。ディスク制御部３２は、ディスク部４０と接続され、複数のＨＤＤ４１、…、４１へのデータの読み書きを行う。複数のＨＤＤ４１、…、４１は、所定の単位毎に、複数のＲＡＩＤ構成４２、…、４２を構成している。重複排除データ監視部３３は、定期的もしくはユーザからの指示があった場合に、重複排除管理テーブル３４を参照し、重複排除データのＲｅａｄ回数およびＷｒｉｔｅ回数に基づいて、複数の論理ディスク５１、５２の中から重複排除データの参照元として最適な論理ディスクを決定し、重複排除データを移動させた後、重複排除管理テーブル３４の情報を更新する。【選択図】図１[Problem] To reduce degradation of access performance and load concentration to a storage device. [Solution] A host control unit 31 transmits and receives data to and from a host device 10. A disk control unit 32 is connected to a disk unit 40, and reads and writes data to a plurality of HDDs 41, ..., 41. The plurality of HDDs 41, ..., 41 configure a plurality of RAID configurations 42, ..., 42 for each predetermined unit. A deduplication data monitor 33 periodically or when instructed by a user refers to a deduplication management table 34, and determines the optimal logical disk from among a plurality of logical disks 51, 52 as the reference source for the deduplication data based on the number of reads and writes of the deduplication data, moves the deduplication data, and then updates the information in the deduplication management table 34. [Selected Figure] FIG.

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

近年、ストレージ装置では、取り扱うデータ量の増加に伴い、ボリューム容量の利用効率を向上させるため、ボリュームに格納されるデータのデータ量、または格納するデータのデータ量を、圧縮技術や、重複排除技術などを利用して削減することが広く行われている。また、データの格納に際しては、ストレージ装置内のハードディスクドライブ（ＨＤＤ）や、ソリッドステートドライブ（ＳＳＤ）などに限らず、外部ストレージや、クラウドストレージなどを活用するストレージシステムが普及している。なお、以下では、ストレージシステムに含まれる、ストレージ装置や、外部ストレージ、およびクラウドストレージなどを総称してストレージ装置と称し、ホスト装置に直接接続されたストレージ装置を区別する際には、自ストレージ装置と称する。 In recent years, with the increase in the amount of data handled by storage devices, it has become common to reduce the amount of data stored in a volume or the amount of data to be stored by using compression technology, deduplication technology, etc., in order to improve the utilization efficiency of the volume capacity. In addition, when storing data, storage systems that utilize not only hard disk drives (HDDs) and solid state drives (SSDs) within the storage device, but also external storage and cloud storage have become widespread. In the following, the storage device, external storage, cloud storage, etc. included in the storage system will be collectively referred to as the storage device, and when distinguishing between a storage device directly connected to a host device, it will be referred to as the local storage device.

このようなストレージシステムにおいても、前述したデータの重複排除技術を複数のストレージ装置を跨いで行うことにより、ストレージシステム全体としてボリューム容量の利用効率の改善が行われている（例えば、特許文献１）。 In such storage systems, the aforementioned data deduplication technology is applied across multiple storage devices, improving the utilization efficiency of the volume capacity of the entire storage system (for example, see Patent Document 1).

特開２０２０－１０６９９９号公報JP 2020-106999 A

このような重複排除技術を用いたデータ管理では、重複排除したデータへのアクセス状況に応じて、重複排除したデータを移動させる必要性や、そのために重複排除したデータの参照元や、参照先などのアドレス更新などの処理が生じる。しかしながら、特許文献１による重複排除技術では、ストレージ装置間のデータ通信処理の発生や、データ通信の回線遅延の影響を考慮していないため、ストレージ装置に対するアクセス性能の低下・負荷集中が発生してしまうという問題があった。 In data management using such deduplication technology, it is necessary to move the deduplication data depending on the access status of the deduplication data, and therefore processing such as updating the addresses of the reference source and reference destination of the deduplication data occurs. However, the deduplication technology in Patent Document 1 does not take into account the occurrence of data communication processing between storage devices and the effects of line delays in data communication, which causes problems such as reduced access performance and load concentration to the storage devices.

そこで本発明は、ストレージ装置に対するアクセス性能の低下・負荷集中を低減する情報処理装置、情報処理方法およびプログラムを提供することを目的としている。 The present invention aims to provide an information processing device, information processing method, and program that reduce degradation of access performance and load concentration to a storage device.

上述した課題を解決するために、本発明の一態様は、複数の元データ間で重複するデータを、複数の論理ディスクに保存管理する際に、前記複数の論理ディスクのうち、１つの元データに対する１つの論理ディスクには前記重複するデータを重複排除データとして保存し、前記複数の論理ディスクのうち、前記１つの元データ以外の他の元データに対する論理ディスクには前記重複排除データを参照するためのアドレス情報を保存する重複排除手段と、前記複数の論理ディスクに対するアクセス状況を監視する監視手段と、前記アクセス状況に応じて、前記重複排除データの移動先となる論理ディスクを決定する移動先決定手段と、前記決定された前記重複排除データの移動先の論理ディスクに前記重複排除データを移動させる移動手段と、前記決定された前記重複排除データの移動先に応じて、前記重複排除データが保存されている前記論理ディスクに関する情報と、前記他の元データに対する前記論理ディスクに保存された前記重複排除データを参照するためのアドレス情報を更新する設定情報更新手段と、を備えることを特徴とする。 In order to solve the above-mentioned problems, one aspect of the present invention is characterized in that, when storing and managing data that overlaps among multiple original data on multiple logical disks, a deduplication means stores the duplicated data as deduplication data on one logical disk of the multiple logical disks for one of the original data, and stores address information for referencing the deduplication data on other logical disks of the multiple logical disks for original data other than the one original data, a monitoring means for monitoring access status to the multiple logical disks, a destination determination means for determining a logical disk to which the deduplication data is to be moved in accordance with the access status, a moving means for moving the deduplication data to the determined destination logical disk for the deduplication data, and a setting information update means for updating information about the logical disk on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data in accordance with the determined destination of the deduplication data.

また、本発明の一態様は、複数の元データ間で重複するデータを、複数の論理ディスクに保存管理する際に、前記複数の論理ディスクのうち、１つの元データに対する１つの論理ディスクには前記重複するデータを重複排除データとして保存し、前記複数の論理ディスクのうち、前記１つの元データ以外の他の元データに対する論理ディスクには前記重複排除データを参照するためのアドレス情報を保存するステップと、前記複数の論理ディスクに対するアクセス状況を監視するステップと、前記アクセス状況に応じて、前記重複排除データの移動先となる論理ディスクを決定するステップと、前記決定された前記重複排除データの移動先の論理ディスクに前記重複排除データを移動させるステップと、前記決定された前記重複排除データの移動先に応じて、前記重複排除データが保存されている前記論理ディスクに関する情報と、前記他の元データに対する前記論理ディスクに保存された前記重複排除データを参照するためのアドレス情報を更新するステップと、を含むことを特徴とする。 In addition, one aspect of the present invention is characterized in that, when storing and managing data that overlaps among multiple original data on multiple logical disks, the method includes the steps of: storing the overlapping data as deduplication data on one logical disk among the multiple logical disks for one of the original data, and storing address information for referencing the deduplication data on other logical disks among the multiple logical disks for the original data other than the one original data; monitoring access status to the multiple logical disks; determining a logical disk to which the deduplication data is to be moved according to the access status; moving the deduplication data to the determined logical disk to which the deduplication data is to be moved; and updating information about the logical disk on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data according to the determined destination of the deduplication data.

また、本発明の一態様は、情報処理装置のコンピュータを、複数の元データ間で重複するデータを、複数の論理ディスクに保存管理する際に、前記複数の論理ディスクのうち、１つの元データに対する１つの論理ディスクには前記重複するデータを重複排除データとして保存し、前記複数の論理ディスクのうち、前記１つの元データ以外の他の元データに対する論理ディスクには前記重複排除データを参照するためのアドレス情報を保存する重複排除機能、前記複数の論理ディスクに対するアクセス状況を監視する監視機能、前記アクセス状況に応じて、前記重複排除データの移動先となる論理ディスクを決定する移動先決定機能、前記決定された前記重複排除データの移動先の論理ディスクに前記重複排除データを移動させる移動機能、前記決定された前記重複排除データの移動先に応じて、前記重複排除データが保存されている前記論理ディスクに関する情報と、前記他の元データに対する前記論理ディスクに保存された前記重複排除データを参照するためのアドレス情報を更新する設定情報更新機能、として機能させることを特徴とする。 In addition, one aspect of the present invention is characterized in that, when storing and managing data that overlaps among multiple original data on multiple logical disks, a computer of an information processing device is configured to function as a deduplication function that stores the duplicated data as deduplication data on one logical disk of the multiple logical disks for one of the original data, and stores address information for referencing the deduplication data on logical disks of the other original data other than the one original data, a monitoring function that monitors the access status of the multiple logical disks, a destination determination function that determines a logical disk to which the deduplication data is to be moved in accordance with the access status, a moving function that moves the deduplication data to the determined destination logical disk for the deduplication data, and a setting information update function that updates information about the logical disk on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data in accordance with the determined destination of the deduplication data.

以上説明したように、ストレージ装置に対するアクセス性能の低下・負荷集中を低減することができるという利点が得られる。 As explained above, this has the advantage of reducing degradation of access performance and load concentration to storage devices.

本発明の実施形態による情報処理システム１の構成を示すブロック図である。1 is a block diagram showing a configuration of an information processing system 1 according to an embodiment of the present invention. 本実施形態による重複排除管理テーブル３４のデータ構成を示す概念図である。3 is a conceptual diagram showing the data configuration of a deduplication management table 34 according to the present embodiment. FIG. 本実施形態による情報処理システム１の動作を説明するためのフローチャートである。4 is a flowchart for explaining the operation of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第１動作例を説明するための概念図である。FIG. 2 is a conceptual diagram for explaining a first operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第１動作例を説明するための概念図である。FIG. 2 is a conceptual diagram for explaining a first operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第２動作例を説明するための概念図である。FIG. 11 is a conceptual diagram for explaining a second operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第２動作例を説明するための概念図である。FIG. 11 is a conceptual diagram for explaining a second operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第３動作例を説明するための概念図である。FIG. 13 is a conceptual diagram for explaining a third operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システム１の第３動作例を説明するための概念図である。FIG. 13 is a conceptual diagram for explaining a third operation example of the information processing system 1 according to the present embodiment. 本実施形態による情報処理システムの最小構成を示すブロック図である。FIG. 2 is a block diagram showing a minimum configuration of an information processing system according to the present embodiment.

以下、本発明の実施の形態を、図面を参照して説明する。 The following describes an embodiment of the present invention with reference to the drawings.

Ａ．実施形態の構成
図１は、本発明の実施形態による情報処理ステムの構成を示すブロック図である。
図１において、情報処理システム１は、ホスト装置１０、ストレージ装置２０、外部ストレージ２００、クラウドストレージ３００、およびネットワーク４００から構成されている。ホスト装置１０は、直接接続されているストレージ装置２０との間で各種データを送受信する。 A. Configuration of the embodiment Fig. 1 is a block diagram showing the configuration of an information processing system according to an embodiment of the present invention.
1, an information processing system 1 includes a host device 10, a storage device 20, an external storage 200, a cloud storage 300, and a network 400. The host device 10 transmits and receives various data to and from the storage device 20 to which it is directly connected.

ストレージ装置２０は、コントローラ３０とディスク部４０とを有する。コントローラ３０は、ホスト制御部３１、ディスク制御部３２、重複排除データ監視部（重複排除手段、監視手段、移動先決定手段、移動手段、設定情報更新手段）３３、および重複排除管理テーブル３４を有する。ディスク部４０は、複数のＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）４１、…、４１からなる。 The storage device 20 has a controller 30 and a disk unit 40. The controller 30 has a host control unit 31, a disk control unit 32, a deduplication data monitoring unit (deduplication means, monitoring means, destination determination means, movement means, configuration information update means) 33, and a deduplication management table 34. The disk unit 40 is made up of multiple HDDs (Hard Disk Drives) 41, ..., 41.

ホスト制御部３１は、ホスト装置１０と接続され、ホスト装置１０とのデータの送受信を行う。なお、ホスト装置１０は、１つ、もしくは複数であってもよい。ディスク制御部３２は、ディスク部４０と接続され、複数のＨＤＤ４１、…、４１へのデータの読み書きを行う。複数のＨＤＤ４１、…、４１は、所定の単位毎に、複数のＲＡＩＤ構成４２、…、４２を構成している。なお、重複排除データ監視部３３および重複排除管理テーブル３４の詳細について後述する。 The host control unit 31 is connected to the host device 10 and transmits and receives data to and from the host device 10. There may be one or more host devices 10. The disk control unit 32 is connected to the disk unit 40 and reads and writes data to the multiple HDDs 41, ..., 41. The multiple HDDs 41, ..., 41 configure multiple RAID configurations 42, ..., 42 for each specified unit. The deduplication data monitoring unit 33 and the deduplication management table 34 will be described in detail later.

また、ストレージ装置２０は、ネットワーク４００を介して外部ストレージ２００およびクラウドストレージ３００に接続されている。外部ストレージ２００は、大容量の記憶媒体からなるボリューム２１０を有し、クラウドストレージ３００は、大容量の記憶媒体からなるボリューム３１０を有する。 The storage device 20 is also connected to an external storage 200 and a cloud storage 300 via a network 400. The external storage 200 has a volume 210 consisting of a large-capacity storage medium, and the cloud storage 300 has a volume 310 consisting of a large-capacity storage medium.

ストレージ装置２０は、複数のＲＡＩＤ構成４２、…、４２、および上記外部ストレージ２００のボリューム２１０およびクラウドストレージ３００のボリューム３１０において、ＬＤ（論理ディスク）５０を構成している。ＬＤ５０は、ＬＤＮ（論理ディスク番号；ＬＤを識別するための番号）で識別される複数のＬＤ５１、…、５１、５２…５２からなる。図示の例では、ＬＤ５１、…、５１が、各々、ＲＡＩＤ構成４２、…、４２に対応しており、ＬＤ５２、…、５２が、各々、外部ストレージ２００のボリューム２１０およびクラウドストレージ３００のボリューム３１０に対応している。データの実体は、ＲＡＩＤ構成４２、…、４２、外部ストレージ２００のボリューム２１０、およびクラウドストレージ３００のボリューム３１０のいずれかに格納される。なお、ＨＤＤ４１、…、４１は、例えばＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｉｓｋ）などの他の記憶媒体で構成してもよい。 The storage device 20 configures an LD (logical disk) 50 in a plurality of RAID configurations 42, ..., 42, and the volume 210 of the external storage 200 and the volume 310 of the cloud storage 300. The LD 50 is composed of a plurality of LDs 51, ..., 51, 52 ... 52 identified by an LDN (logical disk number; a number for identifying an LD). In the illustrated example, the LDs 51, ..., 51 correspond to the RAID configurations 42, ..., 42, respectively, and the LDs 52, ..., 52 correspond to the volume 210 of the external storage 200 and the volume 310 of the cloud storage 300, respectively. The actual data is stored in either the RAID configurations 42, ..., 42, the volume 210 of the external storage 200, or the volume 310 of the cloud storage 300. Note that HDD41, ..., 41 may be configured with other storage media, such as an SSD (Solid State Disk).

図２は、本実施形態による重複排除管理テーブル３４のデータ構成を示す概念図である。図２において、重複排除管理テーブル３４は、ＬＤ６１（＃０、＃１、…、＃Ｎ）毎のリストで構成されている。リストは、ある単位（３２ＫＢ）毎のＬＢＡ（論理ブロックアドレス）６２、重複排除状態６３、重複排除数（重複排除処理の対象数）６４、参照先ＬＤＮ６５、参照先ＬＢＡ６６、データ格納ストレージ６７、Ｒｅａｄ回数６８、Ｗｒｉｔｅ回数６９で構成され、ＬＢＡ６２毎に各情報を保持している。ＬＢＡ６２は、ＬＤの論理ブロックアドレスを示している。重複排除状態６３には、重複排除データ（実データ）の参照元であるか、参照先であるか、重複排除未実施かの状態を示す情報が格納される。なお、「参照元」は、重複排除データが格納されていることを示し、「参照先」は、重複排除データを参照していることを示す。重複排除数６４は、重複排除処理の対象となる同じデータを示すデータ実態の数を示している。 2 is a conceptual diagram showing the data configuration of the deduplication management table 34 according to this embodiment. In FIG. 2, the deduplication management table 34 is configured with a list for each LD 61 (#0, #1, ..., #N). The list is configured with LBA (logical block address) 62 for each unit (32 KB), deduplication status 63, number of deduplications (number of targets for deduplication processing) 64, reference LDN 65, reference LBA 66, data storage storage 67, number of reads 68, and number of writes 69, and holds each information for each LBA 62. The LBA 62 indicates the logical block address of the LD. The deduplication status 63 stores information indicating whether the deduplication data (actual data) is the reference source, the reference destination, or the deduplication has not been performed. Note that "reference source" indicates that deduplication data is stored, and "reference destination" indicates that deduplication data is being referenced. The number of deduplications 64 indicates the number of data entities indicating the same data that is the target of deduplication processing.

なお、重複排除管理テーブル３４は、３２ＫＢごとのボリュームを特定するためのＬＢＡ６２のリストで構成されており、データが重複しているかどうか（重複判定）は、ＬＢＡ６２単位で、すなわち３２ＫＢごとのデータで判定する。また、重複判定は、周知の方法（例えば、米国特許出願公開第２００７／０１０１０７４号明細書で示されている方法）などで行うものとする。重複判定とは、ＬＢＡ６２に含まれるデータが一致しているかどうかの判定である。なお、各ＬＤＮや、ＬＢＡ６２の単位および重複判定方法は、上述した値および方法のみに限定されない。重複判定の結果、複数のＬＢＡ６２でデータが重複していると判定した場合、ストレージ装置１は重複排除の処理を行う。これにより重複しているデータを記憶する一方のＬＢＡ６２からはデータの実態が削除され、当該一方のＬＢＡ６２は他方のＬＢＡ６２を参照先として、他方のＬＢＡ６２にのみデータの実態が保存される。この場合、他方のＬＢＡ６２が参照元となる。重複排除の処理の結果、重複排除管理テーブル３４における重複排除状態６３のカラムには参照元または参照先のフラグが記録される。参照元はデータの実態を記憶するＬＢＡ６２、参照先は参照元の記憶するデータの実態を参照先として参照しているＬＢＡ６２であることを示す。 The deduplication management table 34 is composed of a list of LBAs 62 for identifying volumes every 32 KB, and whether data is duplicated (duplication judgment) is judged in units of LBAs 62, that is, every 32 KB of data. The duplication judgment is performed by a known method (for example, the method shown in the specification of U.S. Patent Application Publication No. 2007/0101074). The duplication judgment is a judgment of whether the data contained in the LBAs 62 match. The units of each LDN and LBAs 62 and the duplication judgment method are not limited to the values and methods described above. If it is judged that data is duplicated in multiple LBAs 62 as a result of the duplication judgment, the storage device 1 performs a deduplication process. As a result, the actual data is deleted from one LBA 62 that stores the duplicated data, and the one LBA 62 refers to the other LBA 62, and the actual data is stored only in the other LBA 62. In this case, the other LBA 62 is the reference source. As a result of the deduplication process, a flag indicating the source or destination of the reference is recorded in the deduplication status 63 column in the deduplication management table 34. The source of the reference indicates the LBA 62 that stores the actual data, and the destination of the reference indicates the LBA 62 that is referencing the actual data stored in the source of the reference as the destination of the reference.

参照先ＬＤＮ６５は、重複排除を実施している場合、重複排除データの参照先のＬＤ６１を示す。参照先ＬＢＡ６６は、重複排除を実施している場合に、重複排除データの参照先のＬＢＡ６２を示す。参照先ＬＤＮ６５と参照先ＬＢＡ６６とが、重複排除データが保存されているアドレスを示す。また、データ格納ストレージ６７は、重複排除を実施している場合に、重複排除データが格納されているストレージ装置の識別子を示す。Ｒｅａｄ回数６８およびＷｒｉｔｅ回数６９は、ホスト装置１０等の他の装置から該当ＬＢＡに対する一定期間内のＲｅａｄ（読出し）およびＷｒｉｔｅ（書込み）のアクセス回数が格納される。 The referenced LDN 65 indicates the LD 61 that is the reference of the deduplication data when deduplication is being performed. The referenced LBA 66 indicates the LBA 62 that is the reference of the deduplication data when deduplication is being performed. The referenced LDN 65 and the referenced LBA 66 indicate the address where the deduplication data is stored. Furthermore, the data storage 67 indicates the identifier of the storage device where the deduplication data is stored when deduplication is being performed. The read count 68 and the write count 69 store the number of read and write accesses to the LBA in question from other devices such as the host device 10 within a certain period of time.

なお、ＬＢＡ（論理ブロックアドレス）とは、ＨＤＤなどの外部記憶装置においてアドレスを指定する方法の一種で、記憶媒体にアクセスできる単位（セクタ）すべてに通し番号のアドレスを振り、その通し番号によって記録位置へのアクセスを可能にする方式のことである。本実施形態においても、ＬＤの番号を示すＬＤＮと、ＬＤ内のアドレスを示すＬＢＡとによってデータが保存されているアドレスを指定するようになっている。 Note that LBA (logical block address) is a method of specifying addresses in external storage devices such as HDDs, in which a serial number address is assigned to every unit (sector) that can be accessed on the storage medium, making it possible to access a recording location using that serial number. In this embodiment, too, the address where data is stored is specified using the LDN, which indicates the LD number, and the LBA, which indicates the address within the LD.

重複排除データ監視部３３は、定期的もしくはユーザからの指示があった場合に、重複排除管理テーブル３４のＬＤ６１およびＬＢＡ６２毎に、重複排除データのＲｅａｄ回数６８およびＷｒｉｔｅ回数６９をチェックする。重複排除データ監視部３３は、Ｒｅａｄ回数６８および／またはＷｒｉｔｅ回数６９に基づいて、重複排除データの参照元として最適なＬＢＡ（論理上のストレージ）６２を決定し、新たに参照元として決定したＬＢＡ６２が、現在の重複排除データの参照元であるＬＢＡ６２と異なる場合は、重複排除データを新たに参照元として決定したＬＢＡ６２で示される保存領域に移動させる。その後、移動した重複排除データの参照元および参照先であるＬＢＡ６２に関して、重複排除状態４３、重複排除数４４、参照先ＬＤＮ４５、参照先ＬＢＡ４６、データ格納ストレージ４７を更新する。 The deduplication data monitor 33 periodically or when instructed by the user checks the read count 68 and write count 69 of the deduplication data for each LD 61 and LBA 62 in the deduplication management table 34. The deduplication data monitor 33 determines the optimal LBA (logical storage) 62 as the reference source of the deduplication data based on the read count 68 and/or write count 69, and if the newly determined reference source LBA 62 is different from the current reference source LBA 62 of the deduplication data, moves the deduplication data to the storage area indicated by the newly determined reference source LBA 62. Thereafter, the deduplication status 43, deduplication count 44, reference destination LDN 45, reference destination LBA 46, and data storage storage 47 are updated for the reference source and reference destination LBA 62 of the moved deduplication data.

Ｂ．実施形態の動作
図３は、本実施形態による情報処理システム１の動作を説明するためのフローチャートである。また、図４、図５、図６、図７、図８、および図９は、本実施形態による情報処理システム１の具体的な動作例を説明するための概念図である。 B. Operation of the embodiment Fig. 3 is a flowchart for explaining the operation of the information processing system 1 according to the present embodiment. Also, Figs. 4, 5, 6, 7, 8, and 9 are conceptual diagrams for explaining specific operation examples of the information processing system 1 according to the present embodiment.

ここで、説明のための具体例として、ストレージ装置２０をストレージＡ、外部ストレージ２００又はクラウドストレージ３００をストレージＢとする。ストレージ装置２０は、ホスト装置１０に直接接続されているので、比較的アクセス頻度が多い重複排除データを格納するために用いられる。一方、外部ストレージ２００又はクラウドストレージ３００は、ネットワーク４００を介してアクセスされるためにアクセス性能の影響を受けやすいので、比較的アクセス頻度が少ない重複排除データを格納するために用いられる。 As a specific example for the purpose of explanation, the storage device 20 is storage A, and the external storage 200 or cloud storage 300 is storage B. The storage device 20 is directly connected to the host device 10, and is therefore used to store deduplication data that is accessed relatively frequently. On the other hand, the external storage 200 or cloud storage 300 is accessed via the network 400, and is therefore susceptible to the effects of access performance, and is therefore used to store deduplication data that is accessed relatively infrequently.

重複排除データ監視部３３は、定期的もしくはユーザが指示を行った場合に、重複排除管理テーブル３４のＬＤ６１およびＬＢＡ６２毎に以下の処理を実行する。重複排除データ監視部３３は、まず、監視対象のＬＤ６１の該当ＬＢＡ６２の重複排除状態６３が「未実施」であるか否かを判断する（ステップＳ１０）。そして、重複排除状態６３が「未実施」である場合には（ステップＳ１０のＹＥＳ）、重複排除データ監視部３３は、監視対象を次のＬＢＡ（次の３２ＫＢ）に設定し（ステップＳ２８）、ステップＳ１０に戻る。 The deduplication data monitoring unit 33 executes the following process for each LD 61 and LBA 62 in the deduplication management table 34 periodically or when instructed by the user. The deduplication data monitoring unit 33 first determines whether the deduplication status 63 of the corresponding LBA 62 of the monitored LD 61 is "not performed" (step S10). If the deduplication status 63 is "not performed" (YES in step S10), the deduplication data monitoring unit 33 sets the monitoring target to the next LBA (next 32 KB) (step S28) and returns to step S10.

一方、重複排除状態６３が「未実施」でない場合、すなわち該当ＬＢＡ６２の重複排除状態６３が「参照元」又は「参照先」である場合には（ステップＳ１０のＮＯ）、重複排除データ監視部３３は、監視対象のＬＢＡ６２とそのＬＢＡ６２に関連して重複排除を実施している他の全てのＬＢＡ６２におけるＲｅａｄ回数６８とＷｒｉｔｅ回数６９（Ｉ／Ｏ特性）を重複排除管理テーブル３４から取得する（ステップＳ１２）。つまり重複排除データ監視部３３は、参照元と参照先の関係にある各ＬＢＡ６２について、Ｒｅａｄ回数６８とＷｒｉｔｅ回数６９（Ｉ／Ｏ特性）を重複排除管理テーブル３４から取得する。 On the other hand, if the deduplication status 63 is not "not yet performed", i.e., if the deduplication status 63 of the LBA 62 is "source" or "destination" (NO in step S10), the deduplication data monitor 33 obtains the read count 68 and write count 69 (I/O characteristics) for the monitored LBA 62 and all other LBAs 62 for which deduplication is being performed in relation to that LBA 62 from the deduplication management table 34 (step S12). In other words, the deduplication data monitor 33 obtains the read count 68 and write count 69 (I/O characteristics) for each LBA 62 that is in a reference source and reference destination relationship from the deduplication management table 34.

次に、重複排除データ監視部３３は、Ｒｅａｄ回数６８およびＷｒｉｔｅ回数６９が共に閾値（例えば、Ｒｅａｄ回数閾値が１０、Ｗｒｉｔｅ回数閾値が５）以下であるか否かを判断する（ステップＳ１４）。そして、双方が閾値以下である場合には（ステップＳ１４のＹＥＳ）、重複排除データの参照元を、ネットワーク４００を介してアクセスする外部ストレージ２００又はクラウドストレージ３００、すなわちデータ格納ストレージ６７が「Ｂ」のＬＢＡ６２に決定する（ステップＳ２０）。つまり、アクセスが比較的少ない重複排除データについては、ネットワーク４００を介してアクセスする外部ストレージ２００又はクラウドストレージ３００「Ｂ」（ストレージとしての比較的能力の低いストレージ）に保存したとしても、アクセス性能の低下が発生しにくいためである。なお、ネットワーク４００を介して接続しているストレージであっても、当該ネットワーク回線における単位時間当たりのデータ伝送量が大きい場合いは、そのストレージは比較的能力が高いとして、他の比較的能力の低いストレージを新たな参照元として決定してもよい。 Next, the deduplication data monitoring unit 33 judges whether the read count 68 and the write count 69 are both equal to or less than a threshold (for example, the read count threshold is 10 and the write count threshold is 5) (step S14). If both are equal to or less than the threshold (YES in step S14), the deduplication data reference source is determined to be the external storage 200 or cloud storage 300 accessed via the network 400, i.e., LBA 62 of the data storage storage 67 "B" (step S20). In other words, for deduplication data that is accessed relatively infrequently, even if it is stored in the external storage 200 or cloud storage 300 "B" accessed via the network 400 (a storage with a relatively low capacity as a storage), a decrease in access performance is unlikely to occur. Note that even if a storage connected via the network 400 has a large amount of data transmission per unit time on the network line, the storage may be determined to have a relatively high capacity and another storage with a relatively low capacity may be determined as the new reference source.

ここで、現在（重複排除処理後）の重複排除管理テーブル３４が、例えば、図４に示すように、ＬＤ６１「＃０」において、ＬＢＡ６２「１０００」では、重複排除状態６３が「参照元」、重複排除数６４が「３」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃０」において、ＬＢＡ６２「２０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「１０」、Ｗｒｉｔｅ回数６９が「０」であるとする。なお、図中、「－」は無効（データなし）を意味する。 Now, assume that the current (after deduplication processing) deduplication management table 34 is as shown in FIG. 4, for example, in LD 61 "#0", at LBA 62 "1000", the deduplication status 63 is "reference source", the deduplication count 64 is "3", the data storage storage 67 is "A", the read count 68 is "5", and the write count 69 is "0". Also assume that in LD 61 "#0", at LBA 62 "2000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "10", and the write count 69 is "0". Note that in the figure, "-" means invalid (no data).

同様に、ＬＤ６１「＃１」において、ＬＢＡ６２「１１０００」では、重複排除状態６３が「参照先」、重複排除数６４が「３」、参照先ＬＤＮが「０」、参照先ＬＢＡが「１０００」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「０」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１２０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「５」であるとする。なお、参照先ＬＤＮが「０」、参照先ＬＢＡが「１０００」は、参照先のＬＤとＬＢＡを示しており、換言すれば、重複排除データの実態が、ＬＤ６１「＃０」のＬＢＡ６２「１０００」に保存されていることを意味する。 Similarly, in LD 61 "#1", at LBA 62 "11000", the deduplication status 63 is "referenced", the deduplication count 64 is "3", the referenced LDN is "0", the referenced LBA is "1000", the data storage storage 67 is "A", the read count 68 is "0", and the write count 69 is "0". Also, in LD 61 "#1", at LBA 62 "12000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "5", and the write count 69 is "5". Note that the referenced LDN is "0" and the referenced LBA is "1000", indicating the referenced LD and LBA. In other words, this means that the actual deduplication data is stored in LBA 62 "1000" of LD 61 "#0".

そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０１０００」では、重複排除状態６３が「参照先」、重複排除数６４が「３」、参照先ＬＤＮが「０」、参照先ＬＢＡが「１０００」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０２０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「１０」、Ｗｒｉｔｅ回数６９が「０」であるとする。参照先ＬＤＮ６５が「０」、参照先ＬＢＡ６６が「１０００」については上述した通りである。 In LD 61 "#200", at LBA 62 "201000", the deduplication status 63 is "referenced", the deduplication count 64 is "3", the referenced LDN is "0", the referenced LBA is "1000", the data storage storage 67 is "B", the read count 68 is "5", and the write count 69 is "0". In addition, in LD 61 "#200", at LBA 62 "202000", the deduplication status 63 is "not performed", the data storage storage 67 is "B", the read count 68 is "10", and the write count 69 is "0". The referenced LDN 65 is "0" and the referenced LBA 66 is "1000" as described above.

図４に示す例では、ＬＤ６１「＃０」において、ＬＢＡ６２「１０００」におけるＲｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「０」である（太線の矩形を参照）。この場合、Ｒｅａｄ回数６８およびＷｒｉｔｅ回数６９が共に閾値以下であるので、アクセス頻度が少ないことから、ネットワークを介して接続される、データ格納ストレージ６７「Ｂ（外部ストレージ２００又はクラウドストレージ３００）」であるＬＤ「＃２００」におけるＬＢＡ６２「２０１０００」を、重複排除データの参照元に決定する（太線の矩形を参照）。 In the example shown in FIG. 4, in LD 61 "#0", the read count 68 at LBA 62 "1000" is "5" and the write count 69 is "0" (see the thick-lined rectangle). In this case, since both the read count 68 and the write count 69 are below the threshold, the access frequency is low, and therefore LBA 62 "201000" in LD "#200", which is data storage storage 67 "B (external storage 200 or cloud storage 300)" connected via the network, is determined as the reference source for deduplication data (see the thick-lined rectangle).

次に、重複排除データ監視部３３は、現在の参照元のＬＢＡ６２から新たに参照元に決定したＬＢＡ６２に変更があったか否か、すなわち新たに参照元に決定したＬＢＡ６２の重複排除状態６３が既に「参照元」であるか否かを判断する（ステップＳ２２）。そして、現在の参照元のＬＢＡ６２から新たに参照元に決定したＬＢＡ６２に変更が無かった場合、すなわち新たに参照元に決定したＬＢＡ６２の重複排除状態６３が既に「参照元」である場合には（ステップＳ２２のＮＯ）、重複排除データ監視部３３は、監視対象を次のＬＢＡ（次の３２ＫＢ）に設定し（ステップＳ２８）、ステップＳ１０に戻る。 Next, the deduplication data monitoring unit 33 determines whether there has been a change from the current reference source LBA 62 to the newly determined reference source LBA 62, i.e., whether the deduplication status 63 of the newly determined reference source LBA 62 is already "reference source" (step S22). If there has been no change from the current reference source LBA 62 to the newly determined reference source LBA 62, i.e., if the deduplication status 63 of the newly determined reference source LBA 62 is already "reference source" (NO in step S22), the deduplication data monitoring unit 33 sets the next LBA (next 32 KB) to be monitored (step S28) and returns to step S10.

一方、参照元に決定したＬＢＡ６２の重複排除状態６３が「参照先」であった場合には（ステップＳ２２のＹＥＳ）、重複排除データ監視部３３は、参照元が変更されたと判断し、現在の参照元であるＬＢＡ６２に格納されている重複排除データの実態を、新たに参照元に決定したＬＤ６１のＬＢＡ６２に移動する（ステップＳ２４）。次に、重複排除データ監視部３３は、重複排除管理テーブル３４の各項目を更新する（ステップＳ２６）。 On the other hand, if the deduplication status 63 of the LBA 62 determined as the reference source is "reference destination" (YES in step S22), the deduplication data monitor 33 determines that the reference source has changed, and moves the actual deduplication data stored in the current reference source LBA 62 to the LBA 62 of the LD 61 determined as the new reference source (step S24). Next, the deduplication data monitor 33 updates each item in the deduplication management table 34 (step S26).

図４に示す例では、現在の参照元であるＬＤ６１「＃１」におけるＬＢＡ６２「１０００」に格納されている重複排除データの実態を、新たに参照元に決定したＬＤ６１「＃２００」におけるＬＢＡ６２「２０１０００」で示される保存領域に移動する。また、図５に示す斜線によるハッチングのように、ＬＤ６１「＃０」において、ＬＢＡ６２「１０００」では、重複排除状態６３を「参照元」から「参照先」、参照先ＬＤＮ６５を無効「－」から「２００」、参照先ＬＢＡ６６を無効「－」から「２０１０００」に更新する。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１１０００」では、参照先ＬＤＮ６５を「０」から「２００」、参照先ＬＢＡ６６を「１０００」から「２０１０００」に更新する。そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０１０００」では、重複排除状態６３を「参照先」から「参照元」、参照先ＬＤＮ６５を「０」から無効「－」、参照先ＬＢＡ６６を「１０００」から無効「－」に更新する。 In the example shown in FIG. 4, the actual deduplication data stored in LBA 62 "1000" in LD 61 "#1", which is the current reference source, is moved to the storage area indicated by LBA 62 "201000" in LD 61 "#200", which has been newly determined as the reference source. Also, as shown by the diagonal hatching in FIG. 5, in LD 61 "#0", at LBA 62 "1000", the deduplication status 63 is updated from "reference source" to "referenced", the referenced LDN 65 is updated from invalid "-" to "200", and the referenced LBA 66 is updated from invalid "-" to "201000". Also, in LD 61 "#1", at LBA 62 "11000", the referenced LDN 65 is updated from "0" to "200", and the referenced LBA 66 is updated from "1000" to "201000". Then, in LD 61 "#200", at LBA 62 "201000", the deduplication status 63 is updated from "referenced" to "referenced", the referenced LDN 65 is updated from "0" to invalid "-", and the referenced LBA 66 is updated from "1000" to invalid "-".

その後、重複排除データ監視部３３は、監視対象を次のＬＢＡ（次の３２ＫＢ）に設定し（ステップＳ２８）、ステップＳ１０に戻る。 Then, the deduplication data monitoring unit 33 sets the monitoring target to the next LBA (next 32 KB) (step S28) and returns to step S10.

このように、Ｒｅａｄ回数６８およびＷｒｉｔｅ回数６９が共に閾値以下である場合には、重複排除データの参照元を、アクセス性能の影響を受けにくい、ネットワーク４００を介してアクセスする外部ストレージ２００又はクラウドストレージ３００、すなわちデータ格納ストレージ６７が「Ｂ」であるＬＤ６１が「＃２００」に変更するようにしたので、他のＲｅａｄ回数６８およびＷｒｉｔｅ回数６９の大きいデータに対するアクセス性能が維持され、これにより、ストレージ装置に対するアクセス性能の低下・負荷集中を低減することができる。 In this way, when the read count 68 and write count 69 are both below the threshold, the reference source of the deduplication data is changed to the external storage 200 or cloud storage 300 accessed via the network 400, which is less affected by access performance, i.e., the LD 61 whose data storage storage 67 is "B" is changed to "#200". This maintains the access performance to other data with large read counts 68 and write counts 69, thereby reducing the degradation of access performance and load concentration to the storage device.

上述したステップＳ１４において、Ｒｅａｄ回数６８およびＷｒｉｔｅ回数６９が共に閾値以下でない場合には（ステップＳ１４のＮＯ）、Ｒｅａｄ回数６８が閾値（例えば、Ｒｅａｄ回数の閾値が５０）以上であるか否かを判断する（ステップＳ１６）。そして、Ｒｅａｄ回数６８が閾値以上である場合には（ステップＳ１６のＹＥＳ）、重複排除データの参照元を、比較的アクセスが容易な（速い、比較的能力の高い）ホスト装置１０に直接接続されたストレージ装置２０、すなわちデータ格納ストレージ６７が「Ａ」であるＬＢＡ６２に決定する（ステップＳ２０）。つまり、アクセスが比較的多い重複排除データについては、ネットワーク４００を介してアクセスする外部ストレージ２００又はクラウドストレージ３００「Ｂ」よりも、ホスト装置１０に直接接続されたストレージ装置２０の方がアクセス性能の低下が発生しにくいためである。 In the above-mentioned step S14, if the read count 68 and the write count 69 are not equal to or less than the threshold (NO in step S14), it is determined whether the read count 68 is equal to or greater than the threshold (for example, the threshold for the read count is 50) (step S16). If the read count 68 is equal to or greater than the threshold (YES in step S16), the reference source of the deduplication data is determined to be the storage device 20 directly connected to the host device 10, which is relatively easy to access (fast, relatively high capacity), that is, the LBA 62 where the data storage storage 67 is "A" (step S20). In other words, for deduplication data that is accessed relatively frequently, the storage device 20 directly connected to the host device 10 is less likely to cause a decrease in access performance than the external storage 200 or cloud storage 300 "B" accessed via the network 400.

ここで、現在（重複排除処理後）の重複排除管理テーブル３４が、例えば、図６に示すように、ＬＤ６１「＃０」において、ＬＢＡ６２「４０００」では、重複排除状態６３が「参照先」、重複排除数６４が「３」、参照先ＬＤＮ６５が「２００」、参照先ＬＢＡ６６が「２０４０００」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５０」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃０」において、ＬＢＡ６２「５０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「１０」、Ｗｒｉｔｅ回数６９が「０」であるとする。なお、図中、「－」は無効（データなし）を意味する。なお、参照先ＬＤＮが「２００」、参照先ＬＢＡが「２０４０００」は、参照先のＬＤとＬＢＡを示しており、換言すれば、重複排除データが、ＬＤＮが「＃２００」、ＬＢＡ６２が「２０４０００」に格納されていることを意味する。 Now, assume that the current (after deduplication processing) deduplication management table 34 is as shown in FIG. 6, for example, in LD 61 "#0", at LBA 62 "4000", the deduplication status 63 is "reference", the deduplication count 64 is "3", the reference LDN 65 is "200", the reference LBA 66 is "204000", the data storage storage 67 is "A", the read count 68 is "50", and the write count 69 is "0". Also assume that in LD 61 "#0", at LBA 62 "5000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "10", and the write count 69 is "0". Note that in the figure, "-" means invalid (no data). Note that the referenced LDN is "200" and the referenced LBA is "204000" which indicates the referenced LD and LBA. In other words, this means that the deduplication data is stored at LDN "#200" and LBA 62 "204000".

同様に、ＬＤ６１「＃１」において、ＬＢＡ６２「１４０００」では、重複排除状態６３が「参照先」、重複排除数６４が「３」、参照先ＬＤＮが「２００」、参照先ＬＢＡが「２０４０００」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「２０」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１５０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「５」であるとする。参照先ＬＤＮ６５が「２００」、参照先ＬＢＡ６６が「２０４０００」については上述した通りである。 Similarly, in LD 61 "#1", at LBA 62 "14000", the deduplication status 63 is "referenced", the deduplication count 64 is "3", the referenced LDN is "200", the referenced LBA is "204000", the data storage storage 67 is "A", the read count 68 is "20", and the write count 69 is "0". Also, in LD 61 "#1", at LBA 62 "15000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "5", and the write count 69 is "5". The referenced LDN 65 is "200" and the referenced LBA 66 is "204000" as described above.

そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０４０００」では、重複排除状態６３が「参照元」、重複排除数６４が「３」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「４０」、Ｗｒｉｔｅ回数６９が「０」であるとする。また、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０５０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「１０」、Ｗｒｉｔｅ回数６９が「０」であるとする。 In LD 61 "#200", at LBA 62 "204000", the deduplication status 63 is "reference source", the deduplication count 64 is "3", the data storage storage 67 is "B", the read count 68 is "40", and the write count 69 is "0". In addition, in LD 61 "#200", at LBA 62 "205000", the deduplication status 63 is "not performed", the data storage storage 67 is "B", the read count 68 is "10", and the write count 69 is "0".

図６に示す重複排除管理テーブル３４では、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０４０００」で示される保存領域（データ格納ストレージ「Ｂ」＝外部ストレージ２００又はクラウドストレージ３００）に重複排除データの実態が保存されており、当該重複排除データの実態が、ＬＤ６１「＃０」におけるＬＢＡ６２「４０００」と、ＬＤ６１「＃１」におけるＬＢＡ６２「１４０００」とから参照されることを意味する。 In the deduplication management table 34 shown in FIG. 6, the actual deduplication data is stored in the storage area indicated by LBA 62 "204000" in LD 61 "#200" (data storage storage "B" = external storage 200 or cloud storage 300), and this means that the actual deduplication data is referenced from LBA 62 "4000" in LD 61 "#0" and LBA 62 "14000" in LD 61 "#1".

図４に示す例では、ＬＤ６１「＃０」において、ＬＢＡ６２「４０００」におけるＲｅａｄ回数６８が「５０」である（太線の矩形を参照）。この場合、Ｒｅａｄ回数６８が閾値以上であるので、データ格納ストレージ４７が「Ａ（ストレージ装置２０）」である、ＬＤ「＃０」、ＬＢＡ６２「４０００」を、重複排除データの参照元に決定する（太線の矩形を参照）。 In the example shown in FIG. 4, in LD 61 "#0", the read count 68 at LBA 62 "4000" is "50" (see the thick-lined rectangle). In this case, since the read count 68 is greater than or equal to the threshold, LD "#0", LBA 62 "4000", where the data storage storage 47 is "A (storage device 20)", is determined to be the reference source for the deduplication data (see the thick-lined rectangle).

次に、重複排除データ監視部３３は、上述したように、現在の参照元のＬＢＡ６２から新たに参照元に決定したＬＢＡ６２に変更があったか否か、すなわち新たに参照元に決定したＬＢＡ６２の重複排除状態６３が既に「参照元」であるか否かを判断し（ステップＳ２２）、新たに参照元に決定したＬＢＡ６２の重複排除状態６３が既に「参照元」である場合には（ステップＳ２２のＮＯ）、重複排除データ監視部３３は、監視対象を次のＬＢＡ（次の３２ＫＢ）に設定し（ステップＳ２８）、ステップＳ１０に戻る。すなわち、新たに参照元に決定したＬＢＡ６２における重複排除状態６３が既に「参照元」である場合には、重複排除データを移動させる必要がないので、重複排除管理テーブル３４を更新することなく、次のＬＢＡ（次の３２ＫＢ）に進む。 Next, as described above, the deduplication data monitoring unit 33 determines whether there has been a change from the current reference source LBA 62 to the newly determined reference source LBA 62, i.e., whether the deduplication status 63 of the newly determined reference source LBA 62 is already "reference source" (step S22). If the deduplication status 63 of the newly determined reference source LBA 62 is already "reference source" (NO in step S22), the deduplication data monitoring unit 33 sets the next LBA (next 32 KB) to be monitored (step S28) and returns to step S10. In other words, if the deduplication status 63 of the newly determined reference source LBA 62 is already "reference source", there is no need to move the deduplication data, so the deduplication data monitoring unit 33 proceeds to the next LBA (next 32 KB) without updating the deduplication management table 34.

一方、参照元に決定したＬＢＡ６２の重複排除状態６３が「参照先」であった場合には（ステップＳ２２のＹＥＳ）、重複排除データ監視部３３は、上述したように、重複排除データの実態を、新たに参照元に決定したＬＤ６１のＬＢＡ６２に移動し（ステップＳ２４）、重複排除管理テーブル３４の各項目を更新する（ステップＳ２６）。 On the other hand, if the deduplication status 63 of the LBA 62 determined as the reference source is "reference destination" (YES in step S22), the deduplication data monitoring unit 33 moves the actual deduplication data to the LBA 62 of the LD 61 determined as the new reference source (step S24), as described above, and updates each item in the deduplication management table 34 (step S26).

図６に示す例では、現在の参照元であるＬＤ６１「＃２００」におけるＬＢＡ６２「２０４０００」に格納されている重複排除データを、新たに参照元に決定したＬＤ６１「＃０」におけるＬＢＡ６２「４０００」で示される保存領域に移動する。また、図７に示す斜線によるハッチングのように、ＬＤ６１「＃０」において、ＬＢＡ６２「４０００」では、重複排除状態６３を「参照先」から「参照元」、参照先ＬＤＮ６５を「２００」から無効「－」、参照先ＬＢＡ６６を「２０４０００」から無効「－」に更新する。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１４０００」では、参照先ＬＤＮ６５を「２００」から「０」、参照先ＬＢＡ６６を「２０４０００」から「４０００」に更新する。そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０４０００」では、重複排除状態６３を「参照元」から「参照先」、参照先ＬＤＮ６５を無効「－」から「０」、参照先ＬＢＡ６６を無効「－」から「４０００」に更新する。 In the example shown in FIG. 6, the deduplication data stored at LBA 62 "204000" in LD 61 "#200", which is the current reference source, is moved to the storage area indicated by LBA 62 "4000" in LD 61 "#0", which has been newly determined as the reference source. Also, as shown by the diagonal hatching in FIG. 7, in LD 61 "#0", at LBA 62 "4000", the deduplication status 63 is updated from "referenced" to "referenced", the referenced LDN 65 is updated from "200" to invalid "-", and the referenced LBA 66 is updated from "204000" to invalid "-". Also, in LD 61 "#1", at LBA 62 "14000", the referenced LDN 65 is updated from "200" to "0", and the referenced LBA 66 is updated from "204000" to "4000". Then, in LD 61 "#200", at LBA 62 "204000", the deduplication status 63 is updated from "source" to "destination", the destination LDN 65 is updated from invalid "-" to "0", and the destination LBA 66 is updated from invalid "-" to "4000".

このように、ステップＳ１６においてＲｅａｄ回数６８が閾値以上であると判定した場合には、重複排除データの参照元を、ホスト装置１０に直接接続され、アクセス性能が低下しにくいストレージ装置２０、すなわちデータ格納ストレージ６７が「Ａ」であるＬＤ６１が「＃０」に変更するようにしたので、ストレージ装置に対するアクセス性能の低下・負荷集中を低減することができる。 In this way, when it is determined in step S16 that the read count 68 is equal to or greater than the threshold, the reference source of the deduplication data is changed to the storage device 20 that is directly connected to the host device 10 and whose access performance is unlikely to deteriorate, i.e., the LD 61 whose data storage storage 67 is "A" is changed to "#0", thereby reducing the deterioration of access performance and the concentration of load on the storage device.

上述したステップＳ１６において、Ｒｅａｄ回数６８が閾値以上でない場合には（ステップＳ１６のＮＯ）、参照元のＬＢＡへのＷｒｉｔｅ回数６９が閾値（例えば、Ｗｒｉｔｅ回数の閾値が３０）以上であるか否かを判断する（ステップＳ１８）。そして、Ｗｒｉｔｅ回数６９が閾値以上である場合には（ステップＳ１８のＹＥＳ）、重複排除データの参照元を、比較的アクセス頻度が少ない（Ｒｅａｄ回数とＷｒｉｔｅ回数の合計値が最も小さい）ＬＢＡ６２に決定する（ステップＳ２０）。つまり、参照元のＬＢＡへのＷｒｉｔｅ回数６９が比較的多い（所定の閾値よりも多い）場合には、重複排除データが変更される可能性が高く、変更された場合には、そもそも参照元として重複排除データの実態を残すことが不適切である。つまり参照元を参照先としているＬＢＡ６２においては、参照元が変更されることが不適切な場合があり、そのような場合には、参照元を記憶する新たな他のＬＢＡ６２を決定する必要がある。よって、重複排除処理を実行して、重複排除データの移動や、重複排除管理テーブル３４の更新などを行う必要が生じる。そこで、Ｗｒｉｔｅ回数６９が比較的多い（所定の閾値よりも多い）場合には、重複排除しているＬＢＡの数が多く、Ｒｅａｄ回数とＷｒｉｔｅ回数の合計値が最も小さいＬＢＡを、重複排除データの参照元に変更した方がアクセス性能の低下が発生しにくい。 In the above-mentioned step S16, if the number of reads 68 is not equal to or greater than the threshold (NO in step S16), it is determined whether the number of writes 69 to the reference source LBA is equal to or greater than the threshold (for example, the threshold for the number of writes is 30) (step S18). Then, if the number of writes 69 is equal to or greater than the threshold (YES in step S18), the reference source of the deduplication data is determined to be the LBA 62 with a relatively low access frequency (the sum of the number of reads and the number of writes is the smallest) (step S20). In other words, if the number of writes 69 to the reference source LBA is relatively high (greater than a predetermined threshold), the deduplication data is likely to be changed, and if it is changed, it is inappropriate to leave the actual deduplication data as the reference source in the first place. In other words, in the LBA 62 that refers to the reference source, it may be inappropriate to change the reference source, and in such a case, it is necessary to determine a new other LBA 62 to store the reference source. Therefore, it becomes necessary to execute the deduplication process to move the deduplication data and update the deduplication management table 34. Therefore, when the number of writes 69 is relatively high (higher than a predetermined threshold), it is more difficult to cause a decrease in access performance by changing the number of deduplication-depleted LBAs that is large and has the smallest total number of reads and writes to the LBA that is the reference source for the deduplication data.

ここで、現在（重複排除処理後）の重複排除管理テーブル３４が、例えば、図８に示すように、ＬＤ６１「＃０」において、ＬＢＡ６２「８０００」では、重複排除状態６３が「参照元」、重複排除数６４が「５」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「２０」、Ｗｒｉｔｅ回数６９が「３０」であるとする。また、ＬＤ６１「＃０」において、ＬＢＡ６２「９０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「５」であるとする。なお、図中、「－」は無効（データなし）を意味する。 Now, assume that the current (after deduplication processing) deduplication management table 34 is as shown in FIG. 8, for example, in LD 61 "#0", at LBA 62 "8000", the deduplication status 63 is "reference source", the deduplication count 64 is "5", the data storage storage 67 is "A", the read count 68 is "20", and the write count 69 is "30". Also, in LD 61 "#0", at LBA 62 "9000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "5", and the write count 69 is "5". Note that in the figure, "-" means invalid (no data).

同様に、ＬＤ６１「＃１」において、ＬＢＡ６２「１８０００」では、重複排除状態６３が「参照先」、重複排除数６４が「５」、参照先ＬＤＮが「０」、参照先ＬＢＡが「８０００」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「２０」、Ｗｒｉｔｅ回数６９が「３５」であるとする。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１９０００」では、重複排除状態６３が「未実施」、データ格納ストレージ６７が「Ａ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「５」であるとする。なお、参照先ＬＤＮ「０」、参照先ＬＢＡ「８０００」は、参照先のＬＤとＬＢＡを示しており、換言すれば、重複排除データが、ＬＤＮ「＃０」のＬＢＡ６２「８０００」に格納されていることを意味する。 Similarly, in LD 61 "#1", at LBA 62 "18000", the deduplication status 63 is "referenced", the deduplication count 64 is "5", the referenced LDN is "0", the referenced LBA is "8000", the data storage storage 67 is "A", the read count 68 is "20", and the write count 69 is "35". Also, in LD 61 "#1", at LBA 62 "19000", the deduplication status 63 is "not performed", the data storage storage 67 is "A", the read count 68 is "5", and the write count 69 is "5". Note that the referenced LDN "0" and referenced LBA "8000" indicate the referenced LD and LBA; in other words, this means that the deduplication data is stored in LBA 62 "8000" of LDN "#0".

そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０８０００」では、重複排除状態６３が「参照先」、重複排除数６４が「５」、参照先ＬＤＮ６５が「０」、参照先ＬＢＡ６６が「８０００」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「５」であるとする。また、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０９０００」では、重複排除状態６３が「参照先」、重複排除数６４が「５」、参照先ＬＤＮ６５が「０」、参照先ＬＢＡ６６が「８０００」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「０」であるとする。そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０Ａ０００」では、重複排除状態６３が「参照先」、重複排除数６４が「５」、参照先ＬＤＮ６５が「０」、参照先ＬＢＡ６６が「８０００」、データ格納ストレージ６７が「Ｂ」、Ｒｅａｄ回数６８が「５」、Ｗｒｉｔｅ回数６９が「０」であるとする。参照先ＬＤＮ６５が「０」、参照先ＬＢＡ６６が「８０００」については上述した通りである。 In LD 61 "#200", at LBA 62 "208000", the deduplication status 63 is "referenced", the deduplication count 64 is "5", the referenced LDN 65 is "0", the referenced LBA 66 is "8000", the data storage storage 67 is "B", the read count 68 is "5", and the write count 69 is "5". In LD 61 "#200", at LBA 62 "209000", the deduplication status 63 is "referenced", the deduplication count 64 is "5", the referenced LDN 65 is "0", the referenced LBA 66 is "8000", the data storage storage 67 is "B", the read count 68 is "5", and the write count 69 is "0". In LD 61 "#200", at LBA 62 "20A000", the deduplication status 63 is "referenced", the deduplication count 64 is "5", the referenced LDN 65 is "0", the referenced LBA 66 is "8000", the data storage 67 is "B", the read count 68 is "5", and the write count 69 is "0". The referenced LDN 65 is "0" and the referenced LBA 66 is "8000" as described above.

図８に示す例では、ＬＤ６１「＃０」では、ＬＢＡ６２「８０００」におけるＷｒｉｔｅ回数６９が「３０」である（太線の矩形を参照）。この場合、Ｗｒｉｔｅ回数６９が閾値以上であるので、重複排除しているＬＢＡの数が多く、Ｒｅａｄ回数６８とＷｒｉｔｅ回数６９の合計値が最も小さいＬＢＡ４２「２０Ａ０００」を、重複排除データの参照元に決定する（太線の矩形を参照）。なお、ＬＢＡ４２「２０９０００」であってもよく、ここでは、ＬＢＡ４２「２０Ａ０００」を重複排除データの参照元にランダムに決定した。 In the example shown in FIG. 8, in LD 61 "#0", the write count 69 at LBA 62 "8000" is "30" (see the thick rectangle). In this case, since the write count 69 is equal to or greater than the threshold, the number of LBAs being deduplicated is large, and LBA 42 "20A000" with the smallest total of read count 68 and write count 69 is selected as the reference source for the deduplication data (see the thick rectangle). Note that LBA 42 "209000" may also be used, and here LBA 42 "20A000" was randomly selected as the reference source for the deduplication data.

一方、参照元に決定したＬＢＡ６２の重複排除状態６３が「参照先」である場合には（ステップＳ２２のＹＥＳ）、重複排除データ監視部３３は、上述したように、重複排除データを、新たに参照元に決定したＬＤ６１のＬＢＡ６２に移動し（ステップＳ２４）、重複排除管理テーブル３４の各項目を更新する（ステップＳ２６）。 On the other hand, if the deduplication status 63 of the LBA 62 determined as the reference source is "reference destination" (YES in step S22), the deduplication data monitoring unit 33 moves the deduplication data to the LBA 62 of the LD 61 newly determined as the reference source (step S24), as described above, and updates each item in the deduplication management table 34 (step S26).

図８に示す例では、新たに参照元に決定したＬＢＡ６２の重複排除状態６３が「参照先」であるので、現在の参照元であるＬＤ６１「＃０」におけるＬＢＡ６２「８０００」に格納されている重複排除データを、新たに参照元に決定したＬＤ６１「＃２００」におけるＬＢＡ６２「２０Ａ０００」で示される保存領域に移動する。 In the example shown in FIG. 8, the deduplication status 63 of the newly determined reference source LBA 62 is "reference destination", so the deduplication data stored in LBA 62 "8000" of the current reference source LD 61 "#0" is moved to the storage area indicated by LBA 62 "20A000" of the newly determined reference source LD 61 "#200".

また、図９に示す斜線によるハッチングのように、ＬＤ６１「＃０」において、ＬＢＡ６２「８０００」では、重複排除状態６３を「参照元」から「参照先」、参照先ＬＤＮ６５を無効「－」から「２００」、参照先ＬＢＡ６６を無効「－」から「２０Ａ０００」に更新する。また、ＬＤ６１「＃１」において、ＬＢＡ６２「１８０００」では、参照先ＬＤＮ６５を「０」から「２００」、参照先ＬＢＡ６６を「８０００」から「２０Ａ０００」に更新する。 Also, as shown by the diagonal hatching in FIG. 9, in LD 61 "#0", at LBA 62 "8000", the deduplication status 63 is updated from "reference source" to "reference destination", the reference destination LDN 65 is updated from invalid "-" to "200", and the reference destination LBA 66 is updated from invalid "-" to "20A000". Also, in LD 61 "#1", at LBA 62 "18000", the reference destination LDN 65 is updated from "0" to "200", and the reference destination LBA 66 is updated from "8000" to "20A000".

また、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０８０００」では、参照先ＬＤＮ６５を「０」から「２００」、参照先ＬＢＡ６６を「８０００」から「２０Ａ０００」に更新する。また、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０９０００」では、参照先ＬＤＮ６５を「０」から「２００」、参照先ＬＢＡ６６を「８０００」から「２０Ａ０００」に更新する。そして、ＬＤ６１「＃２００」において、ＬＢＡ６２「２０Ａ０００」では、重複排除状態６３を「参照先」から「参照元」、参照先ＬＤＮ６５を「０」から無効「－」、参照先ＬＢＡ６６を「８０００」から無効「－」に更新する。 In addition, in LD 61 "#200", at LBA 62 "208000", the referenced LDN 65 is updated from "0" to "200", and the referenced LBA 66 is updated from "8000" to "20A000". In LD 61 "#200", at LBA 62 "209000", the referenced LDN 65 is updated from "0" to "200", and the referenced LBA 66 is updated from "8000" to "20A000". In LD 61 "#200", at LBA 62 "20A000", the deduplication status 63 is updated from "referenced" to "source", the referenced LDN 65 is updated from "0" to invalid "-", and the referenced LBA 66 is updated from "8000" to invalid "-".

このように、Ｗｒｉｔｅ回数６９が閾値以上である場合には、重複排除データの参照元を、重複排除しているＬＢＡの数が多く、Ｒｅａｄ回数６８とＷｒｉｔｅ回数６９の合計値が最も小さいＬＢＡに変更するようにしたので、重複排除データの変更に伴う重複排除処理の実行頻度を抑えることができ、結果的にストレージ装置に対するアクセス性能の低下・負荷集中を低減することができる。 In this way, when the write count 69 is equal to or greater than the threshold, the reference source of the deduplication data is changed to the LBA with the largest number of deduplication-enabled LBAs and the smallest total of the read count 68 and write count 69. This reduces the frequency with which deduplication processing is performed when deduplication data is changed, and ultimately reduces degradation of access performance and load concentration on the storage device.

なお、上述した実施形態では、参照元を決定する際に、Ｒｅａｄ回数４８および／又はＷｒｉｔｅ回数４９を用いて判断するようにしたが、これに限定されることなく、ストレージ装置２０の負荷状態、ホスト装置１０毎のアクセス頻度等、アクセス性能に影響する全てのパラメータを用いて判断するようにしてもよい。 In the above embodiment, the reference source is determined using the read count 48 and/or write count 49, but the present invention is not limited to this and may be determined using all parameters that affect access performance, such as the load state of the storage device 20 and the access frequency of each host device 10.

上述した実施形態によれば、重複排除データ監視部３３が、論理ディスク５０へのアクセス状況に応じた優先順位に従って、重複排除データの移動先を適切な論理ディスク５０に決定するようにしたので、特定のストレージ装置に対するアクセスの集中を低減することができる。上記優先順位は、論理ディスク５０へのアクセス状況（Ｒｅａｄ回数および／またはＷｒｉｔｅ回数）に応じて以下のように設定することができる。 According to the above-described embodiment, the deduplication data monitoring unit 33 determines the appropriate logical disk 50 to which the deduplication data is to be moved in accordance with a priority order according to the access status to the logical disk 50, thereby reducing the concentration of access to a specific storage device. The priority order can be set as follows according to the access status to the logical disk 50 (number of reads and/or writes):

上述した実施形態によれば、重複排除データへのアクセス（Ｒｅａｄ回数およびＷｒｉｔｅ回数）が第一閾値以下場合には、重複排除データの移動先を、ネットワークを介して接続される外部ストレージ２００又はクラウドストレージ３００とするようにしたので、自ストレージ装置２０に対するアクセスの集中を低減することができる。 According to the above-described embodiment, when the number of accesses (read and write counts) to the deduplicated data is equal to or less than the first threshold, the destination of the deduplicated data is set to the external storage 200 or cloud storage 300 connected via the network, thereby reducing the concentration of accesses to the storage device 20 itself.

また、上述した実施形態によれば、重複排除データへのアクセス（Ｒｅａｄ回数）が第二閾値以上である場合には、重複排除データの移動先を、データ通信の回線遅延の可能性があるネットワークを介さずに、データ送受信することができる自ストレージ装置２０とするようにしたので、アクセス性能の低下を回避することができる。 Furthermore, according to the above-described embodiment, when the access (number of reads) to the deduplicated data is equal to or greater than the second threshold, the destination of the deduplicated data is set to the local storage device 20 that can transmit and receive data without going through a network that may cause line delays in data communication, thereby making it possible to avoid a decrease in access performance.

また、重複排除データへのアクセス（Ｗｒｉｔｅ回数）が第三閾値以上である場合には、重複排除データの移動先を、重複排除しているＬＢＡの数が多く、Ｒｅａｄ回数６８とＷｒｉｔｅ回数６９の合計値が最も小さい論理ディスク５０のＬＢＡとしたので、重複排除データの変更に伴う重複排除処理の実行頻度を抑えることができ、結果的にストレージ装置全体に対するアクセス性能の低下・負荷集中を低減することができる。 In addition, when the access (write count) to the deduplicated data is equal to or greater than the third threshold, the destination of the deduplicated data is the LBA of the logical disk 50 that has the largest number of deduplicated LBAs and the smallest total value of the read count 68 and the write count 69. This reduces the frequency with which the deduplicated data is changed and, as a result, reduces the degradation of access performance and the load concentration on the entire storage device.

図１０は、本実施形態による情報処理システムの最小構成を示すブロック図である。
本実施形態による情報処理システムは、少なくとも、複数の論理ディスク９０、９０、…、９０、重複排除手段１００、監視手段１０１、移動先決定手段１０２、移動手段１０３、設定情報更新手段１０４の構成を備えればよい。複数の論理ディスク９０、９０、…、９０は、複数の物理ストレージ装置から構築されている。重複排除手段１００は、複数の元データ間で重複するデータを、複数の論理ディスク９０、９０、…、９０に保存管理する際に、前記複数の論理ディスク９０、９０、…、９０のうち、１つの元データに対する１つの論理ディスク９０には前記重複するデータを重複排除データとして保存し、前記複数の論理ディスク９０、９０、…、９０のうち、前記１つの元データ以外の他の元データに対する論理ディスク９０には前記重複排除データを参照するためのアドレス情報を保存する。監視手段１０１は、前記複数の論理ディスク９０、９０、…、９０に対するアクセス状況を監視する。移動先決定手段１０２は、前記アクセス状況に応じて、前記重複排除データの移動先となる論理ディスク９０を決定する。移動手段１０３は、前記決定された前記重複排除データの移動先の論理ディスク９０に前記重複排除データを移動させる。設定情報更新手段１０４は、前記決定された前記重複排除データの移動先に応じて、前記重複排除データが保存されている前記論理ディスク９０に関する情報と、前記他の元データに対する前記論理ディスク９０に保存された前記重複排除データを参照するためのアドレス情報を更新する。 FIG. 10 is a block diagram showing the minimum configuration of an information processing system according to this embodiment.
The information processing system according to the present embodiment may at least include a plurality of logical disks 90, 90, ..., 90, a deduplication means 100, a monitoring means 101, a migration destination determination means 102, a migration means 103, and a configuration information update means 104. The plurality of logical disks 90, 90, ..., 90 are constructed from a plurality of physical storage devices. When storing and managing data that is duplicated among a plurality of original data in the plurality of logical disks 90, 90, ..., 90, the deduplication means 100 stores the duplicated data as deduplication data in one logical disk 90 for one original data among the plurality of logical disks 90, 90, ..., 90, and stores address information for referencing the deduplication data in one logical disk 90 for other original data among the plurality of logical disks 90, 90, ..., 90. The monitoring means 101 monitors the access status to the plurality of logical disks 90, 90, ..., 90. The destination determination means 102 determines a destination logical disk 90 for the deduplication data according to the access status. The movement means 103 moves the deduplication data to the determined destination logical disk 90 for the deduplication data. The setting information update means 104 updates information about the logical disk 90 on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk 90 for the other original data according to the determined destination for the deduplication data.

以上、この発明のいくつかの実施形態について説明したが、この発明は、これらに限定されるものではなく、特許請求の範囲に記載された発明とその均等の範囲を含むものである。
以下に、本願出願の特許請求の範囲に記載された発明を付記する。 Although several embodiments of the present invention have been described above, the present invention is not limited to these and includes the inventions described in the claims and their equivalents.
The inventions described in the claims of this application are set forth below.

（付記１）
複数の元データ間で重複するデータを、複数の論理ディスクに保存管理する際に、前記複数の論理ディスクのうち、１つの元データに対する１つの論理ディスクには前記重複するデータを重複排除データとして保存し、前記複数の論理ディスクのうち、前記１つの元データ以外の他の元データに対する論理ディスクには前記重複排除データを参照するためのアドレス情報を保存する重複排除手段と、前記複数の論理ディスクに対するアクセス状況を監視する監視手段と、前記アクセス状況に応じて、前記重複排除データの移動先となる論理ディスクを決定する移動先決定手段と、前記決定された前記重複排除データの移動先の論理ディスクに前記重複排除データを移動させる移動手段と、前記決定された前記重複排除データの移動先に応じて、前記重複排除データが保存されている前記論理ディスクに関する情報と、前記他の元データに対する前記論理ディスクに保存された前記重複排除データを参照するためのアドレス情報を更新する設定情報更新手段と、を備えることを特徴とする情報処理装置。 (Appendix 1)
1. An information processing apparatus comprising: a deduplication unit for storing and managing duplicate data among a plurality of original data on a plurality of logical disks, the deduplication unit storing the duplicate data as deduplication data in one logical disk among the plurality of logical disks for one of the original data, and storing address information for referencing the deduplication data in logical disks among the plurality of logical disks for other original data other than the one original data; a monitoring unit for monitoring access status to the plurality of logical disks; a destination determination unit for determining a logical disk to which the deduplication data is to be moved in accordance with the access status; a moving unit for moving the deduplication data to the determined logical disk to which the deduplication data is to be moved; and a setting information update unit for updating, in accordance with the determined destination of the deduplication data, information regarding the logical disk on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data.

（付記２）
前記移動先決定手段は、前記複数の論理ディスクに対する読み込み回数および書込み回数が所定の第一閾値以下である場合に、アクセス性能が比較的低い論理ディスクを、前記重複排除データの移動先に決定する、ことを特徴とする付記１に記載の情報処理装置。 (Appendix 2)
The information processing device described in Appendix 1, characterized in that the destination determination means determines a logical disk with relatively low access performance as the destination for the deduplicated data when the number of reads and writes to the multiple logical disks is below a predetermined first threshold.

（付記３）
前記アクセス性能が比較的低い論理ディスクは、ネットワークを介して接続されたストレージ装置である、ことを特徴とする付記１または付記２に記載の情報処理装置。 (Appendix 3)
3. The information processing device according to claim 1, wherein the logical disk having a relatively low access performance is a storage device connected via a network.

（付記４）
前記移動先決定手段は、前記複数の論理ディスクに対する読み込み回数が所定の第二閾値以上である場合に、アクセス性能が比較的高い論理ディスクを、前記重複排除データの移動先に決定する、ことを特徴とする付記２から付記３の何れか一つに記載の情報処理装置。 (Appendix 4)
The information processing device according to any one of claims 2 to 3, characterized in that the destination determination means determines a logical disk with relatively high access performance as the destination for the deduplicated data when the number of reads to the plurality of logical disks is equal to or greater than a predetermined second threshold.

（付記５）
前記アクセス性能が比較的高い論理ディスクは、ネットワークを介さず、直接接続されたストレージ装置であることを特徴とする付記２から付記４の何れか一つに記載の情報処理装置。 (Appendix 5)
5. The information processing device according to claim 2, wherein the logical disk having a relatively high access performance is a storage device that is directly connected without going through a network.

（付記６）
前記移動先決定手段は、前記複数の論理ディスクに対する読み込み回数および書込み回数の双方が所定の第３閾値以上である場合に、重複排除しているデータ数が多く、読み込み回数と書き込み回数の合計値が最も小さい論理ディスクを、前記重複排除データの移動先に決定する、ことを特徴とする付記１から付記５の何れか一つに記載の情報処理装置。 (Appendix 6)
The information processing device according to any one of appendices 1 to 5, characterized in that, when both the number of reads and the number of writes for the plurality of logical disks are equal to or greater than a predetermined third threshold, the destination determination means determines as the destination for the deduplicated data a logical disk having a large number of deduplicated data and the smallest total number of reads and writes.

（付記７）
前記移動先決定手段は、前記複数の論理ディスクに対する読み込み回数および書込み回数が所定の第一閾値以下である場合に、アクセス性能が比較的低い論理ディスクを、前記重複排除データの移動先に決定し、前記複数の論理ディスクに対する読み込み回数が所定の第二閾値以上である場合に、アクセス性能が比較的高い論理ディスクを、前記重複排除データの移動先に決定し、前記複数の論理ディスクに対する読み込み回数および書込み回数の双方が所定の第３閾値以上である場合に、重複排除しているデータ数が多く、読み込み回数と書き込み回数の合計値が最も小さい論理ディスクを、前記重複排除データの移動先に決定することを特徴とする付記１から付記６の何れか一つに記載の情報処理装置。 (Appendix 7)
7. The information processing device of claim 1, wherein the destination determination means determines a logical disk with relatively low access performance as the destination for the deduplicated data if the number of reads and writes for the plurality of logical disks is below a predetermined first threshold, determines a logical disk with relatively high access performance as the destination for the deduplicated data if the number of reads for the plurality of logical disks is equal to or greater than a predetermined second threshold, and determines a logical disk with relatively high access performance as the destination for the deduplicated data if the number of reads and writes for the plurality of logical disks is equal to or greater than a predetermined third threshold.

なお、本発明における処理部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより特典情報の制御処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、インターネットやＷＡＮ、ＬＡＮ、専用回線等の通信回線を含むネットワークを介して接続された複数のコンピュータ装置を含んでもよい。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、ネットワークを介してプログラムが送信された場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリ（ＲＡＭ）のように、一定時間プログラムを保持しているものも含むものとする。また、上記プログラムは、上述した機能の一部を実現するためのものであってもよい。さらに、上述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 In addition, a program for implementing the functions of the processing unit in the present invention may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed to control the bonus information. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. The term "computer system" may also include a plurality of computer devices connected via a network including the Internet, a WAN, a LAN, a dedicated line, and other communication lines. The term "computer-readable recording medium" refers to portable media such as flexible disks, optical magnetic disks, ROMs, and CD-ROMs, and storage devices such as hard disks built into a computer system. The term "computer-readable recording medium" also refers to storage devices that hold a program for a certain period of time, such as volatile memory (RAM) inside a computer system that becomes a server or client when a program is transmitted over a network. The program may also be one for implementing part of the above-mentioned functions. Furthermore, the program may be a so-called differential file (differential program) that can implement the above-mentioned functions in combination with a program already recorded in the computer system.

また、上述した機能の一部または全部を、ＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ）等の集積回路として実現してもよい。上述した各機能は個別にプロセッサ化してもよいし、一部、または全部を集積してプロセッサ化してもよい。また、集積回路化の手法はＬＳＩに限らず専用回路、または汎用プロセッサで実現してもよい。また、半導体技術の進歩によりＬＳＩに代替する集積回路化の技術が出現した場合、当該技術による集積回路を用いてもよい。 In addition, some or all of the above-mentioned functions may be realized as an integrated circuit such as an LSI (Large Scale Integration). Each of the above-mentioned functions may be individually processed, or some or all of them may be integrated into a processor. The integrated circuit method is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor. Furthermore, if an integrated circuit technology that can replace LSI appears due to advances in semiconductor technology, an integrated circuit based on that technology may be used.

１情報処置システム
１０ホスト装置
２０ストレージ装置
３０コントローラ
３１ホスト制御部
３２ディスク制御部
３３重複排除データ監視部
３４重複排除管理テーブル
４０ディスク部
４１ＨＤＤ
４２ＲＡＩＤ構成
５０ＬＤ（論理ディスク）
５１、５２ＬＤ（論理ディスク）
２００外部ストレージ
２１０ボリューム
３００クラウドストレージ
３１０ボリューム
６１ＬＤ（論理ディスク）
６２ＬＢＡ（論理ブロックアドレス）
６３重複排除状態
６４重複排除数
６５参照先ＬＤＮ（論理ディスク番号）
６６参照先ＬＢＡ（論理ブロックアドレス）
６７データ格納ストレージ
６８Ｒｅａｄ回数
６９Ｗｒｉｔｅ回数 REFERENCE SIGNS LIST 1 Information processing system 10 Host device 20 Storage device 30 Controller 31 Host control unit 32 Disk control unit 33 Deduplication data monitoring unit 34 Deduplication management table 40 Disk unit 41 HDD
42 RAID configuration 50 LD (logical disk)
51, 52 LD (Logical Disk)
200 External storage 210 Volume 300 Cloud storage 310 Volume 61 LD (logical disk)
62 LBA (Logical Block Address)
63 Deduplication status 64 Number of deduplications 65 Reference LDN (logical disk number)
66 Reference LBA (Logical Block Address)
67 Data storage 68 Read count 69 Write count

Claims

a deduplication means for storing and managing duplicate data among a plurality of original data on a plurality of logical disks, the deduplication means storing the duplicate data as deduplication data in one logical disk among the plurality of logical disks corresponding to one of the original data, and storing address information for referencing the deduplication data in other logical disks among the plurality of logical disks corresponding to the original data other than the one of the original data;
a monitoring means for monitoring an access status to the plurality of logical disks;
a destination determination unit that determines a destination logical disk to which the deduplication data is to be moved in accordance with the access status;
a migration means for migrating the deduplication data to the determined destination logical disk for the deduplication data;
a configuration information update unit that updates information about the logical disk in which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data in accordance with the determined destination of the deduplication data;
An information processing device comprising:

2. The information processing device according to claim 1, wherein the destination determination means determines a logical disk determined to have low access performance as the destination of the deduplication data when the number of reads and writes to the plurality of logical disks is equal to or less than a predetermined first threshold.

3. The information processing apparatus according to claim 2, wherein the logical disk determined to have low access performance is a storage device connected via a network.

2. The information processing device according to claim 1, wherein the destination determination means determines a logical disk that is determined to have high access performance as the destination of the deduplication data when a number of reads from the plurality of logical disks is equal to or greater than a predetermined second threshold.

5. The information processing apparatus according to claim 4, wherein the logical disk determined to have high access performance is a storage device that is directly connected without going through a network.

The information processing device according to claim 1, characterized in that, when both the number of reads and the number of writes for the plurality of logical disks are equal to or greater than a predetermined third threshold, the destination determination means determines, as the destination for the deduplicated data, a logical disk having a large number of deduplicated data and the smallest total number of reads and writes.

The destination determination means
determining, when a number of reads and writes to the plurality of logical disks is equal to or less than a first predetermined threshold, a logical disk determined to have low access performance as a destination for the deduplication data;
determining, when a number of reads from the plurality of logical disks is equal to or greater than a second threshold, a logical disk determined to have high access performance as a destination for the deduplication data;
2. The information processing device according to claim 1, wherein when both the number of reads and the number of writes for the plurality of logical disks are equal to or greater than a predetermined third threshold, a logical disk having a large number of deduplicated data and the smallest total number of reads and writes is determined as the destination for the deduplicated data.

a step of storing and managing duplicate data among a plurality of original data on a plurality of logical disks, storing the duplicate data as deduplication data in one logical disk among the plurality of logical disks for one of the original data, and storing address information for referencing the deduplication data in other logical disks among the plurality of logical disks for the original data other than the one of the original data;
monitoring an access status to the plurality of logical disks;
determining a logical disk to which the deduplication data is to be moved according to the access status;
migrating the deduplication data to the determined destination logical disk for the deduplication data;
updating information about the logical disk on which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data, in accordance with the determined destination of the deduplication data.

The computer of the information processing device,
a deduplication function for storing and managing duplicate data among a plurality of original data on a plurality of logical disks, the duplicate data being stored as deduplication data in one logical disk among the plurality of logical disks for one of the original data, and storing address information for referencing the deduplication data in other logical disks among the plurality of logical disks for the original data other than the one of the original data;
a monitoring function for monitoring access status to the plurality of logical disks;
a migration destination determination function that determines a logical disk to which the deduplication data is to be migrated in accordance with the access status;
a migration function for migrating the deduplication data to the determined destination logical disk of the deduplication data;
a setting information update function that updates information about the logical disk in which the deduplication data is stored and address information for referencing the deduplication data stored on the logical disk for the other original data, in accordance with the determined destination of the deduplication data;
A program characterized by causing the program to function as a