JP2008052407A

JP2008052407A - Cluster system

Info

Publication number: JP2008052407A
Application number: JP2006226364A
Authority: JP
Inventors: Yusuke Kaneki; 佑介金木
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2006-08-23
Filing date: 2006-08-23
Publication date: 2008-03-06

Abstract

<P>PROBLEM TO BE SOLVED: To solve the problem that since one service failure affects the other services and a cluster manager in a conventional manner, it is necessary to cope with for the OS of an application and a cluster manager, and versatility is poor, and there is any failure which can not be detected by a failure processing mechanism which detects S/W failure from the type of interruption. <P>SOLUTION: This cluster system is provided with: a plurality of computers; virtual machines respectively installed in those computers; a host OS; a cluster manager; 0 or more guest OS; and a shared disk having an OS image which is accessible from each computer. The cluster manager is provided with: functions of the start/stop of each guest OS; the monitor of the status of a service; the failure restoration of a defective service; a heart beat function; and the fail over function of failure restoration using the OS image of the shared disk based on the failure detection of the other cluster manager. Also, each guest OS exclusively installed for a specific service is defined as one service for management. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、仮想マシンを持つ計算機を複数用いて構成したクラスタシステムに係わり、その信頼性を向上させる技術に関する。 The present invention relates to a cluster system configured by using a plurality of computers having virtual machines, and relates to a technique for improving the reliability of the cluster system.

図８は従来のクラスタシステムを表す。計算機10Aではオペレーティングシステム15Aが動作し、その上でクラスタマネージャ20Aが動作し、計算機10Bではオペレーティングシステム15Bが動作し、クラスタマネージャ20Bが動作することで2ノードのクラスタシステムを構成する。クラスタマネージャは互いにハートビート90を利用して接続する。
オペレーティングシステム15Aでは、アプリケーション30A〜30Cが動作し、30A及び30Bがサービス31Aを提供、30Cがサービス31Bを提供する。計算機10A〜10B、クライアント40A〜40BはLAN91によって接続され、クライアント40A〜40Bはサービス31A〜31Bにアクセスする。 FIG. 8 shows a conventional cluster system. An operating system 15A operates on the computer 10A, and a cluster manager 20A operates on the computer 10A. An operating system 15B operates on the computer 10B, and the cluster manager 20B operates to constitute a two-node cluster system. Cluster managers connect to each other using heartbeat 90.
In the operating system 15A, applications 30A to 30C operate, and 30A and 30B provide a service 31A, and 30C provides a service 31B. Computers 10A-10B and clients 40A-40B are connected by LAN 91, and clients 40A-40B access services 31A-31B.

次に動作について説明する。
クラスタマネージャ20Aはサービス31A〜31B、クラスタマネージャ20Bを監視し、サービス31Aの障害を検知すると、アプリケーション30A及びアプリケーション30Bを停止、クラスタマネージャ20Bにアプリケーション30A〜30Bの起動を要求し、サービス31Aを計算機10Bで復旧する。
また、サービス31Aの障害がCPUを占有する等でOSの動作に支障を与えるような障害の場合、サービス31B、クラスタマネージャ20Aもその影響で動作が不安定になる。この場合、クラスタマネージャ20Bはクラスタマネージャ20Bの障害を検知し、計算機10Aを停止、サービス31A、31Bを計算機10Bで復旧する。
このように動作してシステムの信頼性を向上させる。 Next, the operation will be described.
When the cluster manager 20A monitors the services 31A to 31B and the cluster manager 20B and detects a failure in the service 31A, it stops the applications 30A and 30B, requests the cluster manager 20B to start the applications 30A to 30B, and calculates the service 31A Restore with 10B.
In addition, when the failure of the service 31A occupies the CPU or the like, which impedes the operation of the OS, the operation of the service 31B and the cluster manager 20A also becomes unstable due to the influence. In this case, the cluster manager 20B detects a failure of the cluster manager 20B, stops the computer 10A, and restores the services 31A and 31B with the computer 10B.
It works in this way to improve system reliability.

図９は、特開2003-330740号公報に記載される従来の論理計算機を用いたクラスタシステムに関する構成図である。物理計算機1A、1B上に計算機資源分割機構2A、2Bがあり、論理計算機3A〜3Dが動作する。計算機資源分割機構2A、2B内にクラスタマネージャ20A、20Bの機能を有し、論理計算機3A〜3Dを用いて例えば図９のような論理計算機3A、3Cを対、論理計算機3B、3Dを対としたホットスタンバイのクラスタシステムを構成する。また、図１０は従来方式を実現するためのクラスタマネージャの構成を示す。クラスタマネージャは論理計算機割当機構、障害処理機構、物理計算機構成情報テーブル、論理計算機構成情報テーブル、クラスタテーブルを有する。 FIG. 9 is a configuration diagram relating to a cluster system using a conventional logical computer described in Japanese Patent Laid-Open No. 2003-330740. There are computer resource division mechanisms 2A and 2B on the physical computers 1A and 1B, and logical computers 3A to 3D operate. The computer resource partitioning mechanisms 2A and 2B have the functions of cluster managers 20A and 20B. Using logical computers 3A to 3D, for example, logical computers 3A and 3C as shown in FIG. 9 are paired, and logical computers 3B and 3D are paired. Configure a hot standby cluster system. FIG. 10 shows the configuration of a cluster manager for realizing the conventional method. The cluster manager has a logical computer allocation mechanism, a failure processing mechanism, a physical computer configuration information table, a logical computer configuration information table, and a cluster table.

次に動作について説明する。障害処理機構は障害発生により生じた割り込みを検知する。割り込みの種類により、ソフトウェア障害を検知すると、割り込みを起こした論理計算機を再起動する。ハードウェアの障害を検知すると障害を発生したプロセッサに割り当てられた論理計算機を別のプロセッサに割り当て再起動する。また、物理計算機1A全体が障害により停止すると、論理計算機3A、3Bの業務を物理計算機1Bの論理計算機3C、3Dが引き継ぎ、復旧する。
以上のように、物理計算機上のハードウェア障害や論理計算機上のソフトウェア障害に対して、論理計算機を適切に再起動し、システム全体の信頼性を向上させる仕組みが示されている。 Next, the operation will be described. The failure processing mechanism detects an interrupt caused by the occurrence of a failure. If a software failure is detected depending on the type of interrupt, the logical computer that caused the interrupt is restarted. When a hardware failure is detected, the logical computer assigned to the failed processor is assigned to another processor and restarted. Further, when the entire physical computer 1A is stopped due to a failure, the logical computers 3C and 3D of the physical computer 1B take over and recover the work of the logical computers 3A and 3B.
As described above, there is shown a mechanism for appropriately restarting a logical computer and improving the reliability of the entire system in response to a hardware failure on a physical computer or a software failure on a logical computer.

特開2003-330740号公報JP 2003-330740 A

従来のクラスタシステムでは、以下のような問題が生じる。
(1)．従来技術では、各サービスを提供するアプリケーションとクラスタマネージャが同じOS上で動作するため、1つサービスの障害が、他のサービスにもクラスタマネージャにも影響を与える。計算機ごとにサービスを割り当てることで回避可能であるが、その場合、システムが複雑になり、管理コスト、導入コストが増大するという問題を持つ。
(2)．従来技術では、各サービスを提供するアプリケーションとクラスタマネージャが同じOS上で動作するため、アプリケーションとクラスタマネージャが使用するOSに対応している必要がある。そのため、アプリケーション、クラスタマネージャ、OSの選択肢が狭まり、システム構築コストを増大させる。
(3)．特開2003-330740号公報記載のものでは、計算機資源分割機構内にクラスタマネージャを実装するため、クラスタマネージャと計算機資源分割機構が密接に関係し、汎用性が損なわれる。そのため、機能拡張や保守コストが増大する。
(4)．特開2003-3307740公報記載のものでは、障害処理機構が割り込みの種類から論理計算機のソフトウェア障害を検知するため、アプリケーションが割り込みレベルでは正常に動作するが、サービスを提供できていない等の障害を検出することは出来ない。 In the conventional cluster system, the following problems occur.
(1). In the prior art, the application that provides each service and the cluster manager operate on the same OS, so the failure of one service affects the other services and the cluster manager. This can be avoided by allocating a service for each computer. However, in this case, there is a problem that the system becomes complicated, and management costs and introduction costs increase.
(2). In the prior art, since the application providing each service and the cluster manager operate on the same OS, it is necessary to support the OS used by the application and the cluster manager. This reduces the options for applications, cluster managers, and OSs, and increases system construction costs.
(3). In the one described in Japanese Patent Application Laid-Open No. 2003-330740, since the cluster manager is mounted in the computer resource dividing mechanism, the cluster manager and the computer resource dividing mechanism are closely related, and versatility is impaired. As a result, function expansion and maintenance costs increase.
(Four). In the one described in Japanese Patent Laid-Open No. 2003-3307740, since the fault processing mechanism detects a software fault of the logical computer from the type of interrupt, the application operates normally at the interrupt level, but the fault such as not being able to provide the service is detected. It cannot be detected.

この発明に係るクラスタシステムは、
複数の計算機と、この複数の計算機の夫々に設置された仮想マシンと、１個のホストOSと、ホストOSにのみ動作するクラスタマネージャと、
複数の計算機の少なくとも１つ以上に外部に提供するサービス用のアプリケーションを動作させるゲストOSと、
ゲストOSのOSイメージを持ち、各計算機からアクセス可能な共有ディスクとを備え、
クラスタマネージャは、仮想マシンの機能を利用し、各ゲストOSの起動、停止の制御、サービスの状態監視を行い、サービス障害を検知すると、予め設定されたフェイルオーバポリシーに従って該当サービスを障害から復旧する機能と、クラスタマネージャ同士が互いに状態監視を行うハートビート機能を持ち、他のクラスタマネージャの障害を検知すると、該当クラスタマネージャが管理するサービス全てを共有ディスクのOSイメージを用いて障害から復旧するフェイルオーバ機能を有し、
各ゲストOSはそれぞれが特定のサービス専用で、各サービスの復旧方法が指示されるフェイルオーバポリシーに関連付けられ、ゲストOSそのものを１つのサービスとみなして、クラスタマネージャに管理される構成にされる。 The cluster system according to the present invention is:
A plurality of computers, a virtual machine installed in each of the plurality of computers, one host OS, a cluster manager that operates only on the host OS,
A guest OS that operates an application for a service provided externally to at least one of a plurality of computers;
It has an OS image of the guest OS and a shared disk that can be accessed from each computer
The cluster manager uses the virtual machine functions to control the start and stop of each guest OS, monitor the service status, and when a service failure is detected, it restores the service from the failure according to a preset failover policy. Failover function that recovers all services managed by the cluster manager from the failure using the OS image of the shared disk when a failure of other cluster managers is detected. Have
Each guest OS is dedicated to a specific service and is associated with a failover policy instructing a recovery method of each service. The guest OS itself is regarded as one service and is managed by the cluster manager.

この発明のクラスタシステムによれば、クラスタマネージャがサービスまたは他のクラスタマネージャを監視することでソフトウェアまたはハードウェアの障害が生じても、それを検知しサービスを復旧できる。また、OSごとにサービスを割り当てることで、サービスの障害が他のサービスへ影響することを防ぐことが出来る。これによりシステムの信頼性を向上させることが出来る。
さらにOSを仮想マシン上のゲストOSとすることで計算機の不要な増加を防ぐことが出来る。 According to the cluster system of the present invention, even if a software or hardware failure occurs as a result of the cluster manager monitoring the service or another cluster manager, the service can be detected and recovered. In addition, by allocating a service for each OS, it is possible to prevent a service failure from affecting other services. Thereby, the reliability of the system can be improved.
Furthermore, an unnecessary increase in computers can be prevented by setting the OS as a guest OS on a virtual machine.

実施の形態１．
図１は、計算機10A〜10B、共有ディスク81で2ノードの共有ディスク型ホットスタンバイのクラスタシステムである。
計算機10Aでは、仮想マシン90Aが動作し、ゲストOS12AとゲストOS12B、ホストOS11Aが動作する。また、ストレージ（図示せず）を持ち、ゲストOS12A及びゲストOS12BのOSイメージ（図示せず）を持つ。ゲストOS12Aではアプリケーション30A〜30Bが動作し、サービス31Aを提供する。ゲストOS12Bでは、アプリケーション30Cが動作し、サービス31Bを提供する。ホストOS11Aではクラスタマネージャ20Aが動作し、仮想マシン制御I/F91Aを利用して仮想マシン90Aにアクセス可能である。また、クラスタマネージャ20Aはサービス31A〜31Bの復旧方法が記述されているフェイルオーバポリシー60Aを持つ。 Embodiment 1 FIG.
FIG. 1 shows a two-node shared disk type hot standby cluster system including computers 10A to 10B and a shared disk 81.
In the computer 10A, a virtual machine 90A operates, and a guest OS 12A, a guest OS 12B, and a host OS 11A operate. It also has a storage (not shown) and has OS images (not shown) of the guest OS 12A and guest OS 12B. In the guest OS 12A, the applications 30A to 30B operate and provide the service 31A. In the guest OS 12B, the application 30C runs and provides the service 31B. In the host OS 11A, the cluster manager 20A operates and can access the virtual machine 90A using the virtual machine control I / F 91A. Further, the cluster manager 20A has a failover policy 60A in which a recovery method for the services 31A to 31B is described.

計算機10Bでは、仮想マシン90Bが動作し、ホストOS11Bが動作する。また、ストレージ（図示せず）を持ち、ゲストOS12A及びゲストOS12BのOSイメージ（図示せず）を持つ。ホストOS11Bではクラスタマネージャ20Bが動作し、仮想マシン制御I/F91Bを利用して仮想マシン90Bにアクセス可能である。また、クラスタマネージャ20Bはサービス31A〜31Bの復旧方法が記述されているフェイルオーバポリシー60Bを持つ。 In the computer 10B, the virtual machine 90B operates and the host OS 11B operates. It also has a storage (not shown) and has OS images (not shown) of the guest OS 12A and guest OS 12B. In the host OS 11B, the cluster manager 20B operates and can access the virtual machine 90B using the virtual machine control I / F 91B. Further, the cluster manager 20B has a failover policy 60B in which a recovery method for the services 31A to 31B is described.

共有ディスク81は計算機10A〜10Bと接続され、ゲストOSのOSイメージ13A〜13Bを持つ。
計算機10A〜10Bはハートビート90によって接続され、クラスタマネージャ20A〜20Bはハートビート90を利用して互いに接続する。計算機10A〜10B、クライアント40A〜40BはLAN91によって接続され、クライアント40A〜40Bはサービス31A〜31B、ホストOS11A〜11Bと相互に接続する。 The shared disk 81 is connected to the computers 10A to 10B and has OS images 13A to 13B of the guest OS.
The computers 10A to 10B are connected by a heartbeat 90, and the cluster managers 20A to 20B are connected to each other using the heartbeat 90. The computers 10A to 10B and the clients 40A to 40B are connected by a LAN 91, and the clients 40A to 40B are connected to the services 31A to 31B and the host OSs 11A to 11B.

次に動作について説明する。
クラスタマネージャ20AはLAN91を利用して、サービス31A〜31Bに一定間隔でアクセスを行い障害の有無を監視する。
クラスタマネージャ20Aがサービス31Aの障害を検知すると、クラスタマネージャ20Aは仮想マシン制御I/F91Aを利用して、サービス31Aが動作するゲストOS12Aを停止する。ゲストOS12Aが停止すると、クラスタマネージャ20Aは、フェイルオーバポリシー60Aに従って計算機10Aもしくは計算機10BでゲストOS12Aの起動を行う。 Next, the operation will be described.
The cluster manager 20A uses the LAN 91 to access the services 31A to 31B at regular intervals and monitors whether there is a failure.
When the cluster manager 20A detects a failure of the service 31A, the cluster manager 20A uses the virtual machine control I / F 91A to stop the guest OS 12A on which the service 31A operates. When the guest OS 12A is stopped, the cluster manager 20A starts the guest OS 12A on the computer 10A or 10B according to the failover policy 60A.

フェイルオーバポリシー60Aにより計算機10AでゲストOS12Aを起動する場合は、クラスタマネージャ20Aが仮想マシン制御I/F91Aを利用して共有ディスク81のOSイメージ13AからゲストOS12Aの起動を行い、サービス31Aが復旧し、クライアント40A〜40Bへのサービスを再開する。
フェイルオーバポリシー60Aにより計算機10BでゲストOS12Aを起動する場合は、クラスタマネージャ20Aが、クラスタマネージャ20Bに対してゲストOS12Aを計算機10B上で起動するように要求を行う。
クラスタマネージャ20Bは、クラスタマネージャ20AからゲストOS12Aの起動要求を受けると、仮想マシン制御I/F91Bを利用して計算機10Bのストレージ内のOSイメージからゲストOS12Aの起動を行い、サービス31Aが復旧する。
クラスタマネージャ20A〜20Bはハートビート90を利用して、互いに一定間隔でアクセスを行い、障害の有無を監視する。 When the guest OS 12A is started on the computer 10A by the failover policy 60A, the cluster manager 20A uses the virtual machine control I / F 91A to start the guest OS 12A from the OS image 13A of the shared disk 81, and the service 31A is restored. Resume service to clients 40A-40B.
When the guest OS 12A is activated on the computer 10B by the failover policy 60A, the cluster manager 20A requests the cluster manager 20B to activate the guest OS 12A on the computer 10B.
When the cluster manager 20B receives a guest OS 12A activation request from the cluster manager 20A, the guest OS 12A is activated from the OS image in the storage of the computer 10B using the virtual machine control I / F 91B, and the service 31A is restored.
The cluster managers 20A to 20B use the heartbeat 90 to access each other at regular intervals and monitor the presence or absence of a failure.

クラスタマネージャ20Bがクラスタマネージャ20Aの障害を検知すると、クラスタマネージャ20Bは計算機10Bを停止する。計算機10Bを停止した後、クラスタマネージャ20Bは仮想マシン制御I/F91Bを利用して共有ディスク81のOSイメージ13A〜13BからゲストOS12A〜12Bの起動を行い、クラスタマネージャ20Aが管理していたサービス31A〜31Bを計算機10Aで復旧し、クライアント40A〜40Bへのサービスを再開する。 When the cluster manager 20B detects a failure of the cluster manager 20A, the cluster manager 20B stops the computer 10B. After stopping the computer 10B, the cluster manager 20B uses the virtual machine control I / F 91B to start the guest OSs 12A to 12B from the OS images 13A to 13B on the shared disk 81, and the service 31A managed by the cluster manager 20A ~ 31B is restored by the computer 10A, and the services to the clients 40A to 40B are resumed.

以上のように、サービスまたはクラスタマネージャを監視することでソフトウェアまたはハードウェアの障害が生じても、それを検知し、サービスを復旧できる。また、OSごとにサービスを割り当てることで、サービスの障害が他のサービスへ影響することを防ぐことが出来る。これにより、システムの信頼性を向上させることが出来る。
さらにOSを仮想マシン上のゲストOSとすることで計算機の不要な増加を防いでいる。 As described above, even if a software or hardware failure occurs by monitoring the service or the cluster manager, it can be detected and the service can be recovered. In addition, by allocating a service for each OS, it is possible to prevent a service failure from affecting other services. Thereby, the reliability of the system can be improved.
Furthermore, an unnecessary increase in computers is prevented by making the OS a guest OS on a virtual machine.

実施の形態２．
実施の形態１では、共有ディスク81を用いてクラスタを構成したが、実施の形態２はローカルディスクを用いたデータレプリケーション型クラスタである。
図２は実施の形態２の構成を示している。計算機10A〜10Bはローカルディスク80A〜80Bを持ち、互いにゲストOS12A及びゲストOS12BのOSイメージ13A〜13Bを持つ。また、ホストOS11A〜ホストOS11Bは、レプリケータ70A〜70Bを持つ。レプリケータ70A〜70BはLAN91により接続する。 Embodiment 2. FIG.
In the first embodiment, the cluster is configured using the shared disk 81, but the second embodiment is a data replication type cluster using a local disk.
FIG. 2 shows the configuration of the second embodiment. The computers 10A to 10B have local disks 80A to 80B and have OS images 13A to 13B of the guest OS 12A and the guest OS 12B, respectively. The host OS 11A to host OS 11B have replicators 70A to 70B. The replicators 70A to 70B are connected by a LAN 91.

次に動作について説明する。
レプリケータ70A〜70BはLAN91を通じてローカルディスク80A〜80Bのファイルを同期させることが可能である。クラスタマネージャ20A〜20Bはレプリケータ70A〜70Bを管理し、ローカルディスク80A〜80B間でOSイメージ13A〜13Bを一定間隔で同期する。
サービス31Aの障害検知によりクラスタマネージャ20Bがクラスタマネージャ10AからゲストOS12Aの起動要求を受けると、クラスタマネージャ20Aが仮想マシン制御I/F91Aを利用してローカルディスク80BにあるOSイメージ13AからゲストOS12Aの起動を行う。
以上のように共有ディスクを用いなくとも実施の形態１の効果を得ることが出来る。また、高価な共有ディスクを使用せず、コスト削減が可能である。 Next, the operation will be described.
The replicators 70A to 70B can synchronize files on the local disks 80A to 80B through the LAN 91. The cluster managers 20A to 20B manage the replicators 70A to 70B, and synchronize the OS images 13A to 13B at regular intervals between the local disks 80A to 80B.
When the cluster manager 20B receives a guest OS 12A start request from the cluster manager 10A due to failure detection of the service 31A, the cluster manager 20A uses the virtual machine control I / F 91A to start the guest OS 12A from the OS image 13A on the local disk 80B. I do.
As described above, the effects of the first embodiment can be obtained without using a shared disk. In addition, cost can be reduced without using an expensive shared disk.

実施の形態３．
実施の形態２では、ゲストOSを1つのOSイメージで扱っていた。実施の形態３では、図３のようにゲストOS12Aを構成するイメージをシステムイメージとデータイメージに分ける。システムイメージとはオペレーティングシステムやアプリケーションが格納されているパーティションである。データイメージはアプリケーションがデータを格納するパーティションである。
ゲストOSが起動される際には、システムイメージとデータイメージ両方を組み合わせてゲストOSを構成する。サービス稼動時、アプリケーションはデータイメージが提供するパーティションに重要なデータを格納し、システムイメージには書き込みを行わない。また、図４のようにレプリケータはデータイメージのみを同期する。 Embodiment 3 FIG.
In the second embodiment, the guest OS is handled as one OS image. In the third embodiment, as shown in FIG. 3, the image constituting the guest OS 12A is divided into a system image and a data image. A system image is a partition in which an operating system and applications are stored. A data image is a partition where an application stores data.
When the guest OS is started, the guest OS is configured by combining both the system image and the data image. When the service is running, the application stores important data in the partition provided by the data image and does not write to the system image. Further, as shown in FIG. 4, the replicator synchronizes only the data image.

以上のように、OSイメージをシステムイメージ、データイメージに分離し、レプリケータでデータイメージのみ同期することで、同期されるデータ量を削減する。これにより、システムへの負荷を軽減、同期の間隔の短縮が可能となり、サービスの信頼性の向上、サービスレベルの向上が可能となる。 As described above, the OS image is separated into the system image and the data image, and only the data image is synchronized with the replicator, thereby reducing the amount of data to be synchronized. As a result, the load on the system can be reduced, the synchronization interval can be shortened, the service reliability can be improved, and the service level can be improved.

実施の形態４．
実施の形態１〜３では、クラスタマネージャ20A〜20BがゲストOS12A〜12Bを起動する場合、共有ディスクやローカルディスクにあるOSイメージから起動を行うが、実施の形態４ではOSイメージではなく、ゲストOSのスナップショットイメージから起動する場合について説明する。
クラスタシステムへサービス31Aを登録する際には、まずゲストOS12AのOSイメージ13Aを用意しディスクに格納、次にOSイメージ13Aからスナップショットイメージを作成する。
図５はスナップショットイメージの作成方法を説明している。まず、ホストOSが仮想マシン制御I/Fを利用してゲストOS21Aを起動する（S81）。サービス31Aへのアクセスを一定間隔で行う（S82）。サービス31Aから正しいレスポンスが返えるか否かを判定し（S83）、正しいレスポンスが返えることで、アプリケーション30A〜30Bが起動しサービス31Aの提供が開始されたことを検知する。仮想マシン制御I/F91Aにアプリケーションが起動した直後のスナップショットであるスナップショットイメージの作成を要求し（S84）、作成されたゲストOS12Aのスナップショットイメージをディスクに保存する（S85）。 Embodiment 4 FIG.
In the first to third embodiments, when the cluster managers 20A to 20B start the guest OSs 12A to 12B, the guest OS is started from the OS image on the shared disk or the local disk. In the fourth embodiment, the guest OS is not an OS image. A case of starting from a snapshot image of will be described.
When registering the service 31A in the cluster system, first the OS image 13A of the guest OS 12A is prepared and stored in the disk, and then a snapshot image is created from the OS image 13A.
FIG. 5 illustrates a method for creating a snapshot image. First, the host OS starts the guest OS 21A using the virtual machine control I / F (S81). Access to the service 31A is performed at regular intervals (S82). It is determined whether or not a correct response can be returned from the service 31A (S83), and when the correct response is returned, it is detected that the applications 30A to 30B are activated and the provision of the service 31A is started. The virtual machine control I / F 91A is requested to create a snapshot image that is a snapshot immediately after the application is started (S84), and the created snapshot image of the guest OS 12A is stored in the disk (S85).

クラスタマネージャ20Aがサービス31Aの障害を検知した等の理由で、ゲストOS12Aの起動をクラスタマネージャ20Bに要求すると、クラスタマネージャ20Bは仮想マシン制御I/F91Bを通してゲストOS12Aをスナップショットイメージから起動する。 When the cluster manager 20A requests the cluster manager 20B to start the guest OS 12A because, for example, the failure of the service 31A is detected, the cluster manager 20B starts the guest OS 12A from the snapshot image through the virtual machine control I / F 91B.

以上のように、ゲストOSをサービス提供開始後のスナップショットイメージから起動することで、図６のようにOSイメージから起動するよりもOSやアプリケーションの起動過程を省略できるためフェイルオーバ時のサービスの復旧速度が向上し、サービスの信頼性の向上、サービスレベルの向上が可能になる。 As described above, booting the guest OS from the snapshot image after the start of service provision makes it possible to skip the OS and application startup process rather than booting from the OS image as shown in FIG. Speed is improved, service reliability can be improved, and service level can be improved.

実施の形態５．
実施の形態１〜４では、サービスの監視がLAN91を経由したクラスタマネージャのアクセスによってのみを行われる。実施の形態５では、クラスタマネージャがLAN91を経由したサービスの監視の外に、仮想マシン制御I/Fを利用してゲストOSの状態を取得し、その情報からもサービス監視を行う。
実施の形態5の動作について説明する。
クラスタマネージャ20Aは一定間隔でサービス31AにLAN91を経由してアクセスし監視を行うと共に、一定間隔で仮想マシン制御I/F91Aを利用してゲストOS12Aの状態を取得する。状態がクラッシュ等の障害状態であった場合、クラスタマネージャ20Aはサービス31Aの障害を検知し、サービス31Aの復旧を行う。 Embodiment 5. FIG.
In the first to fourth embodiments, service monitoring is performed only by access of the cluster manager via the LAN 91. In the fifth embodiment, the cluster manager acquires the status of the guest OS using the virtual machine control I / F in addition to the service monitoring via the LAN 91, and also performs the service monitoring from the information.
The operation of the fifth embodiment will be described.
The cluster manager 20A accesses and monitors the service 31A via the LAN 91 at regular intervals, and acquires the status of the guest OS 12A using the virtual machine control I / F 91A at regular intervals. If the state is a failure state such as a crash, the cluster manager 20A detects the failure of the service 31A and recovers the service 31A.

以上によりサービスの監視についてゲストOSの状態を含めることによって、ゲストOSの状態からも障害を検知することができ、サービスの信頼性の向上が可能になる。 As described above, by including the state of the guest OS for monitoring the service, a failure can be detected from the state of the guest OS, and the reliability of the service can be improved.

実施の形態６．
実施の形態６では、ゲストOS12A〜12B、ホストOS11Aに異なる種類のオペレーティングシステム、またはオペレーティングシステムの構成やパラメータの異なるものとする。
例えば、図７ではホストOSにLinux1、ゲストOS12AではLinux1'、ゲストOS12BではWindows（登録商標）が動作する。ホストOSで動作するLinux1は、クラスタマネージャが動作し、その機能に必要なだけのモジュールやパッケージのみを持ち、カーネルパラメータ等の設定もクラスタマネージャの動作に最適化されたものである。ゲストOS12Aで動作するLinux1'はLinux1と同じOSであるが、アプリケーション30A〜31Aが動作するだけのモジュールやパッケージのみを持ち、カーネルパラメータ等の設定もアプリケーション30A〜30Bの動作に最適化されたものである。ゲストOS12Bではwindows（登録商標）が動作し、windows（登録商標）のみに対応するアプリケーション30Cが動作する。 Embodiment 6 FIG.
In the sixth embodiment, the guest OSs 12A to 12B and the host OS 11A have different types of operating systems or different operating system configurations and parameters.
For example, in FIG. 7, Linux1 is the host OS, Linux1 'is the guest OS12A, and Windows (registered trademark) is the guestOS12B. Linux1 running on the host OS runs the cluster manager, has only the modules and packages necessary for its functions, and the kernel parameters and other settings are optimized for the cluster manager. Linux1 'running on guest OS12A is the same OS as Linux1, but it has only modules and packages that only run applications 30A-31A, and kernel parameters and other settings are optimized for running applications 30A-30B It is. In the guest OS 12B, windows (registered trademark) operates, and an application 30C corresponding to only windows (registered trademark) operates.

以上のように、ゲストOSにサービス提供のみに特化したOSを適応することで、余分なモジュールの動作を抑制し、コンピュータリソースを有効に活用することが出来る。また、クラスタマネージャやアプリケーション間のOS対応を考慮する必要がない。さらには、アプリケーションやクラスタマネージャごとに最適なOSを選択することが出来る。そのため、システム構築を容易にし、サービスレベルの向上も可能となる。 As described above, by applying an OS specialized only for providing services to the guest OS, it is possible to suppress the operation of extra modules and effectively use computer resources. There is no need to consider OS support between cluster managers and applications. In addition, you can select the optimal OS for each application and cluster manager. Therefore, the system construction is facilitated and the service level can be improved.

実施の形態1〜６では、2つの計算機でホットスタンバイ型クラスタシステムを構築しているが、計算機数3つ以上でも良い。また、アクティブ・アクティブ型等の他のフェイルオーバ方式でも良い。
これにより、複数の計算機の障害に対応できる等、サービスの信頼性向上が可能となる。 In Embodiments 1 to 6, the hot standby cluster system is constructed with two computers, but the number of computers may be three or more. Also, other failover methods such as active / active type may be used.
This makes it possible to improve service reliability, such as being able to deal with failures of a plurality of computers.

この発明のクラスタシステムは、クラスタソフトウェア、仮想マシン技術を組み合わせて信頼性の向上が図れるサーバシステム等に適用可能である。 The cluster system of the present invention can be applied to a server system that can improve reliability by combining cluster software and virtual machine technology.

この発明の実施の形態１を示すクラスタシステムの構成図である。BRIEF DESCRIPTION OF THE DRAWINGS It is a block diagram of the cluster system which shows Embodiment 1 of this invention. 実施の形態２を示すクラスタシステムの構成図である。5 is a configuration diagram of a cluster system showing a second embodiment. FIG. 実施の形態３によるゲストOSの構成説明図である。FIG. 10 is a configuration explanatory diagram of a guest OS according to a third embodiment. 実施の形態３によるレプリケータの動作説明図である。FIG. 10 is an operation explanatory diagram of the replicator according to the third embodiment. 実施の形態４によるスナップショットイメージの作成方法のフロー図である。FIG. 10 is a flowchart of a snapshot image creation method according to Embodiment 4; ゲストOSのOSイメージから起動のフロー図と、スナップショットイメージからの起動のフロー図である。FIG. 5 is a flowchart of booting from an OS image of a guest OS and a flowchart of booting from a snapshot image. ホストOSとゲストOSに異なる種類のオペレーティングシステムを用いる説明構成図である。FIG. 3 is an explanatory configuration diagram using different types of operating systems for a host OS and a guest OS. 従来のクラスタシステムを表す構成図である。It is a block diagram showing the conventional cluster system. 論理計算機を用いた従来のクラスタシステムの構成図である。It is a block diagram of the conventional cluster system using a logical computer. 従来のクラスタシステムを構成するクラスタマネージャの構成図である。It is a block diagram of the cluster manager which comprises the conventional cluster system.

Explanation of symbols

1A、1B：物理計算機、2A、2B：計算機資源分割機構、3A〜3D：論理計算機、10A、10B：計算機、11A、11B：ホストOS、12A、12B：ゲストOS、13A、13B：OSイメージ、15A、15B：オペレーティングシステム、20A、20B：クラスタマネージャ、30A、30B、30C：アプリケーション、31A、31B：サービス、40A、40B：クライアント、60A、60B：フェイルオーバポリシー、70A、70B：レプリケータ、80A、80B：ローカルディスク、81：共有ディスク、90：ハートビート、90A、90B：仮想マシン、91：LAN、91A、91B：仮想マシン制御I/F。 1A, 1B: Physical computer, 2A, 2B: Computer resource partitioning mechanism, 3A-3D: Logical computer, 10A, 10B: Computer, 11A, 11B: Host OS, 12A, 12B: Guest OS, 13A, 13B: OS image, 15A, 15B: Operating system, 20A, 20B: Cluster manager, 30A, 30B, 30C: Application, 31A, 31B: Service, 40A, 40B: Client, 60A, 60B: Failover policy, 70A, 70B: Replicator, 80A, 80B : Local disk, 81: Shared disk, 90: Heartbeat, 90A, 90B: Virtual machine, 91: LAN, 91A, 91B: Virtual machine control I / F.

Claims

A plurality of computers, a virtual machine installed in each of the plurality of computers, one host OS, a cluster manager that operates only on the host OS,
A guest OS that operates an application for a service provided externally to at least one of a plurality of computers;
It has an OS image of the guest OS and a shared disk that can be accessed from each computer.
The cluster manager uses the virtual machine function to start and stop each guest OS, monitor the service status, and if a service failure is detected, the cluster manager can recover the service from the failure according to a preset failover policy. The cluster manager has a heartbeat function that monitors the status of each other, and when a failure of another cluster manager is detected, a failover function that recovers all services managed by the cluster manager from the failure using the OS image of the shared disk. Have
Each guest OS is dedicated to a specific service, and is associated with a failover policy that instructs how to restore each service. The guest OS itself is regarded as a single service and managed by the cluster manager. Feature cluster system.

The shared disk with the OS image of the guest OS is formed with local disks that each computer has,
2. The cluster system according to claim 1, wherein each host OS installed in each computer has a replicator and is configured to synchronize an OS image in a local disk in each computer between the computers.

The OS image of the guest OS is composed of a system image that is an image of a disk partition in which the OS and applications are stored, and a data image that is an image of a disk partition in which the applications store data. Or the cluster system of 2.

In addition to the OS image of the guest OS, the shared disk has a snapshot image that is a snapshot immediately after the guest OS starts and the application starts.
The cluster system according to any one of claims 1 to 3, wherein the cluster manager is configured to start from a snapshot image when the guest OS is started during operation.

The cluster manager obtains the status of each guest OS by using the virtual machine control I / F together with service monitoring by access via a LAN, and handles the information as service monitoring information. The cluster system as described in any one of thru | or 4.

The cluster system according to any one of claims 1 to 5, wherein the guest OS and the host OS installed in the computer have different types of operating systems or different configurations and parameters of the operating systems. .