Cluster basics

Understand the basic clustering concepts before you begin to design and customize a cluster to satisfy your needs.

Start of change There are two basic concepts related to cluster: cluster nodes and cluster resource group. A cluster node is either an iSeries™ system or logical partition that is a member of the cluster. When you create a cluster, you specify the systems or logical partitions that you want to include in the cluster as nodes. A cluster resource group (CRG) serves as the control object for a collection of resilient resources. A CRG may contain a subset or all of nodes within the cluster. An iSeries cluster supports four types of CRGs: application, data, device, and peer. Within these types of CRGs there are two common elements: a recovery domain and an exit program. End of change

A recovery domain defines the role of each node in the CRG. When you create a CRG in a cluster, the CRG object is created on all nodes specified to be included in the recovery domain. However, a single system image of the CRG object, which you can access from any active node in the CRG's recovery domain, is provided. That is, any changes made to the CRG will be made on all nodes in the recovery domain.

An exit program is called during cluster-related events for the CRG. One such event is moving an access point from one node to another node.

There are two models of CRGs that can be created in a cluster: primary-backup model and peer model. In the primary-backup model, the nodes in the recovery domain of the CRG can be defined as follows:

The primary node is the cluster node that is the primary point of access for the resilient cluster resource.
A backup node is a cluster node that will take over the role of primary access if the present primary node fails or a manual switchover is initiated.
A replicate node is a cluster node that has copies of cluster resources, but is unable to assume the role of primary or backup.

In a peer model, the recovery domain of a peer CRG defines an peer relationship between nodes. The nodes in the recovery domain of the peer CRG can be defined as follows:

A peer node is a cluster node can be an active access point for cluster resources.
A replicate node is a cluster node that has copies of cluster resources. Nodes defined as replicate in a peer CRG represent the inactive access point for cluster resources.

With a peer CRG, the nodes in the recovery domain are equivalent with respect to the role the nodes plays in recovery. Because each node in this peer CRG has essentially the same role, the concepts of a failover and switchover do not apply. The nodes have a peer relationship, and when one of the nodes fails, others peer nodes will continue operating. End of change

Start of change You can also create a cluster administrative domain which is represented by a peer CRG. The nodes in a cluster administrative domain are all peer nodes in the CRG's recovery domain. There are no replicate nodes. End of change

In the example below, one CRG of each type is present:

Data CRG

The data CRG is present on Node 1, Node 2 and Node 3. This means that the recovery domain for the data CRG has specified a role for Node 1 (primary), Node 2 (first backup) and Node 3 (second backup). In the example, Node 1 is currently serving as the primary point of access. Node 2 is defined as the first backup in the recovery domain. This means that Node 2 contains a copy of the resource which is kept current through logical replication. Should a failover or switchover occur, Node 2 becomes the primary point of access.

Application CRG

The application CRG is present on Node 4 and Node 5. This means that the recovery domain for the application CRG has specified Node 4 and Node 5. In the example, Node 4 is currently serving as the primary point of access. Should a failover or switchover occur, Node 5 becomes the primary point of access for the application. Requires a takeover IP address.

Peer CRG

The peer CRG is present on Node 6 and Node 7. This means that the recovery domain for the peer CRG has specified Node 6 and Node 7. In this example, nodes 6 and 7 can be either peer or replicate nodes. If this is a cluster administrative domain that is represented by peer CRG, resources that are monitored by the cluster administrative domain will have any changes synchronized across the domain represented by node 6 and node 7, regardless on which node the change originated. End of change

Device CRG

The device CRG is present on Node 2 and Node 3. This means that the recovery domain for the device CRG has specified Node 2 and Node 3. In the example, Node 2 is currently serving as the primary point of access. This means that the resilient device owned by the device CRG can currently be accessed from Node 2. Should a failover or switchover occur, Node 3 becomes the primary point of access for the device.

A device CRG requires a resilient device called an independent disk pool (also called an independent auxiliary storage pool or independent ASP) to be configured on an external device, an expansion unit (tower) or IOP in a logical partition.

The nodes in the recovery domain of a device CRG must also be members of the same device domain. The example below illustrates a device CRG with Node L and Node R in its recovery domain. Both nodes are also members of the same device domain.

A data CRG featuring a device domain and an external expansion unit