Recover a damaged cluster object

While it is unlikely you will ever experience a damaged object, it may be possible for cluster resource services objects to become damaged.

The system, if it is an active node, will attempt to recover from another active node in the cluster. The system will perform the following recovery steps:

For a damaged internal object

  1. The node that has the damage ends.
  2. If there is at least one other active node within the cluster, the damaged node will automatically restart itself and rejoin the cluster. The process of rejoining will correct the damaged situation.

For a damaged cluster resource group

  1. The node that has a damaged CRG will fail any operation currently in process that is associated with that CRG. The system will then attempt to automatically recover the CRG from another active node.
  2. If there is at least one active member in the recovery domain, the CRG recovery will work. Otherwise, the CRG job ends.

If the system cannot identify or reach any other active node, you will need to perform these recovery steps.

For a damaged internal object

You receive an internal clustering error (CPFBB46, CPFBB47, or CPFBB48).

  1. End clustering for the node that contains the damage.
  2. Restart clustering for the node that contains the damage. Do this from another active node in the cluster.
  3. If Steps 1 and 2 do not solve the problem, remove the damaged node from the cluster.
  4. Add the system back into the cluster and into the recovery domain for the appropriate cluster resource groups.

For a damaged cluster resource group

You receive an error stating that an object is damaged (CPF9804).
  1. End clustering on the node that contains the damaged cluster resource group.
  2. Delete the CRG (using the DLTCRG command).
  3. If there is no other node active in the cluster that contains the CRG object, restore from media.
  4. Start clustering on the node that contains the damaged cluster resource group. This can be done from any active node.
  5. When you start clustering, the system resynchronizes all of the cluster resource groups. You may need to recreate the CRG if no other node in the cluster contains the CRG.