See how an example failover scenario works.
The following happens when a cluster resource group for a resilient application fails over due to exceeding the retry limit or if the job is canceled:
- The cluster resource group exit program is called on all active nodes in the recovery domain for the CRG with an action code of failover. This indicates that cluster resource services is preparing to
failover the application's point of access to the first backup.
- Cluster resource services ends the takeover Internet Protocol (IP) connection on the primary node. For more information about the takeover IP address, see Manage application CRG IP addresses.
- Cluster resource services starts the takeover IP address on the first backup (new primary) node.
- Cluster resource services submits a job that calls the cluster resource group exit program only on the new primary node with an action code of Start. This action restarts the application.
The above example shows how one failover scenario works. Other failover scenarios can work differently.