This topic provides an overview of different data resilience technologies that can be used with clusters to enhance high availability.
Data resiliency allows data to remain available to applications and users even though the system that originally hosted the data fails. Choosing the correct set of data resiliency technologies in the context of your overall business continuity strategy can be complex and difficult. It's important to understand the different data resilience solutions that can be used to enhance availability in multiple system environments. You can either choose a single solution or use a combination of these technologies to meet your needs.
For more details on these solutions, see Data Resilience Solutions for IBM® i5/OS™ High Availability Clusters. The section called "Data Resilience Solutions for IBM i5/OS High Availability Clusters" contains a detailed comparison of the attributes for each of these technologies.
Logical replication is the process of copying objects from one node in a cluster to one or more other nodes in the cluster, which makes the objects on all the systems identical.
A replicated resource allows for objects, such as an application and its data, to be copied from one node in the cluster to one or more other nodes in the cluster. This process keeps the objects on all servers in the resource's recovery domain identical. If you make a change to an object on one node in a cluster, the change is replicated to other nodes in the cluster. Then, should a failover or switchover occur, the backup node can seamlessly take on the role of the primary node. The server or servers that act as backups are defined in the recovery domain. When an outage occurs on the server defined as the primary node in the recovery domain and a switchover or failover is initiated, the node designated as the backup in the recovery domain becomes the primary access point for the resource.
Replication requires the use of either a custom-written application or a software application written by a cluster middleware business partner. See Plan for logical replication for details.
Switchable disks enable the resources, such as data and applications, residing on an expansion unit or on an input-output processor (IOP) on a shared bus or in an I/O Pool for a logical partition, to be switched between a cluster's primary node and backup node. This allows for a set of disk units to be accessed from a second server, a server defined as a backup node in the cluster resource group's recovery domain, when the server currently using those disk units experiences an outage and a failover or switchover occurs.
Taking advantage of switchable resources in your cluster requires the use of independent disk pools. See Plan for independent disk pools for more information.
Cross-site mirroring, combined with the geographic mirroring function, enables you to mirror data on disks at sites that can be separated by a significant geographic distance. This technology can be used to extend the functionality of a device cluster resource group (CRG) beyond the limits of physical component connection. Geographic mirroring provides the ability to replicate changes made to the production copy of an independent disk pool to a mirror copy of that independent disk pool. As data is written to the production copy of an independent disk pool, the operating system mirrors that data to a second copy of the independent disk pool through another system. This process keeps multiple identical copies of the data.
Through the device CRG, should a failover or switchover occur, the backup node can seamlessly take on the role of the primary node. The server or servers that act as backups are defined in the recovery domain. The backup nodes can be at the same or different physical location as the primary. When an outage occurs on the server defined as the primary node in the recovery domain and a switchover or failover is initiated, the node designated as the backup in the recovery domain becomes the primary access point for the resource and will then own the production copy of the independent disk pool. Thus, you can gain protection from the single point of failure associated with switchable resources.
Factor | Replication | Switchable disks | Cross-site mirroring |
---|---|---|---|
Flexibility | 10s of servers | 2 servers | 4 servers |
Single point of failure | None | Disk subsystem | None |
Cost |
|
|
|
Performance | Replication overhead | Little impact | Geographic mirroring overhead |
Real time coverage | Journaled objects | Objects contained in independent disk pool | Objects contained in independent disk pool |
Geographic dispersion | Limited by performance considerations | Limited attach distance as servers and expansion units must be attached to HSL OptiConnect loop (250 meters maximum) | Limited by performance considerations (No limits are imposed by the system. However, response time and throughput over selected communications lines can dictate some practical limit.) |
Disaster recovery protection | Yes | No | Yes |
Concurrent backup | Yes | No | No |
Setup |
|
|
|