Data resilience solutions for i5/OS Clusters

This topic provides an overview of different data resilience technologies that can be used to with i5/OS™ Clusters to enhance high availability in multiple system environments.

Data resilience is the ability for the data to remain accessible to the application even if the system that originally hosted the data fails. Choosing the correct set of data resilience technologies in the context of your overall business continuity strategy can be complex and difficult. It’s important to understand the different data resilience solutions that can be used alone or with clusters to enhance availability in multiple system environments. You can either choose a single solution or use a combination of these technologies to meet your needs.

For more details on these solutions, see Data Resilience Solutions for IBM i5/OS High Availability Clusters. The section called "Comparison characteristics" contains a detailed comparison of the attributes for each of these technologies.

Replication

With replication, changes to an object are copied to a saved version with near real-time accuracy. Replication is one of the most widely used high availability solutions in multiple system environments. On the iSeries™, this solution is most often implemented through a business partner.

Consider replication when you have the following requirements:
  • You need two or more copies of the data.
  • You want concurrent access to the second copy of data.
  • You need backup window reduction.
  • You need to selectively replicate objects within a library or directory.
  • Your IT staff can monitor the state of the replication environment.
  • You need geographic dispersion between copies, especially if they need distances greater than what can be achieved by hardware solutions.
  • You already have deployed a solution using logical object replication.
  • You need a solution that has no special hardware configuration requirements.
  • Failover and switchover times should not exceed tens of minutes.
  • Transaction level integrity is important for all journaled objects.

Switchable disk pools

Switchable disk pools are storage devices on the operating system that are independent of a particular system. This allows you to "switch" disk pools from one system to another without performing a full IPL. The key benefits to switchable disk pools involve its simple design and maintenance. One copy of the data is always current with no other version to synchronize so there is minimal administration.

Consider switchable disk pools when you have the following requirements:
  • Only one copy of the data with hardware protection satisfies your requirement and you have considered or addressed avoiding unplanned outages due to disk subsystem failures.
  • You need a simple, low cost and low maintenance solution.
  • Disaster recovery (DR) is not needed.
  • You only need coverage for planned outages and certain types of unplanned outages.
  • The source and target system are at the same site.
  • You want consistent failover and switchover times within minutes and that do not depend on transaction volumes.
  • Transaction-level integrity is important for all objects.
  • You need immediate availability of all object changes with no loss of in flight data.
  • Objects not within an independent disk pool either do not need to be replicated or are handled via some other mechanism.
  • You need the highest throughput environment.
  • Your environment calls for multiple, independent databases that can be moved between systems.

Cross-site mirroring

Cross-site mirroring, combined with the geographic mirroring function, enables you to mirror data on disks at sites that can be separated by a significant geographic distance. Geographic mirroring provides the ability to replicate changes made to the production copy of an independent disk pool to a mirror copy of that independent disk pool. As data is written to the production copy of an independent disk pool, the operating system mirrors that data to a second copy of the independent disk pool through another system. This process keeps multiple identical copies of the data.

Consider cross-site mirroring when you have the following requirements:
  • You want a system-generated second copy of the data (at an IASP level).
  • You need two copies of data, but do not need concurrent access to a second copy.
  • A relatively low cost and low maintenance solution is desired, but you also need disaster recovery.
  • Geographic dispersion between copies is needed, but your distance requirement does not adversely impact your acceptable production performance goals.
  • You want consistent failover and switchover times within minutes and that do not depend on transaction volumes.
  • Transaction-level integrity is important for all objects.
  • You need immediate availability of all object changes with no loss of in flight data.
  • Objects not within an independent disk pool either do not need to be replicated or are handled via some other mechanism.
  • The second copy that is not available during resynchronization fits within your service level objectives.

IBM TotalStorage® Enterprise Storage Server® PPRC used with the iSeries Copy Services for ESS toolkit

This solution involves the replication of data at the storage controller level to a second storage system using IBM TotalStorage Enterprise Storage Server (ESS) copy services. An independent disk pool is the basic unit of storage for the ESS peer-to-peer remote copy (PPRC) function. PPRC generates a second copy of the independent disk pool on another ESS. The toolkit comes as part of the iSeries Copy Services for ESS services offering. It provides a set of functions to combine the PPRC, IASP, and i5/OS cluster services for coordinated switchover and failover processing through a cluster resource group.

This solution provides the benefit of the remote copy function and coordinated switching operations, which gives you good data resiliency capability if the replication is done synchronously. The toolkit enables you to attach the second copy to a backup system without an IPL. No load source recovery is involved in the operations. You also have the ability to combine this solution with other ESS-based copy services functions, such as FlashCopy, for additional benefits such as save window reduction.

Consider IBM TotalStorage Enterprise Storage Server (ESS) peer-to-peer remote copy (PPRC) with IASP and Toolkit when you have the following requirements:
  • You desire a storage-based solution for DR, especially if multiple platforms are involved.
  • You do not need complete high availability (HA), but seek to cover Disaster recovery and some planned outages for critical application data.
  • Start of changeYou want consistent failover and switchover times within minutes and that do not depend on transaction volumes. End of change
  • You want two copies of data, but do not need concurrent access to a second copy.
  • Geographic dispersion between copies is needed, but your distance requirement does not adversely impact your acceptable production performance goals. Alternatively, consider Peer-to-Peer Remote Copy (PPRC) Global Mirror, which is also known as asynchronous PPRC.
  • Transaction-level integrity is important for all objects.
  • You need availability of all object changes with no loss of in-flight data.