Tunable cluster communications parameters

The Change Cluster Resource Services (QcstChgClusterResourceServices) API enables some of the cluster topology services and cluster communications performance and configuration parameters to be tuned to better suit the many unique application and networking environments in which clustering occurs. This API is available to any cluster that is running at cluster version 2 or later.

The Change Cluster Configuration Tuning (CHGCLUCFG) command provides a base level of tuning, while the QcstChgClusterResourceServices API provides both base and advanced levels of tuning.

The QcstChgClusterResourceServices API and CHGCLUCFG command can be used to tune cluster performance and configuration. The API and command provide a base level of tuning support where the cluster will adjust to a predefined set of values identified for high, low, and normal timeout and messaging interval values. If an advanced level of tuning is desired, usually anticipated with the help of IBM® support personnel, then individual parameters may be tuned through use of the API over a predefined value range. Inappropriate changes to the individual parameters can easily lead to degraded cluster performance.

When and how to tune cluster parameters?

The CHGCLUCFG command and the QcstChgClusterResourceServices API provide for a fast path to setting cluster performance and configuration parameters without your needing to understand details. This base level of tuning primarily affects the heartbeating sensitivity and the cluster message timeout values. The valid values for the base level of tuning support are:

1 (High Timeout Values/Less Frequent Heartbeats)

2 (Default Values)
Normal default values are used for cluster communications performance and configuration parameters. This setting may be used to return all parameters to the original default values.
3 (Low Timeout Values / More Frequent Heartbeats)
Adjustments are made to cluster communications to decrease the heartbeating interval and decrease the various message timeout values. With more frequent heartbeats and shorter timeout values, the cluster is quicker to respond (more sensitive) to communications failures.

Resultant example response times are shown in the following table for a heartbeat failure leading to a node partition:

  1 (Less sensitive) 2 (Default) 3 (More sensitive)
  Detection of Heartbeat Problem Analysis Total Detection of Heartbeat Problem Analysis Total Detection of Heartbeat Problem Analysis Total
Single subnet 00:24 01:02 01:26 00:12 00:30 00:42 00:04 00:14 00:18
Multiple subnets 00:24 08:30 08:54 00:12 04:14 04:26 00:04 02:02 02:06
Note: Start of changeTimes are specified in minutes:seconds format.End of change

Depending on typical network loads and specific physical media being used, a cluster administrator might choose to adjust the heartbeating sensitivity and message timeout levels. For example, with a high speed high-reliability transport, such as OptiConnect with all systems in the cluster on a common OptiConnect bus, one might desire to establish a more sensitive environment to ensure quick detection leading to faster failover. Option 3 is chosen. If one were running on a heavily loaded 10Mbs Ethernet bus and the default settings were leading to occasional partitions just due to network peak loads, option 1 could be chosen to reduce clustering sensitivity to the peak loads.

The Change Cluster Resource Services API also allows for tuning of specific individual parameters where the network environmental requirements present unique situations. For example, consider again a cluster with all nodes common on an OptiConnect bus. Performance of cluster messages can be greatly enhanced by setting the Message Fragment Size parameter to the maximum 32,500 bytes to better match the OptiConnect Maximum Transmission Unit (MTU) size than does the default 1,464 bytes. This reduces the overhead of fragmentation and reassembly of large messages. The benefit, of course, depends on the cluster applications and usage of cluster messaging resulting from those applications. Other parameters are defined in the API documentation and may be used to tune either the performance of cluster messaging or change the sensitivity of the cluster to partitioning.

Related concepts
Tune cluster performance