ibm-information-center/dist/eclipse/plugins/i5OS.ic.rzaig_5.4.0.1/rzaigtroubleshootchangepartitionednodes.htm

126 lines
8.8 KiB
HTML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="task" />
<meta name="DC.Title" content="Change partitioned nodes to failed" />
<meta name="abstract" content="Sometimes, a partitioned condition is reported when there really was a node outage. This can occur when cluster resource services loses communications with one or more nodes, but cannot detect if the nodes are still operational. When this condition occurs, a simple mechanism exists for you to indicate that the node has failed." />
<meta name="description" content="Sometimes, a partitioned condition is reported when there really was a node outage. This can occur when cluster resource services loses communications with one or more nodes, but cannot detect if the nodes are still operational. When this condition occurs, a simple mechanism exists for you to indicate that the node has failed." />
<meta name="DC.Relation" scheme="URI" content="rzaigtroubleshootpartitionerrors.htm" />
<meta name="DC.Relation" scheme="URI" content="rzaigconceptsmerge.htm" />
<meta name="DC.Relation" scheme="URI" content="../cl/chgclunode.htm" />
<meta name="DC.Relation" scheme="URI" content="../apis/clcntchgcne.htm" />
<meta name="DC.Relation" scheme="URI" content="../cl/strclunod.htm" />
<meta name="DC.Relation" scheme="URI" content="../apis/clcntstcn.htm" />
<meta name="DC.Relation" scheme="URI" content="rzaigconceptsrejoin.htm" />
<meta name="DC.Relation" scheme="URI" content="rzaigtroubleshoottipclusterpartitions.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="rzaigtroubleshootchangepartitionednodes" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Change partitioned nodes to failed</title>
</head>
<body id="rzaigtroubleshootchangepartitionednodes"><a name="rzaigtroubleshootchangepartitionednodes"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Change partitioned nodes to failed</h1>
<div><p>Sometimes, a partitioned condition is reported when there really
was a node outage. This can occur when cluster resource services loses communications
with one or more nodes, but cannot detect if the nodes are still operational.
When this condition occurs, a simple mechanism exists for you to indicate
that the node has failed.</p>
<div class="section"><div class="attention"><span class="attentiontitle">Attention:</span> When you tell cluster resource services that
a node has failed, it makes recovery from the partition state simpler. However,
changing the node status to failed when, in fact, the node is still active
and a true partition has occurred should not be done. Doing so can cause a
node in more than one partition to assume the primary role for a cluster resource
group. When two nodes think they are the primary node, data such as files
or databases can become disjoint or corrupted if multiple nodes are each independently
making changes to their copies of files. In addition, the two partitions cannot
be merged back together when a node in each partition has been assigned the
primary role.</div>
<p>When the status of a node is changed to Failed, the
role of nodes in the recovery domain for each cluster resource group in the
partition may be reordered. The node being set to Failed will be assigned
as the last backup. If multiple nodes have failed and their status needs to
be changed, the order in which the nodes are changed will affect the final
order of the recovery domain's backup nodes. If the failed node was the primary
node for a CRG, the first active backup will be reassigned as the new primary
node.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="rzaigtroubleshootpartitionerrors.htm" title="Certain cluster conditions are easily corrected. If a cluster partition has occurred, you can learn how to recover. This topic also tells you how to avoid a cluster partition and gives you an example of how to merge partitions back together.">Partition errors</a></div>
</div>
<div class="relconcepts"><strong>Related concepts</strong><br />
<div><a href="rzaigconceptsmerge.htm" title="A merge operation is similar to a rejoin operation except that it occurs when nodes that are partitioned begin communicating again.">Merge</a></div>
<div><a href="rzaigconceptsrejoin.htm" title="Rejoin means to become an active member of a cluster after having been a nonparticipating member.">Rejoin</a></div>
</div>
<div class="reltasks"><strong>Related tasks</strong><br />
<div><a href="rzaigtroubleshoottipclusterpartitions.htm" title="Use these tips for cluster partitions.">Tips: Cluster partitions</a></div>
</div>
<div class="relref"><strong>Related reference</strong><br />
<div><a href="../cl/chgclunode.htm">CHGCLUNODE command</a></div>
<div><a href="../apis/clcntchgcne.htm">Change Cluster Node Entry API (QcstChangeClusterNodeEntry)</a></div>
<div><a href="../cl/strclunod.htm">STRCLUNOD command</a></div>
<div><a href="../apis/clcntstcn.htm">Start Cluster Node (QcstStartClusterNode) API</a></div>
</div>
</div><div class="nested1" xml:lang="en-us" id="usingiseriesnavigator"><a name="usingiseriesnavigator"><!-- --></a><h2 class="topictitle2">Using iSeries Navigator</h2>
<div><div class="section"><p>This requires Option 41 (<span class="keyword">i5/OS™</span> -
HA Switchable Resources) to be installed and licensed.</p>
<p>When cluster
resource services has lost communications with a node but cannot detect if
the node is still operational, a cluster node will have a status of <span class="uicontrol">Not
communicating</span> in the Nodes container in iSeries™ Navigator. You may need to change
the status of the node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>.
You will then be able to restart the node.</p>
<p>To change the status of a
node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>,
follow these steps:</p>
</div>
<ol><li><span>In iSeries Navigator,
expand <span class="uicontrol">Management Central</span>.</span></li>
<li><span>Expand <span class="uicontrol">Clusters</span>.</span></li>
<li><span>Expand the cluster that contains the node for which you want to
change the status.</span></li>
<li><span>Click <span class="uicontrol">Nodes</span>.</span></li>
<li><span>Right-click the node for which you want to change the status, and
select <span class="menucascade"><span class="uicontrol">Cluster</span> &gt; <span class="uicontrol">Change Status</span></span>.</span></li>
</ol>
<div class="section"> and select ClusterChange Status<p>To restart the node, follow these
steps:</p>
<ol><li>Right-click the node, and select <span class="menucascade"><span class="uicontrol">Cluster</span> &gt; <span class="uicontrol">Start</span></span>.</li>
</ol>
</div>
</div>
</div>
<div class="nested1" xml:lang="en-us" id="usingclapis"><a name="usingclapis"><!-- --></a><h2 class="topictitle2">Using CL commands and APIs</h2>
<div><div class="section">To change the status of a node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>,
follow these steps:</div>
<ol><li><span>Use the <a href="../cl/chgclunode.htm"><span class="cmdname">CHGCLUNODE</span> command</a> or
the <a href="../apis/clcntchgcne.htm"><span class="apiname">Change
Cluster Node Entry (QcstChangeClusterNodeEntry)</span> API</a> to change
the status of a node from partitioned to failed. This should be done for all
nodes that have actually failed.</span></li>
<li><span>Use the <a href="../cl/strclunod.htm"><span class="cmdname">STRCLUNOD</span> command</a> or
the <a href="../apis/clcntstcn.htm"><span class="apiname">Start
Cluster Node (QcstStartClusterNode)</span> API</a> to start the cluster
node, allowing the node to rejoin the cluster.</span></li>
</ol>
</div>
</div>
</body>
</html>