126 lines
8.8 KiB
HTML
126 lines
8.8 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html lang="en-us" xml:lang="en-us">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
|
<meta name="security" content="public" />
|
|
<meta name="Robots" content="index,follow" />
|
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
|
<meta name="DC.Type" content="task" />
|
|
<meta name="DC.Title" content="Change partitioned nodes to failed" />
|
|
<meta name="abstract" content="Sometimes, a partitioned condition is reported when there really was a node outage. This can occur when cluster resource services loses communications with one or more nodes, but cannot detect if the nodes are still operational. When this condition occurs, a simple mechanism exists for you to indicate that the node has failed." />
|
|
<meta name="description" content="Sometimes, a partitioned condition is reported when there really was a node outage. This can occur when cluster resource services loses communications with one or more nodes, but cannot detect if the nodes are still operational. When this condition occurs, a simple mechanism exists for you to indicate that the node has failed." />
|
|
<meta name="DC.Relation" scheme="URI" content="rzaigtroubleshootpartitionerrors.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzaigconceptsmerge.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="../cl/chgclunode.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="../apis/clcntchgcne.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="../cl/strclunod.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="../apis/clcntstcn.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzaigconceptsrejoin.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzaigtroubleshoottipclusterpartitions.htm" />
|
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Format" content="XHTML" />
|
|
<meta name="DC.Identifier" content="rzaigtroubleshootchangepartitionednodes" />
|
|
<meta name="DC.Language" content="en-us" />
|
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
|
<!-- US Government Users Restricted Rights -->
|
|
<!-- Use, duplication or disclosure restricted by -->
|
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
|
<title>Change partitioned nodes to failed</title>
|
|
</head>
|
|
<body id="rzaigtroubleshootchangepartitionednodes"><a name="rzaigtroubleshootchangepartitionednodes"><!-- --></a>
|
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
|
<h1 class="topictitle1">Change partitioned nodes to failed</h1>
|
|
<div><p>Sometimes, a partitioned condition is reported when there really
|
|
was a node outage. This can occur when cluster resource services loses communications
|
|
with one or more nodes, but cannot detect if the nodes are still operational.
|
|
When this condition occurs, a simple mechanism exists for you to indicate
|
|
that the node has failed.</p>
|
|
<div class="section"><div class="attention"><span class="attentiontitle">Attention:</span> When you tell cluster resource services that
|
|
a node has failed, it makes recovery from the partition state simpler. However,
|
|
changing the node status to failed when, in fact, the node is still active
|
|
and a true partition has occurred should not be done. Doing so can cause a
|
|
node in more than one partition to assume the primary role for a cluster resource
|
|
group. When two nodes think they are the primary node, data such as files
|
|
or databases can become disjoint or corrupted if multiple nodes are each independently
|
|
making changes to their copies of files. In addition, the two partitions cannot
|
|
be merged back together when a node in each partition has been assigned the
|
|
primary role.</div>
|
|
<p>When the status of a node is changed to Failed, the
|
|
role of nodes in the recovery domain for each cluster resource group in the
|
|
partition may be reordered. The node being set to Failed will be assigned
|
|
as the last backup. If multiple nodes have failed and their status needs to
|
|
be changed, the order in which the nodes are changed will affect the final
|
|
order of the recovery domain's backup nodes. If the failed node was the primary
|
|
node for a CRG, the first active backup will be reassigned as the new primary
|
|
node.</p>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="rzaigtroubleshootpartitionerrors.htm" title="Certain cluster conditions are easily corrected. If a cluster partition has occurred, you can learn how to recover. This topic also tells you how to avoid a cluster partition and gives you an example of how to merge partitions back together.">Partition errors</a></div>
|
|
</div>
|
|
<div class="relconcepts"><strong>Related concepts</strong><br />
|
|
<div><a href="rzaigconceptsmerge.htm" title="A merge operation is similar to a rejoin operation except that it occurs when nodes that are partitioned begin communicating again.">Merge</a></div>
|
|
<div><a href="rzaigconceptsrejoin.htm" title="Rejoin means to become an active member of a cluster after having been a nonparticipating member.">Rejoin</a></div>
|
|
</div>
|
|
<div class="reltasks"><strong>Related tasks</strong><br />
|
|
<div><a href="rzaigtroubleshoottipclusterpartitions.htm" title="Use these tips for cluster partitions.">Tips: Cluster partitions</a></div>
|
|
</div>
|
|
<div class="relref"><strong>Related reference</strong><br />
|
|
<div><a href="../cl/chgclunode.htm">CHGCLUNODE command</a></div>
|
|
<div><a href="../apis/clcntchgcne.htm">Change Cluster Node Entry API (QcstChangeClusterNodeEntry)</a></div>
|
|
<div><a href="../cl/strclunod.htm">STRCLUNOD command</a></div>
|
|
<div><a href="../apis/clcntstcn.htm">Start Cluster Node (QcstStartClusterNode) API</a></div>
|
|
</div>
|
|
</div><div class="nested1" xml:lang="en-us" id="usingiseriesnavigator"><a name="usingiseriesnavigator"><!-- --></a><h2 class="topictitle2">Using iSeries Navigator</h2>
|
|
<div><div class="section"><p>This requires Option 41 (<span class="keyword">i5/OS™</span> -
|
|
HA Switchable Resources) to be installed and licensed.</p>
|
|
<p>When cluster
|
|
resource services has lost communications with a node but cannot detect if
|
|
the node is still operational, a cluster node will have a status of <span class="uicontrol">Not
|
|
communicating</span> in the Nodes container in iSeries™ Navigator. You may need to change
|
|
the status of the node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>.
|
|
You will then be able to restart the node.</p>
|
|
<p>To change the status of a
|
|
node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>,
|
|
follow these steps:</p>
|
|
</div>
|
|
<ol><li><span>In iSeries Navigator,
|
|
expand <span class="uicontrol">Management Central</span>.</span></li>
|
|
<li><span>Expand <span class="uicontrol">Clusters</span>.</span></li>
|
|
<li><span>Expand the cluster that contains the node for which you want to
|
|
change the status.</span></li>
|
|
<li><span>Click <span class="uicontrol">Nodes</span>.</span></li>
|
|
<li><span>Right-click the node for which you want to change the status, and
|
|
select <span class="menucascade"><span class="uicontrol">Cluster</span> > <span class="uicontrol">Change Status</span></span>.</span></li>
|
|
</ol>
|
|
<div class="section"> and select ClusterChange Status<p>To restart the node, follow these
|
|
steps:</p>
|
|
<ol><li>Right-click the node, and select <span class="menucascade"><span class="uicontrol">Cluster</span> > <span class="uicontrol">Start</span></span>.</li>
|
|
</ol>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<div class="nested1" xml:lang="en-us" id="usingclapis"><a name="usingclapis"><!-- --></a><h2 class="topictitle2">Using CL commands and APIs</h2>
|
|
<div><div class="section">To change the status of a node from <span class="uicontrol">Not communicating</span> to <span class="uicontrol">Failed</span>,
|
|
follow these steps:</div>
|
|
<ol><li><span>Use the <a href="../cl/chgclunode.htm"><span class="cmdname">CHGCLUNODE</span> command</a> or
|
|
the <a href="../apis/clcntchgcne.htm"><span class="apiname">Change
|
|
Cluster Node Entry (QcstChangeClusterNodeEntry)</span> API</a> to change
|
|
the status of a node from partitioned to failed. This should be done for all
|
|
nodes that have actually failed.</span></li>
|
|
<li><span>Use the <a href="../cl/strclunod.htm"><span class="cmdname">STRCLUNOD</span> command</a> or
|
|
the <a href="../apis/clcntstcn.htm"><span class="apiname">Start
|
|
Cluster Node (QcstStartClusterNode)</span> API</a> to start the cluster
|
|
node, allowing the node to rejoin the cluster.</span></li>
|
|
</ol>
|
|
</div>
|
|
</div>
|
|
|
|
</body>
|
|
</html> |