111 lines
5.2 KiB
HTML
111 lines
5.2 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<LINK rel="stylesheet" type="text/css" href="../../../rzahg/ic.css">
|
|
|
|
<title>Troubleshoot: Detect hung threads in J2EE applications</title>
|
|
</head>
|
|
<BODY>
|
|
<!-- Java sync-link -->
|
|
<SCRIPT LANGUAGE="Javascript" SRC="../../../rzahg/synch.js" TYPE="text/javascript"></SCRIPT>
|
|
<h3><a name="trbhangdet"></a>Troubleshoot: Detect hung threads in J2EE applications</h3>
|
|
|
|
<p>A common error in J2EE applications is a hung thread. A hung thread can
|
|
result from a simple software defect (such as an infinite loop) or a more
|
|
complex cause (for example, a resource deadlock). System resources, such as
|
|
CPU time, might be consumed by this hung transaction when threads run unbounded
|
|
code paths, such as when the code is running in an infinite loop. Alternately,
|
|
a system can become unresponsive even though all resources are idle, as in
|
|
a deadlock scenario. Unless an end user or a monitoring tool reports the problem,
|
|
the system may remain in this degraded state indefinitely.</p>
|
|
|
|
<p>The hang detection option for WebSphere Application Server - Express is turned on
|
|
by default. You can configure a hang detection policy to accommodate your
|
|
applications and environment so that potential hangs can be reported, providing
|
|
earlier detection of failing servers. When a hung thread is detected, WebSphere
|
|
Application Server - Express notifies you so that you can troubleshoot the problem.</p>
|
|
|
|
<p>Using the hang detection policy, you can specify a time that is too long
|
|
for a unit of work to complete. The thread monitor checks all managed threads
|
|
in the system (for example, Web container threads and object request broker
|
|
(ORB) threads). Unmanaged threads, which are threads created by applications,
|
|
are not monitored.</p>
|
|
|
|
<p>When WebSphere Application Server - Express detects that a thread has been active
|
|
longer than the time defined by the thread monitor threshold, the application
|
|
server takes the following actions:</p>
|
|
|
|
<ul>
|
|
<li><p>Logs a warning in the WebSphere Application Server - Express System.Out log
|
|
file that indicates the name of the thread that is hung and how long it has
|
|
already been active. The following message is written to the log:</p>
|
|
<pre>
|
|
WSVR0605W: Thread <em>threadname</em> has been active for <em>hangtime</em> and may be hung.
|
|
There are <em>totalthreads</em> threads in total in the server that may be hung.
|
|
</pre>
|
|
<p>where <em>threadname</em> is the name that appears in a JVM thread dump,
|
|
<em>hangtime</em> gives an approximation of how long the thread has been active
|
|
and <em>totalthreads</em> gives an overall assessment of the system
|
|
threads.</p>
|
|
</li>
|
|
<li><p>Issues a Java Management Extensions (JMX) notification. This notification
|
|
enables third-party tools to catch the event and take appropriate action,
|
|
such as triggering a JVM thread dump of the server, or issuing an electronic
|
|
page or e-mail. The following JMX notification events are defined in the
|
|
com.ibm.websphere.management.NotificationConstants class:</p>
|
|
|
|
<ul>
|
|
<li>TYPE_THREAD_MONITOR_THREAD_HUNG: This event is triggered by the detection
|
|
of a (potentially) hung thread.</li>
|
|
<li>TYPE_THREAD_MONITOR_THREAD_CLEAR: This event is triggered if a thread
|
|
that was previously reported as hung completes its work. See <a href="#falsealarms">False Alarms</a> for more information.</li>
|
|
</ul>
|
|
|
|
</li>
|
|
|
|
<li><p>Triggers changes in the performance monitoring
|
|
infrastructure (PMI) data counters. These PMI data counters are used by various
|
|
tools to provide a performance analysis.</p></li>
|
|
</ul>
|
|
|
|
|
|
<p><a name="falsealarms"></a><b>False Alarms</b></p>
|
|
|
|
<p>If the work actually completes, a second set of messages, notifications and PMI events is produced to identify the false alarm. The following message is written to the System.out log:</p>
|
|
<pre>
|
|
WSVR0606W: Thread <em>threadname</em> was previously reported to be hung but has completed.
|
|
It was active for approximately <em>hangtime</em>.
|
|
There are <em>totalthreads</em> threads in total in the server that still may be hung.
|
|
</pre>
|
|
<p>where <em>threadname</em> is the name that appears in a JVM thread dump,
|
|
<em>hangtime</em> gives an approximation of how long the thread has been active
|
|
and <em>totalthreads</em> gives an overall assessment of the system threads.</p>
|
|
|
|
<p><b>Automatic adjustment of the hang time threshold</b></p>
|
|
|
|
<p>If the thread monitor determines that too many false alarms are issued (determined
|
|
by the number of pairs of hang and clear messages), it can automatically adjust
|
|
the threshold. When this adjustment occurs, the following message is written
|
|
to the System.out log:</p>
|
|
<pre>
|
|
WSVR0607W: Too many thread hangs have been falsely reported.
|
|
The hang threshold is now being set to <em>thresholdtime</em>.
|
|
</pre>
|
|
<p>where <em>thresholdtime</em> is the time (in seconds) in which a thread can
|
|
be active before it is considered hung.</p>
|
|
|
|
<p>You can prevent WebSphere Application Server - Express from automatically
|
|
adjusting the hang time threshold.</p>
|
|
|
|
<p>For more information, see the following topics:</p>
|
|
|
|
<ul>
|
|
<li><a href="trbadjhangdet.htm">Adjust the hang detection policy</a></li>
|
|
<li><a href="trbconfighangdet.htm">Configure the hang detection policy</a></li>
|
|
</ul>
|
|
|
|
|
|
</body>
|
|
</html>
|