ibm-information-center/dist/eclipse/plugins/i5OS.ic.dbmult_5.4.0.1/whyusedb2.htm

120 lines
8.8 KiB
HTML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="Why you should use DB2 Multisystem" />
<meta name="abstract" content="Performance improvements can be quite significant for certain queries." />
<meta name="description" content="Performance improvements can be quite significant for certain queries." />
<meta name="DC.subject" content="commitment control, two-phase commit protocols, data integrity, protected conversation, definition, with commitment control, protected conversation, conversation, protected, performance, improvement, Symmetric Multiprocessing (SMP), query optimizer, SMP (Symmetric Multiprocessing), improving query processing, query, performance improvements, example, performance improvement, Symmetric Multiprocessing (SMP) enhancements" />
<meta name="keywords" content="commitment control, two-phase commit protocols, data integrity, protected conversation, definition, with commitment control, protected conversation, conversation, protected, performance, improvement, Symmetric Multiprocessing (SMP), query optimizer, SMP (Symmetric Multiprocessing), improving query processing, query, performance improvements, example, performance improvement, Symmetric Multiprocessing (SMP) enhancements" />
<meta name="DC.Relation" scheme="URI" content="RZAF3C5-gen6.htm" />
<meta name="DC.Relation" scheme="URI" content="tip.htm" />
<meta name="DC.Relation" scheme="URI" content="../apiref/api.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="whyusedb2" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Why you should use DB2 Multisystem</title>
</head>
<body id="whyusedb2"><a name="whyusedb2"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Why you should use DB2 Multisystem</h1>
<div><p>Performance improvements can be quite significant for certain queries.</p>
<p>Testing has shown that for queries that have a large amount
of data to be processed, but with a relatively small result set, the performance
gain is almost proportional to the number of systems the file is distributed
across. For example, suppose you have a 5 million (5 000 000) record file
that you want to query for the top 10 revenue producers. With DB2<sup>®</sup> Multisystem,
the response time for the query is cut by nearly one-half by partitioning
the file equally across two systems. On three systems, the response time is
nearly one-third the time of running the query on a single system. This best
case scenario does not apply to complex join operations where data needs to
be moved between nodes.</p>
<p>If a file is fairly small or is primarily used for single-record read/write
processing, little or no performance gain can be realized from partitioning
the file. Instead, a slight degradation in performance might occur. In these
cases, query performance becomes more dependent on the speed of the physical
connection. However, even in these situations, the users on all the systems
in the node group still have the advantage of being able to access the data,
even though it is distributed, using the traditional “local file” database
methods with which they are familiar. In all environments, users have the
benefits of this local-system transparency and the possible elimination of
data redundancy across the systems in the node group.</p>
<p>Another parallelism feature is <span class="keyword">DB2 UDB Symmetric Multiprocessing</span>.
With symmetric multiprocessing (SMP), when a partitioned file is processed
and if any of the systems are multiprocessor systems, you can achieve a multiplier
effect in terms of performance gains. If you partitioned a file across three
systems and each system is a 4-way processor system, the functions of DB2 Multisystem
and SMP work together. Using the previous 5 million record example, the response
time is approximately one-twelfth of what it would have been had the query
been run without using any of the parallelism features. The file sizes and
the details of the query can affect the improvement that you actually see.</p>
<p>When you do queries, the bulk of the work to run the query
is done in parallel, which improves the overall performance of query processing.
The system divides up the query and processes the appropriate parts of the
query on the appropriate system. This makes for the most efficient processing,
and it is done automatically; you do not need to specify anything to make
this highly efficient processing occur.</p>
<div class="note"><span class="notetitle">Note:</span> The performance of some queries might not improve, especially if a large
volume of data has to be moved.</div>
<div class="p">Each node only has to process the records that are physically stored on
that node. If the query specifies selection against the partitioning key,
the query optimizer might determine that only one node needs to be queried.
In the following example, the ZIP code field is the partitioning key within
the SQL statement for the ORDERS file: <pre> SELECT NAME, ADDRESS, BALANCE FROM PRODLIB/ORDERS WHERE ZIP='48009'</pre>
</div>
<p>When the statement is run, the optimizer determines that only one node
needs to be queried. Remember that all the records that contain the 48009
ZIP code are distributed to the same node.</p>
<div class="p">In the next SQL statement example, the processor capability of all the <span class="keyword">iSeries™</span> servers in the node group can
be used to process the statement in parallel: <pre> SELECT ZIP, SUM(BALANCE) FROM PRODLIB/ORDERS GROUP BY ZIP</pre>
</div>
<p>Another advantage of having the optimizer direct I/O requests only to systems
that contain pertinent data is that queries can still run if one or more of
the systems are not active. An example is a file that is partitioned such
that each branch of a business has its data stored on a different system.
If one system is unavailable, file I/O operations can still be performed for
data associated with the remaining branches. The I/O request fails for the
branch that is not active.</p>
<p>The optimizer uses two-phase commit protocols to ensure
the integrity of the data. Because multiple systems are being accessed, if
you request commitment control, all of the systems use protected conversations.
A <dfn class="term">protected conversation</dfn> means that if a system failure occurs
in the middle of a transaction or in the middle of a single database
operation, all of the changes made up to that point are rolled back.</p>
<p>When protected conversations are used, some commitment
control options are changed at the remote nodes to enhance performance. The
Wait for outcome option is set to Y, and the Vote read-only permitted option
is set to Y. To further enhance performance, you can use the Change Commitment
Options (QTNCHGCO) API to change the Wait for outcome option to N on the system
where your queries are initiated. Refer to the APIs topic in the Information
Center to understand the effects of these commitment option values.</p>
</div>
<div>
<ul class="ullinks">
<li class="ulchildlink"><strong><a href="tip.htm">Performance enhancement tip with DB2 Multisystem</a></strong><br />
For performance enhancement, you can specify some values for the distribute data (DSTDTA) parameter.</li>
</ul>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="RZAF3C5-gen6.htm" title="DB2 Multisystem can help you increase your database capacity, realize improvements in query performance, and provide remote database access through an easier method.">Performance and scalability with DB2 Multisystem</a></div>
</div>
<div class="relref"><strong>Related reference</strong><br />
<div><a href="../apiref/api.htm">APIs</a></div>
</div>
</div>
</body>
</html>