ibm-information-center/dist/eclipse/plugins/i5OS.ic.dbmult_5.4.0.1/joinopt.htm

81 lines
5.1 KiB
HTML
Raw Permalink Normal View History

2024-04-02 14:02:31 +00:00
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="Join optimization with DB2 Multisystem" />
<meta name="abstract" content="The distributed query optimizer generates a plan to join distributed files." />
<meta name="description" content="The distributed query optimizer generates a plan to join distributed files." />
<meta name="DC.subject" content="join, optimization" />
<meta name="keywords" content="join, optimization" />
<meta name="DC.Relation" scheme="URI" content="joins.htm" />
<meta name="DC.Relation" scheme="URI" content="partkeysjoin.htm" />
<meta name="DC.Relation" scheme="URI" content="joins.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="joinopt" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Join optimization with DB2 Multisystem</title>
</head>
<body id="joinopt"><a name="joinopt"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Join optimization with DB2<sup>®</sup> Multisystem</h1>
<div><p>The distributed query optimizer generates a plan to join distributed
files.</p>
<p>The distributed query optimizer looks at file sizes, expected number of
records selected for each file, and the type of distributed joins that are
possible; and then the optimizer breaks the query into multiple steps. Each
step creates an intermediate result file that is used as input for the next
step.</p>
<p>During optimization, a cost is calculated for each join step based on the
type of distributed join. The cost reflects, in part, the amount of data movement
required for that join step. The cost is used to determine the final distributed
plan.</p>
<p>As much processing as possible is completed during each step; for example,
record selection isolated to a given step is performed during that step, and
as many files as possible are joined for each step. Each join step might involve
more than one type of distributed join. A collocated join and a directed join
can be combined into one collocated join by directing the necessary file first.
A directed join and a re-partitioned join can be combined by directing all
the files first and then performing the join. Note that directed and re-partitioned
joins are really just a collocated join, with one or more files being directed
before the join occurs.</p>
<p>When joining distributed files with local files, the distributed
query optimizer calculates a cost, similar to the cost calculated when joining
distributed files. Based on this cost, the distributed query optimizer can
choose to perform one of the following actions:</p>
<ul><li>Broadcast all of the local files to the data nodes of the distributed
file and perform a collocated join.</li>
<li>Broadcast all of the local and distributed files to the data nodes of
the largest distributed file and perform a collocated join.</li>
<li>Direct the distributed files back to the coordinator node and perform
the join there.</li>
</ul>
</div>
<div>
<ul class="ullinks">
<li class="ulchildlink"><strong><a href="partkeysjoin.htm">Partitioning keys over join fields with DB2 Multisystem</a></strong><br />
From the preceding sections on the types of joins, you can see that data movement is required for all distributed join types except a collocated join.</li>
</ul>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="joins.htm" title="In addition to the performance considerations for nondistributed join queries, more performance considerations exist for queries involving distributed files.">Implementation and optimization of join operations with DB2 Multisystem</a></div>
</div>
<div class="relconcepts"><strong>Related concepts</strong><br />
<div><a href="joins.htm" title="In addition to the performance considerations for nondistributed join queries, more performance considerations exist for queries involving distributed files.">Implementation and optimization of join operations with DB2 Multisystem</a></div>
</div>
</div>
</body>
</html>