81 lines
5.1 KiB
HTML
81 lines
5.1 KiB
HTML
|
<?xml version="1.0" encoding="UTF-8"?>
|
||
|
<!DOCTYPE html
|
||
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
|
<html lang="en-us" xml:lang="en-us">
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
|
<meta name="security" content="public" />
|
||
|
<meta name="Robots" content="index,follow" />
|
||
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
||
|
<meta name="DC.Type" content="concept" />
|
||
|
<meta name="DC.Title" content="Join optimization with DB2 Multisystem" />
|
||
|
<meta name="abstract" content="The distributed query optimizer generates a plan to join distributed files." />
|
||
|
<meta name="description" content="The distributed query optimizer generates a plan to join distributed files." />
|
||
|
<meta name="DC.subject" content="join, optimization" />
|
||
|
<meta name="keywords" content="join, optimization" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="joins.htm" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="partkeysjoin.htm" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="joins.htm" />
|
||
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Format" content="XHTML" />
|
||
|
<meta name="DC.Identifier" content="joinopt" />
|
||
|
<meta name="DC.Language" content="en-us" />
|
||
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
||
|
<!-- US Government Users Restricted Rights -->
|
||
|
<!-- Use, duplication or disclosure restricted by -->
|
||
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
||
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
||
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
||
|
<title>Join optimization with DB2 Multisystem</title>
|
||
|
</head>
|
||
|
<body id="joinopt"><a name="joinopt"><!-- --></a>
|
||
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
||
|
<h1 class="topictitle1">Join optimization with DB2<sup>®</sup> Multisystem</h1>
|
||
|
<div><p>The distributed query optimizer generates a plan to join distributed
|
||
|
files.</p>
|
||
|
<p>The distributed query optimizer looks at file sizes, expected number of
|
||
|
records selected for each file, and the type of distributed joins that are
|
||
|
possible; and then the optimizer breaks the query into multiple steps. Each
|
||
|
step creates an intermediate result file that is used as input for the next
|
||
|
step.</p>
|
||
|
<p>During optimization, a cost is calculated for each join step based on the
|
||
|
type of distributed join. The cost reflects, in part, the amount of data movement
|
||
|
required for that join step. The cost is used to determine the final distributed
|
||
|
plan.</p>
|
||
|
<p>As much processing as possible is completed during each step; for example,
|
||
|
record selection isolated to a given step is performed during that step, and
|
||
|
as many files as possible are joined for each step. Each join step might involve
|
||
|
more than one type of distributed join. A collocated join and a directed join
|
||
|
can be combined into one collocated join by directing the necessary file first.
|
||
|
A directed join and a re-partitioned join can be combined by directing all
|
||
|
the files first and then performing the join. Note that directed and re-partitioned
|
||
|
joins are really just a collocated join, with one or more files being directed
|
||
|
before the join occurs.</p>
|
||
|
<p>When joining distributed files with local files, the distributed
|
||
|
query optimizer calculates a cost, similar to the cost calculated when joining
|
||
|
distributed files. Based on this cost, the distributed query optimizer can
|
||
|
choose to perform one of the following actions:</p>
|
||
|
<ul><li>Broadcast all of the local files to the data nodes of the distributed
|
||
|
file and perform a collocated join.</li>
|
||
|
<li>Broadcast all of the local and distributed files to the data nodes of
|
||
|
the largest distributed file and perform a collocated join.</li>
|
||
|
<li>Direct the distributed files back to the coordinator node and perform
|
||
|
the join there.</li>
|
||
|
</ul>
|
||
|
</div>
|
||
|
<div>
|
||
|
<ul class="ullinks">
|
||
|
<li class="ulchildlink"><strong><a href="partkeysjoin.htm">Partitioning keys over join fields with DB2 Multisystem</a></strong><br />
|
||
|
From the preceding sections on the types of joins, you can see that data movement is required for all distributed join types except a collocated join.</li>
|
||
|
</ul>
|
||
|
|
||
|
<div class="familylinks">
|
||
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="joins.htm" title="In addition to the performance considerations for nondistributed join queries, more performance considerations exist for queries involving distributed files.">Implementation and optimization of join operations with DB2 Multisystem</a></div>
|
||
|
</div>
|
||
|
<div class="relconcepts"><strong>Related concepts</strong><br />
|
||
|
<div><a href="joins.htm" title="In addition to the performance considerations for nondistributed join queries, more performance considerations exist for queries involving distributed files.">Implementation and optimization of join operations with DB2 Multisystem</a></div>
|
||
|
</div>
|
||
|
</div>
|
||
|
</body>
|
||
|
</html>
|