178 lines
13 KiB
HTML
178 lines
13 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html lang="en-us" xml:lang="en-us">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
|
<meta name="security" content="public" />
|
|
<meta name="Robots" content="index,follow" />
|
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
|
<meta name="DC.Type" content="concept" />
|
|
<meta name="DC.Title" content="Nested loop join implementation" />
|
|
<meta name="abstract" content="DB2 Universal Database for iSeries provides a nested loop join method. For this method, the processing of the tables in the join are ordered. This order is called the join order. The first table in the final join order is called the primary table. The other tables are called secondary tables. Each join table position is called a dial." />
|
|
<meta name="description" content="DB2 Universal Database for iSeries provides a nested loop join method. For this method, the processing of the tables in the join are ordered. This order is called the join order. The first table in the final join order is called the primary table. The other tables are called secondary tables. Each join table position is called a dial." />
|
|
<meta name="DC.subject" content="optimization, nested loop join, definitions, primary table, secondary tables, dial, hash join, join, hash" />
|
|
<meta name="keywords" content="optimization, nested loop join, definitions, primary table, secondary tables, dial, hash join, join, hash" />
|
|
<meta name="DC.Relation" scheme="URI" content="perf24.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzajq1parallel.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="dsscan.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzajqslistprobe.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzajqhtblprobe.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="rzajqrinprobe.htm" />
|
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Format" content="XHTML" />
|
|
<meta name="DC.Identifier" content="c23nl" />
|
|
<meta name="DC.Language" content="en-us" />
|
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
|
<!-- US Government Users Restricted Rights -->
|
|
<!-- Use, duplication or disclosure restricted by -->
|
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
|
<title>Nested loop join implementation</title>
|
|
</head>
|
|
<body id="c23nl"><a name="c23nl"><!-- --></a>
|
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
|
<h1 class="topictitle1">Nested loop join implementation</h1>
|
|
<div><p><span class="keyword">DB2 Universal Database™ for iSeries™</span> provides
|
|
a <strong>nested loop</strong> join method. For this method, the processing of the tables
|
|
in the join are ordered. This order is called the <strong>join order</strong>. The first
|
|
table in the final join order is called the <strong>primary table</strong>. The other
|
|
tables are called <strong>secondary tables</strong>. Each join table position is called
|
|
a <strong>dial</strong>. </p>
|
|
<p> The nested loop will be implemented either using an index on secondary
|
|
tables, a hash table, or a table scan (arrival sequence) on the secondary
|
|
tables. In general, the join will be implemented using either an index or
|
|
a hash table.</p>
|
|
<div class="section"><h4 class="sectiontitle">Index nested loop join implementation</h4><p>During the
|
|
join, <span class="keyword">DB2 Universal Database for iSeries</span>: </p>
|
|
<ol><li id="c23nl__rle1"><a name="c23nl__rle1"><!-- --></a>Accesses the first primary table row selected by the predicates
|
|
local to the primary table.</li>
|
|
<li>Builds a key value from the join columns in the primary table.</li>
|
|
<li>Depending on the access to the first secondary table: <ul><li>If using an index to access the secondary table, Radix Index Probe is
|
|
used to locate the first row that satisfies the join condition for the first
|
|
secondary table by using an index with keys matching the join condition or
|
|
local row selection columns of the secondary table.</li>
|
|
<li>Applies bitmap selection, if applicable. <p>All rows that satisfy the
|
|
join condition from each secondary dial are located using an index. Rows are
|
|
retrieved from secondary tables in random sequence. This random disk I/O time
|
|
often accounts for a large percentage of the processing time of the query.
|
|
Since a given secondary dial is searched once for each row selected from the
|
|
primary and the preceding secondary dials that satisfy the join condition
|
|
for each of the preceding secondary dials, a large number of searches may
|
|
be performed against the later dials. Any inefficiencies in the processing
|
|
of the later dials can significantly inflate the query processing time. This
|
|
is the reason why attention to performance considerations for join queries
|
|
can reduce the run-time of a join query from hours to minutes.</p>
|
|
<p>If an
|
|
efficient index cannot be found, a temporary index may be created. Some join
|
|
queries build temporary indexes over secondary dials even when an index exists
|
|
for all of the join keys. Because efficiency is very important for secondary
|
|
dials of longer running queries, the query optimizer may choose to build a
|
|
temporary index which contains only entries which pass the local row selection
|
|
for that dial. This preprocessing of row selection allows the database manager
|
|
to process row selection in one pass instead of each time rows are matched
|
|
for a dial.</p>
|
|
</li>
|
|
<li>If using a Hash Table Probe to access the secondary table, a hash temporary
|
|
result table is created that contains all of the rows selected by local selection
|
|
against the table on the first probe. The structure of the hash table is such
|
|
that rows with the same join value are loaded into the same hash table partition
|
|
(clustered). The location of the rows for any given join value can be found
|
|
by applying a hashing function to the join value. <div class="p">A nested loop join using
|
|
a Hash Table Probe has several advantages over a nested loop join using an
|
|
Index Probe: <ul><li>The structure of a hash temporary result table is simpler than that of
|
|
an index, so less CPU processing is required to build and probe a hash table.</li>
|
|
<li>The rows in the hash result table contain all of the data required by
|
|
the query so there is no need to access the dataspace of the table with random
|
|
I/O when probing the hash table.</li>
|
|
<li>Like join values are clustered, so all matching rows for a given join
|
|
value can typically be accessed with a single I/O request. </li>
|
|
<li>The hash temporary result table can be built using SMP parallelism.</li>
|
|
<li>Unlike indexes, entries in hash tables are not updated to reflect changes
|
|
of column values in the underlying table. The existence of a hash table does
|
|
not affect the processing cost of other updating jobs in the server.</li>
|
|
</ul>
|
|
</div>
|
|
</li>
|
|
<li>If using a Sorted List Probe to access the secondary table, a sorted list
|
|
result is created that contains all of the rows selected by local selection
|
|
against the table on the first probe. The structure of the sorted list table
|
|
is such that rows with the same join value are sorted together in the list.
|
|
The location of the rows for any given join value can be found by probing
|
|
using the join value. </li>
|
|
<li> If using a table scan to access the secondary table, scan the secondary
|
|
to locate the first row that satisfies the join condition for the first secondary
|
|
table using the table scan to match the join condition or local row selection
|
|
columns of the secondary table. The join may be implemented with a table scan
|
|
when the secondary table is a user-defined table function.</li>
|
|
</ul>
|
|
</li>
|
|
<li id="c23nl__rlee"><a name="c23nl__rlee"><!-- --></a>Determines if the row is selected by applying any remaining
|
|
selection local to the first secondary dial. <p>If the secondary dial row
|
|
is not selected then the next row that satisfies the join condition is located.
|
|
Steps 1 through 4 are repeated until a row that satisfies both the join condition
|
|
and any remaining selection is selected from all secondary tables</p>
|
|
</li>
|
|
<li>Returns the result join row.</li>
|
|
<li>Processes the last secondary table again to find the next row that satisfies
|
|
the join condition in that dial. <p>During this processing, when no more
|
|
rows that satisfy the join condition can be selected, the processing backs
|
|
up to the logical previous dial and attempts to read the next row that satisfies
|
|
its join condition.</p>
|
|
</li>
|
|
<li>Ends processing when all selected rows from the primary table are processed.</li>
|
|
</ol>
|
|
<p>Note the following characteristics of a nested loop join: </p>
|
|
<ul><li>If ordering or grouping is specified and all the columns are over a single
|
|
table and that table is eligible to be the primary, then the optimizer costs
|
|
the join with that table as the primary and performing the grouping and ordering
|
|
with an index. </li>
|
|
<li>If ordering and grouping is specified on two or more tables or if temporaries
|
|
are allowed, <span class="keyword">DB2 Universal Database for iSeries</span> breaks
|
|
the processing of the query into two parts: <ol><li>Perform the join selection omitting the ordering or grouping processing
|
|
and write the result rows to a temporary work table. This allows the optimizer
|
|
to consider any table of the join query as a candidate for the primary table.</li>
|
|
<li>The ordering or grouping processing is then performed on the data in the
|
|
temporary work table.</li>
|
|
</ol>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
<div class="section"><h4 class="sectiontitle">Queries that cannot use hash join</h4><p>Hash join cannot
|
|
be used for queries that: </p>
|
|
<ul><li>Hash join cannot be used for queries involving physical files or tables
|
|
that have read triggers.</li>
|
|
<li>Require that the cursor position be restored as the result of the SQL
|
|
ROLLBACK HOLD statement or the ROLLBACK CL command. For SQL applications using
|
|
commitment control level other than *NONE, this requires that *ALLREAD be
|
|
specified as the value for the ALWBLK precompiler parameter.</li>
|
|
<li>Hash join cannot be used for a table in a join query where the join condition
|
|
something other than an equals operator.</li>
|
|
<li>CQE does not support hash join if the query contains any of the following: <ul><li>Subqueries unless all subqueries in the query can be transformed to inner
|
|
joins.</li>
|
|
<li>UNION or UNION ALL</li>
|
|
<li>Perform left outer or exception join. </li>
|
|
<li>Use a DDS created join logical file. </li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</div>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="perf24.htm" title="A join operation is a complex function that requires special attention in order to achieve good performance. This section describes how DB2 Universal Database for iSeries implements join queries and how optimization choices are made by the query optimizer. It also describes design tips and techniques which help avoid or solve performance problems.">Join optimization</a></div>
|
|
</div>
|
|
<div class="relconcepts"><strong>Related concepts</strong><br />
|
|
<div><a href="rzajq1parallel.htm" title="The DB2 UDB Symmetric Multiprocessing feature provides the optimizer with additional methods for retrieving data that include parallel processing. Symmetrical multiprocessing (SMP) is a form of parallelism achieved on a single server where multiple (CPU and I/O) processors that share memory and disk resource work simultaneously toward achieving a single end result.">Objects processed in parallel</a></div>
|
|
</div>
|
|
<div class="relref"><strong>Related reference</strong><br />
|
|
<div><a href="dsscan.htm" title="A table scan is the easiest and simplest operation that can be performed against a table. It sequentially processes all of the rows in the table to determine if they satisfy the selection criteria specified in the query. It does this in a way to maximize the I/O throughput for the table.">Table scan</a></div>
|
|
<div><a href="rzajqslistprobe.htm" title="A sorted list probe operation is used to retrieve rows from a temporary sorted list based upon a probe lookup operation.">Sorted list probe</a></div>
|
|
<div><a href="rzajqhtblprobe.htm" title="A hash table probe operation is used to retrieve rows from a temporary hash table based upon a probe lookup operation.">Hash table probe</a></div>
|
|
<div><a href="rzajqrinprobe.htm" title="A radix index probe operation is used to retrieve the rows from a table in a keyed sequence. The main difference between the Radix Index Probe and the Radix Index Scan is that the rows being returned must first be identified by a probe operation to subset the rows being retrieved.">Radix index probe</a></div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html> |