88 lines
5.9 KiB
HTML
88 lines
5.9 KiB
HTML
|
<?xml version="1.0" encoding="UTF-8"?>
|
||
|
<!DOCTYPE html
|
||
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
|
<html lang="en-us" xml:lang="en-us">
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
|
<meta name="security" content="public" />
|
||
|
<meta name="Robots" content="index,follow" />
|
||
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
||
|
<meta name="DC.Type" content="reference" />
|
||
|
<meta name="DC.Title" content="Grouping hash implementation" />
|
||
|
<meta name="abstract" content="This technique uses the base hash access method to perform grouping or summarization of the selected table rows. For each selected row, the specified grouping value is run through the hash function. The computed hash value and grouping value are used to quickly find the entry in the hash table corresponding to the grouping value." />
|
||
|
<meta name="description" content="This technique uses the base hash access method to perform grouping or summarization of the selected table rows. For each selected row, the specified grouping value is run through the hash function. The computed hash value and grouping value are used to quickly find the entry in the hash table corresponding to the grouping value." />
|
||
|
<meta name="DC.Relation" scheme="URI" content="groupopt.htm" />
|
||
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Format" content="XHTML" />
|
||
|
<meta name="DC.Identifier" content="grpinghashimp" />
|
||
|
<meta name="DC.Language" content="en-us" />
|
||
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
||
|
<!-- US Government Users Restricted Rights -->
|
||
|
<!-- Use, duplication or disclosure restricted by -->
|
||
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
||
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
||
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
||
|
<title>Grouping hash implementation</title>
|
||
|
</head>
|
||
|
<body id="grpinghashimp"><a name="grpinghashimp"><!-- --></a>
|
||
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
||
|
<h1 class="topictitle1">Grouping hash implementation</h1>
|
||
|
<div><p>This technique uses the base hash access method to perform grouping
|
||
|
or summarization of the selected table rows. For each selected row, the specified
|
||
|
grouping value is run through the hash function. The computed hash value
|
||
|
and grouping value are used to quickly find the entry in the hash table corresponding
|
||
|
to the grouping value.</p>
|
||
|
<div class="section"><p>If the current grouping value already has a row in the hash table,
|
||
|
the hash table entry is retrieved and summarized (updated) with the current
|
||
|
table row values based on the requested grouping column operations (such as
|
||
|
SUM or COUNT). If a hash table entry is not found for the current grouping
|
||
|
value, a new entry is inserted into the hash table and initialized with the
|
||
|
current grouping value.</p>
|
||
|
</div>
|
||
|
<div class="section"><p>The time required to receive the first group result for this implementation
|
||
|
will most likely be longer than other grouping implementations because the
|
||
|
hash table must be built and populated first. Once the hash table is completely
|
||
|
populated, the database manager uses the table to start returning the grouping
|
||
|
results. Before returning any results, the database manager must apply any
|
||
|
specified grouping selection criteria or ordering to the summary entries in
|
||
|
the hash table.</p>
|
||
|
</div>
|
||
|
<div class="section"><h4 class="sectiontitle">Where the grouping hash method is most effective</h4><p>The
|
||
|
grouping hash method is most effective when the consolidation ratio is high.
|
||
|
The <strong>consolidation ratio</strong> is the ratio of the selected table rows to
|
||
|
the computed grouping results. If every database table row has its own unique
|
||
|
grouping value, then the hash table will become too large. This in turn will
|
||
|
slow down the hashing access method.</p>
|
||
|
</div>
|
||
|
<div class="section"><p>The optimizer estimates the consolidation ratio by first determining
|
||
|
the number of unique values in the specified grouping columns (that is, the
|
||
|
expected number of groups in the database table). The optimizer then examines
|
||
|
the total number of rows in the table and the specified selection criteria
|
||
|
and uses the result of this examination to estimate the consolidation ratio.</p>
|
||
|
</div>
|
||
|
<div class="section"><p>Indexes over the grouping columns can help make the optimizer's
|
||
|
ratio estimate more accurate. Indexes improve the accuracy because they contain
|
||
|
statistics that include the average number of duplicate values for the key
|
||
|
columns.</p>
|
||
|
</div>
|
||
|
<div class="section"><p>The optimizer also uses the expected number of groups estimate
|
||
|
to compute the number of partitions in the hash table. As mentioned earlier,
|
||
|
the hashing access method is more effective when the hash table is well-balanced.
|
||
|
The number of hash table partitions directly affects how entries are distributed
|
||
|
across the hash table and the uniformity of this distribution.</p>
|
||
|
</div>
|
||
|
<div class="section"><p>The hash function performs better when the grouping values consist
|
||
|
of columns that have non-numeric data types, with the exception of the integer
|
||
|
(binary) data type. In addition, specifying grouping value columns that are
|
||
|
not associated with the variable length and null column attributes allows
|
||
|
the hash function to perform more effectively.</p>
|
||
|
</div>
|
||
|
</div>
|
||
|
<div>
|
||
|
<div class="familylinks">
|
||
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="groupopt.htm" title="DB2 Universal Database for iSeries has certain techniques to use when the optimizer encounters grouping. The query optimizer chooses its methods for optimizing your query.">Grouping optimization</a></div>
|
||
|
</div>
|
||
|
</div>
|
||
|
</body>
|
||
|
</html>
|