ibm-information-center/dist/eclipse/plugins/i5OS.ic.rzajq_5.4.0.1/grpinghashimp.htm

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="reference" />
<meta name="DC.Title" content="Grouping hash implementation" />
<meta name="abstract" content="This technique uses the base hash access method to perform grouping or summarization of the selected table rows. For each selected row, the specified grouping value is run through the hash function. The computed hash value and grouping value are used to quickly find the entry in the hash table corresponding to the grouping value." />
<meta name="description" content="This technique uses the base hash access method to perform grouping or summarization of the selected table rows. For each selected row, the specified grouping value is run through the hash function. The computed hash value and grouping value are used to quickly find the entry in the hash table corresponding to the grouping value." />
<meta name="DC.Relation" scheme="URI" content="groupopt.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="grpinghashimp" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Grouping hash implementation</title>
</head>
<body id="grpinghashimp"><a name="grpinghashimp"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Grouping hash implementation</h1>
<div><p>This technique uses the base hash access method to perform grouping
or summarization of the selected table rows. For each selected row, the specified
grouping value is run through the hash function.  The computed hash value
and grouping value are used to quickly find the entry in the hash table corresponding
to the grouping value.</p>
<div class="section"><p>If the current grouping value already has a row in the hash table,
the hash table entry is retrieved and summarized (updated) with the current
table row values based on the requested grouping column operations (such as
SUM or COUNT). If a hash table entry is not found for the current grouping
value, a new entry is inserted into the hash table and initialized with the
current grouping value.</p>
</div>
<div class="section"><p>The time required to receive the first group result for this implementation
will most likely be longer than other grouping implementations because the
hash table must be built and populated first. Once the hash table is completely
populated, the database manager uses the table to start returning the grouping
results.  Before returning any results, the database manager must apply any
specified grouping selection criteria or ordering to the summary entries in
the hash table.</p>
</div>
<div class="section"><h4 class="sectiontitle">Where the grouping hash method is most effective</h4><p>The
grouping hash method is most effective when the consolidation ratio is high.
 The <strong>consolidation ratio</strong> is the ratio of the selected table rows to
the computed grouping results. If every database table row has its own unique
grouping value, then the hash table will become too large.  This in turn will
slow down the hashing access method.</p>
</div>
<div class="section"><p>The optimizer estimates the consolidation ratio by first determining
the number of unique values in the specified grouping columns (that is, the
expected number of groups in the database table). The optimizer then examines
the total number of rows in the table and the specified selection criteria
and uses the result of this examination to estimate the consolidation ratio.</p>
</div>
<div class="section"><p>Indexes over the grouping columns can help make the optimizer's
ratio estimate more accurate.  Indexes improve the accuracy because they contain
statistics that include the average number of duplicate values for the key
columns.</p>
</div>
<div class="section"><p>The optimizer also uses the expected number of groups estimate
to compute the number of partitions in the hash table.  As mentioned earlier,
the hashing access method is more effective when the hash table is well-balanced.
 The number of hash table partitions directly affects how entries are distributed
across the hash table and the uniformity of this distribution.</p>
</div>
<div class="section"><p>The hash function performs better when the grouping values consist
of columns that have non-numeric data types, with the exception of the integer
(binary) data type. In addition, specifying grouping value columns that are
not associated with the variable length and null column attributes allows
the hash function to perform more effectively.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="groupopt.htm" title="DB2 Universal Database for iSeries has certain techniques to use when the optimizer encounters grouping. The query optimizer chooses its methods for optimizing your query.">Grouping optimization</a></div>
</div>
</div>
</body>
</html>