86 lines
5.6 KiB
HTML
86 lines
5.6 KiB
HTML
|
<?xml version="1.0" encoding="UTF-8"?>
|
||
|
<!DOCTYPE html
|
||
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
|
<html lang="en-us" xml:lang="en-us">
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
|
<meta name="security" content="public" />
|
||
|
<meta name="Robots" content="index,follow" />
|
||
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
||
|
<meta name="DC.Type" content="concept" />
|
||
|
<meta name="DC.Title" content="Choose partitioning keys with DB2 Multisystem" />
|
||
|
<meta name="abstract" content="For the system to process the partitioned file in the most efficient manner, there are some tips you can consider when setting up or using a partitioning key." />
|
||
|
<meta name="description" content="For the system to process the partitioned file in the most efficient manner, there are some tips you can consider when setting up or using a partitioning key." />
|
||
|
<meta name="DC.subject" content="partitioning key, choosing" />
|
||
|
<meta name="keywords" content="partitioning key, choosing" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="ption.htm" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="qqp.htm" />
|
||
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
||
|
<meta name="DC.Format" content="XHTML" />
|
||
|
<meta name="DC.Identifier" content="choosepart" />
|
||
|
<meta name="DC.Language" content="en-us" />
|
||
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
||
|
<!-- US Government Users Restricted Rights -->
|
||
|
<!-- Use, duplication or disclosure restricted by -->
|
||
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
||
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
||
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
||
|
<title>Choose partitioning keys with DB2 Multisystem</title>
|
||
|
</head>
|
||
|
<body id="choosepart"><a name="choosepart"><!-- --></a>
|
||
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
||
|
<h1 class="topictitle1">Choose partitioning keys with DB2 Multisystem</h1>
|
||
|
<div><p>For the system to process the partitioned file in the most efficient
|
||
|
manner, there are some tips you can consider when setting up or using a partitioning
|
||
|
key.</p>
|
||
|
<p>These tips are:</p>
|
||
|
<ul><li>The best partitioning key is one that has many different values and, therefore,
|
||
|
the partitioning activity results in an even distribution of the records of
|
||
|
data. Customer numbers, last names, claim numbers, ZIP codes (regional mailing
|
||
|
address codes), and telephone area codes are examples of good categories for
|
||
|
using as partitioning keys. <p>Gender, because only two choices
|
||
|
exist, male or female, is an example of a poor choice for a partitioning key.
|
||
|
Gender causes too much data to be distributed to a single node instead of
|
||
|
spread across the nodes. Also, when doing a query, gender as the partitioning
|
||
|
key causes the system to process through too many records of data. It is inefficient;
|
||
|
another field or fields of data can narrow the scope of the query and make
|
||
|
it much more efficient. A partitioning key based on gender is a poor choice
|
||
|
in cases where even distribution of data is wanted rather than
|
||
|
distribution based on specific values.</p>
|
||
|
<p>When preparing to change a local
|
||
|
file into a distributed file, you can use the HASH function to get an idea
|
||
|
of how the data is distributed. Because the HASH function can be used against
|
||
|
local files and with any variety of columns, you can try different partitioning
|
||
|
keys before actually changing the file to be distributed. For example, if
|
||
|
you plan to use the ZIP code field of a file, you can run the HASH function
|
||
|
using that field to get an idea of the number of records that HASH to each
|
||
|
partition number. This helps you in choosing your partitioning key fields,
|
||
|
or in creating the partition map in your node groups.</p>
|
||
|
</li>
|
||
|
<li>Do not choose a field that needs to be updated often. A restriction on
|
||
|
partitioning key fields is that they can have their values updated only if
|
||
|
the update does not force the record to a different node.</li>
|
||
|
<li>Do not use many fields in the partitioning key; the best choice is to
|
||
|
use one field. Using many fields forces the system to do more work at I/O
|
||
|
time.</li>
|
||
|
<li>Choose a simple data type, such as fixed-length character or integer,
|
||
|
as your partitioning key. This consideration might help performance because
|
||
|
the hashing is done for a single field of a simple data type.</li>
|
||
|
<li>When choosing a partitioning key, you should consider the join and grouping
|
||
|
criteria of the queries you typically run. For example, choosing a field that
|
||
|
is never used as a join field for a file that is involved in joins can adversely
|
||
|
affect join performance. See Query design for performance with DB2<sup>®</sup> Multisystem
|
||
|
for information about running queries involving distributed files.</li>
|
||
|
</ul>
|
||
|
</div>
|
||
|
<div>
|
||
|
<div class="familylinks">
|
||
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="ption.htm" title="Partitioning is the process of distributing a file across the nodes in a node group.">Partitioning with DB2 Multisystem</a></div>
|
||
|
</div>
|
||
|
<div class="relconcepts"><strong>Related concepts</strong><br />
|
||
|
<div><a href="qqp.htm" title="This topic provides you with some guidelines for designing queries so that they use query resources more efficiently when you run queries that use distributed files.">Query design for performance with DB2 Multisystem</a></div>
|
||
|
</div>
|
||
|
</div>
|
||
|
</body>
|
||
|
</html>
|