ibm-information-center/dist/eclipse/plugins/i5OS.ic.dbmult_5.4.0.1/choosepart.htm

86 lines
5.6 KiB
HTML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="Choose partitioning keys with DB2 Multisystem" />
<meta name="abstract" content="For the system to process the partitioned file in the most efficient manner, there are some tips you can consider when setting up or using a partitioning key." />
<meta name="description" content="For the system to process the partitioned file in the most efficient manner, there are some tips you can consider when setting up or using a partitioning key." />
<meta name="DC.subject" content="partitioning key, choosing" />
<meta name="keywords" content="partitioning key, choosing" />
<meta name="DC.Relation" scheme="URI" content="ption.htm" />
<meta name="DC.Relation" scheme="URI" content="qqp.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="choosepart" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Choose partitioning keys with DB2 Multisystem</title>
</head>
<body id="choosepart"><a name="choosepart"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Choose partitioning keys with DB2 Multisystem</h1>
<div><p>For the system to process the partitioned file in the most efficient
manner, there are some tips you can consider when setting up or using a partitioning
key.</p>
<p>These tips are:</p>
<ul><li>The best partitioning key is one that has many different values and, therefore,
the partitioning activity results in an even distribution of the records of
data. Customer numbers, last names, claim numbers, ZIP codes (regional mailing
address codes), and telephone area codes are examples of good categories for
using as partitioning keys. <p>Gender, because only two choices
exist, male or female, is an example of a poor choice for a partitioning key.
Gender causes too much data to be distributed to a single node instead of
spread across the nodes. Also, when doing a query, gender as the partitioning
key causes the system to process through too many records of data. It is inefficient;
another field or fields of data can narrow the scope of the query and make
it much more efficient. A partitioning key based on gender is a poor choice
in cases where even distribution of data is wanted rather than
distribution based on specific values.</p>
<p>When preparing to change a local
file into a distributed file, you can use the HASH function to get an idea
of how the data is distributed. Because the HASH function can be used against
local files and with any variety of columns, you can try different partitioning
keys before actually changing the file to be distributed. For example, if
you plan to use the ZIP code field of a file, you can run the HASH function
using that field to get an idea of the number of records that HASH to each
partition number. This helps you in choosing your partitioning key fields,
or in creating the partition map in your node groups.</p>
</li>
<li>Do not choose a field that needs to be updated often. A restriction on
partitioning key fields is that they can have their values updated only if
the update does not force the record to a different node.</li>
<li>Do not use many fields in the partitioning key; the best choice is to
use one field. Using many fields forces the system to do more work at I/O
time.</li>
<li>Choose a simple data type, such as fixed-length character or integer,
as your partitioning key. This consideration might help performance because
the hashing is done for a single field of a simple data type.</li>
<li>When choosing a partitioning key, you should consider the join and grouping
criteria of the queries you typically run. For example, choosing a field that
is never used as a join field for a file that is involved in joins can adversely
affect join performance. See Query design for performance with DB2<sup>®</sup> Multisystem
for information about running queries involving distributed files.</li>
</ul>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="ption.htm" title="Partitioning is the process of distributing a file across the nodes in a node group.">Partitioning with DB2 Multisystem</a></div>
</div>
<div class="relconcepts"><strong>Related concepts</strong><br />
<div><a href="qqp.htm" title="This topic provides you with some guidelines for designing queries so that they use query resources more efficiently when you run queries that use distributed files.">Query design for performance with DB2 Multisystem</a></div>
</div>
</div>
</body>
</html>