93 lines
6.6 KiB
HTML
93 lines
6.6 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!DOCTYPE html
|
||
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
<html lang="en-us" xml:lang="en-us">
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
<meta name="security" content="public" />
|
||
<meta name="Robots" content="index,follow" />
|
||
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
||
<meta name="DC.Type" content="reference" />
|
||
<meta name="DC.Title" content="Unicode considerations for database files" />
|
||
<meta name="abstract" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Follow this topic to learn about how to specify DDS position 30 through 37 and position 45 through 80 for describing database files. Positions not mentioned have no special considerations for Unicode." />
|
||
<meta name="description" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Follow this topic to learn about how to specify DDS position 30 through 37 and position 45 through 80 for describing database files. Positions not mentioned have no special considerations for Unicode." />
|
||
<meta name="DC.subject" content="Unicode, physical and logical files, physical files, logical files, DDS file considerations, DDS file considerations" />
|
||
<meta name="keywords" content="Unicode, physical and logical files, physical files, logical files, DDS file considerations, DDS file considerations" />
|
||
<meta name="DC.Relation" scheme="URI" content="kickoff.htm" />
|
||
<meta name="DC.Relation" scheme="URI" content="ucs2length.htm" />
|
||
<meta name="DC.Relation" scheme="URI" content="ucs2dtype.htm" />
|
||
<meta name="DC.Relation" scheme="URI" content="ucs2dec.htm" />
|
||
<meta name="DC.Relation" scheme="URI" content="ucs2kwd.htm" />
|
||
<meta name="copyright" content="(C) Copyright IBM Corporation 2001, 2006" />
|
||
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 2001, 2006" />
|
||
<meta name="DC.Format" content="XHTML" />
|
||
<meta name="DC.Identifier" content="ucs2ap" />
|
||
<meta name="DC.Language" content="en-us" />
|
||
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
||
<!-- US Government Users Restricted Rights -->
|
||
<!-- Use, duplication or disclosure restricted by -->
|
||
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
||
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
||
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
||
<title>Unicode considerations for database files</title>
|
||
</head>
|
||
<body id="ucs2ap"><a name="ucs2ap"><!-- --></a>
|
||
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
||
<h1 class="topictitle1">Unicode considerations for database files</h1>
|
||
<div><p>Unicode is a universal encoding scheme for written characters and
|
||
text that enables the exchange of data internationally. Follow this topic
|
||
to learn about how to specify DDS position 30 through 37 and position 45 through
|
||
80 for describing database files. Positions not mentioned have no special
|
||
considerations for Unicode.</p>
|
||
<div class="section"><p>A Unicode field can contain all types of characters
|
||
used on an IBM<sup>®</sup> iSeries™ server,
|
||
including double-byte character set (DBCS) characters. Unicode data is composed
|
||
of <em>code units</em>, which represent the minimal byte combination that can
|
||
represent a unit of text.</p>
|
||
<p>There are three transformation formats (encoding
|
||
forms) of Unicode that are supported with physical and logical file DDS:</p>
|
||
<ul><li><strong>UTF-8</strong> is an 8-bit encoding form designed for ease of use with existing
|
||
ASCII-based systems. UTF-8 data is stored in character data types. The CCSID
|
||
value for data in UTF-8 format is 1208. <p>A UTF-8 code unit is 1 byte in
|
||
length. A UTF-8 character can be 1, 2, 3, or 4 code units in length. A UTF-8
|
||
data string can contain any character, including surrogates and combining
|
||
characters.</p>
|
||
</li>
|
||
<li><strong>UTF-16</strong> is a 16-bit encoding form designed to provide code values
|
||
for over a million characters, and a superset of UCS-2. UTF-16 data is stored
|
||
in graphic data types. The CCSID value for data in UTF-16 format is 1200. <p>A
|
||
UTF-16 code unit is 2 bytes in length. A UTF-16 character can be 1 or 2 code
|
||
units (2 or 4 bytes) in length. A UTF-16 data string can contain any character,
|
||
including UTF-16 surrogates and combining characters.</p>
|
||
</li>
|
||
<li><strong>UCS-2</strong> is the Universal Character Set coded in 2 octets, which means
|
||
that characters are represented in 16-bits per character. UCS-2 data is stored
|
||
in graphic data types. The CCSID value for data in UCS-2 format is 13488. <p>UCS-2
|
||
is a subset of UTF-16, and can no longer support all of the characters defined
|
||
by Unicode. UCS-2 is identical to UTF-16, except that UTF-16 also supports
|
||
combining characters and surrogates. If you do not need support for combining
|
||
characters and surrogates, then you can choose to use the UCS-2 type, because
|
||
there is more database functionality available for it.</p>
|
||
</li>
|
||
</ul>
|
||
<div class="note"><span class="notetitle">Note:</span> In this topic, references to UTF-16 imply UCS-2 as well.</div>
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<ul class="ullinks">
|
||
<li class="ulchildlink"><strong><a href="ucs2length.htm">Length (positions 30 through 34)</a></strong><br />
|
||
Specify the length of the field in these positions. The length of a field containing UTF-16 data can range from 1 through 16 383 code units. The length of a field containing UTF-8 data can range from 1 through 32 766 code units.</li>
|
||
<li class="ulchildlink"><strong><a href="ucs2dtype.htm">Data type (position 35)</a></strong><br />
|
||
The valid data types for Unicode data are the G (Graphic) data type and the A (Character) type.</li>
|
||
<li class="ulchildlink"><strong><a href="ucs2dec.htm">Decimal positions (positions 36 and 37)</a></strong><br />
|
||
Leave these positions blank when using Unicode data.</li>
|
||
<li class="ulchildlink"><strong><a href="ucs2kwd.htm">Keyword considerations (positions 45 through 80)</a></strong><br />
|
||
Learn about how Unicode data is used with some DDS keywords.</li>
|
||
</ul>
|
||
|
||
<div class="familylinks">
|
||
<div class="parentlink"><strong>Parent topic:</strong> <a href="kickoff.htm" title="You can use data description specifications (DDS) to define physical and logical database files. This topic provides the information you need to code the positional and keyword entries that define these files.">DDS for physical and logical files</a></div>
|
||
</div>
|
||
</div>
|
||
</body>
|
||
</html> |