ibm-information-center/dist/eclipse/plugins/i5OS.ic.rzakb_5.4.0.1/ucs2ap.htm

93 lines
6.6 KiB
HTML
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="reference" />
<meta name="DC.Title" content="Unicode considerations for database files" />
<meta name="abstract" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Follow this topic to learn about how to specify DDS position 30 through 37 and position 45 through 80 for describing database files. Positions not mentioned have no special considerations for Unicode." />
<meta name="description" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Follow this topic to learn about how to specify DDS position 30 through 37 and position 45 through 80 for describing database files. Positions not mentioned have no special considerations for Unicode." />
<meta name="DC.subject" content="Unicode, physical and logical files, physical files, logical files, DDS file considerations, DDS file considerations" />
<meta name="keywords" content="Unicode, physical and logical files, physical files, logical files, DDS file considerations, DDS file considerations" />
<meta name="DC.Relation" scheme="URI" content="kickoff.htm" />
<meta name="DC.Relation" scheme="URI" content="ucs2length.htm" />
<meta name="DC.Relation" scheme="URI" content="ucs2dtype.htm" />
<meta name="DC.Relation" scheme="URI" content="ucs2dec.htm" />
<meta name="DC.Relation" scheme="URI" content="ucs2kwd.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 2001, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 2001, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="ucs2ap" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>Unicode considerations for database files</title>
</head>
<body id="ucs2ap"><a name="ucs2ap"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">Unicode considerations for database files</h1>
<div><p>Unicode is a universal encoding scheme for written characters and
text that enables the exchange of data internationally. Follow this topic
to learn about how to specify DDS position 30 through 37 and position 45 through
80 for describing database files. Positions not mentioned have no special
considerations for Unicode.</p>
<div class="section"><p>A Unicode field can contain all types of characters
used on an IBM<sup>®</sup> iSeries™ server,
including double-byte character set (DBCS) characters. Unicode data is composed
of <em>code units</em>, which represent the minimal byte combination that can
represent a unit of text.</p>
<p>There are three transformation formats (encoding
forms) of Unicode that are supported with physical and logical file DDS:</p>
<ul><li><strong>UTF-8</strong> is an 8-bit encoding form designed for ease of use with existing
ASCII-based systems. UTF-8 data is stored in character data types. The CCSID
value for data in UTF-8 format is 1208. <p>A UTF-8 code unit is 1 byte in
length. A UTF-8 character can be 1, 2, 3, or 4 code units in length. A UTF-8
data string can contain any character, including surrogates and combining
characters.</p>
</li>
<li><strong>UTF-16</strong> is a 16-bit encoding form designed to provide code values
for over a million characters, and a superset of UCS-2. UTF-16 data is stored
in graphic data types. The CCSID value for data in UTF-16 format is 1200. <p>A
UTF-16 code unit is 2 bytes in length. A UTF-16 character can be 1 or 2 code
units (2 or 4 bytes) in length. A UTF-16 data string can contain any character,
including UTF-16 surrogates and combining characters.</p>
</li>
<li><strong>UCS-2</strong> is the Universal Character Set coded in 2 octets, which means
that characters are represented in 16-bits per character. UCS-2 data is stored
in graphic data types. The CCSID value for data in UCS-2 format is 13488. <p>UCS-2
is a subset of UTF-16, and can no longer support all of the characters defined
by Unicode. UCS-2 is identical to UTF-16, except that UTF-16 also supports
combining characters and surrogates. If you do not need support for combining
characters and surrogates, then you can choose to use the UCS-2 type, because
there is more database functionality available for it.</p>
</li>
</ul>
<div class="note"><span class="notetitle">Note:</span> In this topic, references to UTF-16 imply UCS-2 as well.</div>
</div>
</div>
<div>
<ul class="ullinks">
<li class="ulchildlink"><strong><a href="ucs2length.htm">Length (positions 30 through 34)</a></strong><br />
Specify the length of the field in these positions. The length of a field containing UTF-16 data can range from 1 through 16 383 code units. The length of a field containing UTF-8 data can range from 1 through 32 766 code units.</li>
<li class="ulchildlink"><strong><a href="ucs2dtype.htm">Data type (position 35)</a></strong><br />
The valid data types for Unicode data are the G (Graphic) data type and the A (Character) type.</li>
<li class="ulchildlink"><strong><a href="ucs2dec.htm">Decimal positions (positions 36 and 37)</a></strong><br />
Leave these positions blank when using Unicode data.</li>
<li class="ulchildlink"><strong><a href="ucs2kwd.htm">Keyword considerations (positions 45 through 80)</a></strong><br />
Learn about how Unicode data is used with some DDS keywords.</li>
</ul>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="kickoff.htm" title="You can use data description specifications (DDS) to define physical and logical database files. This topic provides the information you need to code the positional and keyword entries that define these files.">DDS for physical and logical files</a></div>
</div>
</div>
</body>
</html>