ibm-information-center/dist/eclipse/plugins/i5OS.ic.nls_5.4.0.1/rbagsucs2.htm

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="UCS-2 and its relationship to Unicode" />
<meta name="abstract" content="Because the UCS-2 standard is limited to 65 535 characters, and the data processing industry needs over 94 000 characters, the UCS-2 standard is in the process of being superseded by the Unicode UTF-16 standard." />
<meta name="description" content="Because the UCS-2 standard is limited to 65 535 characters, and the data processing industry needs over 94 000 characters, the UCS-2 standard is in the process of being superseded by the Unicode UTF-16 standard." />
<meta name="DC.Relation" scheme="URI" content="rbagsunicodeucs2.htm" />
<meta name="DC.Relation" scheme="URI" content="rbagsutf16.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="rbagsucs2" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>UCS-2 and its relationship to Unicode</title>
</head>
<body id="rbagsucs2"><a name="rbagsucs2"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">UCS-2 and its relationship to Unicode</h1>
<div><p>Because the UCS-2 standard is limited to 65 535 characters,
and the data processing industry needs over 94 000 characters, the UCS-2 standard
is in the process of being superseded by the Unicode UTF-16 standard.</p>
<p>Because UTF-16 is a superset of the existing UCS-2 standard, you can develop
your applications using the existing UCS-2 support as long as your applications
treat the UCS-2 as if it were UTF-16.</p>
<p>i5/OS™ supports UCS-2 encoding with CCSID 13488.</p>
<div class="section"><h4 class="sectiontitle">UCS, UCS-2 (Universal Multiple-Octet Coded Character Set)</h4><p>The
ISO 10646 standard is a character code designed to encode text for storage
in computer files. The design of the ISO 10646 standard is based on today's
prevalent character code, ASCII (and ISO 8859-1, an extended version of the
ASCII code). But ISO 10646 goes beyond ASCII's ability to encode only the
Latin alphabet. The ISO 10646 encoding provides the capability to encode all
of the characters used for written languages throughout the world.</p>
</div>
<div class="section"><h4 class="sectiontitle">Two UCS encoding schemes</h4><p>In order to accommodate
the many thousands of characters used in international text, ISO/IEC 10646
specifies the Universal Multiple-Octet Coded Character Set (UCS). UCS can
be implemented through two encoding schemes:</p>
<ul><li>UCS-2: Each character is represented by 16 bits or 2 bytes. (The number
2 in UCS-2 indicates 2 bytes.) For example, uppercase A is represented by
0041.</li>
</ul>
<ul><li>UCS-4: Each character is represented by 32 bits or 4 bytes. (The number
4 in UCS-4 indicates 4 bytes.) For example, uppercase A is represented by
0000 0041.</li>
</ul>
<p>The major difference between the 2-byte and 4-byte representation
is that the 4-byte representation allows for the presentation or use of additional
characters beyond the capability of UCS-2. That is, you can encode more characters
in UCS-4 than you can in UCS-2.</p>
<p>i5/OS does not
support UCS-4 encoding with a CCSID value.</p>
</div>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="rbagsunicodeucs2.htm" title="Unicode is a standard that precisely defines a character set as well as a small number of encodings for it. It enables you to handle text in any language efficiently. It allows a single application to work for a global audience.">Work with Unicode</a></div>
</div>
<div class="relconcepts"><strong>Related concepts</strong><br />
<div><a href="rbagsutf16.htm" title="UTF-16 is an encoding of Unicode in which each character is composed of either one or two 16-bit elements.">UTF-16</a></div>
</div>
</div>
</body>
</html>