ibm-information-center/dist/eclipse/plugins/i5OS.ic.nls_5.4.0.1/rbagsutf16.htm

63 lines
4.1 KiB
HTML
Raw Normal View History

2024-04-02 14:02:31 +00:00
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html lang="en-us" xml:lang="en-us">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="security" content="public" />
<meta name="Robots" content="index,follow" />
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="UTF-16" />
<meta name="abstract" content="UTF-16 is an encoding of Unicode in which each character is composed of either one or two 16-bit elements." />
<meta name="description" content="UTF-16 is an encoding of Unicode in which each character is composed of either one or two 16-bit elements." />
<meta name="DC.Relation" scheme="URI" content="rbagsunicodeucs2.htm" />
<meta name="DC.Relation" scheme="URI" content="http://www.unicode.org" />
<meta name="DC.Relation" scheme="URI" content="rbagsucs2.htm" />
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="rbagsutf16" />
<meta name="DC.Language" content="en-us" />
<!-- All rights reserved. Licensed Materials Property of IBM -->
<!-- US Government Users Restricted Rights -->
<!-- Use, duplication or disclosure restricted by -->
<!-- GSA ADP Schedule Contract with IBM Corp. -->
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
<link rel="stylesheet" type="text/css" href="./ic.css" />
<title>UTF-16</title>
</head>
<body id="rbagsutf16"><a name="rbagsutf16"><!-- --></a>
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
<h1 class="topictitle1">UTF-16</h1>
<div><p>UTF-16 is an encoding of Unicode in which each character is composed
of either one or two 16-bit elements.</p>
<p>The operating system supports UTF-16 encoding with
CCSID 1200.</p>
<p>Unicode was originally designed as a pure 16-bit encoding, aimed at representing
all modern scripts. Over time, and especially after the addition of over 14
500 composite characters for compatibility with established sets, it became
clear that 16 bits were not sufficient for most users. Out of this arose UTF-16.</p>
<p>UTF-16 allows access to about 60 000 characters as single Unicode 16-bit
units. It can access an additional 1 000 000 characters by a mechanism known
as surrogate pairs.</p>
<p>Two ranges of Unicode code values are reserved for the high (first) and
low (second) values of these pairs. Highs are from 0xD800 to 0xDBFF, and lows
from 0xDC00 to 0xDFFF. Because the most common characters have already been
encoded in the first 64 000 values, the characters requiring surrogate pairs
are relatively rare.</p>
<p>UTF-16 is extremely well designed as the best compromise between handling
and space, and all commonly used characters can be stored with one code unit
per code point. This is the default encoding for Unicode.</p>
</div>
<div>
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a href="rbagsunicodeucs2.htm" title="Unicode is a standard that precisely defines a character set as well as a small number of encodings for it. It enables you to handle text in any language efficiently. It allows a single application to work for a global audience.">Work with Unicode</a></div>
</div>
<div class="relconcepts"><strong>Related concepts</strong><br />
<div><a href="http://www.unicode.org" target="_blank">Unicode</a></div>
<div><a href="rbagsucs2.htm" title="Because the UCS-2 standard is limited to 65 535 characters, and the data processing industry needs over 94 000 characters, the UCS-2 standard is in the process of being superseded by the Unicode UTF-16 standard.">UCS-2 and its relationship to Unicode</a></div>
</div>
</div>
</body>
</html>