63 lines
4.1 KiB
HTML
63 lines
4.1 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html lang="en-us" xml:lang="en-us">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
|
<meta name="security" content="public" />
|
|
<meta name="Robots" content="index,follow" />
|
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
|
<meta name="DC.Type" content="concept" />
|
|
<meta name="DC.Title" content="UTF-16" />
|
|
<meta name="abstract" content="UTF-16 is an encoding of Unicode in which each character is composed of either one or two 16-bit elements." />
|
|
<meta name="description" content="UTF-16 is an encoding of Unicode in which each character is composed of either one or two 16-bit elements." />
|
|
<meta name="DC.Relation" scheme="URI" content="rbagsunicodeucs2.htm" />
|
|
<meta name="DC.Relation" scheme="URI" content="http://www.unicode.org" />
|
|
<meta name="DC.Relation" scheme="URI" content="rbagsucs2.htm" />
|
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Format" content="XHTML" />
|
|
<meta name="DC.Identifier" content="rbagsutf16" />
|
|
<meta name="DC.Language" content="en-us" />
|
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
|
<!-- US Government Users Restricted Rights -->
|
|
<!-- Use, duplication or disclosure restricted by -->
|
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
|
<title>UTF-16</title>
|
|
</head>
|
|
<body id="rbagsutf16"><a name="rbagsutf16"><!-- --></a>
|
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
|
<h1 class="topictitle1">UTF-16</h1>
|
|
<div><p>UTF-16 is an encoding of Unicode in which each character is composed
|
|
of either one or two 16-bit elements.</p>
|
|
<p>The operating system supports UTF-16 encoding with
|
|
CCSID 1200.</p>
|
|
<p>Unicode was originally designed as a pure 16-bit encoding, aimed at representing
|
|
all modern scripts. Over time, and especially after the addition of over 14
|
|
500 composite characters for compatibility with established sets, it became
|
|
clear that 16 bits were not sufficient for most users. Out of this arose UTF-16.</p>
|
|
<p>UTF-16 allows access to about 60 000 characters as single Unicode 16-bit
|
|
units. It can access an additional 1 000 000 characters by a mechanism known
|
|
as surrogate pairs.</p>
|
|
<p>Two ranges of Unicode code values are reserved for the high (first) and
|
|
low (second) values of these pairs. Highs are from 0xD800 to 0xDBFF, and lows
|
|
from 0xDC00 to 0xDFFF. Because the most common characters have already been
|
|
encoded in the first 64 000 values, the characters requiring surrogate pairs
|
|
are relatively rare.</p>
|
|
<p>UTF-16 is extremely well designed as the best compromise between handling
|
|
and space, and all commonly used characters can be stored with one code unit
|
|
per code point. This is the default encoding for Unicode.</p>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="rbagsunicodeucs2.htm" title="Unicode is a standard that precisely defines a character set as well as a small number of encodings for it. It enables you to handle text in any language efficiently. It allows a single application to work for a global audience.">Work with Unicode</a></div>
|
|
</div>
|
|
<div class="relconcepts"><strong>Related concepts</strong><br />
|
|
<div><a href="http://www.unicode.org" target="_blank">Unicode</a></div>
|
|
<div><a href="rbagsucs2.htm" title="Because the UCS-2 standard is limited to 65 535 characters, and the data processing industry needs over 94 000 characters, the UCS-2 standard is in the process of being superseded by the Unicode UTF-16 standard.">UCS-2 and its relationship to Unicode</a></div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html> |