57 lines
3.4 KiB
HTML
57 lines
3.4 KiB
HTML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html lang="en-us" xml:lang="en-us">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
|
<meta name="security" content="public" />
|
|
<meta name="Robots" content="index,follow" />
|
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
|
<meta name="DC.Type" content="concept" />
|
|
<meta name="DC.Title" content="UTF-8" />
|
|
<meta name="abstract" content="A Unicode Transformation Format (UTF) is the algorithmic mapping from every Unicode value to a unique byte sequence." />
|
|
<meta name="description" content="A Unicode Transformation Format (UTF) is the algorithmic mapping from every Unicode value to a unique byte sequence." />
|
|
<meta name="DC.Relation" scheme="URI" content="rbagsunicodeucs2.htm" />
|
|
<meta name="copyright" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 1998, 2006" />
|
|
<meta name="DC.Format" content="XHTML" />
|
|
<meta name="DC.Identifier" content="rbagsutf8" />
|
|
<meta name="DC.Language" content="en-us" />
|
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
|
<!-- US Government Users Restricted Rights -->
|
|
<!-- Use, duplication or disclosure restricted by -->
|
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
|
<title>UTF-8</title>
|
|
</head>
|
|
<body id="rbagsutf8"><a name="rbagsutf8"><!-- --></a>
|
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
|
<h1 class="topictitle1">UTF-8</h1>
|
|
<div><p>A Unicode Transformation Format (UTF) is the algorithmic mapping
|
|
from every Unicode value to a unique byte sequence.</p>
|
|
<p>UTF-8 converts (via an algorithm) Unicode data so that it:</p>
|
|
<ul><li>Does not contain nulls, unless that was the character intended.</li>
|
|
<li>Uses 8 data bits to encode the data</li>
|
|
<li>Keeps all ASCII codes from 00 to 7F as encoded as themselves</li>
|
|
</ul>
|
|
<p>For example, the string "ABC" in Unicode is "004100420043"x. However, in
|
|
UTF-8 it is "414243".</p>
|
|
<p>Because UTF-8 allows Unicode data to flow over an 8-bit network without
|
|
the network needing to know that it is Unicode, UTF-8 is used to store Unicode
|
|
on several UNIX<sup>®</sup> platforms
|
|
and is used as the default encoding for most new internet standards.</p>
|
|
<p>UTF-8 is used mainly as a direct replacement for older MBCS encodings,
|
|
which all use 8-bit code units, but it takes some more code to process it.
|
|
It is a good encoding if 90% of your data is English, because
|
|
all English letters use only one byte.</p>
|
|
<p>The operating system supports UTF-8 encoding with
|
|
CCSID 1208.</p>
|
|
</div>
|
|
<div>
|
|
<div class="familylinks">
|
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="rbagsunicodeucs2.htm" title="Unicode is a standard that precisely defines a character set as well as a small number of encodings for it. It enables you to handle text in any language efficiently. It allows a single application to work for a global audience.">Work with Unicode</a></div>
|
|
</div>
|
|
</div>
|
|
</body>
|
|
</html> |