94 lines
6.4 KiB
HTML
94 lines
6.4 KiB
HTML
|
<?xml version="1.0" encoding="UTF-8"?>
|
||
|
<!DOCTYPE html
|
||
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
||
|
<html lang="en-us" xml:lang="en-us">
|
||
|
<head>
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
||
|
<meta name="security" content="public" />
|
||
|
<meta name="Robots" content="index,follow" />
|
||
|
<meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.icra.org/ratingsv02.html" l gen true r (cz 1 lz 1 nz 1 oz 1 vz 1) "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0) "http://www.classify.org/safesurf/" l gen true r (SS~~000 1))' />
|
||
|
<meta name="DC.Type" content="concept" />
|
||
|
<meta name="DC.Title" content="Unicode considerations for display files" />
|
||
|
<meta name="abstract" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Two transformation formats, UTF_16 and UCS_2, of Unicode are supported with DDS." />
|
||
|
<meta name="description" content="Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Two transformation formats, UTF_16 and UCS_2, of Unicode are supported with DDS." />
|
||
|
<meta name="DC.subject" content="Unicode, display files, DDS file considerations, DDS file considerations" />
|
||
|
<meta name="keywords" content="Unicode, display files, DDS file considerations, DDS file considerations" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="kickoff.htm" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="ucs2pos.htm" />
|
||
|
<meta name="DC.Relation" scheme="URI" content="ucs2kwd.htm" />
|
||
|
<meta name="copyright" content="(C) Copyright IBM Corporation 2001, 2006" />
|
||
|
<meta name="DC.Rights.Owner" content="(C) Copyright IBM Corporation 2001, 2006" />
|
||
|
<meta name="DC.Format" content="XHTML" />
|
||
|
<meta name="DC.Identifier" content="dspfil" />
|
||
|
<meta name="DC.Language" content="en-us" />
|
||
|
<!-- All rights reserved. Licensed Materials Property of IBM -->
|
||
|
<!-- US Government Users Restricted Rights -->
|
||
|
<!-- Use, duplication or disclosure restricted by -->
|
||
|
<!-- GSA ADP Schedule Contract with IBM Corp. -->
|
||
|
<link rel="stylesheet" type="text/css" href="./ibmdita.css" />
|
||
|
<link rel="stylesheet" type="text/css" href="./ic.css" />
|
||
|
<title>Unicode considerations for display files</title>
|
||
|
</head>
|
||
|
<body id="dspfil"><a name="dspfil"><!-- --></a>
|
||
|
<!-- Java sync-link --><script language="Javascript" src="../rzahg/synch.js" type="text/javascript"></script>
|
||
|
<h1 class="topictitle1">Unicode considerations for display files</h1>
|
||
|
<div><p>Unicode is a universal encoding scheme for written characters and
|
||
|
text that enables the exchange of data internationally. Two transformation
|
||
|
formats, UTF_16 and UCS_2, of Unicode are supported with DDS.</p>
|
||
|
<p> A Unicode field in a display file can contain UCS-2 or UTF-16 data. Unicode
|
||
|
data is composed of <em>code units</em>, which represent the minimal byte combination
|
||
|
that can represent a unit of text.</p>
|
||
|
<p>There are two transformation formats (encoding forms) of Unicode that are
|
||
|
supported with DDS:</p>
|
||
|
<ul><li><strong>UTF-16</strong> is a 16-bit encoding form designed to provide code values
|
||
|
for over a million characters and a superset of Unicode. UTF-16 data is stored
|
||
|
in graphic data types. The CCSID value for data in UTF-16 format is 1200. <p>A
|
||
|
UTF-16 code unit is 2 bytes in length. A UTF-16 character can be 1 or 2 code
|
||
|
units (2 or 4 bytes) in length. A UTF-16 data string can contain any character
|
||
|
including UTF-16 surrogates and combining characters.</p>
|
||
|
</li>
|
||
|
<li><strong>UCS-2</strong> is the Universal Character Set coded in 2 octets, which means
|
||
|
that characters are represented in 16 bits per character. One code unit is
|
||
|
used in this topic to describe the size of a UCS-2 character. UCS-2 data is
|
||
|
stored in graphic data types. The CCSID value for data in UCS-2 format is
|
||
|
13488. <p>UCS-2 is a subset of UTF-16 and can no longer support all of the
|
||
|
characters defined by Unicode. UCS-2 is identical to UTF-16 except that UTF-16
|
||
|
also supports the combining of characters and surrogates. If you do not need
|
||
|
support for the combining of characters and surrogates, you can choose to
|
||
|
continue to use the UCS-2 format.</p>
|
||
|
</li>
|
||
|
</ul>
|
||
|
<p>Unicode data is not supported on display devices that currently support
|
||
|
the 5250 data stream. Therefore, conversions between the Unicode data and
|
||
|
EBCDIC are necessary during input and output. On output, the Unicode data
|
||
|
is converted to the CCSID of the device. On input, the data is converted from
|
||
|
the device CCSID to the Unicode CCSID.</p>
|
||
|
<p>Because the device CCSID, which is determined from the device configuration,
|
||
|
determines what the Unicode data is converted to, the converted data will
|
||
|
appear differently on different devices. For example, a Unicode code unit
|
||
|
that maps to a SBCS character will be displayed as a DBCS replacement character
|
||
|
on a graphic-DBCS capable device. On a DBCS or SBCS capable device, the code
|
||
|
unit will appear as a SBCS character. A Unicode code unit that maps to a DBCS
|
||
|
character will be displayed as a graphic-DBCS character on a graphic-DBCS
|
||
|
capable device. On a DBCS device, a DBCS character will be displayed and bracketed
|
||
|
(enclosed in a shift-out and shift-in). A SBCS replacement character will
|
||
|
be displayed on a SBCS device.</p>
|
||
|
<p>It is also suggested that all fields that
|
||
|
are capable of Unicode are initialized in the output buffer before writing
|
||
|
the fields to the screen. Unpredictable results might occur if default initialization
|
||
|
is allowed to take place.</p>
|
||
|
</div>
|
||
|
<div>
|
||
|
<ul class="ullinks">
|
||
|
<li class="ulchildlink"><strong><a href="ucs2pos.htm">Positional entry considerations for display files that use Unicode data</a></strong><br />
|
||
|
The topic describes, by position, DDS for describing display files. Positions not mentioned have no special considerations for Unicode.</li>
|
||
|
<li class="ulchildlink"><strong><a href="ucs2kwd.htm">Keyword considerations for display files that use Unicode data (positions 45 through 80)</a></strong><br />
|
||
|
The DFT keyword can contain SBCS, bracketed-DBCS, or bracketed-DBCS-graphic character strings when specified on a Unicode-capable field.</li>
|
||
|
</ul>
|
||
|
|
||
|
<div class="familylinks">
|
||
|
<div class="parentlink"><strong>Parent topic:</strong> <a href="kickoff.htm" title="Use data description specifications (DDS) to define display files. This topic collection provides the information you need to code the positional and keyword entries that define these files.">DDS for display files</a></div>
|
||
|
</div>
|
||
|
</div>
|
||
|
</body>
|
||
|
</html>
|