Unicode considerations for printer files

This topic describes Unicode considerations for positional entries and keyword entries for printer files. It also describes the CCSID keyword for Unicode data in printer files.

Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. A Unicode field can contain all types of characters used on an iSeries™ server, including ideographic (DBCS) characters. The term code unit is used in this topic to mean the minimal bit combination that can represent a unit of encoded text for processing or interchange.

DDS printer files support two transformation formats (encoding forms) of Unicode:

UTF-16 is a 16-bit encoding form designed to provide code values for over a million characters and a superset of UCS-2. UTF-16 data is stored in graphic data types. The CCSID value for data in UTF-16 format is 1200.
A UTF-16 code unit is 2 bytes in length. A UTF-16 character can be 1 or 2 code units (2 or 4 bytes) in length. A UTF-16 data string can contain any character, including UTF-16 surrogates and combining characters.
UCS-2 is the Universal Character Set coded in 2 octets, which means that characters are represented in 16 bits per character. UCS-2 data is stored in graphic data types. The CCSID value for data in UCS-2 format is 13488.
UCS-2 is a subset of UTF-16 and can no longer support all of the characters defined by Unicode. UCS-2 is identical to UTF-16 except that UTF-16 also supports combining characters and surrogates. If you do not need combining characters and surrogates, you might choose to use UCS-2.

Positional entry considerations for printer files that use UTF-16 data
See DDS for describing printer files by position in this topic. Positions that are not mentioned have no special considerations for UTF-16. In these topics, UTF-16 also implies UCS-2.
Keyword considerations for printer files that use UTF-16 data (positions 45 through 80)
The CCSID keyword for printer files specifies that a G-type field supports UTF-16 data instead of DBCS-graphical data.

Parent topic: DDS for printer files