Unicode is a universal encoding scheme for written characters and text that enables the exchange of data internationally. Two transformation formats, UTF_16 and UCS_2, of Unicode are supported with DDS.
A Unicode field in a display file can contain UCS-2 or UTF-16 data. Unicode data is composed of code units, which represent the minimal byte combination that can represent a unit of text.
There are two transformation formats (encoding forms) of Unicode that are supported with DDS:
A UTF-16 code unit is 2 bytes in length. A UTF-16 character can be 1 or 2 code units (2 or 4 bytes) in length. A UTF-16 data string can contain any character including UTF-16 surrogates and combining characters.
UCS-2 is a subset of UTF-16 and can no longer support all of the characters defined by Unicode. UCS-2 is identical to UTF-16 except that UTF-16 also supports the combining of characters and surrogates. If you do not need support for the combining of characters and surrogates, you can choose to continue to use the UCS-2 format.
Unicode data is not supported on display devices that currently support the 5250 data stream. Therefore, conversions between the Unicode data and EBCDIC are necessary during input and output. On output, the Unicode data is converted to the CCSID of the device. On input, the data is converted from the device CCSID to the Unicode CCSID.
Because the device CCSID, which is determined from the device configuration, determines what the Unicode data is converted to, the converted data will appear differently on different devices. For example, a Unicode code unit that maps to a SBCS character will be displayed as a DBCS replacement character on a graphic-DBCS capable device. On a DBCS or SBCS capable device, the code unit will appear as a SBCS character. A Unicode code unit that maps to a DBCS character will be displayed as a graphic-DBCS character on a graphic-DBCS capable device. On a DBCS device, a DBCS character will be displayed and bracketed (enclosed in a shift-out and shift-in). A SBCS replacement character will be displayed on a SBCS device.
It is also suggested that all fields that are capable of Unicode are initialized in the output buffer before writing the fields to the screen. Unpredictable results might occur if default initialization is allowed to take place.