Length (positions 30 through 34)

Specify the length of the field in these positions. The length of a field containing UTF-16 data can range from 1 through 16 383 code units. The length of a field containing UTF-8 data can range from 1 through 32 766 code units.

When determining the program length of a field containing Unicode data, consider the following rules:

Each UTF-16 code unit is 2 bytes long.
The length of the field is specified in the number of UTF-16 code units. For example, a field containing 3 UTF-16 code units has 6 bytes of data.
Each UTF-8 code unit is 1 byte long. A UTF-8 character can be 1, 2, 3, or 4 code units in length.
After converting between Unicode data and EBCDIC, the resulting data should be equal to, longer, or shorter than the original length of the data before the conversion. For example, 1 UTF-16 code unit is composed of 2 bytes of data. That character might convert to 1 single-byte character set (SBCS) character composed of 1 byte of data, 1 1 graphic double-byte character set (DBCS) character composed of 2 bytes of data, or 1 bracketed DBCS character composed of 4 bytes of data. It is, therefore, recommended that, when converting a Unicode field (in the physical file) to a field with a different type in the logical file, the field in the logical file be defined with the VARLEN keyword. The length of the logical file field should be defined large enough to hold the maximum size that the Unicode field can be converted to. This will account for the expansion that can occur.

On a logical file, if the length is not specified, and a UTF-16 to EBCDIC conversion will be taking place, the length of the corresponding physical file field will be taken, except in the following case:

If the physical file field is UTF-16 capable and the logical file field has a data type of O, then the length of the logical file field will be 2 times the field size of the physical file field.