One of the interesting possibilities in a distributed relational database is that the database might not only span different types of computers, but those computers might be in different countries or regions. The same servers, such as iSeries™ servers, can encode data differently depending on the language used on the server.
Different types of servers encode data differently. For instance, an S/390®, an iSeries server, and a PS/2® system encode numeric data in their own unique formats. In addition, an S/390 and an iSeries server use the EBCDIC encoding scheme to encode character data, while a PS/2 system uses an ASCII encoding scheme.
For numeric data, these differences do not matter. Unlike systems that provide DRDA® support automatically convert any differences between the way a number is represented in one computer system to the way it is represented in another. For example, if an iSeries application program reads numeric data from a DB2 Universal Database™ for iSeries database, DB2® UDB for iSeries sends the numeric data in S/390 format and the i5/OS™ database management system converts it to iSeries numeric format.
However, the handling of character data is more complex, but this too can be handled within a distributed relational database.
Not only can there be differences in encoding schemes, such as Extended Binary Coded Decimal Interchange Code (EBCDIC) versus American Standard Code for Information Interchange (ASCII), but there can also be differences related to language.
For instance, systems configured for different languages can assign different characters to the same code, or different codes to the same character. For example, a system configured for U.S. English can assign the same code to the character } that a system configured for the Danish language assigns to å. But those two systems can assign different codes to the same character such as $.
If data is to be shared across different servers, character data needs to be seen by users and applications the same way. In other words, a PS/2 user in New York and an iSeries server user in Copenhagen both need to see a $ as a $, even though $ might be encoded differently in each server. Furthermore, the user in Copenhagen needs to see a }, if that is the character that was stored at New York, even though the code might be the same as a Danish å. In order for this to happen, the $ must be converted to the proper character encoding for a PS/2 system (that is, U.S. English character set, ASCII), and converted back to Danish encoding when it goes from New York to Copenhagen (that is, Danish character set, EBCDIC). This sort of character conversion is provided for by iSeries server as well as the other IBM® distributed relational database managers. This conversion is done in a coherent way in accordance with the Character Data Representation Architecture (CDRA).
CDRA specifies the way to identify the attributes of character data so that the data can be understood across servers, even if the servers use different character sets and encoding schemes. For conversion to happen across servers, each server must understand the attributes of the character data it is receiving from the other server. CDRA specifies that these attributes be identified through a coded character set identifier (CCSID). All character data in DB2 Universal Database for z/OS®, DB2 Universal Database™ for VM, and the i5/OS database management systems have a CCSID, which indicates a specific combination of encoding scheme, character set, and code page. All character data in an Extended Services® environment has a code page only (but the other database managers treat that code page identification as a CCSID). A code page is a specific set of assignments between characters and internal codes.
For example, CCSID 37 means encoding scheme 4352 (EBCDIC), character set 697 (Latin, single-byte characters), and code page 37 (USA/Canada country extended code page). CCSID 5026 means encoding scheme 4865 (extended EBCDIC), character set 1172 with code page 290 (single-byte character set for Katakana/ Kanji), and character set 370 with code page 300 (double-byte character set for Katakana/Kanji).
DRDA-enabled systems include mechanisms to convert character data between a wide range of CCSID-to-CCSID pairs and CCSID-to-code page pairs. Character conversion for many CCSIDs and code pages is already built into these products. For more information on CCSIDs supported by iSeries, see the i5/OS globalization topic. For a description of the use of CCSIDs on the iSeries server, see coded character set identifier (CCSID).