Literal strings in native methods

It is easier to encode literal strings in UTF-8 if the string is composed of characters with a 7-bit American Standard Code for Information Interchange (ASCII) representation.

If the string can be represented in ASCII, as most are, then the string can be bracketed by 'pragma' statements that change the current codepage of the compiler. Then, the compiler stores the string internally in the UTF-8 form that is required by the JNI. If the string cannot be represented in ASCII, it is easier to treat the original extended binary-coded decimal interchange code (EBCDIC) string as a dynamic string, and process it using iconv() before passing it to the JNI. For more information on dynamic strings, see dynamic strings.

For example, to find the class named java/lang/String, the code looks like this:

    #pragma convert(819)
    myClass = (*env)->FindClass(env,"java/lang/String");
    #pragma convert(0)

The first pragma, with the number 819, informs the compiler to store all subsequent double-quoted strings (literal strings) in ASCII. The second pragma, with the number 0, tells the compiler to revert to the default code page of the compiler for double-quoted strings, which is usually the EBCDIC code page 37. So, by bracketing this call with these pragmas, we satisfy the JNI requirement that string parameters are encoded in UTF-8.

Caution: Be careful with text substitutions. For example, if your code looks like this:

    #pragma convert(819)
    #define MyString "java/lang/String"
    #pragma convert(0)
    myClass = (*env)->FindClass(env,MyString);

Then, the resulting string is EBCDIC, because the value of MyString is substituted into the FindClass call during compilation. At the time of this substitution, the pragma, number 819, is not in effect. Thus, literal strings are not stored in ASCII.

Related tasks
Convert dynamic strings to and from EBCDIC, Unicode, and UTF-8