Combined characters

Some characters can be made up of more than one Unicode character.

For example, characters that have an accent (for example, é or à) or an umlaut (for example, ä or ö) need to be changed, or normalized, to a common format before they are stored in the directory, so that all objects have a unique name. Normalizing a combined character is a process by which the character is put in a known and predictable format. The format chosen for *TYPE2 directories is the canonical composed form. If there are two objects in a *TYPE1 directory that contain the same combined characters, they are normalized to the same name. This causes a collision, even if one object contains composed combined characters and the other object contains decomposed combined characters. Therefore, one of them has its name changed before it is linked in the *TYPE2 directory.