What are combining characters
Symbol dialog showing the IPA Extensions character subset. If you cannot find the character you need in any font you have installed or can locate, then you may be able to create it by combining two or more existing characters. The field places each successive character on top of the previous one. You can use as many characters as you want; just separate them with commas. Use commas as the separators if the decimal symbol for your system is a period specified as part of the regional settings in Microsoft Windows Control Panel.
If the decimal symbol for your system is a comma, use semicolons. Open the Field dialog: Word and earlier: You will get the following dialog: Field dialog with Eq field selected. Instead click the button for Field Codes. In the ensuing dialog, click the Options… button, which will open the Field Options dialog. EQ field Field Options dialog. You then still have to type your elements characters into the text box.
To use a comma, open parenthesis, or backslash character as one of the characters, precede the symbol with a backslash: This inserts the field braces with two spaces between them and the insertion point placed between the spaces: Press F9 to toggle field codes and update the field.
Formatting the field Each character in the field is printed within an invisible character box. Options align the boxes on top of one another. Note that center alignment is the default; you do not need to use an alignment option if you want the characters centered.
Inserting characters in the field is just a rough start. In order to get the proper relationship between the characters, you may need to format one or more of them as raised, lowered, superscript, or subscript through the Format Font dialog or a different font altogether. You can also change the font size and spacing of the individual characters.
In fact, you can apply any kind of font formatting to the characters in the field that you might apply to any text anywhere. This may require considerable trial and error, but ultimately you should be able to get the effect you desire. Once you get the desired result, you will probably want to save it as an AutoText entry for ease of reuse. Field code showing trailing zero. The purpose of this character is to distinguish a zero from a capital O. A slashed zero can still be useful in situations where letters and numbers are mixed in such a way that it may not be clear what is what, such as product key codes.
Occasionally you will see a suggestion to use Unicode character 00D8. This is not a zero at all but rather a capital O with a stroke; needless to say, it is not satisfactory. Although it is possible to create a slashed zero by using an EQ field to combine a slash and a zero, consider these caveats:.
This is rarely necessary in modern printed text since contemporary fonts clearly distinguish O and 0, and they are quite distinct electronically even if they appear similar to the eye. The zero will not be usable in computation. Navajo orthography uses the ogonek, which is the hook to the right, for nasalization; that is not the same as the cedilla, which is the hook to the left.
In Unicode Normalization Form C, the a and the ogonek would be replaced by the single code for a-ogonek, producing:. For display and printing, these combinations should just show the whole letters, with both accents placed properly. Up-to-date Microsoft Windows systems for example will do that automatically and correctly for you.
See also the web page Where Is My Character? Yes, I can represent for example X with circumflex by use of X with a combining circumflex: But it doesn't display correctly. Fonts that properly support a repertoire with the combination you need should have the correct display. If the font doesn't support the repertoire, you can end up with various glitches in display. Exactly how things appear in that case will depend on internal details regarding how the font may handle combining marks.
To compare the possible displays of sequences with those that could have resulted if X-circumflex had been encoded as a precomposed character, see the following table. Some fonts, such as the Doulos and Charis fonts, which are freely available for download, contain large repertoires of appropriate precomposed glyphs for use by linguists and writers of minority languages.
Try checking out those fonts to see if they might cover your repertoire needs. See also Display Problems. There is no fundamental need for a precomposed character to be encoded in the standard at all in order for the font to have and display the correct precomposed glyph for the combination you need.
The hard work, in either case, is in the design for the precomposed glyph. Conceptually it seems simple enough to add a precomposed glyph to a font — after all, typically the base glyph will be in the font already. But professional font design requires considerable effort.
Any time a new accented glyph is added, attention must be paid to design integrity compared to other accented glyphs, kerning issues with all other glyphs, and the possible need for yet other ligatures. Most of this work then has to be repeated for each face of the font: The amount of work for testing the font is multiplied many fold, because not only does the new glyph need testing by itself, but also in interaction with the other glyphs in the font.
This is the fundamental reason why commercial fonts are relatively slow to adopt large new collections of precomposed glyphs into their supported repertoires.
Is there a way for font designers to provide flexible support for arbitrary accented combinations? Yes, many modern fonts support dynamic positioning of diacritical marks using aligning anchors on base and mark glyphs or similar mechanisms. For example, such mechanisms are defined in the OpenType font specification, and many fonts in Windows 7 and later versions have this feature. Other systems, such as Mac OS X, can provide such dynamic display even in the absence of explicit font support.
Why are new combinations of Latin letters with diacritical marks not suitable for addition to Unicode? There are several reasons. First, Unicode encodes many diacritical marks, and the combinations can already be produced, as noted in the answers to some questions above. If precomposed equivalents were added, the number of multiple spellings would be increased, and decompositions would need to be defined and maintained for them, adding to the complexity of existing decomposition tables in implementations.
Finally, normalization form NFC the composed form favored for use on the Web is frozen—no new letter combinations can be added to it. Therefore, the normalized NFC representation of any new precomposed letters would still use decomposed sequences, which can already be expressed by combining character sequences in Unicode.
Nothing would be gained by adding the letter with diacritical mark as a precomposed character; on the contrary, adding such a letter would add one or more multiple spellings to be reckoned with, incrementally complicating all Unicode implementations for no net gain. It is not a format control character, but rather a combining mark. The presence of a combining grapheme joiner in the midst of a combining character sequence does not interrupt the combining character sequence.
And the CGJ does not have any visible display of its own. Of course, as for any such character in the Unicode Standard with no visible display, it is always possible to use a visible glyph when deliberately showing hidden characters, as for an editor's Show Symbol or Show Hidden mode. Despite its name, the combining grapheme joiner neither joins graphemes together in the way punctuation might, nor does it create new graphemes by combinations of other characters.
Especially, it cannot be used to construct grapheme clusters out of arbitrary character sequences, or extend the scope of subsequent combining characters. It has no impact on line breaking, except that as for other combining marks, it should be kept with its base when breaking a line.
It has several functions: In collation, the primary function is to prevent contractions from forming. This usage requires no tailoring of either the combining grapheme joiner or the sequence. It is possible to give sequences of characters which include the combining grapheme joiner special tailored weights; however, such an application of CGJ is not recommended. Second, the insertion of a combining grapheme joiner into a sequence of combining marks will block canonical reordering of those combining marks.
This can be used in some unusual circumstances where two sequences of combining marks need to be distinguished, but where the different sequences would be neutralized by normalization. Such usage will also cause differences in collation for the affected sequences. What shall I do? For the Latin script, the Unicode Standard does not distinguish identically appearing diacritical marks with different functions. Doing so would result in confusion in implementations and among users.
The semantics of CGJ are such that it should impact only searching and sorting, for systems which have been tailored to distinguish it, while being otherwise ignored in interpretation. The CGJ character was encoded with this purpose in mind. This eases the interoperability problem. Both sequences will display as they should.
Implementations which need to distinguish the two for searching and sorting may systematically maintain weighting distinctions. Existing collation, searching, and matching based on the Unicode Collation Algorithm will continue to behave as originally specified: Is it possible to apply a diacritic or combining enclosing mark to a sequence of more than one non-combining character?
What should I do? Because the combination of the letter i or I and diacritic is already covered by characters in Unicode, no precomposed characters for Egyptological yod were separately encoded. For the diacritic, three choices are available: The placement of the diacritic is up to the font-designer and rendering engine, so you should test available fonts with the preferred diacritic.
For further information on the set of comma-form and half-ring diacritics in Unicode and their relationships, see Unicode Technical Note Should I expect issues, when using this approach for representation of Egyptological yod?
The display may work better with some fonts and on certain platforms. To help other users get the best results when viewing your webpages, it may be advisable to also include a note on your webpage, identifying which fonts and which browsers provide the best results. I am digitizing textual materials for a language whose script contains a small letter " ", as well as a capitalized version, depicted as the letter "A" with a circle around it.