Spelling (Orthography) of Pinyin

Pinyin spells Chinese words a bit differently from the other systems that came before it. For example, initial "u" syllables are written with a "w" (as in "wang" instead of "uang"). The same approach is taken for initial "i," which converts to a "y" (as in "yang" instead of "iang"). Standalone "u" is spelled "wu," and standalone "i" is spelled "yi." Likewise, syllables starting with "ü" are spelled with initial "w" or "y" according to the case, followed by "u" without the dieresis, and then the vowel (as in "üe" being spelled "yue"). In fact, "ü" is written as "u" whenever it would not cause confusion.

It is essential to be aware that Pinyin simplifies spelling in many cases, even when the pronunciation is still more complex. For example, "iou," "uei," and "uen" become "iu," "ui" and "un" respectively when following a consonant, even though that is not the way they are pronounced. The "uo" diphthong is almost always rendered just "o" under the same circumstances (as in "buo," which is spelled "bo.")

The apostrophe is employed to separate syllables that might be confused if rendered without it. For example, pi'ao and piao are two separate words, the first one being represented by two characters, and the second by one.

If "e" (pronounced as a schwa) is all by itself, it is written ê. Otherwise, the schwa is just written as "e" when in words.


Each character in Chinese is a separate syllable. Though many characters are compounds of two or more other characters, they are written together as a single character. Mandarin is polysyllabic, meaning that many "words" are really combinations of more than one character. The spacing in Pinyin reflects the distinction between words, not characters. With long words of combined meanings, spaces are inserted to divide them along their combination lines (like stainless steel)

Duplicated words (like rén-rén, meaning "everybody") are written together without a space. If the duplication is ABAB, there is a space between the sets (as in yánjiü yánjiü, meaning study or research). Duplicated words of form AABB are hyphenated (as in láilái-wǎngwǎng, meaning back-and-forth).

Nouns are a single word, including prefixes and suffixes. Location modifiers are usually separated with the hyphen, however (as in hé-li (along side the river)). A few are not, by tradition (as tiänshang - in the sky). Surnames are written together, but separation is placed between first names and family names. Names are capitalized, but titles are not if they follow the name (for example, Lǐ xiānsheng (Mr. Li)). They are capitalized if they are honorifics coming before the name (Lao Qián (elder Mr. Qian)). Geographical names are also capitalized.

Verbs with suffixes are written as one word. "Le," a sentence-ending suffix, is written as a separate word. Verbs are separated from objects, but sometimes single-syllable verbs and adverbs are written together, as in zhěnglǐ hǎo (straighten out).

Pronouns are connected to their plurals (as in wǒ and wǒmen - I and we). Many demonstrative pronouns and interrogatory pronouns are separated. There are a few exceptions: nà li (there), zhèbian (over here) and a few others.

Numbers and Counters (words that indicate the units of measures of nouns, like "ream" for paper) are separated from the nouns they modify.

Tones in Pinyin

Pinyin Tones

How the Four Tones Relate to Each Other

Pinyin marks the four tones of Mandarin through the use of diacritics - normally on the non-medial vowel. Tone marks also can even appear over consonants in vowel-less exclamations. Neutral vowels, with no tonal quality, are often left unmarked. Sometimes this neutral quality is classified as a fifth tone, and sometimes as a "non-tone." Sometimes it is printed with a dot before the syllable, as in "·ma" (a particle used to ask questions and written as 嗎 or 吗).

The four diacritics employed are roughly pictures of the tones themselves. They are:

• Tone 1 (high and flat) the macron or "long mark" - as in "mā" (meaning "mother")(written as 媽 or 妈).
• Tone 2 (middle rising to high) the acute accent as in "má" (meaning "hemp")(written as 麻).
• Tone 3 (middle falling, then rising to high) the hacek as is "mǎ" (meaning "horse")(written as 馬 or 马).
• Tone 4 (high, falling to low) the grave accent as in "mà " (meaning "scold")(written as 罵 or 骂).

Tone marks are often omitted in common, everyday usages, except in teaching materials, where they are maintained in order to ensure the correct pronunciation of Mandarin. They are also used to remove ambiguities that might arise in context if the tone (and hence the character) is not further defined.

The tone mark is placed over the vowel if there's only one. If the first of two or three vowels is "i" or "u" or "ü," then the mark goes on the last vowel. Otherwise, with more than one vowel, the mark goes by priority order, with "a" being at the top, followed by "o" and then "e" last. (Do not consider "w" or "y" as a vowel for this purpose.)

In the absence of the diacritical marks in a letterset, an accommodation is to add a number (1,2,3 or 4) after each syllable to indicate the tone. The neutral tone can be left without a number, or it might bear the "0" or "5" depending on the writer's preference.

The Letter "ü"

Pinyin treats the forward rounded u (ü) differently from other letters. The dieresis appears when the sound is made after "l" and "n" to make the /y/ sound. For example, "lü" can be distinguished from "lu." The former is a front high rounded vowel and the latter a back high rounded vowel. The tone marks are placed over the dieresis. In all other situations, the /y/ sound is not recorded with the dieresis. This would be when it comes after a j q x or y, as in "yú," meaning "fish." (鱼) The dieresis is not there, but the sound is /yu/ -- as if it were. (A Taiwanese version of Pinyin does retain the dieresis.)

Pinyin Beyond Mandarin

A system similar to Pinyin has been devised in the southern provinces. Guangdong is the name for Romanization promulgated by the Guangdong province for Cantonese, Teochew, Hakka and Hainanese.

Taiwan also adopted a modified form of Pinyin at the national level in 2002. This system, called Tongyong Pinyin, created a political stir, as those favoring reunification preferred using Hanyu Pinyin. So at the national level Tongyong Pinyin is used, whereas the Taipei City government, to name one important example has switched to Hanyu Pinyon.

Since 1976, place names throughout China have been transliterated into Pinyin so that they can be pronounced by local non-Mandarin speakers. Thus, in Mongolia and Tibet, for example, Pinyin is the system employed for spelling localities in phonetic form.

