Romanization
Adapted from Wikipedia · Adventurer experience
In linguistics, romanization or romanisation is changing text from a different writing system into the Roman (Latin) script. This helps people read and understand languages that use letters or symbols not found in the Roman alphabet.
There are different ways to do romanization. One common method is called transliteration. This shows written words as closely as possible. Another method is transcription, which tries to capture how words sound when spoken. Transcription can focus on the meaning units of speech or record every small sound very precisely, called phonetic transcription.
Romanization is important because it makes it easier for people who only know the Roman alphabet to read and study other languages. It is used in many books, maps, and computer systems around the world.
Methods
There are many ways to change text from one writing system into the Roman (Latin) alphabet. We pick a method based on what we need, like making text easy to read or keeping the original sounds.
- Source language: Some methods work best for one language, keeping its special sounds. Others work for many languages.
- Target language: Most methods are made for people who speak a certain language.
- Simplicity: The Latin alphabet has fewer letters than many others, so extra symbols are needed to show all sounds.
- Reversibility: Some methods let you go back to the original text, while others do not.
Transliteration
Main article: Transliteration
Transliteration changes each symbol from the original script to the Latin alphabet. It focuses on the symbols, not how they sound. For example, a system for Japanese can help someone see the original Japanese symbols correctly.
Transcription
Main article: Transcription (linguistics)
Phonemic
See also: Phonemic orthography
Most romanizations help people who don’t know the original script say the words right. They try to show the main sounds of the original language. The Hepburn system for Japanese is made to help English speakers say words correctly.
Phonetic
See also: Phonetic transcription
A phonetic method tries to show every sound in the original language, even if it makes the writing harder to read. The International Phonetic Alphabet is a common way to do this.
Compromise
For most languages, a good romanization means finding a balance. Pure transcription isn’t usually possible because languages have sounds that others don’t. Most romanizations today try to help people say words right, rather than just showing the symbols. For example, the Japanese word 柔術 is written as zyûzyutu in one system, but most English readers would find jūjutsu easier to say.
Romanization of specific writing systems
See also: Category:Romanization
Arabic
The Arabic script is used to write Arabic, Persian, Urdu, Pashto and Sindhi and other languages. Romanization standards include:
Arabic
- Deutsche Morgenländische Gesellschaft (1936)
- BS 4280 (1968)
- SATTS (1970s)
- UNGEGN (1972)
- DIN 31635 (1982)
- ISO 233 (1984)
- Qalam (1985)
- ISO 233-2 (1993)
- Buckwalter transliteration (1990s)
- ALA-LC (1997)
- Arabic chat alphabet
Persian
See also: Category:Persian orthography
Notes:
Armenian
Georgian
Notes:
Greek
There are romanization systems for both Modern and Ancient Greek.
Hebrew
The Hebrew alphabet is romanized using several standards:
Indic (Brahmic) scripts
See also: Devanagari transliteration, Romanization of Bengali, and Romanisation of Malayalam
The Brahmic family of abugidas is used for languages of India and south-east Asia. Various transliteration conventions have been used for Indic scripts.
- ISO 15919 (2001)
- The National Library at Kolkata romanization
- Harvard-Kyoto
- ITRANS
- ISCII (1988)
Devanagari–nastaʿlīq (Hindustani)
Hindustani is an Indo-Aryan language. Two standardized registers, Standard Hindi and Standard Urdu, are recognized as official languages in India and Pakistan.
The Hamari Boli Initiative aims to help Hindustani through romanization.
Chinese
Romanization of the Sinitic languages, particularly Mandarin, has been difficult. Many romanization tables include Chinese characters plus one or more romanizations.
Mandarin
- ALA-LC
- EFEO
- Latinxua Sin Wenz (1926)
- Lessing-Othmer
- Postal romanization (1906)
- Wade–Giles (1892)
- Yale (1942)
- Legge romanization
China
- Hanyu Pinyin (1958)
Taiwan
Main article: Chinese language romanization in Taiwan
- Gwoyeu Romatzyh (GR, 1928–1986)
- Mandarin Phonetic Symbols II (MPS II, 1986–2002)
- Tongyong Pinyin (2002–2008)
- Hanyu Pinyin (since January 1, 2009)
Singapore
Main article: Chinese language romanisation in Singapore
Cantonese
- Barnett–Chao
- Guangdong (1960)
- Hong Kong Government
- Jyutping
- Macau Government
- Meyer–Wempe
- Sidney Lau
- Yale (1942)
- ILE romanization of Cantonese
Wu
See also: Romanization of Wu Chinese
Min Nan or Hokkien
See also: Comparison of Hokkien writing systems
Teochew
- Guangdong (1960)
Min Dong
Min Bei
Japanese
Romanization is called "rōmaji" in Japanese. The most common systems are:
- Hepburn (1867)
- Nihon-shiki (1885)
- Kunrei-shiki (1937)
- JSL (1987)
- ALA-LC
- Wāpuro
Korean
The following systems are widely used:
- McCune–Reischauer ("MR"; 1939)
- Revised Romanization of Korean (2000)
- Yale romanization of Korean (1942)
Thai
Thai is written with its own script.
- Royal Thai General System of Transcription
- ISO 11940 1998 Transliteration
- ISO 11940-2 2007 Transcription
- ALA-LC
Nuosu
The Nuosu language is written with the Yi script. The only romanization system is YYPY.
Tibetan
The Tibetan script has two official romanization systems: Tibetan Pinyin and Roman Dzongkha.
Cyrillic
In English, the Library of Congress transliteration method is used.
In linguistics, scientific transliteration is used.
Belarusian
See also: Belarusian Latin alphabet
- BGN/PCGN romanization of Belarusian (1979)
- Scientific transliteration
- ALA-LC romanization (1997)
- ISO 9:1995
Bulgarian
A system based on scientific transliteration was official since the 1970s. Bulgarian authorities switched to the Streamlined System in 2009.
Kyrgyz
Macedonian
Russian
There is no single accepted system of writing Russian using the Latin script. Systems include:
- BGN/PCGN (1947)
- GOST 16876-71 (1971)
- United Nations romanization system (1987)
- ISO 9 (1995)
- ALA-LC (1997)
- "Volapuk" encoding (1990s)
- Streamlined System
- Comparative transliteration
Syriac
Main article: Syriac alphabet § Latin alphabet and romanization
Ukrainian
See also: Ukrainian Latin alphabet
- ALA-LC
- ISO 9
- Ukrainian National transliteration
| Unicode | Persian letter | IPA | DMG (1969) | ALA-LC (1997) | BGN/PCGN (1958) | EI (1960) | EI (2012) | UN (1967) | UN (2012) | Pronunciation |
|---|---|---|---|---|---|---|---|---|---|---|
| U+0627 | ا | ʔ, ∅ | ʾ, — | ʼ, — | ʾ | - as in uh-oh | ||||
| U+0628 | ب | b | b | B as in Bob | ||||||
| U+067E | پ | p | p | P as in pet | ||||||
| U+062A | ت | t | t | T as in tall | ||||||
| U+062B | ث | s | s̱ | s̱ | s̄ | t͟h | ṯ | s̄ | s | S as in sand |
| U+062C | ج | dʒ | ǧ | j | j | d͟j | j | j | J as in jam | |
| U+0686 | چ | tʃ | č | ch | ch | č | ch | č | Ch as in Charlie | |
| U+062D | ح | h | ḥ | ḥ | ḩ/ḥ | ḥ | ḩ | h | H as in holiday | |
| U+062E | خ | x | ḫ | kh | kh | k͟h | ḵ | kh | x | somewhat resembling German Ch |
| U+062F | د | d | d | D as in Dave | ||||||
| U+0630 | ذ | z | ẕ | ẕ | z̄ | d͟h | ḏ | z̄ | z | Z as in zero |
| U+0631 | ر | r | r | R as in rabbit | ||||||
| U+0632 | ز | z | z | Z as in zero | ||||||
| U+0698 | ژ | ʒ | ž | zh | zh | z͟h | ž | zh | ž | S as in television or G as in genre |
| U+0633 | س | s | s | S as in Sam | ||||||
| U+0634 | ش | ʃ | š | sh | sh | s͟h | š | sh | š | Sh as in sheep |
| U+0635 | ص | s | ṣ | ṣ | ş/ṣ | ṣ | ş | s | S as in Sam | |
| U+0636 | ض | z | ż | z̤ | ẕ | ḍ | ż | ẕ | z | Z as in zero |
| U+0637 | ط | t | ṭ | ṭ | ţ/ṭ | ṭ | ţ | t | t as in tank | |
| U+0638 | ظ | z | ẓ | ẓ | z̧/ẓ | ẓ | ẓ | z̧ | z | Z as in zero |
| U+0639 | ع | ʕ | ʿ | ʻ | ʼ | ʻ | ʻ | ʿ | ʿ | _____ |
| U+063A | غ | ɢ~ɣ | ġ | gh | gh | g͟h | ḡ | gh | q | somewhat resembling French R |
| U+0641 | ف | f | f | F as in Fred | ||||||
| U+0642 | ق | ɢ~ɣ | q | ḳ | q | somewhat resembling French R | ||||
| U+06A9 | ک | k | k | C as in card | ||||||
| U+06AF | گ | ɡ | g | G as in go | ||||||
| U+0644 | ل | l | l | L as in lamp | ||||||
| U+0645 | م | m | m | M as in Michael | ||||||
| U+0646 | ن | n | n | N as in name | ||||||
| U+0648 | و | v~w | v | v, w | v | V as in vision | ||||
| U+0647 | ه | h | h | h | h | h | h | h | H as in hot | |
| U+0629 | ة | ∅, t | — | h | — | t | h | — | — | |
| U+06CC | ی | j | y | Y as in Yale | ||||||
| U+0621 | ء | ʔ, ∅ | ʾ | ʼ | ʾ | |||||
| U+0623 | أ | ʔ, ∅ | ʾ | ʼ | ʾ | |||||
| U+0624 | ؤ | ʔ, ∅ | ʾ | ʼ | ʾ | |||||
| U+0626 | ئ | ʔ, ∅ | ʾ | ʼ | ʾ | |||||
| Unicode | Final | Medial | Initial | Isolated | IPA | DMG (1969) | ALA-LC (1997) | BGN/PCGN (1958) | EI (2012) | UN (1967) | UN (2012) | Pronunciation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| U+064E | ـَ | ـَ | اَ | اَ | æ | a | a | a | a | a | a | A as in cat |
| U+064F | ـُ | ـُ | اُ | اُ | o | o | o | o | u | o | o | O as in go |
| U+0648 U+064F | ـو | ـو | — | — | o | o | o | o | u | o | o | O as in go |
| U+0650 | ـِ | ـِ | اِ | اِ | e | e | i | e | e | e | e | E as in ten |
| U+064E U+0627 | ـَا | ـَا | آ | آ | ɑː~ɒː | ā | ā | ā | ā | ā | ā | O as in hot |
| U+0622 | ـآ | ـآ | آ | آ | ɑː~ɒː | ā, ʾā | ā, ʼā | ā | ā | ā | ā | O as in hot |
| U+064E U+06CC | ـَی | — | — | — | ɑː~ɒː | ā | á | á | ā | á | ā | O as in hot |
| U+06CC U+0670 | ـیٰ | — | — | — | ɑː~ɒː | ā | á | á | ā | ā | ā | O as in hot |
| U+064F U+0648 | ـُو | ـُو | اُو | اُو | uː, oː | ū | ū | ū | u, ō | ū | u | U as in actual |
| U+0650 U+06CC | ـی | ـیـ | ایـ | ای | iː, eː | ī | ī | ī | i, ē | ī | i | Y as in happy |
| U+064E U+0648 | ـَو | ـَو | اَو | اَو | ow~aw | au | aw | ow | ow, aw | ow | ow | O as in go |
| U+064E U+06CC | ـَی | ـَیـ | اَیـ | اَی | ej~aj | ai | ay | ey | ey, ay | ey | ey | Ay as in play |
| U+064E U+06CC | ـیِ | — | — | — | –e, –je | –e, –ye | –i, –yi | –e, –ye | –e, –ye | –e, –ye | –e, –ye | Ye as in yes |
| U+06C0 | ـهٔ | — | — | — | –je | –ye | –ʼi | –ye | –ye | –ye | –ye | Ye as in yes |
| Georgian letter | IPA | National system (2002) | BGN/PCGN (1981–2009) | ISO 9984 (1996) | ALA-LC (1997) | Unofficial system | Kartvelo translit | NGR2 |
|---|---|---|---|---|---|---|---|---|
| ა | /ɑ/ | a | a | a | a | a | a | a |
| ბ | /b/ | b | b | b | b | b | b | b |
| გ | /ɡ/ | g | g | g | g | g | g | g |
| დ | /d/ | d | d | d | d | d | d | d |
| ე | /ɛ/ | e | e | e | e | e | e | e |
| ვ | /v/ | v | v | v | v | v | v | v |
| ზ | /z/ | z | z | z | z | z | z | z |
| ჱ | /eɪ/ | ey | ē | ē | é | ej | ẽ | |
| თ | /tʰ/ | t | tʼ | t̕ | tʻ | T or t | t | t / t̊ |
| ი | /i/ | i | i | i | i | i | i | i |
| კ | /kʼ/ | kʼ | k | k | k | k | ǩ | k̉ |
| ლ | /l/ | l | l | l | l | l | l | l |
| მ | /m/ | m | m | m | m | m | m | m |
| ნ | /n/ | n | n | n | n | n | n | n |
| ჲ | /i/, /j/ | j | y | y | j | ĩ | ||
| ო | /ɔ/ | o | o | o | o | o | o | o |
| პ | /pʼ/ | pʼ | p | p | p | p | p̌ | p̉ |
| ჟ | /ʒ/ | zh | zh | ž | ž | J, zh or j | ž | g̃ |
| რ | /r/ | r | r | r | r | r | r | r |
| ს | /s/ | s | s | s | s | s | s | s |
| ტ | /tʼ/ | tʼ | t | t | t | t | t̆ | t̉ |
| ჳ | /w/ | w | w | ŭ | f̃ | |||
| უ | /u/ | u | u | u | u | u | u | u |
| ფ | /pʰ/ | p | pʼ | p̕ | pʻ | p or f | p | p / p̊ |
| ქ | /kʰ/ | k | kʼ | k̕ | kʻ | q or k | q or k | k / k̊ |
| ღ | /ʁ/ | gh | gh | ḡ | ġ | g, gh or R | g, gh or R | q̃ |
| ყ | /qʼ/ | qʼ | q | q | q | y | q | q |
| შ | /ʃ/ | sh | sh | š | š | sh or S | š | x |
| ჩ | /t͡ʃ(ʰ)/ | ch | chʼ | č̕ | čʻ | ch or C | č | c̃ |
| ც | /t͡s(ʰ)/ | ts | tsʼ | c̕ | cʻ | c or ts | c | c |
| ძ | /d͡z/ | dz | dz | j | ż | dz or Z | ʒ | d̃ |
| წ | /t͡sʼ/ | tsʼ | ts | c | c | w, c or ts | ʃ | c̉ |
| ჭ | /t͡ʃʼ/ | chʼ | ch | č | č | W, ch or tch | ʃ̌ | j̉ |
| ხ | /χ/ | kh | kh | x | x | x or kh (rarely) | x | k̃ |
| ჴ | /q/, /qʰ/ | qʼ | ẖ | x̣ | q̌ | q̊ | ||
| ჯ | /d͡ʒ/ | j | j | ǰ | j | j | - | j |
| ჰ | /h/ | h | h | h | h | h | h | h |
| ჵ | /oː/ | ō | ō | ȯ | h̃ |
Overview and summary
The chart below shows common ways to change spoken sounds into Roman letters for many alphabets. This helps many people, but there are other ways to do it, and some letters don’t always follow the same rules. For more information, see the sections for each language above. (Hangul characters are broken down into jamo pieces.)
| Romanized | IPA | Greek | Cyrillic | Amazigh | Hebrew | Arabic | Persian | Katakana | Hangul | Bopomofo |
|---|---|---|---|---|---|---|---|---|---|---|
| A | a | A | А | ⴰ | ַ, ֲ, ָ | َ, ا | ا, آ | ア | ㅏ | ㄚ |
| AE | ai̯/ɛ | ΑΙ | ㅐ | |||||||
| AI | ai | י ַ | ㄞ | |||||||
| B | b | ΜΠ, Β | Б | ⴱ | בּ | ﺏ ﺑ ﺒ ﺐ | ﺏ ﺑ | ㅂ | ㄅ | |
| C | k/s | Ξ | ㄘ | |||||||
| CH | ʧ | TΣ̈ | Ч | צ׳ | چ | ㅊ | ㄔ | |||
| CHI | ʨi | チ | ||||||||
| D | d | ΝΤ, Δ | Д | ⴷ, ⴹ | ד | ﺩ — ﺪ, ﺽ ﺿ ﻀ ﺾ | د | ㄷ | ㄉ | |
| DH | ð | Δ | דֿ | ﺫ — ﺬ | ||||||
| DZ | ʣ | ΤΖ | Ѕ | |||||||
| E | e/ɛ | Ε, ΑΙ | Э | ⴻ | , ֱ, י ֵֶ, ֵ, י ֶ | エ | ㅔ | ㄟ | ||
| EO | ʌ | ㅓ | ||||||||
| EU | ɯ | ㅡ | ||||||||
| F | f | Φ | Ф | ⴼ | פ (or its final form ף ) | ﻑ ﻓ ﻔ ﻒ | ﻑ | ㄈ | ||
| FU | ɸɯ | フ | ||||||||
| G | ɡ | ΓΓ, ΓΚ, Γ | Г | ⴳ, ⴳⵯ | ג | گ | ㄱ | ㄍ | ||
| GH | ɣ | Γ | Ғ | ⵖ | גֿ, עֿ | ﻍ ﻏ ﻐ ﻎ | ق غ | |||
| H | h | Η | Һ | ⵀ, ⵃ | ח, ה | ﻩ ﻫ ﻬ ﻪ, ﺡ ﺣ ﺤ ﺢ | ه ح ﻫ | ㅎ | ㄏ | |
| HA | ha | ハ | ||||||||
| HE | he | ヘ | ||||||||
| HI | hi | ヒ | ||||||||
| HO | ho | ホ | ||||||||
| I | i/ɪ | Η, Ι, Υ, ΕΙ, ΟΙ | И, І | ⵉ | ִ, י ִ | دِ | イ | ㅣ | ㄧ | |
| IY | ij | دِي | ||||||||
| J | ʤ | TZ̈ | ДЖ, Џ | ⵊ | ג׳ | ﺝ ﺟ ﺠ ﺞ | ج | ㅈ | ㄐ | |
| JJ | ʦ͈/ʨ͈ | ㅉ | ||||||||
| K | k | Κ | К | ⴽ, ⴽⵯ | כּ | ﻙ ﻛ ﻜ ﻚ | ک | ㅋ | ㄎ | |
| KA | ka | カ | ||||||||
| KE | ke | ケ | ||||||||
| KH | x | X | Х | ⵅ | כ, חֿ (or its final form ך ) | ﺥ ﺧ ﺨ ﺦ | خ | |||
| KI | ki | キ | ||||||||
| KK | k͈ | ㄲ | ||||||||
| KO | ko | コ | ||||||||
| KU | kɯ | ク | ||||||||
| L | l | Λ | Л | ⵍ | ל | ﻝ ﻟ ﻠ ﻞ | ل | ㄹ | ㄌ | |
| M | m | Μ | М | ⵎ | מ (or its final form ם ) | ﻡ ﻣ ﻤ ﻢ | م | ㅁ | ㄇ | |
| MA | ma | マ | ||||||||
| ME | me | メ | ||||||||
| MI | mi | ミ | ||||||||
| MO | mo | モ | ||||||||
| MU | mɯ | ム | ||||||||
| N | n | Ν | Н | ⵏ | נ (or its final form ן ) | ﻥ ﻧ ﻨ ﻦ | ن | ン | ㄴ | ㄋ |
| NA | na | ナ | ||||||||
| NE | ne | ネ | ||||||||
| NG | ŋ | ㅇ | ||||||||
| NI | ɲi | ニ | ||||||||
| NO | no | ノ | ||||||||
| NU | nɯ | ヌ | ||||||||
| O | o | Ο, Ω | О | , ֳ, וֹֹ | ُا | オ | ㅗ | |||
| OE | ø | ㅚ | ||||||||
| P | p | Π | П | פּ | پ | ㅍ | ㄆ | |||
| PP | p͈ | ㅃ | ||||||||
| PS | ps | Ψ | ||||||||
| Q | q | Θ | ⵇ | ק | ﻕ ﻗ ﻘ ﻖ | غ ق | ㄑ | |||
| R | r | Ρ | Р | ⵔ, ⵕ | ר | ﺭ — ﺮ | ر | ㄹ | ㄖ | |
| RA | ɾa | ラ | ||||||||
| RE | ɾe | レ | ||||||||
| RI | ɾi | リ | ||||||||
| RO | ɾo | ロ | ||||||||
| RU | ɾɯ | ル | ||||||||
| S | s | Σ | С | ⵙ, ⵚ | ס, שׂ | ﺱ ﺳ ﺴ ﺲ, ﺹ ﺻ ﺼ ﺺ | س ث ص | ㅅ | ㄙ | |
| SA | sa | サ | ||||||||
| SE | se | セ | ||||||||
| SH | ʃ | Σ̈ | Ш | ⵛ | שׁ | ﺵ ﺷ ﺸ ﺶ | ش | ㄕ | ||
| SHCH | ʃʧ | Щ | ||||||||
| SHI | ɕi | シ | ||||||||
| SO | so | ソ | ||||||||
| SS | s͈ | ㅆ | ||||||||
| SU | sɯ | ス | ||||||||
| T | t | Τ | Т | ⵜ, ⵟ | ט, תּ, ת | ﺕ ﺗ ﺘ ﺖ, ﻁ ﻃ ﻄ ﻂ | ت ط | ㅌ | ㄊ | |
| TA | ta | タ | ||||||||
| TE | te | テ | ||||||||
| TH | θ | Θ | תֿ | ﺙ ﺛ ﺜ ﺚ | ||||||
| TO | to | ト | ||||||||
| TS | ʦ | ΤΣ | Ц | צ (or its final form ץ ) | ||||||
| TSU | ʦɯ | ツ | ||||||||
| TT | t͈ | ㄸ | ||||||||
| U | u | ΟΥ, Υ | У | ⵓ | , וֻּ | دُ | ウ | ㅜ | ㄩ | |
| UI | ɰi | ㅢ | ||||||||
| UW | uw | دُو | ||||||||
| V | v | B | В | ב | و | |||||
| W | w | Ω | ⵡ | ו, וו | ﻭ — ﻮ | |||||
| WA | wa | ワ | ㅘ | |||||||
| WAE | wɛ | ㅙ | ||||||||
| WE | we | ヱ | ㅞ | |||||||
| WI | y/ɥi | ヰ | ㅟ | |||||||
| WO | wo | ヲ | ㅝ | |||||||
| X | x/ks | Ξ, Χ | ㄒ | |||||||
| Y | j | Υ, Ι, ΓΙ | Й, Ы, Ј | ⵢ | י | ﻱ ﻳ ﻴ ﻲ | ی | |||
| YA | ja | Я | ヤ | ㅑ | ||||||
| YAE | jɛ | ㅒ | ||||||||
| YE | je | Е, Є | ㅖ | |||||||
| YEO | jʌ | ㅕ | ||||||||
| YI | ji | Ї | ||||||||
| YO | jo | Ё | ヨ | ㅛ | ||||||
| YU | ju | Ю | ユ | ㅠ | ||||||
| Z | z | Ζ | З | ⵣ, ⵥ | ז | ﺯ — ﺰ, ﻅ ﻇ ﻈ ﻆ | ز ظ ذ ض | ㄗ | ||
| ZH | ʐ/ʒ | Ζ̈ | Ж | ז׳ | ژ | ㄓ |
Related articles
This article is a child-friendly adaptation of the Wikipedia article on Romanization, available under CC BY-SA 4.0.
Safekipedia