Safekipedia

Romanization

Adapted from Wikipedia · Adventurer experience

In linguistics, romanization or romanisation is changing text from a different writing system into the Roman (Latin) script. This helps people read and understand languages that use letters or symbols not found in the Roman alphabet.

There are different ways to do romanization. One common method is called transliteration. This shows written words as closely as possible. Another method is transcription, which tries to capture how words sound when spoken. Transcription can focus on the meaning units of speech or record every small sound very precisely, called phonetic transcription.

Romanization is important because it makes it easier for people who only know the Roman alphabet to read and study other languages. It is used in many books, maps, and computer systems around the world.

Methods

There are many ways to change text from one writing system into the Roman (Latin) alphabet. We pick a method based on what we need, like making text easy to read or keeping the original sounds.

  • Source language: Some methods work best for one language, keeping its special sounds. Others work for many languages.
  • Target language: Most methods are made for people who speak a certain language.
  • Simplicity: The Latin alphabet has fewer letters than many others, so extra symbols are needed to show all sounds.
  • Reversibility: Some methods let you go back to the original text, while others do not.

Transliteration

Main article: Transliteration

Transliteration changes each symbol from the original script to the Latin alphabet. It focuses on the symbols, not how they sound. For example, a system for Japanese can help someone see the original Japanese symbols correctly.

Transcription

Main article: Transcription (linguistics)

Phonemic

See also: Phonemic orthography

Most romanizations help people who don’t know the original script say the words right. They try to show the main sounds of the original language. The Hepburn system for Japanese is made to help English speakers say words correctly.

Phonetic

See also: Phonetic transcription

A phonetic method tries to show every sound in the original language, even if it makes the writing harder to read. The International Phonetic Alphabet is a common way to do this.

Compromise

For most languages, a good romanization means finding a balance. Pure transcription isn’t usually possible because languages have sounds that others don’t. Most romanizations today try to help people say words right, rather than just showing the symbols. For example, the Japanese word 柔術 is written as zyûzyutu in one system, but most English readers would find jūjutsu easier to say.

Romanization of specific writing systems

See also: Category:Romanization

Arabic

The Arabic script is used to write Arabic, Persian, Urdu, Pashto and Sindhi and other languages. Romanization standards include:

Arabic

Persian

See also: Category:Persian orthography

Notes:

Armenian

Georgian

Notes:

Greek

There are romanization systems for both Modern and Ancient Greek.

Hebrew

The Hebrew alphabet is romanized using several standards:

Indic (Brahmic) scripts

See also: Devanagari transliteration, Romanization of Bengali, and Romanisation of Malayalam

The Brahmic family of abugidas is used for languages of India and south-east Asia. Various transliteration conventions have been used for Indic scripts.

Devanagari–nastaʿlīq (Hindustani)

Hindustani is an Indo-Aryan language. Two standardized registers, Standard Hindi and Standard Urdu, are recognized as official languages in India and Pakistan.

The Hamari Boli Initiative aims to help Hindustani through romanization.

Chinese

Romanization of the Sinitic languages, particularly Mandarin, has been difficult. Many romanization tables include Chinese characters plus one or more romanizations.

Mandarin

China
Taiwan

Main article: Chinese language romanization in Taiwan

  1. Gwoyeu Romatzyh (GR, 1928–1986)
  2. Mandarin Phonetic Symbols II (MPS II, 1986–2002)
  3. Tongyong Pinyin (2002–2008)
  4. Hanyu Pinyin (since January 1, 2009)
Singapore

Main article: Chinese language romanisation in Singapore

Cantonese

Wu

See also: Romanization of Wu Chinese

Min Nan or Hokkien

See also: Comparison of Hokkien writing systems

Teochew
  • Guangdong (1960)

Min Dong

Min Bei

Japanese

Romanization is called "rōmaji" in Japanese. The most common systems are:

Korean

The following systems are widely used:

Thai

Thai is written with its own script.

Nuosu

The Nuosu language is written with the Yi script. The only romanization system is YYPY.

Tibetan

The Tibetan script has two official romanization systems: Tibetan Pinyin and Roman Dzongkha.

Cyrillic

In English, the Library of Congress transliteration method is used.

In linguistics, scientific transliteration is used.

Belarusian

See also: Belarusian Latin alphabet

Bulgarian

A system based on scientific transliteration was official since the 1970s. Bulgarian authorities switched to the Streamlined System in 2009.

Kyrgyz

Macedonian

Russian

There is no single accepted system of writing Russian using the Latin script. Systems include:

  • BGN/PCGN (1947)
  • GOST 16876-71 (1971)
  • United Nations romanization system (1987)
  • ISO 9 (1995)
  • ALA-LC (1997)
  • "Volapuk" encoding (1990s)
  • Streamlined System
  • Comparative transliteration

Syriac

Main article: Syriac alphabet § Latin alphabet and romanization

Ukrainian

See also: Ukrainian Latin alphabet

  • ALA-LC
  • ISO 9
  • Ukrainian National transliteration
Consonants
UnicodePersian
letter
IPADMG (1969)ALA-LC (1997)BGN/PCGN (1958)EI (1960)EI (2012)UN (1967)UN (2012)Pronunciation
U+0627اʔ, ∅ʾ, —ʼ, —ʾ- as in uh-oh
U+0628بbbB as in Bob
U+067EپppP as in pet
U+062AتttT as in tall
U+062Bثst͟hsS as in sand
U+062Cجǧjjd͟jjjJ as in jam
U+0686چčchchčchčCh as in Charlie
U+062Dحhḩ/ḥhH as in holiday
U+062Eخxkhkhk͟hkhxsomewhat resembling German Ch
U+062FدddD as in Dave
U+0630ذzd͟hzZ as in zero
U+0631رrrR as in rabbit
U+0632زzzZ as in zero
U+0698ژʒžzhzhz͟hžzhžS as in television
or G as in genre
U+0633سssS as in Sam
U+0634شʃšshshs͟hšshšSh as in sheep
U+0635صsş/ṣşsS as in Sam
U+0636ضzżżzZ as in zero
U+0637طtţ/ṭţtt as in tank
U+0638ظzz̧/ẓzZ as in zero
U+0639عʕʿʻʼʻʻʿʿ_____
U+063Aغɢ~ɣġghghg͟hghqsomewhat resembling French R
U+0641فffF as in Fred
U+0642قɢ~ɣqqsomewhat resembling French R
U+06A9کkkC as in card
U+06AFگɡgG as in go
U+0644لllL as in lamp
U+0645مmmM as in Michael
U+0646نnnN as in name
U+0648وv~wvv, wvV as in vision
U+0647هhhhhhhhH as in hot
U+0629ة∅, thth
U+06CCیjyY as in Yale
U+0621ءʔ, ∅ʾʼʾ
U+0623أʔ, ∅ʾʼʾ
U+0624ؤʔ, ∅ʾʼʾ
U+0626ئʔ, ∅ʾʼʾ
Vowels
UnicodeFinalMedialInitialIsolatedIPADMG (1969)ALA-LC (1997)BGN/PCGN (1958)EI (2012)UN (1967)UN (2012)Pronunciation
U+064EـَـَاَاَæaaaaaaA as in cat
U+064FـُـُاُاُoooouooO as in go
U+0648 U+064FـوـوoooouooO as in go
U+0650ـِـِاِاِeeieeeeE as in ten
U+064E U+0627ـَاـَاآآɑː~ɒːāāāāāāO as in hot
U+0622ـآـآآآɑː~ɒːā, ʾāā, ʼāāāāāO as in hot
U+064E U+06CCـَیɑː~ɒːāááāáāO as in hot
U+06CC U+0670ـیٰɑː~ɒːāááāāāO as in hot
U+064F U+0648ـُوـُواُواُوuː, oːūūūu, ōūuU as in actual
U+0650 U+06CCـیـیـایـایiː, eːīīīi, ēīiY as in happy
U+064E U+0648ـَوـَواَواَوow~awauawowow, awowowO as in go
U+064E U+06CCـَیـَیـاَیـاَیej~ajaiayeyey, ayeyeyAy as in play
U+064E U+06CCـیِ–e, –je–e, –ye–i, –yi–e, –ye–e, –ye–e, –ye–e, –yeYe as in yes
U+06C0ـهٔ–je–ye–ʼi–ye–ye–ye–yeYe as in yes
Georgian letterIPANational system
(2002)
BGN/PCGN
(1981–2009)
ISO 9984
(1996)
ALA-LC
(1997)
Unofficial systemKartvelo translitNGR2
/ɑ/aaaaaaa
/b/bbbbbbb
/ɡ/ggggggg
/d/ddddddd
/ɛ/eeeeeee
/v/vvvvvvv
/z/zzzzzzz
/eɪ/eyēēéej
/tʰ/tT or ttt / t̊
/i/iiiiiii
/kʼ/kkkkǩ
/l/lllllll
/m/mmmmmmm
/n/nnnnnnn
/i/, /j/jyyjĩ
/ɔ/ooooooo
/pʼ/pppp
/ʒ/zhzhžžJ, zh or jž
/r/rrrrrrr
/s/sssssss
/tʼ/tttt
/w/wwŭ
/u/uuuuuuu
/pʰ/pp or fpp / p̊
/kʰ/kq or kq or kk / k̊
/ʁ/ghghġg, gh or Rg, gh or R
/qʼ/qqqyqq
/ʃ/shshššsh or Sšx
/t͡ʃ(ʰ)/chchʼč̕čʻch or Cč
/t͡s(ʰ)/tstsʼc or tscc
/d͡z/dzdzjżdz or Zʒ
/t͡sʼ/tsʼtsccw, c or tsʃ
/t͡ʃʼ/chʼchččW, ch or tchʃ̌
/χ/khkhxxx or kh (rarely)x
/q/, /qʰ/
/d͡ʒ/jjǰjj-j
/h/hhhhhhh
/oː/ōōȯ

Overview and summary

The chart below shows common ways to change spoken sounds into Roman letters for many alphabets. This helps many people, but there are other ways to do it, and some letters don’t always follow the same rules. For more information, see the sections for each language above. (Hangul characters are broken down into jamo pieces.)

RomanizedIPAGreekCyrillicAmazighHebrewArabicPersianKatakanaHangulBopomofo
AaAАַ, ֲ, ָَ, اا, آ
AEai̯/ɛΑΙ
AIaiי ַ
BbΜΠ, ΒБבּﺏ ﺑ ﺒ ﺐﺏ ﺑ
Ck/sΞ
CHʧTΣ̈Чצ׳چ
CHIʨi
DdΝΤ, ΔДⴷ, ⴹדﺩ — ﺪ, ﺽ ﺿ ﻀ ﺾد
DHðΔדֿﺫ — ﺬ
DZʣΤΖЅ
Ee/ɛΕ, ΑΙЭ, ֱ, י ֵֶ, ֵ, י ֶ
EOʌ
EUɯ
FfΦФפ (or its final form ף )ﻑ ﻓ ﻔ ﻒ
FUɸɯ
GɡΓΓ, ΓΚ, ΓГⴳ, ⴳⵯגگ
GHɣΓҒגֿ, עֿﻍ ﻏ ﻐ ﻎق غ
HhΗҺⵀ, ⵃח, הﻩ ﻫ ﻬ ﻪ, ﺡ ﺣ ﺤ ﺢه ح ﻫ
HAha
HEhe
HIhi
HOho
Ii/ɪΗ, Ι, Υ, ΕΙ, ΟΙИ, Іִ, י ִدِ
IYijدِي
JʤTZ̈ДЖ, Џג׳ﺝ ﺟ ﺠ ﺞج
JJʦ͈/ʨ͈
KkΚКⴽ, ⴽⵯכּﻙ ﻛ ﻜ ﻚک
KAka
KEke
KHxXХכ, חֿ (or its final form ך )ﺥ ﺧ ﺨ ﺦخ
KIki
KK
KOko
KU
LlΛЛלﻝ ﻟ ﻠ ﻞل
MmΜМמ (or its final form ם )ﻡ ﻣ ﻤ ﻢم
MAma
MEme
MImi
MOmo
MU
NnΝНנ (or its final form ן )ﻥ ﻧ ﻨ ﻦن
NAna
NEne
NGŋ
NIɲi
NOno
NU
OoΟ, ΩО, ֳ, וֹֹُا
OEø
PpΠПפּپ
PP
PSpsΨ
QqΘקﻕ ﻗ ﻘ ﻖغ ق
RrΡРⵔ, ⵕרﺭ — ﺮر
RAɾa
REɾe
RIɾi
ROɾo
RUɾɯ
SsΣСⵙ, ⵚס, שׂﺱ ﺳ ﺴ ﺲ, ﺹ ﺻ ﺼ ﺺس ث ص
SAsa
SEse
SHʃΣ̈Шשׁﺵ ﺷ ﺸ ﺶش
SHCHʃʧЩ
SHIɕi
SOso
SS
SU
TtΤТⵜ, ⵟט, תּ, תﺕ ﺗ ﺘ ﺖ, ﻁ ﻃ ﻄ ﻂت ط
TAta
TEte
THθΘתֿﺙ ﺛ ﺜ ﺚ
TOto
TSʦΤΣЦצ (or its final form ץ )
TSUʦɯ
TT
UuΟΥ, ΥУ, וֻּدُ
UIɰi
UWuwدُو
VvBВבو
WwΩו, ווﻭ — ﻮ
WAwa
WAE
WEwe
WIy/ɥi
WOwo
Xx/ksΞ, Χ
YjΥ, Ι, ΓΙЙ, Ы, Јיﻱ ﻳ ﻴ ﻲی
YAjaЯ
YAE
YEjeЕ, Є
YEO
YIjiЇ
YOjoЁ
YUjuЮ
ZzΖЗⵣ, ⵥזﺯ — ﺰ, ﻅ ﻇ ﻈ ﻆز ظ ذ ض
ZHʐ/ʒΖ̈Жז׳ژ

Related articles

This article is a child-friendly adaptation of the Wikipedia article on Romanization, available under CC BY-SA 4.0.