The design of the Niji script

March 25, 2018

This document describes the design of the Niji script. For more information about the script, see The Niji Script.

The Niji script applies the ideas underlying the Korean Hangeul script to the Japanese language. It provides separate consonant and vowel characters to enable a logical phonemic writing system.

The script is based on the following principles:

The script is phonemic. It is designed such that a person familiar with the script’s small set of characters and simple rules can pronounce text based just on the written information, and write spoken text just by listening, without deep understanding of the text. It is detailed enough to record all differences in pronunciation that are necessary to distinguish different words. It is not phonetic, that is, it doesn’t record details of the pronunciation that don’t help distinguish words.
The script uses separate consonant and vowel characters that can be combined as needed to represent the syllables used in Japanese and loanwords.
The script adds new consonants as needed rather than relying on modifiers such as dakuten ﾞ, handakuten ﾟ, and small letters ぁぃぅぇぉゃゅょァィゥェォャュョ.
Characters are based on Hangeul, but adapted to the Japanese language, whose pronunciation is quite different from Korean.
Consonants and vowels are combined in syllable blocks to enable a compact representation of text that’s reminiscent of Chinese characters.

The new characters are not necessarily featural in the way that many Hangeul characters are: The shapes of the simplest Hangeul consonants are designed to reflect the way they’re produced in the mouth, while others add strokes to indicate other aspects such as aspiration. This characteristic is interesting, but not essential to understand or use the script. Some completely new characters in Niji include a half-circle, reflecting the rounded shapes of the corresponding Hiragana characters, while for others strokes are added or removed to indicate voiced or voiceless pronunciation.

The discussion below makes references to columns of the Hiragana syllable table, where each column represents one or more related consonants, and each row a vowel (the only pure consonant, ん n, is not shown).

	∅	k	s	t	n	h	m	y	r	w
a	あ	か	さ	た	な	は	ま	や	ら	わ
i	い	き	し	ち	に	ひ	み		り
u	う	く	す	つ	ぬ	ふ	む	ゆ	る
e	え	け	せ	て	ね	へ	め		れ
o	お	こ	そ	と	の	ほ	も	よ	ろ	を

Vowels and Y

The Japanese vowels あ a, い i, う u, え e, and お o are represented by  a,  i,  u,  e, and  o. Four of them are directly taken from Hangeul; the  used here for e however represents the Korean eo sound in Hangeul. We’re using this character instead of the Hangeul ㅔ because it’s simpler and opens the opportunity to streamline the representation of long vowels.

Niji follows Hangeul in requiring that a complete syllable either starts with an initial consonant or indicates its absence with the special non-consonant character . For example, the complete syllable あ a is written as . In Hangeul, this may be necessary to disambiguate Korean vowel and diphthong sequences; in Niji, it only serves to fill out syllable blocks.

Long vowels are indicated by adding a horizontal bar  as a long vowel mark, as known from Katakana. Fonts may support a compact form though, in which the horizontal bar is merged as a second bar into the vowel. For example,  uu can be rendered as , and  ii as  (if these look the same, try a better browser).

The Japanese consonant y is special in that it can occur both as an initial consonant, indicated in Hiragana by large characters や ya, ゆ yu, and よ yo, and as a medial consonant, indicated in Hiragana by small characters ゃ ya, ゅ yu, and ょ yo. Niji follows the model of Hangeul and treats y as modifying the following vowel, with an initial y requiring the non-consonant character . For example, や ya is represented as . In the Unicode character encoding, the consonant is encoded separately; fonts are required to form a ligature with the following vowel, and keyboards should enable direct input of the ligature.

Simple initial consonants: N, M, L, and W

A few Japanese consonants can be directly mapped to Hangeul consonants: The n of な na, に ni, ぬ nu, ね ne, の no to ; the m of ま ma, み mi, む mu, め me, も mo to ; the r of ら ra, り ri, る ru, れ re, ろ ro to . The last column, traditionally transliterated with r, is actually closer to l in current pronunciation, and so the Unicode character name for the Niji character  uses L.

The consonant w of わ wa, on the other hand, has no direct counterpart in Hangeul. In Korean, w is interpreted as an o or u before another vowel, and can occur with or without initial consonant: 와 wa, 과 gwa. In Japanese, on the other hand, w is treated as a consonant. Niji therefore introduces a new character,  w. Its shape reflects that of the Hiragana わ.

Voiceless and voiced consonants: K and G

For several consonants, Japanese distinguishes voiceless and voiced variants, and the distinction is phonemic, that is, it makes words differ. In Korean, voicing depends on context and does not mark different words. On the other hand, Korean distinguishes between plain, aspirated, and tense variants of consonants, a distinction that doesn’t exist in Japanese.

The consonants transliterated into Latin with K and G are one such case: The k of か ka, き ki, く ku, け ke, こ ko is voiceless, while the g of が ga, ぎ gi, ぐ gu, げ ge, ご go is voiced, as indicated by the dakuten ﾞ. Korean has plain ㄱ g, aspirated ㅋ k, and tense ㄲ gg, but the latter two don’t exist in Japanese, and whether the first is pronounced voiceless or voiced depends on context.

In this case, Niji adopts the Hangeul consonant ㄱ g as  k, for the voiceless variant, and adds a stroke for the voiced variant  g. The added stroke is different from the one that Hangeul adds for aspiration; in fact, in this case (but not in all other cases) the result looks somewhat similar to the tense Hangeul variant. A long stroke has the advantage of being more visible than dakuten ﾞ, which in small or low-resolution renderings are often hard to make out.

Voiceless, voiced, and modified consonants: S, Z, and SH

The Hiragana and Katakana scripts have separate base glyphs corresponding to only 9 consonants: k, s, t, n, h, m, y, r, w. Any other consonants are produced by modifying these or, in one case, a vowel. With g, we have already seen the use of the dakuten ﾞ to indicate voiced variants. Another trick is to take a syllable that happens to be pronounced differently from the others in a column, and add small characters to it to essentially create a new column. With s syllables, both modifications are used.

The s of さ sa, す su, せ se, そ so is a voiceless consonant, and the z of ざ za, ず zu, ぜ ze, ぞ zo the voiced counterpart. Like for k, Niji adopts the Hangeul consonant  s, and adds a stroke for  z.

However, し is not pronounced si, but shi. And this difference in pronunciation, with additional small characters, is used to also produce しゃ sha, しゅ shu, しぇ she (very rare), and しょ sho. Niji instead adds a new consonant  sh, whose shape reflects the Hiragana し shi.

し and its derivatives are also combined with dakuten, resulting in じ ji and more. However, in standard pronunciation し is not actually the voiceless counterpart of じ. We’ll find the actual counterpart in the next section.

Multiple modified consonants: T, D, CH, J, TS

Within the t column, only た ta, て te, and と to actually have a t sound, and only だ da, で de, and ど do a d sound. Niji adopts the Hangeul ㄷ d as  t, and adds a stroke for  d.

ち is pronounced chi, and is expanded with small characters to the full set of ちゃ cha, ち chi, ちゅ chu, ちぇ che (very rare), and ちょ cho. Niji adopts the Hangeul ㅈ j  ch.

The consonant j is in Hiragana normally written as the voiced variant of sh, that is, as じゃ ja, じ ji, じゅ ju, ジェ je (Katakana only), and じょ jo. However, its standard pronunciation is actually the voiced counterpart of ch, and so Niji adds a stroke to  ch to obtain  j.

つ is pronounced tsu, and in Katakana is expanded with small characters to the full set of ツァ tsa, ツィ tsi, ツ tsu, ツェ tse, and ツォ tso. Niji adds a new consonant  ts, whose shape reflects the Hiragana つ tsu.

Multiple modified consonants: H, P, B, F, V

Syllables from the h column can be combined not only with dakuten ﾞ to produce a voiced b, but also with handakuten ﾟ to produce a voiceless p: ば ba, ぱ pa. In modern pronunciation, the b is the voiced counterpart of p, not of h. Niji expresses this by using related glyphs for p and b, and a separate one for h. In this case, we adopt the Hangeul ㅂ b for  b and remove a stroke for voiceless  p to arrive at a pair analogous to  t and  d. For h, Niji removes the stroke indicating aspiration from the Hangeul ㅎ h to get  h (Hangeul originally had a character ᅙ representing the glottal stop, but that’s no longer used).

ふ is pronounced fu, and in Katakana is expanded with small characters to the full set of ファ fa, フィ fi, フ fu, フェ fe, and フォ fo. Niji adds a new consonant  f, whose shape reflects the Hiragana ふ fu.

For a few foreign words, Katakana combines the vowel ウ u with dakuten ﾞ and small characters to represent ヴァ va, ヴィ vi, ヴェ ve, and ヴォ vo. The v here is the voiced counterpart to f, and so Niji adds a stroke to  f to obtain  v.

Final consonants: N, TS

Japanese has two generic consonants that can occur at the end of a syllable: ん n and っ small tsu. They’re generic in the sense that their pronunciation depends on context. ん is pronounced m before m, p, and b; ng before k and g; and has a few more variations besides the default n. っ doubles the following consonant, as in かった katta. These generic consonants could be mapped to a full range of actual final consonants in Niji, but since their pronunciation is very regular, Niji follows tradition and simply supports syllable-final  n and  ts. These two consonants are interpreted as final if they immediately follow a vowel or a prolonged sound mark and are not immediately followed by a medial consonant or a vowel; they are interpreted as initial consonants otherwise.

Particles

A few grammatical particles in Japanese are written with Hiragana syllables that don’t reflect their current pronunciation. In Niji, these are written according to their pronunciation:

は, pronounced as wa when used as the topic marker: .
を, pronounced as o when used as the object marker: .
へ, pronounced as e when used as a direction marker: .

Pitch

Japanese distinguishes between high and low pitch, and pitch sometimes helps telling words apart. However, pitch varies significantly between regions, and listeners rely more on context than on pitch to distinguish words. Japanese writing usually doesn’t record pitch, except in some dictionaries. Hangeul originally recorded pitch using dots on the left side of characters, but standard Korean no longer has pitch distinctions. Niji follows the trend and does not record pitch.

Syllable blocks

Niji follows the model of Hangeul in combining the characters of a syllable into a syllable block, although the presence of a prolonged sound mark may break up a syllable into two blocks.

Niji syllables consist of an initial consonant, or the non-consonant character  to indicate the lack of an initial consonant, optionally a medial consonant  y, a vowel, optionally a prolonged sound mark, and optionally a final consonant. Every consonant except  y can be used as an initial consonant. Only  y can be used as a medial consonant; it cannot be used with  i. Only  n and  ts can be used as final consonants.  n and  ts are interpreted as final consonants if they immediately follow a vowel or a prolonged sound mark and are not immediately followed by a medial consonant or a vowel; they are interpreted as initial consonants otherwise.

The consonants and vowels of each syllable are combined into a visual entity that leaves the components recognizable, but also shows that they belong together. The vowels  a,  i, and  e render to the right side of the initial consonant; the vowels  u and  o below the initial consonant. A medial consonant  y forms a ligature with the following vowel by doubling that vowel’s short stroke; the ligature is positioned like the vowel that it includes. A prolonged sound mark  either renders to the right of the group formed by initial consonant, medial consonant, and vowel, or forms a ligature with the vowel by doubling the vowel’s main stroke. A final consonant renders below the group formed by the initial consonant and the vowel or vowel ligature, or below the prolonged sound mark if present and rendered separately.

As in traditional Hangeul, fonts may be designed to give syllables uniform height and width, but, as in contemporary Hangeul, they’re not required to do so.

Layout, punctuation, and digits

Niji text is written from left to right. Words are separated by spaces, with precise rules yet to be determined. Common script punctuation and digits are used.

More information

For more information about the Niji script, see The Niji Script.

	∅	k	s	t	n	h	m	y	r	w
a	あ	か	さ	た	な	は	ま	や	ら	わ
i	い	き	し	ち	に	ひ	み		り
u	う	く	す	つ	ぬ	ふ	む	ゆ	る
e	え	け	せ	て	ね	へ	め		れ
o	お	こ	そ	と	の	ほ	も	よ	ろ	を

	∅	k	s	t	n	h	m	y	r	w
a	あ	か	さ	た	な	は	ま	や	ら	わ
i	い	き	し	ち	に	ひ	み		り
u	う	く	す	つ	ぬ	ふ	む	ゆ	る
e	え	け	せ	て	ね	へ	め		れ
o	お	こ	そ	と	の	ほ	も	よ	ろ	を

	∅	k	s	t	n	h	m	y	r	w
a	あ	か	さ	た	な	は	ま	や	ら	わ
i	い	き	し	ち	に	ひ	み		り
u	う	く	す	つ	ぬ	ふ	む	ゆ	る
e	え	け	せ	て	ね	へ	め		れ
o	お	こ	そ	と	の	ほ	も	よ	ろ	を