Super useful web tool – auto convert between modern and traditional kanji

Since both traditional and simplified characters are still in active use in the Chinese world not only IME software, but also software to automatically convert between the two is readily available, for example as a feature in Openoffice (and MS Word?), and as part of the Chinese language edition of Wikipedia. In the case of Japanese, however, traditional characters are for the most part archaic, and almost nobody ever has any reason to input more than a couple of 繁体字 (for example, to input an unusual or old name) at a time. Except of course for academics dealing with old documents that are not readily available in digital form. Well, I just did a quick search and came across such a tool for Japanese. The web form lets you input either modern Japanese into the top field and have it converted to 舊字體, or post old Japanese into the bottom form and click to convert it into modern kanji. Note that it does not change the kana portion, so if you need to enter a bunch of archaic Japanese text you will still have to make those alterations oneself, but for kanji at least this looks like it good save a fair amount of time compared with either searching the dictionaries one by one or even using the Pinyin/繁体字 IME.

For comparison, here’s a random passage I had open, before:

文部科学省の定義は、「我が国では、学校教育法により、小・中・高等学校等の教科書について教科書検定制度が採用されています。教科書の検定とは、民間で著作・編集された図書について、文部科学大臣が教科書として適切か否かを審査し、これに合格したものを教科書として使用することを認めることです。

And after:

文部科學省の定義は、「我が國では、學校教育法により、小・中・高等學校等の教科書について教科書檢定制度が採用されています。教科書の檢定とは、民間で著作・編集された圖書について、文部科學大臣が教科書として適切か否かを審査し、これに合格したものを教科書として使用することを認めることです。

The page also includes a handy reference chart. Note that it only seems to convert relatively common characters, i.e. those that are simplified forms of the same character. It won’t actually help at all for all those times you have to enter kanji that are either variants （異体字） or just plain archaic.

3 thoughts on “Super useful web tool – auto convert between modern and traditional kanji”

This type of conversion is fairly trivial to do. You just need a mapping table between modern and traditional. The Unicode Consortium keeps track of such data in their Unihan project. See TR38: http://www.unicode.org/reports/tr38/

Using Unihan_Variants.txt, I spent less than 10 minutes to parse it and generate a list of all kTraditionalVariant mappings as given below. Notice that there are occasionally multiple traditional characters to choose from, so it is not necessarily a 1-to-1 conversion.

There are other fields in the database as well, so the results can be expanded and customized as needed.

㑩 -> 儸
㓥 -> 劏
㔉 -> 劚
㖊 -> 噚
㖞 -> 喎
㛟 ->

Well yes, it is trivial, but still a pain in the ass to do if you’re a non-programmer like 99% of people, or a lapsed programmer like myself, who would have to spend a fair amount of time getting up to speed before writing even something as simple as that. And I’m not aware of any other pre-made tool for doing the conversion other than this one.

BTW Ben, your list is actually simplified Chinese hanzi. The Japanese chart of simplified characters is both different and much smaller.

Comments are closed.