Cursive Japanese and OCR: Using KuroNet

The Center for Open Data in the Humanities’ KuroNet Kuzushiji Ninshiki Sābisu (KuroNetくずし字認識サービス) launched late last year. KuroNet is a free OCR (Optical Character Recognition) platform which allows users to convert images of documents written in cursive Japanese into printed text. It was a development that I and many others waited for in anticipation, excitement and perhaps to some extent apprehension. Would this be another of the terribly ineffective, free, Japanese OCR platforms that seem to abound on the internet or would it actually work? When I was a teenager my friend’s father who was a computer engineer told me to never jump on software or OS updates as soon as they are released, but to wait a while until bugs and other such faults are worked out. I often live by this philosophy in my digitally oriented work, so when KuroNet was launched I decided to wait a while before using it. Of course, I knew long before I tried it that it must be something special since the post-launch hype in the worlds of both traditional and social media was immense. After trying out the software, I decided that this week I would write a brief tutorial and review of KuroNet.

The instructions for KuroNet are available in Japanese, here. Unless I am mistaken, instructions do not appear to be available in other languages, although parts of the interface can be used in English or Japanese. It is quite easy to use. First, one needs to open the KuroNet Kuzushiji Recognition Viewer/IIIF Curation Viewer which will look something like this:

The KuroNet Kuzushiji Recognition Viewer/IIIF Curation Viewer.

After opening the viewer one needs to log into an account. You can sign in with Google, Facebook, Twitter or Email. Once you’ve logged in your name will appear in the top right corner of the screen.

The KuroNet Kuzushiji Recognition Viewer after signing in.

Next we need to find an image of a document written in cursive Japanese text that we want to convert to printed text. I imagine that the reader will already have a text in mind, but for the sake of this tutorial we will use the second volume of the 1639 version of the Kirishitan Monogatari 吉利支丹物語 due to its relation to my own research. This version of the Kirishitan Monogatari is available via the National Diet Library Digital Collections. This is important since the image that users decide to use must conform to the International Image Interoperability Framework. Images conforming to the International Image Interoperability Framework are available through many databases and online archives, so users of KuroNet should find them easy to locate.

The Kirishitan Monogatari on the National Diet Library Digital Collections Website.

Once one has found an image or set of images the IIIF Manifest URI or IIIF Manifest URL must be dragged and dropped into the KuroNet Kuzushiji Recognition Viewer. On the National Diet Library Digital Collections Website the IIIF Manifest URI is located in the bottom left corner of the screen.

Dragging the IIIF Manifest URI into the viewer.

Now the document can be seen and read within the viewer. To navigate pages one can either click the arrows in the top left corner, or click on the Thumbnails tab to jump to a specific page. At this stage we are also given a series of options in the top right corner including the abilities to add the image to a list, download the image, and look at copyright information etc.

The image in KuroNet Kuzushiji Recognition Viewer.

Next we click on the square symbol in the top right hand corner of our image. This allows us to draw a rectangular box around the cursive text that we want to convert to printed script. Once we have drawn the box it will look something like this.

Drawing the box around the text.

After drawing the box, one is prompted to click on it in order to view info. When we click the box two links appear. The first link says “KuroNet Kuzushiji Recognition Service” and the second is a URL. Clicking the second link will take you to a tab containing an image of the area of the text that you have highlighted, but clicking the other link (KuroNet Kuzushiji Recognition Service) will take us to our personal dashboard where we can perform OCR on the image. One’s dashboard will look something like this:

Dashboard with queued items.

The dashboard shows us our image, the time and date we uploaded it, and will contain a link that says “OCR予約.” All we need to do now is click the “OCR予約” link. This takes us to a loading screen and then back to the dashboard. If we refresh our dashboard, we can then view the results of the OCR for the image by clicking on the “OCR成功:閲覧” button in the “OCR結果” column. If for some reason OCR could not be performed “OCR失敗:消去” will appear in this column instead. In my case it took about 6 to 10 seconds for the software to complete the task for each image, but I imagine this may vary based on the text and the number of active users etc.

Dashboard following the completion of OCR.

Once we click on the “OCR成功:閲覧” button we are transported back to our image in the KuroNet Kuzushiji Recognition Viewer, but now the cursive characters are overlaid with printed red text. And that is the end of the process.

Cursive characters overlaid with printed characters.

A close up version of part of the text.

As I hope the reader has been able to ascertain KuroNet is extremely simple to use and produces accurate results quite quickly. That does not mean that it is without limitations. I found that on some double pages that printed text and cursive text did not align properly. I am uncertain, but this is possibly the result of my having drawn the box too large. This isn’t really a problem. The order of the printed sentences does not change, so one can easily work out which printed line corresponds to which line of cursive text.

Cursive and printed text being unaligned.

On occasion the software also skips characters. Although I found that some redrawing and moving of boxes resolved this issue. Generally speaking one is able to infer the readings or meanings of missing characters either through their own knowledge of cursive Japanese or through the context of the sentence, so this shouldn’t cause much of an issue for most users.

The first sentence here should read Kirishitan, but the OCR has only recognized shitan.

After redrawing the box, the OCR recognized the ki in the sentence also. Alignment issues were also fixed.

A final limitation with KuroNet arrives when a user wants to use it on images of documents in their personal archive. As noted, all images must conform to the International Image Interoperability Framework so in order to use KuroNet on pieces from your personal collection, one needs to create an IIIF compatible image using a tool such as Omeka. This is a minor inconvenience, but adds to time to the process of using KuroNet.

Minor limitations aside KuroNet is a fantastic and easy to use tool which will doubtlessly become an essential part of the arsenal used by those involved in the study of historical Japanese texts. If you haven’t tried it out already, I highly recommend having an explore of the platform and its functions.

One thought on “Cursive Japanese and OCR: Using KuroNet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s