Cursive Japanese and OCR: Using KuroNet

The Center for Open Data in the Humanities’ KuroNet Kuzushiji Ninshiki Sābisu (KuroNetくずし字認識サービス) launched late last year. KuroNet is a free OCR (Optical Character Recognition) platform which allows users to convert images of documents written in cursive Japanese into printed text. It was a development that I and many others waited for in anticipation, excitement and perhaps to some extent apprehension. Would this be another of the terribly ineffective, free, Japanese OCR platforms that seem to abound on the internet or would it actually work? When I was a teenager my friend’s father who was a computer engineer told me to never jump on software or OS updates as soon as they are released, but to wait a while until bugs and other such faults are worked out. I often live by this philosophy in my digitally oriented work, so when KuroNet was launched I decided to wait a while before using it. Of course, I knew long before I tried it that it must be something special since the post-launch hype in the worlds of both traditional and social media was immense. After trying out the software, I decided that this week I would write a brief tutorial and review of KuroNet.

The instructions for KuroNet are available in Japanese, here. Unless I am mistaken, instructions do not appear to be available in other languages, although parts of the interface can be used in English or Japanese. It is quite easy to use. First, one needs to open the KuroNet Kuzushiji Recognition Viewer/IIIF Curation Viewer which will look something like this:

The KuroNet Kuzushiji Recognition Viewer/IIIF Curation Viewer.

After opening the viewer one needs to log into an account. You can sign in with Google, Facebook, Twitter or Email. Once you’ve logged in your name will appear in the top right corner of the screen.

The KuroNet Kuzushiji Recognition Viewer after signing in.

Next we need to find an image of a document written in cursive Japanese text that we want to convert to printed text. I imagine that the reader will already have a text in mind, but for the sake of this tutorial we will use the second volume of the 1639 version of the Kirishitan Monogatari 吉利支丹物語 due to its relation to my own research. This version of the Kirishitan Monogatari is available via the National Diet Library Digital Collections. This is important since the image that users decide to use must conform to the International Image Interoperability Framework. Images conforming to the International Image Interoperability Framework are available through many databases and online archives, so users of KuroNet should find them easy to locate.

The Kirishitan Monogatari on the National Diet Library Digital Collections Website.

Once one has found an image or set of images the IIIF Manifest URI or IIIF Manifest URL must be dragged and dropped into the KuroNet Kuzushiji Recognition Viewer. On the National Diet Library Digital Collections Website the IIIF Manifest URI is located in the bottom left corner of the screen.

Dragging the IIIF Manifest URI into the viewer.

Now the document can be seen and read within the viewer. To navigate pages one can either click the arrows in the top left corner, or click on the Thumbnails tab to jump to a specific page. At this stage we are also given a series of options in the top right corner including the abilities to add the image to a list, download the image, and look at copyright information etc.

The image in KuroNet Kuzushiji Recognition Viewer.

Next we click on the square symbol in the top right hand corner of our image. This allows us to draw a rectangular box around the cursive text that we want to convert to printed script. Once we have drawn the box it will look something like this.

Drawing the box around the text.

After drawing the box, one is prompted to click on it in order to view info. When we click the box two links appear. The first link says “KuroNet Kuzushiji Recognition Service” and the second is a URL. Clicking the second link will take you to a tab containing an image of the area of the text that you have highlighted, but clicking the other link (KuroNet Kuzushiji Recognition Service) will take us to our personal dashboard where we can perform OCR on the image. One’s dashboard will look something like this:

Dashboard with queued items.

The dashboard shows us our image, the time and date we uploaded it, and will contain a link that says “OCR予約.” All we need to do now is click the “OCR予約” link. This takes us to a loading screen and then back to the dashboard. If we refresh our dashboard, we can then view the results of the OCR for the image by clicking on the “OCR成功:閲覧” button in the “OCR結果” column. If for some reason OCR could not be performed “OCR失敗:消去” will appear in this column instead. In my case it took about 6 to 10 seconds for the software to complete the task for each image, but I imagine this may vary based on the text and the number of active users etc.

Dashboard following the completion of OCR.

Once we click on the “OCR成功:閲覧” button we are transported back to our image in the KuroNet Kuzushiji Recognition Viewer, but now the cursive characters are overlaid with printed red text. And that is the end of the process.

Cursive characters overlaid with printed characters.

A close up version of part of the text.

As I hope the reader has been able to ascertain KuroNet is extremely simple to use and produces accurate results quite quickly. That does not mean that it is without limitations. I found that on some double pages that printed text and cursive text did not align properly. I am uncertain, but this is possibly the result of my having drawn the box too large. This isn’t really a problem. The order of the printed sentences does not change, so one can easily work out which printed line corresponds to which line of cursive text.

Cursive and printed text being unaligned.

On occasion the software also skips characters. Although I found that some redrawing and moving of boxes resolved this issue. Generally speaking one is able to infer the readings or meanings of missing characters either through their own knowledge of cursive Japanese or through the context of the sentence, so this shouldn’t cause much of an issue for most users.

The first sentence here should read Kirishitan, but the OCR has only recognized shitan.

After redrawing the box, the OCR recognized the ki in the sentence also. Alignment issues were also fixed.

A final limitation with KuroNet arrives when a user wants to use it on images of documents in their personal archive. As noted, all images must conform to the International Image Interoperability Framework so in order to use KuroNet on pieces from your personal collection, one needs to create an IIIF compatible image using a tool such as Omeka. This is a minor inconvenience, but adds to time to the process of using KuroNet.

Minor limitations aside KuroNet is a fantastic and easy to use tool which will doubtlessly become an essential part of the arsenal used by those involved in the study of historical Japanese texts. If you haven’t tried it out already, I highly recommend having an explore of the platform and its functions.

5 thoughts on “Cursive Japanese and OCR: Using KuroNet

  1. This is really an amazing application. I’m interested in Japanese woodblock prints and frequently poems in the print are written in Japanese cursive script. It is very difficult to read these poems especially for someone who can’t read or speak Japanese. This application can be a great help to transcribe the cursive text in recognizable Japanese characters.
    I have a question. Is there a possibility to copy and print the OCR result?

    1. There wasn’t an option to copy or print the results when I wrote the piece. I checked now and it doesn’t seem that the option has been added yet. One of the developers is creating an app which will allow people to photograph a document with their phones or tablets – perhaps this will have such a feature.

  2. I’m interested in reading pre-WWII letters from a paper (non-digitized) archive. But even after creating an IIIF compatible image via Omeka.net, Kuronet requires to drag-and-drop the manifest, not the image itself. This manifest is a bit of Python programming in JSON format, specifying image properties, file structure, etc. Institutions like the Diet Library who publish digitized collections have software/programmers develop their manifests; I’ve asked Omeka.net where to find or how to generate manifests but they say Omeka does not do that for users. Since I can’t modify their file structure, am not very good at Python and do not have my own server on which to install standalone Omeka Classic, the manifest is a dead end for me. Time being, I conclude Kuronet is not helpful until they port it to phones/tablets. Ability to copy/print would also be good as you point out.

    1. Hi Michael,

      Thanks for your comment!
      I haven’t tried using it with my own IIIF documents – perhaps there is another method?

      KuroNet exists on phones and tablets in the form of Miwo (they both use the same dataset and recognition system). So you should be able to see if its helpful by using that. There is also Komonjo Kamera (but only works within Japan) which may be useful.
      Also good news – KuroNet now allows users to download the text of the transcriptions, so perhaps this gets around the copy-paste issue!

      James

Leave a comment