Category Archives: OCR

ScanTailor: Installation Instructions and Impressions

When we cannot find a digitized version on the internet, we photograph or scan a book or article ourselves. We end up with photos of pages that are warped and rotated, often with a lot of the surrounding showing, saved

ScanTailor: Installation Instructions and Impressions

When we cannot find a digitized version on the internet, we photograph or scan a book or article ourselves. We end up with photos of pages that are warped and rotated, often with a lot of the surrounding showing, saved

Some Thoughts about Arabic-Script OCR

Recently, as a result of my current research project —an edition and translation of al-Maqrizi’s fifteenth-century chronicle al-Suluk for the Ayyubid period (1171-1193) — I have been pondering issues related to Optical Character Recognition (OCR). Part of my work involves investigating from

Some Thoughts about Arabic-Script OCR

Recently, as a result of my current research project —an edition and translation of al-Maqrizi’s fifteenth-century chronicle al-Suluk for the Ayyubid period (1171-1193) — I have been pondering issues related to Optical Character Recognition (OCR). Part of my work involves investigating from

Cursive Japanese and OCR: Using KuroNet

The Center for Open Data in the Humanities’ KuroNet Kuzushiji Ninshiki Sābisu (KuroNetくずし字認識サービス) launched late last year. KuroNet is a free OCR (Optical Character Recognition) platform which allows users to convert images of documents written in cursive Japanese into printed

Cursive Japanese and OCR: Using KuroNet

The Center for Open Data in the Humanities’ KuroNet Kuzushiji Ninshiki Sābisu (KuroNetくずし字認識サービス) launched late last year. KuroNet is a free OCR (Optical Character Recognition) platform which allows users to convert images of documents written in cursive Japanese into printed

Using Kraken to Train your own OCR Models

This is a contribution by Christine Roughan of NYU. Connect with her on Twitter @cmroughan Over the summer of 2019, inspired by the promising results in articles like Romanov et al. 2017, I set out to use the Kraken OCR software on a variety of texts. Kraken, see their website or their repository, is open-source command line software that is capable

Using Kraken to Train your own OCR Models

This is a contribution by Christine Roughan of NYU. Connect with her on Twitter @cmroughan Over the summer of 2019, inspired by the promising results in articles like Romanov et al. 2017, I set out to use the Kraken OCR software on a variety of texts. Kraken, see their website or their repository, is open-source command line software that is capable