Automatic Arabic Translation Using Google: A Test

Several months ago a rather interesting blog post was brought to my attention. This was a post by Christopher Rose in which it was detailed how to quickly and automatically translate research-work texts from one language into another (see the original post here). The test that was shown was for translating Arabic into English and, as that is my main area of research, I was interested to see this and to try it out myself. This blog post is thus essentially a review and test of that previous blog post’s suggestion. 

To test this approach, I first used my book scanner to create PDFs of the pages that I wanted to have translated. In this case, these were the pages from Ibn Wasil’s (d. 1298) Mufarrij al-kurub fi akhbar Bani Ayyub that contain his account of the Fifth Crusade. Using my scanner (a Czur Aura X Pro) produced a perfectly clear, useable copy of the original text:

After that, I first uploaded the PDF onto my GoogleDrive, and then opened it using GoogleDocs. The output is as below:

As should be clear to those who read Arabic, this is a very close rendering of the original. The formatting is a bit odd, but the Arabic itself is extremely close to the original, and Google even shows where it is not wholly sure by marking the text in yellow. (On a side-note here, as part of a large research project in which I am involved, one of my colleagues has been carrying out a comparison of around twenty Arabic OCR programs, and this free one by Google consistently comes at or near the top, outperforming various paid-for versions).

The next step was to highlight the text to be translated. According to Rose’s original, the easiest way to do this was via Google Translate. Yet it is not possible to translate large chunks of text because Google Translate can only cope with 5,000 characters at a time. So, instead, I focused on doing a paragraph at a time or, with long paragraphs, five or six lines that were constituted of two or three sentences at a time. All one has to do is to copy the relevant text. From this example, I took the heading and the first paragraph and ran it through Google Translate. The resulting text:

He mentioned the disturbance of the military against the full king, his delay in his position and the plunder by the Franks of the Muslims’ weights
When the full king reached the death of his father, which is his status known as Adiliyya in Barr Damietta, straddling the Franks, he sat in the open, and the death of his father was greatly exaggerated over him in such a difficult time: he was afraid that his brothers would abandon him and would not be able to pay the Franks from the Egyptian lands. And in their possession of her Boar Islam altogether

As may be clear, this is not a great translation. Partially, this is caused by the punctuation marks in the original Arabic which, if replaced with more ‘modern’ punctuation, helps the flow of the Arabic and thus of the translation. The replacement of ‘the full king’ with his original Arabic honorific, al-Malik al-Kamil, also helps:

‘He mentioned the disturbance of the military against al-Malik al-Kamil, his delay in his position and the plunder by the Franks of the Muslims’ weights.
When al-Malik al-Kamil reached the death of his father, which is his position known as Adiliyya in Barr Damietta, straddling the Franks, he sat in the open, and the death of his father was magnified by the very sad death of his father at such a difficult time. And he was afraid that his brothers would abandon him, and I would not be able to pay the Franks from the Egyptian homes’.

A few more tweaks to the English based on my knowledge of the events themselves does, however, allow me to produce something a little more useable:

‘Account of the revolt of the army against al-Malik al-Kamil, his delay in his position and the plunder by the Franks of the Muslims’ camp.
When news of the death of his father reached al-Malik al-Kamil, who was in his position known as Adiliyya, in Barr Damietta, across from the Franks, he sat in the open, and the death of his father was magnified by it being such a difficult time. And he was afraid that his brothers would abandon him, and he would not be able to drive away the Franks from the lands of Egypt’.

This is now a workable, legible translation. However, the amount of time and effort that had to go into producing it means that it would have been quicker to just read the Arabic myself while also providing an inaccurate translation. As such, this technique really cannot be used in any serious research work.

3 thoughts on “Automatic Arabic Translation Using Google: A Test

  1. Agreed! I find the OCR function far more useful than the translation, since the all powerful google is better at recognizing possible words where the scan isn’t clear. If I just want a quick-and-dirty version to see if something specific is mentioned, it’s helpful — But for ‘real’ work and quotation in an academic publication, I’ll translate it myself.

Leave a comment