On Friday Oct 16, a conference was hosted at Brown University bringing together those in Islamic Studies who are working on making computer technology useful for research purposes. It was recorded and you can watch the conference here.
It was a very fruitful day, with intense discussions from 9 to 6. The abstracts can be read here. Some recurring themes are the following:
This is the future: Clearly, by entering the digital age, our workflow ought to change, to keep up with the efficiency that others show. Computer technology also allows us to answer questions that were impossible to answer before. Parts of computer technology will influence the entire field. Other parts will remain exclusive to the few invested in what is called Digital Humanities.
Can somebody figure out OCR already!? The biggest hurdle by far is the lack of a reliable Optical Character Recognition technology for Arabic. How hard can this be, people!? And surely the economic rewards are great. Yet, for some reason, the nut has not been cracked yet. To have the corpus of al-Maktaba al-shāmila, estimated at 800 million words, is great, but it is still only a fraction of the Islamic heritage, and a very specific one for that matter. Only with OCR technology will we be able to create text databases for our own specific uses.
Required technical knowledge: The technical skills of the participants varied greatly. A majority were not producing tools but rather exploring how to use those tools that are already available. Some were cooperating with computer scientists, having reading knowledge of programming. Only few were actual programmers themselves, actively developing new tools. Developing computer solutions for specific problems is time consuming and specialized work. To do that and to produce real research is a tough combination. Perhaps the best solution is the cooperation with programmers, but finding a good match is not easy, nor is maintaining that work relation for the long run.
Limited tools, limited availability: Tools that are out there are limited in their use, often developed for specific problems. Moreover, they are not always shared. Even when they are meant to be shared, communication is not straightforward. There is no repository of tools for Islamic Studies, nor a listserv, nor any other common communication channel for finding out what is available. Moreover, there is no clear knowledge of tools available outside of Islamic Studies.
It seems, then, that Digital Humanities in Islamic Studies is still at an early stage. There were plenty of ‘proofs of concepts’, and certainly enough big ideas. To make all of that a reality will require more manpower and greater funding.
We need to help with the OCR development. There are technical specialists who can either develop something new or improve something that is already available, but they need our help. For example — training data. If each one of us prepares an equivalent of several hundred pages for them to work with, we can expect some results.