This post is the second part of a conversation between Ephrem A. Ishac and David Michaelson on the state of Digital Syriac Studies. The first part can be found here.
4. Enhancement of Wright’s Catalogue
The Syriac Manuscripts in the British Library (SMBL) is described as a “digital enhancement” of William Wright’s foundational catalogue. We know Wright’s work is essential but over 150 years old. What specific forms does this “enhancement” take—beyond simple digitization—to make the British Library data more usable, searchable, and LOD-compliant in the 21st century?
Prof. David Michelson: That’s a good angle into the project. Our goal was not merely to create electronic text out of Wright, though even that is already really useful. Now you can search all of the Syriac, which of course in the printed volume is not searchable. I want to thank Robert Aydin who was responsible and prepared this back between 2010 and 2012 when HTR (Hand Text Recognition) for Syriac did not yet exist, so this is hand-typed and hand-proofed Syriac text. Of course, now perhaps with OCR it could be done faster.
But I would say the main way that this is different is we have taken William Wright’s data but applied a slightly different understanding of the codex from Wright. Wright has a bit of an idiosyncratic system. He doesn’t follow the shelf marks of the British Library; as anybody who’s used William Wright’s catalog knows, he has these Roman numbers, yet at the same time he is not entirely strict about reconstructing what the manuscripts must have looked like earlier. It is very hard from Wright’s catalog to understand what the manuscripts must have been like in Deir al-Surian, where the majority of them came from.
What we’ve done is taken a stricter approach to separating different codices. So it has more entries than Wright’s catalog even though we’re using the same material. To give probably the easiest example: often Wright was not very interested in flyleaves, but you and I know that if a manuscript has a yellow paste-down or flyleaves, each of those represents some other Syriac manuscript that must have existed at some time from which now we only have one leaf surviving. Our project has treated those as separate entities in the hopes that maybe a matching flyleaf might show up somewhere. At the very least, they are evidence for past manuscripts that did exist. We tried to come up with a more rational basis, different from either Wright or the shelf marks in the British Library, because we know that neither of those divisions really represent the manuscripts as they existed at their time of creation, or even as they existed up until the moment they entered the British Library.
The other enhancement is we have tried to pay much closer attention to connecting titles and authors. Wright did a very good job of this, so I don’t want to denigrate his work. It’s Wright’s work and it’s amazing, especially considering it was the work of one person without a computer. But the way we’ve approached authors and titles will allow a more specific search of the catalog. If you’re looking for someone famous—one of the Cappadocians—Wright’s catalog will help you find that quickly. If you’re looking for a lesser-known author, it can be very difficult. There’s still a lot of work to be done in terms of authority control and titles. We haven’t disambiguated titles at all; often Wright only gives an abbreviated title. We didn’t go back in and recatalogue the manuscripts. Hopefully, some future Syriac scholar who’s only in elementary school right now will get a grant in 20 years and do that!
I’ll just add one more thing: we’ve also tried to mark up the parts of the manuscript. In Wright’s catalog, if you wanted to search all manuscripts that have illuminations or illustrations, that’s basically impossible. The data is there, and so we’ve pulled it out. There are about 180 manuscripts where Wright tells us they do have illustrations, and now it’s searchable. We even use the categorization that HMML uses, so you can filter them by type—whether it means animal depictions, or just the use of red/blue inks. We’ve done the same thing marking up colophons and marginalia. Now you can search only in colophons, again something that would be very difficult to do in the print version.

5. Technical Challenges and Lessons
Wright’s catalogue is historically complex. Could you discuss the biggest technical challenge in translating this legacy data into the structured, machine-readable format required by the Srophé platform and the Linked Open Data standard? What lessons did your team learn that could benefit other DH projects working with similar legacy manuscript catalogues?
Prof. David Michelson: We encountered a number of challenges. One is that the lifecycle of digital humanities scholarship is much shorter than the lifecycle of print scholarship. From the beginning of the project to the end, we discovered that we needed to update our own software. You mentioned the Srophé app; we began work on that in 2010. It’s now 15 years old, which for a software application is quite old. So we have actually switched to a new app which we call Gaddel, and we will be porting all of the Syriaca.org published apps to that version. For the technically interested, it’s a serverless app. One of our goals is actually to move to a flat or minimal computing format—in other words, that all of the Syriaca.org publications would have static HTML web pages as a way to future-proof them. You won’t be relying on any particular JavaScript. If 20 years from now you have a web browser that can open old web pages with text, it’ll be able to open Syriaca.org and read them. They’ll look outdated, but they’ll still open.

Thinking more specifically about Wright’s catalog and technical challenges, I would point to the mentalities of 19th-century assumptions. You no doubt are familiar with many of Wright’s prejudices. He has quite a few nasty things to say about particular scribes. There’s a certain Western arrogance and superiority. In fact, one of the most interesting parts of the project was a discovery I made while working in the library. Wright says very nasty things about the medieval Syriac scribes erasing Greek manuscripts and writing over them in Syriac. Now, from these scribes’ point of view, of course, they’re just promoting their literary heritage and covering over something they no longer need. Amazingly, we have preserved in the British Library a Syriac manuscript in which, in the 19th century, people working in the British library went ahead and erased the Syriac so they could get at the Greek underneath! Before they did it, some Western scholars had made a transcription of what they erased. You can go and look at this manuscript today. So there’s a bit of perspective hypocrisy: Wright was willing to criticize the medieval Syriac scribes for erasing things they found useless, yet we see 19th-century scholars in the British Library willing to erase the Syriac manuscripts themselves when they found something else useful.
So we had to ask: where are these historical assumptions in Wright’s catalog? How might this keep our work from being useful, and how could we use digital technology to undo them? One way is the hierarchical nature of Wright’s catalog. He was interested primarily in great authors, primarily in how we could use Syriac to get to ‘more important’ things like Greek texts. We at no point wanted to erase Wright’s work. Even when Wright has made a mistake or said something that’s clearly erroneous or prejudiced, we’ve kept it, because that is part of the historical record. But we’ve also created digital ways to correct the mistake. For example, some of the works of Palladius are attributed to Jerome in the Paradise of the Fathers. Wright mentions Jerome. Someone using the catalog might want to see where Jerome is mentioned, so we’ve kept that, but we’ve also noted that this is not now generally understood to be by Jerome.
I will just mention one other unexpected challenge: the British Library suffered a devastating cyberattack which took down their systems. This had a sort of silver lining in that, by the time of the attack, our catalog was online and thus was the only available finding aid. Of course, we’re very frustrated that the attack happened and set back everyone’s work, but we were delighted that our finding aid was available. It’s been a joy to me to be in the British Library and see people use it there! I think it’s actually made the British Library even more open to conversations and partnerships with external collaborators, which is good for all of us in our field.

6. International Collaboration
Your current residency here at the Austrian Academy of Sciences (IMAFO), under the umbrella of the Eurasian Transformation Cluster, represents an important cross-border academic exchange. How do these temporary, physical collaborations contribute to the success and sustainability of long-term digital infrastructure projects like SMBL and Syriaca.org?
Prof. David Michelson: They’re 100% essential. Just to be clear, even though digital projects often are done remotely and we can have Zoom meetings, it may just be that I’m a scholar who recently became more than a half-century old, and maybe I’m an old-fashioned scholar, but I think that these in-person meetings are really essential towards the collegiality that is required.
Clarification is key—working digitally or remote-only, misunderstandings can develop. So I think it’s important both for establishing a baseline for collaboration, but also to hear about what’s in progress but not yet released. I’ve been on sabbatical in Europe and Asia this semester and I was anxious to do that simply because, being in the United States, it can be difficult to hear about works in progress. It’s such a delight for me to be here with you, to be learning about the Eurasian Transformation Cluster, getting all sorts of updates, and giving updates like in this conversation.

Ephrem A. Ishac: Thank you so much!
Prof. David Michelson: Thank you!
