Balancing Innovation and Preservation: Sustainability in the Oral History of Tibetan Studies Project

This post was co-written with Daniel Wojahn.

The Oral History of Tibetan Studies (OHTS) project aims to collect and preserve the memories of the pioneers of Tibetan studies, both through recorded interviews and visual materials made available in our digital archive. Since its launch in 2017, OHTS has collected over 80 interviews – totalling more than 300 hours of footage – in a multitude of languages, with individuals across Europe, North America, and Asia. 

This article addresses some of the challenges encountered during the project’s development, including the systematic digitisation and curation of materials, the balance between preserving raw data and ensuring accessibility, and the integration of rapidly evolving technologies into our workflow. We present the OHTS approach to these challenges and its strategies for making its collection tangible to a wide audience without compromising on sustainability and data reusability. In the process, we seek to adopt digital tools that are not only user-friendly and intuitive, but also meet modern standards in line with the FAIR principles and ultimately bridge the gap between innovation and sustainability.

Project background

The Oral History of Tibetan Studies was founded by Anna Sehnalova and Rachael Griffiths in 2017 with the dual objectives of capturing the memories of the ‘pioneers’ of Tibetan studies and related disciplines through audio/visual interviews in the short term and preserving these interviews, along with supplementary materials such as conference programmes and photographs, in an online archive in the long term. Initially conceived as a pet project alongside their PhD studies, a basic data curation pipeline was established but the project lacked a comprehensive data management plan. The focus was on collecting and conserving interviews (mainly Tibetologists visiting Oxford), and securing funding to sustain the project.

When Daniel Wojahn joined the project in 2019 as digital content lead, the project expanded its scope to include digitisation and archival efforts, although the focus was still very much on capturing and editing interviews (logging and rough edit) to eventually go online. Funding from the John Fell Fund facilitated 33 interviews across Europe. By December 2021, when we launched our online archive, we had collected 39 interviews – over 120 hours of recordings! However, only 15 interviews were archived on the website at the time. Several factors contributed to this, including underestimating the time and resources required to sustain a project of this size, as well as expanding our aims to make the interviews more accessible. For example, we time-stamped the questions and different sections of the interviews to help viewers better navigate the materials, but this added at least an extra five hours to the editing process for each interview (but could add as much as 15 additional hours). Finally, to prepare for future funding applications, the project expanded its public outreach by giving presentations, writing papers, and creating a social media account.

The OHTS website

After the archive was launched, we reflected on the strengths and weaknesses of our project and workflow. In the first phase of the project (2017–2021), we created a best practice document for the technical, thematic, and aesthetic requirements of our interviews, but we also wanted to ensure the long-term usability and accessibility of our raw data. Accordingly, we had to decide what to prioritise in the second phase of the project (2022–2025). This was challenging – trying to strike a balance between what makes sense for the project and what will attract funding – and we decided that short term (2021–2022) we would continue to focus on capturing interviews (this time in north America and Asia), as this was believed to be more popular with funders, and then later focus on doing more with the data.

As of now, we have collected 89 interviews, which is more than 370 hours of footage. This expansion is incredible, but the scale of the project, developments in DH, and our own reflections on our work have prompted some large shifts in our aims; most importantly how do we continue to sustain a project of this size (not just financially, but also juggling priorities with limited capacity and resources) and thinking more broadly about accessibility (captioning, making the archive accessible to Tibetans, or extracting meaningful data from our interviews)?

Funding institutions versus further development: How to balance project goals?

Moving a project like the Oral History of Tibetan Studies out of the conservation phase and into the database era, as described by Ilona Budapesti (Budapesti 2019, 25-40), is easier said than done and can be seen in the repeated life cycles of many digital projects: after the initial project funding of three to five years, the first results are published on a website provided by the hosting institution, and (hopefully) a few repositories are posted to Zenodo or GitHub. Still, once the funding has expired, the project members disperse and join their next research group, and the project and its website pass into oblivion (a study by Brigitte Mathiak and her colleagues at the Leibniz Institute for the Social Sciences, Cologne found that the average lifespan of digital outputs was 8.5 years (Naumann 2022, 78-80)).

Therefore, we have asked ourselves how we can strike a balance between our goals. While making our collection fully accessible to the extent we desire would require additional funding and time, we still want to ensure that the material is preserved and available for funders to see and for others to use. To address this, we adopted a staged approach, seeking funding for specific project phases aligned with our goals. These stages also facilitate ongoing reflection and provide us with space and flexibility to incorporate new technologies and adapt our workflow. Some examples of this are discussed below.   

Streamlining the editing workflow

Our editing process underwent several optimisations to enhance efficiency and accessibility. Due to the sensitive nature of the project, we are bound by strict ethical guidelines that require the full consent of our interviewees. We therefore meticulously check the raw material for possible content-related issues and technical corrections. We have extended this review to incorporate simultaneously time-stamping interview questions, and noting key terms. These annotations facilitate navigation within lengthy interviews and lay the groundwork for future data analysis. 

When the finished videos are uploaded to YouTube, time-stamps provide us with automatically generated chapter markers that allow users to easily navigate through an interview that lasts several hours (some exceed 11 hours!). These chapter markers could be reused as links on the website by connecting them to the embedded video using a simple JavaScript function within the Youtube API. In addition, this provides further data for the fuzzy search of our website.

Example of OHTS interview with time-stamps

After completing the initial videos, we decided to opt for additional summaries of the interviewee’s answers in the form of bullet points and noted down people, institutions and terms using a keyword list we had created. We achieved this with the free web app oTranscribe by Elliot Bentley, which allows saving as a Markdown file. Although we did run several interviews through AI-assisted transcription programmes, it was not worth the time and effort necessary to correct the data. In particular, Sanskrit and Tibetan names and terms as well as personal names were not recognised or were rendered in approximation into erroneous information.

Example of interview summary using oTranscribe

Sustainability

This streamlined first stage helped us achieve a public archive with a few basic accessibility features. We also connected the website to the International Association for Tibetan Studies (IATS), which has a long-term hosting agreement to ensure the longevity of our project. The OHTS website was built using WordPress, but served as a static site and therefore does not rely on critical updates. Hosting the videos on YouTube serves the same purpose.

Keeping our time-stamps and brief summaries alongside the highlighted terms we deemed important in Markdown format can hopefully facilitate the return to our whole collection once more funding has been secured and employ additional tools, techniques, and elements that improve accessibility over time.

Connecting the dots: The next stage

Although all project members initially approached the project with a preservationist approach, new ideas and possibilities developed organically over the many years of engagement with our interviews. This resulted in the conundrum of how to preserve the anthropological nature of the interviews—semi-structured interviews, with little to no editing—while at the same time opening the archives further to the interested public and perhaps even bringing the stories and data collected into conversation with each other.

To this end, we are considering integrating an open Obsidian project into our website. This will allow users to work through the various Markdown files or use the Open Graph feature to identify themes, trends, and intellectual lineages emerging from OHTS. It would also allow the integration of other pioneers and predecessors who have already passed away and increase the complexity of the network.

It is also possible to include RDF Linked Open Data (LOD) syntax in the Markdown files, which would allow for compatibility with other projects. Digital humanities is a cross-disciplinary and collaborative field that emphasises the aggregation of knowledge and data and uses various digital approaches to make archived material more accessible. Future research will benefit from better linking of our information, using LOD best practices and semantic infrastructures such as Wikidata. 

We also assume that future DH tools will open up even more possibilities for us to analyse our archive and therefore think that the Markdown structure is a good and long-term basis for this.

An alternative, but also very time-consuming option is the open-source framework OHMS. The Louie B. Nunn Center for Oral History at the University of Kentucky Libraries created this web-based system in order to enhance access to oral history online through word-level search capability and a time-correlated transcript or indexed interview.

While it essentially provides a full suite of tools similar to those outlined above, datafying our collection would take considerably more time due to its XML and tag-based infrastructure. The final metadata would include (1) section and chapter summaries, (2) keywords and subjects tagged in accordance with the Library of Congress Subject Headings catalogue, (3) geo-tagged places and indexed persons, and (4) bilingual translations (mostly English and Tibetan). 

Summary:

In conclusion, OHTS stands at a pivotal juncture, navigating the delicate balance between innovation and its original goal of preservation, while steadfastly adhering to principles of sustainability. Since its inception in 2017, the project has evolved from a humble initiative to a comprehensive archival endeavour, collecting over 80 interviews across continents and amassing a wealth of invaluable historical narratives totaling more than 350 hours of footage.

Throughout its development, the project has encountered and addressed numerous challenges, from digitization and curation, to ensuring accessibility without compromising on sustainability and data reusability. By adopting a meticulous approach to editing and annotation, coupled with strategic partnerships and a staged funding model, OHTS has not only created a public archive with basic accessibility features but also laid the groundwork for future enhancements and collaborations.

Looking ahead, OHTS is poised to enter its next phase, exploring innovative avenues for enhancing the usability and interconnectedness of its archive. Whether through the integration of open Obsidian projects or the incorporation of RDF Linked Open Data syntax, the project is committed to fostering interdisciplinary collaboration and advancing the field of digital humanities.

In essence, the OHTS project exemplifies the intricate dance between tradition and modernity, preservation, and accessibility. As it continues to evolve and adapt to the ever-changing landscape of technology and scholarship, it remains dedicated to its core mission of preserving the rich tapestry of Tibetan studies for generations to come.

References

Budapesti, Ilona. 2019. “Past, Present, and Future of Digital Buddhology.” In Digital Humanities and Buddhism: An Introduction, edited by Daniel Veidlinger, 25-40. Berlin, Boston: De Gruyter. https://doi.org/10.1515/9783110519082-002.

Naumann, Kai. 2022. “Databases for 2080 – Preserving database content for the long term.” ABI Technik 42, no. 1: 78-80. https://doi.org/10.1515/abitech-2022-0009

One thought on “Balancing Innovation and Preservation: Sustainability in the Oral History of Tibetan Studies Project

Leave a comment