​​Archives in Indian Languages as Digital Commons in India

This is a follow-up to part one on the “Future of the Commons” conference published earlier this week. While that article focused on developments in AI for Indian languages, this one documents the state of archiving in Indian languages across the country. 

The Centre for Internet and Society, Bengaluru, and Maharashtra Knowledge Corporation Ltd, Pune brought together archivists, curators, technologists, academicians, civil society organisations, and journalists for a conference on ways of thinking about the future of the commons in India. Archives in Indian languages were an important sub-theme for the programme. Representatives from People’s Archive of India (PARI), Keystone Foundation, Rekhta Foundation, Curating for Culture, Pratham Foundation, India Water Portal, The White Swan Foundation, and Translation Panacea were present to share their experiences on working with Indian languages. Over a period of two days, the participants engaged in conversations about how conducive digitality has been for heritage resources in Indian languages. 

Here are six takeaways from the event for those interested in questions around Indian languages and digital commons: 

1. Lack of open source solutions

While there are many open source solutions for archives maintained in English, these do not respond well to Indian languages and materials. Simple requirements such as those of right-to-left languages become heavy demands on technology. If there are any projects dedicated to working on addressing such challenges in India, unfortunately, not much is known about them.

2. What to digitise?

Resources—digital, financial, or labour-related—for archives are limited. Does that mean one must digitise everything? What about the duplication of efforts, or information pollution, when a number of archives focus on the same object for digitisation? Similar problems haunt different communities and institutions working on AI in Indian languages: lack of communication among the stakeholders and duplication of efforts that can be easily avoided. 

If some of the focus could be shifted to preservation, it would shape into a better archiving model for India. 

3. An imagination of engagement

Because the technological resources to work with Indian languages are so few, it becomes a challenge to get users and communities to interact with an archive. A lot of resources require enhancement from the point of view of UX/UI to attract users. For instance, Omeka, a platform for archiving, could be adapted to make navigation from the textual artifact to plain text much easier. 

If developers could align these needs for technical resources with the aesthetic requirements, it is likely to help the cause of the archives.  

4. Talking to each other

Archives must be in a position to talk to each other. However, not many virtual museums or mega library projects in India are in a position to do that: stakeholders do not want to keep their archives open and up for collaboration, for a variety of reasons, including financial, and there is no forum to centralise and coordinate such initiatives. Interoperability and standardisation need to be made priorities for archivists and curators in India for the larger cause of the commons. 

5. Capacity building

Archivists suffer from technological and financial anxieties around digitisation. There need to be more courses or training forums to help them understand the processes of archiving, and also to identify co-archivists and collaborators. A large question that most archivists are unable to grapple with is: how does one work with the processes of transfer as archives outlive the archivists?

Various tools such as image generation, translation, OCR, photogrammetry, speech-to-text transfer, gamification, data visualisation, and data analysis exist but ways to access them (such as through financial resources, capacity building, as well as awareness-generation campaigns) need to be identified. 

While solutions such as those provided by Google Arts and Culture are popular, there is a lot more needed for the survival of archives. For instance, public engagement is needed. Theft of identity and privacy are also huge areas of concern: while these concerns affect the digital world as a whole, the specific problems that arise with digital archives is that they might make certain individuals prone to targeting by specific groups. For instance, journalistic coverage of an individual’s story of exploitation opens it up for further scrutiny when present in an archive. What can be deleted as a post on social media acquires a different aura of record or statement of dissent or resistance when it is a part of an archive. In such a situation, deletion is not a solution, while its continued presence in the archive remains a matter of concern.  

Additionally, not all archivists tend to be trained around fundraising or strategy building to ensure the survival of archives. A basic education in what resources are available can help in addressing many gaps around archive building in India. 

6. Let’s de-romanticise archives

From the keynote by Palagummi Sainath, founder of PARI, to the interventions from fellow participants, a refrain in the conference was that in order for archives to be meaningful in India, they need to be de-romanticised as things that belong in fancy art galleries or elite spaces and instead brought into the fold of what should be known as “poor man’s GLAM” (galleries, libraries, archives, museums). For this to happen, archives need to be made available in Indian languages. It would be a radical act in the democratisation of archives and archival processes. 

Overall, the argument that emerged from the conference was that while a lot of archives—personal, community, and institutional—are beginning to be built now thanks to the harnessing of social media and other digital technologies, there is still a lot more that can be done. The greatest lesson to be learned from surveying the archives scene in India has been that of community building. The most successful archives focused on bringing people together, not just through crowdsourcing but by keeping them nourished with ideas and possibilities of working together on social media or through community festivals, and then working on the development of the content or resources. 

If one can replicate the success of, for instance, Rekhta, in leveraging love for Urdu language, then other things that archives usually struggle with can be taken care of: these struggles include getting people to contribute to and engage with the archive, as well as getting them concerned about the resources needed to run the archive. 

Another example that stood out was PARI’s attempt to make their resources available in 15 Indian languages so that the stories from rural India reach a much wider audience rather than staying within the English-speaking demographic. The translators working with PARI have, among themselves, an interesting body of work that speaks to the challenges of making ideas and concepts travel from one language (and region) to another. For instance, the ‘salt pan’ from peninsular India becomes ‘noon-er bhaanti’ in Bangla, evoking a sense of furnace rather than that of a (salt) pan. This shared understanding of Indian languages also deserves to be seen as a critical part of the digital commons that the conference set out to unpack.

It is hoped that future editions of the conference will pick up on these themes and invite more intense deliberations on languages that did not find presence in the conversations: languages that do not have a script, Adivasi languages, languages from the Northeast, and so on. Any conversation about digital inclusivity needs to address such languages on their own terms. 


Cover image by Vanshhuyaar. Languages of Bharat.

2 thoughts on “​​Archives in Indian Languages as Digital Commons in India

Leave a Reply