In my previous post, I wrote about the Sindhi Halchal Archive: a passion project that extends the work of the PG Sindhi Library by drawing attention to Sindhi books and their reading and publishing contexts in very visual, colourful ways. In this post, I follow up with another project that complements the PG Sindhi Library by collecting metadata from online and offline libraries where Sindhi books are housed.
Sindhi Sanchaya
Sindhi Sanchaya is a repository of Sindhi books available across different libraries and digital archives. Think of it as a specialised version of the World Catalogue website for Sindhi books. Funded by JP Narayan Centre for Excellence in Humanities at IIT Indore, the project’s objective is to help readers find relevant sources related to Sindhi language and literature. Among the libraries featured are Sindhi Sangat, websites dedicated to individual Sindhi authors (Sundri Uttamchandani or Gobind Malhi, for instance), and glimpses into the relevant public domain data of catalogues from international institutions such as the British Library or India Office Records. Together, the metadata constitutes a bibliographical treasure. The database can hopefully be studied to map the writers, titles, and institutions to visualise the itineraries of how and where Sindhi books have traveled. What’s the history of displacement of Sindhi literature? That’s one of the questions this project hopes to answer.
Methodology
A key issue for Sindhi literature is that it’s scattered. Some archival work on Sindhi literature is available on different websites such as individual author sites or on general archive websites without any consolidation of information. Sindhi Sanchaya can put seekers of information in touch with the right online and offline resources where books can be traced. It can draw traffic to these resources and contribute to their usage for further research. Its metadata can also be of immense interest in generating the big picture of the state of Sindhi as modern literature.
The collection of resources around Sindhi publishing was envisioned in the following ways:
- Collection of materials for the database such as library catalogues, publishers’ catalogues, or titles in individual spaces (online as well as offline) spread across different locations in India (and a handful of international collections)
- Documentation and cataloguing of information collected from the different sources
- Curation and standardisation of the categories and labels to arrive at some form of consistency among records found in different sources
- Production of metadata (title, author, year and location of publication, publisher details) in 3 scripts (Perso-Arabic, Devanagari, and Roman) to make the database accessible and interactive
Technology
The Sindhi Sanchaya digital archive is built on a modern, modular web technology stack designed to manage rich multilingual content and provide a seamless user experience for accessing Sindhi literary heritage. At its core, the platform uses Django CMS for backend content management and Next.js for the frontend interface, both of which are connected through APIs to provide a responsive and scalable architecture.
The frontend is built using Next.js, a React-based framework optimized for performance and server-side rendering. This ensures that the archive loads quickly and can handle SEO-friendly routing for search engines and users alike. For deployment, both Render (backend) and Vercel (frontend) are configured to support continuous integration—meaning any code changes pushed to the GitHub repositories automatically trigger builds and deploy updates to production. This allows the site to evolve rapidly without downtime or manual intervention.
The backend, hosted on Render, leverages the Python Django framework combined with a Supabase database for storing structured data on authors, books, and multilingual content in Devanagari, Roman, and Arabic scripts. The backend also exposes RESTful APIs that serve data securely to the frontend application. Hosted on Vercel, the frontend seamlessly connects to the Django APIs using environment variables such as next public backend, allowing dynamic content delivery.
Uploading or updating data is simplified through scripts or API calls that ingest content from the structured Google Sheets databases into Supabase. The platform accommodates various data entry methods, including manual additions and automated web scraping for online collections, enabling a comprehensive coverage of available resources.
Challenges
Collecting the data about where Sindhi books might be present in libraries, institutions, or private collections required some networking and reaching out to the custodians to explain the project. The request made to them was regarding their sharing of lists of books or catalogues for the project. It seemed to be a straightforward thing: most of these institutions did not have a digital presence (yet) because they did not have OPAC-like (Open Public Access Catalog) features or even a website, and nobody could access their records unless they came in person or called up their staff. Private collections were not planned by the collectors as websites anyway.
Sindhi Sanchaya was intended as a space for gathering all this data about Sindhi publications, a bibliographic resource, especially given the scattered nature of Sindhi archival material: some of its publishing network exists in India, some in Pakistan, and some in the UK. However, not everyone saw the project as an exercise in data collection. Some thought it was meant to take credit and Sindhi resources away from them. Some thought that the metadata about the books were their “copyright” and not suitable for sharing with anyone. While some were supportive and forthcoming to share the lists of their books, others got the project entangled in regular bureaucracy. Some could not share the complete catalogue or metadata because they did not have the staff to maintain this data. Technical Assistants involved in the project were onboarded for some of this process of going through the physical copies of the books and recording metadata.
The objective was to include all aspects of the metadata of the books: title, author, year of publication, and publisher, along with details such as accession or classification number of the book wherever applicable (especially if the book is in a library). However, in most cases, it has been difficult to collect all this information for some of the reasons stated above.
Theoretical Implications
As a minor language in India and as a partitioned language and literary tradition, Sindhi needs greater presence on the Internet. Sindhi Sanchaya is envisioned as a comprehensive and interactive online database of Sindhi literature connecting the communities of readers, researchers and enthusiasts to Sindhi texts scattered in different places online as well as offline. This transliteration of the database in 3 scripts – Perso-Arabic, Devanagari, and Roman – remains an elusive goal because of constraints of resources, personnel who understand OCR mechanisms, time, and financial support. But perhaps initiatives such as Sindhi Sanchaya that take the angle of dissemination seriously can help the language and the community get there.
The point is that low resource languages must go through several iterations of smaller datasets in order to be able to pursue larger goals such as OCR, transliteration, or LLMs. It is hoped that with a larger presence, Sindhi manages to get some attention and tech becomes an easier problem to solve. The infrastructure needs to exist before more sophisticated tools can be developed.
Conclusion
While there is enthusiasm about Sindhi cuisine or folklore among the Sindhi online communities, there is hardly any conversation about Sindhi as a language of knowledge and aesthetics. PG Sindhi Library, Sindhi Halchal Archive, and Sindhi Sanchaya attempt to address the creative and epistemological energies of Sindhi. While the Sindhi Halchal Archive shows us the economic networks that sustained Sindhi publishing, Sindhi Sanchaya maps the institutional and individual networks that preserved it. Together, they’re creating a clearer picture of how Sindhi literature survived and traveled after Partition. Let’s see what else can be done.

2 thoughts on “Sindhi Sanchaya: A Repository of Data about Sindhi Books”