Imagine an infinite library, housing thousands of Arabic, Persian and Ottoman manuscripts. Works on medicine, astronomy, poetry, philosophy, and law. The traditional catalogue card, with its predefined fields (such as Author, Title, Date, Place), is like a still image: it provides us with data, but kills the story. It tells us “who” and “what”, but rarely “why” and “how”. This approach, inherited from the colonial era and the birth of large Western museums and libraries, often reflects an “Orientalist” view in the sense described by Edward Said: a tendency to categorise, label and possess “Other” cultures, making them understandable and controllable through Eurocentric structures of thought. A 10th-century medical treatise is classified under “Arabic Science”, isolating it from the global scientific debate of which it was an active part. A Koranic commentary is separated from the historical and political context that generated it. The crucial question that arises is: can we use the most advanced technology not to perpetuate, but to challenge these rigid structures? The answer, increasingly clear in the field of digital studies, is yes and comes from the Semantic Web and its ontologies.
In computer science, an ontology is not a treatise on metaphysics, but a formal model that describes a domain of knowledge by defining concepts (the “classes”), properties (the “relations”), and instances (the “specific objects”). Think of it as a hyper-precise mind map that machines can read.
The following examples illustrate the difference in information representation between a traditional catalogue and an ontology (Figure 1), with the latter representing data as a set of triples composed of a Subject (S), Predicate (P), and Object (O).
Traditional Catalogue
- Object: Manuscript A
- Author: Al-Razi
- Subject: Medicine
Ontology (Figure 1)
- Manuscript A (S) – was written (P) – by Al-Razi (O)
- Al-Razi (S) – was influenced (P) – by Galen (O)
- Manuscript A (S) – cites (P) – Manuscript B (located in another library) (O)
Figure 1 – Example explaining how data are organised in the semantic web, i.e. in triples composed by Subject (S), Predicate (P), and Object (O).
The difference is enormous. The first is a label; the second is a network of narrative relationships. Ontologies allow us to move from data to contextual information, and from information to knowledge.
To build these dynamic networks, the global community has developed shared standards. The most important one for cultural heritage is the CIDOC Conceptual Reference Model (CRM). It is not a pre-established ontology, but a “skeleton”, a common language for describing events, roles, places, and times in the life cycle of cultural heritage. The CIDOC-CRM does not simply say “this is a book”, but models the idea that “this is a material object that participates in a creation event that was carried out by a person in the role of author”. To bring this abstract scheme to life, visual editors such as Protégé are used, which act as a veritable laboratory for the cultural data modeller. It is in this digital space that classes are created (e.g. “Manuscript”, “Philosophical Concept”, “Historical Event”, representing the subjects and objects of the triple), properties are defined (such as “was written”, “was influenced”, “cited”, representing the predicates of the triple), and everything is populated with real data from the collections.
Let’s take a concrete step and see how this methodology can rewrite the history of a collection of Arabic manuscripts. Let’s take a digital collection of scientific texts. A traditional approach would classify them into separate compartments: “Astronomy”, “Mathematics”, “Medicine”. With an ontology based on CIDOC-CRM, we can instead model a completely different and much richer history. Instead of a monolithic “Astronomy” class, we can model dynamic concepts such as Planetary Theory, Latitude Calculation and Astronomical Instrument. We can then show how the same concept of Planetary Theory appears in an Arabic manuscript, is discussed and modified in extensive correspondence between scholars in Baghdad and Cordoba, and, finally, influences a medieval Latin treatise. The focus thus shifts from “where it is catalogued” to “how it travels and transforms” across cultures. We can create a property dedicated to paths, called, for example, “was transmitted through”. A manuscript can thus be transmitted through Latin Translation in Toledo, an event that in turn involved Jewish, Christian and Muslim translators. This single event culturally links Arabic manuscripts from Spain to Latin manuscripts from Italy, creating a cultural pathway that separate catalogues tend to obscure. Furthermore, an ontology can model the complexity of authorship in ancient texts, moving beyond the idea of a single author. A work does not have a single Author, but may have been started by one Person, commented on by another, translated by a third, and copied by a fourth. Each agent has a specific Role in a specific Event, restoring the idea of collective and processual knowledge, rather than a monolithic legacy.
This modelling power does not come, however, without challenges and profound responsibility. The biggest challenge is the risk of creating new biases. Ontology is a mirror: it reflects the questions, priorities, and worldviews of those who construct it. Deciding which relationships to model (i.e. choosing between influence or appropriation) is an interpretative act fraught with consequences. It is therefore essential that interdisciplinary teams, composed of historians, philologists, and experts in the cultures of origin, work side by side with computer scientists at every stage of the process. The second challenge is technical in nature: interoperability. The ultimate goal is to connect data from different archives, museums, and libraries around the world. Standards such as CIDOC-CRM are fundamental to this global dialogue, but their implementation requires a coordinated effort and significant resources.
In conclusion, the use of ontologies for cultural heritage pathways, and in particular for Oriental studies, is not merely an exercise in cataloguing 2.0. It is a genuine paradigm shift. It is the transition from heritage seen as a museum of static objects, arranged in display cases according to fixed and often inherited categories, to heritage understood as a dynamic ecosystem of relationships. This approach allows us to dismantle the taxonomic “cages” of Orientalism, not to deny categorisations, but to show their limitations and provisional nature. It invites us to see collections of Oriental manuscripts not as exotic treasures to be admired in isolation, but as vibrant nodes in a global network of exchanges, influences, conflicts, and dialogues that have shaped world history. Ultimately, we are not just building a smarter archive. We are using the language of computational logic to restore history to its deepest truth: its inherently relational, complex, and multifaceted nature. And in this, perhaps, we find one of the most powerful antidotes to simplification and prejudice.
References
Berners-Lee, Tim, and James Hendler. “Publishing on the Semantic Web.” Nature 410, no. 6832 (2001): 1023–24.
Berners-Lee, Tim, James Hendler, and Ora Lassila. “The Semantic Web: A New Form of Web Content That Is Meaningful to Computers Will Unleash a Revolution of New Possibilities.” In Linking the World’s Information: Essays on Tim Berners-Lee’s Invention of the World Wide Web, edited by James Hendler and Wendy Hall, 91–103. Scientific American, 2023.
Berners-Lee, Tim, Wendy Hall, James A. Hendler, Nigel Shadbolt, and Daniel J. Weitzner. “Creating a Science of the Web.” Science 313, no. 5788 (2006): 769–71.
Biagetti, Maria Teresa. “An Ontological Model for the Integration of Cultural Heritage Information: CIDOC-CRM.” JLIS.it: Italian Journal of Library, Archives and Information Science 7, no. 1 (2016): 43–77.
Castelli, Lisa, Achille Felicetti, and Fabio Proietti. “Heritage Science and Cultural Heritage: Standards and Tools for Establishing Cross-Domain Data Interoperability.” International Journal on Digital Libraries 22, no. 3 (2021): 279–87.
Doerr, Martin, Christian-Emil Ore, and Stephen Stead. “The CIDOC Conceptual Reference Model: A New Standard for Knowledge Sharing.” In Tutorials, Posters, Panels and Industrial Contributions at the 26th International Conference on Conceptual Modeling, 51–56. ER ’07. Australia: Australian Computer Society, Inc., 2007.
ICOM CIDOC Documentation Standards WG. “CIDOC CRM.” Accessed November 25, 2025. https://www.cidoc-crm.org/versions-of-the-cidoc-crm.
Melo, Dora, Irene Pimenta Rodrigues, and Davide Varagnolo. “A Strategy for Archives Metadata Representation on CIDOC-CRM and Knowledge Discovery.” Semantic Web 14, no. 3 (2023): 553–84.
Stanford Center for Biomedical Informatics Research. “Protégé.” Stanford University. Accessed November 25, 2025. https://protege.stanford.edu/.
Tomasi, Francesca. “Archival Finding Aids in Linked Open Data between Description and Interpretation.” JLIS.it: Italian Journal of Library, Archives & Information Science 14, no. 3 (2023): 1–20.


2 thoughts on “From Rigid Taxonomies to Networks of Relationships: When the Semantic Web Redesigns Cultural Narratives”