Online Resources for Chinese Palaeography – Part One

In the field of early China, the end of the 20th century will be remembered by many as a time that revolutionized the study of ancient Chinese texts. In the early 70s, the world was mesmerized by the recovery of silk paintings and texts, among many other objects, from the archeological site now known as Mawangdui 馬王堆 (Hunan province; most items are displayed at the Hunan Provincial Museum 湖南省博物館). In 1993, while working on a tomb near Guodian 郭店 (Hubei province) after its looting, archeologists found bamboo strips dating to the 3rd century BCE, what is now one of the most famous manuscript collections in the field. Increasingly, manuscripts have become prized items by scholars in several disciplines, while also attracting the attention of looters. These discoveries of ancient texts are by no mean the earliest in Chinese history, but, for the first time, the quantity of manuscripts, as well as the technologies now available to scholars (such as HD photography and the creation of e-books) make more than 300 manuscripts accessible to scholars all around the world.

Part of the excitement generated by these manuscripts lies in their dating: several corpora predate the foundation of the Qin empire in 221 BCE, when the Chinese script was standardized as result of the reforms proposed by Chancellor Li Si 李斯 (c. 289–208 BCE). They represent, therefore, an invaluable set of data for the study of ancient Chinese languages’ scripts and sounds. And this is where the digital humanities have come to rescue scholars eager to immerse themselves in studying the growing number of manuscripts by navigating their content.

Reproduction of Li Si’s portrait. Unknown Author. Credit: Alice Panato (Instagram @a.li.ce.p).

The Chinese Writing System: Background

Several websites launched by institutions or centers of research allow us to research the content of manuscripts. Understanding the nature of the Chinese script is a fundamental prerequisite to use these websites meaningfully. Ancient Chinese is a logographic language, where the representation of a word (i.e., the graph) may or may not include its phonetic value. In addition, two words sharing the same phonetic value can be represented by the same graph for that value. To use an example from modern China, for example, the word “new”, xin 新, could be written with the character xin 心 (meaning “heart”) in light of their shared phonetic value “xin” (this is one of the many ways in which writers attempt to bypass censorship). The same applied to ancient China: the graph you 又 (see the below image from the Da Yu ding 大盂鼎 bronze inscription) can represent several words with the same phonetic value, the most common are “also” and “to have” (reconstructions in Old Chinese follow the system by Bill Baxter and Laurent Sagart):   

又 yòu < *ɢʷəʔ-s, “also.”

有yǒu < *[ɢ]ʷəʔ , “to have.”

The graph you 又 from the Da Yu ding 大盂鼎 bronze inscription.

Again the word can be written with the graph that would normally represent a different word due to their shared phonetic value. Both possibilities are features of ancient Chinese language that make its study so exciting and so frustrating at the same time. When navigating websites that collect data from ancient manuscripts, scholars are subject to decisions made in building these websites when determining the meaning of graphs. Let us explore one of these websites and see its potential as well as their limitations.

The Center of Bamboo Silk Manuscripts

The BSM is a center based at Wuhan University 武漢大學, one of the main centers of research of excavated manuscripts in Mainland China. Among the many resources this website offers, there is a searchable database of pre-imperial and imperial sources, which is constantly, albeit slowly, being updated. The user can search a single word (dan zi 單字) or a radical (pian pang 偏旁) and restrict their research to specific texts or corpora:

The results show a selection of 10 graphs (a subscription is required for more) from the selected material singled out individually, along with the title of the manuscripts from which these were taken (in light yellow) and the strip numbers (in light blue):

The potential of such a database is twofold: first, it allows scholars interested in reading the manuscript itself to quickly identify the strip numbers. Quotes that may not include strip numbers can thus be quickly located. Second, it gives at-a-glance visualizations of how the same word was written in different texts and by different scribes.

There are two tricky aspects in using this kind of resource. First, individual decisions made on how to represent a word differ. Consider the following example. Two manuscripts of the same text, the Natural Dispositions come from Endowment (Xing zi ming chu 性自命出) and Discussions on Natural Dispositions and Emotions (Xing qing lun 性情論) open with the sentence “[Natural] dispositions await on externalities and then arise” (Dai wu erhou zuo 待物而後作). Yet, searching for the word dai 待 “to await” in the BSM database for these two manuscripts leads to no results:

This is because, whoever is responsible for matching the graph as it appears on the strips to the Modern Chinese typable equivalent, decided to do so according to what the graph is actually representation, namely si 寺, and not the word that is written (as said, dai 待). Searching for 寺 in fact yields the expected results:

The second aspect is related to the same problem: the variety of opinions in interpreting the manuscript. To remain within the manuscript Natural Dispositions, the graph (strip 7) is composed of {亻+長}, representing it is presumed the word “to grow” in the BSM database. It therefore shows up through a search for chang 倀:

However, the interpretation of this graph is still subject to debate, and as such it does not appear through a search for 倀 in two other major online resources for ancient Chinese manuscripts, HUMANUM 漢語多功能字庫 and the Open Ancient Chinese Characters Glyphs Database 開放古文字字形庫. I will discuss this further in the next post.

Both of these aspects can be a real headache even for the most experienced scholars working with manuscripts (let alone students who are just starting to approach this area of study), and are two of the reasons that a complete grasp of this material requires years of work with manuscripts, copying them manually (a training Chinese palaeographers undergo), and a solid grasp of the Chinese writing system. They impact significantly the flow of work: understanding a single sentence can easily take up a few days of work. Although these websites are routinely updated, addition of new material takes precedence over the revision of previous identifications, not without good reasons: understanding these manuscripts is hard work. The earlier these manuscripts are made available to the public, the more people can study them and contribute to their decoding.

6 thoughts on “Online Resources for Chinese Palaeography – Part One

  1. Dear Maddalena Poli,
    Thank you for your precious evidence.
    Where I see that a common practice to ‘reduce’ a character to its phonetic element in excavated texts is also fairly common, here, knowing this possibility, we can refer to the linked menu below:
    http://www.bsm.org.cn/zxcl/index_cl.php

    to see what the ‘glyph-giver’ prompt to (and this could be your answer?):
    上博性情論1號簡~(待)兌(悅)而句(後)行
    上博性情論1號簡~(待)習而句(後)〇(奠)

    1. Dear Prof. Galassi, thank you for this. Yes, hovering over the thumbnails gives the context and how the graph has been interpreted – something I will discuss in the next entry for this amazing online journal. I do not think of it in terms of answer, since there is nothing wrong in using 寺 to identify that word. Si 寺 is, after all, how the word was written in the Chinese writing system of the time. I wanted to point out however what can be seen as a discrepancy, especially by students or scholars first approaching manuscripts, who do not have clear the distinction between gaps and word. And they may indeed be looking for an answer.

Leave a comment