The Shang Dynasty Has An Unexpected Ally: AI

This is an account based on the articles linked throughout and at the end of the article. I thank Kevin Huang for sharing these with me 

The 1920s were a pivotal period for scholars interested in the emergence of writing in Chinese history, as that decade marks the first archeological excavations to recover oracle bone inscriptions (甲骨文), i.e. animal bones used in divination rituals with writings on them. Quite a few of these today present themselves as well-composed artifacts, such as this one, ready for us to read and study. But as the image clearly shows, the original state of these bones was fragmentary. Of the 160.000 Shang and Zhou dynasty oracle bones, 90% were in fragments. Scholars with excellent training joined these fragments (綴合) by collating them one by one, making sense of the edges and the writing. 

The fragmentary nature of this evidence was not only challenging because of the need to realign all the pieces. It also made each piece more easily moved and/or lost. The first discoveries date a bit earlier than 1920: around 1899, locals in Anyang had found some fragments, thought of them as “dragon bones,” and seemingly smashed and sold some to cure malaria and other diseases, until imperial scholar Wang Yirong 王懿榮 came into possession of these fragments and understood their nature. Even after this realization, archeological excavations were slow at securing the area, and interrupted several times due to the tumultuous events that took place in the first half of the 20th century in China. This resulted in a lot of archaeological evidence being looted and/or sold by individuals in financial distress. 

It also meant that when scholars began to catalogue these fragments, they were facing a real headache. Contrary to archeological excavations, where everything is systematically catalogued, pieces that were either looted or excavated but then sold were scattered all over the place, including Europe. Of some, there were rubbings, but the physical fragment was not publicly available (e.g., it was in private collections rather than museums). Databases such as the Zhu yu lan zhu 缀玉联珠 (a metaphor by Tang Emperor Xuanzong 宣宗 for beautiful poetic writing), recently made available online, are extremely important for scholars to see which fragments have been matched, and to what. (In order to use the database, you need to know the collection name and number assigned to a fragment.) 

Now, Artificial Intelligence can help with this process. A program named “Diviner” created by Microsoft in collaboration with the Center for the Study of Oracle Bone Inscriptions at Capital Normal University 首都师范大学甲骨文研究中心 is being trained to identify fragments’ shapes, in order to assist scholars in identifying them. For example, it identifies fragments that have been published several times, by different scholars, as one single element. Or, it identifies as one fragment what was originally thought of as 2 different pieces. In the example with elements # 25224 and 22629 in this article introducing Divine, #25224 is a rubbing of an inscription, of which only a particle fragment remains, namely #22629 (this may be due to losses after the rubbing was done; preserving this material is a delicate operation).

In other words, Diviner is helping scholars in comparing and finding duplicates 校重; or matching fragments to rubbings. I could not find detailed descriptions of how Diviner was trained, but from the examples given in the article I am guessing that it is not trained to identify characters or read the inscriptions; it focuses on the features of rubbings and fragments, matching the graphs line by line (without knowing, however, that they are graphs). So far, a relatively small number of new fragments have been discovered; you can find a list here, with more detailed examples of what Diviner can do. 

For those worried about AI replacing human beings: far from it! Diviner is a very fast, never-tired assistant, as the articles describe it. Reading and making sense of these ancient writings can only be done by humans, especially when the textual content is unattested. But the application of AI in the field of excavated documents is exciting, and can potentially be extended to other materials, such as bamboo strips, wooden tables and stone tables, also prone to fragmentation. The database for Pre-Qin research, 先秦史研究室 includes a few examples of fragments joined by scholars. It is a fascinating line of inquiry, with the potential of yielding more accurate research, but also of increasing digitization of the material and the possibility of it being open-access. 

___

Articles used for this post: 

90后博士用大数据为甲骨残片“拼图”

人工智能开启甲骨文整理研究新范式

Leave a comment