Cracking the Oracle Bones: the Yinxu Oracle Bone Inscription Digital Database (Part 1)


In April, the Zhonghua Book Company (中華書局) launched the (subscription-only) Yinxu Oracle Bone Inscription Digital Database 殷墟甲骨文數據庫 (YOD), containing over 143,000 Shang dynasty (c. 1600-1046 BCE) oracle bone inscriptions. In terms of quantity, the YOD is now the most extensive digitized collection of inscriptions. So far, however, little has been made public regarding its genesis and development. The website describes the project as a collaborative effort between the Chinese Book Company’s subsidiary company Gulian (古聯) and the oracle bone scholar Professor Chen Nianfu 陳年福 based at Zhejiang Normal University. Whereas other oracle bone databases, such as Academia Sinica’s free Oracle Bone Inscriptions and Rubbings Database (甲骨文拓片資料庫), offer detailed information about their digitization process, sources, and contributors, no such information is provided by the YOD. However, scholars of Early China will likely greet this news with blank faces. As the digitization of Early Chinese texts is increasingly handled by private companies, methodological opacity is becoming commonplace.

In this two-part series, we’ll walk through some of the YOD’s main features. In the process, I’ll try and identify some of the main challenges facing oracle bone digitization and think through their implications for future projects. The first article will look at issues relating to images, inscriptional “themes” (主題), interpretations, and fonts. Then, in part two, I’ll discuss the “Original Text Search” (原文檢索), “Oracle Bone Dictionary” (甲骨字典), and “Oracle Graph Input Assistant” (甲骨文輔助輸入) features.

What are Oracle Bone Inscriptions?

Shang oracle bone inscriptions refer to the inscriptional records of pyromantic divinations conducted by the religious elite of the ancient Chinese Shang 商 dynasty. During divination, hollows drilled into one face of a bone plate (see below, right) – typically a turtle shell or ox scapula – were subjected to intense heat. The rapid expansion of the bone produced cracks on the reverse face (see below, left), which were then interpreted as auspicious or inauspicious omens in relation to a “charge” (e.g., “It will rain tomorrow”). However, it was not until the reign of King Wu Ding 武丁 in the late Shang (c. 1200 BCE) that records of these divinations were first inscribed onto the divination bones themselves. Indeed, these records are the earliest known examples of Chinese writing proper.

Oracle bone Heji 9658 rubbings

The most detailed inscriptions include the date of the crack-making (e.g., “the 3rd day of the 60-day cycle”), the diviner’s name, the charge, the king’s prognostication based on his interpretation of the cracks, and the divination outcome (e.g. “it really did rain the next day”). Because the Shang divined only on the most pressing matters – military alliances, invasions, harvests, illnesses, among others – these records have become an invaluable source for Early Chinese historians. And since there are hundreds of thousands of known inscriptions, digital corpus analysis may offer a workable way to manage the superabundant data. So, with introductions out of the way, let’s crack on…

Inscriptional Context and Topic Tags 

If you’ve cleared the first hurdle of convincing your institution to purchase a subscription, at the welcome screen you’ll be greeted with the database’s five main features: “Browse Inscriptions” (卜辭瀏覽), “Search Interpretations” (釋文檢索), “Search Original Text” (原文檢索), “Oracle Bone Dictionary” (甲骨字典) and an inscription number search box.

Database homepage

Clicking “Browse Inscriptions” returns a list of every inscription in the database – 14,578 pages of them.

Browse inscription function

Numbers aside, no other oracle bone database currently provides quite as many powerful filters as the YOD: published collections, excavation sites, institutional and personal collections, reassembled bones, diviner and orthographic groupings, and 32 inscriptional “themes” ranging from “tribute” to “illness and dreams.”

Clicking an inscription from the list opens a viewer displaying the bone rubbing/image/hand-drawn reproduction, its inscriptions, and several useful tools (rotate, invert black and white, zoom). Clicking “Add to Collection” (收藏) saves the current bone to a handy list, while the “Add to Comparison Window” (加入對照欄) allows the user to view multiple bones in the same window.

Bone viewer

One of the YOD’s best structural features is that, like the Academia Sinica database, it’s organized around high-quality images/rubbings/reproductions of whole bones. Compared to the CHinese ANcient Texts Project (CHANT) database, which provides only cropped pictures limited to single inscriptions (see below), viewing an inscription in the context of the whole bone is probably the more convenient option.

CHANT Database inscription viewer

I say this because oracle bones typically contain multiple divinations on the same events. If you’re struggling to interpret a short inscription, returning to the original bone plate can thus provide key contextual clues. For example, if you’re faced with a pithy inscription containing the graph A picture containing text

Description automatically generated (qiang; 羌), and you want to know whether it writes the name of a Shang royal ancestor or the term for “captive,” you can examine surrounding inscriptions on the same bone to determine the most plausible reading. 

However, this feature of oracle bones also brings into view a slight caveat with the inscriptional “theme” search mentioned above. Let’s take a look at two inscriptions from the same bone:

In the first, the diviner Bin tests whether the king should order Lord Qiang to attack the enemy polity Zhu. Although the second inscription is very short, it clearly refers to the same events. In fact, it is a “subcharge” concerning the attack’s timing. Short “subcharges” are a common oracle bone feature and typically address specific details of the main “charge.” While the editors tag the first inscription with the themes “commanding,” “Lords,” “military affairs,” and “enemy states,” they tag the second with only “commanding” and “lords.” Hence, a search for “military affairs” will not return an exhaustive list of relevant inscriptions – it will return only those containing keywords associated with the topic. A researcher filtering by a specific theme would have to return to the original bone to check for other topically-related inscriptions. Even then, a bone fragment containing only a “subcharge” with no explicit theme may elude the initial theme search entirely. For the time being, then, the theme filter should probably be treated more as a quick and convenient starting point for investigations. One suggestion for future digitization efforts would be to first group inscriptions from the same bone into “thematic clusters” consisting of “charges” and their associated “subcharges.” Developers could then assign these clusters a ”collective theme tag” to ensure “subcharges” don’t fall between the cracks of the search engine. 

Reading Between the Cracks: Interpretations

The boxes on the right-hand side of the viewing window display the “original text” (原文) and “interpretation” (釋文) fields in a special, copyrighted font developed by the Zhonghua Book Company which users must download before browsing. The “original text” field ostensibly reproduces the original graphs while the “interpretation” field displays the words the interpreter believes the bone graphs write. Without getting too bogged down in linguistics here, a single oracle bone graph can write several words, and different bone graphs can be used to write the same word.

But sometimes – especially where context is lacking – identifying graphs can be challenging, let alone determining what words they write. Indeed, there are many cases where a graph might plausibly write several different words. For this reason, oracle bone scholarship typically provides generous footnotes filled with interpretive justifications. The Academia Sinica oracle bone database addresses this issue by providing a window for the more ambiguous inscriptions which contains references to different published interpretations:

Academia Sinica’s interpretation sources

YOD transcriptions appear to be based on the scholarship of the co-developer, Professor Chen Nianfu, but it’s not always clear. The user will sometimes come across interpretations identical to those already published. For the sake of transparency, it would be helpful to know if similarities between the YOD and published interpretations result from the YOD’s digitization process (e.g., OCR-scanning the aforementioned published collections) or the independent evaluation of the YOD’s researchers. Going forward, therefore, oracle bone databases might consider following Academia Sinica, either by listing their sources or by explicitly stating where no sources are used. In addition, providing multiple published interpretations for problematic inscriptions would both alert users to inscriptional ambiguities and allow them to quickly compare possible readings to form their own judgments where appropriate.

Oracle Bone Inscription Fonts

My colleague Maddalena has already addressed issues relating to fonts and Early Chinese manuscripts, much of which also applies to oracle bone fonts. Creating an oracle bone font is no easy task. There are many oracle bone graphs for which the modern equivalent is a non-regular (kaishu 楷書) character and others for which we have no modern equivalent. In the first case, the YOD font provides its own standardized version of the character (built into its specialized font) in the “interpretation” field, and in the second, it reproduces the original bone graph. Hence, even though there is an option to render interpretations in a copy-and-pasteable format, unless your browser uses a highly specialized font (not provided by the YOD) this function will return blank boxes for certain codepoints (see below). 

For the time being, it seems the only workaround is to find a published version of the non-regular or oracle bone graph and paste it into your word processor as an image.

So, what about the “original text field”? Unlike the “interpretation” field, there is no way to copy-and-paste the “original text” field – doing so results in a string of random characters. As mentioned above, this is because the YOD employs its own oracle bone font which can’t be used outside of the database. For over two decades, proposals have been made for an official, Unicode oracle bone font. The Chinese Foundation for Digitization Technology made the most recent proposal in 2019. While Unicode has tentatively allocated space for it in its “Roadmap to the TIP (Tertiary Ideographic Plane),” the proposal has not yet been formally accepted.

But what might this Unicode font look like in practice? Here, I believe the YOD might offer us a glimpse into the future. True, the YOD’s oracle bone font does not exactly reproduce bone graphs – the developers standardize the graphs, assigning nearly-identical bone graphs to the same character. That said, it’s hard to imagine how an oracle bone font would even be possible without some standardization. Even when the YOD does standardize, it does an impressive job of distinguishing between almost-identical variants – consider the ten (!) variant forms provided for ren Icon

Description automatically generated (人, “person”):

YOD’s bone graph font: variant forms of 人 “person.”

The question is: in the context of the database, does this standardization really matter? The answer, I think, depends on how a researcher uses the YOD. For example, a user investigating bone graph evolution could not rely on the font alone. And yet, they could find a workaround by filtering inscriptions according to chronological diviner groups and going back to examine the original bone plates. For questions less concerned with orthography, the oracle bone font allows the user to quickly make sense of an inscription without having to constantly return to the original images. As we’ll see in the second part of this series, the font also introduces the exciting possibility of searching by oracle bone graphs and their variants. As a research tool, at least, I think it’s fair to say the font represents a remarkable feat of digitization and points the way forward for the development of oracle bone fonts.

Looking back, I think the main issue raised by our discussion of images, theme tags, interpretations, and fonts is the importance of methodological transparency. Of course, the risk with any database working with texts over three thousand years old is that their glossy interface inadvertently sweeps aside important uncertainties or ambiguities. However, each of the YOD features examined above also reflects exciting trends in digitization which will surely open the door to new kinds of methodological approaches to oracle bone analysis. In the second part of this series, I’ll discuss how the “dictionary,” “oracle graph input,” and search features contribute to this overall project and set new standards for future digitization projects.

2 thoughts on “Cracking the Oracle Bones: the Yinxu Oracle Bone Inscription Digital Database (Part 1)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s