Skip to content

The Digital Orientalist

Practical examples and theoretical reflections on the do's and don'ts of using digital tools for your study and research in African and Asian Studies.

Primary Navigation

  • About
    • About The Digital Orientalist
    • Team
    • Hall of Fame
    • Newsletter
  • Topics
    • African Studies
    • African Languages
    • Ancient Near Eastern Studies
    • Archiving
    • Between Legal and Illegal
    • Buddhist Studies
    • Chinese Language
    • Coding
    • DH in General
    • DH in Practice
    • Digital Cartography
    • Digitization
    • Equipment
    • Events & Conferences
    • Hardware
    • Housekeeping
    • Indian Studies
    • Islamic Studies
    • Iranian Studies
    • Islamic Languages
    • Korean Studies
    • Japanese Studies
    • Mongolian Studies
    • OCR
    • Online Resources
    • Ottoman Studies
    • Sinology
    • Social Media
    • Software
    • Syriac Studies
    • Teaching
    • Textual Analysis
    • Theory
    • Using Real Paper
    • Visualization
    • Workflow
  • Submissions
    • Submission Guidelines
  • Publications
  • The Digital Orientalist’s Conferences
    • 2025 – “AI and the Digital Humanities”
      • Titles and Abstracts
      • Conference Proceedings
    • 2023 – “Sustainability in the DH”
      • Conference Proceedings
    • 2022 – “Infrastructures”
      • Titles and Abstracts
    • 2021 – The Digital Orientalist’s Virtual Conference
      • Titles
    • 2020 – “Digital Orientalisms 2020”
  • Donate
  • Search
  • ISSN: 2772-8374

Social Navigation

  • X
  • Facebook
  • Instagram
  • YouTube
  • BlueSky
  • LinkedIn

Tag: LLM

To Merge or Not to Merge: The Pitfalls of Chinese Tokenization in General-Purpose LLMs
Chinese Language, DH in Practice, LLM, New Post, Sinology

To Merge or Not to Merge: The Pitfalls of Chinese Tokenization in General-Purpose LLMs

Tokenization, the process of transforming human text into machine-understandable units of meaning (tokens), is a foundational step in language modeling. … Continue reading To Merge or Not to Merge: The Pitfalls of Chinese Tokenization in General-Purpose LLMs

Cognitive Stylometry as an LLM-Based Methodology in Chinese Literary Studies
Chinese Language, DH in Practice, Machine Learning, New Post, Sinology, Textual Analysis, Theory

Cognitive Stylometry as an LLM-Based Methodology in Chinese Literary Studies

Recent years have witnessed a wave of fascinating research findings in both natural and artificial intelligence. While scholars remain cautious … Continue reading Cognitive Stylometry as an LLM-Based Methodology in Chinese Literary Studies

Website Powered by WordPress.com.
  • Subscribe Subscribed
    • The Digital Orientalist
    • Join 334 other subscribers
    • Already have a WordPress.com account? Log in now.
    • The Digital Orientalist
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar