JieZi: A Large-Scale Expert-Audited Dataset and Benchmark for Ancient Chinese Character Exegesis

AI-powered exegesis of ancient Chinese characters

JieZi Framework

Abstract

The scholarly exegesis of ancient Chinese characters demands integrating visual observation, linguistic analysis, and historical context. However, existing computational approaches focus narrowly on subtasks such as character recognition and retrieval, lacking the structured datasets and benchmarks required for comprehensive scholarly analysis. To address this limitation, we introduce Ancient Chinese Character Exegesis, a vision-language question answering (VQA) task that models the scholarly exegesis process. ACCE is organized into four progressive levels: basic character identification, glyph-form analysis, meaning exegesis, and diachronic evolution analysis. To support this task, we construct two complementary resources. JieZi-Dataset is the first large-scale, expert-audited VQA training dataset for ACCE, comprising over 500K QA pairs. It is constructed via a pipeline that reduces factual errors by constraining generation with expert-designed templates and source-text references. Human verification is further applied at each key stage to ensure scholarly accuracy. JieZi-Bench is an evaluation benchmark aligned with the exegesis process, constructed and verified by human experts to ensure evaluation reliability. It consists of four levels with reference answers curated from authoritative lexicographic works held separate from the training data. Experiments on multimodal large language models show that current models perform well on basic identification but struggle with glyph analysis, semantic reasoning, and diachronic understanding. Fine-tuning on JieZi-Dataset substantially improves performance across all four levels. Code and dataset are available at https://github.com/Ran00w/JieZi.

Dataset Overview

JieZi comprises four progressive levels of ancient Chinese character exegesis, from basic identification to diachronic evolution analysis, spanning 500K+ expert-audited VQA pairs across 7 scripts.

500K+
VQA Pairs
4
Progressive Levels
7
Scripts
130000+
Images
OBI 甲骨文
OBI sample 1
OBI sample 2
OBI sample 3
OBI sample 4
OBI sample 5
Bronze 金文
Bronze sample 1
Bronze sample 2
Bronze sample 3
Bronze sample 4
Bronze sample 5
Warring state 战国文字
Warring state sample 1
Warring state sample 2
Warring state sample 3
Warring state sample 4
Warring state sample 5
Seal 篆书
Seal sample 1
Seal sample 2
Seal sample 3
Seal sample 4
Seal sample 5
Clerical 隶书
Clerical sample 1
Clerical sample 2
Clerical sample 3
Clerical sample 4
Clerical sample 5
Regular 楷书
Regular sample 1
Regular sample 2
Regular sample 3
Regular sample 4
Regular sample 5
cursive 草书
Cursive sample 1
Cursive sample 2
Cursive sample 3
Cursive sample 4
Cursive sample 5

ACCE Task Demo

Explore the four progressive levels of Ancient Chinese Character Exegesis (ACCE). Click any image to enlarge.

Level 1 识别古文字对应的现代字头和书体类型