Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Welcome

aozora is a pure-functional Rust parser for 青空文庫記法 (Aozora Bunko notation) — the in-text annotation language used by 青空文庫, the long-running volunteer digital library of Japanese literature in the public domain.

It handles ruby (|青梅《おうめ》), bouten / bousen ([#「X」に傍点]), 縦中横, gaiji references (※[#…、第3水準1-85-54]), kunten / kaeriten, indent and align containers ([#ここから2字下げ]… [#ここで字下げ終わり]), and page / section breaks — every notation that appears in a real Aozora Bunko .txt source.

The repository is CommonMark-free, Markdown-free: aozora deals only with the 青空文庫 notation. The renderer emits semantic HTML5; the lexer reports structured diagnostics; the AST is a borrowed-arena tree that can be walked in O(n) without copying source bytes. If you want a Markdown dialect that also understands aozora notation, see the sibling project afm, which is built on top of this parser.

What this handbook is for

A practical tour and a deep reference, in one document.

Project shape

aozora is a single-author, green-field project that takes the opportunity to reach for the good algorithm and data structure for each problem rather than the obvious naive one. That orientation permeates every chapter — when you read about the scanner or the arena or the gaiji table, you’ll see why this technique spelled out, not just what the code does.

Status

Released versions track GitHub Releases; the bindings — the CLI, the Rust library, WASM, the C ABI, Go, Python, and the Extism host-SDK — all build and pass CI smoke tests. Public crates.io publication is gated on the v1.0 API freeze; in the meantime, depend on a tagged commit (see Install for the current pin).

A live build of this site lives at https://p4suta.github.io/aozora/; the rustdoc API reference is layered underneath at https://p4suta.github.io/aozora/api/aozora/.