Expand description
Zero-copy, arena-allocated AST.
This module is the AST that the aozora-lex pipeline produces and
that downstream consumers (aozora-render, aozora, the FFI /
WASM / Python drivers) walk.
§Lifetime model
Every type carries a single lifetime parameter 'src, the
lifetime of the source text being parsed and of the arena
allocator that owns the tree’s storage. By convention the
enclosing Document<'src> owns both, so 'src is the borrow of
that document.
All AST types are Copy because they only contain Copy data:
&'src references, primitives, and Copy enums. This means a
parsed AozoraNode can be passed by value without ceremony and
the visitor pattern in aozora-render does not need
&mut self for traversal.
§Memory ownership
Construction allocates into an Arena (a thin wrapper over
bumpalo::Bump). Every &'src str inside the tree points either
to the arena (rewritten / synthesised text) or to the source
string (zero-copy borrow of original bytes). When the arena drops,
the entire tree drops as a single deallocation; per-node Drop
never runs.
§Why “borrowed”?
Every type here borrows its payload from the source / arena rather than owning a heap copy. The “observable equivalence” purity contract permits arena mutation behind the scenes while keeping the public surface deterministic.
Modules§
- arena 🔒
- Per-document bump arena.
- intern 🔒
- Arena-backed string interner.
- non_
empty 🔒 - Non-emptiness invariant for
Content<'src>. - registry 🔒
- Sentinel-position →
AozoraNodelookup table. - types 🔒
- Borrowed AST types parameterised by the source/arena lifetime
'src.
Structs§
- Annotation
- Generic annotation.
- Aozora
Heading - Aozora heading (窓見出し / 副見出し).
- Arena
- Bump-allocator arena owning all AST node storage for a single parse.
- Bouten
- Emphasis dots / sidelines.
- Container
Pair - Resolved (open, close) container-marker pair, in normalized coordinates.
- Double
Ruby - Double angle-bracket payload.
- Gaiji
- Gaiji (out-of-character-range glyph).
- Heading
Hint - Forward-reference heading hint.
- Intern
Stats - Diagnostic counters surfaced by
Interner::stats. - Interner
- Open-addressing intern table over arena-allocated strings.
- Kaeriten
- Chinese-reading-order mark (
返り点). - NonEmpty
- Non-emptiness wrapper for an AST payload.
- NonEmpty
Str - Non-empty
&'src strnewtype. - Registry
- Whole-document registry — single Eytzinger-keyed table.
- Ruby
- Ruby (furigana).
- Sashie
- Illustration metadata.
- Tate
ChuYoko - Tate-chu-yoko (horizontal embedding).
- Warichu
- Warichu (split annotation).
Enums§
- Aozora
Node - Every Aozora-specific AST node, in borrowed form.
- Content
- Body content for nodes whose textual payload may carry nested Aozora constructs.
- NodeRef
- Unified view over a registry hit, returned by
Registry::node_at. - Segment
- One element of a
Content::Segmentsrun.