Skip to main content

Crate aozora

Crate aozora 

Source
Expand description

aozora — the public meta crate.

Single front door for parsing Aozora Bunko notation. Downstream consumers should depend on this crate alone; everything they need is re-exported through this surface or accessed via Document and AozoraTree.

use aozora::Document;

let source = std::fs::read_to_string("crime_and_punishment.txt").unwrap();
let doc = Document::new(source);
let tree = doc.parse();
let html = tree.to_html();
println!("{html}");

Tunable parses go through the builder chain:

use aozora::{Document, DiagnosticPolicy};

let doc = Document::options()
    .arena_capacity(64 * 1024)
    .diagnostic_policy(DiagnosticPolicy::DropInternal)
    .build("|青梅《おうめ》");
let tree = doc.parse();
assert!(!tree.serialize().is_empty());

§Architecture

Document owns the source buffer plus a bumpalo-backed arena. AozoraTree borrows from that arena via the &self lifetime returned by Document::parse. Every per-node allocation lives inside the arena, with the Interner deduplicating repeated string content; dropping the Document releases the entire tree in a single Bump::reset step.

Internal build-block crates (aozora-spec, aozora-syntax, aozora-pipeline, aozora-render, aozora-encoding) are publish = false and reachable only through this meta crate’s pipeline / syntax / render / encoding / wire modules. Depend on aozora alone; see the Architecture chapter of the handbook for the layered design.

Modules§

codes
Stable identifier strings for known Diagnostic variants.
document 🔒
Document — single owning handle to a parsed Aozora source buffer, and AozoraTree<'a> — borrowed view a caller walks for output rendering.
encoding
Re-export of aozora_encoding — Shift_JIS decoding and gaiji resolution.
html
Borrowed-AST HTML rendering.
pipeline
Re-export of [aozora_pipeline] under a stable name.
render
Re-export of [aozora_render] — HTML / serialize emitters and the visitor trait.
serialize
Borrowed-AST Aozora-source serializer.
syntax
Re-export of aozora_syntax — AST node types, arena, interner.
wire
Driver-shared wire format for serialising aozora parser output.

Structs§

AlignEnd
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
Annotation
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Generic annotation.
AozoraHeading
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Aozora heading (窓見出し / 副見出し).
AozoraTree
Borrowed view into a parsed Aozora document.
BorrowedLexOutput
Borrowed-AST output of the lex pipeline.
Bouten
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Emphasis dots / sidelines.
Document
Single owning handle to a parsed Aozora source.
DoubleRuby
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Double angle-bracket payload.
Gaiji
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Gaiji (out-of-character-range glyph).
HeadingHint
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Forward-reference heading hint.
Indent
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
Kaeriten
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Chinese-reading-order mark (返り点).
NormalizedOffset
Byte offset into the normalized text the lex pipeline emits for the downstream CommonMark parser.
PairLink
Resolved open/close pair, as observed by Phase 2.
ParseOptions
Builder for the Document::parse entry point.
Ruby
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Ruby (furigana).
Sashie
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Illustration metadata.
SlugEntry
One row of the slug catalogue.
SourceNode
Source-keyed registry entry — pairs a source-byte span with the classified node landed there. Lives in the bumpalo arena.
SourceOffset
Byte offset into the sanitized source text (Phase 0 output).
Span
Byte-range span. Both endpoints are guaranteed to fall on UTF-8 character boundaries when produced by the parser; callers can safely slice the source with them.
TateChuYoko
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Tate-chu-yoko (horizontal embedding).
Warichu
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Warichu (split annotation).

Enums§

AnnotationKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
AozoraHeadingKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
AozoraNode
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Every Aozora-specific AST node, in borrowed form.
BoutenKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
BoutenPosition
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Which side of the vertical-writing base text the bouten marks sit on.
ContainerKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. The kinds of Aozora container blocks the lexer classifies.
Content
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Body content for nodes whose textual payload may carry nested Aozora constructs.
Diagnostic
Observation emitted by any lexer phase.
DiagnosticPolicy
Diagnostic policy applied at parse time.
DiagnosticSource
Origin of a Diagnostic — distinguishes user-input issues from library-internal sanity-check failures.
InternalCheckCode
Identifier of a specific pipeline-internal sanity check.
NodeKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. Cross-cutting tag for an AST node or NodeRef projection.
NodeRef
Unified view over a registry hit, returned by Registry::node_at.
PairKind
Pair kind. The variants enumerate every balanced delimiter Aozora notation recognises.
SectionKind
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door.
Segment
Borrowed-AST node types editor surfaces match against (LSP inlay hints, hover, completion, code actions, semantic tokens). Re-exported so external consumers don’t have to depend on aozora-syntax directly — aozora is the single editor-facing front door. One element of a Content::Segments run.
Sentinel
Sentinel kind tag.
Severity
Severity of a Diagnostic.
SlugFamily
Family / coarse category a slug belongs to. Used by the LSP completion UI to group entries (CompletionItem::sort_text) and pick an appropriate CompletionItemKind icon.
TriggerKind
Classification of a single trigger character (or merged double).

Constants§

ALL_SENTINELS
All four sentinels in declaration order.
BLOCK_CLOSE_SENTINEL
Paired-container close line (e.g. [#ここで字下げ終わり]).
BLOCK_LEAF_SENTINEL
Block-leaf Aozora line (page break, section break, leaf indent, sashie).
BLOCK_OPEN_SENTINEL
Paired-container open line (e.g. [#ここから字下げ]).
INLINE_SENTINEL
Inline Aozora span (ruby / bouten / annotation / gaiji / TCY / kaeriten).
SLUGS
Canonical slug catalogue. See module docs.

Functions§

canonicalise_slug
Snap an input slug body (with the surrounding [# … ] already stripped) to the canonical form, if one is recognised.
lex_into_arena
Run the lex pipeline and collect the result into arena.