Library Usage
afm ships as a Rust library (afm-markdown) alongside the CLI. The
binary is a thin wrapper over the same public API every embedder
calls — there is no parallel “library-only” path that the CLI
bypasses, so a CLI run and a library embed produce byte-identical
HTML for the same input.
Add the dependency
afm is not on crates.io yet; depend on it directly by git URL:
[dependencies]
afm-markdown = { git = "https://github.com/P4suta/afm" }
The aozora-encoding sibling crate provides Shift_JIS decoding when
you need it; pin it from the same repo set:
[dependencies]
aozora-encoding = { git = "https://github.com/P4suta/aozora" }
Render to HTML — the simple path
use afm_markdown::{Options, render_to_string};
fn main() {
let rendered = render_to_string(
"彼は|青梅《おうめ》に行った。",
&Options::afm_default(),
);
println!("{}", rendered.html);
for diag in &rendered.diagnostics {
eprintln!("warning: {diag}");
}
}
Options::afm_default() enables the GFM extensions afm uses on top
of CommonMark (strikethrough, tables, autolinks, task lists),
hardbreaks (so each Aozora source newline becomes a <br> — verse /
dialogue boundaries are load-bearing in 青空文庫 source), and the
Aozora pre-pass.
For pure CommonMark or pure GFM behaviour (no Aozora recognition),
use Options::commonmark_only() or Options::gfm_only() — these are
also what the CommonMark 0.31.2 and GFM 0.29 spec runners exercise.
Render to a structured IR
render_to_ir returns the same HTML alongside a typed IrDocument
that mirrors the TypeScript IRDocument consumed by afm-obsidian:
use afm_markdown::ir::{IrBlock, IrInline};
use afm_markdown::{Options, render_to_ir};
fn main() {
let rendered = render_to_ir(
"# 第一章\n\n|青梅《おうめ》",
&Options::afm_default(),
);
for block in &rendered.ir.blocks {
match block {
IrBlock::Heading { level, .. } => println!("h{level}"),
IrBlock::Paragraph { children, .. } => {
let ruby_count = children
.iter()
.filter(|c| matches!(c, IrInline::Ruby { .. }))
.count();
println!("paragraph with {ruby_count} ruby span(s)");
}
other => println!("{other:?}"),
}
}
}
The IR carries every Aozora-side construct (Ruby, DoubleRuby,
Bouten, Tcy, Gaiji, Annotation, Container, PageBreak,
SectionBreak) plus the markdown-side block / inline shapes — so
JS-side renderers in afm-obsidian / afm-logseq can pick their own
output target (DOM fragment, CodeMirror RangeSet, semantic tokens)
without re-parsing the HTML.
Render block-by-block (streaming)
For long documents where you want to checkpoint between blocks
(afm-obsidian uses this for AbortSignal cancellation in chunked
post-processors), use render_blocks_to_ir:
#![allow(unused)]
fn main() {
use afm_markdown::{Options, render_blocks_to_ir};
let (blocks, diagnostics) = render_blocks_to_ir(
"first paragraph\n\n|second《せかんど》paragraph",
&Options::afm_default(),
);
for block in blocks {
println!("{} ir nodes at line {}", block.ir.len(), block.source_line);
println!("{}", block.html);
}
assert!(diagnostics.is_empty());
}
The shared StreamingIrBuilder threads the sentinel cursor across
calls, so per-block IR projection stays in lockstep with the
whole-document path. A block may carry zero IR entries (e.g.
container-open paragraphs that drain at the next call boundary) or
more than one (a container that finally closes).
Reading Shift_JIS input
Aozora Bunko ships its text files in Shift_JIS. aozora-encoding
exposes a transparent decoder so your pipeline doesn’t need to know
the encoding ahead of time:
use afm_markdown::{Options, render_to_string};
use aozora_encoding::decode_sjis;
fn main() -> std::io::Result<()> {
let bytes = std::fs::read("tsumito_batsu.txt")?;
let utf8 = decode_sjis(&bytes).expect("decoded");
let rendered = render_to_string(&utf8, &Options::afm_default());
std::fs::write("tsumito_batsu.html", rendered.html)?;
Ok(())
}
Round-tripping through the lexer
afm_markdown::serialize is the inverse of the lex pre-pass: it
replays the borrowed-AST registry to reconstruct the original afm
markup byte-for-byte (modulo the lexer’s Phase-0 sanitisation). This
is what the upstream 17 k-work corpus sweep exercises as I3 (round-
trip fixed point):
use afm_markdown::serialize;
fn main() {
let source = "彼は|青梅《おうめ》に行った。";
assert_eq!(serialize(source), source);
}
More examples
End-to-end snippets live under
crates/afm-markdown/examples/
in the repository:
render-utf8.rs— UTF-8 source → HTML on stdout.render-sjis.rs— Shift_JIS source viaaozora-encoding.ast-walk.rs— walk the parsed AST and tallyAozoraNodevariants.serialize-round-trip.rs— verifyserialize ∘ lex ≡ idon one file.
Run any of them with:
cargo run --example <name> -p afm-markdown -- <path>