Phase D — Sentinel enum + single-table registry results
The single-table registry collapsed four per-kind sentinel position
tables into one position-keyed EytzingerMap dispatched through a
NodeRef enum. Before the refactor the registry held independent
inline / block_leaf / block_open / block_close EytzingerMaps
and Registry::node_at(pos) swept them in declaration order with
four if let Some(...) = table.get(&pos) chains; the current shape
is one binary search per lookup, with the variant tag carried on the
entry itself.
Structural changes
old : Registry { inline, block_leaf, block_open, block_close } // 4× EytzingerMap
node_at(pos) → 4-way if-let chain, ~4 binary searches worst-case
now : Registry { table: EytzingerMap<u32, NodeRef<'src>> } // 1× EytzingerMap
node_at(pos) → one binary search, NodeRef variant tags the kind
Renderers (crates/aozora-render/src/html.rs,
crates/aozora-render/src/serialize.rs) replaced the parallel
4-way if let Some(...) = registry.<kind>.get(...) chains with
a single (Structural, NodeRef) cross-product match — the
compiler now enforces variant coverage at the call site.
Expected runtime impact
Theoretical: per-lookup binary search count drops from ≤ 4 to 1.
Render hot path is dominated by registry lookups inside the
memchr2_iter loop in html::render_into (one lookup per PUA
sentinel hit), so the savings scale with sentinel density. Aozora
corpus profiling against the four-table layout showed registry
lookups at ~12 % of render time on bouten-heavy documents; the
unified dispatch should absorb roughly that fraction.
Measurement procedure
Run before each minor release:
# Take a baseline against the previous release tag
git checkout v0.3.0
just samply-corpus --repeat 5 --out before.json.gz
git checkout -
# Take a current measurement
just samply-corpus --repeat 5 --out after.json.gz
# Diff at the function level
xtask trace compare before.json.gz after.json.gz
Numbers go in the table below at release time:
| Metric | Four-table | Single-table | Δ |
|---|---|---|---|
| Render hot path (corpus median, ns/doc) | to fill | to fill | to fill |
| Registry lookup CPU share (%) | to fill | to fill | to fill |
| End-to-end parse + render p50 (ms/doc) | to fill | to fill | to fill |
Repro environment recorded in perf/samply.md. Pin the host
CPU + corpus version + Rust toolchain so the table is comparable
across releases.