Skip to main content

Module scan

Module scan 

Source
Expand description

Streaming $MFT scanner (ADR-0011).

The $MFT’s data runs are read in 16MiB aligned chunks through our own volume handle, records are fixed up and parsed per chunk, and the buffers are recycled — peak RAM is bounded at a few chunks. ntfs-reader provides the bootstrap (boot-sector geometry + record 0’s data runs) and the per-record attribute parsing types.

Two layers of overlap (entry order stays byte-for-byte identical to a sequential scan):

  • a dedicated I/O thread reads chunk N+1 while chunk N parses (pipeline::run_chunk_pipeline; degrades to inline reads if the thread can’t start — scan_pipeline_fallbacks)
  • within a chunk, record sub-ranges parse on rayon workers that carry the WTF-8 encoding too (parse::parse_chunk); the builder then appends the worker batches in chunk order, so EntryId assignment is deterministic.

Modules§

deferred 🔒
Deferred $ATTRIBUTE_LIST name resolution (ADR-0011): name-bearing extension records are cached in RAM while the $MFT streams through, so this pass resolves names without disk reads; anything missing (cache cap, torn records) falls back to a targeted read of the live volume.
parse 🔒
Parallel chunk parsing (ADR-0011): record sub-ranges of one chunk fan out across rayon workers, each producing a ParsedBatch; the builder appends the batches in chunk order, so EntryId assignment is deterministic.
pipeline 🔒
Read-ahead pipeline (ADR-0011): record-aligned chunk planning over the $MFT run map, plus the dedicated I/O thread that reads chunk N+1 while chunk N parses. If the thread can’t start, the scan degrades to inline sequential reads (scan_pipeline_fallbacks).
probe 🔒
I/O strategy probe (fmf io-probe).
volume_io 🔒
Raw volume access: \\.\C:-style handles, the NTFS update-sequence fixup, and the logical→physical run map of the $MFT data stream.
walk
Non-elevated folder-walk scanner for scope mode (ADR-0024).
walk_id 🔒
Synthetic record numbers for scope-mode (folder-walk) indexing (ADR-0024).

Structs§

ProbeStats
Throughput result of one measured $MFT read pass.
ScanStats
Statistics from a full index build.

Enums§

IoProbeMode
I/O strategy to measure for one $MFT read pass (ADR-0011).

Functions§

io_probe
Measure one $MFT read pass under mode. Elevation required (the same volume-handle rule as the scan).
scan_volume
Full initial scan: stream the volume’s $MFT and build the in-memory index. drive is a drive letter spec like C:.