Architecture and FFI Canonical Contract

This file is the canonical FFI contract — both the engine (Rust) and UI (C#) follow it; change signatures here first, then both sides. Design judgment and rationale live in docs/adr/.

Overall Structure

┌────────────────────────────────────────────────────┐
│ WinUI 3 app (C#/.NET, asInvoker)                     │
│   ViewModels ── IEngineClient (swap boundary)         │
│        ├─ PipeEngineClient (default: named pipe)      │
│        ├─ FfiEngineClient (--engine=inproc, elevated) │
│        └─ FakeEngineClient (--fake-engine)            │
└───────┬──────────────────────────┬─────────────────┘
        │ named pipe               │ C ABI (in-proc)
┌───────▼────────────────────┐ ┌──▼──────────────────┐
│ fmf-service (priv service,  │ │ fmf_engine.dll        │
│  LocalSystem, least-priv)   │ │  (fmf-ffi crate,      │
│  pipe server+SCM+flush     │ │   cdylib) conversion, │
│  wire def = fmf-proto rlib   │ │  handle mgmt,         │
│                            │ │  catch_unwind only    │
├────────────────────────────┴─┴─────────────────────┤
│ fmf-core (rlib): VolumeIndex / query /               │
│   mft scan (ntfs-reader) / usn tail / persist        │
└──────────────────────────────────────────────────────┘

1 FFI function = 1 pipe opcode, event callback = pipe push notification. The wire spec is canonical in the "Pipe Protocol" section of this document (design judgment in ADR-0016 / ADR-0017).

Module Map (1 file = 1 responsibility)

Narrative order = data-flow order (ingest: mft/scan→usn→index, search: query→engine, cross-cutting: diag/metrics).

fmf-contract/src/ machine-readable canonical contract (ADR-0018, zero deps, no logic): codes / opcodes
                 / events(EventKind) / options(SortKey/CaseMode/VolumeState+from_u32)
                 / pod(repr(C)+const layout pin) / volume(label 16B padded) / versions
                 / limits / counters(counter roster) / bin/gen-contract(EngineContract.g.cs
                 emitter) / tests/drift(generated-output match — always within cargo test)
fmf-core/src/
├─ mft.rs        $MFT record format (consumed by scan)
├─ scan/         mod(scan_volume+ScanStats) / volume_io(raw volume open+fixup)
│                / pipeline(16MiB×3 read-ahead+sequential degrade) / parse(rayon parallel+RecordArena)
│                / deferred(NameCache 128Ki+LazyRecordReader — degrade returned via stats)
│                / probe(io-probe measurement; independent of main flow)
├─ usn/          records / apply / session(journal tailing)
├─ index/        mod(types+re-exports+in-place merge) / core(VolumeIndex+reads+derived caches)
│                / mutate(USN mutations) / snapshot(persistence; unsafe POD confined here)
│                / builder(2-pass build+EXCLUDED propagation) / compact(compaction) / frn
│                / testutil(TestDir RAII etc.; feature "testutil" for other crates' tests)
├─ query/        mod(AST/compile surface+wire→QueryOptions conversion) / exec(search driver+materialize)
│                / sweep(pool-sweep candidate gen) / matchers(residual eval) / memo(DirPaths/OffsetTable)
├─ engine/       mod(Engine+lifecycle+EngineEvent::to_wire=single point of event mapping)
│                / volume(VolumeSlot+install_index+checkpoint — home of state)
│                / worker(volume thread+pure transition-decision fns: snapshot_decision etc. — drives flow)
│                / seams(SnapshotStore+JournalSource, 2 traits only; no additional ports = ADR-0018)
│                / worker_tests(non-elevated deterministic replay of failure paths)
│                / search(cross-volume+k-way merge) / results(ResultSet+fill_page=single impl of
│                  row+blob build+STALE check) / tests
├─ diag.rs       init_diag(sole bootstrap for all entry points) / resolve_log_dir / error_chain(4KiB)
│                / degrade!(warn+counter, atomic) / diag ring+sink
├─ metrics.rs / wtf8.rs
fmf-ffi/src/     lib(contract re-export+export pin) / error / handle / events
                 / volumes / blob / results / contract_tests(literal absolute-value pin+ABI layout
                 +null/error paths — independent tripwire for canonical-source miss-edits). clippy.toml
                 forbids unwrap_or_default (compile-time rejection of silent swallow)
fmf-proto/src/   lib(contract re-export) / frame(16B header+length-prefixed codec)
                 / messages(payload codec — types in contract) / tests/golden(corpus pin)
fmf-service/src/ lib(module exposure — loopback tests drive the real server)
                 / pipe(overlapped I/O as Read/Write+listener; accept is a 2-wait on connect/stop Event)
                 / server(per connection: reader+2 workers+write mutex) / dispatch(opcode→Engine,
                 catch_unwind firewall, result-handle LRU64=evict is counter+warn) / events(Subscribe
                 +bounded queue 256) / config(service.json) / host(lock-loser 5s→60s retry)
                 / faults(--debug-faults: !!lag/!!panic/!!drop)
                 / security(SDDL build pin+SID capture+connect-time token check+dir DACL)
                 / svc(common serve core+SCM entry: Stop/PRESHUTDOWN→flush→graceful)
                 / main(run/install/uninstall --purge-data/start/stop/status). clippy.toml same as above
fmf-cli/src/     main(clap defs+dispatch only) / cmd/{index,stats,bench,io_probe,criterion_gate,diag}
                 / bench_support(BENCH_QUERIES+baseline JSON shape+median+TempSnapshotGuard)
app/FindMyFiles/
├─ Engine/       IEngineClient(boundary — interface+exception types only; CancellationToken on all async)
│                / EngineTypes(DTOs — synced with golden's actual shape) / EngineJson(sole definition of snake_case settings)
│                / Generated/EngineContract.g.cs(gen-contract generated; no hand-editing)
│                / EngineEventMarshaler(sole point of event→IDispatcher crossing)
│                / FakeEngineClient(contract-conformant: shares invalid_queries.json+BumpEpoch)
│                / PipeProtocol(codec — constants reference Generated) / PageCodec(row decode — same)
│                / NativeEngine(P/Invoke signatures+the other half of generated structs+startup SizeOf assert)
│                / EngineClientFactory(CLI>settings>auto selection)
│                / Transport/ PipeEngineClient(supervision+multiplexing only) / PipeConnection(ownership
│                  unit of a single connection — structural resolution of disconnect races) / PipeSearchResult / PipeServerIdentity
│                  / FfiEngineClient(callback guarded by generation counter)
├─ ViewModels/   MainViewModel(composition root) / SearchOrchestrator / ResultsPresenter
│                / NotificationCenter / PerfPanelViewModel / StatusFormatter / ResultRow
├─ Views/        PerfPanel(custom control for the F12 panel)
├─ Controls/     ResultsViewportManager(viewport save/restore, selection restore — UI thread only)
├─ Converters/   UiConverters(x:Bind static pure functions)
├─ Virtualization/ VirtualResultList(single lifetime+Reassign/epoch+per-epoch ct=double defense)
├─ Services/     IDispatcher(test seam) / DispatcherQueueDispatcher / Notifier / FileLog / ShellOps
│                / ExceptionPolicy(3 handlers+single home of crash marker)
│                / AppSettings(%APPDATA%\settings.json: engine mode etc.; corruption→warn+default+.bad save-aside)
└─ FindMyFiles.Tests/  xUnit(ManualDispatcher fake deterministically mimics the UI thread)
                 / Contract/(EngineClientContractTests abstract suite×4 derivations
                   + GoldenCorpusTests=identical byte pin across both languages)

Default visibility for new fields/methods is "within that responsibility's directory" (pub(super)). Exposure outside the crate is only via pub use in mod.rs.

Engine Internals Key Points

Only the current structure is described here. For decision rationale, measured evidence, and rejected alternatives, see docs/adr/.

  • VolumeIndex (per volume, struct-of-arrays): names use the fold-overflow layout (ADR-0004) — the sweep target is the single folded lower_pool; the original is kept only on a mismatch via orig_pool+orig_off (u32::MAX=identical to fold). Fold is length-preserving (ADR-0003). Size is a u32 column+overflow map (ADR-0007). FRN→EntryId is a sorted id permutation, keyed by indirection through the frn column (ADR-0005). The only always-maintained sort permutation is name; size/mtime order is lazily derived (ADR-0006). Path strings are not retained but lazily built via the parent chain. Deletions are tombstoned; compaction runs above a threshold.
  • Maintaining sort structure on USN batches: binary search for the insertion point+in-place segment move (index/mod.rs merge_sorted_tail, ADR-0008).
  • Compaction: the volume thread decides per batch apply (len≥100k && (tombstone>12.5% || dead_name_bytes>32MiB)). An ascending old-id remap means the perm/FRN indexes need no re-sort (ADR-0009). A copy is built under a read guard→install_index swaps it+structural bump→open result handles become hard STALE. Children of a dead dir go to root (push_raw's orphan policy).
  • FRN index lookup semantics: unmerged tail (newest first)→binary search. Always tombstone-survivor filtered (even with multiple pairs for the same key, at most one survives). The initial scan defers parent resolution to the parallel pass in finish().
  • Default exclusion (EXCLUDED): raw H/S attributes+a computed EXCLUDED bit (self or an ancestor is H|S). Queries skip these by default (lifted via include_hidden_system). Inheritance is propagated O(n) at scan finish, and recomputed from the parent on USN insert/move. Limitation: a subtree move out of an excluded branch is stale until the next rescan.
  • 2-layer generation: content_generation increments per USN batch (existing result handles can keep reading). structural_generation increments only on compaction/full rescan (existing handles become hard STALE=FMF_E_STALE). Replacement always goes through VolumeSlot::install_index (inheriting old+1; initial/snapshot restore does not bump). Not persisted in the snapshot (in-process monotonicity is enough).
  • Query-time materialize: per volume, one-pass-filter the permutation→a sort-order-finalized contiguous array+multi-volume k-way merge (single volume is a direct copy). Subsequent page fetches are O(1) slices. A column click=re-issue with a different sort.
  • Incremental search (query cache): VolumeSlot::last_query holds the previous (compiled, options, both generations, ids). When the conservative subsumption rules in query/subsume.rs (same sort, single AND group, needle containment/range narrowing/filter addition only; fold bridging is orig→folded direction only) provably narrow the result, query::refine filters the previous ids via full evaluation — O(previous hit count). Correctness via oracle test (refine==fresh), kill switch FMF_QUERY_CACHE=0, observed in QueryTrace.cache.
  • Locking: parking_lot::RwLock. Search=read, USN batch apply=write. The index has a single writer: one volume thread.
  • Threads: initial scan=1 thread per volume. USN tailing=1 thread per volume (blocking read→drain→batch apply). Stop via CancelSynchronousIo.
  • Initial scan: $MFT is streamed in 16MiB chunks (1 read-ahead thread+3 buffers; startup failure degrades to sequential read+counter); within a chunk, rayon parses 1MiB subranges in parallel. Chunk-order append makes EntryId assignment deterministically match the sequential version (equivalence gate=admin test). Deferred ($ATTRIBUTE_LIST) names are resolved from a RAM cache of extension records (ADR-0011).
  • Search execution: query→AST→CompiledTerm sequence (cost order, AND short-circuit). rayon parallel over 64k chunks. The sweep is always on lower_pool. An uppercase needle / Sensitive does a superset sweep of the fold needle+original residual verification, resolving the fold-identical entry O(1) (ADR-0004). dm: is local TZ. No NFC/NFD normalization (known limitation). Trigram index not adopted (ADR-0002).
  • Derived caches (OffsetTable/DirPaths/SizePerm/MtimePerm): generation-managed per content_generation, extended incrementally from the previous generation where possible (OffsetTable fully rebuilds above a stale ratio of n/8; watermark mismatch→warn+counter+rebuild). DirPaths is lazily built on the first path query, with separate fold/orig slots, extended incrementally as long as the dir-topology generation is unchanged. Byte counts are charged to the B/entry gate via IndexStats.derived_cache_bytes.
  • Persistence: {index_dir}\{drive-letter}.fmfidx (e.g. c.fmfidx), format FMFIDX04 (ADR-0010). temp→MoveFileEx(REPLACE_EXISTING). On startup: load→verify→USN replay→live tail. Failure always falls back to a full rescan.

FFI Contract (C ABI)

Common conventions:

  • DLL name fmf_engine. All functions return an int32_t status (FMF_OK=0)+output args.
  • Strings are UTF-8 (file names are WTF-8: invalid surrogates preserved; the C# side restores UTF-16 via a dedicated decode).
  • Handles are opaque pointers. All functions are thread-safe. FFI re-entry from within a callback is forbidden.
  • catch_unwind at every entry → FMF_E_PANIC. The detail message is in fmf_last_error (thread-local).
  • Pointer/length contract (caller's responsibility): at the C ABI boundary, Rust cannot validate array length or allocated capacity.
    • (buf, cap) output buffer (fmf_list_volumes / fmf_index_status): buf must point to cap writable FmfVolumeStatus. The engine writes at most cap entries and returns the true total in *count (buf=NULL is a size query that writes only *count).
    • (volumes, n) input array (fmf_index_start): volumes must point to n valid NUL-terminated UTF-8 char*.
    • (roots, n, excludes, m) input arrays (fmf_index_start_scope): roots/excludes must point to n/m valid NUL-terminated UTF-8 char* (scope mode, ADR-0024/-0025).
    • POD pointers (FmfQueryOptions* / FmfVolumeStatus* / FmfEvent* …) must satisfy the declared #[repr(C)] size/alignment (C# marshals with the corresponding explicit layout, and fmf-contract pins it with compile-time offset_of assertions).
    • The engine null-checks every pointer and writes up to the cap limit, but cannot detect a length claim exceeding the actual allocation (undefined behavior). This contract is guaranteed by the sole caller, FfiEngineClient, constructing each array together with its length as a unit (this is why fmf-ffi uses #![allow(clippy::missing_safety_doc)] to delegate per-function safety notes to this section).
// ── lifecycle ──
uint32_t fmf_abi_version(void);                         // currently 1; C# side checks at startup
// config_json: { "index_dir": "...", "log_dir": "...", "log_level": "info" } (required keys)
int32_t fmf_engine_create(const char* config_json, FmfEngineHandle* out);
int32_t fmf_engine_destroy(FmfEngineHandle h);          // joins internal threads+saves (explicit save is fmf_flush)

// ── events (fired from internal engine threads; receiver marshals to DispatcherQueue) ──
// kind: 1=Progress(volume, scanned) / 2=VolumeReady(volume, entries)
//       / 3=IndexChanged(200ms engine-side debounce, the only throttle)
//       / 4=RescanStarted(volume) / 5=VolumeFailed(volume) / 6=EngineError(severity)
typedef void (*FmfEventCb)(const FmfEvent* ev /*POD*/, void* user);
int32_t fmf_set_event_callback(FmfEngineHandle h, FmfEventCb cb, void* user); // cb=NULL to clear

// ── volumes and index ──
int32_t fmf_list_volumes(FmfEngineHandle h, FmfVolumeStatus* buf, uint32_t cap, uint32_t* count);
int32_t fmf_index_start(FmfEngineHandle h, const char* const* volumes, uint32_t n); // explicit start, async; elements are drive labels "C:"
int32_t fmf_index_start_scope(FmfEngineHandle h, const char* const* roots, uint32_t n, const char* const* excludes, uint32_t m); // non-elevated folder-walk (ADR-0024); excludes prune matching subtrees at walk time (ADR-0025)
int32_t fmf_index_status(FmfEngineHandle h, FmfVolumeStatus* buf, uint32_t cap, uint32_t* count);
// FmfVolumeStatus.state: Scanning / Ready / Rescanning / Failed
// queries always succeed over "Ready volumes only" (UI judges the partial-result InfoBar by state)

// ── query (synchronous, fast; sort finalized at query time) ──
// options: { sort: Name|Size|Mtime, dir: Asc|Desc, case_mode: Smart|Insensitive|Sensitive,
//            include_hidden_system: bool (default false = exclude H/S attributes and their descendants),
//            regex_mode: u32 (bit0=interpret the whole query as one regex, bit1=scope 0=name/1=full path) }
int32_t fmf_query(FmfEngineHandle h, const char* query_utf8,
                  const FmfQueryOptions* options, FmfResultHandle* out, uint64_t* out_count,
                  FmfBlob** out_trace /* nullable: QueryTrace JSON */);

// ── observability (JSON blob; same "engine allocates+free" pattern as FmfPage) ──
// FmfBlob { data: *const u8, len: u32 } — UTF-8 JSON
int32_t fmf_engine_stats(FmfEngineHandle h, FmfBlob** out); // MetricsSnapshot (recent trace, histograms, USN feed, per-column memory)
int32_t fmf_blob_free(FmfBlob*);
// ── page fetch: an engine-allocated contiguous block (row-header array+string blob). 1 P/Invoke, 1 copy ──
// FmfRow (48 bytes, no padding; fmf-ffi's contract_tests fix size/offset):
//   { entry_ref u64, frn u64, size u64, mtime i64,
//     name_off u32, parent_path_off u32, flags u32, name_len u16, parent_path_len u16 } + trailing blob
// returns FMF_E_STALE = structural_generation mismatch. UI re-issues the same query
int32_t fmf_result_page(FmfResultHandle r, uint64_t offset, uint32_t count, FmfPage** out);
int32_t fmf_page_free(FmfPage* p);
int32_t fmf_result_free(FmfResultHandle r);

// ── diagnostics ──
// len is in/out: in=buffer capacity, out=length written (excluding NUL). Insufficient capacity is silently
// truncated (always NUL-terminated). buf=NULL queries the required size.
int32_t fmf_last_error(char* buf, uint32_t* len);

Error code table (shared with the pipe protocol. Append-only, no renumbering — contract_tests pin the values): FMF_OK=0, FMF_E_INVALID_ARG=1, FMF_E_STALE=2, FMF_E_NOT_ADMIN=3, FMF_E_VOLUME=4, FMF_E_QUERY_SYNTAX=5, FMF_E_IO=6, FMF_E_LOCKED=7, FMF_E_PANIC=99. FMF_E_LOCKED = another process holds the index_dir writer lock (cross-process enforcement of the single-writer invariant; see the "Pipe Protocol" section).

// ── explicit save (materialized in v2) ──
// Snapshot-saves only Ready volumes that are dirty (content_generation advanced since the last save).
// The service calls this internally on a schedule+at stop. Not exposed on the pipe
// (opcode 11 is a number reservation only — client-driven flush spamming is a DoS path that stops USN apply).
int32_t fmf_flush(FmfEngineHandle h);

Intentionally not included: fmf_entry_full_path (unnecessary since a row carries name+parent_path) / query cancel (queries are expected to take tens of ms; the UI drops stale results via the generation counter; only the room to add fmf_query_cancel if it ever gets heavy is left).

Pipe Protocol (v2 service split)

The wire spec between fmf-service (privileged service) and the non-privileged UI. This section is canonical. The machine-readable definitions (error codes, opcodes, event kinds, POD, limits, version numbers) are held as the single canonical source by the zero-dependency leaf crate fmf-contract, and fmf-proto (the encode/decode implementation), fmf-ffi, and fmf-service radiate from it (ADR-0018; the former claim "a cdylib cannot be depended on, so constants must be duplicated" was a factual error about Cargo — the only impossible direction is depending on a cdylib). fmf-ffi's contract_tests remain as literal absolute-value pins, serving as an independent tripwire that detects miss-edits of the canonical source itself.

Transport

  • pipe name: \\.\pipe\fmf-engine-v2 (the protocol version is in the name; an incompatible change bumps the whole name — v1→v2 was the incompatible change of adding regex_mode to FmfQueryOptions, growing it 16→20B. ADR-0023)
  • byte mode (PIPE_TYPE_BYTE)+length-prefixed framing (message mode not used)
  • creation flags: FILE_FLAG_FIRST_PIPE_INSTANCE on the first instance only (detects name pre-emption; the 2nd and later instances use the same SDDL with no flag — squatting is impossible as long as the server holds the first instance)
    • PIPE_REJECT_REMOTE_CLIENTS on all instances. Instance limit 8 (excess is connection-rejected+ pipe_connections_rejected counter)
  • DACL: explicit SDDL D:P(A;;GA;;;SY)(A;;GRGW;;;<user SID>) — only SYSTEM and the user SID captured at install. Authenticated Users not adopted (name leak on multi-user machines). Allowing Administrators also fails (a UAC-filtered token becomes deny-only, so the non-elevated UI cannot connect). As defense in depth, on connection accept the client token is checked against authorized_sids in service.json (ImpersonateNamedPipeClient reads the client SID)
  • The client opens the pipe at identification level (C# TokenImpersonationLevel.Identification / Rust SECURITY_SQOS_PRESENT | SECURITY_IDENTIFICATION). Left at the default anonymous level, the server's ImpersonateNamedPipeClient only gets an anonymous token, and the SID check above rejects even an authorized user's connection (pipe client token rejected). This trap is not exposed by console-mode tests that skip the check because authorized_sids is empty — it only shows up with an installed service
  • client-side verification: for the default pipe name, GetNamedPipeServerProcessId → checked against the PID of the SCM-registered fmf-engine service (QueryServiceStatusEx) (anti-fake-server). Works in the non-elevated UI — a SYSTEM process's token cannot be opened non-elevated (ACCESS_DENIED), and the session 0 identity is unobtainable, so SYSTEM token checking cannot be used. A squatter cannot do SCM registration (admin required), so the PID will not match. When --pipe-name is specified (tests), verification is skipped

Frame (16-byte LE header+payload)

struct FrameHeader {            // 16 bytes, little-endian
    uint32_t len;               // payload length (excluding header). limit 16 MiB
    uint16_t opcode;            // see table below
    uint16_t flags;             // bit0=response, bit1=event push
    uint32_t request_id;        // request/response correlation. event push is 0
    int32_t  status;            // valid only on responses. error code table (shared with FFI)
};
  • malformed frame (unknown opcode, len overflow, truncation) = disconnect+pipe_malformed_frames counter+warn
  • an error response (status != 0) carries UTF-8 detail in the payload (the mapping of fmf_last_error — thread-local pull does not exist on the pipe)
  • requests are multiplexed by request_id (out-of-order completion allowed)

Opcode table (correspondence to FFI functions)

Payload-notation legend: a type-annotated {} = little-endian, no-padding POD byte sequence. "JSON" = UTF-8 JSON, field names are snake_case (serde default). POD+variable-length data are concatenated with no gaps in the listed order. The volume identifier is everywhere a drive-label string "C:" (GUIDs not used). For both binary and JSON, the representative messages are pinned as identical golden frames (byte sequences) in both the Rust and C# suites. The canonical corpus is contract/golden/ (repository root): fmf-proto tests/golden.rs and fmf-core tests/golden_json.rs capture and pin them, and on the C# side GoldenCorpusTests independently decode→re-encode the same files and pin them. Re-capture is only an explicit run with FMF_BLESS=1 (the ritual for an intentional contract change — ADR-0018).

opnameFFI mappingpayload (req → resp)
1Hellofmf_abi_version{protocol_version:u32}{protocol_version:u32, abi_version:u32, server_pid:u32} (version mismatch is INVALID_ARG+disconnect)
2Subscribefmf_set_event_callback(cb≠NULL)empty → empty. events pushed to this connection thereafter
3Unsubscribefmf_set_event_callback(NULL)empty → empty
4ListVolumesfmf_list_volumesempty → JSON [{"volume":"C:","state":0,"entries":0}] (state equals FmfVolumeStatus.state)
5IndexStartfmf_index_startJSON {"volumes":["C:"]} → empty (persisted to service.json)
6IndexStatusfmf_index_statusempty → JSON (same shape as ListVolumes)
7Queryfmf_queryFmfQueryOptions (20B POD below)+UTF-8 query string (length derived from frame len, no NUL terminator) → {result_id:u64, count:u64}+QueryTrace JSON
8ResultPagefmf_result_page{result_id:u64, offset:u64, count:u32}{row_count:u32, blob_len:u32}FmfRow (48B)× row_count (densely packed) → string blob (blob_len bytes, WTF-8). name_off/parent_path_off are byte offsets relative to the start of the blob (same layout as the FFI FmfPage)
9ResultFreefmf_result_free{result_id:u64} → empty
10Statsfmf_engine_statsempty → MetricsSnapshot JSON (same shape as FFI, snake_case)
11(Flush reserved)fmf_flushnumber reserved only, not implemented — client-driven flush spamming is a local DoS path that stops USN apply by repeatedly holding index.read(). Saving is the service's internal responsibility
12ServiceInfo(service-specific)empty → JSON {uptime_ms, connections, version}

FmfQueryOptions (20B, no padding, LE — pinned by a contract test like FmfRow): { sort:u32@0(0=Name 1=Size 2=Mtime), desc:u32@4(0=Asc 1=Desc), case_mode:u32@8(0=Smart 1=Insensitive 2=Sensitive), include_hidden_system:u32@12(0/1), regex_mode:u32@16(bit0=treat the whole query as one regex, bit1=scope 0=name/1=full path, high bits reserved 0) }

Mapping exceptions (C ABI specific, not present on the pipe): fmf_engine_create/fmf_engine_destroy (absorbed into connection establish/disconnect and service lifetime), fmf_page_free/fmf_blob_free (ownership moves to the client on frame receipt), fmf_last_error (inline detail in error responses).

Event push

  • To a Subscribed connection, push flags=event, request_id=0, opcode=event kind (equal to FFI kind 1–6) with the FmfEvent-equivalent POD {kind:u32, _pad:u32, entries:u64, volume:[u8;16]}. volume is a UTF-8 drive label ("C:") 0x00-padded (not a GUID)
  • per connection a bounded queue (256)+a dedicated writer thread. When full, drop the oldest+pipe_events_dropped counter+warn — a slow/non-reading client never blocks the volume thread (never hangs). A dropped IndexChanged-class event self-heals on the next re-query
  • because an event frame carries the event kind (1–6) in opcode, its number overlaps with request opcodes — always discriminate first by the event bit in flags (do not dispatch on opcode alone)
  • the client's (re)connect sequence is fixed (this section is canonical): Hello → Subscribe → IndexStatus → forced IndexChanged fire. The last IndexChanged is synthesized locally by the client (the server does not send it) — to pick up, via re-query, changes missed while disconnected

Result handle (result_id) lifetime

  • the server holds ResultSets in a per-connection registry. Freed by ResultFree or disconnect
  • limit 64/connection. On excess, evict the least-recently-accessed (LRU), and a subsequent ResultPage for that result_id returns FMF_E_STALE (detail includes "evicted" to make it distinguishable from a structural generation change). the client recovers via the existing STALE→re-query path

Single-writer exclusion (cross-process)

  • Engine::new opens {index_dir}\.writer.lock in share-mode 0 and holds it for its lifetime. Failure is FMF_E_LOCKED. It auto-releases when the OS handle vanishes, so a stale lock never occurs
  • the service as the loser (in-proc UI got there first): backoff retry (5s→60s cap)+logs the holding process pid. Stops with an exit code that does not trigger an SCM failure-recovery (restart) loop
  • the UI as the loser (--engine=inproc while the service is running): an explanatory InfoBar ("Service is running. To use in-proc, run just service-stop")

Per-machine settings %ProgramData%\find-my-files\service.json (service-owned)

{ "volumes": ["C:"], "log_level": "info", "flush_interval_secs": 300, "authorized_sids": ["S-1-5-21-…"] }
  • fmf-service install creates it together with capturing the user SID. IndexStart receipt persists volumes. The initial default is all fixed NTFS volumes. The non-elevated UI forwards its own SID via --owner-sid, and install validates it with validate_user_sid (accepts only the real user type=SidTypeUser) before appending it to authorized_sids — because under OTS elevation (elevating with a different admin account) install's own SID differs from the everyday user's
  • authorized_sids is read exactly once at service start and baked into DACL construction and connect-time token checking (immutable while running). Reflecting an added SID requires fmf-service restart (= stop→start) — an in-place install alone does not affect a running instance (it keeps rejecting with the old allow list). The app's "register/re-register the service" runs install→restart in sequence
  • ownership is separated from the per-user %APPDATA%\find-my-files\settings.json (UI-owned)

C# Side Contract

  • IEngineClient (swap boundary): SearchAsync(query, options) → SearchOutcome(ISearchResult, QueryTrace) / GetStatsAsync / ListVolumesAsync / StartIndexingAsync / GetStatusAsync (3 methods changed to return Task in v2 — a synchronous call across the pipe is a "never hang" violation on the UI thread) / event IndexChanged / event VolumeUpdated / event EngineErrorOccurred / EngineConnectionState Connection { get; } + event ConnectionChanged (InProc | Connecting | Connected | Reconnecting; Ffi/Fake are fixed to InProc). The 3 implementations Fake/FFI/Pipe follow the same interface.
  • Engine selection (EngineClientFactory): CLI --fake-engine / --engine=pipe|inproc > settings.json "engine" (default auto) > auto = pipe 250ms probe → success uses Pipe / on failure branch on service state (ServiceSetup.QueryState): if running (=holds writer.lock; a probe failure means the token is unauthorized) then do not create in-proc even when elevated (reliably avoids an FMF_E_LOCKED collision), an empty engine with a "re-register" affordance / if absent/stopped and the process is elevated, Ffi (writer.lock is free) / if neither is possible, an explanatory InfoBar+empty engine (zero-result FakeEngineClient.CreateEmpty(), badge "not connected" — no demo data: fake data has no practical use in a search app)+a "Restart as administrator" button (explicit action only; no automatic runas loop; forwards the non-elevated user's SID via --setup-owner). When started elevated in-proc with the service unregistered/stopped, you can set it up with one click in an in-app notification (ServiceSetupfmf-service install --owner-sid+restart; install is idempotent on the service side, restart reflects the new authorized_sids) — an onboarding path that never opens a terminal: normal start → "Restart as administrator" → "Register and start the service" → normal start forever after. The Fake with data is --fake-engine (development/UI test) only.
  • Disconnect and reconnect (PipeEngineClient): disconnect = fail in-flight requests immediately with EngineUnavailableException, epoch-invalidate surviving ISearchResults (afterwards GetRangeAsyncStaleResultException = the existing re-query mechanism is the recovery path), reconnect indefinitely with backoff (250ms→5s). The reconnect sequence is canonical in the "Pipe Protocol" section (VolumeUpdated events are synthesized and fired from the IndexStatus response). Requests have a default timeout of 10s.
  • SearchResultHandle : SafeHandle. Page fetches bracket DangerousAddRef/Release, and do not release the underlying object even after Dispose() until in-flight fetches complete.
  • page received→copy to ResultRowimmediately fmf_page_free.
  • the callback delegate is held in a client field (prevents GC reclamation). After receipt, to the UI via DispatcherQueue.TryEnqueue.
  • Search pipeline responsibility split (MainViewModel is the composition root only):
    • SearchOrchestrator — when and what to search: 50ms debounce (clear is immediate), Dispose of stale results via the generation counter, RequeryOrigin classification, bounded Stale retry (1×), exception classification. An empty query is not sent to the engine (the product rule that an empty field has no results to return; a match-all enumeration would have its IDs shift every USN tick, so the start screen would redraw forever) — empty screen via PresentEmpty (idempotent). During IME composition the query is held (TextCompositionStarted/Ended; only the committed string flows through the normal debounce). Focused mode (focused search) = a pure query rewrite just before passing to the engine (FocusedQueryRewriter: add a !path: exclusion and one ext: whitelist item to each OR group; do not add ext to an explicit ext:/regex: group, nor an exclusion to a group containing path:/\) — does not touch the engine; settings in settings.json, ADR-0019.
    • ResultsPresenter — presenting results: prefetch the visible-range page before publishing, then publish atomically via VirtualResultList.Reassign (the old results stay on screen until the new ones are ready=zero blank frames). Count text and viewport placement events.
  • two re-query families (RequeryOrigin classifies): type/clear/sort/filter-originated=reset to top / IndexChanged/VolumeReady/Stale-originated=save the top visible index→restore, and selection restored best-effort only when an EntryRef in the seed matches.
  • VirtualResultList (non-generic IList+INCC+IItemsRangeInfo): a single instance with the same lifetime as the page (ItemsSource is x:Bind OneTime — replacing it discards the ListView virtualization state and causes flicker). New results are Reassign(result, seeds) = epoch++ → discard the page cache → apply seeds → emit INCC Reset once (UI thread only). A re-query of the same result (guaranteed by the engine via QueryTrace.unchanged: same text+options and the ID sequence memcmp-matches on every volume) is RefreshInPlace = epoch++ → swap the handle → in-place fill the visible seed into existing row instances (the MVVM setter notifies only on value change) → no Reset, count text unchanged — the screen does not redraw on the re-query that idle USN traffic (logs, telemetry, etc.) triggers every 200ms. In-place updated size/mtime update only the cells whose value changed. The indexer never fetches and returns a placeholder (out of range throws immediately — no negative index, no fabricated fake page). On RangesChanged, background-fetch the visible range ±1 page in 64-row units→fill properties of existing ResultRows. Completion of an old-epoch fetch is silently discarded. Page LRU limit 4096 rows. Hard STALE receipt→BecameStale (only on epoch match)→ the Orchestrator re-queries.
  • IList contract invariant (do not falsely affirm membership): XAML blindly trusts the answers of Contains/IndexOf/GetAt via the WinRT adapter. A false "absent" is fixed by container re-realization, but a false "present" causes a crash deep in XAML at GetAt(staleIndex) (proven: the root of the Int32.MaxValue-1 exception that reliably reproduced on search-with-results→clear-all). Membership is defined as "index is below Count AND the corresponding slot in the current page cache is that same instance". A row of an old result, a row of an LRU-evicted page, and a temporary row for enumeration always answer absent. Enumeration/CopyTo do not disturb the virtualization state (LRU). The UI-thread check of the mutation family (Reassign/RefreshInPlace) is always active in Release.

Error Handling and Diagnostics (principle: "never crash, never hang, never go silent")

Every anomaly always reaches 3 paths: (1) the log file (2) the diag ring (=auto-displayed in the F12 panel/fmf stats) (3) the UI InfoBar. No telemetry is sent (local only).

  • Logs: engine=%ProgramData%\find-my-files\logs\engine.log (daily rotation, filter via the FMF_LOG env var), app=%APPDATA%\find-my-files\logs\app.log (one-generation rotation at 2MB)
  • diag ring (fmf-core::diag): holds the most recent 128 tracing events at WARN or above+panics (with backtrace). Always included in MetricsSnapshot.recent_errors
  • panic: caught by a global hook→log+ring. The volume thread has a catch_unwind firewall, so even on panic the UI always receives VolumeFailed (no silent hang)
  • Event kind 6 FMF_EVENT_ENGINE_ERROR: a POD notification that a diag event occurred (entries=severity 1=warn/2=error/3=panic). Detail text is pulled from the stats JSON (push notification+pull detail)
  • Degradation recording convention (ADR-0018): a degradation path uses fmf_core::degrade! (the only way to do tracing::warn!+counter increment atomically; rg degrade! = enumerates all degradation paths). The batch path inside scan is the sole exception, returning the degradation in a ScanStats field and mapping it to counters+warn in one place at the worker layer (do not scatter the macro across the hot path). The boundary crates (fmf-ffi / fmf-service) forbid unwrap_or_default via disallowed-methods in clippy.toml — a silent fallback is rejected at compile time
  • Canonical source of counter names: fmf-contract::counters::COUNTER_NAMES (C#'s CountersData is generated by gen-contract, and fmf-core's golden test reconciles CountersSnapshot's serde keys with the roster — a missing addition is mechanically detected)
  • Degradation counters (MetricsSnapshot.counters, shown in F12 if nonzero): stat_fetch_failures / usn_batches_truncated / snapshot_load_failures / snapshot_save_failures / deferred_names_unresolved / corrupt_mft_records / journal_rescans / scan_pipeline_fallbacks (scan read-ahead I/O thread startup failure→degrade to sequential read) / offset_table_rebuild_fallbacks (offset table watermark mismatch→degrade to full rebuild) / lazy_perm_rebuild_fallbacks (the same kind of defense for the lazy sort permutation) / compaction_aborts (generation mismatch during compaction→discard the copy. Detects a break of the single-writer invariant) / pipe_malformed_frames (malformed frame→disconnect) / pipe_events_dropped (event bounded-queue overflow→drop oldest) / pipe_connections_rejected (instance limit exceeded) / deferred_name_cache_overflow (extension-record name cache full→degrade to disk read) / deferred_name_read_failures (disk-read failure of lazy name resolution) / pipe_results_evicted (LRU eviction of a result handle) / trace_serialize_failures (QueryTrace JSON-ification failure→respond with an empty trace)
  • Single implementation of error detail: fmf_core::diag::error_chain (joins all causes, 4KiB limit+"…" truncation) — both FFI fmf_last_error and the pipe error-response payload use this
  • Single home of diagnostics init: fmf_core::diag::init_diag(log_dir, level) (logging+panic hook+diag ring connect, idempotent) is called by all entry points: FFI / service / CLI. log_dir resolution is resolve_log_dir: explicit specification (config/CLI) > a logs subdir of the engine's index_dir (co-located with the index, so it shares the index's writable, non-machine-wide pollution domain) — there is no machine-wide fallback (%ProgramData% dirtied the machine for non-elevated callers and panicked when unwritable); the machine service still logs to %ProgramData%\find-my-files\logs by passing it explicitly. This priority is implemented in only this one place
  • C# convention: fire-and-forget always uses task.Forget(area) (exception→app.log+InfoBar). Shell operations go through ShellOps. A global exception handler writes a crash marker and notifies on the next start
  • Diagnostics copy: the F12 panel's "Copy diagnostics" = stats JSON+tail of app.log+environment info
FFI codemeaningUI behaviorretry
FMF_E_QUERY_SYNTAX(5)query syntax errorshown in the status barfix input
FMF_E_STALE(2)structural generation changeauto re-issue the same queryautomatic
FMF_E_NOT_ADMIN(3)insufficient elevationInfoBar+explanationrestart
FMF_E_LOCKED(7)index_dir held by another engineInfoBar+explanation ("Service is running. Use in-proc after just service-stop")restart after stopping the service
FMF_E_PANIC(99)panic inside the engineInfoBar+pointer to engine.lognot possible (report)
others (1,4,6)argument/volume/IOInfoBardepends

Latency Budget (breakdown of the change→on-screen ≤1s AC)

USN batch commit ≤100ms + engine IndexChanged debounce 200ms (the only throttle) + UI re-query ≤100ms + render ≤100ms = ≤500ms (2× margin). Do not place an additional throttle on the UI side.

Additional budget for the pipe path (canonical here — other docs' numbers reference this section): ResultPage 64-row round trip p99 ≤5ms (provisional — the loopback integration test asserts it, to be finalized by measurement). Continuously observed via F12's PageRttEwma. Event push is one hop after the debounce above, so the budget structure does not change.

pipe test gates: protocol round-trip and loopback integration (unique pipe name+ insert_ready_volume) run unconditionally under non-elevated cargo test. The C# client × real fmf-service integration is FMF_PIPE_TESTS=1 (just test-pipe). The service E2E using real volumes is, as before, FMF_ADMIN_TESTS=1 (elevated).

Contract reference (generated)

Machine-generated snapshot of the FFI / pipe contract from the values defined in the fmf-contract crate. The prose canonical source for semantics and the safety contract is Architecture; for Rust API details see rustdoc (/doc/fmf_contract/). Numbers are the actual offset_of! / size_of! values.

Versions and pipe name

ItemValue
ABI_VERSION2
PROTOCOL_VERSION2
PIPE_NAME\\.\pipe\fmf-engine-v2
PIPE_NAME_SHORTfmf-engine-v2
SERVICE_NAMEfmf-engine

Status codes (frame header status / FFI return values. append-only)

NameValue
OK0
INVALID_ARG1
STALE2
NOT_ADMIN3
VOLUME4
QUERY_SYNTAX5
IO6
LOCKED7
PANIC99

Pipe opcodes (event pushes reuse 1..=6 as the kind)

NameValue
HELLO1
SUBSCRIBE2
UNSUBSCRIBE3
LIST_VOLUMES4
INDEX_START5
INDEX_STATUS6
QUERY7
RESULT_PAGE8
RESULT_FREE9
STATS10
FLUSH_RESERVED11
SERVICE_INFO12

Event kinds (FFI FmfEvent.kind = pipe event-push opcode)

KindValue
Progress1
VolumeReady2
IndexChanged3
RescanStarted4
VolumeFailed5
EngineError6

Limits (protocol facts. not tunable)

ConstantValue
MAX_PAYLOAD_LEN16777216
MAX_RESULTS_PER_CONN64
EVENT_QUEUE_CAP256
PAGE_ROWS64

POD layouts (#[repr(C)], actual offset_of! values)

FrameHeader (16 B)

Fieldoffset
len0
opcode4
flags6
request_id8
status12

FmfRow (48 B)

Fieldoffset
entry_ref0
frn8
size16
mtime24
name_off32
parent_path_off36
flags40
name_len44
parent_path_len46

FmfQueryOptions (20 B)

Fieldoffset
sort0
desc4
case_mode8
include_hidden_system12
regex_mode16

FmfPage (32 B)

Fieldoffset
row_count0
rows8
blob16
blob_len24

FmfEvent (32 B)

Fieldoffset
kind0
entries8
volume16

FmfVolumeStatus (32 B)

Fieldoffset
label0
state16
entries24

FmfBlob (16 B)

Fieldoffset
data0
len8

Degradation counter names (snake_case keys of the stats JSON. append-only)

  • stat_fetch_failures
  • usn_batches_truncated
  • snapshot_load_failures
  • snapshot_save_failures
  • deferred_names_unresolved
  • corrupt_mft_records
  • journal_rescans
  • scan_pipeline_fallbacks
  • offset_table_rebuild_fallbacks
  • lazy_perm_rebuild_fallbacks
  • compaction_aborts
  • pipe_malformed_frames
  • pipe_events_dropped
  • pipe_connections_rejected
  • deferred_name_cache_overflow
  • deferred_name_read_failures
  • pipe_results_evicted
  • trace_serialize_failures
  • walk_read_errors
  • walk_depth_truncated

Verified technical facts (researched 2026-06-10, primary sources confirmed)

Design decisions assume this file. Sources at the end of each item.

NTFS / MFT / USN journal

  • FSCTL_ENUM_USN_DATA (DeviceIoControl, winioctl.h, documented) is the official API to enumerate MFT records. Call it repeatedly with MFT_ENUM_DATA_V0/V1 as input, starting from StartFileReferenceNumber=0. The returned USN_RECORD_V2 has FRN, parent FRN, file name, and FileAttributes, but no file size or timestamp (TimeStamp is the journal-record time). Indexing with size and date requires reading the raw $MFT ($STANDARD_INFORMATION/$FILE_NAME/$DATA) or an extra per-file query. https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_enum_usn_data https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ns-winioctl-usn_record_v2
  • Incremental monitoring: FSCTL_QUERY_USN_JOURNAL to get UsnJournalID/NextUsn → FSCTL_READ_USN_JOURNAL (READ_USN_JOURNAL_DATA_V0, blocking subscription possible with BytesToWaitFor>0). The state to persist is the UsnJournalID + last-processed USN pair. The journal is maintained by the OS, so changes made while the app is stopped can be caught up. https://learn.microsoft.com/en-us/windows/win32/api/winioctl/ni-winioctl-fsctl_read_usn_journal
  • Error fallback (standard pattern): ERROR_JOURNAL_NOT_ACTIVE → create with FSCTL_CREATE_USN_JOURNAL (admin required). ERROR_JOURNAL_DELETE_IN_PROGRESS (deletion continues across reboots). Saved USN older than FirstUsn → ERROR_JOURNAL_ENTRY_DELETED. These plus a JournalID mismatch fall back to a full rescan. https://learn.microsoft.com/en-us/windows/win32/fileio/creating-modifying-and-deleting-a-change-journal
  • FRN→path: USN records have no path string. Hold an FRN→(name, parent FRN) map for all directories and build paths lazily by walking the parent chain up to the root (fixed at MFT record 5 on NTFS). A folder rename/move updates only that one record; no records are emitted for its children. FRN is 64-bit on NTFS (low 48 bits = record number + high 16 bits = sequence). ReFS is 128-bit (USN_RECORD_V3) — out of scope for MVP but accounted for in the ID type design.
  • Privileges: Opening a volume handle (\\.\C:) requires admin (CreateFile official Remarks: "The caller must have administrative privileges"). The undocumented FSCTL_READ_UNPRIVILEGED_USN_JOURNAL allows non-elevated journal reads, but it is undocumented and has no ENUM equivalent, so the initial scan requires elevation. https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew
  • Hard links: Multiple $FILE_NAME attributes within a single MFT record. A USN record's file name is normally only the "first link name". → MVP uses "one representative name per FRN".
  • Symbolic links / junctions: not followed (cycle-matching cost). Index the reparse point itself as a single entry.

Search syntax (real-usage research)

  • Real-usage research (HN etc.) centers on substring default, space=AND, |=OR, !=NOT, ""=phrase, *? (whole-filename match), ext: path: size: dm: (ranges a..b >x). regex:/content: are niche, and content search is inherently slow. → Supports the syntax scope and the "filename-only indexing" tradeoff (ADR-0001).

Competitors / prior art (as of 2026-06)

  • "Rust engine + native WinUI 3 + truly FOSS" is an empty niche. The strongest competitor, omni-search (Eul45, started 2026-02, 517 stars, MIT), is Tauri v2 + React + C++, requireAdministrator approach.
  • Past FOSS clones are all stalled: Orange (Rust/Tauri/Tantivy, walk-based without MFT, stopped 2023-10), FastFileSearch (2016), Indexer++ (2019), SwiftSearch (actually CC BY-NC = non-FOSS, 2019).

Real C: name/size statistics (2026-06-11, fmf stats C: --name-stats, 1,268,450 entries)

Primary data for layout decisions and synthetic-benchmark calibration (re-measure with this command):

  • fold-identical (lower==orig) = 73.2% / unique names 53.2% / unique after fold 53.0%
  • name length (WTF-8 bytes): mean 29.7 / p50 18 / p90 90 / p99 110 / max 171
  • files over 4GiB = 10 (0.0008%)

See docs/adr/ for design and rejection decisions and their numeric rationale.

Rust crates (existence and maturity confirmed)

  • ntfs-reader 0.4.5 (MIT/Apache-2.0, updated 2026-03): full raw-$MFT record scan (README benchmark: Vec Cache 3.756s / HashMap 4.981s / No Cache 12.3s, environment not stated). FileInfo gives name/path/size/created/modified. Cannot retrieve all hard-link names (one representative name).
  • usn-journal-rs (wangfu91, MIT, updated 2026-05): MFT enumeration + USN monitoring + FRN path resolution. Read as a reference implementation (policy: do not depend on it).
  • windows-sys 0.61: complete FSCTL constants, MFT_ENUM_DATA, USN_RECORD, etc. The USN wrapper is implemented in-house (~200 lines).
  • memchr (memmem::Finder = SIMD substring), rayon, parking_lot, thiserror, tracing, xxhash-rust.

WinUI 3 (Windows App SDK)

  • Data virtualization: random access with a known count uses non-generic IList + INotifyCollectionChanged + IItemsRangeInfo + placeholders. Explicitly supported in current WASDK (MS Learn updated 2026-03). IList<T> alone does not work (#1809). ISupportIncrementalLoading has crash reports (#6883), avoid it. ItemsView/ItemsRepeater support neither interface. Setting ItemsPanel to anything other than ItemsStackPanel disables virtualization. https://learn.microsoft.com/en-us/windows/apps/develop/performance/listview-and-gridview-data-optimization
  • Tray / hotkey: no native support. H.NotifyIcon.WinUI + in-house RegisterHotKey + an HWND_MESSAGE hidden window (WM_HOTKEY).
  • DPI: the WinUI 3 template defaults to Per-Monitor V2.
  • MSIX × requireAdministrator is a poor fit (allowElevation etc. constraints, almost always rejected in Store review) → unpackaged + self-contained distribution.
  • Known constraints of elevated processes: D&D from Explorer is not possible (UIPI). ShellExecute directly from an elevated process launches the associated app elevated too → de-elevate via explorer.exe "<path>" (standard pattern).
  • WASDK 1.6+ supports Native AOT (official sample cuts startup by about 50%). However, the "instant launch" experience is best ensured by a resident tray + hotkey.

Security — v2 service separation (researched 2026-06-11, primary sources confirmed)

A privileged-indexer → non-privileged-UI design carries an information-disclosure risk: exposing file names and paths that should be invisible per ACL. The v2 threat model and defenses are in docs/SECURITY.md; decision records are ADR-0016/0017. Below is the supporting research:

  • PIPE_REJECT_REMOTE_CLIENTS (CreateNamedPipeW dwPipeMode): officially stated as "Connections from remote clients are automatically rejected". Direct mechanism for remote rejection. https://learn.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-createnamedpipew
  • FILE_FLAG_FIRST_PIPE_INSTANCE: creating a second instance fails with ERROR_ACCESS_DENIED (officially stated). Defends against pipe-name squatting. Same source as above.
  • GetNamedPipeServerProcessId: a client can get the server process PID (fake-server detection: PID → verify the token is SYSTEM). https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-getnamedpipeserverprocessid
  • Anonymous access (caution): the default for anonymous restriction via NullSessionPipes is machine-type/policy dependent (enabled on DC/standalone, Not defined on member/client). Make an explicit DACL (no anonymous ACE = default deny) the primary defense for blocking anonymous access. https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-10/security/threat-protection/security-policy-settings/network-access-restrict-anonymous-access-to-named-pipes-and-shares
  • Deny-only Administrators in a UAC-filtered token: in a non-elevated process the BUILTIN\Administrators SID becomes SE_GROUP_USE_FOR_DENY_ONLY and is not used for allow ACEs (only for deny-ACE matching). A pipe DACL that "allows Administrators" cannot be connected to by a non-elevated UI → naming the user's individual SID is mandatory. https://learn.microsoft.com/en-us/windows/win32/secauthz/sid-attributes-in-an-access-token
  • ImpersonateNamedPipeClient: the server can obtain and inspect the client's token (SID matching at connect time = defense in depth against a misconfigured DACL). https://learn.microsoft.com/en-us/windows/win32/ipc/impersonating-a-named-pipe-client
  • SERVICE_CONFIG_REQUIRED_PRIVILEGES_INFO (ChangeServiceConfig2): declaring required privileges makes the SCM strip undeclared privileges from the process token at startup (SeChangeNotifyPrivilege always remains; for shared-process services the union applies). Used to disarm LocalSystem. https://learn.microsoft.com/en-us/windows/win32/api/winsvc/ns-winsvc-service_required_privileges_infow
  • SERVICE_CONTROL_PRESHUTDOWN (caution): the default grace period is 10 seconds on Windows 10 1703 and later (3 minutes before that). Saving a large snapshot requires explicitly extending it via SERVICE_PRESHUTDOWN_INFO (dwPreshutdownTimeout). https://learn.microsoft.com/en-us/windows/win32/api/winsvc/ns-winsvc-service_preshutdown_info
  • windows-service crate (Mullvad, v0.8.1 2026-05, MIT/Apache-2.0): provides define_windows_service! and service_control_handler::register. A PRESHUTDOWN handler can be registered. https://github.com/mullvad/windows-service-rs
  • SeBackupPrivilege and raw-volume reads: what is documented goes only as far as "retrieving content of normal files by bypassing the ACL". There is no documented guarantee that a raw volume handle to \.\C: can be opened with SeBackupPrivilege alone (research scope: Managing Privileges in a File System and others). Volume handles require admin (see "Privileges" item above) → the basis on which ADR-0017 rejected the dedicated low-privilege-account proposal. https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/privileges

Regex engine (rust regex crate, researched 2026-06-15, premise for first-class status = ADR-0023)

  • Linear-time guarantee, no ReDoS: the regex crate is implemented with finite automata (lazy DFA / Pike VM) and does not backtrack. Matching is linear in "input length × pattern length", and the catastrophic backtracking that plagues regex services (ReDoS runtime exponential blowup) cannot occur structurally, as officially stated. Even malicious (a+)+$-style patterns run linearly. https://docs.rs/regex/latest/regex/#untrusted-input https://docs.rs/regex/latest/regex/#performance
  • Remaining attack surface = compile time/memory: when accepting untrusted patterns, the only DoS surface is the compile-time program/DFA size demanded by a huge pattern (bounded-repetition expansion like a{1000}{1000}). The crate provides RegexBuilder::size_limit (byte cap on the compiled program, default 10 MiB) and dfa_size_limit (byte cap on the lazy DFA cache, default 2 MiB); on overflow build() returns an Error (CompiledTooBig equivalent). nest_limit (default 250) caps parse-tree depth. The docs recommend tightening both size limits for untrusted patterns. https://docs.rs/regex/latest/regex/struct.RegexBuilder.html#method.size_limit https://docs.rs/regex/latest/regex/struct.RegexBuilder.html#method.dfa_size_limit
  • find-my-files uses 1 MiB each (with name length p99 ≈110B this is excessively generous; legitimate patterns never reach it, and malicious patterns are cleanly rejected with FMF_E_QUERY_SYNTAX). Decision and re-examination triggers are in ADR-0023.

Security — Threat Model and Defenses (v2 service split)

Current architecture: a privileged service fmf-service (LocalSystem, least privilege) reads NTFS $MFT/USN, and the non-privileged UI connects over a named pipe. Decision history and rejected options are in ADR-0016 / ADR-0017; API spec verification is in RESEARCH.md.

Threats and Defenses

#ThreatDefense
1ACL-bypass name leak — the privileged indexer exposes file names invisible under the user's own ACL to another userRestrict the pipe DACL to SYSTEM + the user SID (SID captured at install time + the everyday-user SID forwarded by the non-elevated UI via --owner-sid. The latter is accepted only if it is a real-user type via validate_user_sid — keeps the everyday user from being locked out even under OTS elevation, while preventing injection of an arbitrary SID). No Authenticated Users / Everyone ACE (deny by default) + token check on connect
2Remote connectionPIPE_REJECT_REMOTE_CLIENTS (+ server features are permanently out of scope per the won't-do list)
3Anonymous connectionNo anonymous ACE in the explicit DACL = deny by default (the NullSessionPipes default is policy-dependent, so do not rely on it)
4Pipe-name squatting / spoofed serverServer: FILE_FLAG_FIRST_PIPE_INSTANCE on the first instance only (no flag on subsequent instances — name preemption is impossible as long as the first instance is held). Client: for the default pipe name, GetNamedPipeServerProcessIdmatch against the SCM-registered fmf-engine service PID (QueryServiceStatusEx; works non-elevated — a SYSTEM process token cannot be opened non-elevated [ACCESS_DENIED], and a session 0 process identity is not obtainable either. A squatter cannot register with the SCM [requires admin] so its PID will not match)
5Malicious client input (malformed frame, huge len, unknown opcode, pathological regex)16 MiB length cap; validation failure drops the connection + pipe_malformed_frames counter. The whole dispatcher is a catch_unwind firewall (panic returns FMF_E_PANIC, the service survives). Regex is linear-time matching (no ReDoS) + compile caps size_limit/dfa_size_limit=1 MiB to gracefully reject computational DoS (overflow returns FMF_E_QUERY_SYNTAX. ADR-0023, RESEARCH.md)
6Local DoS (connection flood, handle exhaustion, flush spamming)Pipe instance cap 8 (overflow rejects the connection + pipe_connections_rejected). Result handle cap 64/connection (LRU evict → STALE). Flush is not exposed over the pipe (only the service-internal periodic flush and flush on stop). Events use a bounded queue + drop to protect the USN thread. Note that only the authorized same user can even reach this (#1)
7Leak of the data file itself (.fmfidx contains every file name on every volume)At install, apply a protective DACL to %ProgramData%\find-my-files (SYSTEM + Administrators; user read only on the logs subdirectory). Uninstall keeps data by default (shows guidance about leftovers); --purge-data deletes it
8Residual risk (accepted)An authorized user can search the "name/path" of files invisible under their own ACL (a structural property of name-only indexing; the contents and the actual ACL cannot be read). Targets single-user machines primarily; multi-user authorization is a re-examination trigger in ADR-0017

Distribution Integrity (code signing)

Authenticode signing of the distributed binaries is done with SSL.com eSigner (individual IV). The wiring is built into release.yml in a dormant state and is activated once the GitHub Secrets are set after obtaining the certificate. The acquisition/activation steps are in SIGNING.md; the rationale for the choice is in ADR-0020. Signing is limited to the tag-driven release.yml (the ci.yml dev artifacts are not signed).

Manual Verification Checklist (run once before each release; record the result and date here)

Items that cannot be automated (require another user's token or another machine). The structure of the SDDL-building functions is pinned by unit tests.

  • A pipe connection from another user (non-authorized SID) is rejected
  • A remote connection to \\<host>\pipe\fmf-engine-v2 is rejected
  • %ProgramData%\find-my-files\index\*.fmfidx cannot be read from a non-elevated process (2026-06-14: elevated on-machine verification found a bug where it was readable with Users:RX → fixed install → confirmed via icacls that both index/ and c.fmfidx are SYSTEM + Administrators only. See the implementation record below)
  • %ProgramData%\find-my-files\logs\engine.log can be read from a non-elevated process (F12 diagnostics path) (2026-06-14: confirmed via icacls SYSTEM + Administrators + install-user read)
  • After OTS elevation (elevated with a different admin account), the everyday user can still connect to the pipe non-elevated (--owner-sid propagation)
  • "Re-register" to a running service → restart reflects authorized_sids, and a previously-rejected user can connect (pipe client token rejected stops)
  • Leftovers after fmf-service uninstall match the guidance / are removed by --purge-data

Implementation record:

  • Code-level audit (2026-06-14): traced each defense in the table above to the implementation and confirmed it; conclusion was that the code paths are sound and covered by existing tests, no code change needed. See the commit for the full trace.
  • Real bug found and fixed in elevated on-machine verification (2026-06-14, threat 7): install did not explicitly apply the protective DACL to index/ (while logs/ was correctly applied explicitly), so the snapshot index/c.fmfidx (every file name on every volume) was readable by all local users with BUILTIN\Users:(RX). Cause: index/ inherits the Users ACE from %ProgramData% at creation time, and protecting the root afterward with D:P does not re-propagate to existing children because SetFileSecurityW (used by set_dir_dacl) does not (the asymmetry where logs/ is correct and only index/ is exposed is evidence of the root cause — reproduces even on a clean install). Fix: added set_dir_dacl(&data_dir.join("index"), &data_dir_sddl()) to install in fmf-service/src/main.rs (the same explicit application as logs/). After rebuild + reinstall + icacls, confirmed both index/ and c.fmfidx are SYSTEM + Administrators only; existing files were remediated with icacls /reset.
  • Runtime sign-off (remaining, not yet done): items requiring another user's token, a remote host, OTS elevation, or uninstall leftovers must be run in an elevated multi-user/network environment before release, with the date recorded.

Code Signing — Authenticode Signing of Distributables

Runbook for Authenticode-signing the project's own binaries (FindMyFiles.exe and others) in the distribution zip with SSL.com eSigner (cloud HSM signing). For the decision rationale and rejected alternatives, see ADR-0020.

Current state

The signing step in .github/workflows/release.yml is wired up but dormant. Signing is non-blocking: until the repository Secrets (ES_USERNAME / CREDENTIAL_ID) are in place, cutting a tag still completes, leaving the binaries unsigned and emitting a ::warning::. Signing activates automatically from the next tag after you obtain a certificate and register the 4 Secrets below. No other CI changes are needed.

Only the project's own PE files are signed — FindMyFiles.exe (the executable the user launches = the main target of SmartScreen evaluation), fmf.exe, fmf-service.exe, fmf_engine.dll. The bundled .NET / WindowsAppSDK runtime DLLs are already Microsoft-signed and are not re-signed (to avoid wasting the signing quota and signing others' copyrighted works).

Background (why this setup)

  • An individual residing in Japan is not eligible for the individual tier of Azure Artifact Signing (formerly Trusted Signing) (US/CA/EU/UK only).
  • EV signing no longer grants immediate SmartScreen trust (Microsoft changed this in March 2024). This app ships no kernel driver, so taking EV brings almost no practical benefit. Therefore individual-name IV (Individual Validation) is sufficient.
  • SmartScreen is reputation-based. Even when signed, a warning may appear on first run and disappears as download history accumulates. The immediate effect of signing is that "unknown publisher" disappears and your name appears in the properties.

Activation procedure (when you want a certificate)

A. Obtain the certificate (SSL.com)

  1. Create an account at SSL.com.
  2. Purchase a Code Signing certificate. Choose Individual Validation (IV) with eSigner (cloud signing) support (the cloud version, not the USB token version). Expect roughly $130–250 per year.
    • Only if you want the EV title, you may choose Sole Proprietor EV (no corporate registration required). No changes to this repository's CI (same Action, same 4 Secrets). But SmartScreen behavior is the same as IV.

B. Identity verification (IV validation)

  1. Government-issued ID + identity verification (documents/video). No corporate registration required. There is a track record of Japanese individuals / sole proprietors obtaining it.

C. Configure eSigner for automated signing

  1. In the SSL.com dashboard:
    • Note the Credential ID of the signing certificate.
    • Issue and note the TOTP (2FA) secret for automated signing (a Base32 string).
    • The account username / password.

D. Register 4 GitHub Secrets

  1. In the repository → Settings → Secrets and variables → Actions → New repository secret:

    Secret nameValue
    ES_USERNAMESSL.com username
    ES_PASSWORDSSL.com password
    CREDENTIAL_IDCredential ID of the signing certificate
    ES_TOTP_SECRETTOTP secret for eSigner automated signing (Base32)

    → On the next vX.Y.Z tag (or release via workflow_dispatch), HAVE_SIGNING becomes true and signing runs.

E. Verification

  1. Dry run (possible right now, even before obtaining a certificate): with Secrets unset, run Actions → release via workflow_dispatch → confirm that the signing step is skipped, a ::warning:: is emitted, and the zip / checksum / Release creation complete without failure (= the dormant wiring does not break the pipeline).
  2. Real signing (after registering Secrets): cut a test tag (e.g. v0.0.1-rc1) and run. Confirm that "Sign staged binaries" runs and "Verify signatures" turns green with all 4 files showing signed: ... - CN=<your name>.
  3. Local confirmation: extract the Release zip and on Windows:
    signtool verify /pa /v build\dist\FindMyFiles\FindMyFiles.exe   # → Successfully verified
    Get-AuthenticodeSignature build\dist\FindMyFiles\FindMyFiles.exe # → Status: Valid
    
    In the properties of FindMyFiles.exe → "Digital Signatures" tab, your name and a timestamp appear.

Renewal (handling expiry)

  • The validity period of a publicly trusted code signing certificate is, per CA/Browser Forum rules, at most ~460 days (about 15 months). Renew at SSL.com before expiry.
  • Only if the Credential ID / TOTP change on renewal, update the corresponding Secret.

Troubleshooting

  • Verify signatures fails: batch_sign may have written the signed files to a separate folder instead of override. Check the output location in the Action log and, if needed, specify output_path in the signing step of release.yml so the copy source for "Copy signed binaries back" matches it.
  • Get-AuthenticodeSignature returns UnknownError: the public trust chain is unresolved. Check details with signtool verify /pa.
  • A SmartScreen warning still appears on first launch: expected (reputation is shallow). It disappears as downloads accumulate. Same with EV.

Supply Chain and Provenance

The mechanisms and verification procedures that let users machine-verify that a distributable was "built from this commit of this repository, by an untampered CI." For code signing (Authenticode), see SIGNING.md. This document covers build provenance (SLSA provenance), SBOM, and dependency pinning.

For users: verify a download

release.yml (tag-driven) issues GitHub-native keyless attestation. There is no private key; it signs to Sigstore (Fulcio/Rekor) with the workflow's OIDC token. All you need to verify is gh:

# Verify build provenance (which commit / workflow / runner built it)
gh attestation verify find-my-files-vX.Y.Z-win-x64.zip --repo P4suta/find-my-files

# Verify that the SBOM is bound to the same zip (CycloneDX predicate)
gh attestation verify find-my-files-vX.Y.Z-win-x64.zip --repo P4suta/find-my-files \
  --predicate-type https://cyclonedx.org/bom

Successful verification means "the artifact's digest matches an attestation issued from P4suta/find-my-files's release.yml." The release attaches the following:

AssetContents
find-my-files-vX.Y.Z-win-x64.zipApp + engine binaries (Authenticode-signed when signing is enabled)
SHA256SUMS.txtSHA-256 of the zip
fmf-engine.cdx.jsonSBOM of the Rust engine (CycloneDX 1.6. cargo-sbom, all workspace dependencies)
app.cdx.jsonSBOM of the C# app (CycloneDX 1.6. CycloneDX dotnet tool, NuGet graph)

The zip and SHA256SUMS have a build-provenance attestation, and each SBOM has an SBOM attestation (listed in the repository's Attestations tab; 3 total).

Dependency and build controls

AspectMechanism
Rust dependency lockengine/Cargo.lock / xtask/Cargo.lock (committed)
C# dependency lockapp/FindMyFiles/packages.lock.json / app/FindMyFiles.Tests/packages.lock.json. CI treats stale as failure via -p:RestoreLockedMode=true
Vulnerabilitiescargo-audit (RustSec, weekly + on lock change). C# uses CodeQL + Dependabot
License/provenancecargo-deny (bans / licenses / sources. Unknown registries and git are deny)
Auto-updateDependabot (cargo / nuget×2 / github-actions. Weekly)
Action pinningThird-party actions in all workflows are pinned to a 40-char commit SHA (with # vX.Y.Z alongside). Dependabot updates the SHA and comment. actionlint validates workflows in the hygiene job
Posture monitoringOpenSSF Scorecard (weekly, SARIF to the Security tab, README badge)
Reproducible buildC# uses ContinuousIntegrationBuild=true in CI (embedded source path normalization. Deterministic is the SDK default). Rust is deterministic by default

For maintainers: runbook for the first attested release

The attestation/SBOM steps fire only on tags, so do a dry-run before the real tag to confirm the OIDC/permission path:

  1. Manually run release via workflow_dispatch (input tag_name) with an existing test tag (or a throwaway tag). Confirm that permissions: id-token: write / attestations: write and each step pass.
  2. For production, run just release as usual (version bump + tag push) → release.yml fires automatically.
  3. After completion, confirm that gh attestation verify <zip> --repo P4suta/find-my-files succeeds, the Attestations tab has 3 items (provenance + SBOM×2), and the release has zip / SHA256SUMS / *.cdx.json.

Notes

  • SBOM tools are CI/release-only (not added to the mise.toml development loop). For Rust, cargo install cargo-sbom; for C#, dotnet tool install --global CycloneDX (both version-pinned). Both languages standardize on CycloneDX 1.6.
  • For lock file updates, Dependabot's nuget PR regenerates packages.lock.json. After adding a version locally, run dotnet restore (both csproj) → commit. The floating 4.* of Roslynator.Analyzers is pinned to the resolved version by the lock file, so on bump the lock file must be regenerated (= the intended determinism).
  • Optional future extension: Microsoft.SourceLink.GitHub (makes PDBs traceable to commits). Deferred because it adds dependency surface. Consider adding it if there is debugging demand for distributed PDBs.

OpenSSF Scorecard

This repo publishes an OpenSSF Scorecard report (.github/workflows/scorecard.yml, badge in the README). Scorecard grades supply-chain hygiene across ~18 checks. This page records which checks we act on in-repo, which need a one-time maintainer action, and which we deliberately leave — so the score is understood, not chased blindly.

Baseline when this page was written (2026-06-17): 5.9 / 10.

In-repo (already applied)

CheckBeforeWhat changed
Token-Permissions0codeql.yml / pages.yml / release.yml declared write scopes at the top level. Moved each to the single job that needs it; top level is now contents: read. Same effective token, least-privilege, and Scorecard rewards job-scoped writes.
Fuzzing0Added cargo-fuzz harnesses (engine/fuzz/) + a bounded Linux smoke (fuzz.yml). See Fuzzing scope below.
Branch-Protection3Added .github/CODEOWNERS (the file half of "require Code Owner reviews"). The settings half is the runbook below.

Maintainer actions (cannot be done by committing files)

Branch-Protection — settings runbook

The score (3) means protection exists but lacks approval/code-owner reviews. The rules live in GitHub repo settings, not the tree. Apply with gh (needs admin; run it yourself — we don't store admin tokens or write ad-hoc scripts).

There is a real tension: requiring ≥1 approval conflicts with solo self-merge (you can't approve your own PR → every merge blocks). Pick a mode.

Mode A — keep self-merge, harden everything else (modest bump, no workflow change):

gh api --method PUT repos/P4suta/find-my-files/branches/main/protection --input - <<'JSON'
{
  "required_status_checks": { "strict": true, "contexts": ["ci-required"] },
  "enforce_admins": true,
  "required_pull_request_reviews": null,
  "restrictions": null,
  "required_linear_history": true,
  "allow_force_pushes": false,
  "allow_deletions": false,
  "required_conversation_resolution": true
}
JSON

Mode B — require review (largest bump; needs a second reviewer, or you stop self-merging): same body but with

  "required_pull_request_reviews": {
    "required_approving_review_count": 1,
    "require_code_owner_reviews": true,
    "dismiss_stale_reviews": true
  },

Optionally also require signed commits:

gh api --method POST repos/P4suta/find-my-files/branches/main/protection/required_signatures

Verify either way:

gh api repos/P4suta/find-my-files/branches/main/protection

Confirm the status-check context name (ci-required) matches the aggregate job in ci.yml as GitHub reports it — check Settings → Branches if the PUT rejects the context.

Honest ceiling: Scorecard's higher Branch-Protection tiers are gated on review requirements, so a solo project realistically tops out in the mid range under Mode A. Mode B is the only path to the top tier.

CII / OpenSSF Best Practices — enrollment runbook

Self-attested, external (bestpractices.dev). Scorecard only credits it once the badge reaches passing (in-progress = 0).

  1. Register the project at https://www.bestpractices.dev/en/projects/new (repo URL https://github.com/P4suta/find-my-files).

  2. Answer the passing questionnaire. This repo already satisfies the bulk of it — evidence map below.

  3. Take the assigned project id NNNN and add the badge under the Scorecard badge in README.md (do this after enrollment so it never renders broken):

    [![OpenSSF Best Practices](https://www.bestpractices.dev/projects/NNNN/badge)](https://www.bestpractices.dev/projects/NNNN)
    

Evidence map for the passing criteria (most are already met):

CriterionEvidence in this repo
Project homepage / descriptionREADME.md, GitHub Pages (pages.yml)
Version-controlled source, publicthis Git repo
OSI licenseLICENSE (Apache-2.0)
Contribution guideCONTRIBUTING.md
Bug/issue reporting.github/ISSUE_TEMPLATE/, Discussions
Vulnerability reporting process.github/SECURITY.md (private advisories)
Build + automated testsjust build / just test / just test-app, enforced in ci.yml
Tests run on contributions (CI)ci.yml on every PR
Static analysisCodeQL (codeql.yml), Clippy, cargo-audit
Secured release / signingrelease.yml (Authenticode + Sigstore attestations + SBOMs), docs/SIGNING.md, docs/SUPPLY_CHAIN.md
Unique versioning + release notesSemVer tags, auto-generated release notes
HTTPSGitHub + Pages are HTTPS-only

The open items are typically a couple of "describe X" free-text answers, not new engineering.

Fuzzing scope (and the deferred fmf-core work)

fuzz.yml runs engine/fuzz/ on Linux because cargo-fuzz is effectively a Linux/nightly tool. The harnesses target fmf-proto + fmf-contract — the named-pipe wire codec. That is deliberate, not a shortcut:

  • It is the privilege boundary (non-elevated UI → elevated fmf-service, see docs/SECURITY.md). A hostile local client sending malformed frames to the elevated service hits these parsers first — the highest-value fuzz target in the project.
  • It is the only untrusted-input parser that builds cross-platform. The richer parsers (wtf8, query parser, usn::records, index::snapshot) live in fmf-core, which depends unconditionally on ntfs-reader / windows-sys and therefore does not build on Linux (same reason pages.yml notes cargo doc fails there). Those are already covered by proptest no-panic property tests.

Deferred follow-up — to fuzz the fmf-core parsers under libFuzzer, make the crate Linux-buildable: move ntfs-reader (and the windows-sys call sites in scan/, mft.rs, engine/volume.rs, usn/session.rs, query/dates.rs) behind [target.'cfg(windows)'] + #[cfg(windows)], leaving the pure modules to compile everywhere. That is a fmf-core change touching the disciplined core and brushing the "no cross-platform" scope rule, so it is a separate decision, not bundled here.

ADR-0021 note: cargo-fuzz writes corpus/, artifacts/, target/ next to the fuzz crate and its dir model doesn't compose with the build/ redirect, so those are git-ignored in place (.gitignore). They're nightly, CI-only, machine-local.

Deliberately left (not movable by repo changes)

CheckScoreWhy we leave it
Code-Review0Solo self-merge → 0 approved changesets. Needs a second reviewer (see Branch-Protection Mode B).
Contributors0Scores contributors' org affiliations; not meaningful for a solo personal repo.
Maintained0Repo is < 90 days old. Resolves with time + activity.
Signed-Releases-1No release cut yet (inconclusive, excluded from the average). The infra (release.yml: Authenticode + attestations + SBOMs) is ready; the first tagged release should score well.
Packaging-1Same — no published release to detect yet.

Cutting the first release is a product decision, not a Scorecard chore, so it's out of scope for this hardening pass.

Re-checking the score

After merging, trigger a fresh scan (scorecard.yml runs weekly, on push to main, and via Actions → scorecard → Run workflow), then read the badge or https://scorecard.dev/viewer/?uri=github.com/P4suta/find-my-files.

ADR index

  • 0001 — Filename-only index (no content/property index)
  • 0002 — Linear pool sweep + incremental search; trigram inverted index rejected
  • 0003 — Names stored as WTF-8; fold is length-preserving single-char lowercasing only
  • 0004 — Single folded pool + original-text overflow (−16B/entry)
  • 0005 — FRN index is a sorted id permutation (25→12→4B/entry)
  • 0006 — size/mtime permutations are a lazy derived cache (−8B/entry)
  • 0007 — size column u32 + overflow map (−4B/entry)
  • 0008 — USN batch is insertion-point merge (54.6→2.0ms@1M)
  • 0009 — Compaction remaps in ascending old-id order, no re-sort
  • 0010 — Snapshot is raw POD + full validation, no backward compatibility
  • 0011 — Streaming scan adopted; I/O multiplexing rejected (+14.4% < +30%)
  • 0012 — Default allocator + RecordArena; mimalloc rejected (WS +260MB)
  • 0013 — Measurement discipline: cold, back-to-back, real-volume absolute gate
  • 0014 — rust-lld/sccache/nextest rejected; rationale for codegen-units=1
  • 0015 — WinUI 3 data virtualization (IList+INCC+IItemsRangeInfo)
  • 0016 — v2 service split: fmf-service + named pipe; rejected transport options; flush public surface
  • 0017 — Service security model: LocalSystem + least privilege, 4-layer pipe DACL
  • 0018 — Single source of truth for the contract (fmf-contract) + capture-first golden corpus, 2-seam ceiling
  • 0019 — Focused search mode = UI-layer query rewrite (ext:+!path:); in-engine bits/ranking rejected
  • 0020 — Code signing = SSL.com eSigner/individual IV; Azure (not available to Japanese individuals) and EV (immediate loss of SmartScreen trust) rejected
  • 0021 — Build output consolidated into a single build/ (per-workspace .cargo/config.toml, C# bin via BaseOutputPath, obj left in place)
  • 0022 — OS/shell/UI boundaries require testable seams + behavioral tests (lesson from reveal day-one breakage, mutation/coverage gate)
  • 0023 — First-class regex: literal-prefilter-driven + 1MiB compile cap, contract narrowed to 20B (pipe v2); trigram still rejected
  • 0024 — Non-elevated scope index mode: folder walk + ReadDirectoryChangesW as the second implementation of the 2 seams; synthetic FRN (path hash) keeps the format unchanged; a single-point revision of ADR-0001's "no folder walk"

ADR-0001: Index filenames only

Date: 2026-06-11 / Status: Accepted

Decision

The index holds only filename, size, modified time, and attributes. No content index, no property/tag index, no preview.

Rationale

  • Speed and RAM come from the "index filenames only" tradeoff. The RAM gate is engine-only ≤110B/file (M2), which is orders of magnitude incompatible with a content index
  • A content index inflates RAM by orders of magnitude (can reach 8GB-class)
  • Filename-only indexing lands at ≈100B/file (the ≤110B RAM gate is the target derived from this)
  • Real-world search syntax usage centers on substring, ext:, path:, size:, dm:; content:/regex: are niche (docs/RESEARCH.md)

Consequences

  • No search over file contents or meta-properties
  • Under the same scope freeze, FTP/HTTP/ETP servers, FAT/exFAT/network drives (MVP), ReFS (MVP), and cross-platform support are also out of scope

Re-examination triggers

  • The core (filename-only index, content index excluded) is a permanent decision (canonical source: the "do-not list" in CLAUDE.md)
  • Exception: the single point "volume-level only, no folder-walk index" was overridden by ADR-0024 (non-elevated scope index mode), justified by unlocking the non-elevated (corporate PC where elevation is forbidden) persona. The "filename-only index" core stays unchanged even under ADR-0024

ADR-0002: Linear pool sweep + incremental search (no trigram inverted index)

Date: 2026-06-11 / Status: Accepted

Decision

Search is a linear sweep of the folded name pool (SIMD memmem, rayon 64k-chunk parallelism). A re-query that provably narrows the previous query is handled by query::refine, which re-evaluates only the previous hit set (conservative subsumption rules in query/subsume.rs). No trigram inverted index.

Rationale

  • A synthetic 1M-entry cold 3-char query is about 2.9ms (query-cache MISS + derived-cache warm, materialize included). That is an order of magnitude below the criterion "per-volume scan_us p99 > 25ms @1M"
  • Posting maintenance costs +10-15B/file under the RAM ≤110B/file constraint, plus diff maintenance per USN batch. Not worth it
  • Incremental search is O(previous hit count), skipping both the scan and the O(n) materialize

Consequences

  • refine applies only under conservative subsumption rules (same sort, single AND group, needle containment / range shrink / filter addition only). Correctness is held by an oracle property test (refine == fresh search)
  • Kill switch FMF_QUERY_CACHE=0; observability via QueryTrace.cache (miss/refine/partial)

Re-examination triggers (only if all hold)

  1. Cache-MISS cold 3-char scan_us p99 > 25ms @1M
  2. Measured estimate from fmf stats --trigram-estimate ≤15B/file and total ≤110B/file
  3. Posting diff maintenance ≤2ms/batch
  4. Real demand for a single volume exceeding 4M entries

ADR-0003: WTF-8 storage and length-preserving fold

Date: 2026-06-11 / Status: Accepted

Decision

Names are stored as WTF-8. The search fold applies only "single-character lowercasing that keeps the same encoded length" (wtf8.rs). No NFC/NFD normalization.

Rationale

  • NTFS names are raw UTF-16 sequences and may contain ill-formed (unpaired) surrogates. WTF-8 preserves these losslessly (FFI contract: strings are UTF-8, filenames are WTF-8; the C# side restores UTF-16 via a dedicated decode)
  • Because the fold is length-preserving, name_off / name_len can be shared between the folded pool view and the original-text view (the precondition for the fold-overflow layout = ADR-0004)
  • A match position in the folded pool maps to the same byte position in the original text (anchor-position preservation). Residual verification holds without offset conversion

Consequences

  • Lowercasing that expands to multiple characters and normalization equivalence (NFC/NFD) are not absorbed by search (known limitation)
  • The fold rule is shared between the engine core and the fmf stats --name-stats measurement (no rule mismatch between statistics and implementation)

Re-examination triggers

  • If real-world harm from NFC/NFD mismatch is reported continuously (even then storage stays WTF-8; only the fold layer is redesigned)

ADR-0004: fold-overflow name layout

Date: 2026-06-11 / Status: Accepted

Decision

The only full-length pool that can be swept contiguously is the single folded lower_pool. The original text is stored in orig_pool only when it differs from the fold, referenced by orig_off (u32, sentinel u32::MAX = identical to fold). name() lends the lower slice directly for fold-identical entries.

Rationale

  • Real C: measurement (1,268,450 entries): fold-identical (lower == orig byte match) = 73.2%. About 3/4 of the double-stored names are duplicate identical bytes
  • Measured −16B/entry. The single largest term toward the M2 RAM gate (≤110B/entry)
  • Three soundness pillars: (1) the fold is length-preserving (ADR-0003) (2) original match ⇒ fold match (a superset sweep is sound; same algebra as bridge_needle in subsume.rs) (3) a fold-unstable needle (a needle differing from its own fold) cannot appear in a fold-identical name → an O(1) rejection via a single orig_off sentinel resolves 73% of candidates
  • Alternative (i) "sorted (entry, orig_off) pairs" is predicted at −19.6B/entry, 1.9B better than the u32-column approach (−17.7B), but adds a binary search per residual verification, so the u32 column was chosen to avoid p99 risk

Consequences

  • Needles containing uppercase (smart case) and Sensitive mode cannot sweep the original directly; they become a superset sweep of the folded needle + original-text residual verification
  • Accepted regression: real C: uppercase needle ("Win", smart-case) p50 2.5→3.6ms (7% of the p99 budget of 50ms)
  • The snapshot shrinks by the same amount (FMFIDX04)

Re-examination triggers

  • If a name distribution is observed on a real volume where the fold-identical ratio collapses substantially (below 50%-class)

ADR-0005: FRN index is a sorted id permutation

Date: 2026-06-11 / Status: Accepted

Decision

The FRN→EntryId index is held only as a sorted id permutation (ids u32 = 4B/entry, index/frn.rs). The comparison key is read by indirection into the frn column. lookup scans the unmerged tail newest-first (the latest upsert within a batch wins) → binary-searches the body, always passing through the tombstone liveness filter.

Rationale

  • An FxHashMap implementation is ~25B/entry (16-byte slot + bucket capacity padding + control bytes; real C: frn row 31.2MB), the largest RAM term after the name pool
  • Splitting into two arrays keys u64 + ids u32 gives 12B/entry (frn row 31.2→15.1MB, WS 157→140B/entry; first time under the M0 gate ≤150B)
  • keys is a pure redundant copy of masked(frn[ids[i]]) → removed to reach 4B/entry (−8B/entry, ~10MB on real C:)
  • lookup is on the critical path only for the USN apply path and the builder's parent resolution; the search hot path does not touch it. The +1 cache miss from indirection is acceptable
  • Side benefit: restore goes from a million serial hashmap inserts → one parallel sort, criterion load_1m 89.4→58.9ms (−34%)

Consequences

  • Deletion is tombstone-only with no unmap. rename / NTFS record reuse leaves dead duplicates of the same key, but under the liveness filter live count is always at most 1 (pinned by a byte-identical test against the forward-merge reference under random rename/delete storms)
  • The first-scan builder defers parent resolution and resolves it in bulk on the parallel path of finish() (per-lookup into the unmerged 1M tail is O(n²)). build_ms 13→64ms, invisible within the read-bound 2.1s scan

Re-examination triggers

  • If a design change lands where the search hot path requires an FRN lookup

ADR-0006: Lazy sort permutations (only perm_name is always maintained)

Date: 2026-06-11 / Status: Accepted

Decision

The only always-maintained, persisted fast-sort permutation is perm_name. size/mtime order is a derived-cache lazy permutation (SizePerm/MtimePerm in query/memo.rs): one par_sort on the first sort query, then per-generation incremental extension via the same insertion-position merge as perm_name; not included in the snapshot (non-persistent).

Rationale

  • −8B/entry (two permutations' worth) + ~8MB snapshot reduction. Many sessions never request a size/mtime sort
  • Initial construction is one par_sort, ~60ms-class @1M. One-off, so it does not sit on the always-on path of the query p99 gate (50ms)
  • Maintenance cost reduction: apply_batch_1k 6.67→1.96ms (−71.6%; permutations to merge go 3→1)
  • The only regression from going lazy is first_query_sorted_size +6.5% (2.0→2.1ms, within the 10% gate)

Consequences

  • The first size/mtime column click accepts a one-off construction cost (~60ms-class @1M)
  • After snapshot restore, the first use re-sorts (stale order from stat updates is also reset at the same time)
  • Correctness is pinned by an extend oracle (byte-equality of lazy == fresh-sort). Watermark inconsistency triggers warn + lazy_perm_rebuild_fallbacks counter + full rebuild (does not go silent)

Re-examination triggers

  • Real demand for a single volume large enough that the measured first sort-click exceeds the perceptual threshold (100ms-class)

ADR-0007: size column is u32 + overflow map

Date: 2026-06-11 / Status: Accepted

Decision

Hold the size column as u32. For 4GiB and above, store sentinel u32::MAX and offload the real value to an overflow map keyed by entry.

Rationale

  • Measured on real C:: 10 of 1,268,450 files exceed 4GiB (0.0008%)
  • −4B/entry (u64→u32)
  • The sentinel branch in size() is effectively zero-cost; the map is negligible in size

Impact

  • The snapshot (FMFIDX04) gains a size-overflow section (ids+sizes). On load, structurally validate "all pairs ↔ sentinel correspondence, ascending order, truly overflowing"
  • Files that shrink below 4GiB are correctly returned to the u32 side via the sentinel

Re-examination trigger

  • If a volume where the share of files over 4GiB reaches the several-percent range (e.g. a dedicated video-archive machine) becomes a primary target

ADR-0008: USN batch merge via insertion-point binary search + in-place segment move

Date: 2026-06-11 / Status: Accepted

Decision

Applying a USN batch to sorted structures (perm_name, FRN index; index/mod.rs merge_sorted_tail) finds each batch element's insertion point by binary search and moves the intervening segment once each with copy_within. No full-length element comparison, no full-length reallocation. Capacity is reserved with reserve_exact(max(add, len/64)).

Rationale

  • Batch ~1k vs existing ~1M. The full-length rebuild approach paid, per batch, a comparison against every existing element (for perm_name, a string comparison for every file in the index)
  • Measured: apply_batch_1k 54.6→2.0ms@1M (54.6→6.3ms from the insertion-point merge; 2.0ms with permutation lazification = ADR-0006 included). The ~30MB/batch reallocation churn on the FRN index side also disappears
  • Complexity is O(batch·log n) comparisons + one bounded memmove, no allocation
  • A doubling capacity policy puts a permanent 2× slack on the RAM gate. With a len/64 floor, the full-length copy is amortized over each ~1.6% growth, and the slack ceiling is also ~1.6%

Impact

  • Existing elements are not reordered. Because the id tie-break makes the sort result unique, byte-identity with the old code can be pinned in tests (random-batch comparison against a forward-merge reference)
  • In-place stat updates leave perm_size/perm_mtime locally stale-sorted (as before; under the purview of lazy permutation = ADR-0006)

Re-examination trigger

  • If usage where the batch length is large relative to the existing length becomes the norm (in that regime, full-length merge wins)

ADR-0009: compaction is old-id ascending remap (no re-sort)

Date: 2026-06-11 / Status: Accepted

Decision

Compaction reclaims tombstone rows and dead bytes in the pool. Live entries are renumbered in old-id ascending order — because relative order is preserved, every (key, id)-ordered structure (perm_name, FRN index) is carried over by an O(n) filter+remap copy with no re-sort. The volume thread evaluates the threshold on each batch apply: len≥100k AND (tombstone_ratio>12.5% OR dead_name_bytes>32MiB).

Rationale

  • Tombstone rows and the name bytes abandoned by renames accumulate without bound — a slow leak against the B/entry RAM gate (previously reclaimable only by a full rescan)
  • With old-id ascending remap, the relative order of live entries is preserved, so sorted structures can be filtered+remapped byte-equivalently (O(n), zero sort cost)
  • The decision input is dead_name_bytes observability (IndexStats.pool_garbage_ratio). Thresholds are set on the premise of real-volume observation

Impact

  • The copy build runs under a read guard (queries run concurrently; the only writer is the single volume thread). The swap goes through install_index with a µs-scale write lock + structural generation bump
  • Result handles open across a compaction go hard STALE (FMF_E_STALE) → the UI auto-reissues the same query (existing mechanism)
  • Children of dead directories are reparented to root (same as the orphan policy of push_raw)
  • A defensive generation-check failure increments the compaction_aborts counter + discards the copy (detection of a broken single-writer invariant; does not stay silent)

Re-examination trigger

  • Observation of compaction_aborts > 0 (revisit the single-writer invariant)
  • If the thresholds show, in real operation, that compaction fires too often or reclaims too little (threshold re-tuning)

ADR-0010: snapshot is a raw POD dump + full validation, no backward compatibility

Date: 2026-06-11 / Status: Accepted

Decision

Persistence is a homegrown binary FMFIDX04: magic + UsnJournalID + last USN + raw column-array dumps + xxhash64. Sections are lower_pool / orig_pool / orig_off / name_off / name_len / parent / size_lo / size-overflow ids+sizes / mtime / frn / flag / perm_name. No backward compatibility — a version mismatch or validation failure is always Err → full rescan.

Rationale

  • Real C:: 92.4MiB for 1.27M entries (−28% from the old 128.6MiB format), restore p50 81ms — ample margin against the restore→ready ≤2s gate
  • A rescan is cheap at 2.0s (ADR-0011). Not worth the maintenance and test cost of migration code
  • On load, beyond the checksum, perform structural validation of all slice bounds and overflow correspondence (Err → rescan instead of panicking on corrupt input)
  • The size/mtime permutations and the FRN index are not persisted (parallel-sort rebuild at restore/first-use time is faster than a serial load: load_1m −34%, ADR-0005/0006)

Impact

  • Accept one full rescan per volume (2s-scale, requires elevation) on each format version bump
  • structural_generation is not persisted (0 at restore). Since result handles do not cross processes, in-process monotonicity is sufficient
  • Writes are temp → MoveFileEx(REPLACE_EXISTING). Failures go to the snapshot_load_failures / snapshot_save_failures counters

Re-examination trigger

  • If a scale where the initial scan takes minutes becomes a primary target and the felt cost of a rescan per version bump becomes a problem

ADR-0011: streaming scan pipeline (I/O multiplexing rejected)

Date: 2026-06-11 / Status: Accepted

Decision

The initial scan reads $MFT as buffered synchronous streaming in 16MiB chunks; a single dedicated I/O thread prefetches chunk N+1 (3 buffers fix the RAM ceiling), and within a chunk rayon parses 1MiB record-boundary subranges in parallel. Name resolution for $ATTRIBUTE_LIST (deferred) RAM-caches the extension records that carry $FILE_NAME during streaming (capped at 128Ki entries ≈ 128MiB temporary) and runs with zero disk reads. I/O multiplexing via NO_BUFFERING + overlapped is not adopted.

Rationale

  • Measured deferred path: 2.9s with the disk-read version → 8ms with the RAM cache. Random reads on \\.\C: are serialized in the kernel regardless of the number of outstanding handles, so they do not shrink with parallel I/O
  • Whole scan 5.0s→2.1s (read at 1.6s is the limiter)
  • fmf io-probe C: ($MFT 1.54GiB) measured: buffered sync 962.6 / +SEQUENTIAL_SCAN 960.9 / NO_BUFFERING sync 958.2 / NO_BUFFERING+overlapped QD2 1075.9 / QD4 1101.6 MB/s (+14.4%). Below the adoption bar (read +30% for Stage 2; adopt at whole-scan −25%). The whole-scan effect is projected at 2.0→~1.85s, not worth it given the current state already clearing the M2 scan gate (60s) by 30×
  • EntryId assignment appends worker batches in chunk order, so it matches the sequential version deterministically (admin test streaming_scan_matches_reference is the equivalence gate)

Impact

  • Temporary RAM: 3×16MiB pipeline buffers + extension-record cache (cap 128Ki entries). Overflow increments the ext_name_cache_skipped counter + falls back to disk
  • I/O thread startup failure increments the scan_pipeline_fallbacks counter + degrades to sequential reads (does not stay silent)
  • fmf io-probe is kept on hand as a measurement tool

Re-examination trigger

  • If a multi-volume concurrent-scan requirement arises, or the scan gate is tightened to 10s or below (re-evaluate Stage 2 overlapped multiplexing)

ADR-0012: keep the default allocator + RecordArena for scan temporaries (mimalloc rejected)

Date: 2026-06-11 / Status: Accepted

Decision

Do not swap out the global allocator; keep the default. Scan temporaries (the many ~1KiB records of the deferred/extension-record cache) stop using individual Boxes and are allocated contiguously into a slot-addressed RecordArena (scan.rs).

Rationale

  • Individual Boxes leave heap fragments after free that persist as a WS delta not visible in accounting. Going to RecordArena: real-C: steady-state WS 124.2→119.9MiB (−4.3MiB, consistent across 3 measurements)
  • mimalloc A/B measured (fmf-cli feature gate, after a real-C: scan): steady-state WS 119.9MiB → ~380MiB (+260MB). mimalloc keeps freed segments in its own cache and does not return them to the OS, so scan temporaries sit there. Query p50 improves a few percent, but it is out of the question against the WS gate (≤110B/entry) → rejected
  • RecordArena is a homegrown implementation with zero dependencies

Impact

  • RAM measurement is on the engine process WorkingSet basis (CLAUDE.md performance pass line), so the allocator's OS-return behavior lands directly on the gate — fix the premise that "small self-accounting" alone is not enough

Re-examination trigger

  • Only if mimalloc gains a stably-provided "return segments to the OS immediately" setting AND the WS gap (measured WS − self-accounting) widens beyond 10B/entry

ADR-0013: Measurement discipline (cold machine, back-to-back, real-volume absolute gate)

Date: 2026-06-11 / Status: Accepted

Decision

Performance judgments are fixed as follows: (1) baseline recording and perf-gate/bench-check only on a cold, idle machine (confirm % Processor Performance >=95% beforehand with typeperf) (2) criterion comparisons limited to back-to-back A/B within the same session (3) the final judgment is the real-volume absolute gate (query p99 <=50ms, restore p50 <=1s) plus a query p50 relative +50%. The name distribution of the synthetic 1M benchmark is calibrated to measured real C: data (identical fold 73.2% / unique names 53.2% / mean WTF-8 length 29.7B), and build_synthetic asserts those ratios every run.

Rationale

  • This machine throttles to ~75% clock after a few minutes of all-core load, drifting p50 uniformly +30 to +46% (including snapshot restore, which is pure fixed-CPU work). Confirmed via simultaneous old/new A/B that "both equally slow = machine drift".
  • criterion is also state-dependent: measuring the same code 40 minutes apart drifts +30% (parse_compile, a µs-class pure-CPU bench).
  • p99-of-50-runs is effectively max (a single OS hiccup trips it). Even at 200 runs it swings +-60% -> p99 is gated only by the absolute budget (50ms).
  • Synthetic criterion benches move +-12 to 23% from code layout alone (a synthetic "regression" that did not reproduce on real C: and was actually -4%). Real breakage shows up at +48% / 5x class, clearly outside the p50 relative +50% gate.
  • The pre-calibration synthetic index had all-unique, lowercase-only names, making it useless for judging pool/column layout.

Consequences

  • p50 regressions under +50% are not detected by the real-volume gate (detection is handled by the back-to-back 10% median gate in bench-micro-check).
  • "all items including restore degrade uniformly" is treated as a thermal signature, not judged a code regression (re-measure cold).
  • The baseline is machine-dependent. Re-record when the volume's entry count drifts more than 10% from the baseline.

Re-examination triggers

  • If a thermally stable machine dedicated to measurement (constant clock >=95%) becomes available, reconsider tightening the relative gate.

ADR-0014: Build tooling rejection record and codegen-units=1

Date: 2026-06-11 / Status: Rejections recorded (codegen-units=1 accepted)

Decision

Do not adopt rust-lld, sccache, or cargo-nextest. The release profile keeps codegen-units = 1 + lto = "thin" (engine/Cargo.toml).

Rationale

  • Fair A/B of rust-lld vs MSVC link.exe (3 crates in the engine workspace): fmf-cli incremental 1.72s vs 1.73s, full test link after fmf-core change 3.44s vs 3.46s — no difference. Zero measured improvement does not justify the risk of a non-standard linker (DLL output, CI divergence).
  • sccache rejected because it disables incremental compilation. cargo-nextest rejected because the test suite is small and shows no benefit (both A/B decisions on the same day).
  • codegen-units=1: rustc splits codegen units per module, so splitting the query kernel into exec/sweep/matchers/memo loses inlining and produces ~10% query latency (A/B measured in the same machine state). With 1 unit, hot-path inlining is independent of module layout.

Consequences

  • Release build time grows by the codegen-units=1 amount (acceptable).
  • The query kernel's file-split refactoring can be done independently of runtime performance.
  • Build-speedup proposals should check this ADR first (re-proposal prevention).
  • rust-cache (Swatinem/rust-cache = GitHub Actions cache) is not a target of this ADR's rejection: unlike sccache it does not wrap rustc invocations; it only archives/restores ~/.cargo and target as artifacts, so it does not break incremental compilation. CI's CARGO_INCREMENTAL=0 is also CI-workflow-only and does not propagate to local incremental. CI speedups (parallel job split, shared-key cache sharing, dll artifact sharing, PR cancel-in-progress) fall under this and are permitted (ci.yml).

Re-examination triggers

  • Re-measure rust-lld only if the workspace grows and link reaches the tens-of-seconds class.

ADR-0015: WinUI 3 data virtualization (non-generic IList+INCC+IItemsRangeInfo)

Date: 2026-06-11 / Status: Accepted

Decision

Result-list virtualization uses non-generic IList + INotifyCollectionChanged + IItemsRangeInfo + placeholders (VirtualResultList). Do not use ISupportIncrementalLoading, ItemsView, or ItemsRepeater. ItemsPanel is fixed to ItemsStackPanel. VirtualResultList is a single instance with the same lifetime as the page (x:Bind OneTime), and ItemsSource is not swapped. New results are published via Reassign (apply prefetched seed + one INCC Reset); a re-query where the engine returns QueryTrace.unchanged=true (same query, ID sequence memcmp-equal across the whole volume) uses RefreshInPlace (no Reset, in-place fill of visible rows, count text unchanged).

Rationale

  • For random-access virtualization with a known count, "non-generic IList + INCC + IItemsRangeInfo + placeholders" is the explicitly supported path in current WASDK. IList<T> alone does not work (microsoft-ui-xaml#1809).
  • ISupportIncrementalLoading has crash reports, so avoid it (microsoft-ui-xaml#6883).
  • ItemsView / ItemsRepeater do not support the above interfaces. Setting ItemsPanel to anything other than ItemsStackPanel disables virtualization.
  • Swapping ItemsSource discards the ListView's virtualization state and reintroduces flicker.
  • Windows is never silent even when idle (USN batches from logs, telemetry, etc.). IndexChanged-driven re-queries return identical results every 200ms, so re-issuing Reset would churn the screen constantly — RefreshInPlace on unchanged (the MVVM setter notifies only on value change) brings redraw of the same screen to zero.

Consequences

  • IList residency contract: residency = "index is less than Count, and the corresponding slot in the current page cache is that same instance". A false "residency" causes GetAt(staleIndex) to crash deep in XAML (demonstrated: search with results -> clear all reliably reproduces an Int32.MaxValue-1 exception. Fix A/B: UIA stress went from 4 errors on the old code to 0).
  • The indexer throws immediately out of range and never fetches (returns a placeholder). Enumeration/CopyTo do not disturb the page LRU (cap 4096 rows).
  • The UI-thread check in Reassign/RefreshInPlace is always enabled in Release.
  • In-place updates only update cells whose value changed (e.g. the size of a grown file).

Re-examination triggers

  • If WASDK officially provides known-count random-access virtualization (IItemsRangeInfo equivalent) for the ItemsView family.

ADR-0016: v2 service split — fmf-service + named pipe

Date: 2026-06-11 / Status: Accepted (only the duplicated contract constants + value-pin sync operation is superseded by ADR-0018)

Decision

Host the engine in a privileged service fmf-service (hosts fmf-core directly, LocalSystem), make the UI non-privileged (asInvoker), and connect over a named pipe. Wire definitions live in a new rlib fmf-proto, and PipeEngineClient becomes the third implementation of IEngineClient. The canonical spec is the "Pipe protocol" section of docs/ARCHITECTURE.md. The FFI (fmf_engine.dll) and in-proc paths persist for now (--engine=inproc, requires manual elevation).

Rationale

  • The MVP's requireAdministrator runs the whole app as administrator: UIPI kills Explorer→window drag & drop (known limitation in README), and "open" needed an explorer.exe de-elevation workaround
  • The design reserved this split from the start: the fmf-ffi no-logic rule, the IEngineClient swap boundary, the per-machine index under %ProgramData%, and the "shared with the pipe protocol" note in the error-code table
  • A resident service achieves "the index stays fresh via USN tracking even when the UI is not running"

Rejected transports

  • COM / RPC (out-of-process) — registry registration, marshalling definitions, and elevation-boundary complexity; worse wire observability vs. a length-prefixed named pipe
  • gRPC / HTTP (localhost) — network stack drifts toward the "won't do" server features; dependency (tokio/tonic) clashes with fmf-core's synchronous threading; HTTP/2 overkill for local IPC
  • Shared memory + events — fastest page transfer, but self-designing lifetime/permissions/generation loses the "1 FFI function = 1 message" mapping; unneeded since the pipe round-trip has budget headroom (baseline in ARCHITECTURE.md latency-budget section)
  • async runtime (tokio) — at most a few connections; blocking I/O + threads fit the existing design; only adds dependency and build time

flush exposure surface (3 options)

The premise is to materialize Engine::flush() (VolumeSlot's shared checkpoint + generation-pair dirty-skip). Three options for the exposure surface were compared:

  • Option 1: expose as a pipe opcode — rejected. Client-driven flush spamming repeatedly holds index.read(), a local DoS path that stalls USN application (SECURITY.md threat 6)
  • Option 2: not even in FFI, service-internal function only — rejected. The in-proc (--engine=inproc) path and tests cannot reproduce save timing, and it punches a hole in the contract mapping table (1 FFI function = 1 message)
  • Option 3: adopted — FFI fmf_flush is exported, the pipe only reserves opcode 11 as a number

Saving is a service-internal responsibility — periodic (default 300s, staggered across volumes, dirty only) + on SCM Stop/PRESHUTDOWN. Because the PRESHUTDOWN default grace has been shortened to 10 seconds on current Windows (docs/RESEARCH.md), set an explicit extension via SERVICE_PRESHUTDOWN_INFO at install time.

Distribution

MSIX/installer is deferred for this milestone (WindowsPackageType=None kept). Service deployment is established via fmf-service install (sc.exe cannot substitute, because SID capture, DACL setup, and privilege stripping must be done atomically) + a justfile recipe + README instructions. Switching to asInvoker is conditioned on a working service-deployment mechanism (default behavior when the service is not deployed: an InfoBar with explanation + fake fallback + a "Restart as administrator" button).

Consequences

  • 2 new crates (fmf-proto / fmf-service). fmf-ffi and the DLL name fmf_engine are unchanged
  • The 3 synchronous IEngineClient methods (ListVolumes/StartIndexing/GetStatus) become Task-returning (sync across the pipe = a violation of the UI-thread "must not freeze" rule)
  • The single-writer invariant extends across processes: {index_dir}\.writer.lock + FMF_E_LOCKED=7
  • Both the Rust and C# test suites pin identical golden frames (byte sequences), fixing wire drift the same way as contract_tests
  • Removal trigger for FfiEngineClient (--engine=inproc): completion of a one-release soak after service GA
  • drag-out (results→Explorer) is filed separately as a new feature outside this milestone (only the drop direction is resolved here)

Verification (measured 2026-06-11. Canonical numbers are the CLAUDE.md performance pass-line and the ARCHITECTURE.md latency-budget section)

  • First index, real C: 2.31s @1,268,560 entries (gate: 1M≈60s. just bench-check). End-to-end via the real service binary (service_admin.rs, console-mode child process) also confirms real-C: scan→Ready→query
  • USN→event 250.9ms (gate 1s. Measured with periodic flush at a 10s interval firing. Almost all of it is the intended engine-side 200ms debounce. The UI side adds the existing 50ms debounce + rendering)
  • kill→restart→restore 1.25s (including process startup. Gate 2s. The engine-alone restore p50 is 108ms). The same test also proves the snapshot survives (durability) via the periodic flush before the hard kill
  • Search p99 ≤5.6ms for all queries on real C: (gate 50ms) / the loopback round-trip for a 64-row ResultPage p99 ≤5ms is constantly asserted by the test (pipe_loopback.rs::page_roundtrip_stays_inside_the_latency_budget)
  • RAM: the engine is the same code as the fmf-cli measurement (~99B/entry, WS 119.9MiB @1.27M). The fmf-service addition is only the pipe threads and queues (event queue cap 256×32B/connection)
  • SCM registration (fmf-service install → start → stop → uninstall) real-machine smoke — left as a manual procedure. Registering the persistent LocalSystem auto-start service is done by user action (just service-install). The SCM-path code goes through the windows-service crate, and the serve core is shared with the console E2E
  • SECURITY.md manual verification checklist (other-user reject / remote reject need a separate token / separate machine)

Re-examination triggers

  • If environments where pipe page-fetch p99 exceeds 5ms become routine (re-evaluate a multi-page batch-fetch opcode, or shared-memory page transfer)
  • Real demand for concurrent multi-user use (fmf-service authorize <user> to register multiple authorization SIDs)

ADR-0017: Service security model

Date: 2026-06-11 / Status: Accepted

Decision

fmf-service runs as LocalSystem and, at install time, strips privileges to a minimal set via SERVICE_CONFIG_REQUIRED_PRIVILEGES_INFO (SCM removes undeclared privileges from the token — docs/RESEARCH.md). The pipe has 4-layer defense — (1) explicit SDDL (SYSTEM + only the user SID captured at install time) (2) PIPE_REJECT_REMOTE_CLIENTS (3) FILE_FLAG_FIRST_PIPE_INSTANCE (4) token check on connect (ImpersonateNamedPipeClient) — guaranteeing "same user only, reject remote, reject anonymous". %ProgramData%\find-my-files gets a protective DACL at install time (SYSTEM+Administrators; user read only on the logs subdirectory). The standing threat-model document is docs/SECURITY.md (this ADR records the decision only).

Rationale

  • Adopt LocalSystem / reject a dedicated low-privilege account + SeBackupPrivilege: the verified fact only goes as far as "opening a volume handle (\.\C:) requires administrator". There is no documented guarantee that SeBackupPrivilege grants raw volume reads (docs/RESEARCH.md — only describes ACL bypass for regular files). Rather than bet on an unverified privilege configuration, narrow the attack surface with the verified SYSTEM + privilege stripping + zero network capability + minimal pipe-surface opcodes.
  • Name the user SID / reject Authenticated Users: Authenticated Users RW lets other users on a multi-user machine search every file name (a name leak that bypasses ACLs). Allowing Administrators also fails: in a UAC-filtered token the Administrators SID becomes SE_GROUP_USE_FOR_DENY_ONLY and is not used in allow ACEs (docs/RESEARCH.md). So store the install-running user's individual SID in service.json and use it in both the SDDL and the token check.
  • Handling OTS elevation (elevating with a different administrator account): when a standard user enters a different administrator credential at UAC, the install-running user (= the admin used to elevate) != the everyday user, and the everyday user can no longer connect to their own service. The non-elevated UI forwards its own SID to install via --owner-sid, and install validates it via validate_user_sid (accepting only SIDs for which LookupAccountSid returns SidTypeUser) before recording it alongside. The validation defends against threat 7 (injecting an arbitrary SID so someone else reads all file names) — install requires elevation and already has sc.exe-equivalent rights, but unresolvable / non-user-type SIDs are silently dropped (install itself does not fail).
  • Applying authorized_sids requires a restart: the service reads service.json once at startup and bakes that value into both DACL construction and the connect-time token check (immutable while running). So to add a SID to a running instance, install (idempotent append) must be followed by fmf-service restart (stop->start) — start alone is a no-op with ERROR_SERVICE_ALREADY_RUNNING and keeps rejecting with the old allow list (the root cause of the regression that appeared as repeated pipe client token rejected on real hardware). The app's registration flow runs install->restart consecutively.
  • Reason for defense in depth: a mistake building the SDDL string is the accident pattern of "silently wide open". Pin the structure of the build function with a non-elevated unit test, and place the connect-accept token check independently. Blocking anonymous access is primarily defended by the explicit DACL (no anonymous ACE = default deny) — do not rely on NullSessionPipes defaults, which are machine-type/policy-dependent (docs/RESEARCH.md).
  • Protective DACL on %ProgramData%: under the default ACL a general user can directly read .fmfidx (which contains every file name) — no matter how hard the pipe is locked down, it leaks from the side. Leave user read only on logs (to keep the non-elevated F12 "copy diagnostic info" flow working).

Consequences

  • In addition to SCM registration, install atomically does SID capture -> service.json, the directory DACL, privilege stripping, and explicit SERVICE_PRESHUTDOWN_INFO (current Windows' default grace is only 10 seconds) -> not expressible via sc.exe, so the fmf-service install subcommand is the only choice (making the logic unit-testable).
  • uninstall keeps data by default (--purge-data deletes .fmfidx/logs/service.json). The leftover artifacts are documented in README and SECURITY.md.
  • Client-connection prerequisites (verified on real hardware with the non-elevated UI): (1) the client opens the pipe at Identification level (C# TokenImpersonationLevel.Identification / Rust SECURITY_SQOS_PRESENT) — at the default anonymous level the server's ImpersonateNamedPipeClient gets an anonymous token, and the connect-time SID check rejects even authorized users entirely. (2) The client-side fake-server check (threat 4) is done not by SYSTEM-token comparison but by PID comparison of the SCM-registered service (QueryServiceStatusEx) — the non-elevated UI cannot open a SYSTEM process's token (ACCESS_DENIED) and cannot get the session-0 identity. Both were blind spots not exposed in console-mode tests where authorized_sids is empty and the token check is skipped; they only appear with the installed service.
  • "reject other users" and "reject remote" cannot be auto-verified on the dev machine/CI (they need another user's token / another machine) -> substituted by structure-pinning the SDDL build function + the manual checklist in SECURITY.md. Do not create a pipe-creation code path that bypasses the build function (review point).
  • Residual risk (accepted): an authorized user can also search the "name and path" of files invisible under their own ACL (a structural property of a file-name-only index; contents are unreadable). Documented in SECURITY.md.

Re-examination triggers

  • If a documented demonstration of a low-privilege indexer appears (e.g. a raw volume read means equivalent to FSCTL_READ_UNPRIVILEGED_USN_JOURNAL) -> re-evaluate demoting LocalSystem.
  • SERVICE_SID_TYPE_RESTRICTED + an explicit ACE on the index directory (a v2.1 hardening candidate).
  • Real demand for multi-user machines (UX for registering multiple authorized SIDs).

ADR-0018: Contract single source of truth (fmf-contract) + capture-first golden corpus

Date: 2026-06-11 / Status: Adopted (supersedes only ADR-0016's "duplicate contract constants and sync by value pinning" practice. The named-pipe adoption, rejected transport alternatives, flush public surface, and distribution decisions are unchanged)

Decision

Introduce a dependency-free leaf crate fmf-contract (rlib) at the bottom of the dependency graph as the machine-readable source of truth for the engine contract (status codes, opcodes, event kinds, wire PODs, QueryOptions, limits, version numbers, pipe name), radiating a single definition to all Rust consumers (fmf-core / fmf-proto / fmf-ffi / fmf-service). For C#, the gen-contract binary inside fmf-contract radiates app/FindMyFiles/Engine/Generated/EngineContract.g.cs (constants, enums, [StructLayout(Explicit)] structs, CountersData DTO) as a checked-in generated artifact, and fmf-contract/tests/drift.rs (byte match between regeneration and the committed artifact) continuously detects drift inside cargo test --workspace.

The contract semantics are carried by contract/golden/ at the repository root (manifest + byte streams + shared JSON fixtures) as an executable specification. The corpus is captured from the current implementation before the refactor begins (capture-first); thereafter both Rust (fmf-proto) and the independently hand-written C# codec (PipeProtocol/PageCodec) pin the same files. Re-capture (bless) only happens via explicit invocation with FMF_BLESS=1 — the ritual for an intentional contract change; normal test runs require a match against the existing bytes.

Additionally, limit the engine's internal OS-effect seams to SnapshotStore / JournalSource (2 traits only) (to push the volume worker's failure paths down into non-elevated, deterministic tests), and forbid additional porting beyond this cap.

Rationale

The duplication rationale was based on a misreading of Cargo

The current fmf-proto/src/lib.rs:3-5 and fmf-ffi/Cargo.toml claim that "fmf-ffi is a cdylib, so it cannot depend on / be depended upon, therefore the error-code table is duplicated and synced via value pinning in contract_tests." This is false: only the direction "another crate depends on a cdylib" is impossible; a cdylib depending on an rlib is perfectly fine (fmf-ffi already depends on fmf-core, an rlib). Placing a dependency-free rlib at the bottom replaces "duplicated definitions detected after the fact in tests" with "one definition that cannot drift," which structurally eliminates 3 of the 6 confirmed high-severity audit findings (duplicate code table, scattered event-kind magic numbers, unmet "pin the same golden bytes" claim).

capture-first (corpus first, refactor second)

Generating the golden corpus "from the new contract crate" would bake generator bugs into the spec itself (self-consistency trap: a test that only proves the generator agrees with itself). Capturing and sealing the current implementation's bytes first means (1) "wire/ABI bytes unchanged" from S1 onward is proven by byte match rather than circumstantial evidence, and (2) the generator is put on the side required to "reproduce the captured bytes."

Generation method: explicit command + check-in + drift test (not subject to ADR-0014)

gen-contract is not wired into MSBuild/build hooks (consistent with the no-custom-Directory.Build.props rule and ADR-0014's rejection of build complexity). Equivalent guarantees come from explicit just contract-gen invocation + committing the artifact + drift verification inside cargo test --workspace (rides on the existing lefthook pre-push / CI test job with no changes). FieldOffset and similar values are taken from the offset_of! actual values of compiled Rust types, so there is zero hand calculation and value drift is impossible in the type system. Missing enum entries are detected three ways: drift + golden + a C# startup Marshal.SizeOf assert.

Rejected alternatives

  • Full Platform porting (~10 traits + new fmf-win crate): speculative generalization against the Windows-only charter; doubles I/O-seam maintenance permanently. Adopt only the 2 seams with demonstrated test value.
  • Wire version bump (pipe name v2 / event opcode cleanup / PROTOCOL_VERSION=2): contradicts the bytes-unchanged principle and ruins the captured corpus's regression-oracle property; the benefit does not justify the ritual cost. Do it in a separate ADR if needed.
  • Macro DSL for contract definitions (contract_consts! etc.): over-machinery for ~40 constants + 6 PODs; plain definitions + a meta() function (direct offset_of!) give the same guarantee.
  • A dedicated crate just for gen: fmf-contract/src/bin/gen-contract.rs suffices and keeps the crate count at 6.
  • fmf-proto → fmf-core dependency (put the conversion layer on the core side): the contract source's leaf property would no longer be enforced by Cargo. Unify the dependency direction to core→contract and eliminate the conversion layer itself.
  • Wholesale vocabulary replacement (scan→ingest / diag→obs etc.): permanently diverges from the language of 17 historical ADRs and degrades the "read the relevant ADR before changing structure" workflow. Adopt only "narration order = flow order"; naming is unchanged.
  • Full state-machine rewrite of the volume worker: rewriting concurrency invariants pinned only in prose (checkpoint-after-apply, compaction-generation recheck) cannot be proven old/new equivalent by new tests. Limit to behavior-preserving pure-function extraction + 2 seams.

Impact

  • 1 new crate (fmf-contract). DLL name fmf_engine, pipe name fmf-engine-v1, ABI_VERSION=1, PROTOCOL_VERSION=1, FMFIDX04 are all bytes-unchanged (no version bump).
  • fmf-ffi's contract_tests is promoted from "duplicate equality pin" to "literal absolute-value pin + ABI layout pin" and lives on — an independent tripwire where a downstream test catches an accidental edit of the single source itself.
  • Canonical contract-change flow (one-directional radiation): docs/ARCHITECTURE.md (prose) → fmf-contract (definitions) → FMF_BLESS=1 re-capture → just contract-gen → both-language tests green. The error-code table remains append-only / no renumbering as before.
  • C# decisions (user-confirmed): CountersData is also a generation target (counter additions auto-follow into C#); CancellationToken is fully propagated to ISearchResult.GetRangeAsync too (double defense with the epoch mechanism, fixed by a behavior test).
  • Migration is 11 stages (S0→S0.5→S1a→S1b→S2 strict order; S3⇔S4, S5a/S5b⇔S4/S4b may run in parallel). Each stage compiles standalone + all tests green, mergeable to main. fmf-core-touching stages (S1b/S3/S4/S4b) require just perf-gate green in an elevated shell as a merge condition.

S4 (scan.rs teardown) rollback clause

If the scan/ split exceeds the criterion 10% gate, immediately roll back to file consolidation before investigating the cause, and re-judge with ADR-0014's measurement procedure (same-time alternating A/B against the baseline-commit worktree). Because of codegen-units=1, module boundaries should be neutral to inlining, but measurement takes priority over hypothesis.

Verification

  • S0.5: capture corpus pinned by both Rust/C# suites on the same files (non-elevated cargo test + just test-app)
  • S1a: after dependency inversion, corpus match proves wire bytes unchanged. All tests pass with C# unchanged (double proof)
  • S2: byte match generated corpus == captured corpus (self-consistency trap closed) + drift test running
  • S4: streaming_scan_matches_reference (elevated) + perf-gate green
  • S4b: worker failure paths (snapshot corruption→rescan / journal-gone→Rescan→Ready / save failure) green in non-elevated, deterministic tests; old/new behavior identical in a real C: smoke
  • S6: perf-gate + FMF_ADMIN_TESTS + FMF_PIPE_TESTS all green in an elevated shell, compared numerically against this appendix's starting point

Re-examination triggers

  • A real regression in an admin-only failure path that the 2 seams cannot cover (port addition gets its own ADR then)
  • Contract-change frequency rises and the bless ritual becomes friction (re-evaluate build integration of generation)
  • pipe page-fetch p99 > 5ms becomes the norm (inherits ADR-0016's re-examination trigger)

Appendix: old→new path mapping (for history investigation; aids git log --follow)

OldNew
fmf-proto codes/PIPE_NAME/PROTOCOL_VERSIONfmf-contract codes/versions (proto re-exports)
fmf-proto QueryOptionsWire/WireRow/EventWirefmf-contract pod::{FmfQueryOptions, FmfRow, FmfEvent}
fmf-ffi FMF_* constants / POD definitions / volume_bytesre-exported from fmf-contract / volume::encode_label
fmf-ffi error_chain / fmf-service/dispatch error_chainfmf-core diag::error_chain (4KiB cap)
fmf-core engine::VolumePhasefmf-contract options::VolumeState (name unified too)
fmf-core scan.rs (1165 lines)scan/{mod,volume_io,pipeline,parse,deferred,probe}.rs
fmf-core engine/volume.rs thread bodyengine/worker.rs (+seams.rs+worker_tests.rs)
fmf-cli main.rs (878 lines)main.rs (135 lines)+cmd/{index,stats,bench,io_probe,criterion_gate,diag}.rs+bench_support.rs
C# NativeEngine struct / status constantsEngine/Generated/EngineContract.g.cs (generated, partial NativeEngine)
C# DTOs inside IEngineClient.csEngine/EngineTypes.cs (CountersData moves to the generated artifact)
C# connection / result handle inside PipeEngineClient.csEngine/Transport/{PipeConnection,PipeSearchResult}.cs
C# MainPage.xaml.cs (452 lines) viewport/perf/converterControls/ResultsViewportManager / Views/PerfPanel / Converters/UiConverters (181 lines remain)
C# App.xaml.cs 3 exception handlersServices/ExceptionPolicy.cs
per-test unique tempdir duplication (%TEMP%)fmf-core index::testutil::TestDir (build/engine/test-tmp, RAII)

Stage commits: S0=9f7f4a6 / S0.5=c3916df / S1a=c9eb007 / S1b=fdb5407 / S2=7ce58e7 / S3=6855336 / S4=4e99077 / S4b=261fbb7 / S5a=289e60a / S5b=540d79c / S6a=6226ea8 / S6b=287f659+9d7a30d (+doc convergence commit).

Appendix: starting-point record (at refactor start)

  • Baseline commit: 97df250 (= feat/v2-service-split complete, ff-merged to main)
  • Measured values (2026-06-11, from ADR-0016 verification section): first index real C: 2.31s @1,268,560 entries / USN→event 250.9ms / kill→restore 1.25s (restore p50 108ms) / search p99 ≤5.6ms / loopback ResultPage p99 ≤5ms / RAM ~99B/entry (WS 119.9MiB @1.27M)
  • Non-elevated gate (just verify) green confirmed: run right after branch creation 2026-06-11 — fmt-check / clippy -D warnings / cargo test --workspace / C# 80/80 all pass

Appendix: final gate judgment (2026-06-12, all stages complete)

  • FMF_ADMIN_TESTS=1 (elevated): green — streaming_scan_matches_reference (scan/ split equivalence gate), real C: E2E, USN live, service kill→restore all pass
  • Real-volume absolute gate (just bench-check, elevated): green, no regression — 1,289,867 entries 2.05s / search p99 all queries ≤6.2ms (gate 50ms) / restore p50 79ms (gate 2s)
  • criterion 10% gate: 2 items exceeded initially → adjudicated with ADR-0013's alternating A/B/A:
    • post_usn/apply_batch_1k +10.6%: noise. Re-measuring identical code (97df250 itself vs its own baseline) gives CI −5.9%~+10.4% — this bench's intrinsic spread is about the same as the threshold. Re-measurement B was +2.5% (p=0.52)
    • parse_compile +13.7%→re-measure +5.7% (reproducible): a real difference but accepted — the absolute value is ~100ns/query out of 1.9µs (0.0002% of the p99 budget of 50ms). Probable cause: contract unification made SortKey/CaseMode repr(u32) (formerly rustc's default 1 byte), or a code-layout shift from changed declaration order. Since the source of truth (real-volume absolute gate) is green with wide margins on all items, it does not meet the trigger condition for the S4 rollback (file-consolidation rollback)
  • criterion "committed" baseline re-recorded at the refactor tip (baseline for the next optimization session)

ADR-0019: Focused mode (focused search) is a pure query rewrite in the UI layer

Date: 2026-06-12 / Status: Accepted

Decision

For the request "the files a general user looks for are limited in both directory and format", focused mode is realized as a pure query rewrite in the UI layer (ViewModels/FocusedQueryRewriter.Compose — static, no side effects). It does not touch the engine (Rust), the wire contract, or the index at all.

  • Split the user query on top-level | (do not split | inside quotes — same quoting rules as the engine's tokenizer), append config-derived suffixes to each OR group, and rejoin:
    • Excluded paths: for each entry p, !path:"p" (quoted negation — noise areas such as \windows\)
    • Format whitelist: ext:e1;e2;… as one term (the value of ext: is OR semantics, so do not add more terms)
  • Collision avoidance (the user's explicit intent always wins): if a group contains ext:/regex:, do not append the ext whitelist; if it contains path: or \, do not append excluded paths. The check is a simple substring test on the group string (over-matching only ever falls toward "skip the append" = the safe side).
  • An empty query is returned empty (the rule "do not throw an empty query at the engine" remains owned by the Orchestrator). An excluded-path config value containing " is unescapable in the query language, so it is ignored + warned (first time only).
  • Settings live in %APPDATA%\find-my-files\settings.json (UI-owned): focused_search (default true) / focused_exclude_paths / focused_extensions. The UI is a ToggleButton next to the search box; a toggle change is a filter-originated re-query (RequeryOrigin.Filter = reset to top).
  • SearchOrchestrator.FocusedSearch defaults to false (to keep existing tests and existing behavior intact). Only the product wiring (MainViewModel) feeds in the settings' true.

Rationale

  • Everything is expressible in the existing query language: ext: is a 1-term OR (;-separated), path: supports quotes + ! negation, | is an OR group (fmf-core/src/query/ast.rs). No new operator or new filter mechanism is needed.
  • The residual cost is proportional to hit count: the engine's linear sweep is unchanged, and the appended terms only work to reduce candidates. The rewrite itself is string concatenation (per keystroke, a few µs) and does not load the latency budget.
  • No engine contact = no perf-gate needed: since fmf-core is untouched, it can ship without the elevated-bench / regression-gate ritual. Wire bytes, contract, and golden corpus are also unchanged.

Rejected alternatives

  • In-engine preset filter (index bit): precomputing a "noise-area flag" per entry would make the residual cost nearly zero, but it incurs an index layout change (RAM budget, snapshot version), a full recompute on config change, and an ownership cross between UI settings and the service-owned index. This is an optimization to consider after the rewrite approach is measured to be slow, not a cost to pay upfront.
  • Ranking (relevance-order sort): the proper way to get "exactly a few hits" is scoring, not filtering, but it needs an entire scoring foundation (feature values such as usage frequency, recency, and path depth, plus a permutation cache) and is not orthogonal to the current lazy-sort permutation (ADR-0006). Noted as future work — not built while filtering suffices.

Consequences

  • Change surface: FocusedQueryRewriter (new) + 3 AppSettings keys + one rewrite point in SearchOrchestrator
    • a ToggleButton in MainPage. Engine, contract/golden, and Generated are unchanged.
  • The query notation in the F12 panel/logs is post-rewrite (the string the engine actually saw) — when investigating, keep the two lists in settings.json in mind.
  • Because it is ON by default, triage "the file should exist but does not show up" inquiries by first turning the toggle OFF (the existence of exclusions is already noted in the tooltip).

Re-examination triggers

  • Focused-ON search p99 > 50ms (exceeds the performance pass line — re-evaluate an in-engine preset filter).
  • When filtering's "exactness" falls short and ranking (a scoring foundation) becomes necessary.

ADR-0020: Code-signing provider selection (SSL.com eSigner / individual IV)

Date: 2026-06-13 / Status: Accepted (wiring dormant — certificate not yet obtained. Runbook docs/SIGNING.md)

Decision

Authenticode signing of the distributed binaries is done with SSL.com eSigner (a cloud HSM signing service) + a personal Individual Validation (IV) certificate. Signing is kept as a CI-environment-specific YAML step in release.yml (tag-driven), not placed in xtask/. Until the certificate is obtained it stays non-blocking and dormant (if Secrets are unset, finish unsigned + ::warning::).

The signing targets are only the 4 in-house PEs: FindMyFiles.exe / fmf.exe / fmf-service.exe / fmf_engine.dll. The bundled .NET / WindowsAppSDK runtime DLLs are Microsoft-signed, so they are not re-signed.

Rationale

  • Azure Artifact Signing (formerly Trusted Signing) not adopted: it is managed and easy to integrate into CI (release.yml was originally wired to this service), but as of 2026 the personal tier is limited to US/CA/EU/UK, and individuals residing in Japan cannot apply. Eliminated by the geographic requirement.
  • EV not adopted (IV adopted): since March 2024, EV no longer grants instant SmartScreen trust (Microsoft official). SmartScreen is purely reputation-based — reputation accrues from the signer certificate + file hash via download history — and "first-time warning -> cleared by track record" is the same for EV/OV/IV. This app does not ship a kernel driver (do-not-do list), so EV's remaining practical benefits (driver signing, corporate procurement requirements) do not apply. IV, the cheapest and obtainable under a personal name, is the rational choice. The budget (100,000 yen/year) puts EV in range too, but the consideration is "title only".
  • SSL.com eSigner adopted: cloud HSM signing needs no hardware token on the runner. Fully unattended CI signing via TOTP. A GitHub Action (SSLcom/esigner-codesign) exists. It supports both personal IV and Sole Proprietor EV (no corporate registration required), and is obtainable from Japan. Best fit for the "fully outsourced managed signing" requirement.
    • The alternative Certum personal (about $50/15 months) is cheapest but SimplySign requires a phone OTP per signature, which is a poor fit for unattended CI. SignPath Foundation (FOSS, free) requires review and may put new projects on hold. Both are inferior on the "throw-it-over-the-wall managed" requirement.
  • Keep signing as a YAML step (not in xtask): signing is CI-environment-specific processing that depends on GitHub Secrets and an Action; it is not the "portable release procedure logic" that xtask/ consolidates. Follows the precedent set by the Azure version (a YAML step).
  • Sign in-house PEs only: re-signing MS runtime DLLs wastes eSigner quota and is meaningless signing of others' copyrighted work. Collect just the 4 in a staging directory, batch_sign (1 OTP), and after copy-back hard-verify with Get-AuthenticodeSignature (do not silently succeed unsigned when signing was requested = the "do not stay silent" principle).

Consequences

  • The signing step in release.yml has already been swapped from Azure to SSL.com eSigner. The gate HAVE_SIGNING is decided by the presence of ES_USERNAME + CREDENTIAL_ID. Enabling it is just adding 4 Secrets (docs/SIGNING.md, section D).
  • Publicly trusted certificates expire after at most ~460 days (CA/Browser Forum 2026). Renewal procedure is in docs/SIGNING.md.
  • Signing is limited to the tag-driven release.yml. ci.yml (PR/push) does not sign (do not distribute development intermediates, conserve quota, fork PRs cannot access Secrets).

Re-examination triggers

  • If Azure Artifact Signing opens to individuals in Japan, re-evaluate on managed-ness and CI affinity.
  • If this project comes to have a kernel driver, EV becomes a mandatory requirement.
  • If a corporate EV procurement requirement (enterprise distribution, store requirements, etc.) arises, reconsider Sole Proprietor EV / corporate EV.
  • If SmartScreen's reputation model changes and first-time behavior again differs by signing type, revisit.

ADR-0021: Consolidate build output into a single build/ tree

Date: 2026-06-14 / Status: Adopted

Decision

Consolidate all build artifacts into a single build/ tree at the repository root.

build/
├── engine/        # cargo target-dir for the engine workspace
├── xtask/         # cargo target-dir for the xtask workspace
├── app/           # C# bin output (FindMyFiles / FindMyFiles.Tests)
├── dist/FindMyFiles/   # publish bundle
├── package/       # release zip + SHA256SUMS.txt
├── sbom/          # CycloneDX SBOM (release.yml)
├── site/          # GitHub Pages assembly (landing + book + doc)
└── docs-book/     # mdBook output

Mechanism (all means that do not violate the prohibition rules):

  • Rust: per-workspace [build] target-dir in .cargo/config.toml (engine/.cargo../build/engine, xtask/.cargo../build/xtask). Relative paths resolve against the .cargo/ parent (confirmed empirically with cargo metadata). A single config at the repository root is rejected (both workspaces would share one target and break the ADR-0018 separation rule).
  • C# bin: each csproj's BaseOutputPath (..\..\build\app\<proj>\).
  • dist/package/site: xtask/src/paths.rs as the single source of truth (build_root/dist_dir/package_dir/engine_release_dir/site_dir).
  • mdBook: build.build-dir = ../build/docs-book in docs/book.toml.

Rationale

  • Artifacts were scattered across engine/target, xtask/target, app/**/bin, root dist/, root zip, root SBOM, and site/, making them costly to track and clean. A single build/ means "delete it and everything is gone" plus an effectively one-line .gitignore.
  • The target-dir in .cargo/config.toml is not a toolchain pin, so it does not violate the rule against placing rust-toolchain.toml/global.json (avoiding double management with mise).

Consequences

  • C# obj stays put (app/**/obj/). Relocating obj requires BaseIntermediateOutputPath to take effect during pre-restore evaluation, which effectively requires Directory.Build.props, but CLAUDE.md prohibits that file (it silently shadows the analyzer injection of winapp run). obj is intermediate output and already gitignored, so there is no real harm.
  • The dev-tree fmf-service.exe lookup (ServiceSetup.cs production + pipe/contract tests) follows build/engine/release.
  • The test-tmp fallback default in testutil.rs is build/engine (because the config.toml target-dir does not set the CARGO_TARGET_DIR env var).
  • CI (ci/release/pages) artifact, SBOM, package, and Pages paths are all updated to under build/. site/ remains the committed landing source; assembly output goes to build/site.
  • Tools that assumed the old engine/target etc. (rust-analyzer, etc.) follow because they respect config.toml (reload if needed).

Re-examination triggers

  • If demand to also remove C# obj from the root grows strong and the winapp run analyzer-injection mechanism changes to no longer depend on Directory.Build.props (re-evaluate whether to allow props).

ADR-0022: OS/shell/UI boundaries must use testable seams + behavioral tests

Date: 2026-06-15 / Status: Adopted

Decision

Code that touches the OS, shell, processes, file I/O, or UI events must go through an injectable seam (an interface, or an internal core with paths/dependencies passed as arguments), and must come with tests that verify its behavior via dotnet test / cargo test. Do not ship with only pure helpers or argument construction tested while "actual behavior is unverified."

Canonical patterns: app/FindMyFiles/Engine/IEngineClient.cs (Fake/Ffi/Pipe), Services/IDispatcher.cs, Services/IProcessRunner.cs / Services/IRevealApi.cs, the path-parameterized core of Services/FileLog.cs. On the engine side, engine/crates/fmf-core/.../seams.rs (SnapshotStore / JournalSource; the two-seam cap is ADR-0018).

Rationale

  • "Open folder and select file" (reveal) was broken from day one: the actual behavior of ShellOps.Reveal (SHOpenFolderAndSelectItems) was never tested; only the pure helper BuildOpenStartInfo was green, and CI kept passing. The tests did not guarantee quality.
  • Root-cause type: if the runtime/OS boundary stays static + direct P/Invoke, behavior cannot be swapped with a fake and behavioral verification cannot be written. Argument/structure tests do not make "passes = not broken" hold.
  • The C# coverage gate being Threshold=15 (nominal only) also allowed unverified code to ship.

Consequences

  • New boundary code is required at review to have "seam + behavioral test" (construction-only tests are deemed insufficient).
  • C# live UI automation assumes a PowerShell script (ui-tests.ps1), which is disabled by execution policy on this machine and not adopted by operating policy. Therefore UI-adjacent logic is pushed into ViewModels / core and verified via dotnet test (no dependence on live UI automation).
  • Mutation testing is used to detect vacuous tests (those that pass even when broken): Rust = just mutants (cargo-mutants), C# = just stryker (Stryker.NET). Informational for now; gated incrementally.
  • The C# coverage gate is raised incrementally from 15% (ratchet).

Re-examination triggers

  • If live UI automation becomes adoptable without PowerShell dependence (e.g., integrating FlaUI into dotnet test) → re-evaluate direct testing of UI flows.
  • Signs that seam proliferation distorts the design (the engine side keeps the two-seam cap = ADR-0018).