my-pdf-tools-java

Tools for self-scanned (自炊) Japanese-book PDFs and their bitonal page images.

Three focused command-line tools, the unified pdfbook pipeline that chains them, and a Spring Boot web app — all on a shared hexagonal core. They turn a raw book scan into clean, uniform, right-to-left two-page spreads that read correctly in any PDF viewer.

API documentation (Javadoc) Source on GitHub

The tools

register

Aligns bitonal scans of 縦書き novel pages onto a fixed paper-size canvas, removing per-scan jitter in size, position and rotation.

despeckle

Removes scanner pepper-noise from bitonal scans while protecting ruby (振り仮名), 句読点, and dakuten / handakuten.

tate-yoko-pdf

Converts 縦書き scanned PDFs into right-to-left (RTL) two-page spreads that read correctly in any viewer.

pdfbook

The unified pipeline: extract once, then despeckle → register → RTL spreads in a single pass with no intermediate PDFs. Includes a guided interactive mode (-i).

webapp

A Spring Boot front end for pdfbook: upload a scan, watch per-page progress over SSE, and download the finished book.

One pass, no intermediate PDFs

pdfbook scan.pdf -o book.pdf
extract → despeckle (noise) → register (align) → RTL spreads → one repacked PDF

CLI conventions

consistent

Every tool shares -h/--help, -V/--version, -v/--verbose, --completion and --man, with sysexits-flavored exit codes.

safe

Existing output is refused by default (--force to overwrite); an interrupted run leaves no temp files behind.

bilingual errors

A failure carries only a stable error kind; the CLI renders it in English, the web UI in Japanese — neither surface shares a message string.