Appearance
Architecture Source Ledger
publication-target: Marius architecture build reference tracked-raw-text: no raw-pdf-tracked: no
Source
- file: source.pdf
- original-path: redacted from tracked ledger; see ignored extraction report when local provenance is needed
- cache-path:
.cache/architecture-reference/source-corpus/source.pdf - sha256:
d7d5b77acaf501ceed0d41f07162d8dc3cdf6b6253a089817678477e3be4bc45 - size-bytes: 216602954
- source-pages: 442
- extracted-pages: 442
- extracted-at: 2026-06-16T13:38:33.478Z
- extractor: Bun orchestration over
pdftotext, fallbackmutool, fallbackuvx markitdownunless--no-markitdown, optional OCR with--allow-ocr - source-override: pass
--source=/path/to/source.pdfor setARCHITECTURE_REFERENCE_SOURCE_PDF
Registered Reference Seeds
- source: architecture web corpus
- url: registered-reference-url
- status: registered metadata only, not scraped by this generator
- allowed-use: future curated URL selection, short source pointers after explicit capture
- excluded-use: blind full-site mirroring, raw article dumps, copied diagram text or images
Corpus Boundary
This repo stores only build-reference notes, topic indexes, corpus pointers, and extraction tooling. The raw PDF, raw page text, OCR images, and intermediate chunk JSON stay under .cache/architecture-reference/source-corpus/, which is ignored by git.
Use this as an architecture build reference: explain patterns in operational words, cite corpus pages for inspection, and keep raw captures out of git.
Extraction Report
- report:
.cache/architecture-reference/source-corpus/extraction-report.json - chunks: 18
- failed-or-partial-ranges: p. 1-25 (partial), p. 301-325 (partial), p. 351-375 (partial), p. 376-400 (partial)