Skip to content

AI Models and LLM Systems

AI Models and LLM Systems Graphics Coverage

Primary chapter graphic: LLM Glossary Map, Generative AI Learning Roadmap, Transformer Encoder Decoder Flow, AI Concepts Map, MCP Server Catalog, Frontier Model One-Pager, Generative AI Tech Stack, Reasoning Model Training Path, Open Model Inference Flow, RAG vs Fine-Tuning, Full Fine-Tuning vs LoRA vs RAG, Open Source AI Stack, How LLMs See Text, Visual Token Context Compression. Accepted graphics: 14. Reviewed non-signal pages: 5. Open graphics in review: 0. QA status lives in graphics audit and visual review ledger.

Corpus pages: p. 38, p. 43, p. 74, p. 131, p. 147, p. 158, p. 175, p. 194-195, p. 200-201, p. 230, p. 242, p. 266-267, p. 277, p. 326-327, p. 335-336, p. 343, p. 364, p. 373-374 Coverage: 24 pages; low-confidence extraction ranges: p. 364, p. 373-374

This chapter is part of Marius's owned architecture build corpus. The text routes decisions; durable implementation signal is carried by accepted graphics, reviewed non-signal decisions, and the linked QA audit.

Chapter Visuals

Accepted graphics carry the canonical design signal for this chapter. Each selected source page is either accepted as a graphic or explicitly marked non-signal in the source-faithful ledger. Review and QA state live in visual inventory, visual review ledger, and graphics audit.

LLM Glossary Map

LLM Glossary Map

Generative AI Learning Roadmap

Generative AI Learning Roadmap

Transformer Encoder Decoder Flow

Transformer Encoder Decoder Flow

MCP Server Catalog

MCP Server Catalog

AI Concepts Map

AI Concepts Map

Frontier Model One-Pager

Frontier Model One-Pager

Generative AI Tech Stack

Generative AI Tech Stack

Reasoning Model Training Path

Reasoning Model Training Path

Open Model Inference Flow

Open Model Inference Flow

RAG vs Fine-Tuning

RAG vs Fine-Tuning

Full Fine-Tuning vs LoRA vs RAG

Full Fine-Tuning vs LoRA vs RAG

Open Source AI Stack

Open Source AI Stack

How LLMs See Text

How LLMs See Text

Visual Token Context Compression

Visual Token Context Compression

Open Review Queue

  • none

Reviewed Non-Signal Pages

  • AI Models And LLM Systems: LLM + Vector Search Map: source p. 195; batch 08; status non-signal/reviewed; ledger reason in visual-review-ledger.json
  • AI Models And LLM Systems: LLM + Tool Map: source p. 230; batch 08; status non-signal/reviewed; ledger reason in visual-review-ledger.json
  • AI Models And LLM Systems: LLM + Tool Map: source p. 242; batch 11; status non-signal/reviewed; ledger reason in visual-review-ledger.json
  • AI Models And LLM Systems: LLM + Embedding Map: source p. 277; batch 12; status non-signal/reviewed; ledger reason in visual-review-ledger.json
  • AI Models And LLM Systems: Embedding + Pattern Map: source p. 131; batch 16; status non-signal/reviewed; ledger reason in visual-review-ledger.json

Use When

  • A workflow needs language understanding, extraction, classification, summarization, or generation.

Avoid When

  • A deterministic rule is cheaper, clearer, and more reliable.

Core Model

  • Model systems combine prompt, context, tool constraints, evaluation, fallback, and cost controls.
  • Prefer explicit ownership over accidental coupling. Every boundary should say who owns correctness, cost, data, recovery, and change.
  • Use corpus page pointers for inspection, and keep the chapter notes focused on reusable design decisions.

Implementation Guidance

  • Define task type, acceptable error, source context, output schema, review threshold, and regression set.
  • Write the smallest useful design note: purpose, inputs, outputs, state, failure behavior, observability, and rollback.
  • Choose the first implementation that can be tested against the real workflow without hiding a known production risk.

Tradeoffs

  • Fine-tuning can specialize behavior; retrieval keeps knowledge fresh without changing weights.
  • Centralization reduces duplicated work but can become a bottleneck when every team needs exceptions.
  • Specialized infrastructure helps at scale, but it must earn its operational cost.

Failure Modes

  • A model is evaluated by vibes instead of test cases tied to business risk.
  • The diagram shows boxes but not ownership, retry behavior, data freshness, or user-visible failure.
  • The system has no proof path for the highest-risk assumption.

Decision Checklist

  • Measure accuracy, refusal behavior, hallucination rate, latency, cost, and human review load.
  • Name the owner, source of truth, timeout, retry policy, and evidence that the path works.
  • Add one regression check for the failure mode most likely to recur.

Neutral Automation Examples

  • A classifier suggests categories with confidence and routes low-confidence items to a reviewer.
  • A neutral internal automation starts with fixtures, then adds credentials, permissions, and production scheduling only after the boundary is tested.
  • A customer-facing workflow keeps irreversible actions behind explicit approval until metrics show it is safe to automate further.