Contributing¶
First step¶
Update uv before anything else:
uv self update
If uv was installed via Homebrew, use brew update && brew upgrade uv
instead.
TDD workflow¶
Always follow red/green TDD:
- Write a failing test first (red).
- Write the minimum code to make it pass (green).
- Refactor if needed.
Import errors do not count as red. A test that fails due to an
ImportError or ModuleNotFoundError is not a valid red test — the
module or function must exist before the test can legitimately fail for
the right reason.
Live integration tests¶
Code that talks to a real external service (OpenRouter, yt-dlp /
YouTube, future providers) gets a @pytest.mark.live test in
tests/test_live_integration.py alongside its mocked unit tests.
Live tests are skipped by default and not run in CI: they cost real money (OpenRouter) and depend on third-party availability (YouTube). Opt in locally:
uv run pytest --run-live
Each test additionally skips if its required env var is missing
(OPENROUTER_API_KEY for OpenRouter), so --run-live on a fresh
checkout still passes for whatever subset the runner has credentials
for. Override the OpenRouter model with LIVE_TEST_MODEL=... to run
against a cheaper model than the project default.
When you add or change a real-world integration boundary, add or update the matching live test in the same commit.
QA seeding¶
auto-lorebook seed-ingest --at=<stage> mints a fresh disposable
qa-<hex> source_id and lays down synthetic stage-input artifacts, so
a single pipeline stage can be exercised in isolation without running
the prior stages or hitting the LLM.
uv run auto-lorebook seed-ingest --at=plan
# Seeded source qa-1a2b3c4d at stage 'plan' from fixture 'tiny-aldara'.
# Next: auto-lorebook replan qa-1a2b3c4d
Stage ladder (each --at value seeds everything from prior levels too):
--at |
Next command |
|---|---|
structure |
generate-reading <sid> — runs Stage 1a + 1b |
summarize |
regenerate-reading <sid> --from=summarize — runs 1b |
approve |
approve-reading <sid> --yes — runs approve + plan + extract |
plan |
replan <sid> — runs Stage 2 + 3 |
Fixtures live in the package at src/auto_lorebook/_qa_fixtures/; the
default is tiny-aldara (a 4-cue SRT with two segments). Add new
fixtures by dropping a sibling directory containing the same set of
artifacts.
Clean up with reject-ingest <sid> (which knows how to remove pending
artifacts and any contributions written into the wiki). The interactive
review and reading-approval gates are out of scope for QA seeding —
exercise them manually if you need to test those paths.
After every code assignment¶
Run these in order:
uv sync --dev # keep dependencies fresh
uv run ruff check # lint
uv run ruff format # format
uv run ty check # type-check
uv run pytest # run tests
If the change touched any of:
- Environment variables or config keys.
- On-disk layout under
sources/,<category>/, or~/.auto-lorebook/. - Architecture (pipeline stages, entity model, artifact hashing).
Update the matching page under docs/ in the same commit, then run:
uv run mkdocs build --strict
Doc drift is a CI failure, not a future cleanup task.
Technical writing style¶
Docstrings and inline comments use telegraphic style:
- Omit articles ("the", "a"), auxiliary verbs, unnecessary prepositions, filler words.
- Keep nouns, verbs, adjectives, key modifiers.
- Don't restate types already in signatures.
- Don't summarize functions when the name is self-explanatory.
- Start inline comments lowercase.
- Use periods only for full sentences.
Wiki pages (docs/*.md) keep full sentences for readability, but still
aim for concise — trim wordiness, filler, restating.
Behavior guidelines¶
These reduce common LLM coding mistakes. For trivial tasks, use judgment; the guidelines bias toward caution over speed.
Think before coding¶
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
Simplicity first¶
- No features beyond what was asked.
- No abstractions for single-use code.
- No flexibility or configurability that wasn't requested.
- No error handling for impossible scenarios.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
Surgical changes¶
- Touch only what you must.
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it — don't delete it.
Every changed line should trace directly to the user's request.
Goal-driven execution¶
Transform tasks into verifiable goals:
- "Add validation" → "Write tests for invalid inputs, then make them pass."
- "Fix the bug" → "Write a test that reproduces it, then make it pass."
- "Refactor X" → "Ensure tests pass before and after."
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
Pull requests¶
Keep PRs narrow. One feature or fix per PR. Reference the spec section
or docs page any non-obvious decision draws from. If the change touches
architecture, update docs/architecture/ in the same commit.