provenote

Space Governance Runbook

Use this runbook when you need to understand or reclaim Provenote-related disk usage without guessing.

Classification Model

Space governance uses five retention classes only:

Clear actions are intentionally separate from rebuildability:

Audit Commands

Human-readable audit:

bash tooling/scripts/ops/audit_space_surfaces.sh

JSON output for automation:

bash tooling/scripts/ops/audit_space_surfaces.sh --format json

Machine-cache audit lane:

bash tooling/scripts/ops/cleanup_machine_cache.sh --mode audit-only

Housekeeping inventory for repo-managed cleanup candidates:

bash tooling/scripts/ops/audit_space_surfaces.sh \
  --inventory-class repo_managed_candidate \
  --action-filter safe_clear,cautious_clear

If you specifically need the repo-internal execution view that matches cleanup_runtime_cache.sh, use:

bash tooling/scripts/ops/audit_space_surfaces.sh \
  --cleanup-owner cleanup_runtime_cache.sh \
  --action-filter safe_clear,cautious_clear

Explicit operator path that chains repo-local runtime cleanup, repo-related machine cache review, and Docker/buildx review:

make cleanup-operator-audit
make cleanup-operator-apply

Detailed machine-cache dry run, including stale bootstrap snapshots:

bash tooling/scripts/ops/cleanup_machine_cache.sh \
  --mode dry-run \
  --include-stale-bootstrap-snapshots

Schema and contract validation:

bash tooling/scripts/runtime/run_uv_managed.sh run python tooling/scripts/ci/check_space_surfaces.py

Operator Path

Use one explicit operator path instead of remembering separate buildx and repo-local cleanup commands by hand.

make cleanup-operator-dry-run
make cleanup-operator-rebuildable
make cleanup-operator-aggressive

What each lane means:

Cleanup Buckets

Bucket 1: Safe Clear

These are small repo-local transient caches such as:

Bucket 2: Cautious Clear

These are repo-exclusive and rebuildable, but clearing them slows the next run. Some are repo-internal execution targets, while repo-external entries are candidate inventory only:

Important boundary:

Boundary note:

Bucket 3: Verify Before Clear

These keep current proof, backup, state, or tracked worktree context:

Bucket 4: Do Not Clear In Repo Automation

These are not repo-managed cleanup targets:

Recovery Commands

Frontend dependencies:

cd apps/web && npm ci && cd ../..

Managed Python environment and uv cache:

bash tooling/scripts/runtime/run_uv_managed.sh sync --frozen --extra dev

Repo-specific Playwright browsers:

cd apps/web && npm run test:e2e:install && cd ../..

Consistent-container caches:

bash tooling/scripts/ci/run_in_consistent_container.sh --profile python -- \
  bash -lc 'bash tooling/scripts/runtime/run_uv_managed.sh sync --frozen --extra dev'

Repo-local managed Python environment:

bash tooling/scripts/runtime/run_uv_managed.sh sync --frozen --extra dev

Single-Container Log Truth

The canonical single-container supervisor log roots are:

Current truth boundary:

Why Shared Layers Are Not Auto-Cleared

Shared layers are like a building-wide storage room: Provenote may use them, but other projects can be using the same storage at the same time.

That is why these paths stay advisory-only in repo automation:

They can still appear in audits, but repo-owned cleanup scripts must not treat them as Provenote-exclusive reclaim targets.

The isolated Chrome user-data root is different:

It is repo-exclusive, but it is a permanent browser state surface, not a clearable download cache. Repo automation must treat it as protected browser state and keep it out of TTL/cap trimming.

Docker Runtime Operator Path

Provenote now keeps Docker/buildx cleanup explicit instead of leaving it as tribal knowledge.

This split matters:

Inventory Versus Execution

Two inventory views intentionally coexist:

Docker attribution uses three states:

Until a dedicated per-repo Docker attribution lane exists, Docker Desktop should stay in the first two states only.

Machine Cache Namespace

The canonical repo-specific machine cache root is:

Important subtrees inside that namespace:

Treat this namespace like a repo-owned download shed: it is still Provenote-related space even though it lives outside the checkout, but it should only contain reusable download caches.

Historical Candidates Versus Strict Confirmed Usage

The audit intentionally separates four ideas:

This is why the distinct summary can show historical candidates separately from strict confirmed totals. It prevents parent/child double counting and keeps unresolved named candidates from being silently mixed into the confirmed repo footprint.

~/.cache/provenote-* entries are migration-only historical candidates. The canonical machine cache root is ~/.cache/provenote, and entrypoint-triggered machine-cache cleanup now removes stray legacy roots instead of preserving them indefinitely.

Bootstrap Snapshot Governance

.runtime-cache/ci-host/bootstrap/apps-web-node-modules stores lock-hash keyed frontend dependency snapshots for the consistent-container bootstrap flow.

Those snapshots must be classified before cleanup:

The repo-local cleanup lane must never wipe the bootstrap root wholesale. It may only consider stale snapshots individually, while preserving the active hash and any locked snapshot directories.