SourceHarbor

Architecture

SourceHarbor is easiest to understand as a single knowledge pipeline with four outward-facing surfaces.

SourceHarbor architecture showing source intake, API and worker pipeline, artifact generation, retrieval and MCP surfaces, and the web command center.

The Product Model

Long-form sources come in through two honest intake lanes:

Everything else in the repository exists to make that loop reliable, inspectable, and reusable.

The Four Runtime Surfaces

API

apps/api exposes HTTP endpoints for:

Worker

apps/worker runs the asynchronous pipeline:

MCP

apps/mcp exposes the same system as agent tools:

Web

apps/web is the operator command center:

Shared Surfaces

Under Evaluation, Not Runtime Surfaces

Two directions are intentionally kept outside the current runtime surface:

See reference/project-positioning.md for the stable public summary of those future-direction boundaries.

Public docs path

Public docs keep this system map at a summary level: reader-first surfaces, shared proof ladders, and intake/tracking pillars stay front and center, while blueprint-level contracts stay inside the internal planning ledger. When you need the stable public boundary for agent autopilot, hosted workspaces, or other bids, link to reference/project-positioning.md and reference/ecosystem-and-big-bet-decisions.md instead of citing the internal blueprints directly.

Design Principles

Deferred Directions

Two directions remain explicitly outside the current runtime surface:

Those boundaries are deliberate. They protect the repository’s source-first and local-proof-first contract until auth, isolation, approval, and remote-proof layers exist for real.