Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Resource-Driven Development

Resource-driven development is the design philosophy behind this architecture. Instead of writing imperative scripts or maintaining per-cluster YAML, you define resources as structured data and let templates + reconciliation handle the rest.

The Idea

Every entity in the platform is a resource with a schema:

erDiagram
    CLUSTER ||--o{ COMPONENT_REF : "has platform_components"
    CLUSTER ||--o{ NAMESPACE_REF : "has namespaces"
    CLUSTER ||--o{ ROLEBINDING_REF : "has rolebindings"
    CLUSTER ||--o{ PATCH : "has patches"
    COMPONENT_REF }o--|| CATALOG_ENTRY : "references"
    NAMESPACE_REF }o--|| NAMESPACE_DEF : "references"
    ROLEBINDING_REF }o--|| ROLEBINDING_DEF : "references"

    CLUSTER {
        string id PK
        string cluster_name
        string cluster_dns
        string environment
    }
    COMPONENT_REF {
        string id FK
        boolean enabled
        string oci_tag "nullable override"
        string component_path "nullable override"
    }
    CATALOG_ENTRY {
        string id PK
        string component_path
        string component_version
        string oci_url
        string oci_tag
        boolean cluster_env_enabled
        string[] depends_on
    }
    NAMESPACE_REF {
        string id
    }
    ROLEBINDING_REF {
        string id
    }
    NAMESPACE_DEF {
        string id PK
        object labels
        object annotations
    }
    ROLEBINDING_DEF {
        string id PK
        string role
        object[] subjects
    }
    PATCH {
        string component_id FK
        object key_values
    }

cluster.namespaces and cluster.rolebindings are reference arrays (id only). Full namespace/rolebinding payloads live in their own definition resources and are resolved during merge.

Resources are declared, not scripted. The API merges them. Templates render them. Flux reconciles them.

Three-Layer Separation

The architecture cleanly separates what from how from where:

LayerResponsibilityWho Owns ItExample
DataWhat should exist on each clusterPlatform operators via API/CLI“Cluster X should have cert-manager v1.14.0 with 3 replicas”
TemplatesHow resources are rendered into Kubernetes manifestsPlatform engineers via GitResourceSet template that turns an input into a HelmRelease
ReconciliationWhere and when resources are appliedFlux Operator (automated)Flux detects drift and applies the diff

This separation means:

  • Operators change cluster state by updating data (API calls), not by writing YAML
  • Engineers change how things are deployed by updating templates (Git PRs), not by touching every cluster
  • Flux handles the convergence loop — no manual kubectl apply, no configuration management playbooks, no custom deployment scripts

How a Change Flows Through the System

Example: Adding a new platform component to 50 clusters

Traditional approach:

  1. Write Helm values for 50 clusters (or complex overlay structure)
  2. Open PR to add component to each cluster’s directory
  3. Wait for PR review and merge
  4. Watch tier-by-tier rollout
  5. Debug failures per-cluster

Resource-driven approach:

  1. Add the component to the catalog (one API call)
  2. Add a component reference to each cluster’s platform_components array (one API call per cluster, or a batch script)
  3. Done — Flux picks it up on next poll
flowchart LR
    A["API call:<br/>Add component<br/>to catalog"] --> B["API call:<br/>Add component_ref<br/>to cluster doc"]
    B --> C["Next poll cycle:<br/>Provider fetches<br/>updated inputs"]
    C --> D["ResourceSet renders<br/>HelmRepo + HelmRelease"]
    D --> E["Flux reconciles:<br/>component installed"]

Example: Patching a component value on one cluster

flowchart LR
    A["CLI:<br/>patch-component podinfo<br/>--set replicaCount=3"] --> B["API updates<br/>cluster.patches.podinfo"]
    B --> C["Provider fetches<br/>updated inputs"]
    C --> D["ResourceSet renders<br/>HelmRelease with<br/>valuesFrom ConfigMap"]
    D --> E["Flux reconciles:<br/>podinfo scales to 3"]

No Git PR. No pipeline. The data change flows through the system automatically.

Resource Schemas as API Contracts

Each resource type has a defined schema managed via Firestone — a resource-based API specification generator that converts JSON Schema definitions into OpenAPI specs, CLI tools, and downstream code generation artifacts.

The schemas:

  • cluster (v2) — the full cluster document with arrays of component refs, namespace refs, rolebinding refs, and a patches object
  • platform_component (v1) — the catalog entry with OCI URLs, versions, dependencies
  • namespace (v1) — namespace with labels and annotations
  • rolebinding (v1) — role binding with subjects

These schemas are the single source of truth for:

  • OpenAPI spec generation (openapi/openapi.yaml) — used for API documentation and client generation
  • Rust model generation (src/models/, src/apis/) — the structs the API service uses
  • CLI code generation (src/generated/cli/) — the CLI commands for each resource type

When a schema changes, make generate regenerates all downstream artifacts. This ensures the API, CLI, and documentation stay in sync with the resource definitions. See the Firestone documentation for the full schema language and generator options.

Benefits for Enterprise

Auditability

Every state change goes through the API. The API can log who changed what, when. Combined with Git history for templates, you have a full audit trail.

Consistency

The merge logic guarantees that every cluster gets a consistent, computed response. No hand-edited YAML files that drift.

Velocity

Operators can change cluster state in seconds. No PR cycles for operational changes. Reserve Git PRs for template/structural changes.

Testability

Because resources are structured data, you can:

  • Validate schemas before applying
  • Unit test merge logic
  • Integration test API responses against the ExternalService contract
  • Dry-run template rendering

Separation of Permissions

  • Template changes (how things deploy) require Git PR review
  • Data changes (what is deployed where) require API auth tokens
  • Reconciliation is automated — no human in the loop