Declarative bootstrap for Kubernetes

One static binary that takes a freshly created cluster from “API server answers” to “workloads can be deployed.” No kubectl, no helm binary, no bash, no sleep 30.

Get started View on GitHub

bootstrap.yaml

apiVersion: khook.io/v1
kind: Khook
metadata:
  name: bootstrap
defaults:
  timeout: 5m
steps:
  - name: cilium
    helm:
      chart: cilium
      repo: https://helm.cilium.io/
      version: 1.18.4
      namespace: kube-system
      atomic: true
  - name: all-ready
    needs: [cilium]
    wait:
      for: condition=Ready
      on: pods
      allNamespaces: true

zsh

$ khook apply -f bootstrap.yaml
✓ cilium (helm)  24.108s
✓ all-ready (wait)  9.412s

The problem

The gap nobody owns

Terraform (or eksctl, or CAPI) hands you a cluster. ArgoCD takes over once it’s installed. In between lives everybody’s least favorite artifact: the bootstrap script — a few hundred lines of kubectl apply, helm upgrade --install, sleep 30, and retry loops, duct-taped into a null_resource and feared by everyone on call. khook replaces that gap with a declarative spec: a DAG of steps it validates, plans, and converges — the same way every time.

Why khook

Built for the day-zero window

Everything between “cluster exists” and “GitOps has the wheel” — sequenced, wait-heavy, run-once-converge-always.

📦

One binary, zero dependencies

The Kubernetes and Helm SDKs are embedded. khook never shells out — nothing to install on the runner but khook itself.

🕸️

A DAG, not a script

Steps declare needs:; khook topologically sorts them and runs each level in parallel. Cycles are caught before anything touches the cluster.

🔁

Idempotent by design

Re-running a spec is always safe: Helm release history decides install-vs-upgrade, applies converge existing resources, deletes treat “already gone” as success. Run it on every terraform apply.

🚨

Fails loud, precisely

Per-step timeouts and retries, onError: fail | continue, and a summary naming exactly what succeeded, failed, or was skipped — with distinct exit codes for validation vs execution.

🔍

Plan before you touch

khook plan predicts install / upgrade / no-op per step against the live cluster; --diff renders exact object changes via server-side dry-run. Reads only — plan never mutates.

🤝

Bootstrap, then hand off

khook installs your CNI, secrets tooling, and GitOps controller — then gets out of the way. It is deliberately not a GitOps engine.

The DSL

Seven verbs cover the bootstrap surface

A spec is a set of steps, each with exactly one action — the key implies the type, no discriminator field. Variables (${VAR}, ${VAR:-default}, sprig pipelines) and when: CEL conditions keep one spec serving many environments.

Verb	What it does	Instead of
`helm:`	install-or-upgrade a chart (history decides)	`helm repo add` + `helm upgrade --install`
`apply:`	apply manifests — inline, file, URL, or kustomize — optionally waiting on them	`kubectl apply -f/-k` (`&& kubectl wait`)
`delete:`	remove resources by manifest, selector, or Helm release	`kubectl delete` / `helm uninstall`
`patch:`	modify a resource in place (strategic/merge/json)	`kubectl patch`
`wait:`	block until a condition or jsonpath holds (or gone)	`kubectl wait` + `sleep`-and-pray
`rollout:`	restart / await workload rollouts	`kubectl rollout restart/status`
`job:`	run a container to completion in-cluster	one-off `kubectl run` / bash scripts

Variables & conditions

One spec, many environments

Env-first variables — values come from --set, --var-file, or KHOOK_VAR_* environment variables; a missing variable is a validation error, reported all at once.
Secrets stay out of your logs — KHOOK_SECRET_* values (and anything derived from them through pipelines) are redacted from logs, plan, and diff.
Sprig pipelines on values, helm-style — hermetic function set, so plan and apply always see the same spec.
CEL conditions — when: decides at load time, before anything touches the cluster; skipped steps still satisfy their dependents’ needs.

variables.yaml

steps:
  - name: app-namespace
    apply:
      manifests:
        - inline: |
            apiVersion: v1
            kind: Namespace
            metadata:
              name: ${APP_NAME | lower | trunc 63}

  - name: argocd
    needs: [app-namespace]
    when: vars.get("ENABLE_ARGOCD", "false") == "true"
    helm:
      chart: argo-cd
      repo: https://argoproj.github.io/argo-helm
      version: ${ARGOCD_CHART_VERSION:-7.7.5}
      namespace: argocd
      createNamespace: true

vs Terraform

Isn’t this just Terraform’s `kubernetes`/`helm` providers?

No chicken-and-egg. Providers are configured at plan time, but the cluster’s endpoint only exists after apply. khook runs strictly after: a kubeconfig is an input, never a cycle in the graph.
Plans need a live API schema. Ship custom resource definitions and the resources that use them in one Terraform apply and the plan fails — the type doesn’t exist yet. khook executes in DAG order against the real server, so they land in order.
Waiting is first-class. Bootstrap is mostly waiting; readiness gates are steps, not time_sleep and local-exec.
No state hostage-taking. khook keeps no object inventory — the cluster is the source of truth, and handing off to GitOps is the design, not a turf war.

Read the full comparison →

Quickstart

Try it in 60 seconds

zsh

$ go install github.com/dvrkn/khook/cmd/khook@latest
$ k3d cluster create dev
$ khook apply -f bootstrap.yaml
✓ cilium (helm)  24.108s
✓ all-ready (wait)  9.412s

Run it again — everything converges, nothing breaks. That’s the point.