pithkit — LLM-authored programs, compiled to typed Go

Three primitives. Nineteen verbs. Twelve types. Six step kinds.

The closure is the contribution. The LLM cannot invent new vocabulary — it must compose what exists. The validator enforces the closure; the codegen reads it. Everything else falls out.

Entities

Data shapes, by name

Field-name → abstract type. No methods, no inheritance. Link in one leaf's output is identifiably the same Link in another leaf's input.

Leaves

Single-responsibility atoms

Verb-first (one of 19), typed I/O, ≥ 3 examples. Cannot call other leaves. The atom codegen fills the body; examples become Go tests.

Composites

Entry points with bodies

HTTP routes, browser events, CLI commands. The body composes leaf calls using 6 step kinds: let, call, check, return, decide, map.

# URL shortener — one file, one LLM call
intent: "Backend API that shortens long URLs into short slugs and redirects."
acceptance_criteria:
  - "POST /shorten generates a slug (auto or user-supplied)"
  - "GET /{slug} redirects to original URL"

entities:
  Link: { slug: Text, original_url: Text, expires_at: Optional<Timestamp> }

leaves:
  - id: validate_url
    verb: validate
    output: { type: "Map<Text,Text>" }
    supports_criteria: [0]

composites:
  - id: create_short_link
    trigger: "POST /shorten"
    body:
      - { kind: let,    bind: v,     value: { fn: validate_url, args: { url: $input.original_url } } }
      - { kind: check,  cond: $v.ok }
      - { kind: let,    bind: slug,  value: { fn: create_slug, args: {} } }
      - { kind: return, value: { fn: compose_shorten_response, args: { slug: $slug } } }
    satisfies_criteria: [0]

Why we stopped building an 8-stage compiler

We spent six months on a staged compiler — eight LLM calls in a row, each emitting an intermediate artifact for the next. It worked on small intents and ran out of cross-stage budget on hard ones. The diagnosis took us a while.

Before — Path A (frozen)

8 LLM calls in a sequence
Stage 1 lexer → Stage 8 codegen
Cross-stage retries: $5–15 / hard intent
Hard intents often didn't build
LLM never saw the whole program at once

After — Path B (this repo)

1 LLM call emits the whole program
Validator rejects misuses
Edit-loop converges in 1–2 turns
Stages 1–6 become internal cognitive moves, never crystallised
The LLM holds the whole program in working memory

Honest caveat: A was our own work. The numbers reported here are architectural evidence from our development log, not a controlled comparison against an independent baseline. Read the 12-page essay for the threats discussion.

See it on a real program

The viewer renders any pithkit Program JSON. Click an acceptance criterion to highlight the leaves and composites that claim it.

URL shortener → Event attendance → Mini e-commerce →

Give the LLM a constrained surface to author programs in. Not free-form Go.