NAME
GOAL.md — instruction file for Claude Code
SYNOPSIS
./
METADATA
DESCRIPTION
GOAL.md is a file format for directing autonomous AI agent optimization loops. Inspired by Andrej Karpathy's autoresearch pattern, it provides a structured way to define what an agent should optimize, how to measure success, and what actions it can take.
Unlike CLAUDE.md (which is instructional — describing how to work in a codebase), GOAL.md is directive and metric-driven. It defines fitness functions with quantitative targets, improvement loops with explicit iteration strategies, and action catalogs that constrain the agent's search space.
The core pattern is: observe metrics, hypothesize a change, implement it, measure the result, and decide whether to keep or revert. This creates a feedback loop that allows an agent to make incremental progress toward a measurable goal without human intervention at each step.
GOAL.md is particularly useful for optimization tasks (performance tuning, bundle size reduction, test coverage improvement) where the objective is clear and measurable but the path to get there requires experimentation. The action catalog prevents the agent from taking unbounded risks, while the fitness functions ensure every change is evaluated objectively.
The format is designed to be committed to version control alongside CLAUDE.md, creating a complete agent configuration: CLAUDE.md provides the working context, GOAL.md provides the optimization target.
STRUCTURE
ANNOTATED EXAMPLE
1# GOAL.md — Autonomous Optimization Target
2
3## Objective
4Reduce cold-start latency of the API server to under 200ms
5while maintaining 100% test pass rate.
6
7## Fitness Functions
8
9| Metric | Baseline | Target | Measurement |
10|---------------------|----------|---------|------------------------------------|
11| Cold-start latency | 450ms | <200ms | `time node dist/server.js` |
12| Bundle size | 2.4MB | <1.5MB | `du -sh dist/` |
13| Test pass rate | 100% | 100% | `npm test -- --reporter=json` |
14| Import count | 47 | <30 | `grep -c "^import" src/server.ts` |
15
16## Improvement Loop
17
181. **Observe**: Run all fitness function measurements
192. **Hypothesize**: Identify the largest contributor to cold-start time
203. **Implement**: Make ONE targeted change (lazy import, tree-shake, etc.)
214. **Measure**: Re-run fitness functions
225. **Decide**:
23 - If latency improved AND tests pass → commit and continue
24 - If tests fail → revert immediately
25 - If latency unchanged after 3 attempts on same target → move to next
26
27**Stop when**: All targets met, OR 10 iterations completed.
28
29## Action Catalog
30
31| Action | Risk | Precondition |
32|---------------------------|--------|-------------------------------|
33| Convert static import → dynamic | Low | Module not used at startup |
34| Remove unused dependency | Low | No import references found |
35| Replace heavy lib with lighter alt | Medium | Equivalent API coverage |
36| Refactor module structure | Medium | Tests cover affected code |
37| Modify build config | High | Snapshot current config first |
38
39## Progress Log
40
41| Iteration | Change | Latency | Tests | Status |
42|-----------|---------------------------------|---------|-------|--------|
43| 0 | Baseline | 450ms | PASS | — |
COMMON MISTAKES
CLAUDE.md provides instructional context (code style, architecture, commands). GOAL.md provides directive context (what to optimize, how to measure success). They serve different purposes: CLAUDE.md tells the agent how to work; GOAL.md tells it what to achieve.
Autonomous loops need quantitative feedback to function. Without measurable targets, the agent cannot evaluate whether its changes are improvements. Every fitness function should have a number, a measurement method, and a target.
Without explicit rollback conditions, an autonomous agent can compound bad changes across iterations. Each improvement loop should specify what constitutes a failed experiment and how to recover.
An open-ended action space is dangerous in autonomous loops. The action catalog constrains what the agent can try, preventing it from making risky changes (deleting files, changing APIs) without explicit permission.