Discovering Ralph#
Over the holidays, I kept seeing excitement about “Ralph” in AI coding circles - an autonomous coding loop technique that lets Claude Code work through tasks iteratively without constant human intervention. I started experimenting with it in early January, refining my approach until it worked reliably.
The concept comes from Geoffrey Huntley’s original technique, which is beautifully simple:
while :; do cat PROMPT.md | claude-code ; done
That’s it. A shell loop that feeds a prompt to Claude Code, lets it work, and when Claude exits, feeds the same prompt again. Each iteration is a fresh process with fresh context.
This simplicity masks a profound insight: by starting fresh each iteration, Claude must re-discover its work through files, not conversation memory. There’s no context rot from accumulated tool calls and intermediate reasoning. Each iteration reads the current state of the codebase and decides what to do next.
Why does this matter? Long-running Claude Code sessions accumulate context - previous file reads, tool call history, intermediate reasoning, stale information from earlier in the session. After 10-20 iterations, this context bloat causes slower responses, reduced reasoning quality, and eventually hits context limits. Fresh context each iteration solves this elegantly.
The Problem with Existing Implementations#
After discovering Ralph, I naturally searched for existing implementations. I tried several community versions that were shared and recommended. Then I found the official Anthropic ralph-loop plugin and thought: “Perfect, the official implementation should be the gold standard.”
It wasn’t.
The Anthropic plugin uses Stop hooks to intercept session exit and feed the prompt back, keeping everything within a single session. This sounds efficient, but it fundamentally deviates from the original Ralph philosophy. Instead of fresh context each iteration, context accumulates.
Issue #16440 on the Claude Code repository documents this problem:
The original Ralph technique uses an external Bash loop… Each iteration is a new process with fresh context. The ralph-wiggum plugin attempts to replicate this using Stop hooks, but currently the context accumulates instead of resetting.
Here’s how the approaches compare:
| Aspect | Original Ralph | Anthropic Plugin | My Implementation |
|---|---|---|---|
| Context per iteration | Fresh (new process) | Accumulated (same session) | Fresh (new process) |
| Context size over time | Constant (~40k) | Growing (40k → 200k+) | Constant (~40k) |
| Iteration limit | Unlimited | ~20 before overflow | Unlimited (with -u flag) |
| Learning persistence | Via files | Via conversation memory | Via WORKLOG.md |
| Behavior consistency | Consistent | Degrades over time | Consistent |
The issues are real:
- Context overflow: After 10-20 iterations, context exceeds limits
- Degraded reasoning: Attention gets diluted across irrelevant history
- Different behavior: The accumulated approach produces fundamentally different results
The Solution: ralph-claude-code#
I built ralph-claude-code to solve these problems while adding practical improvements for real-world use.
The core philosophy: fresh context each iteration, but learnings persist through files.
This is accomplished through a two-file system:
- BRIEF.md: Your static task specification (never changes during execution except to mark completion)
- WORKLOG.md: Dynamic learnings accumulated across iterations
┌─────────────────────────────────────────────────────────────┐
│ Ralph Iteration Loop │
└─────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────┐
│ Read BRIEF.md │
│ Find first [ ] │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Read WORKLOG.md │
│ Check learnings │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Execute task │
│ (one at a time) │
└─────────┬────────┘
│
▼
┌──────────────────┐
│ Validate │
│ tests/lint/type │
└─────────┬────────┘
│
┌───────────────┴───────────────┐
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ PASS │ │ FAIL │
│ Mark [x] │ │ Leave [ ] │
│ Commit changes │ │ Log learnings │
│ Log success │ │ Next iteration │
└────────┬───────┘ └────────┬───────┘
│ │
└───────────────┬───────────────┘
│
▼
┌──────────────────┐
│ All tasks [x]? │
└─────────┬────────┘
│
┌───────────────┴───────────────┐
│ │
▼ ▼
┌────────────────┐ ┌────────────────┐
│ YES │ │ NO │
│ COMPLETE │ │ Next iteration │
└────────────────┘ └────────────────┘
Each iteration spawns a new claude process, reads the current state from files, and works with fresh context. Failed attempts are logged to WORKLOG.md, so the next iteration can learn from mistakes without carrying the baggage of accumulated conversation history.
Key Features#
Fresh Context Per Iteration#
The core loop spawns a new process each time:
# Simplified from the actual implementation
while true; do
timeout "$TIMEOUT" claude --print "$PROMPT" | tee "$OUTPUT_FILE"
if grep -q "<promise>COMPLETE</promise>" "$OUTPUT_FILE"; then
break # All tasks done
fi
done
Each claude invocation starts fresh. No accumulated context, no degradation over time.
Learning Loop Architecture#
The WORKLOG.md file accumulates learnings without context bloat:
# Work Log
## Learnings
- Tests use BATS framework: run with `bats test/`
- Config is loaded from ~/.myapprc
- The API uses snake_case for all parameters
---
## Iteration 1 - TASK-001: Create config module
- What was implemented: Config loading with defaults
- Files changed: src/config.js, test/config.test.js
- Learnings for future iterations:
- Use dotenv for env vars
- Config validates on load
---
## Iteration 2 - TASK-001: Create config module (retry)
- Previous attempt failed: Missing required ENV var
- What was fixed: Added fallback defaults
- Files changed: src/config.js
---
The key insight: Claude reads this file at the start of each iteration, so it has access to all learnings from previous iterations - but only the distilled learnings, not the full conversation history.
The /brief Skill#
Writing good BRIEF.md files is surprisingly tricky and time consuming. Tasks need to be:
- Small enough to fit in one iteration’s context window
- Sequenced correctly (data layer before UI)
- Verifiable with objective acceptance criteria
The /brief skill automates this through Claude’s AskUserQuestion tool:
claude
# Then type: /brief I want my application to do xyz
The skill guides you through interactive requirements gathering:
- Existing file detection: Checks if
BRIEF.mdexists and asks what to do - Task scoping: Ensures each task fits within a single iteration - oversized tasks are decomposed
- Dependency ordering: Sequences tasks correctly (data layer → logic → UI)
- Verifiable criteria: Every acceptance criterion is objectively checkable
- Required validations: Automatically adds “Testing passes” and “Linting passes” to each task
Flexible Controls#
# Run with defaults (10 iterations, 3s sleep between each)
ralph
# Run up to 20 iterations
ralph -n 20
# Run unlimited iterations until all tasks complete
ralph -u
# Run with no pause between iterations (fastest)
ralph -S
# Reset worklog and start fresh
ralph -r
# Keep existing worklog and continue
ralph -k
# Preview what would happen without running
ralph -d
# Combine flags: unlimited iterations, no sleep, verbose
ralph -u -S -v
Edge Cases and Solutions#
This is where the real work went - handling all the things that go wrong in practice:
File Cleanup and Creation#
Ralph handles BRIEF.md and WORKLOG.md intelligently:
- Checks if
BRIEF.mdexists before starting (errors if missing) - Prompts about existing
WORKLOG.md(reset or continue?) - The
--resetand--keepflags skip the prompt for scripted use - The
--cleanupflag removes both files after completion
Gitignore Management#
You probably don’t want to commit BRIEF.md and WORKLOG.md to your repo. Ralph auto-detects missing gitignore entries:
⚠️ Workflow files not in .gitignore
BRIEF.md and WORKLOG.md should probably be gitignored.
Add them now? (y/n)
The --add-gitignore flag auto-adds them, and --skip-gitignore skips the check entirely.
Stuck Iteration Timeouts#
Sometimes Claude gets stuck. Each iteration has a configurable timeout (default 10 minutes):
ralph --timeout 3600 # 1 hour timeout per iteration
Timeouts are logged to WORKLOG.md so the next iteration knows something went wrong and can try a different approach.
Debugging with Prompt Output#
When things consistently fail, you need to see exactly what Claude is doing:
ralph -P # or --prompt
This outputs the exact prompt being sent to Claude, which you can paste into an interactive session to debug why iterations are failing.
Signal Handling#
Clean Ctrl+C interruption with proper exit codes:
- Exit 0: All tasks completed successfully
- Exit 1: Error or max iterations reached
- Exit 130: Interrupted by user (Ctrl+C)
The BRIEF.md Format#
Here’s what a well-structured brief looks like:
# Brief: Comment Threading System
## Introduction
Enable nested replies on comments so users can have focused discussions.
Comments can have replies up to 3 levels deep, with collapse/expand controls
and visual indentation.
## Objectives
- Support threaded replies on any comment
- Limit nesting to 3 levels to maintain readability
- Allow collapsing/expanding reply threads
- Show reply count on collapsed threads
## Tasks
### TASK-001: Add parent reference to comments table
**Description:** As a developer, I need to track comment relationships
so replies link to their parent.
**Acceptance Criteria:**
- [ ] Add nullable `parent_id` foreign key column
- [ ] Add index on `parent_id` for query performance
- [ ] Migration runs without errors
- [ ] Testing passes
- [ ] Linting passes
### TASK-002: Create reply submission endpoint
**Description:** As a user, I need to submit a reply to an existing comment.
**Acceptance Criteria:**
- [ ] POST endpoint accepts `parent_id` and `content`
- [ ] Validates parent exists and nesting depth <= 3
- [ ] Returns 422 if max depth exceeded
- [ ] Testing passes
- [ ] Linting passes
### TASK-003: Render nested comment tree
**Description:** As a user, I want to see replies indented beneath
their parent comment.
**Acceptance Criteria:**
- [ ] Replies render with increasing left margin per level
- [ ] Maximum 3 indentation levels displayed
- [ ] Reply count badge shows on comments with replies
- [ ] Testing passes
- [ ] Linting passes
- [ ] Verify changes work in browser
## Out of Scope
- No @mentions or notifications for replies
- No editing or deleting replies after posting
- No pagination within threads
## Implementation Notes
- Leverage existing Comment component, add depth prop
- Use recursive rendering for nested structure
Key principles:
- Each task is small enough for one iteration
- Tasks are sequenced by dependencies (schema → API → UI)
- Acceptance criteria are objectively verifiable
- Every task includes “Testing passes” and “Linting passes”
Installation#
Prerequisites#
Bash 4.0+: Ralph uses modern bash features
- macOS: Install via Homebrew (
brew install bash) - the system bash is v3.x - Linux: Usually already installed
- macOS: Install via Homebrew (
Claude Code CLI: The
claudecommand must be in your PATH- Install from: https://docs.anthropic.com/en/docs/claude-code
timeout command: Used for iteration timeouts
- macOS: Install via Homebrew (
brew install coreutils) - providesgtimeout - Linux: Usually already installed as
timeout
- macOS: Install via Homebrew (
Installation Steps#
# Clone the repository
git clone https://github.com/mmenanno/ralph-claude-code.git
cd ralph-claude-code
# Make ralph executable
chmod +x ralph
# Symlink to PATH
ln -s "$(pwd)/ralph" /usr/local/bin/ralph
# Install the brief skill (optional but recommended)
mkdir -p ~/.claude/skills
ln -s "$(pwd)/skill/brief" ~/.claude/skills/brief
# Verify installation
ralph --help
Real-World Usage#
My typical workflow:
# Navigate to project
cd my-project
# Create a brief using the skill
claude
# > /brief My application needs feature xyz added
# > to it with these requirements...
# Run ralph
ralph -u # unlimited iterations until complete
# When done, review the results then clean up
ralph --cleanup
When to Use Ralph vs. Normal Sessions#
Use Ralph for:
- Multi-step features with clear acceptance criteria
- Tasks that benefit from fresh context between steps
- Long-running work that would hit context limits
- Projects with good test coverage (Ralph needs validation)
- Prototyping where you have an idea and direction and just want to see how far claude can get on it’s own
Use normal Claude Code for:
- Exploratory work where you’re still figuring things out
- Quick fixes and single-task work
- Interactive debugging and pairing
Lessons Learned#
Understanding Original Intent Matters#
The Anthropic plugin tried to be clever by using Stop hooks to keep everything in one session. This breaks the core insight of Ralph: fresh context prevents degradation. Sometimes the simple approach (spawn new processes) is better than the clever approach (hooks and session management).
Fresh Context Is a Feature, Not a Bug#
It’s tempting to think accumulated context would help - Claude would “remember” what it already tried. In practice, the opposite is true. Fresh context with file-based learnings produces better results than accumulated context with conversation memory.
Try It Yourself#
Ralph is MIT licensed and available on GitHub:
github.com/mmenanno/ralph-claude-code
Contributions welcome! The repository includes:
- Complete implementation with 130+ tests
- The
/briefskill for creating well-structured briefs - Comprehensive documentation
- CONTRIBUTING.md with development guidelines
If you’ve been frustrated by context limits in long Claude Code sessions, or want to automate multi-step development workflows, give Ralph a try. The fresh-context-per-iteration approach makes a real difference.
Resources:
