Control Flow

Version analyzed: March 31, 2026 leak
This page traces the exact path from when a user types a message to when they see the final output.

Overview

Claude Code’s control flow is a sophisticated pipeline that transforms user input into agentic actions. The system operates in a continuous loop, executing tools, gathering results, and deciding whether to continue or stop.

Complete Flow Diagram

Phase-by-Phase Breakdown

Phase 1: User Input Processing

Location: src/utils/processUserInput/processUserInput.ts When the user presses Enter:

Raw Input Capture
- Text content
- Pasted images/files
- Cursor position

Slash Command Detection

Illustrative pseudocode (not verbatim source).

if (input.startsWith('/')) {
  // Parse command name and arguments
  // Queue command for execution
}

Context Key Expansion
- #File → Read file content
- #Folder → List directory
- #Problems → Get diagnostics
- #Terminal → Capture terminal output
- #Git → Show git diff
Attachment Processing
- Images → Base64 encode, resize if needed
- PDFs → Extract text, handle page ranges
- Files → Read content with line numbers
Memory Attachment
- Auto-attach relevant memory files
- Filter duplicates
- Add freshness notes

Phase 2: Query Engine Initialization

Location: src/QueryEngine.ts The QueryEngine orchestrates the entire agentic loop:

Load Session State
- Previous messages
- Tool permission context
- File state cache
- Cost tracking
Prepare Tools
- Filter by permission mode
- Apply deny rules
- Inject MCP tools
- Build tool schemas
Initialize Tracking
- Token budget
- Turn counter
- Cost accumulator
- Abort controller

Phase 3: Prompt Assembly

Location: src/constants/prompts.ts, src/utils/queryContext.ts The system prompt is built in layers:

Illustrative pseudocode (not verbatim source).

// 1. Static prefix (cacheable)
const prefix = "You are Claude Code, Anthropic's official CLI for Claude."

// 2. Tool instructions
const toolGuidance = buildToolInstructions(tools)

// 3. User context (dynamic)
const userContext = {
  cwd: "/path/to/project",
  git_branch: "main",
  git_status: "modified: 2 files",
  os: "macOS 14.0",
  shell: "zsh"
}

// 4. System context
const systemContext = {
  date: "2026-03-31",
  capabilities: ["file_read", "bash", "web_search"]
}

// 5. Memory
const memory = loadMemoryPrompt() // MEMORY.md content

// 6. MCP instructions
const mcpInstructions = buildMcpInstructions(mcpClients)

// Final assembly
const systemPrompt = [
  prefix,
  ...toolGuidance,
  DYNAMIC_BOUNDARY, // Cache split point
  userContext,
  systemContext,
  memory,
  mcpInstructions
]

Cache Optimization:

Everything before DYNAMIC_BOUNDARY uses scope: 'global'
Dynamic content after boundary uses scope: 'session'
Reduces API costs significantly

Phase 4: API Communication

Location: src/query.ts, src/services/api/claude.ts The query function handles API communication:

Message Normalization

Illustrative pseudocode (not verbatim source).

// Convert internal format to API format
const apiMessages = normalizeMessagesForAPI(messages)

Token Budget Check

Illustrative pseudocode (not verbatim source).

if (estimatedTokens > contextWindow) {
  // Trigger auto-compact
  messages = await compactMessages(messages)
}

Stream Request

Illustrative pseudocode (not verbatim source).

const stream = await anthropic.messages.stream({
  model: mainLoopModel,
  max_tokens: 8192,
  system: systemPrompt,
  messages: apiMessages,
  tools: toolSchemas,
  stream: true
})

Handle Stream Events
- message_start → Initialize response
- content_block_start → New content block
- content_block_delta → Incremental text/thinking
- content_block_stop → Block complete
- message_stop → Response complete

Phase 5: The Agentic Loop

Location: src/query.ts, src/services/tools/toolOrchestration.ts The core loop that makes Claude Code “agentic”:

Illustrative pseudocode (not verbatim source).

while (true) {
  // 1. Get response from API
  const response = await streamResponse()
  
  // 2. Check for tool uses
  const toolUses = extractToolUses(response)
  
  if (toolUses.length === 0) {
    // No tools → check stop reason
    if (stopReason === 'end_turn' && tokenBudgetRemaining > 0) {
      // Continue the turn
      continue
    }
    // Done
    break
  }
  
  // 3. Execute tools (parallel when possible)
  const toolResults = await executeTools(toolUses, {
    parallel: true,
    permissionCheck: true
  })
  
  // 4. Add results to conversation
  messages.push(...toolResults)
  
  // 5. Check budget
  if (exceedsTokenBudget() || exceedsTurnLimit()) {
    break
  }
  
  // 6. Continue loop with tool results
}

Phase 6: Tool Execution

Location: src/services/tools/toolOrchestration.ts, individual tool files Tool execution follows a strict protocol:

Permission Check

Illustrative pseudocode (not verbatim source).

const decision = await checkPermission(toolName, input, context)

if (decision.type === 'deny') {
  return { error: decision.reason }
}

if (decision.type === 'prompt') {
  const approved = await promptUser(decision.message)
  if (!approved) {
    return { error: 'User denied' }
  }
}

Input Validation
Illustrative pseudocode (not verbatim source).
```
const validated = toolSchema.parse(input)
```

Execution

Illustrative pseudocode (not verbatim source).

const result = await tool.execute(validated, context)

Result Formatting

Illustrative pseudocode (not verbatim source).

return {
  type: 'tool_result',
  tool_use_id: toolUse.id,
  content: formatResult(result)
}

Progress Reporting

Illustrative pseudocode (not verbatim source).

// For long-running tools
onProgress?.({ 
  status: 'running',
  message: 'Processing...'
})

Phase 7: Stop Decision

Location: src/query.ts The system decides when to stop based on:

Stop Reasons from API
- end_turn → Natural completion
- max_tokens → Output limit reached (retry with higher limit)
- stop_sequence → Explicit stop marker
- tool_use → Waiting for tool results
Budget Checks
- Token budget exhausted
- Turn limit reached
- Cost limit exceeded
- Time limit exceeded
User Interruption
- Ctrl+C pressed
- Stop command issued
Error Conditions
- API error (non-retryable)
- Tool execution failure (critical)
- Permission denial (blocking)

Phase 8: State Persistence

Location: src/utils/sessionStorage.ts After each turn:

Record Transcript

{"type":"user","content":"...","timestamp":1234567890}
{"type":"assistant","content":"...","timestamp":1234567891}
{"type":"tool_use","name":"bash","input":{...},"timestamp":1234567892}
{"type":"tool_result","content":"...","timestamp":1234567893}

Update State
- Cost tracking
- Token usage
- File state cache
- Permission context
Flush to Disk
```
await flushSessionStorage()
```

Special Cases

Thinking Mode

When extended thinking is enabled: Thinking blocks are:

Displayed in real-time
Preserved in conversation
Token accounting behavior depends on provider/model/runtime handling
Can be redacted in some modes

REPL Mode

When REPL mode is enabled: REPL mode wraps primitive tools (Read, Write, Edit, Bash) in a single tool call.

Auto-Compact

When approaching context limits:

Performance Characteristics

Latency Breakdown

Illustrative latency ranges for reasoning about bottlenecks (not a benchmark from instrumented source measurements):

Phase	Time	Notes
Input Processing	10-50ms	Depends on attachments
Prompt Assembly	50-100ms	Cached after first turn
API First Token	200-500ms	Network + model startup
Streaming	Variable	~50 tokens/sec
Tool Execution	Variable	Depends on tool
State Persistence	10-30ms	Async, non-blocking

Optimization Strategies

Parallel Tool Execution
- Independent tools run concurrently
- Reduces total turn time
Prompt Caching
- Static prefix cached globally
- Session context cached per-session
- Saves ~80% of prompt tokens
Streaming Display
- Text rendered as it arrives
- User sees progress immediately
- Perceived latency reduced
Lazy Loading
- Heavy modules loaded on demand
- Faster startup time

Error Handling

Retry Logic

Illustrative pseudocode (not verbatim source).

const retryConfig = {
  maxRetries: 3,
  backoff: 'exponential',
  retryableErrors: [
    'overloaded_error',
    'timeout',
    'rate_limit_error'
  ]
}

Fallback Models

Illustrative pseudocode (not verbatim source).

if (primaryModelFails) {
  // Try fallback model
  const fallbackResponse = await query({
    ...params,
    model: fallbackModel
  })
}

Graceful Degradation

Tool failures don’t crash the session
Permission denials return error messages
API errors show user-friendly messages
State always persists, even on crash

Architecture Overview

High-level system architecture

Agent Loop

Deep dive into the agentic loop

Prompt Assembly

How system prompts are constructed

Tools Overview

Understanding tool execution

Getting Started

Architecture

Internals

Tools

Commands

Models & Inference

State & Memory

Configuration

Advanced Features

Security

Overview

Complete Flow Diagram

Phase-by-Phase Breakdown

Phase 1: User Input Processing

Phase 2: Query Engine Initialization

Phase 3: Prompt Assembly

Phase 4: API Communication

Phase 5: The Agentic Loop

Phase 6: Tool Execution

Phase 7: Stop Decision

Phase 8: State Persistence

Special Cases

Thinking Mode

REPL Mode

Auto-Compact

Performance Characteristics

Latency Breakdown

Optimization Strategies

Error Handling

Retry Logic

Fallback Models

Graceful Degradation

Architecture Overview

Agent Loop

Prompt Assembly

Tools Overview

Getting Started

Architecture

Internals

Tools

Commands

Models & Inference

State & Memory

Configuration

Advanced Features

Security

​Overview

​Complete Flow Diagram

​Phase-by-Phase Breakdown

​Phase 1: User Input Processing

​Phase 2: Query Engine Initialization

​Phase 3: Prompt Assembly

​Phase 4: API Communication

​Phase 5: The Agentic Loop

​Phase 6: Tool Execution

​Phase 7: Stop Decision

​Phase 8: State Persistence

​Special Cases

​Thinking Mode

​REPL Mode

​Auto-Compact

​Performance Characteristics

​Latency Breakdown

​Optimization Strategies

​Error Handling

​Retry Logic

​Fallback Models

​Graceful Degradation

​Related Links

Architecture Overview

Agent Loop

Prompt Assembly

Tools Overview

Overview

Complete Flow Diagram

Phase-by-Phase Breakdown

Phase 1: User Input Processing

Phase 2: Query Engine Initialization

Phase 3: Prompt Assembly

Phase 4: API Communication

Phase 5: The Agentic Loop

Phase 6: Tool Execution

Phase 7: Stop Decision

Phase 8: State Persistence

Special Cases

Thinking Mode

REPL Mode

Auto-Compact

Performance Characteristics

Latency Breakdown

Optimization Strategies

Error Handling

Retry Logic

Fallback Models

Graceful Degradation

Related Links