Version analyzed: March 31, 2026 leak
This page traces the exact path from when a user types a message to when they see the final output.
This page traces the exact path from when a user types a message to when they see the final output.
Overview
Claude Code’s control flow is a sophisticated pipeline that transforms user input into agentic actions. The system operates in a continuous loop, executing tools, gathering results, and deciding whether to continue or stop.Complete Flow Diagram
Phase-by-Phase Breakdown
Phase 1: User Input Processing
Location:src/utils/processUserInput/processUserInput.ts
When the user presses Enter:
-
Raw Input Capture
- Text content
- Pasted images/files
- Cursor position
-
Slash Command Detection
Illustrative pseudocode (not verbatim source).
-
Context Key Expansion
#File→ Read file content#Folder→ List directory#Problems→ Get diagnostics#Terminal→ Capture terminal output#Git→ Show git diff
-
Attachment Processing
- Images → Base64 encode, resize if needed
- PDFs → Extract text, handle page ranges
- Files → Read content with line numbers
-
Memory Attachment
- Auto-attach relevant memory files
- Filter duplicates
- Add freshness notes
Phase 2: Query Engine Initialization
Location:src/QueryEngine.ts
The QueryEngine orchestrates the entire agentic loop:
-
Load Session State
- Previous messages
- Tool permission context
- File state cache
- Cost tracking
-
Prepare Tools
- Filter by permission mode
- Apply deny rules
- Inject MCP tools
- Build tool schemas
-
Initialize Tracking
- Token budget
- Turn counter
- Cost accumulator
- Abort controller
Phase 3: Prompt Assembly
Location:src/constants/prompts.ts, src/utils/queryContext.ts
The system prompt is built in layers:
Illustrative pseudocode (not verbatim source).
- Everything before
DYNAMIC_BOUNDARYusesscope: 'global' - Dynamic content after boundary uses
scope: 'session' - Reduces API costs significantly
Phase 4: API Communication
Location:src/query.ts, src/services/api/claude.ts
The query function handles API communication:
-
Message Normalization
Illustrative pseudocode (not verbatim source).
-
Token Budget Check
Illustrative pseudocode (not verbatim source).
-
Stream Request
Illustrative pseudocode (not verbatim source).
-
Handle Stream Events
message_start→ Initialize responsecontent_block_start→ New content blockcontent_block_delta→ Incremental text/thinkingcontent_block_stop→ Block completemessage_stop→ Response complete
Phase 5: The Agentic Loop
Location:src/query.ts, src/services/tools/toolOrchestration.ts
The core loop that makes Claude Code “agentic”:
Illustrative pseudocode (not verbatim source).
Phase 6: Tool Execution
Location:src/services/tools/toolOrchestration.ts, individual tool files
Tool execution follows a strict protocol:
-
Permission Check
Illustrative pseudocode (not verbatim source).
-
Input Validation
Illustrative pseudocode (not verbatim source).
-
Execution
Illustrative pseudocode (not verbatim source).
-
Result Formatting
Illustrative pseudocode (not verbatim source).
-
Progress Reporting
Illustrative pseudocode (not verbatim source).
Phase 7: Stop Decision
Location:src/query.ts
The system decides when to stop based on:
-
Stop Reasons from API
end_turn→ Natural completionmax_tokens→ Output limit reached (retry with higher limit)stop_sequence→ Explicit stop markertool_use→ Waiting for tool results
-
Budget Checks
- Token budget exhausted
- Turn limit reached
- Cost limit exceeded
- Time limit exceeded
-
User Interruption
- Ctrl+C pressed
- Stop command issued
-
Error Conditions
- API error (non-retryable)
- Tool execution failure (critical)
- Permission denial (blocking)
Phase 8: State Persistence
Location:src/utils/sessionStorage.ts
After each turn:
-
Record Transcript
-
Update State
- Cost tracking
- Token usage
- File state cache
- Permission context
-
Flush to Disk
Special Cases
Thinking Mode
When extended thinking is enabled: Thinking blocks are:- Displayed in real-time
- Preserved in conversation
- Token accounting behavior depends on provider/model/runtime handling
- Can be redacted in some modes
REPL Mode
When REPL mode is enabled: REPL mode wraps primitive tools (Read, Write, Edit, Bash) in a single tool call.Auto-Compact
When approaching context limits:Performance Characteristics
Latency Breakdown
Illustrative latency ranges for reasoning about bottlenecks (not a benchmark from instrumented source measurements):| Phase | Time | Notes |
|---|---|---|
| Input Processing | 10-50ms | Depends on attachments |
| Prompt Assembly | 50-100ms | Cached after first turn |
| API First Token | 200-500ms | Network + model startup |
| Streaming | Variable | ~50 tokens/sec |
| Tool Execution | Variable | Depends on tool |
| State Persistence | 10-30ms | Async, non-blocking |
Optimization Strategies
-
Parallel Tool Execution
- Independent tools run concurrently
- Reduces total turn time
-
Prompt Caching
- Static prefix cached globally
- Session context cached per-session
- Saves ~80% of prompt tokens
-
Streaming Display
- Text rendered as it arrives
- User sees progress immediately
- Perceived latency reduced
-
Lazy Loading
- Heavy modules loaded on demand
- Faster startup time
Error Handling
Retry Logic
Illustrative pseudocode (not verbatim source).
Fallback Models
Illustrative pseudocode (not verbatim source).
Graceful Degradation
- Tool failures don’t crash the session
- Permission denials return error messages
- API errors show user-friendly messages
- State always persists, even on crash
Related Links
Architecture Overview
High-level system architecture
Agent Loop
Deep dive into the agentic loop
Prompt Assembly
How system prompts are constructed
Tools Overview
Understanding tool execution