Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

AI Architecture

Hone integrates AI at multiple levels: inline code completion, multi-turn chat, automated code review, and an autonomous agent. The AI system is provider-agnostic, with a registry of adapters for major LLM providers.

Provider System

ProviderRegistry

Central registry managing AI provider adapters. Each adapter implements a common interface for chat completions and streaming.

Adapters:

ProviderDescription
AnthropicClaude models (direct API)
OpenAIGPT models (direct API)
GoogleGemini models
OllamaLocal models via Ollama
Azure OpenAIOpenAI models via Azure
BedrockAWS Bedrock (Anthropic, etc.)
VertexGoogle Cloud Vertex AI

ModelRouter

Selects the appropriate model for a given task based on:

  • User preferences (configured default model)
  • Task type (completion vs. chat vs. review)
  • Model capabilities (context window, tool use support)
  • Provider availability

Token Estimation

Estimates token counts before sending requests to:

  • Stay within context window limits
  • Truncate or chunk context when necessary
  • Provide usage feedback to the user

AI Surfaces

1. Inline Completion (ai/inline/)

Ghost text suggestions that appear as the user types.

  • Fill-In-Middle (FIM) formatting – constructs prompts with prefix, suffix, and cursor position for infilling
  • Debouncing – waits for a pause in typing before sending requests (avoids flooding the provider)
  • Caching – recently generated completions are cached and reused when the user types matching characters
  • Request lifecycle – manages in-flight requests, cancels stale requests when the cursor moves

The completion appears as dimmed ghost text. The user accepts with Tab or dismisses by continuing to type.

2. Chat (ai/chat/)

Multi-turn AI chat panel integrated into the IDE.

  • Multi-turn conversation – maintains conversation history with system prompt, user messages, and assistant responses
  • Context collection – automatically gathers relevant context:
    • Current file content and cursor position
    • Selected text
    • Visible viewport
    • Diagnostics (errors/warnings)
    • Open file list
  • Code block extraction – parses assistant responses to identify code blocks with language and file path annotations
  • Streaming renderer – renders assistant responses token-by-token as they stream in

3. Review (ai/review/)

AI-powered code review for diffs and pull requests.

  • Diff chunking – splits large diffs into reviewable chunks that fit within the model’s context window
  • Annotation parsing – extracts structured annotations (line number, severity, suggestion) from the model’s response
  • Review engine – orchestrates the review process:
    1. Collects diff hunks
    2. Chunks them for the model
    3. Sends each chunk with review instructions
    4. Parses and aggregates annotations
    5. Displays inline annotations in the diff view

Agent System (ai/agent/)

Autonomous task execution with tool calling.

Plan and Execute Loop

  1. User describes a task in natural language
  2. Agent creates a plan (sequence of steps)
  3. Agent executes steps using tools:
    • File read/write
    • Terminal commands
    • Search (workspace-wide)
    • LSP queries (find references, go to definition)
  4. Agent observes results and adjusts plan
  5. Repeats until task is complete or blocked

Approval Flows

  • Auto-approve – read-only operations (file read, search) execute without confirmation
  • Prompt for approval – write operations (file edit, terminal commands) require user confirmation
  • Batch approval – user can approve all remaining steps in a plan

Error Recovery

When a tool call fails, the agent:

  1. Observes the error message
  2. Adjusts its approach
  3. Retries with a different strategy
  4. Escalates to the user if repeated failures occur

Activity Logging

All agent actions are logged for transparency:

  • Tool calls with arguments and results
  • Plan steps with status (pending, in-progress, completed, failed)
  • Token usage per step
  • Total elapsed time