AI Architecture

Hone integrates AI at multiple levels: inline code completion, multi-turn chat, automated code review, and an autonomous agent. The AI system is provider-agnostic, with a registry of adapters for major LLM providers.

Provider System

ProviderRegistry

Central registry managing AI provider adapters. Each adapter implements a common interface for chat completions and streaming.

Adapters:

Provider	Description
Anthropic	Claude models (direct API)
OpenAI	GPT models (direct API)
Google	Gemini models
Ollama	Local models via Ollama
Azure OpenAI	OpenAI models via Azure
Bedrock	AWS Bedrock (Anthropic, etc.)
Vertex	Google Cloud Vertex AI

ModelRouter

Selects the appropriate model for a given task based on:

User preferences (configured default model)
Task type (completion vs. chat vs. review)
Model capabilities (context window, tool use support)
Provider availability

Token Estimation

Estimates token counts before sending requests to:

Stay within context window limits
Truncate or chunk context when necessary
Provide usage feedback to the user

AI Surfaces

1. Inline Completion (`ai/inline/`)

Ghost text suggestions that appear as the user types.

Fill-In-Middle (FIM) formatting – constructs prompts with prefix, suffix, and cursor position for infilling
Debouncing – waits for a pause in typing before sending requests (avoids flooding the provider)
Caching – recently generated completions are cached and reused when the user types matching characters
Request lifecycle – manages in-flight requests, cancels stale requests when the cursor moves

The completion appears as dimmed ghost text. The user accepts with Tab or dismisses by continuing to type.

2. Chat (`ai/chat/`)

Multi-turn AI chat panel integrated into the IDE.

Multi-turn conversation – maintains conversation history with system prompt, user messages, and assistant responses
Context collection – automatically gathers relevant context:
- Current file content and cursor position
- Selected text
- Visible viewport
- Diagnostics (errors/warnings)
- Open file list
Code block extraction – parses assistant responses to identify code blocks with language and file path annotations
Streaming renderer – renders assistant responses token-by-token as they stream in

3. Review (`ai/review/`)

AI-powered code review for diffs and pull requests.

Diff chunking – splits large diffs into reviewable chunks that fit within the model’s context window
Annotation parsing – extracts structured annotations (line number, severity, suggestion) from the model’s response
Review engine – orchestrates the review process:
1. Collects diff hunks
2. Chunks them for the model
3. Sends each chunk with review instructions
4. Parses and aggregates annotations
5. Displays inline annotations in the diff view

Agent System (`ai/agent/`)

Autonomous task execution with tool calling.

Plan and Execute Loop

User describes a task in natural language
Agent creates a plan (sequence of steps)
Agent executes steps using tools:
- File read/write
- Terminal commands
- Search (workspace-wide)
- LSP queries (find references, go to definition)
Agent observes results and adjusts plan
Repeats until task is complete or blocked

Approval Flows

Auto-approve – read-only operations (file read, search) execute without confirmation
Prompt for approval – write operations (file edit, terminal commands) require user confirmation
Batch approval – user can approve all remaining steps in a plan

Error Recovery

When a tool call fails, the agent:

Observes the error message
Adjusts its approach
Retries with a different strategy
Escalates to the user if repeated failures occur

Activity Logging

All agent actions are logged for transparency:

Tool calls with arguments and results
Plan steps with status (pending, in-progress, completed, failed)
Token usage per step
Total elapsed time

Keyboard shortcuts

Hone Documentation