AI Architecture
Hone integrates AI at multiple levels: inline code completion, multi-turn chat, automated code review, and an autonomous agent. The AI system is provider-agnostic, with a registry of adapters for major LLM providers.
Provider System
ProviderRegistry
Central registry managing AI provider adapters. Each adapter implements a common interface for chat completions and streaming.
Adapters:
| Provider | Description |
|---|---|
| Anthropic | Claude models (direct API) |
| OpenAI | GPT models (direct API) |
| Gemini models | |
| Ollama | Local models via Ollama |
| Azure OpenAI | OpenAI models via Azure |
| Bedrock | AWS Bedrock (Anthropic, etc.) |
| Vertex | Google Cloud Vertex AI |
ModelRouter
Selects the appropriate model for a given task based on:
- User preferences (configured default model)
- Task type (completion vs. chat vs. review)
- Model capabilities (context window, tool use support)
- Provider availability
Token Estimation
Estimates token counts before sending requests to:
- Stay within context window limits
- Truncate or chunk context when necessary
- Provide usage feedback to the user
AI Surfaces
1. Inline Completion (ai/inline/)
Ghost text suggestions that appear as the user types.
- Fill-In-Middle (FIM) formatting – constructs prompts with prefix, suffix, and cursor position for infilling
- Debouncing – waits for a pause in typing before sending requests (avoids flooding the provider)
- Caching – recently generated completions are cached and reused when the user types matching characters
- Request lifecycle – manages in-flight requests, cancels stale requests when the cursor moves
The completion appears as dimmed ghost text. The user accepts with Tab or dismisses by continuing to type.
2. Chat (ai/chat/)
Multi-turn AI chat panel integrated into the IDE.
- Multi-turn conversation – maintains conversation history with system prompt, user messages, and assistant responses
- Context collection – automatically gathers relevant context:
- Current file content and cursor position
- Selected text
- Visible viewport
- Diagnostics (errors/warnings)
- Open file list
- Code block extraction – parses assistant responses to identify code blocks with language and file path annotations
- Streaming renderer – renders assistant responses token-by-token as they stream in
3. Review (ai/review/)
AI-powered code review for diffs and pull requests.
- Diff chunking – splits large diffs into reviewable chunks that fit within the model’s context window
- Annotation parsing – extracts structured annotations (line number, severity, suggestion) from the model’s response
- Review engine – orchestrates the review process:
- Collects diff hunks
- Chunks them for the model
- Sends each chunk with review instructions
- Parses and aggregates annotations
- Displays inline annotations in the diff view
Agent System (ai/agent/)
Autonomous task execution with tool calling.
Plan and Execute Loop
- User describes a task in natural language
- Agent creates a plan (sequence of steps)
- Agent executes steps using tools:
- File read/write
- Terminal commands
- Search (workspace-wide)
- LSP queries (find references, go to definition)
- Agent observes results and adjusts plan
- Repeats until task is complete or blocked
Approval Flows
- Auto-approve – read-only operations (file read, search) execute without confirmation
- Prompt for approval – write operations (file edit, terminal commands) require user confirmation
- Batch approval – user can approve all remaining steps in a plan
Error Recovery
When a tool call fails, the agent:
- Observes the error message
- Adjusts its approach
- Retries with a different strategy
- Escalates to the user if repeated failures occur
Activity Logging
All agent actions are logged for transparency:
- Tool calls with arguments and results
- Plan steps with status (pending, in-progress, completed, failed)
- Token usage per step
- Total elapsed time