LLM Providers & Models
Mudabbir supports six LLM providers, from free local models to cloud APIs. This guide covers everything: which provider to pick, which models to use, how they interact with agent backends, and how to configure them.
Quick Start: Which Provider Should I Use?
| Goal | Provider | Cost | Recommended Backend |
|---|---|---|---|
| Best quality, zero config | Anthropic | Paid API | Claude Agent SDK or Mudabbir Native |
| Free cloud API, great quality | Gemini | Free tier available | Mudabbir Native |
| Fully local, no cloud | Ollama | Free | Claude Agent SDK or Mudabbir Native |
| Use OpenAI models | OpenAI | Paid API | Open Interpreter |
| Custom endpoint (OpenRouter, vLLM, etc.) | OpenAI-Compatible | Varies | Mudabbir Native |
New to Mudabbir? Start with Gemini (free) or Anthropic (best quality). You can switch providers at any time from Settings without losing data.
Provider Details
Anthropic (Default)
The recommended provider for maximum capability. Claude models excel at coding, tool use, and complex reasoning.
Configuration:
export MUDABBIR_LLM_PROVIDER="anthropic"export MUDABBIR_ANTHROPIC_API_KEY="sk-ant-..."export MUDABBIR_ANTHROPIC_MODEL="claude-sonnet-4-5-20250929"Available Models:
| Model | ID | Best For |
|---|---|---|
| Claude Sonnet 4.5 | claude-sonnet-4-5-20250929 | Best balance of speed and quality (default) |
| Claude Opus 4.5 | claude-opus-4-5-20250929 | Most capable, complex reasoning |
| Claude Haiku 4.5 | claude-haiku-4-5-20250929 | Fastest, cheapest, simple tasks |
| Claude Sonnet 4 | claude-sonnet-4-20250514 | Previous-gen balanced model |
| Claude Opus 4 | claude-opus-4-20250514 | Previous-gen flagship |
Get an API key at console.anthropic.com.
Gemini
Google’s Gemini models via their OpenAI-compatible API. Free tier available with generous limits. Internally, Mudabbir routes Gemini through the OpenAI-compatible code path using Google’s endpoint at https://generativelanguage.googleapis.com/v1beta/openai/.
Configuration:
export MUDABBIR_LLM_PROVIDER="gemini"export MUDABBIR_GOOGLE_API_KEY="AIza..."export MUDABBIR_GEMINI_MODEL="gemini-2.5-flash"Available Models:
| Model | ID | Best For |
|---|---|---|
| Gemini 2.5 Flash | gemini-2.5-flash | Best price-performance (default) |
| Gemini 2.5 Pro | gemini-2.5-pro | State-of-the-art reasoning |
| Gemini 2.5 Flash Lite | gemini-2.5-flash-lite | Fastest, cheapest |
| Gemini 3 Pro | gemini-3-pro-preview | Latest flagship (preview) |
| Gemini 3 Flash | gemini-3-flash-preview | Latest fast model (preview) |
Get a free API key at AI Studio. The same key is also used for Mudabbir’s image generation tool.
Gemini 2.0 Flash and 2.0 Flash Lite are deprecated and will be retired on March 31, 2026. Use 2.5 Flash or newer.
Gemini requires the Mudabbir Native or Open Interpreter backend. It does not work with the Claude Agent SDK backend (the SDK uses Anthropic’s message format, which Gemini doesn’t support). Switch to Mudabbir Native in Settings → General → Agent Backend.
Ollama (Local)
Run models entirely on your machine. No API keys, no cloud, no costs. Requires Ollama installed and running.
Configuration:
export MUDABBIR_LLM_PROVIDER="ollama"export MUDABBIR_OLLAMA_HOST="http://localhost:11434"export MUDABBIR_OLLAMA_MODEL="llama3.2"Recommended Models:
| Model | ollama pull | Parameters | Best For |
|---|---|---|---|
| Llama 3.2 | ollama pull llama3.2 | 3B | Fast, general use (default) |
| Llama 3.1 | ollama pull llama3.1 | 8B | Better quality, more VRAM |
| Qwen 2.5 Coder | ollama pull qwen2.5-coder | 7B | Coding tasks |
| Mistral | ollama pull mistral | 7B | General reasoning |
| DeepSeek Coder V2 | ollama pull deepseek-coder-v2 | 16B | Advanced coding |
Verify connectivity:
mudabbir --check-ollamaThis runs 4 checks: server reachable, model available, API compatibility, and tool calling support.
Ollama automatically skips smart model routing since there’s only one model to route to.
OpenAI
Use OpenAI’s GPT models directly.
Configuration:
export MUDABBIR_LLM_PROVIDER="openai"export MUDABBIR_OPENAI_API_KEY="sk-..."export MUDABBIR_OPENAI_MODEL="gpt-4o"Available Models:
| Model | ID | Best For |
|---|---|---|
| GPT-4o | gpt-4o | Best overall (default) |
| GPT-4o Mini | gpt-4o-mini | Fast, cost-effective |
| o1 | o1 | Complex reasoning |
| o3-mini | o3-mini | Balanced reasoning |
OpenAI provider works with Open Interpreter only. It does not work with Claude Agent SDK (different API format) or Mudabbir Native. To use OpenAI models with Mudabbir Native, use the OpenAI-Compatible provider with base URL https://api.openai.com/v1.
OpenAI-Compatible
Connect to any endpoint that implements the OpenAI Chat Completions API. This includes hosted services (OpenRouter, Together AI, Fireworks) and self-hosted servers (vLLM, LiteLLM, text-generation-inference).
Configuration:
export MUDABBIR_LLM_PROVIDER="openai_compatible"export MUDABBIR_OPENAI_COMPATIBLE_BASE_URL="https://openrouter.ai/api/v1"export MUDABBIR_OPENAI_COMPATIBLE_API_KEY="sk-or-..."export MUDABBIR_OPENAI_COMPATIBLE_MODEL="anthropic/claude-3.5-sonnet"export MUDABBIR_OPENAI_COMPATIBLE_MAX_TOKENS=0 # 0 = no limitPopular services:
| Service | Base URL | Notes |
|---|---|---|
| OpenRouter | https://openrouter.ai/api/v1 | 100+ models, pay-per-token |
| Together AI | https://api.together.xyz/v1 | Open-source models |
| Fireworks AI | https://api.fireworks.ai/inference/v1 | Fast inference |
| LiteLLM Proxy | http://localhost:4000/v1 | Self-hosted proxy to any provider |
| vLLM | http://localhost:8000/v1 | Self-hosted model serving |
Verify connectivity:
mudabbir --check-openai-compatibleThis runs 2 checks: API connectivity and tool calling support.
The max_tokens setting (0 by default) controls the maximum output tokens per request. Set to 0 for no limit, which is recommended for most models.
Backend note: OpenAI-Compatible endpoints work with Mudabbir Native (recommended) and Open Interpreter. They only work with the Claude Agent SDK if the endpoint implements the Anthropic Messages API format (e.g., LiteLLM proxy). Most endpoints (OpenRouter, Together AI, Fireworks, vLLM) speak OpenAI format only.
Auto (Default)
When llm_provider is set to "auto" (the default), Mudabbir auto-detects the best available provider:
- Anthropic — if
MUDABBIR_ANTHROPIC_API_KEYis set - OpenAI — if
MUDABBIR_OPENAI_API_KEYis set - Ollama — fallback (no key needed)
This means Mudabbir works out of the box with Ollama if no API keys are configured.
Backend Compatibility Matrix
Not every provider works with every agent backend. Here’s the full compatibility matrix:
| Provider | Claude Agent SDK | Mudabbir Native | Open Interpreter |
|---|---|---|---|
| Anthropic | Yes | Yes | Yes |
| Gemini | No | Yes | Yes |
| Ollama | Yes | Yes | Yes |
| OpenAI | No | No | Yes |
| OpenAI-Compatible | Partial | Yes | Yes |
| Feature | Claude Agent SDK | Mudabbir Native | Open Interpreter |
|---|---|---|---|
| Smart Model Routing | Yes | Yes (Anthropic only) | No |
| Fast-Path (simple msgs) | Yes | No | No |
| Built-in Tools | 8 SDK tools | 6 native tools | Code execution |
| MCP Server Support | Yes | Yes | No |
| Streaming | Yes | Yes | Yes |
| Security Hooks | PreToolUse hooks | Regex + path jail | None |
| Tool Call Format | Anthropic Messages API | Auto-converted | LiteLLM |
API format matters. The Claude Agent SDK speaks the Anthropic Messages API format. It only works with endpoints that understand this format: Anthropic (native), Ollama v0.14+ (added Anthropic compat), and Anthropic-compatible proxies like LiteLLM. Endpoints that only speak OpenAI format (Gemini, OpenAI, OpenRouter, Together AI) will not work with Claude SDK — use Mudabbir Native instead.
Mudabbir Native is the most versatile backend for non-Anthropic providers. It auto-converts between Anthropic and OpenAI API formats, so it works with every provider except plain OpenAI (use OpenAI-Compatible with https://api.openai.com/v1 as a workaround).
Claude Agent SDK (Recommended)
The default and most capable backend. Uses the official Claude Agent SDK as a subprocess with built-in agentic tools.
Important: API format requirement. The Claude SDK uses the Anthropic Messages API format for all communication. This means it only works with endpoints that understand Anthropic’s message format — not endpoints that only speak OpenAI’s Chat Completions format.
How it connects to providers:
All providers are passed to the Claude CLI subprocess via environment variables (ANTHROPIC_BASE_URL, ANTHROPIC_API_KEY):
| Provider | Works? | How |
|---|---|---|
| Anthropic | Yes | Native — just set ANTHROPIC_API_KEY |
| Ollama | Yes | Ollama v0.14+ implements the Anthropic Messages API natively |
| Gemini | No | Gemini only speaks OpenAI format — use Mudabbir Native instead |
| OpenAI | No | OpenAI only speaks OpenAI format |
| OpenAI-Compatible | Partial | Only if the endpoint speaks Anthropic format (e.g., LiteLLM proxy) |
The Claude Agent SDK also supports enterprise cloud providers like Amazon Bedrock, Google Vertex AI, and Azure AI Foundry for Anthropic models.
Using Gemini or OpenAI? Switch to Mudabbir Native backend in Settings → General → Agent Backend. Mudabbir Native handles the API format conversion automatically.
Built-in SDK tools: Bash, Read, Write, Edit, Glob, Grep, WebSearch, WebFetch — these are provided by the Claude SDK itself and don’t need configuration.
Fast-path optimization: For messages classified as SIMPLE by the model router, Mudabbir bypasses the CLI subprocess entirely and calls the Anthropic API directly. This saves ~1.5–3 seconds of subprocess startup time per simple message. The fast-path requires an API key.
Security: Uses PreToolUse hooks to intercept and block dangerous Bash commands before execution.
Persistent client: The subprocess is reused across messages. It only reconnects when the model or tool configuration changes, avoiding repeated startup overhead.
MCP servers: Fully supported. Loads enabled MCP server configs and passes them to the SDK subprocess. Supports stdio, SSE, and HTTP transports.
Best for: Anthropic and Ollama users who want maximum capability, coding tasks, complex multi-step reasoning, and tool-heavy workflows.
Mudabbir Native
Custom orchestrator that uses the Anthropic SDK for reasoning and Open Interpreter for code execution. Provides a transparent, user-controlled agentic loop with integrated security layers. This is the most versatile backend — it supports every provider by auto-converting between API formats.
How it connects to providers:
Mudabbir Native creates different API clients depending on the provider, automatically handling format differences:
| Provider | Client | SDK | Notes |
|---|---|---|---|
| Anthropic | AsyncAnthropic | Anthropic SDK | Direct API, no base URL needed |
| Ollama | AsyncAnthropic | Anthropic SDK | base_url=ollama_host, api_key="ollama" |
| Gemini | AsyncOpenAI | OpenAI SDK | Auto-converts tool format |
| OpenAI-Compatible | AsyncOpenAI | OpenAI SDK | Auto-converts tool format |
| OpenAI | ❌ Not supported | — | Use OpenAI-Compatible instead |
Automatic format conversion: When using Gemini or OpenAI-Compatible providers, Mudabbir Native automatically:
- Converts Anthropic-format tool definitions to OpenAI function-calling format
- Converts message history between Anthropic and OpenAI formats
- Maps OpenAI’s
finish_reasonto Anthropic’sstop_reason - Handles
tool_callsresponse objects withid,function.name,function.arguments
This means you get the same tool-use experience regardless of provider — the conversion is invisible.
API timeouts:
| Provider | Timeout | Retries | Why |
|---|---|---|---|
| Anthropic | 90s | 2 | Reliable API, higher retries |
| Ollama | 120s | 1 | Local, may be slow on first load |
| Gemini | 180s | 1 | Longer for thinking models |
| OpenAI-Compatible | 180s | 1 | Longer for thinking models |
Built-in tools: shell, read_file, write_file, edit_file, list_dir, remember, recall, forget — plus any MCP tools configured.
Security layers:
- Dangerous command regex — blocks
rm -rf /,mkfs,dd, etc. - Sensitive path protection — SSH keys, AWS credentials,
.envfiles - File jail — restricts filesystem access to
~/.mudabbir/by default - Output redaction — hides API keys and passwords from responses
Smart routing: Supported for Anthropic provider only. Disabled for Ollama and Gemini.
MCP servers: Supported. Tools are registered with mcp_<server>__<tool> naming convention.
Best for: Users who want fine-grained control, transparency, and strong security guardrails. Works well with Gemini for a free, full-featured setup.
Open Interpreter
Lightweight wrapper around Open Interpreter. Delegates model management entirely to Open Interpreter’s own provider system, which uses LiteLLM under the hood.
How it connects to providers:
| Provider | Model Format | API Base | Notes |
|---|---|---|---|
| Anthropic | claude-sonnet-4-5-20250929 | (auto) | API key passed directly |
| Gemini | gemini-2.5-flash | (auto) | Via LiteLLM |
| Ollama | ollama/llama3.2 | Ollama host URL | Auto-prefixed with ollama/ |
| OpenAI | gpt-4o | (auto) | API key passed directly |
| OpenAI-Compatible | Model name | Your base URL | API key passed directly |
Open Interpreter uses LiteLLM internally, which supports 100+ LLM providers. Any model supported by LiteLLM can be used.
Ollama prefix: Ollama models are automatically prefixed with ollama/ (e.g., llama3.2 becomes ollama/llama3.2). This is required by LiteLLM to identify the provider.
Streaming: Runs in a thread pool with queue-based chunk streaming. Filters verbose console output and emits only user-facing messages and tool results.
Tool execution: Open Interpreter handles code execution directly — it generates and runs code in a sandboxed environment. Mudabbir wraps the output into standardized tool_use/tool_result events for the Activity panel.
Limitations:
- No MCP server support
- No smart model routing
- No
PreToolUsesecurity hooks - Less granular tool control compared to other backends
Best for: Quick prototyping, users familiar with Open Interpreter, scenarios where code execution is the primary use case.
Smart Model Routing
When using Anthropic as the provider, Mudabbir can automatically select the model size based on message complexity. This optimizes cost by using cheaper models for simple queries.
How It Works
Each incoming message is classified into a tier:
| Tier | Description | Default Model | When |
|---|---|---|---|
| Simple | Greetings, yes/no, quick facts | claude-haiku-4-5-20251001 | Short messages matching simple patterns |
| Moderate | General tasks, search, moderate reasoning | claude-sonnet-4-5-20250929 | Default fallback |
| Complex | Coding, debugging, multi-step planning | claude-opus-4-6 | Long messages or 1+ complex keywords |
Classification Signals
- Simple patterns: “hi”, “hello”, “thanks”, “what is X?”, “remind me”, short messages under 30 characters
- Complex signals: “plan”, “architect”, “debug”, “implement”, “analyze”, “optimize”, “research”
- Threshold: 1 complex signal + message > 30 chars = COMPLEX; 2+ complex signals = COMPLEX; > 400 chars = COMPLEX
Configuration
export MUDABBIR_SMART_ROUTING_ENABLED=trueexport MUDABBIR_MODEL_TIER_SIMPLE="claude-haiku-4-5-20251001"export MUDABBIR_MODEL_TIER_MODERATE="claude-sonnet-4-5-20250929"export MUDABBIR_MODEL_TIER_COMPLEX="claude-opus-4-6"Smart routing is automatically skipped for Ollama, Gemini, and OpenAI-Compatible providers since they use a single configured model.
Configuration Reference
All LLM-Related Settings
| Setting | Env Variable | Default | Description |
|---|---|---|---|
llm_provider | MUDABBIR_LLM_PROVIDER | auto | Provider selection |
anthropic_api_key | MUDABBIR_ANTHROPIC_API_KEY | — | Anthropic API key |
anthropic_model | MUDABBIR_ANTHROPIC_MODEL | claude-sonnet-4-5-20250929 | Anthropic model |
google_api_key | MUDABBIR_GOOGLE_API_KEY | — | Google API key (Gemini + image gen) |
gemini_model | MUDABBIR_GEMINI_MODEL | gemini-2.5-flash | Gemini model |
openai_api_key | MUDABBIR_OPENAI_API_KEY | — | OpenAI API key |
openai_model | MUDABBIR_OPENAI_MODEL | gpt-4o | OpenAI model |
ollama_host | MUDABBIR_OLLAMA_HOST | http://localhost:11434 | Ollama server URL |
ollama_model | MUDABBIR_OLLAMA_MODEL | llama3.2 | Ollama model |
openai_compatible_base_url | MUDABBIR_OPENAI_COMPATIBLE_BASE_URL | — | Endpoint URL |
openai_compatible_api_key | MUDABBIR_OPENAI_COMPATIBLE_API_KEY | — | Endpoint API key |
openai_compatible_model | MUDABBIR_OPENAI_COMPATIBLE_MODEL | — | Endpoint model name |
openai_compatible_max_tokens | MUDABBIR_OPENAI_COMPATIBLE_MAX_TOKENS | 0 | Max output tokens (0 = no limit) |
smart_routing_enabled | MUDABBIR_SMART_ROUTING_ENABLED | true | Enable auto model selection |
model_tier_simple | MUDABBIR_MODEL_TIER_SIMPLE | claude-haiku-4-5-20251001 | Model for simple tasks |
model_tier_moderate | MUDABBIR_MODEL_TIER_MODERATE | claude-sonnet-4-5-20250929 | Model for moderate tasks |
model_tier_complex | MUDABBIR_MODEL_TIER_COMPLEX | claude-opus-4-6 | Model for complex tasks |
Setting via config.json
All settings can also be set in ~/.mudabbir/config.json:
{ "llm_provider": "gemini", "gemini_model": "gemini-2.5-flash", "smart_routing_enabled": false}Setting via Dashboard
Open the web dashboard (default at http://localhost:8888) and go to Settings > General to select your provider and model from dropdown menus.
Security Notes
- API keys are encrypted at rest using Fernet + PBKDF2 in
~/.mudabbir/secrets.enc. They are never stored in plaintext inconfig.json. - Environment variables override config file values and are never written to disk.
- The Gemini provider reuses the
google_api_keyfield, which is also used for image generation. One key covers both features.
Troubleshooting
Ollama: “Model not found”
ollama list # See available modelsollama pull llama3.2 # Download a modelmudabbir --check-ollama # Run connectivity checkGemini: “API key invalid”
- Go to AI Studio and create/verify your key
- Make sure the key is saved in Settings > API Keys > Google API Key
- Ensure the Gemini API is enabled for your Google Cloud project
OpenAI-Compatible: “Cannot connect”
mudabbir --check-openai-compatible # Run connectivity checkVerify:
- The base URL is correct and includes
/v1if needed - The server is running and accessible
- The model name matches what the endpoint expects
Wrong provider being used
If llm_provider is "auto", Mudabbir picks the first available key (Anthropic > OpenAI > Ollama). Set an explicit provider to override:
export MUDABBIR_LLM_PROVIDER="gemini"