Appendix E: Model Selection Guide
Different models have different strengths and price points. This guide helps you choose the best model for your needs.
📅 Data updated April 2026. Prices based on OpenRouter public pricing (per million input tokens).
Choose by Scenario
Daily Chat & General Tasks
| Model | Provider | Context | Price | Recommendation |
|---|---|---|---|---|
| GLM-5.1 | Zhipu | 202K | $0.95/M | ⭐⭐⭐⭐⭐ |
| GPT-5.4 Nano | OpenAI | 400K | $0.2/M | ⭐⭐⭐⭐⭐ |
| Gemini 2.5 Flash | 1M | $0.3/M | ⭐⭐⭐⭐ | |
| DeepSeek V3.2 | DeepSeek | 128K | $0.26/M | ⭐⭐⭐⭐ |
| Qwen3.5 Plus | Alibaba Cloud | 1M | $0.26/M | ⭐⭐⭐⭐ |
| MiniMax M2.5 | MiniMax | 128K | $0.118/M | ⭐⭐⭐⭐ |
Programming & Development
| Model | Provider | Strength | Recommendation |
|---|---|---|---|
| Claude Sonnet 4.6 | Anthropic | Code understanding, refactoring, review | ⭐⭐⭐⭐⭐ |
| GPT-5.4 | OpenAI | All-rounder, long code generation | ⭐⭐⭐⭐⭐ |
| Qwen3 Coder Plus | Alibaba Cloud | Code generation, 1M context | ⭐⭐⭐⭐ |
| GLM-5.1 | Zhipu | Great value, Chinese code | ⭐⭐⭐⭐ |
| DeepSeek V3.2 | DeepSeek | Strong code reasoning, ultra-low price | ⭐⭐⭐⭐ |
Complex Reasoning
| Model | Provider | Strength | Recommendation |
|---|---|---|---|
| Claude Opus 4.6 | Anthropic | Best reasoning, 1M context | ⭐⭐⭐⭐⭐ |
| GPT-5.4 Pro | OpenAI | Best overall, 1M context | ⭐⭐⭐⭐⭐ |
| o3 | OpenAI | Math/logic reasoning specialist | ⭐⭐⭐⭐⭐ |
| Gemini 2.5 Pro | Long-doc reasoning, 1M context | ⭐⭐⭐⭐ |
Local Deployment (Privacy First)
| Model | Parameters | Minimum Config | Recommendation |
|---|---|---|---|
| Qwen3-14B-Instruct | 14B | 16GB RAM | ⭐⭐⭐⭐ |
| Llama4-Scout-8B | 8B | 8GB RAM | ⭐⭐⭐⭐ |
| DeepSeek-R1-7B | 7B | 8GB RAM | ⭐⭐⭐ |
| Qwen3-72B-Instruct | 72B | 48GB VRAM | ⭐⭐⭐⭐⭐ |
Choose by Budget
Free Tier
| Model | Provider | Limitations |
|---|---|---|
| GLM-4-Flash | Zhipu | Generous free quota |
| Gemini 2.5 Flash | Free API | |
| HuggingFace Open Models | HF | $0.1/month free quota |
| Local Models | Self-hosted | Completely free |
Low Budget (< $10/month)
| Model | Provider | Approx. Monthly Cost |
|---|---|---|
| GLM-5.1 | Zhipu | ¥50-100 |
| GPT-5.4 Nano | OpenRouter | $2-4 |
| DeepSeek V3.2 | OpenRouter | $2-4 |
| Qwen3.5 Plus | Alibaba Cloud | $3-5 |
| MiniMax M2.5 | MiniMax | $1-3 |
Medium Budget ($10-50/month)
| Model | Provider | Approx. Monthly Cost |
|---|---|---|
| Claude Sonnet 4.6 | OpenRouter | $15-30 |
| GPT-5.4 | OpenRouter | $15-30 |
| Gemini 2.5 Pro | $10-25 | |
| GLM-5.1 | Zhipu | ¥100-300 |
High Budget (No Limits)
| Model | Provider | Approx. Monthly Cost |
|---|---|---|
| Claude Opus 4.6 | Anthropic | $50-100+ |
| GPT-5.4 Pro | OpenAI | $50-100+ |
Model Switching Tips
Use different models for different tasks to optimize costs:
# Daily chat — use the cheap one
/model zai/glm-5.1
# Programming tasks — use the strong one
/model anthropic/claude-sonnet-4.6
# Complex reasoning — use the strongest one
/model openai/gpt-5.4-proPair with fallback_model for automatic downgrading:
# config.yaml
fallback_model:
provider: openrouter
model: anthropic/claude-sonnet-4.6 # Auto-switch when primary model is rate-limitedContext Window Comparison
| Model | Context Window | Approximately |
|---|---|---|
| GPT-5.4 / GPT-5.4 Pro | 1M | ~750 pages of text |
| Claude Opus/Sonnet 4.6 | 1M | ~750 pages of text |
| Gemini 2.5 Pro/Flash | 1M | ~750 pages of text |
| Qwen3.6 Plus | 1M | ~750 pages of text |
| GLM-5.1 | 202K | ~150 pages of text |
| DeepSeek V3.2 | 128K | ~100 pages of text |
| Local Qwen3 (Ollama) | 4K-32K | Depends on configuration |
Context Window ≠ Available Window
The Agent's system prompt + tool definitions + skill loading consume significant context. The actual usable window for conversation is typically 20-40% smaller than the stated value.
April 2026 Model Pricing Overview
Prices are per million input tokens. Output prices are typically 2-5x the input price. Check each platform's official website for exact figures.
International Models
| Model | Provider | Context | Input Price | Notes |
|---|---|---|---|---|
| GPT-5.4 Pro | OpenAI | 1M | $30/M | Best overall |
| GPT-5.4 | OpenAI | 1M | $2.5/M | All-around flagship |
| GPT-5.4 Mini | OpenAI | 400K | $0.75/M | Performance/price balance |
| GPT-5.4 Nano | OpenAI | 400K | $0.2/M | Ultra-cheap |
| o3 | OpenAI | 200K | $2/M | Reasoning specialist |
| o4-mini | OpenAI | 200K | $1.1/M | Lightweight reasoning |
| Claude Opus 4.6 | Anthropic | 1M | $5/M | Best reasoning |
| Claude Opus 4.6 Fast | Anthropic | 1M | $3/M | Fast reasoning |
| Claude Sonnet 4.6 | Anthropic | 1M | $3/M | Programming powerhouse |
| Gemini 2.5 Pro | 1M | $1.25/M | Long-text expert | |
| Gemini 2.5 Flash | 1M | $0.3/M | Fast and cheap |
Chinese Domestic Models
| Model | Provider | Context | Input Price | Notes |
|---|---|---|---|---|
| GLM-5.1 | Zhipu | 202K | $0.95/M | Strong overall |
| GLM-5 Turbo | Zhipu | 128K | $1.2/M | Fast response |
| DeepSeek V3.2 | DeepSeek | 128K | $0.26/M | Extreme value |
| DeepSeek V3.2 Speciale | DeepSeek | 128K | $0.4/M | Specially optimized |
| Qwen3.6 Plus | Alibaba Cloud | 1M | $0.325/M | 1M ultra-long context |
| Qwen3.5 Plus | Alibaba Cloud | 1M | $0.26/M | Great value |
| Qwen3 Max | Alibaba Cloud | 32K | $0.78/M | Fine-grained tasks |
| Qwen3 Coder Plus | Alibaba Cloud | 1M | $0.65/M | Code specialist |
| Kimi K2.5 | Moonshot | 262K | $0.38/M | Long-text understanding |
| Kimi K2 Thinking | Moonshot | 262K | $0.6/M | Deep thinking |
| M2.7 | MiniMax | 128K | $0.3/M | General capability |
| M2.5 | MiniMax | 128K | $0.118/M | Ultra-cheap |
Chinese Domestic Model Configuration Quick Reference
| Model | Provider | Environment Variable | Get Key |
|---|---|---|---|
| GLM-5.1 | Zhipu | GLM_API_KEY | open.bigmodel.cn |
| Qwen | Alibaba Cloud | DASHSCOPE_API_KEY | modelstudio.console.alibabacloud.com |
| DeepSeek | DeepSeek | DEEPSEEK_API_KEY | platform.deepseek.com |
| Kimi | Moonshot | KIMI_API_KEY | moonshot.ai |
| MiniMax | MiniMax | MINIMAX_API_KEY | minimax.io |
Configuration Example
# config.yaml — Using Zhipu GLM-5.1
provider:
name: zai
api_key: ${GLM_API_KEY}
model: glm-5.1
# Or use international models via OpenRouter relay
provider:
name: openrouter
api_key: ${OPENROUTER_API_KEY}
model: anthropic/claude-sonnet-4.6Chinese Domestic Model Pros & Cons
| Model | Strengths | Weaknesses | Best For |
|---|---|---|---|
| GLM-5.1 | Strong overall, good Chinese, good tool calling | Slightly weaker English | Daily chat, Chinese content |
| Qwen3.6 Plus | 1M ultra-long context, multilingual | Strict API rate limits | Long documents, Q&A |
| DeepSeek V3.2 | Strong code reasoning, ultra-low price | Queue during peak hours | Programming assistance |
| Kimi K2.5 | Excellent long-text understanding | Occasional hallucinations | Long document analysis |
| MiniMax M2.5 | Cheapest overall | Slightly weaker overall capability | High-volume calls |
⚠️ Chinese domestic models typically require a Chinese phone number for registration. Some platforms support email registration if you're overseas.