Skip to content

Appendix E: Model Selection Guide

Different models have different strengths and price points. This guide helps you choose the best model for your needs.

📅 Data updated April 2026. Prices based on OpenRouter public pricing (per million input tokens).

Choose by Scenario

Daily Chat & General Tasks

ModelProviderContextPriceRecommendation
GLM-5.1Zhipu202K$0.95/M⭐⭐⭐⭐⭐
GPT-5.4 NanoOpenAI400K$0.2/M⭐⭐⭐⭐⭐
Gemini 2.5 FlashGoogle1M$0.3/M⭐⭐⭐⭐
DeepSeek V3.2DeepSeek128K$0.26/M⭐⭐⭐⭐
Qwen3.5 PlusAlibaba Cloud1M$0.26/M⭐⭐⭐⭐
MiniMax M2.5MiniMax128K$0.118/M⭐⭐⭐⭐

Programming & Development

ModelProviderStrengthRecommendation
Claude Sonnet 4.6AnthropicCode understanding, refactoring, review⭐⭐⭐⭐⭐
GPT-5.4OpenAIAll-rounder, long code generation⭐⭐⭐⭐⭐
Qwen3 Coder PlusAlibaba CloudCode generation, 1M context⭐⭐⭐⭐
GLM-5.1ZhipuGreat value, Chinese code⭐⭐⭐⭐
DeepSeek V3.2DeepSeekStrong code reasoning, ultra-low price⭐⭐⭐⭐

Complex Reasoning

ModelProviderStrengthRecommendation
Claude Opus 4.6AnthropicBest reasoning, 1M context⭐⭐⭐⭐⭐
GPT-5.4 ProOpenAIBest overall, 1M context⭐⭐⭐⭐⭐
o3OpenAIMath/logic reasoning specialist⭐⭐⭐⭐⭐
Gemini 2.5 ProGoogleLong-doc reasoning, 1M context⭐⭐⭐⭐

Local Deployment (Privacy First)

ModelParametersMinimum ConfigRecommendation
Qwen3-14B-Instruct14B16GB RAM⭐⭐⭐⭐
Llama4-Scout-8B8B8GB RAM⭐⭐⭐⭐
DeepSeek-R1-7B7B8GB RAM⭐⭐⭐
Qwen3-72B-Instruct72B48GB VRAM⭐⭐⭐⭐⭐

Choose by Budget

Free Tier

ModelProviderLimitations
GLM-4-FlashZhipuGenerous free quota
Gemini 2.5 FlashGoogleFree API
HuggingFace Open ModelsHF$0.1/month free quota
Local ModelsSelf-hostedCompletely free

Low Budget (< $10/month)

ModelProviderApprox. Monthly Cost
GLM-5.1Zhipu¥50-100
GPT-5.4 NanoOpenRouter$2-4
DeepSeek V3.2OpenRouter$2-4
Qwen3.5 PlusAlibaba Cloud$3-5
MiniMax M2.5MiniMax$1-3

Medium Budget ($10-50/month)

ModelProviderApprox. Monthly Cost
Claude Sonnet 4.6OpenRouter$15-30
GPT-5.4OpenRouter$15-30
Gemini 2.5 ProGoogle$10-25
GLM-5.1Zhipu¥100-300

High Budget (No Limits)

ModelProviderApprox. Monthly Cost
Claude Opus 4.6Anthropic$50-100+
GPT-5.4 ProOpenAI$50-100+

Model Switching Tips

Use different models for different tasks to optimize costs:

bash
# Daily chat — use the cheap one
/model zai/glm-5.1

# Programming tasks — use the strong one
/model anthropic/claude-sonnet-4.6

# Complex reasoning — use the strongest one
/model openai/gpt-5.4-pro

Pair with fallback_model for automatic downgrading:

yaml
# config.yaml
fallback_model:
  provider: openrouter
  model: anthropic/claude-sonnet-4.6  # Auto-switch when primary model is rate-limited

Context Window Comparison

ModelContext WindowApproximately
GPT-5.4 / GPT-5.4 Pro1M~750 pages of text
Claude Opus/Sonnet 4.61M~750 pages of text
Gemini 2.5 Pro/Flash1M~750 pages of text
Qwen3.6 Plus1M~750 pages of text
GLM-5.1202K~150 pages of text
DeepSeek V3.2128K~100 pages of text
Local Qwen3 (Ollama)4K-32KDepends on configuration

Context Window ≠ Available Window

The Agent's system prompt + tool definitions + skill loading consume significant context. The actual usable window for conversation is typically 20-40% smaller than the stated value.


April 2026 Model Pricing Overview

Prices are per million input tokens. Output prices are typically 2-5x the input price. Check each platform's official website for exact figures.

International Models

ModelProviderContextInput PriceNotes
GPT-5.4 ProOpenAI1M$30/MBest overall
GPT-5.4OpenAI1M$2.5/MAll-around flagship
GPT-5.4 MiniOpenAI400K$0.75/MPerformance/price balance
GPT-5.4 NanoOpenAI400K$0.2/MUltra-cheap
o3OpenAI200K$2/MReasoning specialist
o4-miniOpenAI200K$1.1/MLightweight reasoning
Claude Opus 4.6Anthropic1M$5/MBest reasoning
Claude Opus 4.6 FastAnthropic1M$3/MFast reasoning
Claude Sonnet 4.6Anthropic1M$3/MProgramming powerhouse
Gemini 2.5 ProGoogle1M$1.25/MLong-text expert
Gemini 2.5 FlashGoogle1M$0.3/MFast and cheap

Chinese Domestic Models

ModelProviderContextInput PriceNotes
GLM-5.1Zhipu202K$0.95/MStrong overall
GLM-5 TurboZhipu128K$1.2/MFast response
DeepSeek V3.2DeepSeek128K$0.26/MExtreme value
DeepSeek V3.2 SpecialeDeepSeek128K$0.4/MSpecially optimized
Qwen3.6 PlusAlibaba Cloud1M$0.325/M1M ultra-long context
Qwen3.5 PlusAlibaba Cloud1M$0.26/MGreat value
Qwen3 MaxAlibaba Cloud32K$0.78/MFine-grained tasks
Qwen3 Coder PlusAlibaba Cloud1M$0.65/MCode specialist
Kimi K2.5Moonshot262K$0.38/MLong-text understanding
Kimi K2 ThinkingMoonshot262K$0.6/MDeep thinking
M2.7MiniMax128K$0.3/MGeneral capability
M2.5MiniMax128K$0.118/MUltra-cheap

Chinese Domestic Model Configuration Quick Reference

ModelProviderEnvironment VariableGet Key
GLM-5.1ZhipuGLM_API_KEYopen.bigmodel.cn
QwenAlibaba CloudDASHSCOPE_API_KEYmodelstudio.console.alibabacloud.com
DeepSeekDeepSeekDEEPSEEK_API_KEYplatform.deepseek.com
KimiMoonshotKIMI_API_KEYmoonshot.ai
MiniMaxMiniMaxMINIMAX_API_KEYminimax.io

Configuration Example

yaml
# config.yaml — Using Zhipu GLM-5.1
provider:
  name: zai
  api_key: ${GLM_API_KEY}
  model: glm-5.1

# Or use international models via OpenRouter relay
provider:
  name: openrouter
  api_key: ${OPENROUTER_API_KEY}
  model: anthropic/claude-sonnet-4.6

Chinese Domestic Model Pros & Cons

ModelStrengthsWeaknessesBest For
GLM-5.1Strong overall, good Chinese, good tool callingSlightly weaker EnglishDaily chat, Chinese content
Qwen3.6 Plus1M ultra-long context, multilingualStrict API rate limitsLong documents, Q&A
DeepSeek V3.2Strong code reasoning, ultra-low priceQueue during peak hoursProgramming assistance
Kimi K2.5Excellent long-text understandingOccasional hallucinationsLong document analysis
MiniMax M2.5Cheapest overallSlightly weaker overall capabilityHigh-volume calls

⚠️ Chinese domestic models typically require a Chinese phone number for registration. Some platforms support email registration if you're overseas.

Further Reading

Released under CC BY-NC-SA 4.0 | GitHub