Chapter 4 Sub-agents & Parallel Orchestration
One person can't keep up? Let Cuttlefish clone itself — handle multiple independent tasks simultaneously and double the speed.
Two Parallelism Mechanisms
Hermes Agent provides two parallel processing methods:
| Mechanism | Tool | Best For |
|---|---|---|
| Sub-agents | delegate_task | Complex tasks requiring reasoning |
| Script batching | execute_code | Mechanical multi-step operations |
Which to Choose?
Need reasoning/decision-making? ──→ delegate_task (sub-agent)
Pure mechanical operations? ──→ execute_code (script)
3+ tool calls with logic? ──→ execute_code
Single tool call? ──→ Call the tool directly
Need user interaction? ──→ Neither — handle it yourselfdelegate_task Sub-agents
How It Works
┌─────────────────────────────────┐
│ Main Agent (your conversation)│
│ │
│ delegate_task ──→ Sub-agent 1 │ → Research topic A
│ delegate_task ──→ Sub-agent 2 │ → Research topic B
│ delegate_task ──→ Sub-agent 3 │ → Write code C
│ │
│ ← Aggregate all sub-agent results│
│ ← Integrate and reply to you │
└─────────────────────────────────┘Each sub-agent has:
- Its own conversation context
- Its own terminal session
- Its own toolset
- Only the final summary is returned to the main Agent
Two Modes
Single Task Mode:
delegate_task(
goal="Analyze code statistics of the hermes-agent repository",
context="Repository path: /path/to/hermes-agent",
toolsets=["terminal", "file"]
)Batch Mode (Parallel):
delegate_task(
tasks=[
{"goal": "Research Python asyncio", "context": "..."},
{"goal": "Research Rust tokio", "context": "..."},
{"goal": "Research Go goroutines", "context": "..."}
]
)Up to 3 sub-agents can run simultaneously.
Two Key Parameters
| Parameter | Description |
|---|---|
goal | Task objective (specific, self-contained) |
context | Background info (file paths, error messages, constraints) |
WARNING
Sub-agents don't have memory of your conversation! You must pass all necessary information through context.
execute_code Script Batching
How It Works
execute_code runs scripts in a Python sandbox and can call Hermes tools:
from hermes_tools import terminal, read_file, write_file, search_files
# Batch operations
files = search_files(pattern="*.py", target="files", path="./src")
for f in files["matches"][:10]:
content = read_file(path=f)
# Process...Features
| Feature | Description |
|---|---|
| Built-in tools | terminal, read_file, write_file, search_files, patch, etc. |
| Built-in functions | json_parse, shell_quote, retry |
| Limits | 5-minute timeout, 50KB stdout cap, max 50 tool calls |
Use Cases
- Need to process N files in a loop
- Need conditional branching (if/else)
- Need data processing between tool calls
- Need retry logic
Hands-On Examples
Example 1: Parallel Research
You: Help me compare and analyze three AI Agent frameworks:
Hermes Agent, OpenAI Agents SDK, and LangGraph.
Compare them across three dimensions: architecture design, tool system, and deployment.The Agent creates 3 sub-agents for parallel research, then generates a comparison table after aggregation.
Example 2: Batch File Processing
You: Translate the docstrings in all Python files under src/ to EnglishThe Agent uses execute_code to process files in a loop, avoiding the overhead of calling each one individually.
Example 3: Code Review + Testing in Parallel
You: Review the code changes in this PR while running the test suite┌─ Sub-agent 1: Code review (loads github-code-review skill)
│
├─ Sub-agent 2: Run tests (executes pytest in terminal)
│
└─ Main agent: Consolidate both reportsExample 4: Multi-Repository Sync
You: Check my 5 active projects. For each project:
1. Check for uncommitted changes
2. Check if behind remote
3. Check CI statusUse execute_code to loop through 5 repositories and collect all information at once.
Performance & Limitations
delegate_task Limits
| Limit | Value |
|---|---|
| Max parallel | 3 |
| Max tool call rounds | 50 (configurable) |
| Context isolation | Fully isolated |
| User interaction | Not supported |
execute_code Limits
| Limit | Value |
|---|---|
| Execution timeout | 5 minutes |
| stdout cap | 50KB |
| Tool call cap | 50 calls |
| Foreground only | No background execution |
When Not to Use Parallelism
- Single tool call → Call directly
- Tasks have sequential dependencies → Execute serially
- Need user feedback → Handle yourself
Advanced Patterns
ACP Sub-agents
You can have sub-agents use different AI models:
delegate_task(
goal="Deep code review",
acp_command="claude", # Use Claude Code as sub-agent
acp_args=["--acp", "--stdio", "--model", "claude-opus-4"]
)Sub-agent Toolset Control
Restrict the tools available to sub-agents for improved security:
delegate_task(
goal="Search papers",
toolsets=["web"], # Only give web search tools
max_iterations=10 # Limit to max 10 rounds
)Nesting Limits
Sub-agents cannot create further sub-agents (to prevent infinite recursion). However, sub-agents can use execute_code.
Best Practices
- Be specific with goals: Don't say "help me do research" — say "analyze code statistics of the xxx repository, including LOC, language distribution, and file count"
- Provide complete context: Include all file paths, URLs, and error messages
- Control parallelism: Max 3, with appropriate spacing to avoid API rate limits
- Aggregate results: Sub-agents only return summaries; the main Agent handles integration