Skip to content

Chapter 4 Sub-agents & Parallel Orchestration

One person can't keep up? Let Cuttlefish clone itself — handle multiple independent tasks simultaneously and double the speed.

Two Parallelism Mechanisms

Hermes Agent provides two parallel processing methods:

MechanismToolBest For
Sub-agentsdelegate_taskComplex tasks requiring reasoning
Script batchingexecute_codeMechanical multi-step operations

Which to Choose?

Need reasoning/decision-making? ──→ delegate_task (sub-agent)
Pure mechanical operations?      ──→ execute_code (script)
3+ tool calls with logic?        ──→ execute_code
Single tool call?                ──→ Call the tool directly
Need user interaction?           ──→ Neither — handle it yourself

delegate_task Sub-agents

How It Works

┌─────────────────────────────────┐
│     Main Agent (your conversation)│
│                                  │
│   delegate_task ──→ Sub-agent 1 │ → Research topic A
│   delegate_task ──→ Sub-agent 2 │ → Research topic B
│   delegate_task ──→ Sub-agent 3 │ → Write code C
│                                  │
│   ← Aggregate all sub-agent results│
│   ← Integrate and reply to you   │
└─────────────────────────────────┘

Each sub-agent has:

  • Its own conversation context
  • Its own terminal session
  • Its own toolset
  • Only the final summary is returned to the main Agent

Two Modes

Single Task Mode:

python
delegate_task(
  goal="Analyze code statistics of the hermes-agent repository",
  context="Repository path: /path/to/hermes-agent",
  toolsets=["terminal", "file"]
)

Batch Mode (Parallel):

python
delegate_task(
  tasks=[
    {"goal": "Research Python asyncio", "context": "..."},
    {"goal": "Research Rust tokio", "context": "..."},
    {"goal": "Research Go goroutines", "context": "..."}
  ]
)

Up to 3 sub-agents can run simultaneously.

Two Key Parameters

ParameterDescription
goalTask objective (specific, self-contained)
contextBackground info (file paths, error messages, constraints)

WARNING

Sub-agents don't have memory of your conversation! You must pass all necessary information through context.


execute_code Script Batching

How It Works

execute_code runs scripts in a Python sandbox and can call Hermes tools:

python
from hermes_tools import terminal, read_file, write_file, search_files

# Batch operations
files = search_files(pattern="*.py", target="files", path="./src")
for f in files["matches"][:10]:
    content = read_file(path=f)
    # Process...

Features

FeatureDescription
Built-in toolsterminal, read_file, write_file, search_files, patch, etc.
Built-in functionsjson_parse, shell_quote, retry
Limits5-minute timeout, 50KB stdout cap, max 50 tool calls

Use Cases

  • Need to process N files in a loop
  • Need conditional branching (if/else)
  • Need data processing between tool calls
  • Need retry logic

Hands-On Examples

Example 1: Parallel Research

You: Help me compare and analyze three AI Agent frameworks:
    Hermes Agent, OpenAI Agents SDK, and LangGraph.
    Compare them across three dimensions: architecture design, tool system, and deployment.

The Agent creates 3 sub-agents for parallel research, then generates a comparison table after aggregation.

Example 2: Batch File Processing

You: Translate the docstrings in all Python files under src/ to English

The Agent uses execute_code to process files in a loop, avoiding the overhead of calling each one individually.

Example 3: Code Review + Testing in Parallel

You: Review the code changes in this PR while running the test suite
┌─ Sub-agent 1: Code review (loads github-code-review skill)

├─ Sub-agent 2: Run tests (executes pytest in terminal)

└─ Main agent: Consolidate both reports

Example 4: Multi-Repository Sync

You: Check my 5 active projects. For each project:
    1. Check for uncommitted changes
    2. Check if behind remote
    3. Check CI status

Use execute_code to loop through 5 repositories and collect all information at once.


Performance & Limitations

delegate_task Limits

LimitValue
Max parallel3
Max tool call rounds50 (configurable)
Context isolationFully isolated
User interactionNot supported

execute_code Limits

LimitValue
Execution timeout5 minutes
stdout cap50KB
Tool call cap50 calls
Foreground onlyNo background execution

When Not to Use Parallelism

  • Single tool call → Call directly
  • Tasks have sequential dependencies → Execute serially
  • Need user feedback → Handle yourself

Advanced Patterns

ACP Sub-agents

You can have sub-agents use different AI models:

python
delegate_task(
  goal="Deep code review",
  acp_command="claude",     # Use Claude Code as sub-agent
  acp_args=["--acp", "--stdio", "--model", "claude-opus-4"]
)

Sub-agent Toolset Control

Restrict the tools available to sub-agents for improved security:

python
delegate_task(
  goal="Search papers",
  toolsets=["web"],          # Only give web search tools
  max_iterations=10          # Limit to max 10 rounds
)

Nesting Limits

Sub-agents cannot create further sub-agents (to prevent infinite recursion). However, sub-agents can use execute_code.


Best Practices

  1. Be specific with goals: Don't say "help me do research" — say "analyze code statistics of the xxx repository, including LOC, language distribution, and file count"
  2. Provide complete context: Include all file paths, URLs, and error messages
  3. Control parallelism: Max 3, with appropriate spacing to avoid API rate limits
  4. Aggregate results: Sub-agents only return summaries; the main Agent handles integration

Further Reading


Released under CC BY-NC-SA 4.0 | GitHub