Skip to content

Research Assistant

Give Cuttlefish a research direction and it will search papers, read abstracts, perform comparative analysis, and even generate a literature review draft.

Scenario Description

You're doing research on AI Agents and need to track the latest papers, compare different methods, and compile literature reviews. Let Hermes Agent accelerate your work through the arxiv skill and search capabilities.

Prerequisites

  • arxiv skill installed
  • web-content-fetch or defuddle skill installed
  • ✅ LLM configured (long-context models preferred)
You: Search arXiv for papers about AI Agents from the past week,
    filter the 5 most-cited ones,
    give me the title, translated abstract, and core contribution.

Hermes will:

  1. Call the arxiv skill to search
  2. Filter and rank results
  3. Translate abstracts and extract key information

Example search results:

Hermes: Found 5 relevant papers —

📄 1. "AgentBench: Evaluating LLMs as Agents"
   Core contribution: Proposed a multi-dimensional Agent evaluation benchmark covering 8 real-world scenarios
   Citations: 1,200+ | Published: 2024-03
   
📄 2. "AutoGen: Enabling Next-Gen LLM Applications"
   Core contribution: Multi-Agent conversation framework supporting custom inter-Agent communication
   Citations: 980+ | Published: 2024-02

📄 3. "ToolLLM: Facilitating Large Language Models to Master Tools"
   Core contribution: Large-scale tool learning, supporting 16,000+ API calls
   Citations: 750+ | Published: 2024-01
   ...

Step 2: Deep Reading

You: Read this paper for me https://arxiv.org/abs/xxxx.xxxxx
    Focus on:
    1. What is the core method?
    2. What are the innovations compared to existing methods?
    3. What are the experimental results?
    4. What are the limitations?

Example deep reading output:

Hermes: Paper analysis as follows —

🔍 Core Method
  Proposes the ReAct framework, interleaving Reasoning and Acting
  rather than the traditional chain-of-thought-then-act approach.

🆕 Innovations
  - First to prove "think while doing" outperforms "think then do" by 15-20%
  - Achieved SOTA on HotpotQA and ALFWorld
  
📊 Experimental Results
  HotpotQA: EM 64.2% (+8.1%)
  ALFWorld: Success rate 74.5% (+12.3%)
  
⚠️ Limitations
  - Each inference requires 3-5 LLM calls, cost is relatively high
  - Performance is unstable on tasks requiring long-term planning
  - Depends on prompt engineering, robustness needs further validation

Step 3: Comparative Analysis

You: Compare the methodological differences among these three papers:
    1. Paper A: ReAct framework
    2. Paper B: Plan-and-Execute framework  
    3. Paper C: Adaptive Agent framework
    Create a comparison table across method, effectiveness, and applicable scenarios.

Step 4: Generate a Literature Review

You: Based on the papers we've discussed,
    write a ~2,000-word literature review
    on "Autonomous Agents Driven by Large Language Models",
    with proper citation formatting and clear logic.

Step 5: Ongoing Tracking

You: Create a weekly task to search arXiv for new AI Agent papers,
    filter those relevant to my research direction,
    generate a summary report and send to Feishu.

Advanced Tips

  • Knowledge base integration: Auto-save search results to Obsidian
  • Multi-source search: Simultaneously search arXiv, Semantic Scholar, and Google Scholar
  • Paper recommendations: Recommend related papers based on your reading history
  • Experiment reproduction: Let Cuttlefish help you understand and reproduce experiments from papers

Further Reading

Released under CC BY-NC-SA 4.0 | GitHub