Skip to content

feat(hermes): capture tool calls individually and remove content truncation in sync_turn#1012

Open
ggoldani wants to merge 2 commits into
rohitg00:mainfrom
ggoldani:feat/hermes-sync-turn-rich-capture
Open

feat(hermes): capture tool calls individually and remove content truncation in sync_turn#1012
ggoldani wants to merge 2 commits into
rohitg00:mainfrom
ggoldani:feat/hermes-sync-turn-rich-capture

Conversation

@ggoldani

@ggoldani ggoldani commented Jul 4, 2026

Copy link
Copy Markdown

Summary

The Hermes integration's sync_turn had two issues that degraded memory quality for Hermes Agent users:

  1. Content truncationtool_input was hard-cut to 500 chars and tool_output to 2000 chars, losing long prompts, code blocks, and tool results. The agentmemory server already truncates to 8000 chars server-side, so the plugin-side limits were needlessly aggressive.

  2. No per-tool-call capture — Hermes passes the full OpenAI-format messages list (including assistant tool_calls and matching tool results) via the messages kwarg, but sync_turn ignored it entirely. Every turn produced a single "conversation" observation with concatenated text, losing granular tool-call data.

Changes

  • Remove [:500]/[:2000] truncation — let the server handle size limits
  • Add _extract_tool_observations() helper — walks the messages list, matches assistant tool_calls to their tool results by tool_call_id, and emits one observation per call/result pair
  • Cap at 10 most-recent tool calls per turn to avoid observation flooding
  • Keep the conversation observation as a fallback — backward compatible

Behavior

Before After
1 obs/turn (conversation) 1+N obs/turn (N tool calls + conversation)
tool_input cut at 500 chars Full (server truncates at 8000)
tool_output cut at 2000 chars Full (server truncates at 8000)
messages kwarg ignored Parsed for individual tool calls

Backward compatibility

  • Turns without messages kwarg → identical (conversation-only)
  • Turns with messages but no tool calls → identical (conversation-only)
  • Turns with messages and tool calls → conversation + individual tool-call obs

Test plan

  • 9 new assertions in test/hermes-plugin.test.ts
  • Full suite: 1418/1439 pass (21 pre-existing failures confirmed on baseline)
  • E2E validated against live agentmemory server

Summary by CodeRabbit

  • New Features

    • Improved sync capture to include tool-related activity alongside standard conversation updates.
    • Expanded tool observation collection by correlating tool calls with their corresponding results.
  • Bug Fixes

    • More accurate matching of tool requests to tool outputs using call/result identifiers.
    • Reduced truncation of tool input/output and conversation text to retain more complete context.
    • Ensures tool observations are sent when message data is available, with a fallback to conversation-only capture when not.

…cation in sync_turn

The Hermes integration's sync_turn had two issues that degraded memory
quality for Hermes Agent users:

1. Content truncation — tool_input was hard-cut to 500 chars and
   tool_output to 2000 chars, losing long prompts, code blocks, and
   tool results. The agentmemory server already truncates to 8000 chars
   server-side, so the plugin-side limits were needlessly aggressive.

2. No per-tool-call capture — Hermes passes the full OpenAI-format
   messages list (including assistant tool_calls and matching tool
   results) via the messages kwarg, but sync_turn ignored it entirely.
   Every turn produced a single conversation observation with concatenated
   text, losing granular tool-call data.

Changes:
- Remove [:500]/[:2000] truncation
- Add _extract_tool_observations() helper that walks the messages list,
  matches assistant tool_calls to their tool results by tool_call_id,
  and emits one observation per call/result pair
- Cap at 10 most-recent tool calls per turn to avoid observation flooding
- Keep the conversation observation as a fallback (backward compatible)

Tested: 9 new assertions in hermes-plugin.test.ts, E2E validated against
live agentmemory server.
@vercel

vercel Bot commented Jul 4, 2026

Copy link
Copy Markdown

@ggoldani is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 958e7491-b440-4530-9eec-6d18fb315cfa

📥 Commits

Reviewing files that changed from the base of the PR and between 3e964d2 and 32f4208.

📒 Files selected for processing (1)
  • integrations/hermes/__init__.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • integrations/hermes/init.py

📝 Walkthrough

Walkthrough

Adds a Hermes helper that derives per-tool observations from messages, and updates AgentMemoryProvider.sync_turn to post those tool observations plus the conversation observation with shared session and timestamp values. Tests assert the new extraction and payload behavior.

Changes

Hermes sync_turn rich capture

Layer / File(s) Summary
Tool observation extraction helper
integrations/hermes/__init__.py
Adds _extract_tool_observations(messages, max_results) to parse assistant tool_calls, match them to results via tool_call_id, normalize tool arguments, build {tool_name, tool_input, tool_output} entries, reverse newest-first, and cap the list.
sync_turn integration and payload changes
integrations/hermes/__init__.py, test/hermes-plugin.test.ts
Initializes _first_prompt_sent, updates sync_turn to use shared session_id and timestamp, posts extracted tool observations from kwargs["messages"], updates the conversation observation payload, removes prior text truncation, and adds tests covering the new behavior.

Estimated code review effort: 2 (Simple) | ~15 minutes

Possibly related issues

Possibly related PRs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main Hermes sync_turn changes: per-tool call capture and removal of truncation.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/hermes-plugin.test.ts (1)

73-104: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy lift

Regex-on-source assertions don't verify actual runtime behavior.

These tests only check that certain string patterns exist/don't exist in the Python source; they don't exercise _extract_tool_observations with real message payloads, so they can't catch logic bugs (e.g., matching order, id-collision handling, max_results actually capping the returned list). This follows the existing style in this file (e.g. readAgentMemoryProviderHookMethods), but consider adding behavioral tests (e.g., via a Python subprocess harness or pytest) that call the helper directly with sample messages and assert on the returned observation list.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/hermes-plugin.test.ts` around lines 73 - 104, The current checks in
hermes-plugin.test.ts only inspect source text and do not validate
_extract_tool_observations behavior at runtime. Replace or supplement the regex
assertions with behavioral tests that execute the Python helper with real
messages payloads, verifying tool_call_id matching, fallback conversation
handling, and that max_results actually limits the returned observations. Use
the existing sync_turn / _extract_tool_observations symbols to locate the code
under test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@integrations/hermes/__init__.py`:
- Around line 208-218: The pending tool-call tracking in the tool-call parsing
block can collide when call.get("id", "") is empty, causing multiple calls to
overwrite the same entry and drop earlier observations. Update the logic around
the call_id assignment and pending[...] insertion to ensure every tool call gets
a unique key even when the Hermes payload omits an id, and keep the existing
tool_name/tool_input mapping intact while preventing empty-string collisions.

---

Nitpick comments:
In `@test/hermes-plugin.test.ts`:
- Around line 73-104: The current checks in hermes-plugin.test.ts only inspect
source text and do not validate _extract_tool_observations behavior at runtime.
Replace or supplement the regex assertions with behavioral tests that execute
the Python helper with real messages payloads, verifying tool_call_id matching,
fallback conversation handling, and that max_results actually limits the
returned observations. Use the existing sync_turn / _extract_tool_observations
symbols to locate the code under test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8349cdda-622d-4283-a91f-0495288820d6

📥 Commits

Reviewing files that changed from the base of the PR and between 93ae9bc and 3e964d2.

📒 Files selected for processing (2)
  • integrations/hermes/__init__.py
  • test/hermes-plugin.test.ts

Comment on lines +208 to +218
call_id = call.get("id", "")
try:
args_str = fn.get("arguments", "")
if isinstance(args_str, dict):
args_str = json.dumps(args_str)
except (TypeError, ValueError):
args_str = str(fn.get("arguments", ""))
pending[call_id] = {
"tool_name": name,
"tool_input": args_str,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Missing tool_call_id collisions can silently drop observations.

When call.get("id", "") is empty/missing, multiple tool calls in the same or different assistant messages will all key into pending under "", so an earlier unmatched call gets silently overwritten before it can be paired with its result.

🐛 Proposed fix to avoid collisions on missing ids
                 call_id = call.get("id", "")
+                if not call_id:
+                    call_id = f"__noid_{len(pending)}_{name}"
                 try:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
call_id = call.get("id", "")
try:
args_str = fn.get("arguments", "")
if isinstance(args_str, dict):
args_str = json.dumps(args_str)
except (TypeError, ValueError):
args_str = str(fn.get("arguments", ""))
pending[call_id] = {
"tool_name": name,
"tool_input": args_str,
}
call_id = call.get("id", "")
if not call_id:
call_id = f"__noid_{len(pending)}_{name}"
try:
args_str = fn.get("arguments", "")
if isinstance(args_str, dict):
args_str = json.dumps(args_str)
except (TypeError, ValueError):
args_str = str(fn.get("arguments", ""))
pending[call_id] = {
"tool_name": name,
"tool_input": args_str,
}
🧰 Tools
🪛 ast-grep (0.44.1)

[info] 211-211: use jsonify instead of json.dumps for JSON output
Context: json.dumps(args_str)
Note: [CWE-116] Improper Encoding or Escaping of Output.

(use-jsonify)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@integrations/hermes/__init__.py` around lines 208 - 218, The pending
tool-call tracking in the tool-call parsing block can collide when
call.get("id", "") is empty, causing multiple calls to overwrite the same entry
and drop earlier observations. Update the logic around the call_id assignment
and pending[...] insertion to ensure every tool call gets a unique key even when
the Hermes payload omits an id, and keep the existing tool_name/tool_input
mapping intact while preventing empty-string collisions.

CodeRabbit review caught that empty call_id ('') causes all id-less tool
calls to overwrite the same pending[''] entry, dropping earlier calls.
Skip calls without an id — the conversation fallback observation already
captures the full turn when tool calls can't be matched to results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant