feat(onboard): add Tavily as web search provider#2105
feat(onboard): add Tavily as web search provider#2105lakshyaag-tavily wants to merge 8 commits intoNVIDIA:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis pull request adds support for Tavily as a web search provider alongside Brave. A new Tavily preset configuration is introduced with network policies for the Tavily API endpoint. The web search infrastructure is refactored to be provider-agnostic, with support for provider selection, credential validation, environment variable mapping, and session state preservation. Sandbox and Dockerfile configuration are updated to inject provider-specific settings. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
test/onboard.test.ts (1)
163-175: Keep theknownPresetNamesfixture synchronized with Tavily.Line 174 expects
tavily, but the sharedknownfixture still omits it. Including it will keep filter-driven cases consistent and prevent false negatives in future provider-specific tests.♻️ Suggested update
const known = [ "npm", "pypi", "huggingface", "brew", "brave", + "tavily", "slack", "discord", "telegram", "jira", "outlook", "local-inference", ];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.ts` around lines 163 - 175, The test's knownPresetNames fixture (known) is missing "tavily" while the expectation in the it-block calls computeSetupPresetSuggestions("balanced", { enabledChannels: [], knownPresetNames: known }) and expects "tavily" in the result; update the known array to include "tavily" so the filter-driven test aligns with computeSetupPresetSuggestions and avoids false negatives (locate the known array near the test and add the string "tavily").nemoclaw-blueprint/policies/presets/tavily.yaml (1)
14-19: Narrow the Tavily preset to the endpoint this integration actually uses.The new onboarding flow validates Tavily with
POST /search, but this preset grantsGET /**andPOST /**on the whole domain. If nothing else needs broader access, tighten it now so the balanced tier doesn't silently authorize more of Tavily's API surface than required.🔒 Suggested tightening
rules: - - allow: { method: GET, path: "/**" } - - allow: { method: POST, path: "/**" } + - allow: { method: POST, path: "/search" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoclaw-blueprint/policies/presets/tavily.yaml` around lines 14 - 19, The preset currently allows overly-broad access via rules allowing "allow: { method: GET, path: "/**" }" and "allow: { method: POST, path: "/**" }"; restrict the rules to only the exact endpoint used by the integration by replacing those entries with a single rule permitting POST to "/search" (e.g., "allow: { method: POST, path: \"/search\" }") and remove the global GET/POST /** rules so only POST /search is authorized; keep protocol, enforcement and tls as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 994-1032: The Brave and Tavily provider validate functions call
runCurlProbe with raw curl args and lack the standard validation timeouts;
update the validate implementations for the "brave" and "tavily" entries to use
getValidationProbeCurlArgs() so the timeout/standard flags are applied (e.g.,
build the provider-specific args as before and pass them through
getValidationProbeCurlArgs(...) or merge its output with your args, then call
runCurlProbe with that result). Ensure you reference the existing validate
functions, runCurlProbe, and getValidationProbeCurlArgs when making the change.
---
Nitpick comments:
In `@nemoclaw-blueprint/policies/presets/tavily.yaml`:
- Around line 14-19: The preset currently allows overly-broad access via rules
allowing "allow: { method: GET, path: "/**" }" and "allow: { method: POST, path:
"/**" }"; restrict the rules to only the exact endpoint used by the integration
by replacing those entries with a single rule permitting POST to "/search"
(e.g., "allow: { method: POST, path: \"/search\" }") and remove the global
GET/POST /** rules so only POST /search is authorized; keep protocol,
enforcement and tls as-is.
In `@test/onboard.test.ts`:
- Around line 163-175: The test's knownPresetNames fixture (known) is missing
"tavily" while the expectation in the it-block calls
computeSetupPresetSuggestions("balanced", { enabledChannels: [],
knownPresetNames: known }) and expects "tavily" in the result; update the known
array to include "tavily" so the filter-driven test aligns with
computeSetupPresetSuggestions and avoids false negatives (locate the known array
near the test and add the string "tavily").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: b32a5ac9-dc9b-4d55-82f4-e279869b178e
📒 Files selected for processing (11)
Dockerfilenemoclaw-blueprint/policies/presets/tavily.yamlnemoclaw-blueprint/policies/tiers.yamlsrc/lib/onboard-session.test.tssrc/lib/onboard-session.tssrc/lib/onboard.tssrc/lib/web-search.test.tssrc/lib/web-search.tstest/onboard.test.tstest/policies.test.tstest/policy-tiers.test.ts
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/lib/onboard.ts (2)
994-1010:⚠️ Potential issue | 🟡 MinorApply the standard validation probe timeouts here.
These onboarding probes bypass
getValidationProbeCurlArgs(), so a dead Brave or Tavily endpoint can stall this step much longer than the rest of the wizard.Also applies to: 1017-1034
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 994 - 1010, The validate probe is using runCurlProbe with a raw arg array (inside the validate property) which omits the standard timeout settings; replace the explicit arg array with the shared getValidationProbeCurlArgs(...) helper (or call getValidationProbeCurlArgs() and merge/add the custom headers and query args) so the probe includes the standard timeout/retry flags, and apply the same fix to the other similar probe block later in this file (the other runCurlProbe usage around the Tavily/Brave validation probes). Ensure you still pass the X-Subscription-Token header and the same query params while using getValidationProbeCurlArgs to derive the base curl args.
1134-1143:⚠️ Potential issue | 🟠 MajorUse saved web-search credentials in the non-interactive path.
This still only inspects
process.env, so a non-interactive run with a saved Brave/Tavily key but no exported env var will skip web search entirely or fail validation, even thoughensureValidatedWebSearchCredential()already supports saved credentials.Also applies to: 1169-1187
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 1134 - 1143, resolveNonInteractiveWebSearchProvider currently only inspects process.env and will ignore saved Brave/Tavily credentials; update it to fallback to the saved credential retrieval used elsewhere (call the existing ensureValidatedWebSearchCredential or the internal saved-credential loader) when process.env[webSearch.BRAVE_API_KEY_ENV] / process.env[webSearch.TAVILY_API_KEY_ENV] are not set, then normalize those saved values with normalizeCredentialValue and preserve the current precedence (brave before tavily). Apply the same change to the other non-interactive check block referenced (the duplicate at the later range) so both paths use saved credentials instead of only environment variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 1161-1167: The code persists webSearchConfig using a noncanonical
field fetchEnabled which breaks round-trip and reuse detection; update the logic
that currently returns { fetchEnabled: true, provider } (in the existingConfig
branch and the other similar spots referenced) to return { enabled: true,
provider } and ensure all reads/gates check config?.enabled === true (not
truthiness of the whole object). Also normalize default objects to { enabled:
false, provider: "brave" } and update any change-detection or revalidation
checks (e.g., the code paths around fetchEnabled usage and the places noted at
the other ranges) to use enabled so the enable bit is preserved across
save/load.
---
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 994-1010: The validate probe is using runCurlProbe with a raw arg
array (inside the validate property) which omits the standard timeout settings;
replace the explicit arg array with the shared getValidationProbeCurlArgs(...)
helper (or call getValidationProbeCurlArgs() and merge/add the custom headers
and query args) so the probe includes the standard timeout/retry flags, and
apply the same fix to the other similar probe block later in this file (the
other runCurlProbe usage around the Tavily/Brave validation probes). Ensure you
still pass the X-Subscription-Token header and the same query params while using
getValidationProbeCurlArgs to derive the base curl args.
- Around line 1134-1143: resolveNonInteractiveWebSearchProvider currently only
inspects process.env and will ignore saved Brave/Tavily credentials; update it
to fallback to the saved credential retrieval used elsewhere (call the existing
ensureValidatedWebSearchCredential or the internal saved-credential loader) when
process.env[webSearch.BRAVE_API_KEY_ENV] /
process.env[webSearch.TAVILY_API_KEY_ENV] are not set, then normalize those
saved values with normalizeCredentialValue and preserve the current precedence
(brave before tavily). Apply the same change to the other non-interactive check
block referenced (the duplicate at the later range) so both paths use saved
credentials instead of only environment variables.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 069ae16b-7955-4ea0-b3ee-541e9cdb361b
📒 Files selected for processing (1)
src/lib/onboard.ts
|
✨ Thanks for submitting this PR that proposes adding Tavily as a web search provider during NemoClaw onboarding. The changes and test plan you provided will help us review this further. |
1 similar comment
|
✨ Thanks for submitting this PR that proposes adding Tavily as a web search provider during NemoClaw onboarding. The changes and test plan you provided will help us review this further. |
02647d6 to
1e920b5
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/lib/onboard.ts (1)
1159-1175:⚠️ Potential issue | 🟡 MinorUse standard validation timeout args in provider probes.
Both web-search validation probes should include
getValidationProbeCurlArgs()to avoid long hangs during onboarding when endpoints are slow/unreachable.Suggested patch
brave: { @@ validate: (apiKey) => runCurlProbe([ "-sS", + ...getValidationProbeCurlArgs(), "--compressed", @@ tavily: { @@ validate: (apiKey) => runCurlProbe([ "-sS", + ...getValidationProbeCurlArgs(), "--compressed",Also applies to: 1182-1199
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/onboard.ts` around lines 1159 - 1175, The validation probe for the Brave web-search provider calls runCurlProbe with a static arg list that lacks timeout/standard validation args; update the validate callbacks (the function that builds args passed into runCurlProbe for the web-search probes) to include the standard validation curl args by merging/concatenating the result of getValidationProbeCurlArgs() into the argument array (e.g., [...getValidationProbeCurlArgs(), "-sS", "--compressed", ... , "https://api.search.brave.com/res/v1/web/search"]). Do the same for the other web-search probe that also constructs args for runCurlProbe so both probes use getValidationProbeCurlArgs() to prevent long hangs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/lib/onboard.ts`:
- Around line 1159-1175: The validation probe for the Brave web-search provider
calls runCurlProbe with a static arg list that lacks timeout/standard validation
args; update the validate callbacks (the function that builds args passed into
runCurlProbe for the web-search probes) to include the standard validation curl
args by merging/concatenating the result of getValidationProbeCurlArgs() into
the argument array (e.g., [...getValidationProbeCurlArgs(), "-sS",
"--compressed", ... , "https://api.search.brave.com/res/v1/web/search"]). Do the
same for the other web-search probe that also constructs args for runCurlProbe
so both probes use getValidationProbeCurlArgs() to prevent long hangs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: dd53ec3e-a925-4887-b96f-ea79b7fee2ba
📒 Files selected for processing (11)
Dockerfilenemoclaw-blueprint/policies/presets/tavily.yamlnemoclaw-blueprint/policies/tiers.yamlsrc/lib/onboard-session.test.tssrc/lib/onboard-session.tssrc/lib/onboard.tssrc/lib/web-search.test.tssrc/lib/web-search.tstest/onboard.test.tstest/policies.test.tstest/policy-tiers.test.ts
✅ Files skipped from review due to trivial changes (7)
- src/lib/web-search.test.ts
- test/policies.test.ts
- nemoclaw-blueprint/policies/tiers.yaml
- test/policy-tiers.test.ts
- src/lib/web-search.ts
- nemoclaw-blueprint/policies/presets/tavily.yaml
- test/onboard.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/lib/onboard-session.ts
- src/lib/onboard-session.test.ts
1e920b5 to
faa137c
Compare
faa137c to
d713cda
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/lib/web-search.ts (1)
14-16: Future-proofwebSearchEnvForwith an exhaustive switch.Current behavior is correct, but a new provider added later would silently fall back to Brave.
♻️ Suggested refactor
export function webSearchEnvFor(provider: WebSearchProvider): string { - return provider === "tavily" ? TAVILY_API_KEY_ENV : BRAVE_API_KEY_ENV; + switch (provider) { + case "brave": + return BRAVE_API_KEY_ENV; + case "tavily": + return TAVILY_API_KEY_ENV; + default: { + const _exhaustive: never = provider; + return _exhaustive; + } + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/lib/web-search.ts` around lines 14 - 16, The function webSearchEnvFor currently uses a ternary that returns BRAVE_API_KEY_ENV for any provider other than "tavily", which would silently mis-handle new providers; replace the ternary with an exhaustive switch on provider in webSearchEnvFor that handles "tavily" and "brave" explicitly and uses a default case that throws an error (or asserts unreachable) so adding a new WebSearchProvider will surface a compile/runtime failure; reference the existing symbols TAVILY_API_KEY_ENV and BRAVE_API_KEY_ENV and ensure the switch returns those values accordingly.test/onboard.test.ts (1)
289-289: Make theknownPresetNamesfixture explicitly includetavily.Line 289 expects Tavily, but the local
knownlist doesn’t contain it; adding it makes this test intent clearer and more regression-resistant.🧪 Suggested test-fixture tweak
const known = [ "npm", "pypi", "huggingface", "brew", "brave", + "tavily", "slack", "discord", "telegram", "jira", "outlook",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.ts` at line 289, The test assertion expects "tavily" in the suggestions but the test fixture list of known presets (the knownPresetNames / known array used to seed suggestions in test/onboard.test.ts) does not include it; update that fixture by adding the string "tavily" (lowercase) to the knownPresetNames/known list so the expected array ["npm", "pypi", "huggingface", "brew", "brave", "tavily"] matches the seeded data.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/generate-openclaw-config.py`:
- Around line 219-224: Normalize the raw environment value from env.get before
validating: take the value returned into _ws_provider, call strip() and lower()
on it (so inputs like "TAVILY" or " tavily " become "tavily"), then validate
against ("brave","tavily") and use that normalized value to look up _ws_env_key
from {"brave":"BRAVE_API_KEY","tavily":"TAVILY_API_KEY"}; ensure the
normalization happens before the not-in check and before the dictionary lookup
so the correct provider and env key are selected.
---
Nitpick comments:
In `@src/lib/web-search.ts`:
- Around line 14-16: The function webSearchEnvFor currently uses a ternary that
returns BRAVE_API_KEY_ENV for any provider other than "tavily", which would
silently mis-handle new providers; replace the ternary with an exhaustive switch
on provider in webSearchEnvFor that handles "tavily" and "brave" explicitly and
uses a default case that throws an error (or asserts unreachable) so adding a
new WebSearchProvider will surface a compile/runtime failure; reference the
existing symbols TAVILY_API_KEY_ENV and BRAVE_API_KEY_ENV and ensure the switch
returns those values accordingly.
In `@test/onboard.test.ts`:
- Line 289: The test assertion expects "tavily" in the suggestions but the test
fixture list of known presets (the knownPresetNames / known array used to seed
suggestions in test/onboard.test.ts) does not include it; update that fixture by
adding the string "tavily" (lowercase) to the knownPresetNames/known list so the
expected array ["npm", "pypi", "huggingface", "brew", "brave", "tavily"] matches
the seeded data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 62eb6b6d-eb98-4b42-843f-881716ac04c4
📒 Files selected for processing (12)
nemoclaw-blueprint/policies/presets/tavily.yamlnemoclaw-blueprint/policies/tiers.yamlscripts/generate-openclaw-config.pysrc/lib/onboard-session.test.tssrc/lib/onboard-session.tssrc/lib/onboard.tssrc/lib/redact.tssrc/lib/web-search.test.tssrc/lib/web-search.tstest/onboard.test.tstest/policies.test.tstest/policy-tiers.test.ts
✅ Files skipped from review due to trivial changes (5)
- test/policies.test.ts
- test/policy-tiers.test.ts
- src/lib/web-search.test.ts
- nemoclaw-blueprint/policies/presets/tavily.yaml
- nemoclaw-blueprint/policies/tiers.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
- src/lib/onboard-session.test.ts
- src/lib/onboard-session.ts
- src/lib/onboard.ts
7346a70 to
c9f6956
Compare
c9f6956 to
32eb01a
Compare
05efcbb to
3664b62
Compare

Summary
This PR adds Tavily as a first-class web search provider during NemoClaw onboarding, including provider-aware credential validation, sandbox configuration, and policy preset selection.
Changes
tavilynetwork policy preset and include it in thebalancedandopenpolicy tiers.braveandtavily, including provider-to-env-var mapping viaBRAVE_API_KEYandTAVILY_API_KEY./searchAPI, and carry the selected provider through sandbox creation, env injection, preset suggestions, and resume flows.tools.web.search.providerand the matchingapiKeyplaceholder at runtime.webSearchConfig.providerin onboard session save/load paths, redactTAVILY_API_KEYin persisted failure messages, and add regression coverage for provider round-tripping and Tavily-specific onboarding behavior.Type of Change
Verification
npx prek run --all-filespassesnpm testpassesmake docsbuilds without warnings (doc changes only)AI Disclosure
Signed-off-by: Lakshya Agarwal [email protected]
Summary by CodeRabbit
New Features
Chores