[KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider#7417
[KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider#7417wangzhigang1999 wants to merge 10 commits intoapache:masterfrom
Conversation
c534fd7 to
3011909
Compare
…re stack, OpenAI provider, and live E2E tests
This PR delivers the runtime layer of the Data Agent Engine on top of the tool
system and data source plumbing from 2a/4:
- ReactAgent: ReAct-style loop with streaming LLM responses, per-step tool
dispatch, and AgentRunContext tracking token usage, iterations, and session.
- Middleware stack (AgentMiddleware + ReactAgent.Builder):
* LoggingMiddleware -- structured per-step/LLM/tool/finish logs with MDC.
* ApprovalMiddleware -- CompletableFuture-based resolve for DESTRUCTIVE
tools; modes NORMAL / STRICT / AUTO_APPROVE.
* CompactionMiddleware -- token-threshold-triggered history summarization
with KEEP_RECENT_TURNS=4, emits a Compaction AgentEvent so clients can
observe the mechanism firing.
* ToolResultOffloadMiddleware -- spills large tool outputs to disk and
surfaces `read_tool_output` / `grep_tool_output` companion tools for the
LLM to re-query truncated previews.
- OpenAiProvider: single shared ReactAgent, per-session ConversationMemory,
streaming chat completions, Hikari-pooled JDBC data source; reads model and
thresholds from KyuubiConf.
- ExecuteStatement (Scala): encodes all AgentEvents (including compaction and
approval_request) as SSE JSON rows streamed through the JDBC reply column.
- KyuubiConf: new keys for LLM provider/api-url/model/api-key, approval mode,
compaction trigger tokens, offload root/thresholds, max iterations, etc.
- Tests:
* Unit tests for runtime, middlewares, offload store, and event shapes.
* Live tests gated on DATA_AGENT_LLM_API_KEY covering full LLM round-trips:
ReactAgentLiveTest (offload+grep, approval approve/deny), DataAgentE2ESuite
and DataAgentApprovalE2ESuite (JDBC layer), DataAgentCompactionE2ESuite
(JDBC-observable compaction event + post-compaction recovery),
CompactionMiddlewareLiveTest.
* Compatibility verified against qwen3.6-plus, glm-5, and kimi-k2.5 via
per-call `model=` logging in ReactAgent.
3011909 to
ce4eecc
Compare
MySQL Connector/J is GPL-licensed and cannot be bundled in an Apache binary release. Users who need the MySQL/StarRocks datasource at runtime should provide the driver jar themselves on the engine classpath. Addresses review feedback on apache#7417.
Evidence: runtime under real workload
TL;DR
Setup
ResultsOverall: 500 questions, By difficulty:
By database (sorted by EX):
Cost: ~45 min wall time at concurrency=8, ~21M tokens total. What this evidence supports
Scope disclaimers
Follow-up: Spark backend runSame harness, same 500 BIRD questions, but the agent targets a real EMR Kyuubi + Spark 3.5.3 cluster via
This setup is materially stricter than the official BIRD evaluation. BIRD pins the target What this run confirms for the PR:
|
ReactAgent Execution Flow |
|
Hi @pan3793, when you have time, could I ask for a review on this one? 🙏 Third PR of the Data Agent Engine series (umbrella #7379, labeled 2b/4) — adds the It's on the larger side (~5.3k lines, almost all under No rush — thanks! |
… drop SQLite/PostgreSQL bundle, pin kotlin/okhttp/okio Two pom-level cleanups requested in apache#7417 review: 1. Drop sqlite-jdbc and postgresql JDBC drivers from the binary bundle. sqlite-jdbc moves to test scope (still needed for unit tests); postgresql is no longer declared. Users targeting those databases provide the driver jar on the engine classpath the same way they do for any other JDBC source. Trims ~14 MB from the bundled tgz. 2. Pin the kotlin runtime and okhttp/okio versions transitively introduced by openai-java in the data-agent module's pom, so any drift across openai-java upgrades becomes a deliberate change rather than a silent transitive shift. Versions pinned at the values openai-java currently resolves to (kotlin-stdlib* 1.8.0, kotlin-reflect 2.0.21, okhttp 4.12.0, okio 3.6.0); the dependency tree is identical to before. Addresses review feedback on apache#7417.
…ename OpenAiProvider to ChatCompletionProvider
The previous config namespace and provider class name were ambiguous —
both implied an LLM/vendor identity (OpenAI) when in practice they
denote the OpenAI-compatible chat-completion protocol that virtually
every modern LLM endpoint speaks. Reviewers asked for vendor-neutral
naming aligned with Trino's ai.* function configuration style.
Config keys:
kyuubi.engine.data.agent.llm.api.key -> openai.api.key
kyuubi.engine.data.agent.llm.api.url -> openai.endpoint
kyuubi.engine.data.agent.llm.model -> model
Provider class:
org.apache.kyuubi.engine.dataagent.provider.openai.OpenAiProvider
-> org.apache.kyuubi.engine.dataagent.provider.chatcompletion
.ChatCompletionProvider
Env vars consumed by tests/E2E suites and the regenerated settings.md
follow the same renames.
Addresses review feedback on apache#7417.
…ialect class names Reviewer asked for proper acronym casing in class names. Rename: SqliteDialect -> SQLiteDialect MysqlDialect -> MySQLDialect and update test method names that embed the same tokens (testSqlite*, testMysql*, testDatasourceSqlite, testDatasourceMysql). Addresses review feedback on apache#7417.
…it sentinel actions in AgentMiddleware
Reviewer pushed back on null propagation in AgentMiddleware return types.
Apply the same sealed-style pattern uniformly across the three hooks
that historically used null to mean "do nothing":
beforeLlmCall -> LlmCallAction { LlmNoopAction | LlmSkip | LlmModifyMessages }
beforeToolCall -> ToolCallAction { ToolCallApproval | ToolCallDenial }
afterToolCall -> ToolResultAction { ToolResultUnchanged | ToolResultReplace }
Each base type is non-instantiable, the no-op subtype is a singleton
(*.INSTANCE), and the active subtype carries its payload. Defaults and
all built-in middleware (Logging, Approval, Compaction, ToolResultOffload)
return the appropriate sentinel; the ReactAgent dispatchers switch from
null checks to instanceof checks. Tests assert on the singleton or on
instanceof + cast to read the payload.
No behavior change; the goal is just to remove null from the contract.
Addresses review feedback on apache#7417.
…ter in AgentTool.execute Reviewer asked for the per-invocation context to come first so the parameter list reads context-then-payload, matching the conventional shape of "function(context, args)" used elsewhere in the codebase. Update the AgentTool interface signature, all production tool implementations (ReadToolOutputTool, GrepToolOutputTool, RunSelectQueryTool, RunMutationQueryTool), the ToolRegistry call site, and every test/test-helper call site that exercises tool.execute(...). Addresses review feedback on apache#7417.
…n types under Decision<T> Collapse the three sealed action hierarchies (LlmCallAction, ToolCallAction, ToolResultAction) plus nullable onEvent into a single generic Decision<T> with proceed / replace / abort. Pack tool-call (id, name, args) into ToolInvocation so beforeToolCall can rewrite args (e.g. inject SQL LIMIT, redact params), and align afterLlmCall by moving its dispatch ahead of the memory write so replace actually rewrites what enters memory and tool-call extraction.
…ATE in approval live test testApprovalApproveFlow asked the model to increment a counter and return the new value, but UPDATE returns no value, so weaker models (e.g. kimi-k2.5) hallucinated "0" instead of running a follow-up SELECT. Make the instruction explicit so behavior converges across models.
…lient + composite MiddlewareDispatcher ReactAgent had grown to mix three concerns: the ReAct control loop, OpenAI streaming/chunk assembly, and middleware fold logic. Extract: - LlmStreamClient: owns one streaming chat completion call, accumulates content + tool-call deltas, and exposes StreamResult.toAssistantMessage for SDK message construction. Depends only on the OpenAI SDK and AgentRunContext (emits ContentDelta via ctx.emit, no dispatcher reference). - MiddlewareDispatcher: implements AgentMiddleware as a composite over the configured list. ReactAgent calls onAgentStart / onEvent / beforeLlmCall etc. on the composite the same way it would call any middleware; resolveApproval stays as a non-interface accessor for the approval flow's special case. Also: afterToolCall now returns Decision<String> for symmetry with the other interceptor hooks; ABORT marks ToolResult.isError=true so listeners can distinguish a middleware-vetoed result from a successful one. The emit-then-forward step splits cleanly: the composite runs onEvent, and ReactAgent's ctx.setEventEmitter lambda forwards the filtered event to the user's raw consumer. ReactAgent's run() drops the eventConsumer parameter threading through internal helpers — everywhere downstream uses ctx.emit().
|
Thanks @pan3793 — pushed followup commits addressing all the feedback (deps & bundling, Trino-style config keys, Also folded in some internal cleanup: split |
|
thanks, merged to master |
Why are the changes needed?
Part 2b of 4 for the Data Agent Engine (umbrella, KPIP-7373).
This PR adds the ReAct agent runtime that drives the LLM <-> tool loop, a composable middleware stack around it, and a production
OpenAiProvider. It sits on top of the tool system and data source abstraction introduced in PR 2a, and is consumed by the REST layer in PR 3.Changes include:
ReactAgent— ReAct loop with streaming, tool-call dispatch, turn budget, malformed-tool-call recoveryConversationMemory— message history with cumulative prompt-token trackingAgentRunContext/AgentInvocation/ApprovalMode— per-run state plumbingToolOutputStore— size-gated tool-output offload, keyed by session+call-id, withReadToolOutputTool/GrepToolOutputToolfor LLM-driven retrievalAgentMiddlewareinterface withonRegisterhook for tool wiring, plus four middlewares:LoggingMiddleware— structured request/response loggingApprovalMiddleware— risk-level-based approval gateCompactionMiddleware— token-threshold-driven history summarization keyed by sessionToolResultOffloadMiddleware— transparently owns theToolOutputStoreand registers retrieval toolsOpenAiProvider— OpenAI-compatible chat completions with streaming and tool callsExecuteStatement.scala— SSE encoding extended to emitCompactioneventsdatasource.dialectpackage for organizationkyuubi.engine.data.agent.compaction.trigger.tokensconfiguration entryMockLlmProvider— deterministic mock for middleware and runtime testsmysql-connector-jmoved totestscope (GPL-licensed; cannot be bundled in an Apache binary release — addresses review feedback on [KYUUBI #7379][2b/4] Data Agent Engine: agent runtime, middleware stack, and OpenAI provider #7417)How was this patch tested?
ConversationMemoryTest,ToolOutputStoreTest,ApprovalMiddlewareTest,CompactionMiddlewareTest,ToolResultOffloadMiddlewareTest,event/EventTest, plus updates toToolRegistryThreadSafetyTest/ToolTest/RunSelectQueryToolTest/RunMutationQueryToolTest/JdbcDialectTest/ MySQLDialectTestDATA_AGENT_LLM_API_KEY/DATA_AGENT_LLM_API_URL/DATA_AGENT_LLM_MODEL):ReactAgentLiveTest,CompactionMiddlewareLiveTest— exercise the full loop against a real OpenAI-compatible endpointDataAgentE2ESuiteextended with OpenAI-provider paths; newDataAgentCompactionE2ESuiteobserves compaction via JDBCWas this patch authored or co-authored using generative AI tooling?
Partially assisted by Claude Code (Claude Opus 4.7) for test generation, code review, and PR formatting. Core design and implementation are human-authored.