Skip to content

feat: add Model Serving connector and plugin#239

Open
pkosiec wants to merge 2 commits intomainfrom
pkosiec/serving-1-core
Open

feat: add Model Serving connector and plugin#239
pkosiec wants to merge 2 commits intomainfrom
pkosiec/serving-1-core

Conversation

@pkosiec
Copy link
Copy Markdown
Member

@pkosiec pkosiec commented Apr 3, 2026

Summary

  • Add serving connector layer wrapping the Databricks SDK for endpoint invocation (invoke + SSE stream)
  • Add serving plugin with Express routes for /api/serving/:alias/invoke and /api/serving/:alias/stream
  • Add UPSTREAM_ERROR SSE error code for propagating Databricks API errors
  • Support named endpoint aliases for routing to multiple serving endpoints

Demo

model-serving-demo-compressed.mp4

PR Stack — Model Serving

# PR Description
1 this PR Serving connector & plugin
2 #240 Type generator, Vite plugin & UI hooks
3 #241 Dev-playground, template & docs

Add the core Model Serving plugin that provides an authenticated proxy
to Databricks Model Serving endpoints. Includes the connector layer
(SDK client wrapper) and the plugin layer (Express routes for
invoke/stream). Also adds UPSTREAM_ERROR SSE error code for propagating
API errors.

Signed-off-by: Pawel Kosiec <[email protected]>
@pkosiec pkosiec force-pushed the pkosiec/serving-1-core branch from 76a2618 to 41a0074 Compare April 3, 2026 10:14
The serving plugin was not forwarding the abort signal to the serving
connector, unlike the genie plugin. Without the signal, the connector's
fetch request cannot be cancelled and the abort-check loop never triggers.

Signed-off-by: Pawel Kosiec <[email protected]>

logger.debug("Streaming from endpoint %s at %s", endpointName, url);

const res = await fetch(url, {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we use fetch because the WorkspaceClient does not have this method yet? (Same for the non-stream endpoint)

endpointName: string,
body: Record<string, unknown>,
options?: ServingInvokeOptions,
): Promise<unknown> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't know the response type?

"permission": "CAN_QUERY",
"fields": {
"name": {
"env": "DATABRICKS_SERVING_ENDPOINT",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline - could we use name, since it might make it more clear that we're not expecting the actual URL?


buffer += decoder.decode(value, { stream: true });

if (buffer.length > MAX_BUFFER_SIZE) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, but would an error be preferable?

Comment on lines +291 to +297
exports(): ServingFactory {
return ((alias?: string) => ({
invoke: (body: Record<string, unknown>) =>
this.invoke(alias ?? "default", body),
stream: (body: Record<string, unknown>) =>
this.stream(alias ?? "default", body),
})) as ServingFactory;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the asUser functionality, you can probably do the same approach as the files plugin is doing

Comment on lines +92 to +101
// Always strip `stream` from the body — the connector controls this
const { stream: _stream, ...cleanBody } = body;

const headers = new Headers({
"Content-Type": "application/json",
Accept: "application/json",
});
await client.config.authenticate(headers);

logger.debug("Invoking endpoint %s at %s", endpointName, url);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit hacky no? 😅 I wonder first why aren't we using the client to use the model serving endpoint instead of putting our own fetch? is it because there's no streaming option? if that's the case I would put a big comment explaining why we are doing this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a caveat that not all of the endpoints support streaming

Comment on lines +141 to +149
const headers = new Headers({
"Content-Type": "application/json",
Accept: "text/event-stream",
});
await client.config.authenticate(headers);

logger.debug("Streaming from endpoint %s at %s", endpointName, url);

const res = await fetch(url, {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Comment on lines +64 to +77
const cacheFile = path.join(
process.cwd(),
"node_modules",
".databricks",
"appkit",
".appkit-serving-types-cache.json",
);
this.schemaAllowlists = await loadEndpointSchemas(cacheFile);
if (this.schemaAllowlists.size > 0) {
logger.debug(
"Loaded schema allowlists for %d endpoint(s)",
this.schemaAllowlists.size,
);
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is assuming that the vite plugin is added into the server right? but this pr doesn't have the plugin. I also don't like too much that we are reading this file from the plugin, we should probably follow the same approach as we did with analytics

Comment on lines +130 to +169
injectRoutes(router: IAppRouter) {
if (this.isNamedMode) {
this.route(router, {
name: "invoke",
method: "post",
path: "/:alias/invoke",
handler: async (req: express.Request, res: express.Response) => {
await this.asUser(req)._handleInvoke(req, res);
},
});

this.route(router, {
name: "stream",
method: "post",
path: "/:alias/stream",
handler: async (req: express.Request, res: express.Response) => {
await this.asUser(req)._handleStream(req, res);
},
});
} else {
this.route(router, {
name: "invoke",
method: "post",
path: "/invoke",
handler: async (req: express.Request, res: express.Response) => {
req.params.alias = "default";
await this.asUser(req)._handleInvoke(req, res);
},
});

this.route(router, {
name: "stream",
method: "post",
path: "/stream",
handler: async (req: express.Request, res: express.Response) => {
req.params.alias = "default";
await this.asUser(req)._handleStream(req, res);
},
});
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we do OBO on all by default?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants