Skip to content

Replace hapi with hono#131

Merged
taoeffect merged 80 commits intomasterfrom
replace-hapi-with-hono
May 8, 2026
Merged

Replace hapi with hono#131
taoeffect merged 80 commits intomasterfrom
replace-hapi-with-hono

Conversation

@corrideat
Copy link
Copy Markdown
Member

@corrideat corrideat commented Apr 14, 2026

This closes #75 and closes #69

@corrideat corrideat self-assigned this Apr 14, 2026
@corrideat corrideat marked this pull request as ready for review April 14, 2026 00:05
Copilot AI review requested due to automatic review settings April 14, 2026 00:05
@corrideat corrideat marked this pull request as draft April 14, 2026 00:05
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the backend HTTP layer from Hapi to Hono, updating server bootstrapping, routing/auth, and static file serving, and adds new Deno tests to validate endpoint behavior.

Changes:

  • Replaced the Hapi server with a Hono app + @hono/node-server HTTP server adaptor.
  • Ported REST endpoints, auth, and static asset serving from Hapi/Joi/Boom patterns to Hono + HTTPException.
  • Added a new Deno test suite (helpers + read/write/KV/ZKPP/stateless tests) to cover the migrated routes.

Reviewed changes

Copilot reviewed 19 out of 20 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/serve/server.ts Replace Hapi server bootstrap with Hono + node-server adaptor and update WS version info payload.
src/serve/routes.ts Port route definitions to Hono handlers, replace Boom/Joi validation with manual checks, and implement static file serving.
src/serve/auth.ts Replace Hapi auth plugin with Hono middleware that sets credentials and authStrategy on context.
src/serve/dashboard-server.ts Replace Hapi/Inert dashboard server with Hono + serveStatic.
src/serve/database.ts Replace Boom exceptions with typed backend errors (NotFound/Gone/Conflict) for mapping at the HTTP layer.
src/serve/errors.ts Add BackendErrorConflict to support conflict mapping.
src/serve/index.ts Update shutdown log message to remove Hapi reference.
src/pin.ts Use fullContractName when updating config keys and logging contract name.
src/eventsAfter.ts Remove Boom-specific wording in comment.
deno.json Add Hono imports and adjust test task flags.
deno.lock Update dependency lock entries for Hono packages and other version shifts.
src/serve/routes-*.test.ts, src/serve/routes-test-helpers.ts New route-level Deno tests covering the migrated endpoints.
build/* Regenerated build artifacts reflecting the server/runtime changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/serve/routes.ts Outdated
Comment thread src/serve/routes.ts Outdated
Comment on lines +519 to +526
// Check content-length before reading body
const contentLength = parseInt(c.req.header('content-length') || '0', 10)
if (contentLength > FILE_UPLOAD_MAX_BYTES) {
throw new HTTPException(413, { message: 'Payload too large' })
}
const ourHash = createCID((request.payload as { payload: Uint8Array }[])[i].payload, multicodes.SHELTER_FILE_CHUNK)
if ((request.payload as { payload: Uint8Array }[])[i].payload.byteLength !== chunk[0]) {
throw Boom.badRequest('bad chunk size')

const formData = await c.req.formData()
const manifestFile = formData.get('manifest') as File | null
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upload size guard relies only on content-length, but for chunked uploads or missing/invalid content-length this becomes 0 and c.req.formData() will still buffer the entire body into memory. This is a regression from Hapi’s maxBytes and can allow large uploads/DoS. Consider enforcing a hard byte limit while reading the request body (or requiring a valid Content-Length and rejecting otherwise), and ensure the limit is applied before fully materializing formData().

Copilot uses AI. Check for mistakes.
Comment thread src/serve/routes.ts Outdated
Comment on lines +791 to +792
const payloadBuffer = Buffer.from(await c.req.arrayBuffer())

Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer.from(await c.req.arrayBuffer()) reads the entire KV payload into memory with no size limit. Previously this endpoint had a maxBytes cap; without an explicit limit this can be abused for memory/CPU DoS. Add a request-size limit (e.g., reject when Content-Length exceeds the configured cap and/or enforce a streaming limit while reading the body).

Suggested change
const payloadBuffer = Buffer.from(await c.req.arrayBuffer())
const MAX_KV_PAYLOAD_BYTES = 1024 * 1024
const readRequestBodyWithLimit = async (): Promise<Buffer> => {
const contentLength = c.req.header('content-length')
if (contentLength !== undefined) {
const parsedLength = Number(contentLength)
if (!Number.isFinite(parsedLength) || parsedLength < 0) {
throw new HTTPException(400, { message: 'Invalid Content-Length' })
}
if (parsedLength > MAX_KV_PAYLOAD_BYTES) {
throw new HTTPException(413, { message: 'Payload too large' })
}
}
const body = c.req.raw.body
if (!body) return Buffer.alloc(0)
const reader = body.getReader()
const chunks: Uint8Array[] = []
let totalBytes = 0
try {
while (true) {
const { done, value } = await reader.read()
if (done) break
if (!value) continue
totalBytes += value.byteLength
if (totalBytes > MAX_KV_PAYLOAD_BYTES) {
throw new HTTPException(413, { message: 'Payload too large' })
}
chunks.push(value)
}
} finally {
reader.releaseLock()
}
return Buffer.concat(chunks.map((chunk) => Buffer.from(chunk)), totalBytes)
}
const payloadBuffer = await readRequestBodyWithLimit()

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a regression, but the issue is valid. See #133.

Comment thread src/serve/dashboard-server.ts Outdated
Comment thread deno.json Outdated
Comment thread src/pin.ts Outdated
Comment thread src/serve/routes.ts Outdated
@taoeffect
Copy link
Copy Markdown
Member

taoeffect commented Apr 14, 2026

/crush_fast


AI review started.

@github-actions
Copy link
Copy Markdown

Advanced AI Review

  • Type: Agentic (crush)
  • Model: glm-5.1
Click to expand review

Review generated using glm-5.1 via Z.AI. Comment /crush_fast to re-run.

Comment thread src/serve/auth.ts Outdated
Comment thread src/serve/auth.ts Outdated
devin-ai-integration[bot]

This comment was marked as resolved.

@taoeffect
Copy link
Copy Markdown
Member

taoeffect commented Apr 27, 2026

/crush_fast {{openrouter, deepseek/deepseek-v4-pro}}


AI review started.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 27, 2026

Advanced AI Review

  • Type: Agentic (crush)
  • Model: deepseek/deepseek-v4-pro
Click to expand review

Now let me check for any remaining issues with the route order and the serveAsset function, plus explore the isClosing dead code:

Now I have the full picture. Here's my review:


🔴 Issue 1: Server starts but database is immediately closed — server becomes non-functional

  • Addressed
  • Dismissed

src/serve.ts:97-139 — The old code had await new Promise(() => {}) to keep the process (and database connection) alive indefinitely after starting servers. This line was removed. Now after startApplicationServer() resolves, the finally block runs closeDB(), which closes the database connection while the HTTP servers are still accepting requests.

// src/serve.ts:97-138
export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()
  try {
    await startDashboardServer()     // non-blocking after listen
    if (args.dev) {
      await watch(args)              // non-blocking (starts detached watcher)
    } else {
      await deployManifests(args)    // blocks until deploy completes, then returns
    }
    await startApplicationServer()   // non-blocking after listen
    // OLD CODE HAD: await new Promise(() => {})  <-- REMOVED
  } catch (error) {
    console.error(colors.red('❌ Failed to start server:'), error)
    process.exit(1)
  } finally {
    await closeDB()                  // 💥 DB closes while server is still running!
  }
}

Effect: Every API request after the first few seconds will fail because the database backend has been torn down. The process stays alive because the HTTP server sockets are open, but the server is effectively dead.

Fix: Either restore await new Promise(() => {}) before the finally block, or move closeDB() registration into the signal handlers:

// Option A — restore the blocking promise:
    await startApplicationServer()
    // Keep the process alive
    await new Promise(() => {})
  } catch (error) {
    console.error(colors.red('❌ Failed to start server:'), error)
    process.exit(1)
  }

// Option B — move closeDB to signal handlers and rely on stopServer():
export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()
  try {
    await startDashboardServer()
    if (args.dev) {
      await watch(args)
    } else {
      await deployManifests(args)
    }
    await startApplicationServer()
    // Block forever; cleanup happens in stopServer() via signal handlers
    await new Promise(() => {})
  } catch (error) {
    console.error(colors.red('❌ Failed to start server:'), error)
    process.exit(1)
  }
}

🟡 Issue 2: safePathWithin rejects valid paths when base has a trailing slash

  • Addressed
  • Dismissed

src/serve/routes.ts:251-255 — The path traversal guard concatenates path.sep to base before the startsWith check, producing a double slash if base already ends with a separator:

function safePathWithin (base: string, subpath: string): string | null {
  const resolved = path.resolve(base, subpath)
  if (!resolved.startsWith(base + path.sep) && resolved !== base) return null
  return resolved
}

If base = "/app/assets/" and subpath = "foo.js":

  • resolved = path.resolve("/app/assets/", "foo.js")"/app/assets/foo.js"
  • base + path.sep"/app/assets//"
  • "/app/assets/foo.js".startsWith("/app/assets//")false → returns null (incorrect rejection)

Fix: Normalize base first:

function safePathWithin (base: string, subpath: string): string | null {
  const normalizedBase = path.resolve(base)
  const resolved = path.resolve(normalizedBase, subpath)
  if (!resolved.startsWith(normalizedBase + path.sep) && resolved !== normalizedBase) return null
  return resolved
}

🟡 Issue 3: errorMapper doesn't handle BackendErrorConflict, causing unexpected 500 responses

  • Addressed
  • Dismissed

src/serve/routes.ts:220-232 — The error mapper handles BackendErrorNotFound, BackendErrorGone, and BackendErrorBadData, but BackendErrorConflict (defined in src/serve/errors.ts:6) falls through to the default 500 handler:

const errorMapper = (e: Error): HTTPException => {
  switch (e?.name) {
    case 'BackendErrorNotFound':
      return new HTTPException(404)
    case 'BackendErrorGone':
      return new HTTPException(410)
    case 'BackendErrorBadData':
      return new HTTPException(422, { message: e.message })
    default:
      console.error(e, 'Unexpected backend error')
      return new HTTPException(500, { message: e.message ?? 'internal error' })
  }
}

While the /event route handler manually catches BackendErrorConflict and converts it to a 409 (line 416-417), other routes using errorMapper — or future code — would get a confusing 500 instead of 409.

Fix:

    case 'BackendErrorConflict':
      return new HTTPException(409, { message: e.message })

🟡 Issue 4: ctEq contains misleading non-null assertions and can be simplified

  • Addressed
  • Dismissed

src/serve/routes.ts:181-187 — When actual is longer than expected, expected.codePointAt(i) returns undefined for out-of-bounds indices, and the ! assertion suppresses what is genuinely an undefined value:

const ctEq = (expected: string, actual: string): boolean => {
  let r = actual.length ^ expected.length
  for (let i = 0; i < actual.length; i++) {
    r |= actual.codePointAt(i)! ^ expected.codePointAt(i)!  // ! on undefined
  }
  return r === 0
}

The function is functionally correct (the bitwise ^ coerces undefined to 0, and the initial length XOR already makes r non-zero for mismatched lengths), but the assertions are misleading. Also, charCodeAt is more appropriate since callers only pass hex hashes (ASCII).

Fix:

const ctEq = (expected: string, actual: string): boolean => {
  let r = actual.length ^ expected.length
  for (let i = 0; i < actual.length && i < expected.length; i++) {
    r |= actual.charCodeAt(i) ^ expected.charCodeAt(i)
  }
  return r === 0
}

Adding i < expected.length to the loop condition avoids accessing out-of-bounds indices entirely, making the logic obviously correct.


⚪️ Issue 5: initDB contains a dead isClosing check inside the queued promise

  • Addressed
  • Dismissed

src/serve/database.ts:207-210 — The check if (isClosing) throw ... is inside the initPromise closure, but initDB() first awaits getCurrentClosePromise() (line 204), which guarantees closeDB() has finished before initDB() proceeds. So isClosing can never be true when this check runs:

export const initDB = async ({ skipDbPreloading }: { skipDbPreloading?: boolean } = {}) => {
  await getCurrentClosePromise()   // waits for closeDB to finish

  const thisInitPromise = initPromise ?? (async () => {
    if (isClosing) {               // dead: can never be true here
      throw new Error('Cannot init DB while closing is in progress')
    }
    // ...

The guard is never reachable because the await above serializes access.

Fix: Remove the dead check:

  const thisInitPromise = initPromise ?? (async () => {
    installBaseSelectorsOnce()
    // ...

⚪️ Issue 6: lock/overwrite called on every initDB() in development mode

  • Addressed
  • Dismissed

src/serve/database.ts:232 — The condition !setSelectors || !(import.meta as ImportMeta).lockDbSelectors means that when lockDbSelectors is false (development), both overwrite and lock are called on every initDB() invocation, not just the first:

if (!setSelectors || !(import.meta as ImportMeta).lockDbSelectors) {
  sbp('sbp/selectors/overwrite', { ... })
  sbp('sbp/selectors/lock', [...])
  setSelectors = true
}

If overwrite implicitly unlocks (common in SBP), this is wasteful but harmless. However, some SBP implementations may warn or throw on repeated lock calls on already-locked selectors.

Fix: Guard both overwrite and lock independently:

if (!setSelectors) {
  sbp('sbp/selectors/overwrite', { ... })
  if ((import.meta as ImportMeta).lockDbSelectors) {
    sbp('sbp/selectors/lock', [...])
    setSelectors = true
  }
}

</details>

---
*Review [generated](https://github.com/okTurtles/chel/actions/runs/25015789346) using `deepseek/deepseek-v4-pro` via OpenRouter. Comment `/crush_fast {{openrouter, deepseek/deepseek-v4-pro}}` to re-run.*

@corrideat
Copy link
Copy Markdown
Member Author

corrideat commented Apr 27, 2026

/crush_fast


AI review started.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 27, 2026

Advanced AI Review

  • Type: Agentic (crush)
  • Model: glm-5.1
Click to expand review

Let me check a few more specific things to validate my findings:---

Code Review: Replace Hapi with Hono


🔴 Issue 1: Server shuts down DB immediately after startup; DB never properly closed on process exit

  • Addressed
  • Dismissed

File: src/serve.ts:97-138

The old code had await new Promise(() => {}) at the end of the try block to keep serve() alive indefinitely. The finally { closeDB() } only ran when the process was shutting down (via the SERVER_EXITING event chain). That blocking line was removed:

// src/serve.ts — current code
export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()                        // dbRefs: 0 → 1
  try {
    await startDashboardServer()         // resolves immediately after listen()
    await deployManifests(args)          // resolves quickly
    await startApplicationServer()       // resolves immediately after listen()
    // ← try block ends here, nothing keeps it alive
  } catch (error) {
    process.exit(1)
  } finally {
    await closeDB()                      // dbRefs: 2 → 1 (doesn't actually close!)
  }
}

Meanwhile, server.ts:540 also calls initDB(), so dbRefs reaches 2. The finally block decrements it back to 1, meaning the DB stays open but the ref count is permanently off. When a signal arrives, exit() (in src/serve/index.ts:91-103) calls process.exit()but stopServer() is never invoked and no SERVER_EXITING handler exists to close the DB, terminate workers, close the pubsub server, or shut down the HTTP server. The old database.ts had a SERVER_EXITING handler that called close() — that handler was removed and nothing replaced it.

Impact: Every graceful shutdown leaves the DB un-cleanly closed (risking data corruption with SQLite), workers un-terminated, and the pubsub server's intervals running.

Suggested fix: Register a SERVER_EXITING listener that calls stopServer() (which already does all the right cleanup), and remove the redundant initDB()/closeDB() from serve.ts since startServer() in server.ts handles it:

// src/serve/server.ts — add after installServerSelectorsOnce():

sbp('okTurtles.events/once', SERVER_EXITING, () => {
  sbp('okTurtles.eventQueue/queueEvent', SERVER_EXITING, async () => {
    await stopServer()
  })
})
// src/serve.ts — simplify:
export async function serve (args: ArgumentsCamelCase<Params>) {
  try {
    await startDashboardServer()
    // ...
    await startApplicationServer()
  } catch (error) {
    console.error(colors.red('❌ Failed to start server:'), error)
    process.exit(1)
  }
}

🔴 Issue 2: SERVER_EXITING event has no listeners — graceful shutdown is broken

  • Addressed
  • Dismissed

File: src/serve/index.ts:91-103, src/serve/database.ts (old handler removed)

In the old codebase, database.ts registered:

sbp('okTurtles.events/once', SERVER_EXITING, () => {
  sbp('okTurtles.eventQueue/queueEvent', SERVER_EXITING, async () => {
    await close()
  })
})

This was the only SERVER_EXITING handler. It has been removed. A search of the entire src/ directory confirms SERVER_EXITING is now only emitted (in index.ts:102) and listened-to by the exit() function itself to coordinate the final process.exit() — but no handler actually performs cleanup.

The stopServer() function in server.ts:668-698 does the right thing (closes intervals, pubsub, HTTP server, workers, persistent actions, DB) but is never called during shutdown. It's only exposed as an SBP selector backend/server/stop that nobody invokes on shutdown.


🟡 Issue 3: ImportMeta type definition missing lockDbSelectors

  • Addressed
  • Dismissed

File: src/types/build.d.ts:1-5

scripts/build.ts injects 'import.meta.lockDbSelectors': 'true', and the code uses it at src/serve/database.ts:107 and src/serve/server.ts:13 as (import.meta as ImportMeta).lockDbSelectors. But the ImportMeta interface in build.d.ts doesn't declare it:

// src/types/build.d.ts — current
interface ImportMeta {
  VERSION: string,
  ownerSizeTotalWorker?: string,
  creditsWorker?: string
}

Should be:

interface ImportMeta {
  VERSION: string,
  ownerSizeTotalWorker?: string,
  creditsWorker?: string,
  lockDbSelectors?: boolean
}

🟡 Issue 4: dashboard-server.ts imports serveStatic from hono/deno but runs via @hono/node-server

  • Addressed
  • Dismissed

File: src/serve/dashboard-server.ts:4-5

import { serveStatic } from 'npm:hono/deno'
import { createAdaptorServer } from 'npm:@hono/node-server'

The HTTP server is created using @hono/node-server (which creates a Node.js http.Server), but serveStatic is imported from hono/deno (which uses Deno.open/Deno.stat). While this works when running under Deno (since Deno supports both APIs), it will break if the built output is ever run under plain Node.js (which the platform: 'node' setting in scripts/build.ts suggests is intended). The correct import for a Node.js-compatible build would be:

import { serveStatic } from 'npm:@hono/node-server/serve-static'

Or conditionally import based on the runtime.


🟡 Issue 5: exit() in eventsAfter.ts/get.ts bypasses finally { closeDB() }

  • Addressed
  • Dismissed

Files: src/eventsAfter.ts:25-30, src/get.ts:39-43

// src/eventsAfter.ts
} catch (error) {
  exit(error)         // calls Deno.exit(1) — terminates immediately
} finally {
  if (dbOpen) {
    await closeDB()   // never reached on error path
  }
}

The exit() utility (src/utils.ts) calls Deno.exit(1), which terminates the process synchronously. The finally block's await closeDB() never executes in the error path. The finally block only runs on the success path, where it's still useful. This isn't a regression (the old code had the same issue), but the new finally blocks give a false impression that cleanup happens on errors.


⚪️ Issue 6: GitHub Actions workflow uses non-standard runner

  • Addressed
  • Dismissed

File: .github/workflows/pull-request-bin.yml:11

runs-on: ubuntu-slim

ubuntu-slim is not a standard GitHub-hosted runner. The standard runners are ubuntu-latest, ubuntu-22.04, ubuntu-24.04, etc. This will fail on GitHub-hosted runners unless a self-hosted runner with that label exists.


⚪️ Issue 7: bin/chel.exe is a text placeholder masquerading as a binary

  • Addressed
  • Dismissed

File: bin/chel.exe

placeholder, will be replaced by real binary. DO NOT DELETE THIS FILE!

This is a plain text file with a .exe extension. If npm's post-install script fails to replace it, running bin/chel.exe on Windows would produce a confusing error rather than a clear "binary not found" message. Consider using an empty file or naming it chel.exe.placeholder instead.


⚪️ Issue 8: Redundant initDB() call in serve.ts

  • Addressed
  • Dismissed

File: src/serve.ts:98 and src/serve/server.ts:540

Both serve() and startServer() call initDB(). Since startServer() is called by startApplicationServer() which is called by serve(), the DB is initialized twice. The ref-counting mechanism handles this correctly, but it results in dbRefs being 2 instead of 1, meaning the first closeDB() only decrements the counter without actually closing. Remove the initDB() from serve.ts and let server.ts own the lifecycle.


Review generated using glm-5.1 via Z.AI. Comment /crush_fast to re-run.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 42 additional findings in Devin Review.

Open in Devin Review

Comment thread src/serve/routes.ts Outdated
Comment thread .github/workflows/pull-request-bin.yml
Comment thread src/serve/database.ts Outdated
Comment thread src/serve/routes.ts
@corrideat
Copy link
Copy Markdown
Member Author

corrideat commented May 2, 2026

/crush_fast


AI review started.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Advanced AI Review

  • Type: Agentic (crush)
  • Model: glm-5.1
Click to expand review

Let me investigate the codebase to understand the changes better, particularly the new server implementation and related files.Now let me look at the latestState.ts command to check for DB cleanup there too:Now let me check the upload.ts to understand how it handles URL vs dir:Now let me check the test files to see if there are any test failures expected:Now let me look at the CI workflow file more closely:Let me check the safePathWithin function more carefully:Now let me check the z.stringbool() usage:Now let me check one more important area - the hash.ts describe string that looks like a bug:Now let me look at the parseArgs.ts more carefully to understand the handler flow:Now let me check the serve function's flow more carefully - specifically the DB reference counting interaction with startServer:Now I have a thorough understanding of all the changes. Let me compile my review.

Code Review: Replace Hapi with Hono

Issue 1 — safePathWithin compares against non-normalized base, enabling path traversal

🔴

  • Addressed
  • Dismissed

src/serve/routes.ts:253-258

function safePathWithin (base: string, subpath: string): string | null {
  const normalizedBase = path.resolve(base)
  const resolved = path.resolve(normalizedBase, subpath)
  if (!resolved.startsWith(base + path.sep) && resolved !== normalizedBase) return null
  return resolved
}

The check on line 256 compares resolved against the original base string, not normalizedBase. If base is a relative path like "./assets", then resolved (an absolute path) will never start with "./assets" + path.sep, so the check will always return null, and all static file serving will fail. If base is absolute, this works by accident.

The fix is to compare against normalizedBase:

function safePathWithin (base: string, subpath: string): string | null {
  const normalizedBase = path.resolve(base)
  const resolved = path.resolve(normalizedBase, subpath)
  if (!resolved.startsWith(normalizedBase + path.sep) && resolved !== normalizedBase) return null
  return resolved
}

Issue 2 — serve() calls closeDB() in finally but startServer() is a long-running promise that blocks the finally from ever running while the server is alive

🔴

  • Addressed
  • Dismissed

src/serve.ts:88-133

export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()
  try {
    // ...
    await startServer()  // This resolves once the server is running, not when it stops
  } catch (error) {
    // ...
    throw error
  } finally {
    await closeDB()
  }
}

startServer() (in src/serve/index.ts:109-130) resolves as soon as the server is listening — it does not block until the server stops. The old code had await new Promise(() => {}) to block forever. Now the finally block runs immediately after startup, closing the DB while the server is still running.

The startServer function in src/serve/index.ts waits for SERVER_RUNNING:

return await new Promise((resolve, reject) => {
  sbp('okTurtles.events/on', SERVER_RUNNING, function onRunning (info) {
    sbp('okTurtles.events/off', SERVER_RUNNING, onRunning)
    resolve({ uri: info.info.uri })  // resolves immediately
  })
  startServerImpl().catch(reject)
})

Since startServer resolves once the server is up, closeDB() in the finally will execute immediately. However, startServerImpl() also calls initDB() (in server.ts:540), so the DB ref count is 2. The finally closeDB() will decrement to 1, so it won't actually close. But this is fragile and relies on the ref-counting being exactly right. If the ref count is off (e.g., due to an error path), the DB gets closed under a running server.

If the intent is that serve() should keep the process alive, you need to re-add a blocking promise or use a different mechanism:

export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()
  // ...
  await startServer()
  // Block forever — the server stops via signal handlers calling stopServer()
  await new Promise(() => {})
}

And handle closeDB() in the stopServer() path instead (which it already does via server.ts:693).


Issue 3 — Broken string literal in hash.ts describe

🔴

  • Addressed
  • Dismissed

src/hash.ts:33-34

  describe:
    'Computes and logs the content identifier (CID) for the given file.\n\' + \'File contents will be interpreted as raw binary data, unless the file extension is \'.json\'.',

This was a pre-existing bug that got reformatted but not fixed. The \' + \' is a literal string concatenation that appears verbatim in the output as ' + '. The original code was almost certainly:

  describe:
    "Computes and logs the content identifier (CID) for the given file.\n" +
    "File contents will be interpreted as raw binary data, unless the file extension is '.json'.",

Or as a single string:

  describe:
    'Computes and logs the content identifier (CID) for the given file.\nFile contents will be interpreted as raw binary data, unless the file extension is \'.json\'.',

Issue 4 — Dashboard server uses serveStatic from hono/deno which won't work under Node.js

🟡

  • Addressed
  • Dismissed

src/serve/dashboard-server.ts:4

import { serveStatic } from 'npm:hono/deno'

The rest of the server code uses @hono/node-server explicitly (since this is a Deno project compiled with esbuild targeting platform: 'node'). Using serveStatic from hono/deno will fail at runtime in a compiled Node.js binary because Deno-specific APIs (Deno.open, Deno.stat, etc.) won't be available. This should likely be:

import { serveStatic } from 'npm:@hono/node-server/serve-static'

Issue 5 — getClientIP trusts X-Real-IP header without validation, enabling IP spoofing for rate limiting bypass

🟡

  • Addressed
  • Dismissed

src/serve/routes.ts:236-245

export function getClientIP (c: Context): string {
  const headerIP = c.req.header('x-real-ip')
  if (headerIP) return headerIP
  // ...
}

This is documented as a requirement in the POST /event handler comment, but the header is trusted globally for all rate-limited endpoints. If the server is exposed directly to the internet (no reverse proxy), any client can set X-Real-IP to bypass rate limiting entirely. The code comment says proxies should set this, but there's no configuration to disable header-based IP detection, and no validation that the header contains a valid IP.

Consider either:

  • Only trusting the header when explicitly configured (e.g., trustedProxy config option)
  • Validating the header value is actually an IP address using isIP()

Issue 6 — authMiddleware in optional mode may proceed with invalid credentials instead of rejecting

🟡

  • Addressed
  • Dismissed

src/serve/auth.ts:45-68

export function authMiddleware (
  strategies: string | string[],
  mode: 'required' | 'optional' = 'required'
): MiddlewareHandler {
  // ...
  return async (c, next) => {
    for (const strategy of strategyList) {
      const extractor = extractors[strategy]
      if (!extractor) throw new Error(`Unknown auth strategy: ${strategy}`)
      const credentials = extractor(c)
      if (credentials) {
        c.set('credentials', credentials)
        c.set('authStrategy', strategy)
        return next()
      }
    }
    if (mode === 'optional') {
      c.set('credentials', {} as AuthCredentials)
      c.set('authStrategy', '')
      return next()
    }
    throw new HTTPException(401, { message: 'Unauthorized' })
  }
}

The old Hapi code would distinguish between "no Authorization header at all" (which is fine for optional mode) and "Authorization header present but invalid" (which should fail). The current implementation treats both cases identically — if extractor(c) returns null (e.g., malformed or wrong scheme), it falls through to the optional branch and proceeds with empty credentials. This means a client sending Authorization: bearer (empty token) or Authorization: Basic abc gets treated as unauthenticated rather than rejected.

For the POST /event endpoint which uses authMiddleware('chel-shelter', 'optional'), this could mean that a malformed shelter auth header silently falls through to the unauthenticated path, creating identity contracts without proper attribution when the user actually intended to authenticate.


Issue 7 — extractBearer and extractShelter use inconsistent case sensitivity

⚪️

  • Addressed
  • Dismissed

src/serve/auth.ts:18-37

function extractBearer (c: Context): AuthCredentials | null {
  const authorization = c.req.header('authorization')
  if (!authorization) return null
  const prefix = 'bearer '
  if (authorization.slice(0, prefix.length).toLowerCase() !== prefix) return null
  return { token: authorization.slice(prefix.length) }
}

function extractShelter (c: Context): AuthCredentials | null {
  const authorization = c.req.header('authorization')
  if (!authorization) return null
  const prefix = 'shelter '
  if (authorization.slice(0, prefix.length).toLowerCase() !== prefix) return null

Both use .toLowerCase() on the prefix portion, which is correct per RFC 7235. However, the token value at authorization.slice(prefix.length) preserves original casing. For bearer tokens this is fine, but for extractShelter, the rest of the authorization header (after "Shelter ") is passed to verifyShelterAuthorizationHeader — this is correct since the signature payload is case-sensitive.

The issue is minor: the prefix comparison uses toLowerCase() on the full prefix including the space, which works correctly but is slightly unusual. This is fine and not a real bug, just noting the pattern is correct.


Issue 8 — x-real-ip header is used but CORS is set to origin: '*'

⚪️

  • Addressed
  • Dismissed

src/serve/server.ts:419 and src/serve/routes.ts:237

currentApp.use('*', cors({ origin: '*' }))
export function getClientIP (c: Context): string {
  const headerIP = c.req.header('x-real-ip')
  if (headerIP) return headerIP

CORS origin: * means any website can make cross-origin requests to the server. Combined with the X-Real-IP trust issue, a malicious website could craft requests that include X-Real-IP headers (if the browser allows it, which it generally doesn't for forbidden headers — but proxies might). This is more of a defense-in-depth concern than a direct vulnerability.


Issue 9 — CI workflow fails on direct pushes to master (empty BASE_REF)

⚪️

  • Addressed
  • Dismissed

.github/workflows/pull-request-bin.yml:7-10,23

on:
  push:
    branches: [ "master" ]
  pull_request:
    branches: [ "master" ]
BASE_REF="${{ github.event.pull_request.base.ref }}"
git fetch origin "$BASE_REF" --depth=1

When triggered by a push event (not a PR), github.event.pull_request.base.ref is empty, causing git fetch origin "" --depth=1 and git diff --name-only "origin/"..HEAD which will either fail or produce incorrect results. The workflow should either skip the check on push events or guard against an empty BASE_REF.

if: github.event_name == 'pull_request'

Issue 10 — bin/chel.exe placeholder is a text file but bin/chel is binary — this will cause runtime failures on Windows

⚪️

  • Addressed
  • Dismissed
bin/chel.exe  →  placeholder, will be replaced by real binary. DO NOT DELETE THIS FILE!
bin/chel      →  (binary)

The bin/chel.exe file contains the literal text placeholder, will be replaced by real binary. DO NOT DELETE THIS FILE!\n. If the post-install script fails or doesn't run, users on Windows will get a non-executable file. This is acknowledged by the .gitignore comment referencing issue #125, but worth noting as a known limitation.


Issue 11 — deployManifests in serve.ts opens and closes DB via deployupload, but serve() calls initDB() to prevent closure

⚪️

  • Addressed
  • Dismissed

src/serve.ts:89-91

  // `deployManifests` / `watch` will open the database and then close it.
  // By callig `initDB` here, we ensure that the DB isn't closed.
  await initDB()

The comment says "calling initDB here" but the word "callig" has a typo. More importantly, the DB ref-counting logic is subtle: serve calls initDB() → ref count = 1, then deployManifestsdeployuploadinitDB → ref count = 2, then upload's closeDB → ref count = 1. Then startServerserver.ts's initDB() → ref count = 2. Then serve's finallycloseDB → ref count = 1. This works correctly as long as nothing goes wrong in between, but the reference counting is fragile and error-prone. Consider structured patterns like RAII or a more explicit lifecycle manager.


Review generated using glm-5.1 via Z.AI. Comment /crush_fast to re-run.

Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 43 additional findings in Devin Review.

Open in Devin Review

Comment thread .github/workflows/pull-request-bin.yml
@corrideat
Copy link
Copy Markdown
Member Author

One preexisting but important issue is this:

Issue 5 — getClientIP trusts X-Real-IP header without validation, enabling IP spoofing for rate limiting bypass

If the server is directly exposed to the Internet, for whatever reason, limits are trivial to circumvent by setting to x-real-ip to any arbitrary, client-chosen value.

@taoeffect
Copy link
Copy Markdown
Member

taoeffect commented May 4, 2026

GPT-5.5:

Details

1. 🟡 Security headers are skipped for CORS preflights and thrown errors

  • Addressed
  • Dismissed

The Hapi implementation added X-Frame-Options: DENY from an onPreResponse extension for both normal and Boom error responses. In the Hono version, the header is installed as middleware after the CORS middleware in src/serve/server.ts:419-425:

currentApp.use('*', cors({ origin: '*' }))

currentApp.use('*', async (c, next) => {
  await next()
  c.header('X-Frame-Options', 'DENY')
})

Because Hono's CORS middleware handles OPTIONS preflight requests without calling later middleware, those responses never get the header. Also, if a route throws an HTTPException or other error, await next() rejects and the header-setting line is skipped. This is a regression from the previous Hapi onPreResponse path that handled Boom responses explicitly.

Move the header before CORS and make error responses set it too, for example:

currentApp.use('*', async (c, next) => {
  await next()
  c.header('X-Frame-Options', 'DENY')
})
currentApp.use('*', cors({ origin: '*' }))
currentApp.onError((err, c) => {
  const response = err instanceof HTTPException
    ? err.getResponse()
    : c.text('Internal Server Error', 500)
  response.headers.set('X-Frame-Options', 'DENY')
  return response
})

2. 🟡 Missing dashboard assets return index.html with status 200

  • Addressed
  • Dismissed

The dashboard fallback in src/serve/dashboard-server.ts:35-43 tries static serving and then returns index.html for every unmatched path:

app.get('/*', async (c) => {
  const staticResponse = await staticMiddleware(c, async () => {})
  if (staticResponse) {
    return staticResponse
  }
  const indexResponse = await indexMiddleware(c, async () => {})
  return indexResponse || c.text('Not Found', 404)
})

This means a missing script, stylesheet, or image such as /assets/missing.js is served as the SPA HTML with 200 OK. Browsers will then fail with confusing MIME/type errors and caches may store the wrong response. The old Hapi /assets/{path*} static route would return a real 404 for missing files instead of falling through to the SPA.

Only use the SPA fallback for navigation requests or extensionless paths, and return 404 for asset-like paths:

app.get('/*', async (c) => {
  const staticResponse = await staticMiddleware(c, async () => {})
  if (staticResponse) return staticResponse

  const requestedPath = new URL(c.req.url).pathname
  const acceptsHtml = c.req.header('accept')?.includes('text/html')
  if (path.extname(requestedPath) && !acceptsHtml) {
    return c.text('Not Found', 404)
  }

  return await indexMiddleware(c, async () => {}) || c.text('Not Found', 404)
})

3. ⚪️ Bare /app and /dashboard application routes no longer match

  • Addressed
  • Dismissed

The application SPA route now registers only '/app/*' or '/dashboard/*' in src/serve/routes.ts:1054. Hono wildcard routes do not cover the bare prefix without a trailing slash, so direct links to /app or /dashboard fall through instead of serving or redirecting to the SPA. Hapi's /{path*} style route accepted the empty trailing path.

Register the bare prefix explicitly next to the wildcard route:

const spaPrefix = isCheloniaDashboard ? '/dashboard' : '/app'
app.get(spaPrefix, (c) => c.redirect(`${spaPrefix}/`))
app.get(`${spaPrefix}/*`, etag(), async function (c) {
  const file = await Deno.readFile(staticServeConfig.distIndexHtml)
  return c.body(file, 200, { 'Content-Type': 'text/html' })
})

4. ⚪️ Reinitializing the DB with the in-memory backend after a persistent backend leaves stale selectors

  • Addressed
  • Dismissed

The new explicit DB lifecycle closes a persistent backend by setting currentBackend = null in src/serve/database.ts:317-327, but initDB() only overwrites the chelonia.db/* selectors inside the persistence && persistence !== 'mem' branch in src/serve/database.ts:217-285. If a long-lived process or test suite initializes a persistent backend, calls closeDB(), and later initializes with database.backend = 'mem', the persistent selectors remain installed and call currentBackend!.readData(...) while currentBackend is null.

Either restore the default in-memory selectors on close, or overwrite selectors for the mem path as well. At minimum, reset setSelectors when closing an unlocked persistent backend so the next non-mem init cannot reuse stale closures:

if (!(import.meta as ImportMeta).lockDbSelectors) {
  setSelectors = false
  preloaded = false
  currentCache = null
}

This should be paired with an explicit mem-selector restore if switching back to mem in the same process is supported.

@corrideat
Copy link
Copy Markdown
Member Author

corrideat commented May 4, 2026

/crush_fast


AI review started.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Advanced AI Review

  • Type: Agentic (crush)
  • Model: glm-5.1
Click to expand review

Let me start by examining the key files that changed to understand the full scope of this migration from Hapi to Hono.Now let me check some more files for the full review:Now let me check for a few more things that might be concerning:Let me check a few more things:Here is my complete review:


Code Review: Replace Hapi with Hono

Issue 1 — 🔴 x-real-ip header trust allows IP spoofing to bypass rate limiting

  • Addressed
  • Dismissed

src/serve/routes.ts:237-244 and src/serve/pubsub.ts:299 both trust the x-real-ip and x-forwarded-for headers unconditionally, which allows any client to spoof their IP address. This directly undermines the signup rate limiting logic at src/serve/routes.ts:375-384.

Routes:

// src/serve/routes.ts:237-244
export function getClientIP (c: Context): string {
  const headerIP = c.req.header('x-real-ip')
  if (headerIP) return headerIP
  try {
    const info = getConnInfo(c)
    return info.remote.address || 'unknown'
  } catch {
    return 'unknown'
  }
}

Pubsub:

// src/serve/pubsub.ts:299
socket.ip = (request.headers['x-real-ip'] as unknown as string) || (request.headers['x-forwarded-for'] as unknown as string)?.split(',')[0].trim() || request.socket.remoteAddress

While the POST /event route has a comment noting this (IMPORTANT: IT IS A REQUIREMENT THAT ANY PROXY SERVERS (E.G. nginx) IN FRONT OF US SET THE X-Real-IP HEADER! OTHERWISE THIS IS EASILY SPOOFED!), this was carried over from the Hapi version and remains unmitigated. There should be a configuration option to control whether to trust these headers (e.g., only when behind a known proxy or from loopback addresses), or they should be stripped on ingress.


Issue 2 — 🔴 serveStatic import from hono/deno will crash in Node.js runtime

  • Addressed
  • Dismissed

src/serve/dashboard-server.ts:4:

import { serveStatic } from 'npm:hono/deno'

This imports Hono's Deno-specific serveStatic, which uses Deno.open(), Deno.lstatSync(), and other Deno-only APIs. While this works when run via deno bundle (the current build pipeline), the main server at src/serve/server.ts already uses @hono/node-server (the Node.js adapter). If the built output is ever run with Node.js, or if the build pipeline changes, this will crash with ReferenceError: Deno is not defined.

Suggested fix:

// Use the Node.js compatible serveStatic (already a dependency)
import { serveStatic } from 'npm:@hono/node-server/serve-static'

Issue 3 — 🟡 onError handler doesn't log unexpected errors

  • Addressed
  • Dismissed

src/serve/server.ts:427-433:

currentApp.onError((err, c) => {
    const response = err instanceof HTTPException
      ? err.getResponse()
      : c.text('Internal Server Error', 500)
    response.headers.set('X-Frame-Options', 'DENY')
    return response
})

When an unexpected (non-HTTPException) error reaches this handler, it returns a generic 500 response but does not log the error. This means bugs in route handlers that throw unexpected errors will be silently swallowed — the client gets a 500 but no server-side record exists. This is a regression from the previous Hapi error handling which logged errors.

Suggested fix:

currentApp.onError((err, c) => {
    if (!(err instanceof HTTPException)) {
      console.error(err, 'Unhandled error in route handler')
    }
    const response = err instanceof HTTPException
      ? err.getResponse()
      : c.text('Internal Server Error', 500)
    response.headers.set('X-Frame-Options', 'DENY')
    return response
})

Issue 4 — 🟡 bin/chel.exe is a shared binary with Linux bin/chel

  • Addressed
  • Dismissed

The diff shows bin/chel.exe is new file mode 100644 with content placeholder, will be replaced by real binary. DO NOT DELETE THIS FILE!, and both bin/chel and bin/chel.exe have the same hash 14726f0. This means the .exe is just the Linux binary (or placeholder), which will not work on Windows. This is presumably intentional as a placeholder (the .gitignore ignores everything in bin/), but the identical hash for both files is suspicious — if someone does git lfs or similar, they'd get the same binary for both platforms.

This may be intentional, but worth confirming the post-install script handles Windows correctly.


Issue 5 — 🟡 CI workflow ubuntu-slim runner may not exist

  • Addressed
  • Dismissed

.github/workflows/pull-request-bin.yml:13:

runs-on: ubuntu-slim

The standard GitHub Actions runner label is ubuntu-latest or ubuntu-22.04. ubuntu-slim is not a standard runner and would need to be a self-hosted runner or custom label. If this is intentional (self-hosted runner), this is fine. Otherwise, the workflow will fail to find a runner.


Issue 6 — 🟡 POST /kv/:contractID/:key handler returns raw Response instead of using Hono context

  • Addressed
  • Dismissed

src/serve/routes.ts:956-963 and src/serve/routes.ts:971-977:

return new Response(existing || '', {
  status: 412,
  headers: {
    'ETag': `"${cid}"`,
    'x-cid': `"${cid}"`
  }
})

This returns a raw Response object inside a Hono handler. While Hono does support this, it bypasses Hono's response processing pipeline (including the X-Frame-Options middleware registered at server.ts:423-426). This means these 412 and 409 responses will not have the X-Frame-Options: DENY header, while all other responses do.

Suggested fix — use Hono's context methods:

return c.body(existing || '', 412, {
  'ETag': `"${cid}"`,
  'x-cid': `"${cid}"`
})

Issue 7 — 🟡 POST /event handler catches Response as a thrown value

  • Addressed
  • Dismissed

src/serve/routes.ts:986:

} catch (parseErr) {
  if (parseErr instanceof Response) throw parseErr
  throw new HTTPException(422)
}

This catch block checks if the thrown value is a Response instance. This is a code smell — it suggests that chelonia/parseEncryptedOrUnencryptedDetachedMessage can throw a Response object as an exception. This is not a standard pattern and is confusing. It also means a Response thrown from within the try block will skip the HTTPException(422) and be caught by the outer catch at line 456 instead, which would re-throw it if it's not an HTTPException, causing it to hit the onError handler as an unexpected error type.


Issue 8 — 🟡 pin.ts nconf.get('appManifest') returns empty string instead of undefined

  • Addressed
  • Dismissed

src/pin.ts:141 and src/pin.ts:199:

const configPath = nconf.get('appManifest') || join(projectRoot, 'chelonia.json')

In src/serve.ts:172, the --app-manifest option has default: '' (empty string). This means nconf.get('appManifest') returns '' when not specified, and the || fallback correctly handles it. However, this is fragile — if the default ever changes to something falsy-but-not-empty, or if someone sets it via environment variable, the behavior could be unexpected. Consider using explicit undefined or null as the default.


Issue 9 — ⚪️ safePathWithin does not resolve symlinks

  • Addressed
  • Dismissed

src/serve/routes.ts:253-258:

function safePathWithin (base: string, subpath: string): string | null {
  const normalizedBase = path.resolve(base)
  const resolved = path.resolve(normalizedBase, subpath)
  if (!resolved.startsWith(normalizedBase + path.sep) && resolved !== normalizedBase) return null
  return resolved
}

path.resolve normalizes .. segments but does not resolve symlinks. If an attacker can create a symlink inside assetsDir pointing outside of it, they could read arbitrary files. This is a low-risk issue since it requires write access to the assets directory, but using fs.realpathSync before the check would be more robust.


Issue 10 — ⚪️ Node builtins plugin filter may incorrectly match user-land modules

  • Addressed
  • Dismissed

scripts/build.ts:40-46:

{
  name: 'node-builtins',
  setup (build) {
    build.onResolve({ filter: /^[0-9a-zA-Z_/]+$/, namespace: 'file' }, ({ path }) => {
      if (nodeBuiltins.has(path)) {
        return { path: `node:${path}`, external: true }
      }
      return null
    })
  }
}

This plugin's filter /^[0-9a-zA-Z_/]+$/ matches any path consisting of alphanumeric characters, underscores, and slashes. While the actual resolution only externalizes known builtins, the filter is very broad and runs on every import. The return null is correct (tells esbuild to continue to the next handler), but it runs before the skip plugin which handles other protocol-prefixed imports. Since the npm: plugin runs first and strips the prefix, and this plugin is registered second, the ordering is fine — but the filter could be narrowed to something like /^(fs|path|http|...|zlib)$/ to be more explicit and avoid unexpected interactions.


Issue 11 — ⚪️ dashboard-server.ts doesn't return the server for cleanup

  • Addressed
  • Dismissed

src/serve/dashboard-server.ts:18startDashboard() returns ServerType, but in src/serve.ts:95, the return value is ignored:

await startDashboard()

When the main server shuts down, stopServer() in server.ts cleans up the main HTTP server but never closes the dashboard server. The dashboard server will keep the process alive until it's forcefully terminated. Consider storing the dashboard server instance and closing it during shutdown.


Review generated using glm-5.1 via Z.AI. Comment /crush_fast to re-run.

@corrideat
Copy link
Copy Markdown
Member Author

corrideat commented May 4, 2026

/crush_fast


AI review started.

Copy link
Copy Markdown
Member

@taoeffect taoeffect left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @corrideat! 👏 👏 👏

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 4, 2026

Advanced AI Review

  • Type: Agentic (crush)
  • Model: glm-5.1
Click to expand review

Let me start by examining the full changes in detail, focusing on the key server files.Now let me examine some specific areas more carefully:Now let me check the upload.ts for the typo I spotted, and verify a few more things:The typo was in the diff hunk header, not the actual file. The code is correct. Now let me verify one more thing about the node-builtins plugin ordering:Let me verify the z.stringbool() usage in routes.ts:Good, z.stringbool() is valid in Zod v4. Let me also verify one more concern about the eventsAfter endpoint's since parameter:Good. Now let me verify the eventsAfter.ts CLI - the route uses since as a height parameter but the CLI calls it height:## Code Review: Replace Hapi with Hono


🔴 Issue 1: serve() may exit before the server finishes starting

  • Addressed
  • Dismissed

src/serve.ts:131-142

The serve() function no longer has a mechanism to keep the process alive after starting the servers. Previously, await new Promise(() => {}) at the end of serve() created an intentionally unresolved promise that kept the Node.js event loop alive. Now, serve() will resolve after startServer() returns, and the finally block calls closeDB():

  } finally {
    // Safe to close because the server has opened the DB, increasing the
    // reference count.
    await closeDB()
  }

When serve() is called from main.ts, the handlerState.postHandler() promise resolves, execution continues past it, and the process exits — killing the server almost immediately after it starts. The startServer() function in src/serve/index.ts does install signal handlers that call process.exit() on signals, but those only fire on signals. Normal execution flow will end the process.

Additionally, the closeDB() in the finally block runs unconditionally, even on success. Since startServer() in src/serve/server.ts:521 also calls initDB() (incrementing dbRefs to 2), the closeDB() in serve.ts will decrement it back to 1 — so the DB stays open. But this is fragile and relies on reference-counting implementation details.

Suggested fix: Either keep the process-alive mechanism (the unresolved promise or something equivalent), or ensure the caller (main.ts) knows to keep the process alive for the serve command:

export async function serve (args: ArgumentsCamelCase<Params>) {
  await initDB()
  try {
    await startDashboard()
    if (args.dev) {
      await watch(args)
    } else {
      await deployManifests(args)
    }
    await startServer()
  } catch (error) {
    console.error(colors.red('❌ Failed to start server:'), error)
    throw error
  } finally {
    await closeDB()
  }
}

Since startServer() in src/serve/index.ts already sets up signal handlers for graceful shutdown and process.exit(), the serve function should never return during normal operation. But without the old await new Promise(() => {}), if startServer() resolves without error (which it does once the server is listening), control returns to main.ts and the process exits. Consider adding back a keep-alive:

    await startServer()
    await new Promise(() => {}) // Keep process alive until signal

🔴 Issue 2: Error in main.ts masks re-throwable errors with exit()

  • Addressed
  • Dismissed

src/main.ts:16-20

try {
  await handlerState.postHandler()
} catch (e) {
  exit(e)
}

The exit() function in src/utils.ts:43-48 calls Deno.exit(1), which terminates the process immediately. This is a behavioral change from the old code, which used a finally block to emit SERVER_EXITING for cleanup. While the old approach had its own issues (emitting on every command, not just the server), the new approach has a worse problem: commands that rely on re-throwing errors for control flow will now terminate the process.

For example, migrate.ts has a pattern like:

// src/migrate.ts:28-30
  } catch (e) {
    console.error('Error setting up database')
    throw e
  }

With the old code, the error from initDB would be caught, logged, and re-thrown, then caught by the finally block in main.ts. With the new code, the re-thrown error is caught by exit(e) which immediately terminates the process with Deno.exit(1). This means error messages from inner catch blocks that intentionally re-throw will be double-printed (once by the inner handler, once by exit), and more importantly, the error context from exit() only shows e.message, losing the stack trace for errors that aren't Error instances.


🟡 Issue 3: migrate.ts has a reference-counting race with closeDB() in the finally block

  • Addressed
  • Dismissed

src/migrate.ts:102-106

  } finally {
    await Promise.all([backendTo?.close(), closeDB()])
  }

If the migration loop throws an error (e.g., on a bad key read), backendTo will be closed in the finally block. However, backendTo may be undefined if the error occurred during its initialization (before the loop starts). The backendTo?.close() handles this with optional chaining, which is fine. But there's a subtle issue: if initDB() succeeded (setting dbRefs to 1), and then the backend init fails before the try block, the finally block calls closeDB() which properly decrements. This looks correct.

However, the old code had more granular error handling with exit() calls inside the migration loop that would terminate the process. The new code relies on throw propagating to the finally. If the migration is interrupted mid-way (e.g., backendTo writes some keys but fails on others), the target database is left in a partially migrated state. The old code had the same issue, but it called exit(1) explicitly. The new code is actually slightly worse because the error just propagates up, and main.ts calls exit(e) which calls Deno.exit(1) — same effect, but with a less clear error message.


🟡 Issue 4: zValidator query schema uses .strict() but z.stringbool() may conflict

  • Addressed
  • Dismissed

src/serve/routes.ts:470-472

    zValidator('query', z.object({
      keyOps: z.stringbool().optional()
    }).strict()),

Using .strict() on the query parameter schema means that any additional query parameters beyond keyOps will cause a validation failure. This is fine for a strict API, but combined with z.stringbool(), there's a subtle issue: z.stringbool() in Zod v4 expects string values like "true", "1", "yes", etc. However, query parameters like ?keyOps (without a value) are parsed by most frameworks as keyOps="" (empty string), which is not a valid stringbool value. This means ?keyOps without a value would fail validation, and only explicit ?keyOps=true or ?keyOps=1 would work. Make sure this is the intended behavior.


🟡 Issue 5: node-builtins esbuild plugin may intercept subpath imports incorrectly

  • Addressed
  • Dismissed

scripts/build.ts:39-46

        build.onResolve({ filter: /^[0-9a-zA-Z_/]+$/, namespace: 'file' }, ({ path }) => {
          if (nodeBuiltins.has(path)) {
            return { path: `node:${path}`, external: true }
          }
          return null
        })

nodeBuiltins is a Set<string> of builtin module names like "fs", "stream", etc. It does not contain subpath entries like "stream/promises" or "fs/promises". When the code encounters an import like from 'stream/promises', nodeBuiltins.has('stream/promises') returns false, and the plugin returns null, passing the resolution to the next handler. The skip plugin has filter /^[\w\d]+:/ which won't match stream/promises (no colon), so it falls through to esbuild's default resolution, which may try to bundle it or fail.

However, node:stream/promises would be caught by the skip plugin (due to the node: prefix). So imports already using the node: prefix work fine. Only bare subpath imports like stream/promises might be an issue. If the codebase only uses node: prefixed imports for subpaths (which is the standard in modern Node/Deno), this is fine.


🟡 Issue 6: No graceful event listener for Deno signals in migrate.ts

  • Addressed
  • Dismissed

src/migrate.ts:87-94

      const handleSignal = (signal: string, code: number) => {
        process.on(signal, () => {
          shouldExit = 128 + code
          if (++interruptCount < 3) {
            console.error(`Received signal ${signal} (${code}). Finishing current operation.`)
          } else {
            console.error(`Received signal ${signal} (${code}). Force quitting.`)
            reportStatus()
            exit(shouldExit)
          }
        })
      }

The signal handlers use process.on(signal, ...) which adds a new listener every time migrate() is called. While this is unlikely in practice (migrate runs once), the handlers are never removed. More importantly, in Deno, process.on('SIGINT', ...) may not work as expected depending on the Deno version and permissions. The --allow-sys permission is needed for signal handling. Since main.ts includes --allow-sys in its shebang, this should be fine for direct execution, but it could be an issue when running via deno task.


⚪️ Issue 7: bin/chel.exe placeholder is a text file but bin/chel is binary

  • Addressed
  • Dismissed

bin/chel.exe (new file):

+placeholder, will be replaced by real binary. DO NOT DELETE THIS FILE!

The .gitignore for bin/ has * which should ignore everything, but both bin/chel and bin/chel.exe are committed. The .gitignore rule * ignores all files, but since these files are already tracked (the diff shows they're being modified/added), .gitignore won't affect them. The pull-request-bin.yml workflow protects against changes, which is good. However, bin/chel.exe is a plain text placeholder while bin/chel is a binary — this inconsistency could confuse the post-install script.


⚪️ Issue 8: github.com/actions/checkout pinned to specific SHA but ubuntu-slim runner may not exist

  • Addressed
  • Dismissed

.github/workflows/pull-request-bin.yml:14

    runs-on: ubuntu-slim

The runner ubuntu-slim is not a standard GitHub Actions runner. Standard options are ubuntu-latest, ubuntu-22.04, ubuntu-24.04, etc. If this is a self-hosted runner or a custom label, this is fine, but if not, the workflow will fail to find a runner and hang indefinitely.


⚪️ Issue 9: Dashboard serveStatic uses Deno adapter for a Node.js server

  • Addressed
  • Dismissed

src/serve/dashboard-server.ts:5

import { serveStatic } from 'npm:hono/deno'

The dashboard server is created with @hono/node-server's createAdaptorServer, which is a Node.js HTTP server. But serveStatic is imported from npm:hono/deno, which uses Deno-specific file APIs. Since this project runs under Deno (which provides Node.js compatibility), this should work. But it would be cleaner and more correct to use serveStatic from @hono/node-server/serve-static instead.


⚪️ Issue 10: eventsAfter.ts CLI param name mismatch with route

  • Addressed
  • Dismissed

src/eventsAfter.ts:109 vs src/serve/routes.ts:464

The CLI defines the second positional parameter as height:

      .positional('height', {
        describe: 'Height',
        demandOption: true,
        type: 'number'
      })
  command: 'eventsAfter <contractID> <height>',

But the route defines it as since:

  app.get('/eventsAfter/:contractID/:since/:limit?',

The CLI function parameter is also named height in the Params type. This is fine functionally (they're just names), but could be confusing for developers reading the code.


Summary of Issues by Priority

# Severity File Summary
1 🔴 src/serve.ts Process may exit immediately after server starts — missing keep-alive
2 🔴 src/main.ts exit() in catch masks re-thrown errors and loses stack traces
3 🟡 src/migrate.ts finally cleanup race with reference-counted DB
4 🟡 src/serve/routes.ts .strict() + z.stringbool() may reject empty query param
5 🟡 scripts/build.ts node-builtins plugin doesn't handle subpath imports like fs/promises
6 🟡 src/migrate.ts Signal handlers added via process.on never removed
7 ⚪️ bin/chel.exe Placeholder text file inconsistency with binary bin/chel
8 ⚪️ .github/workflows/pull-request-bin.yml ubuntu-slim runner may not exist
9 ⚪️ src/serve/dashboard-server.ts Uses Deno serveStatic with Node.js adaptor server
10 ⚪️ src/eventsAfter.ts CLI param height vs route param since naming mismatch

Review generated using glm-5.1 via Z.AI. Comment /crush_fast to re-run.

@taoeffect taoeffect merged commit a9f2ad1 into master May 8, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move off of Hapi? (CodeQL issue + perf) 'Failed to fetch' issues when running the Cypress tests

3 participants