Skip to content

fix: Database dumps do not work on large databases (#59)#218

Open
IbrahimLaeeq wants to merge 1 commit into
outerbase:mainfrom
IbrahimLaeeq:fix/bounty-issue-59
Open

fix: Database dumps do not work on large databases (#59)#218
IbrahimLaeeq wants to merge 1 commit into
outerbase:mainfrom
IbrahimLaeeq:fix/bounty-issue-59

Conversation

@IbrahimLaeeq
Copy link
Copy Markdown

Fixes #59.

/claim #59

emory at once.

Both also meant nothing was sent to the client until the entire dump finished, so any dump exceeding the ~30s request window failed with no partial output.

Fix (src/export/dump.ts):

  • The response body is now a ReadableStream. Each schema line and INSERT statement is enqueued as it's generated, so memory stays flat regardless of database size and bytes start flowing to the client immediately — keeping the connection alive well past the 30s window for large dumps.
  • Table data is read in bounded pages via LIMIT 1000 OFFSET n, looping until a short page signals end-of-table, so no single large table is ever fully held in memory.
  • The initial table-list query stays outside the stream so a dead connection still produces a proper 500 response; errors during streaming go through controller.error.

Test (src/export/dump.test.ts): Added a case that mocks a full 1000-row page followed by a partial page, asserting the dump paginates (4 executeOperation calls) and issues LIMIT/OFFSET 0 then OFFSET 1000 queries. All existing tests still pass unchanged since streamed responses are still readable via response.text().

Note: this is the focused, dependency-free fix. The issue's fuller "proposed solution" (R2 binding, DO alarms, callback URLs) is a much larger feature requiring new infrastructure bindings and config; the streaming + pagination change is the smallest correct fix that removes the memory ceiling and the hard timeout failure.


Verified against the repository's own test suite before submission.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes the /export/dump endpoint so large databases can be exported without exhausting memory or hitting the ~30s request window. The handler now streams output via a ReadableStream and reads each table in bounded LIMIT/OFFSET pages instead of buffering the entire database into a single string.

Changes:

  • Replace the in-memory dumpContent string + Blob response with a ReadableStream that enqueues schema and INSERT statements as they are generated.
  • Paginate per-table data reads with LIMIT 1000 OFFSET n, stopping when a short page is returned.
  • Add a test verifying that a 1000-row full page triggers a follow-up OFFSET 1000 query and that pagination uses LIMIT/OFFSET.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/export/dump.ts Switches the dump response body to a ReadableStream and paginates per-table data via LIMIT/OFFSET.
src/export/dump.test.ts Adds a pagination test asserting four executeOperation calls and correct LIMIT/OFFSET usage across pages.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/export/dump.ts
Comment on lines +62 to +70
const dataResult = await executeOperation(
[
{
sql: `SELECT * FROM ${table} LIMIT ${DUMP_PAGE_SIZE} OFFSET ${offset};`,
},
],
dataSource,
config
)
Comment thread src/export/dump.ts
Comment on lines +97 to +100
} catch (error: any) {
console.error('Database Dump Error:', error)
controller.error(error)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Database dumps do not work on large databases

2 participants