Pushwork - Claude's Notes

Always update this file as you learn new things about the codebase — patterns, pitfalls, performance considerations, architectural decisions. This is your persistent memory across sessions.

What to do after changing code

Always run npm run build (which runs tsc) after finishing changes to verify compilation.

Code style

src/core/sync-engine.ts and src/commands.ts use tabs for indentation
src/utils/network-sync.ts and src/cli.ts use 2-space indentation
Adding a new CLI command requires changes in 4 places: types (src/types/config.ts for options interface), engine (src/core/sync-engine.ts for method), commands (src/commands.ts for command function), CLI (src/cli.ts for registration + import)

What pushwork is

Pushwork is a CLI tool for bidirectional file synchronization using Automerge CRDTs. It maps a local filesystem directory to a tree of Automerge documents, syncing changes in both directions through a relay server. Multiple users can edit the same files and changes merge automatically without conflicts.

Architecture overview

CLI (cli.ts) -> Commands (commands.ts) -> SyncEngine (core/sync-engine.ts)
                                              |
                                    +---------+---------+
                                    |         |         |
                              ChangeDetector  |    MoveDetector
                                         SnapshotManager

Key files

src/cli.ts - Commander.js CLI entry point, defines all commands
src/commands.ts - Command implementations, setupCommandContext() is the shared setup
src/core/sync-engine.ts - The heart of the system. Two-phase sync: push local changes, then pull remote changes
src/core/change-detection.ts - Compares local filesystem state against snapshot to find changes
src/core/move-detection.ts - Detects file renames/moves by content similarity
src/core/snapshot.ts - Manages .pushwork/snapshot.json, tracks what's been synced
src/core/config.ts - Config loading/merging (defaults < global < local)
src/utils/text-diff.ts - spliceText() for character-level CRDT edits, updateTextContent() for handling legacy immutable strings, readDocContent() for normalizing content reads

Type definitions

src/types/documents.ts - FileDocument, DirectoryDocument, DirectoryEntry
src/types/config.ts - DirectoryConfig, GlobalConfig, all CLI option interfaces
src/types/snapshot.ts - SyncSnapshot, SnapshotFileEntry, SyncResult

How sync works

Data model

Every file becomes an Automerge document (FileDocument) with content stored as either collaborative text (for text files, supporting character-level merge) or raw bytes (for binary files). Directories become DirectoryDocuments containing a docs array of {name, type, url} entries pointing to children. The whole thing forms a tree rooted at one directory document.

Two-phase sync

Push (local -> remote): Detect local filesystem changes vs snapshot. New files get new Automerge docs. Modified files get spliced. Deleted files are removed from their parent directory document (the orphaned doc is left as-is).
Network sync: Wait for documents to reach the relay server, level-by-level deepest-first (children before parents).
Pull (remote -> local): Re-detect changes after network sync. Write remote-only changes to the local filesystem.

Snapshot

The snapshot (.pushwork/snapshot.json) records:

rootDirectoryUrl - the root Automerge document URL
files - map of relative path -> {url, head} for every tracked file
directories - map of relative path -> {url, head} for every tracked directory

The head (Automerge document heads) is how change detection works: if a document's current heads differ from the snapshot heads, it has changed.

Versioned URLs

Automerge URLs can include heads (e.g. automerge:docid#head1,head2). Pushwork stores versioned URLs in directory entries so clients can fetch the exact version. getPlainUrl() strips heads when you need a mutable handle; getVersionedUrl() adds current heads.

Immutable string handling

Old Automerge documents may store text content as RawString (aka ImmutableString) instead of the collaborative text CRDT. You can't splice into these. Two strategies:

updateTextContent() - Inside a change callback, detects if the field is a regular string (splice-able) or legacy immutable (assign directly to convert it).
updateRemoteFile() nuclear path - If A.isImmutableString(content) is true, throws away the old document entirely, creates a brand new one with proper mutable text, and replaces the entry in the parent directory via replaceFileInDirectory().

readDocContent() normalizes RawString to plain strings when reading.

CLI commands

pushwork init [path] - Initialize, creates root directory document
pushwork clone <url> <path> - Clone from an Automerge URL
pushwork sync [path] - Full bidirectional sync (default: force mode — uses default config, preserves snapshot for incremental change detection)
- --dry-run - Preview only
- --gentle - Use merged config instead of defaults
- --nuclear - Recreate all Automerge documents from scratch (except root)
- --force - Silently accepted for backwards compatibility (does nothing, force is now the default)
pushwork track <url> [path] - Set root directory URL without full init (creates minimal .pushwork/snapshot.json). root is a hidden alias.
pushwork commit [path] - Save to Automerge docs without network sync
pushwork status [path] - Show sync status
pushwork diff [path] - Show changes
pushwork url [path] - Print root Automerge URL
pushwork ls [path] - List tracked files
pushwork config [path] - View config
pushwork watch [path] - Watch + build + sync loop
pushwork rm [path] - Remove local .pushwork data

Config

Stored in .pushwork/config.json (local) and ~/.pushwork/config.json (global). Merged: defaults < global < local.

Key fields:

sync_enabled: boolean - Whether to do network sync
sync_server: string - WebSocket relay URL (default: wss://sync3.automerge.org)
sync_server_storage_id: StorageId - Server identity for sync verification
exclude_patterns: string[] - Gitignore-style patterns (default: .git, node_modules, *.tmp, .pushwork, .DS_Store)
sync.move_detection_threshold: number - Similarity threshold for move detection (0-1, default 0.7)

Network sync details

Uses waitForSync() to verify documents reach the server by comparing local and remote heads
Uses waitForBidirectionalSync() to poll until document heads stabilize (no more incoming changes)
- Accepts optional handles param to check only specific handles instead of full tree traversal (used post-push in sync())
- Timeout scales dynamically: max(timeoutMs, 5000 + docCount * 50) so large trees don't prematurely time out
- Tree traversal (collectHeadsRecursive) fetches siblings concurrently via Promise.all
Documents sync level-by-level, deepest first, so children are on the server before their parents
handlesByPath map tracks which documents changed and need syncing

Leaf-first ordering

pushLocalChanges() processes directories deepest-first via batchUpdateDirectory(), propagating subdirectory URL updates as it walks up toward the root. This ensures directory entries always point to the latest version of their children.

The `changeWithOptionalHeads` helper

Used throughout sync-engine: if heads are available, calls handle.changeAt(heads, cb) to branch from a known version; otherwise falls back to handle.change(cb). This is important for conflict-free merging when multiple peers are editing.

Performance pitfalls

Avoid splicing large text deletions. Automerge text CRDTs track every character as an individual op. A.splice(doc, path, 0, largeString.length) to clear a large file is O(n) in CRDT ops and very slow. This is why deleteRemoteFile() no longer clears content — it just lets the document become orphaned when removed from its parent directory.
Avoid diffing artifact files. diffChars() is O(n*m) and pointless for artifact directories since they use RawString (immutable snapshots). Artifact files should always be replaced with a fresh document rather than diffed+spliced. This applies to updateRemoteFile(), applyMoveToRemote(), and change detection. ChangeDetector skips getContentAtHead() and getCurrentRemoteContent() for artifact paths — it uses a SHA-256 contentHash stored in the snapshot to detect local changes, and checks heads to detect remote changes. If neither changed, the artifact is skipped entirely. The contentHash field on SnapshotFileEntry is optional and only populated for artifact files.
Artifact directories are always nuked. batchUpdateDirectory uses a plain dirHandle.change() (not changeWithOptionalHeads) for artifact directory paths and rebuilds the entire docs array from scratch. This avoids changeAt forking from stale heads, which previously caused bugs like deleted entries resurrecting. The rebuild reads the current entries, applies all changes (deletes, updates, additions, subdir URL updates), then splices out the old array and pushes the computed entries.
Sync timeout recovery. waitForSync() returns { failed: DocHandle[] } instead of throwing. When documents fail to sync (timeout or unavailable), recreateFailedDocuments() creates new Automerge docs with the same content, updates snapshot entries and parent directory references, then retries once. If documents still fail after recreation, it's reported as an error (not a warning) so the sync shows as "PARTIAL" rather than "SYNCED".
Document availability during clone. repo.find() rejects with "Document X is unavailable" if the sync server doesn't have the document yet. DocHandle.doc() is synchronous and throws if the handle isn't ready. For clone scenarios, sync() retries repo.find() for the root document with exponential backoff (up to 6 attempts). ChangeDetector.findDocument() wraps repo.find() + doc() with retry logic for all document fetches during change detection.
Server load. enableRemoteHeadsGossiping is disabled — pushwork syncs directly with the server so the gossip protocol is unnecessary overhead. waitForSync processes documents in batches of 10 (SYNC_BATCH_SIZE) to avoid flooding the server with concurrent sync messages. Without batching, syncing 100+ documents simultaneously can overwhelm the sync server (which is single-threaded with no backpressure).
waitForBidirectionalSync on large trees. Full tree traversal (getAllDocumentHeads) is expensive because it repo.find()s every document. For post-push stabilization, pass the handles option to only check documents that actually changed. The initial pre-pull call still needs the full scan to discover remote changes. The dynamic timeout adds the first scan's duration on top of the base timeout, since the first scan is just establishing baseline — its duration shouldn't count against stability-wait time.
Versioned URLs and repo.find(). repo.find(versionedUrl) returns a view handle whose .heads() returns the VERSION heads, not the current document heads. Always use getPlainUrl() when you need the current/mutable state. The snapshot head update loop at the end of sync() must use getPlainUrl(snapshotEntry.url) — without this, artifact directories (which store versioned URLs) get stale heads written to the snapshot, causing changeAt() to fork from the wrong point on the next sync. This was the root cause of the artifact deletion resurrection bug: batchUpdateDirectory would changeAt from an empty directory state where the file entry didn't exist yet, so the splice found nothing to delete.

Subduction sync backend (`--sub`)

The --sub flag switches from the default WebSocket sync adapter to the Subduction backend built into automerge-repo@2.6.0-subduction.9. The Repo manages a SubductionSource internally — pushwork just passes subductionWebsocketEndpoints and the Repo handles connection management, sync, and retries.

How it works

repo-factory.ts: Initializes Subduction Wasm via ESM dynamic import, then creates Repo. When sub: true, passes subductionWebsocketEndpoints: [syncServer] and periodicSyncInterval: 2000 (CLI needs fast sync, not the default 10s). When sub: false, uses the traditional WebSocket network adapter instead.
Default server: wss://subduction.sync.inkandswitch.com (vs wss://sync3.automerge.org for WebSocket)
network-sync.ts: When no StorageId is provided (Subduction mode), waitForSync falls back to head-stability polling (3 consecutive stable checks at 100ms intervals) instead of getSyncInfo-based verification
sync-engine.ts: In sub mode, skips recreateFailedDocuments — SubductionSource has its own heal-sync retry logic
Everything else (push/pull phases, artifact handling, nukeAndRebuildDocs, change detection) is identical

Wasm initialization

As of automerge-repo@2.6.0-subduction.9, the Repo constructor always creates a SubductionSource internally (even without Subduction endpoints), which imports MemorySigner and set_subduction_logger from @automerge/automerge-subduction/slim. The /slim entry does NOT auto-init the Wasm — so Wasm must be initialized before any new Repo() call, including the default WebSocket path.

automerge-repo exports initSubduction() which dynamically imports @automerge/automerge-subduction (the non-/slim entry that auto-inits Wasm). Pushwork calls this via repoMod.initSubduction() after loading the Repo module — no direct dependency on @automerge/automerge-subduction is needed.

repo-factory.ts uses a new Function("specifier", "return import(specifier)") wrapper to perform real ESM import() calls that Node.js evaluates as ESM. This is necessary because TypeScript with "module": "commonjs" compiles await import("x") to require("x"), which resolves CJS entries. The CJS and ESM module graphs have separate Wasm instances, so initializing via CJS require() doesn't help the ESM /slim imports inside automerge-repo. The new Function trick bypasses tsc's transformation and shares the same ESM module graph as the Repo's internal imports.

The Repo class itself is also loaded via this ESM dynamic import (cached after first call) so that new Repo() sees the initialized Wasm module.

Packaging notes

automerge-repo@2.6.0-subduction.9 correctly pins @automerge/automerge-subduction@0.7.0 — no pnpm override needed (unlike subduction.7 which required an override to fix a version mismatch).
RepoConfig now properly types all Subduction options (subductionWebsocketEndpoints, periodicSyncInterval, batchSyncInterval, signer, subductionPolicy, subductionAdapters) — no as any cast needed.
The automerge-repo-network-websocket adapter's NetworkAdapter types are slightly behind the repo's NetworkAdapterInterface (missing state() method in declarations). The adapter works at runtime; the type mismatch is worked around with as unknown as NetworkAdapterInterface.
New "heal-exhausted" event on Repo fires when self-healing sync gives up after all retry attempts for a document. Not currently used by pushwork but available for better error reporting.

Subduction mode persistence

--sub is only accepted on init and clone. It persists subduction: true in .pushwork/config.json. All subsequent commands (sync, watch, etc.) read it from config via config.subduction ?? false. The force-defaults path in setupCommandContext preserves subduction alongside root_directory_url.

When Subduction mode is active, commands print a banner: "Using Subduction sync backend (from config)".

Every sync run prints the root Automerge URL at the end.

Corrupt storage recovery

repo-factory.ts scans .pushwork/automerge/ for 0-byte files before creating the Repo. These indicate incomplete writes from a previous run (process exited before storage flushed). If any are found, the entire automerge cache is wiped and recreated — data will re-download from the sync server. The snapshot (.pushwork/snapshot.json) is preserved so all document URLs are retained.

This is a safety net for the Subduction HydrationError: LooseCommit too short crash. The upstream fix (Repo.shutdown() now calls flush() and SubductionSource.shutdown() awaits pending writes) prevents the corruption from happening in the first place, but edge cases (SIGKILL, OOM, power loss) can still produce 0-byte files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pushwork - Claude's Notes

What to do after changing code

Code style

What pushwork is

Architecture overview

Key files

Type definitions

How sync works

Data model

Two-phase sync

Snapshot

Versioned URLs

Immutable string handling

CLI commands

Config

Network sync details

Leaf-first ordering

The `changeWithOptionalHeads` helper

Performance pitfalls

Subduction sync backend (`--sub`)

How it works

Wasm initialization

Packaging notes

Subduction mode persistence

Corrupt storage recovery

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

Pushwork - Claude's Notes

What to do after changing code

Code style

What pushwork is

Architecture overview

Key files

Type definitions

How sync works

Data model

Two-phase sync

Snapshot

Versioned URLs

Immutable string handling

CLI commands

Config

Network sync details

Leaf-first ordering

The changeWithOptionalHeads helper

Performance pitfalls

Subduction sync backend (--sub)

How it works

Wasm initialization

Packaging notes

Subduction mode persistence

Corrupt storage recovery

The `changeWithOptionalHeads` helper

Subduction sync backend (`--sub`)