Skip to content

feat(audit): forensic audit log v2 (Goggles contract)#623

Open
erskingardner wants to merge 8 commits into
masterfrom
claude/trusting-pasteur-affb6c
Open

feat(audit): forensic audit log v2 (Goggles contract)#623
erskingardner wants to merge 8 commits into
masterfrom
claude/trusting-pasteur-affb6c

Conversation

@erskingardner

@erskingardner erskingardner commented Jun 25, 2026

Copy link
Copy Markdown
Member

Implements the full Marmot forensic audit-log v2 contract end-to-end across darkmatter, matching the schema merged on Goggles master. Builds in six reviewable phases plus a flaky-test fix and a final schema-sync.

What changed

  • Phase 1 — data-mode spine (marmot-forensics): AuditDataMode (obfuscated_sensitive_data default / full_data) stamped on every event; v2 schema; recorder_started reshaped; audit_data_mode_changed kind; ForensicRecorder::set_data_mode rotates the file on a real mode change; versioned audit-<engine_id>-v2.jsonl filenames so existing v1 files are left untouched.
  • Phase 2 — settings/storage/FFI plumbing: data_mode through StoredAuditLogSettings (additive column migration), AuditLogSettings, and AuditLogSettingsFfi/AuditDataModeFfi; recorder opens in the persisted mode; MarmotAppRuntime::set_audit_log_settings hot-swaps a live recorder with a clear boundary when the mode changes.
  • Phase 3 — transport wire evidence: transport_received kind + a reusable AuditTransportWire envelope on inbound (transport_received/context) and outbound (publish_*) rows. Inbound wire data sourced from the Nostr adapter; emitted by the engine before ingest.
  • Phase 4 — recipient expectations: recipient_expectation rows; send_outcome/create_group_outcome carry an outbound_messages inventory. Group messages/commits target all other current members; welcomes target only the added member. Full recipient pubkeys are full-data only.
  • Phase 5 — convergence traces: stable convergence_run_id, convergence_run_state lifecycle rows, and a reshaped convergence_decision carrying every candidate + score, a rule_trace recording each selector rule and the decisive one, and losing branches. The trace is a pure function of the candidate set (order-independent).
  • Phase 6 — full-data decoded content: message_content_decoded (decoded app event, author identity, NIP-94 attachments) emitted at engine ingest, strictly gated on full_data; source_context; group_state_changed value object + actor/subject pubkeys; schema allOf guards forbidding every full-data-only field in obfuscated mode.
  • Schema sync to Goggles master: adopted the merged schema verbatim (darkmatter's copy is byte-identical) and reconciled the model — wire_kind is now a string, the welcome_* wire fields split, membership_change_source added, convergence_decision.candidates required, plus several optional fields (artifact_kind, relay_url, detail, origin_commit_id, state_digest, last_input_time_ms) and pattern tightenings.
  • Flaky-test fix: de-flaked engine_ingest_buffers_future_epoch_app_message_as_convergence_witness (pre-existing on master; mixed the engine's real monotonic convergence clock with a logical now_ms).

Privacy posture

Obfuscated mode (the default) never logs plaintext, decoded content, full author/recipient/group-state values, or account pubkeys; full-data is an explicit opt-in. Neither mode ever logs bearer/upload tokens, auth headers, private keys, ciphertext, or raw MLS bytes.

Reviewer notes

  • The schema↔Rust kind catalog is kept in lockstep by a test, and a recursive conformance test proves serialized output uses only schema-allowed keys (guards additionalProperties: false), since no JSON-Schema validator crate is available offline.
  • The messageId/digestHex tightening is non-breaking: real Nostr ids, payload digests, and pubkeys are already 64-hex (only synthetic test fixtures were shorter).
  • convergenceCandidate.state_digest and per-candidate last_input_time_ms are intentionally left unset (cost / not per-candidate), noted inline.

Verification

just fast-ci green workspace-wide (incl. OTLP feature builds); fmt + clippy clean; marmot-forensics, cgka-engine, cgka-traits (snapshots unchanged), cgka-session, marmot-account, cgka-conformance-simulator, marmot-app, and marmot-uniffi audit suites all green. Rebased on latest origin/master.

🤖 Generated with Claude Code


Open in Stage

erskingardner and others added 8 commits June 25, 2026 16:13
Introduce the V2 forensic audit contract foundation:

- AuditDataMode (obfuscated_sensitive_data default / full_data), stamped
  on every AuditEvent.
- audit-log-event.v2.schema.json scaffold (v1 kinds carried forward plus
  the data-mode additions); bump AUDIT_LOG_SCHEMA_VERSION to v2.
- recorder_started v2 shape (session id moves to the top-level field);
  new audit_data_mode_changed kind.
- ForensicRecorder::set_data_mode rotates the backing store on a real
  mode change and writes a clear mode boundary; data_mode accessor.
- Version v2 filenames (audit-<engine_id>-v2.jsonl) so existing v1 files
  are left untouched rather than appended to.

Tests: schema<->Rust lockstep, all-variant serde round-trip, mode
stamping (default + full-data open), rotation-with-boundary, no-op on
unchanged mode, and v1-files-left-untouched.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire the audit data mode end-to-end so it can be toggled and persisted:

- StoredAuditLogSettings.data_mode (TEXT) with an additive ALTER migration
  for pre-v2 databases; AuditLogSettings.data_mode (AuditDataMode) DTO and
  conversions; AuditLogSettingsFfi.data_mode + AuditDataModeFfi enum.
- open_audit_recorder opens in the persisted mode.
- MarmotAppRuntime::set_audit_log_settings hot-swaps a live recorder when
  the mode changes (engine/session set_audit_recorder_data_mode -> recorder
  set_data_mode), rotating the file with a clear audit_data_mode_changed
  boundary (requirement #4). New SetAuditDataMode worker command.
- marmot-app re-exports AuditDataMode for FFI/CLI consumers.

Tests: storage default/persist + legacy-column migration, app settings
round-trip incl. data_mode, uniffi smoke round-trip + persistence, and a
runtime integration test proving a live full_data toggle rotates the
recorder with an entirely-full_data file and a boundary row.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Capture transport-layer identifiers so an analyzer can correlate engine
activity with raw transport traffic (requirement #8):

- forensics: AuditTransportWire reusable envelope (wire id/kind/pubkey,
  transport_group_id, relay url, subscription id, nostr event id/kind/
  pubkey, gift-wrap/welcome ids, publish_result_id) + transport_received
  kind; wire attached to AuditTransportContext and to publish_attempt/
  outcome/failure. Schema grows to match; lockstep + serde tests updated.
- inbound: nostr adapter populates a generic TransportWireMetadata on
  TransportDeliverySource (h-tag transport group id read from the
  peeler-mapped envelope); session maps it to AuditTransportWire (mirroring
  wire_* into nostr_* for nostr); engine emits transport_received before
  ingest_entry, reusing the ingest payload digest.
- outbound: account runtime stamps the available wire envelope (transport
  source + transport group id) on publish rows; post-wrap relay event id is
  produced in the adapter and left for later.

Wire identifiers are transport-layer (e.g. ephemeral nostr pubkeys), never
the author's account identity, so they are safe in both data modes.

Tests: engine producer test (transport_received precedes ingest_entry with
the wire fields); traits snapshots unchanged (optional fields skip when
None); all producer crates green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Record who each outbound message is expected to reach, from authenticated
membership at send time (requirement #9):

- forensics: recipient_expectation kind + RecipientExpectation /
  MessageArtifactKind / RecipientScope; reshape send_outcome and
  create_group_outcome to an outbound_messages inventory
  ({msg_id, artifact_kind, transport?, recipient_expectation?}). Schema
  grows (recipientExpectation/outboundMessage/messageArtifactKind/
  recipientScope/pubkeyHex); lockstep + serde samples updated.
- engine: recipient_expectation_records computes, per outbound message,
  the expected recipients — group messages/commits target all OTHER
  current members (roster minus self via do_members + identity.self_id);
  welcomes target only the added member (from the welcome envelope
  recipient). Emitted as recipient_expectation rows after the outcome.
  Full member pubkeys are included only in full_data mode; member refs
  (salted hashes) + counts are always emitted.

Tests: engine producer test proving a welcome scopes added_member_only and
an app message scopes all_other_current_group_members, with no recipient
pubkeys in the default obfuscated mode.

Note: engine_ingest_buffers_future_epoch_app_message_as_convergence_witness
is a pre-existing flake (fails ~1/3 multi-threaded on master too);
unrelated to this change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Emit a rich, deterministic trace of distributed-convergence decisions
(requirement #10):

- convergence.rs: select_canonical_branch_traced returns a BranchSelectionTrace
  (per-candidate scores + eligibility, the rule-by-rule comparison between the
  winner and runner-up marking the decisive rule, and losing branches). The
  trace is a pure function of the candidate SET — candidates, losing ids, and
  per-candidate app_witnesses are ordered by id/(epoch,sender) so convergence
  stays input-order independent.
- canonicalization.rs: CanonicalizationResult.selection_trace carries it out.
- distributed_convergence.rs: a salted, per-run convergence run_id; emits
  convergence_run_state lifecycle rows (started/waiting/blocked/unrecoverable/
  applied/stable) plus the reshaped convergence_decision, all correlated by the
  run_id on a new convergence audit context.
- forensics: AuditConvergenceContext, ConvergencePhase, ConvergenceCandidate/
  Score/AppWitness/RuleEvaluation, convergence_run_state kind, reshaped
  convergence_decision (candidates + rule_trace + losing branches). Full
  committer/witness pubkeys are full-data only; refs/digests always emitted.

Tests: traced-selection unit test (decisive rule, candidates, losers,
eligibility), engine producer test (run_state + decision share a run_id), and
the order-independence canonicalization proptest (now green with the sorted
trace). Schema lockstep + serde + conformance suite all pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`engine_ingest_buffers_future_epoch_app_message_as_convergence_witness`
failed ~1/3 of multi-threaded runs (passed single-threaded / in isolation;
reproduced on master, pre-existing).

Root cause: `carol.ingest(...)` buffers the future-epoch messages and stamps
`last_convergence_relevant_input_ms` with the engine's real monotonic clock
(`convergence_now_ms`). The test then converged with a logical `now_ms` of
2_000, and settlement requires `now_ms - last_input >= settlement_quiescence_ms`
(1_000). Under parallel load the real elapsed time from engine creation to the
buffering ingest exceeds ~1s, so `last_input > 1_000` and `2_000 - last_input`
no longer clears quiescence -> ConvergenceStatus::Resolving instead of Settled.

Fix: converge with a logical `now_ms` far past the quiescence window
(1_000_000), matching the ~20 other ingest-then-converge tests in this file, so
the settle is independent of real elapsed time. Verified with 10 multi-threaded
runs of the full file (0 failures).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final audit-v2 phase — surface decrypted content under explicit full_data
opt-in (requirements #6, #7) and finish the obfuscated/full-data contract:

- forensics: message_content_decoded kind (+ MessageAuthor / DecodedPayload /
  DecodedApplicationEvent / AttachmentMetadata), source_context kind (+
  AuditSourceContext, also on the audit context), and GroupStateChanged
  reshaped to a `value` object (digest/len always; text/json/pubkeys full-data)
  plus actor/subject pubkeys. Schema grows to match and gains the obfuscated
  `allOf` guards that forbid every full-data-only field (decoded content,
  account/actor/subject/committer/witness/recipient pubkeys, cleartext
  group-state values) when audit_data_mode is obfuscated_sensitive_data.
- engine: on a successful application-message decrypt, full_data mode decodes
  the MarmotAppEvent and emits message_content_decoded (author member ref +
  full pubkey, decoded kind/content/tags/created_at, NIP-94 imeta attachments);
  obfuscated mode never decodes. GroupStateChanged carries the value object +
  actor/subject pubkeys gated on full_data.
- marmot-app: open_audit_recorder emits a source_context row (account_label
  always; account pubkey full-data only).

Tests: forensics serde+lockstep over all new kinds; engine full-data-positive
(decoded content + author pubkey present, every line full_data) and
obfuscated-negative (ingest recorded, no decoded content) producer tests.
fmt + clippy clean; engine, app, storage, uniffi audit suites green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Goggles merged the audit-log v2 PRD; adopt its schema verbatim (darkmatter's
copy is now byte-identical) and reconcile the Rust model + producers.

Schema/model deltas:
- transportWireEnvelope: wire_kind is now a string (the numeric Nostr kind
  stays on nostr_kind); welcome_event_id replaced by welcome_nostr_event_id,
  welcome_rumor_event_id, welcome_key_package_tag.
- group_state_changed: new membership_change_source (self_leave / admin_action /
  convergence / remote_commit / unknown), derived from the change + actor;
  change_kind gains topic_changed.
- convergence_decision: candidates now required (always serialized);
  convergenceCandidate gains state_digest + last_input_time_ms.
- added optional fields: artifact_kind on publish_attempt/outcome/failure,
  message_state_changed, peeler_outcome; relay_url on publish_*; detail +
  required_acks on publish_failure; origin_commit_id on epoch_confirmed;
  state_digest on snapshot_created.
- pattern tightenings (msg_id->messageId/digestHex, *_digest->digestHex,
  wire/nostr pubkeys->pubkeyHex): no behavioral change — real Nostr ids,
  payload digests, and pubkeys are already 64-hex (only synthetic test ids
  were shorter).

Producers updated in cgka-engine (audit_helpers + ingest + engine), cgka-session
(wire mapping: wire_kind->string), marmot-account (publish rows: artifact_kind,
relay_url, detail, required_acks), and transport-nostr-adapter / traits
(drop the unused welcome_event_id carrier field). State-digest and per-candidate
last_input are left unset (cost / not per-candidate) and noted inline.

Tests: schema<->Rust lockstep + all-variant serde round-trip updated; new
recursive conformance test proves serialized output uses only schema-allowed
keys (guards additionalProperties:false). fast-ci, engine, account, conformance,
app, and uniffi suites all green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 25, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@erskingardner, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 38 minutes and 36 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c4b5c865-8186-4ec0-8dd2-b913156fe0cd

📥 Commits

Reviewing files that changed from the base of the PR and between 5f919b6 and aa368ae.

📒 Files selected for processing (38)
  • crates/cgka-conformance-simulator/tests/candidate_state_graph.rs
  • crates/cgka-conformance-simulator/tests/openmls_replay_probe.rs
  • crates/cgka-engine/src/audit_helpers.rs
  • crates/cgka-engine/src/canonicalization.rs
  • crates/cgka-engine/src/convergence.rs
  • crates/cgka-engine/src/distributed_convergence.rs
  • crates/cgka-engine/src/engine.rs
  • crates/cgka-engine/src/message_processor/ingest.rs
  • crates/cgka-engine/src/message_processor/mod.rs
  • crates/cgka-engine/src/openmls_projection.rs
  • crates/cgka-engine/tests/audit_log.rs
  • crates/cgka-engine/tests/distributed_convergence.rs
  • crates/cgka-engine/tests/group_creation.rs
  • crates/cgka-session/src/lib.rs
  • crates/marmot-account/src/runtime.rs
  • crates/marmot-account/tests/runtime.rs
  • crates/marmot-app/src/audit_log.rs
  • crates/marmot-app/src/client/audit.rs
  • crates/marmot-app/src/conversions.rs
  • crates/marmot-app/src/lib.rs
  • crates/marmot-app/src/runtime/account_worker.rs
  • crates/marmot-app/src/runtime/mod.rs
  • crates/marmot-app/src/tests.rs
  • crates/marmot-app/tests/audit_logs.rs
  • crates/marmot-app/tests/relay_runtime.rs
  • crates/marmot-forensics/AGENTS.md
  • crates/marmot-forensics/schema/audit-log-event.v2.schema.json
  • crates/marmot-forensics/src/audit.rs
  • crates/marmot-forensics/src/audit/tests.rs
  • crates/marmot-forensics/src/lib.rs
  • crates/marmot-uniffi/src/conversions/audit.rs
  • crates/marmot-uniffi/src/lib.rs
  • crates/marmot-uniffi/tests/smoke.rs
  • crates/storage-sqlite/src/shared.rs
  • crates/traits/src/lib.rs
  • crates/traits/src/transport_adapter.rs
  • crates/traits/tests/snapshots.rs
  • crates/transport-nostr-adapter/src/lib.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/trusting-pasteur-affb6c

Comment @coderabbitai help to get the list of available commands.

@stage-review

stage-review Bot commented Jun 25, 2026

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant