[RPC] VM error visibility indexing by jordanjennings-mysten · Pull Request #26595 · MystenLabs/sui

jordanjennings-mysten · 2026-05-12T06:56:23Z

Description

Adds JSON-RPC index support for VM execution error visibility.

This threads richer ExecutionErrorContext through fullnode execution, extracts execution error metadata/source for failed txs, and writes it to a new execution_error_metadata index by tx digest. Validators continue using the on-chain data ExecutionFailure path and drop the extra metadata.

The new index table is prunable by transaction sequence number and has a round-trip unit test.

requires changes to sui-apis sui-rust-sdk:
MystenLabs/sui-rust-sdk#267
MystenLabs/sui-apis#28

Test plan

sui core

test_execution_error_metadata_round_trip
test_empty_execution_error_metadata_is_ignored
test_validator_execution_does_not_store_error_metadata
execution_error_metadata_table_accepts_future_proto_schema

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

vercel · 2026-05-12T06:56:24Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
sui-docs	Ready	Preview, Comment	May 28, 2026 6:24pm

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
multisig-toolkit	Ignored	Preview	May 28, 2026 6:24pm
sui-kiosk	Ignored	Preview	May 28, 2026 6:24pm

stefan-mysten

Left a couple of comments. I am not familiar with this codebase so I hope someone else will take a look, but it was interesting to read and learn. Thanks @jordanjennings-mysten.

stefan-mysten · 2026-05-18T22:23:42Z

                &tracking_store.into_read_objects(),
            );

+        let execution_error_metadata = execution_error_opt


What's the possible size of this error metadata? If I recall correctly, this will be added onto fullnodes, I assume in a table somewhere. If it's very large, can it cause any disruption/issues during insertion/retrieval?

my napkin math says about 1 gb of storage per week with 300 bytes per error, 2% error rate on ~20 million txs a day, depending on when the fullnode prunes it could be a bit smaller, maybe more in the 0.33gb range. based on a 5 hour sample window a few weeks back.

Thanks for sharing more concrete numbers. I was wondering if there's any limits in the DB of how much data can be inserted in a field in a column?

I think they are quite large but you would hit performance problems at some point since its rocks db.

stefan-mysten · 2026-05-18T22:33:28Z

+            .or_else(|| {
+                self.store
+                    .get_execution_error_metadata(digest)
+                    .expect("db error")


This .expect leads to a panic, no? Is that required? I'd expect that this should return None if it cannot get execution_error_metadata rather than panicking.

this was an existing pattern in get_unchanged_loaded_runtime_objects if theres a TypedStoreError it bails which arguably should panic since that error type is associated serious db errors. I do agree probably something to flag though.

amnn

This PR looks good to me, just some questions about the types.

I'm also curious what the roll out for this would be, because if you roll out just this change, and then iterate on the error messaging itself, we will end up with different fullnodes in the ecosystem producing different metadata for the same error based on the version of sui-node they are running.

The code otherwise looks good, thanks @jordanjennings-mysten

amnn · 2026-05-19T13:45:41Z

+                gas_status,
+                effects,
+                timings,
+                execution_failure.map_err(ExecutionErrorContext::from),


Would it be worthwhile to split the extra metadata into its own distinct type? I.e. have Authority::execute_transaction_to_effects return ExecutionError and then a separate, optional ExecutionErrorContext or ExecutionMetadata (whatever you want to call the extra information only)?

The benefit (for me) of using separate types, is that I would be able to clearly see which part of the richer data is dropped in the validator path, because it would be statically None here.

This also fits the onward data flow better because the execution error would need to be indexed with the rest of the base transaction effect data, while the metadata goes into its own table.

I would also recommend we try to align on the shape of the type using a protobuf message and storing the protobuf message itself in the DB so that we can iterate on the shape if need be.

@bmwill, is it possible to do that today with TideHunter/TypedStore? I was personally less worried about the evolvability because the metadata structure is already quite generic, but agree that if it's possible, then using a more generic protobuf value would be nicer for future evolvability.

put prost and serialize and it seemed to work out, I expect we will add some protos somewhere at some point but for now just used the prost derive.

amnn · 2026-05-19T13:55:23Z

        TransactionEffects,
        Vec<ExecutionTiming>,
-        Result<(), ExecutionError>,
+        Result<(), ExecutionErrorContext>,


Should dev_inspect_transaction also use this type?

yup I was going to split that out into its own PR to keep this focused

jordanjennings-mysten · 2026-05-20T06:18:33Z

I'm also curious what the roll out for this would be, because if you roll out just this change, and then iterate on the error messaging itself, we will end up with different fullnodes in the ecosystem producing different metadata for the same error based on the version of sui-node they are running.

This is an interesting question. I hadn't considered versioning/roll out. protocol version certainly feels overkill, I did have an interest in the past about feature flagging (unclear if that would work here originally for cli) but it seemed there was a preference to not introduce flags. I'll think about it some more.

amnn

I don't think the traits on ExecutionErrorMetadata are having the desired effect (which also means the tests that were added are not catching the thing you want them to catch -- schema evolvability). Can you take another look?

amnn · 2026-05-26T13:30:53Z

+    pub attributes: BTreeMap<String, String>,
+}
+
+#[cfg(test)]


Having tests in the middle of the implementation is a bit unconventional, can we move these to the end and just call the module tests?

amnn · 2026-05-26T13:33:36Z

 pub(crate) type BoxError = Box<dyn std::error::Error + Send + Sync + 'static>;
-pub type ExecutionErrorMetadata = BTreeMap<String, String>;
+
+#[derive(Clone, Eq, PartialEq, JsonSchema, Serialize, Deserialize, prost::Message)]


The fact that this type implements prost::Message does not mean that it's being serialized using proto when stored in the authority stores/indices -- I would expect that it's still being BCS encoded (with all the pitfalls around evolvability that this entails) unless you are manually encoding it to proto before writing it, which I didn't see happening in the code above.

This is where my earlier question to @bmwill came from -- whether there is an established pattern for storing proto into tidehunter/typed-store.

amnn · 2026-05-26T14:23:41Z

        TransactionEffects,
        Vec<ExecutionTiming>,
-        Result<(), ExecutionError>,
+        Result<(), ExecutionErrorContext>,


Now that I understand the rest of this change better, I'm pretty sure you do not want to introduce ExecutionErrorContext at this level of abstraction. It bakes in the idea that you have access to a source error on exit from the execution layer when the aim is to abstract that away.

Instead, replicate the change inside authority here -- expose the error and the metadata, and make it the execution layer's responsibility to extract the necessary metadata.

You could package the error and metadata up into a new type, like you have here, or you could introduce the metadata as a new optional field, like in authority (which has smaller knock-on consequences).

amnn · 2026-05-26T15:25:36Z

I ended up chatting with @bmwill about this, here's a summary from that discussion:

The reasons to use proto overlap with the reasons to keep this type generic (which drove the current design where we have a string -> string map). It doesn't make much sense to encode the string -> string map as proto, instead we should define a structured metadata type in proto, and store it as encoded proto to take advantage of the format's type safety and schema evolvability.
To do this properly, the metadata type needs to be defined in a .proto file, with CI support to check for schema evolution issues, and then we can generate the Rust definition from it. This .proto file would not contain a generic string-to-string map, it would contain the exact fields we want to expose (for now, let's just say a message: string)
This could be folded into sui-rpc-api, or it could be a separate thing (although then you would need to rebuild all that CI infra).
When it comes to roll-out, I would say that we should iterate on this until we are happy with the first E2E use case (a human readable message), and then we can look into backfilling by replaying transactions -- cc @tzakian, this is similar to the stuff we were talking about last Wednesday for the dual-layer execution stuff.

jordanjennings-mysten · 2026-05-26T17:50:16Z

ignore the review request.. I didn't realize you had comments since githubs notifications seem to have failed and didn't tell me you had commented. I'll take a look!

jordanjennings-mysten · 2026-05-26T20:47:52Z

we can look into backfilling by replaying transactions -- cc @tzakian, this is similar to the stuff we were talking about last Wednesday for the dual-layer execution stuff.

can we not build up the index first then ship the API?

jordanjennings-mysten · 2026-05-28T05:28:45Z

ExecutionErrorContext no longer leaks through the executor interface, and execution now returns ExecutionError plus optional metadata. the DB path now stores encoded sui-rpc proto bytes in Vec instead of BCS-encoding the local metadata type, with conversions at the read/write functions.

now relies on:
MystenLabs/sui-apis#28
and
MystenLabs/sui-rust-sdk#267

amnn

Looks good to me, but still good to get a review from @bmwill as he is much more familiar with the inner workings on fullnodes.

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 12, 2026 06:56 — with GitHub Actions Inactive

vercel Bot deployed to Preview – sui-docs May 12, 2026 06:57 View deployment

jordanjennings-mysten mentioned this pull request May 12, 2026

[indexers-rpc] VM error visibility jsonrpc only #26499

Closed

8 tasks

Base automatically changed from vm-error-visibility-execution-and-type to main May 14, 2026 03:41

jordanjennings-mysten force-pushed the vm-error-visibility-jsonrpc-indexing branch from a8424b1 to e96bb61 Compare May 14, 2026 17:42

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 14, 2026 17:42 — with GitHub Actions Inactive

vercel Bot deployed to Preview – sui-docs May 14, 2026 17:45 View deployment

jordanjennings-mysten added 2 commits May 18, 2026 11:38

[jsonrpc] vm error visibility indexing

ce7ef8a

fix

c32b90f

jordanjennings-mysten requested review from a team, amnn, bmwill, emmazzz, evan-wall-mysten, nickvikeras, tpham-mysten and wlmyng May 18, 2026 19:04

jordanjennings-mysten force-pushed the vm-error-visibility-jsonrpc-indexing branch from e96bb61 to c32b90f Compare May 18, 2026 19:05

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 18, 2026 19:05 — with GitHub Actions Inactive

vercel Bot deployed to Preview – sui-docs May 18, 2026 19:07 View deployment

stefan-mysten reviewed May 18, 2026

View reviewed changes

jordanjennings-mysten added 2 commits May 18, 2026 15:53

replay fix

c55ac5f

test validator doesnt store, no rows created on empty metadata

ab0c0e3

amnn reviewed May 19, 2026

View reviewed changes

jordanjennings-mysten marked this pull request as ready for review May 19, 2026 17:00

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 19, 2026 17:00 — with GitHub Actions Inactive

jordanjennings-mysten changed the title ~~[JSONRPC] VM error visibility indexing~~ [RPC] VM error visibility indexing May 19, 2026

Track execution error metadata sidecar

075964c

jordanjennings-mysten added 2 commits May 19, 2026 14:56

Down-convert replay execution status errors

c296ef6

nit

c165a1c

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 19, 2026 23:05 — with GitHub Actions Inactive

jordanjennings-mysten requested review from amnn and stefan-mysten May 20, 2026 06:15

amnn reviewed May 26, 2026

View reviewed changes

jordanjennings-mysten requested a review from amnn May 26, 2026 17:47

do not leak ExecutionErrorContext, proto from sui-apis, flexible table

06842bb

vercel Bot deployed to Preview – sui-docs May 28, 2026 05:29 View deployment

amnn approved these changes May 28, 2026

View reviewed changes

jordanjennings-mysten added 6 commits May 28, 2026 10:20

move ExecutionErrorMetadata in rpc proto conversions

05c92a5

revert to execution error source()

56d1e3d

document bytes in authority store, remove less useful test

4306021

move ExecutionErrorMetadata in rpc proto to the bottom of error section

15ef267

drop context from older execution versions

0ed0ba6

move context to sui-execution

ddcdc1c

vercel Bot deployed to Preview – sui-docs May 28, 2026 18:14 View deployment

dont touch cargo.lock for now

d8d0590

jordanjennings-mysten temporarily deployed to sui-typescript-aws-kms-test-env May 28, 2026 18:22 — with GitHub Actions Inactive

vercel Bot deployed to Preview – sui-docs May 28, 2026 18:24 View deployment

jordanjennings-mysten mentioned this pull request Jun 17, 2026

[wip/client] add execution error metadata to grpc and gql responses MystenLabs/ts-sdks#1099

Open

4 tasks

Uh oh!

Conversation

jordanjennings-mysten commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Test plan

Release notes

Uh oh!

vercel Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefan-mysten left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordanjennings-mysten May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stefan-mysten May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amnn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordanjennings-mysten May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jordanjennings-mysten commented May 20, 2026

Uh oh!

amnn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amnn commented May 26, 2026

Uh oh!

jordanjennings-mysten commented May 26, 2026

Uh oh!

jordanjennings-mysten commented May 26, 2026

Uh oh!

jordanjennings-mysten commented May 28, 2026

Uh oh!

amnn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jordanjennings-mysten commented May 12, 2026 •

edited

Loading

vercel Bot commented May 12, 2026 •

edited

Loading

jordanjennings-mysten May 18, 2026 •

edited

Loading

stefan-mysten May 18, 2026 •

edited

Loading

jordanjennings-mysten May 19, 2026 •

edited

Loading