feat: add SAE last executed and last settled height metrics#5362
feat: add SAE last executed and last settled height metrics#5362JonathanOppenheimer wants to merge 17 commits into
Conversation
160920e to
a3a8190
Compare
| metricRegistry *prometheus.Registry | ||
| metrics *saemetrics.Metrics |
There was a problem hiding this comment.
This is an intentional choice to have the metricRegistry as the *prometheus.Registry used for registration and /ext/metrics (and as already used for registerer for the bloom/gossip/p2p metrics) while metrics is used to record metrics updates.
There was a problem hiding this comment.
Pull request overview
Adds SAE-specific Prometheus gauges to expose the VM’s async execution and settlement frontiers, enabling operators to compute lag vs Snowman’s last accepted height.
Changes:
- Introduces
sae_last_executed_heightandsae_last_settled_heightgauges (via a newvms/saevm/metricspackage). - Wires the new metrics into SAE VM initialization, block execution, and settlement paths.
- Adds unit/integration tests and updates Bazel deps and
RELEASES.md.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| vms/saevm/txgossip/txgossip_test.go | Updates executor construction in tests to provide SAE metrics. |
| vms/saevm/txgossip/BUILD.bazel | Adds metrics dependency for txgossip tests. |
| vms/saevm/saexec/saexec.go | Threads SAE metrics through the executor. |
| vms/saevm/saexec/saexec_test.go | Adds a test for last executed height metric (potentially timing-sensitive). |
| vms/saevm/saexec/execution.go | Updates last executed height metric after successful execution. |
| vms/saevm/saexec/BUILD.bazel | Adds metrics + Prometheus testutil deps. |
| vms/saevm/sae/vm.go | Registers an SAE-prefixed registry and instantiates SAE metrics; passes registry to other subsystems. |
| vms/saevm/sae/vm_test.go | Adds a settlement metric test. |
| vms/saevm/sae/consensus.go | Updates last settled height metric when blocks settle during acceptance. |
| vms/saevm/sae/BUILD.bazel | Adds deps for metrics + testutil. |
| vms/saevm/metrics/metrics.go | New SAE metrics collectors and update helpers. |
| vms/saevm/metrics/metrics_test.go | Unit test for the new metrics helpers. |
| vms/saevm/metrics/BUILD.bazel | New Bazel targets for metrics package + tests. |
| RELEASES.md | Documents the newly added gauges in pending release notes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…-metrics Signed-off-by: Jonathan Oppenheimer <jonathan.oppenheimer@avalabs.org>
There was a problem hiding this comment.
I don't think making a "metrics" package makes sense. This metrics struct IMO should be unexported in whatever package has access to each metric. Do you agree?
There was a problem hiding this comment.
I modeled this after avalanchego/avm/metrics. I don't think it's as simple as unexporting it in whatever package has access to each metric -- sae and saeexec -- we need host it, and then export it somewhere. This package will also grow over time as we add more metrics, all of which not be as tightly coupled to an individual package.
What do you think?
I could at least unexport the gauge fields so callers can only go through MarkBlockExecuted/MarkBlockSettled.
There was a problem hiding this comment.
We talked in the office about this.
There was a problem hiding this comment.
Does this look better?
Co-authored-by: Tsvetan Dimitrov <tsvetan.dimitrov23@gmail.com> Signed-off-by: Jonathan Oppenheimer <147infiniti@gmail.com>
Why this should be merged
This PR adds two new SAE-specific metrics,
last_executed_heightandlast_settled_height, registered under thesaenamespace. Snowman already publisheslast_accepted_height, but execution and settlement heights are SAE-only.Once these guages are added, you can compute lag as shown below
This is the first PR in a planned series of PRs to add metrics, as outlined in the shared Notion document.
How this works
I tried to follow general Prometheus best practices here, and follow the AVM metrics pattern (as suggested to me by Stephen). We add a
*saemetrics.Metricsfield on the VM, and a separate*prometheus.Registryfor registration.How this was tested
I added three tests:
TestMetrics: callsMarkBlockExecuted(6)andMarkBlockSettled(7)and asserts both gauges hold the expected heights.TestExecutionMetrics: assertsLastExecutedHeightadvances.TestSettlementMetric: assertsLastSettledHeightadvances.Need to be documented in RELEASES.md?
Yes