Skip to content

Add metrics client#226

Open
ehanson8 wants to merge 4 commits intomainfrom
metrics
Open

Add metrics client#226
ehanson8 wants to merge 4 commits intomainfrom
metrics

Conversation

@ehanson8
Copy link
Copy Markdown
Contributor

@ehanson8 ehanson8 commented Apr 6, 2026

Purpose and background context

This PR creates a MetricsClient class that will eventually be added to our template repos and provides an opportunity to refine our conventions for publishing AWS metrics.

How can a reviewer manually see the effects of these changes?

This is not easily to manually test but you can review this Cloudwatch query to see the metrics that have been produced by this code over the past week

Includes new or updated dependencies?

NO

Changes expectations for external applications?

NO

What are the relevant tickets?

  • NA

Code review

  • Code review best practices are documented here and you are encouraged to have a constructive dialogue with your reviewers about their preferences and expectations.

@ehanson8 ehanson8 requested a review from a team as a code owner April 6, 2026 19:30
@coveralls
Copy link
Copy Markdown

coveralls commented Apr 6, 2026

Coverage Report for CI Build 24909351443

Warning

No base build found for commit 3cc6e39 on main.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 94.77%

Details

  • Patch coverage: 20 uncovered changes across 1 file (62 of 82 lines covered, 75.61%).

Uncovered Changes

File Changed Covered %
dsc/utils/aws/metrics.py 70 50 71.43%

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 1128
Covered Lines: 1069
Line Coverage: 94.77%
Coverage Strength: 0.95 hits per line

💛 - Coveralls

Comment thread dsc/utils/aws/metrics.py Outdated
logger.info("No metrics to publish.")
return

# Validate all metrics before publishing
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this feels slightly weird to re-validate as they were validated in add_metric_to_batch, there is the potential for something to go haywire in between when they are added and when they are published. A cleaner solution would be much appreciated!

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first comment review + inline comments touches on this!

Comment thread dsc/utils/aws/metrics.py Outdated

for x in range(0, len(self.batch_metrics), batch_size):
self._push_metric_data(self.batch_metrics[x : x + batch_size])
self.batch_metrics.clear()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgoing unit test for now, but unit tests will certainly be included when the MetricsClient is added to the template repos.

Copy link
Copy Markdown

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I jump too far into a review, as I was looking at the overall structure, I had a thought/question that might have some rippling effects.

What if we had a dataclass like Metric?

@dataclass
class Metric:
    metric_name: str
    value: int | float | str # maybe others?
    unit: str
    metric_dimensions: dict[str, str] | None = None

If this was defined in the same module, that dataclass could handle validation during initialization, thereby removing the need to validate anywhere else in the class. I'm not crazy about the idea, but I think you can use @dataclass(frozen=True) and a dataclass instance can't be modified. If so, then you avoid that double validation approach to guard against a once-valid-but-maybe-not-anymore situation.

UPDATE: leaving the above thinking about validation, but see also this comment. It feels correct to init the client with allowed metrics, therefore it doesn't make sense for this proposed Metric dataclass to validate itself.

It would also begin to chip away at this repeating block for most methods (not to mention the docstrings):

name: str,
value: int,
unit: str,
metric_dimensions: dict[str, str] | None = None,

If we had something like that, I could imagine method signatures along the lines of:

def publish_metric(self, metric:Metric): ...

def add_metric_to_batch(self, metric:Metric): ...

def publish_metrics_batch(self): ...

To actually use inline in code, could just init as part of the publish:

client.publish_metric(Metric(name="foo", value=42, unit="seconds"))

Whattya think? Even if you don't go quite this far, and only want to pass around a new Metric dataclass internally between methods, I do still think it could be a good place for validation and defining those metric properties only once.

Comment thread dsc/utils/aws/metrics.py Outdated

def __init__(self) -> None:
"""Initialize the MetricsClient."""
self.cloudwatch = boto3.client("cloudwatch")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling like maybe we should make this a private property, like self._cloudwatch? The first thing I do with a new class/client like this is fire it up in a console and see what's available, what's surprising, etc., and this was a little surprising!

Image

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread dsc/utils/aws/metrics.py Outdated
Args:
metric_name: The name of the metric to check.
"""
if metric_name not in METRICS:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first of all, I really like the ability to validate the metric name itself. But the ergonomics feel just a little off to me, given that it's hardcoded to expect a METRICS variable in scope somehow.

I think this tightly couples the client to the nature of the project's config.

What if the class instantiation would support an allowed_metrics: list[str] | None = None?

In code, if None, assume the user doesn't care about metric name validation. But if set, validate against that list.

In code, I think it'd be as trivial as instantiating like this:

from dsc.config import ALLOWED_METRICS

metrics_client = MetricsClient(allowed_metrics=ALLOWED_METRICS)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may find a place for it elsewhere, but I think the same would apply to NAMESPACE which has the same kind of tight coupling.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair: this is a test implementation that will get ported to the templates... but insofar as that implementation will need to be a bit more application structure agnostic, so too must this.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lastly, see my actual PR comment review. I'm proposing the use of a Metric dataclass, but it can't really validate the metric name if we go this route where it's not a magic value in scope, but a value optionally passed during client init.

As such, this comment thread may suggest that Metric self-validation is not a great approach, and it should remain here in the client. I will ammend that comment before sending.

ehanson8 added a commit that referenced this pull request Apr 10, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
@ehanson8
Copy link
Copy Markdown
Contributor Author

Pushed a new commit based on @ghukill's comments and here's a new query showing the results of the updated code

Copy link
Copy Markdown

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting with a disclaimer that, as requested, going very hard into nit picking on this PR with the hope / expectation that we could drop this metrics.py basically as-is into the template repositories and use elsewhere.

My inlined comments about method names and removing double validation are somewhat subjective and optional.

But there is a bigger piece around metrics namespace that I was having trouble finding a single place to inline comment.

I think that we should avoid namespace getting set by a METRICS_NAMESPACE imported into the metrics module. The tightly couples the config for that app to expose a METRICS_NAMESPACE constant in the config file, not to mention just requiring a config file (yes most have, but maybe not all do or will).

Additionally, we might find that a single application wants to publish metrics to multiple namespaces.

For these reasons, I might propose an alternate pattern:

  1. Optionally init MetricsClient with a namespace that will get set for metrics during publishing. Consider supporting an env var default like AWS_METRICS_NAMESPACE.
  2. Optionally set namespace in the new Metric dataclass that will get used for that metric
  3. A namespace in a single Metric instance will override a default namespace set for the client.
  4. During metric validation, an error is thrown if neither set a namespace for the client.

This would give applications a lot of flexibility of how they construct metrics an the metrics client itself.

For example, in this DSC app, any given file or module could do:

from dsc.config import Config
from dsc.utils.aws.metrics import MetricsClient

CONFIG = Config()

metrics_client = MetricsClient(
    namespace=CONFIG.metrics_namespace,
    allowed_metrics=CONFIG.allowed_metrics
)

Note that this assume the Config class has properties for these values instead of free floating constants in the config.py module. But of course, either way works.

Or, maybe the config.py has a metrics client factcory helper if lots of files need a metrics client:

from dsc.utils.aws.metrics import MetricsClient

def get_metrics_client() -> MetricsClient:
    return MetricsClient(
        namespace=METRICS_NAMESPACE,
        allowed_metrics=ALLOWED_METRICS
    )

Note that in this example, the module level constants are used as currently they are set. The flexibility is kind of key to me, in whatever state / conventions the app uses.

Then could use anywhere in the app:

from dsc.config import get_metrics_client

metrics_client = get_metrics_client()

Lastly, an app may opt to publish to multiple namespaces. For that app, a default namespace wouldn't be set on the client itself. They would always prepare metrics with the namespace baked in:

m1 = Metric(
    namespace="timdex",
    # rest of fields...
)

m2 = Metric(
    namespace="use",
    # rest of fields...
)

metrics_client.add_metrics_to_batch([m1, m2])

Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
Comment thread dsc/utils/aws/metrics.py Outdated
@ehanson8
Copy link
Copy Markdown
Contributor Author

For these reasons, I might propose an alternate pattern:

  1. Optionally init MetricsClient with a namespace that will get set for metrics during publishing. Consider supporting an env var default like AWS_METRICS_NAMESPACE.
  2. Optionally set namespace in the new Metric dataclass that will get used for that metric
  3. A namespace in a single Metric instance will override a default namespace set for the client.
  4. During metric validation, an error is thrown if neither set a namespace for the client.

A great recommendation, working on this

ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
@ehanson8
Copy link
Copy Markdown
Contributor Author

ehanson8 commented Apr 15, 2026

@ghukill Pushed a new commit! And the most recent Cloudwatch query

Copy link
Copy Markdown

@ghukill ghukill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've requested a change, but approving as well. If you feel the change is not correct, please feel free to re-request review from me.

That little change aside, looking great! Thanks for the all the discussion on this. Feels like something we could drop right into the templates now.

Comment thread dsc/utils/aws/metrics.py Outdated
ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
ehanson8 added a commit that referenced this pull request Apr 15, 2026
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
Why these changes are being introduced:
* A metrics client class is needed to implement AWS Cloudwatch metrics

How this addresses that need:
* Add MetricsClient class with methods for publishing individual and batch metrics
* Add METRICS_NAMESPACE and METRICS constants to config.py
* Add metrics attributes to Workflow class
* Add publish_metric method calls to submit_items and finalize_items methods
* Update dependencies

Side effects of this change:
* None

Relevant ticket(s):
* NA
* Rename Config.METRICS > ALLOWED_METRICS and change from list to set
* Add Metric dataclass and update MetricsClient class to use it
* Update calls in Workflow class to use Metric dataclass
* Update dependencies
* Add namespace param to Metric class and MetricsClient.__init__
* Rename publish_single_metric > publish_metric,  _push_metric_data > _publish_metrics, add_metric_to_batch > add_metrics_to_batch, and publish_batch_metrics > publish_metrics_batch
* Add logic to use Metric.namespace over MetricsClient.namespace
* Remove validation from publish_metrics_batch
* Update calls in Workflow class
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants