Skip to content

chore(ci): bump Unit Test (PR) to xlarge to complete FXA-13473 mitigation#20454

Closed
vpomerleau wants to merge 1 commit into
mainfrom
flake-ci-resource-bump
Closed

chore(ci): bump Unit Test (PR) to xlarge to complete FXA-13473 mitigation#20454
vpomerleau wants to merge 1 commit into
mainfrom
flake-ci-resource-bump

Conversation

@vpomerleau
Copy link
Copy Markdown
Contributor

@vpomerleau vpomerleau commented Apr 24, 2026

Summary

  • Adds resource_class: xlarge to the Unit Test (PR) workflow entry in .circleci/config.yml (line 927), matching the Unit Test (tagged deploy, line 1115) and Unit Test (nightly) (line 1244) variants.
  • Completes the interim mitigation tracked in FXA-13473; the xlarge bump landed for tagged-deploy and nightly but was not applied to the PR variant, so PR runs still hit the SIGKILL OOM.
  • Does not fix the root cause (the jest.resetModules() pattern in packages/fxa-auth-server/lib/metrics/context.spec.ts). FXA-13473 remains the tracking ticket for the proper refactor.

Why now

The flake has hit multiple PRs this week. Example: PR #20452 — today, unrelated @apollo/server removal, failed with:

FAIL lib/metrics/context.spec.ts
● Test suite failed to run
A jest worker process (pid=3055) was terminated by another process: signal=SIGKILL, exitCode=null.
Test Suites: 1 failed, 149 passed, 150 total
Tests:       3334 passed, 3334 total

Classic OOM-killer fingerprint (kernel SIGKILL, zero test failures). Isolated measurement of context.spec.ts with --coverage shows ~616 MB RSS per worker; with nx --parallel=2 × maxWorkers: 4 that can easily spike past the large container's 8 GB ceiling.

Test plan

  • Open PR → Unit Test (PR) job on CircleCI shows xlarge executor in its build log
  • Re-verify on a couple of subsequent PRs that the SIGKILL flake does not recur
  • Track FXA-13473 for the real fix (spec refactor to stop reloading the require graph per test)

🤖 Generated with Claude Code

…tion

Because:
- The `Unit Test (PR)` job in CircleCI runs on `default: large` (8 GB
  RAM), which is not enough headroom once `nx run-many --parallel=2`
  spawns two packages' jest processes, each with `maxWorkers: 4`.
  packages/fxa-auth-server/lib/metrics/context.spec.ts uses
  `jest.resetModules()` + `require('./context')` in every test, which
  retains a large require tree (joi + fxa-shared + validators); under
  --coverage this regularly OOM-kills a jest worker with SIGKILL.
- FXA-13473 tracks the root-cause refactor (unscheduled), and its
  interim mitigation was to bump the Unit Test jobs to `xlarge`.
  That bump landed for `Unit Test` (tagged deploy, line 1115) and
  `Unit Test (nightly)` (line 1244) but was not applied to
  `Unit Test (PR)` (line 926), so PR runs still hit the OOM.

This commit:
- Adds `resource_class: xlarge` to the `Unit Test (PR)` workflow
  entry, matching the tagged-deploy and nightly variants. Uses more
  CircleCI credits per PR build but eliminates the SIGKILL flake
  until FXA-13473 lands.
Copilot AI review requested due to automatic review settings April 24, 2026 05:38
@vpomerleau vpomerleau requested a review from a team as a code owner April 24, 2026 05:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the CircleCI PR test workflow to allocate more compute resources to unit tests, aligning PR runs with the tagged-deploy and nightly unit test jobs to reduce OOM-related SIGKILL failures.

Changes:

  • Set resource_class: xlarge for the Unit Test (PR) workflow job in CircleCI config.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .circleci/config.yml
- fail-fast
- unit-test:
name: Unit Test (PR)
resource_class: xlarge
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo I think we should fix the underlying issue before bumping this up

@vbudhram
Copy link
Copy Markdown
Contributor

@vpomerleau Trade you #20465

@vpomerleau
Copy link
Copy Markdown
Contributor Author

Closing in favour of #20465

@vpomerleau vpomerleau closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants