Skip to content

feat(seer): Add Night Shift nightly autofix cron scaffolding#112429

Open
trevor-e wants to merge 3 commits intomasterfrom
feat/seer-night-shift-scaffolding
Open

feat(seer): Add Night Shift nightly autofix cron scaffolding#112429
trevor-e wants to merge 3 commits intomasterfrom
feat/seer-night-shift-scaffolding

Conversation

@trevor-e
Copy link
Copy Markdown
Member

@trevor-e trevor-e commented Apr 7, 2026

Add the scaffolding for a nightly cron job ("Night Shift") that fans out per-org to trigger Seer Autofix on top issues. This decouples automation from the hot post_process pipeline, enabling better issue selection and serving lower-volume orgs.

The scheduler iterates active orgs using batched feature flag checks (batch_has_for_organizations) to avoid N+1, applies deterministic jitter to spread load, and dispatches per-org worker tasks. The worker task is currently a stub that just logs — the issue selection and autofix triggering logic will be added in a follow-up once we've decided on the approach.

What's included:

  • schedule_night_shift() — cron scheduler task (daily at 10:00 AM UTC / ~2-3 AM PT)
  • run_night_shift_for_org() — per-org worker task (stub for now)
  • Feature flag: organizations:seer-night-shift (flagpole, disabled by default)
  • Options: seer.night_shift.enable (global killswitch), seer.night_shift.issues_per_org
  • NIGHT_SHIFT referrer and automation source enums
  • Cron schedule entry in TASKWORKER_REGION_SCHEDULES

Not included (follow-ups):

  • Issue selection logic in the worker task
  • Options-automator config (must deploy this PR first)

Refs https://www.notion.so/sentry/Seer-Night-Shift-3338b10e4b5d807e80a6fbd6d70b3f60

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 7, 2026
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Function name contains spurious _d7 suffix
    • Renamed function from _get_eligible_org_ids_from_batch_d7 to _get_eligible_org_ids_from_batch, removing the spurious suffix.
  • ✅ Fixed: Unused module-level constant NIGHT_SHIFT_AUTO_RUN_SOURCE
    • Removed the unused NIGHT_SHIFT_AUTO_RUN_SOURCE constant and its associated SeerAutomationSource import.

Create PR

Or push these changes by commenting:

@cursor push 6aeb4ee3ef
Preview (6aeb4ee3ef)
diff --git a/src/sentry/tasks/seer/night_shift.py b/src/sentry/tasks/seer/night_shift.py
--- a/src/sentry/tasks/seer/night_shift.py
+++ b/src/sentry/tasks/seer/night_shift.py
@@ -8,7 +8,6 @@
 
 from sentry import features, options
 from sentry.models.organization import Organization, OrganizationStatus
-from sentry.seer.autofix.constants import SeerAutomationSource
 from sentry.tasks.base import instrumented_task
 from sentry.taskworker.namespaces import seer_tasks
 from sentry.utils.iterators import chunked
@@ -24,9 +23,7 @@
     "organizations:gen-ai-features",
 ]
 
-NIGHT_SHIFT_AUTO_RUN_SOURCE = SeerAutomationSource.NIGHT_SHIFT.value
 
-
 @instrumented_task(
     name="sentry.tasks.seer.night_shift.schedule_night_shift",
     namespace=seer_tasks,
@@ -50,7 +47,7 @@
         ),
         100,
     ):
-        eligible_ids = _get_eligible_org_ids_from_batch_d7(org_batch)
+        eligible_ids = _get_eligible_org_ids_from_batch(org_batch)
 
         org_map = {org.id: org for org in org_batch}
         for org_id in eligible_ids:
@@ -102,7 +99,7 @@
     )
 
 
-def _get_eligible_org_ids_from_batch_d7(
+def _get_eligible_org_ids_from_batch(
     orgs: Sequence[Organization],
 ) -> list[int]:
     """

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

Reviewed by Cursor Bugbot for commit e65687b. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 7, 2026

Backend Test Failures

Failures on 83a403a in this run:

tests/sentry/taskworker/test_config.py::test_all_instrumented_tasks_registeredlog
[gw1] linux -- Python 3.13.1 /home/runner/work/sentry/sentry/.venv/bin/python3
tests/sentry/taskworker/test_config.py:120: in test_all_instrumented_tasks_registered
    raise AssertionError(
E   AssertionError: Found 1 module(s) with @instrumented_task that are NOT registered in TASKWORKER_IMPORTS.
E   These tasks will not be discovered by the taskworker in production!
E   
E   Missing modules:
E     - sentry.tasks.seer.night_shift
E   
E   Add these to TASKWORKER_IMPORTS in src/sentry/conf/server.py

trevor-e and others added 2 commits April 7, 2026 19:27
Add the scaffolding for a nightly cron job that fans out per-org to
trigger Seer Autofix on top issues. This decouples automation from the
hot post_process pipeline, enabling better issue selection and serving
lower-volume orgs.

Includes:
- Scheduler task with batched feature flag checks and jitter
- Stub per-org worker task (logic TBD)
- Feature flag (organizations:seer-night-shift) and options
- NIGHT_SHIFT referrer and automation source

Co-Authored-By: Claude <[email protected]>
Remove spurious _d7 suffix from function name and remove unused
NIGHT_SHIFT_AUTO_RUN_SOURCE constant (will be added when worker
logic is implemented).

Co-Authored-By: Claude <[email protected]>
@trevor-e trevor-e force-pushed the feat/seer-night-shift-scaffolding branch from 2447b23 to 9308041 Compare April 7, 2026 23:28
@trevor-e trevor-e marked this pull request as ready for review April 7, 2026 23:33
@trevor-e trevor-e requested review from a team as code owners April 7, 2026 23:33
Comment on lines +43 to +49
for org_batch in chunked(
RangeQuerySetWrapper[Organization](
Organization.objects.filter(status=OrganizationStatus.ACTIVE),
step=1000,
),
100,
):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this would be useful for you since you're scheduling all of these at the same time, but I recently added a generic CursoredScheduler that might do what you want. Instead of running all the scheduled items at once, it stripes them over a period of time. This might be helpful for you so that you're not hitting Seer with a bunch of requests all at once.

https://github.com/getsentry/sentry/blob/987d0433540d52624da4390b353961c8ac0749bb/src/sentry/integrations/source_code_management/sync_repos.py#L271-L288 is an example of it being used.

Probably it'd need a few changes to let it filter tasks before scheduling, but I'd be happy enough to include them if it's helpful for you.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like CursoredScheduler seems compelling. If I understand CursoredScheduler correctly it ends up exposing an API like "Run foo() once per org per 24h" (where org / 24h etc can be configured). That's pretty close to the long term future of nightshift probably?

In the short term of nightshift (and more generally for experimental things) - it may be worth splitting the two parts:

  1. One off running a function over all orgs over some period.
  2. Scheduling when 1. happens.

In particular if you a) discover some bug in the recurring task and want to restart it/rerun it/stop it or b) want to run some one time job over all orgs. That might diverge a bit too far from the current design of CursoredScheduler though.

Copy link
Copy Markdown
Contributor

@chromy chromy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % some minor questions

if batch_result is None:
batch_result = {
f"organization:{org.id}": features.has(feature_name, org) for org in orgs
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this happen? I would be tempted to raise here and fix batch_has_for_organizations if this is common.

Comment on lines +43 to +49
for org_batch in chunked(
RangeQuerySetWrapper[Organization](
Organization.objects.filter(status=OrganizationStatus.ACTIVE),
step=1000,
),
100,
):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like CursoredScheduler seems compelling. If I understand CursoredScheduler correctly it ends up exposing an API like "Run foo() once per org per 24h" (where org / 24h etc can be configured). That's pretty close to the long term future of nightshift probably?

In the short term of nightshift (and more generally for experimental things) - it may be worth splitting the two parts:

  1. One off running a function over all orgs over some period.
  2. Scheduling when 1. happens.

In particular if you a) discover some bug in the recurring task and want to restart it/rerun it/stop it or b) want to run some one time job over all orgs. That might diverge a bit too far from the current design of CursoredScheduler though.

Comment on lines +120 to +122

if not eligible:
return []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like

eligible_orgs = list(orgs)
for feature_name in FEATURE_NAMES:
  batch_result = features.batch_has_for_organizations(feature_name, eligible_orgs)

  eligible_orgs = [org for org in eligible_orgs if batch_result.get(f"organization:{org.id}", False)]
return list(eligible)

seems like there is a little less awkward conversion between Organization, id and "organization:{id}" but this is fine also

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants