Skip to content

Support legacy run/job index-based URLs and refactor migration 326#37008

Merged
lunny merged 23 commits intogo-gitea:mainfrom
Zettat123:fix-migration-326
Apr 3, 2026
Merged

Support legacy run/job index-based URLs and refactor migration 326#37008
lunny merged 23 commits intogo-gitea:mainfrom
Zettat123:fix-migration-326

Conversation

@Zettat123
Copy link
Copy Markdown
Contributor

@Zettat123 Zettat123 commented Mar 27, 2026

Follow up #36842

Migration 326 can be prohibitively slow on large instances because it scans and rewrites all commit status target URLs generated by Gitea Actions in the database. This PR refactors migration 326 to perform a partial update instead of rewriting every legacy target URL. The reason for this partial rewrite is that smaller legacy run/job indexes are the most likely to be ambiguous with run/job ID-based URLs during runtime resolution, so this change prioritizes that subset while avoiding the cost of rewriting all legacy records.

To preserve access to old links, this PR introduces resolveCurrentRunForView to handle both ID-based URLs and index-based URLs:

  • For job pages (/actions/runs/{run}/jobs/{job}), it first tries to confirm that the URL is ID-based. It does so by checking whether {job} can be treated as an existing job ID in the repository and whether that job belongs to {run}. If that match cannot be confirmed, it falls back to treating the URL as legacy run index + job index, resolves the corresponding run and job, and redirects to the correct ID-based URL.
    • When both ID-based and index-based interpretations are valid at the same time, the resolver prefers the ID-based interpretation by default. For example, if a repository contains one run-job pair (run_id=3, run_index=2, job_id=4), and also another run-job pair (run_id=1100, run_index=3, job_id=1200, job_index=4), then /actions/runs/3/jobs/4 is ambiguous. In that case, the resolver treats it as the ID-based URL by default and shows the page for run_id=3, job_id=4. Users can still explicitly force the legacy index-based interpretation with ?by_index=1, which would resolve the same URL to /actions/runs/1100/jobs/1200.
  • For run summary pages (/actions/runs/{run}), it uses a best-effort strategy: by default it first treats {run} as a run ID, and if no such run exists in the repository, it falls back to treating {run} as a legacy run index and redirects to the ID-based URL. Users can also explicitly force the legacy interpretation with ?by_index=1.
    • This summary-page compatibility is best-effort, not a strict ambiguity check. For example, if a repository contains two runs: runA (id=7, index=3) and runB (id=99, index=7), then /actions/runs/7 will resolve to runA by default, even though the old index-based URL originally referred to runB.

The table below shows how valid legacy index-based target URLs are handled before and after migration 326. Lower-range legacy URLs are rewritten to ID-based URLs, while higher-range legacy URLs remain unchanged in the database but are still handled correctly by resolveCurrentRunForView at runtime.

run_id run_index job_id job_index old target URL updated by migration 326 current target URL can be resolved correctly
3 2 4 1 /user2/repo2/actions/runs/2/jobs/1 true /user2/repo2/actions/runs/3/jobs/4 true
4 3 8 4 /user2/repo2/actions/runs/3/jobs/4 true /user2/repo2/actions/runs/4/jobs/8 true (without migration 326, this URL will resolve to run(id=3))
80 20 170 0 /user2/repo2/actions/runs/20/jobs/0 true /user2/repo2/actions/runs/80/jobs/170 true
1500 900 1600 0 /user2/repo2/actions/runs/900/jobs/0 false /user2/repo2/actions/runs/900/jobs/0 true
2400 1500 2600 0 /user2/repo2/actions/runs/1500/jobs/0 false /user2/repo2/actions/runs/1500/jobs/0 true
2400 1500 2601 1 /user2/repo2/actions/runs/1500/jobs/1 false /user2/repo2/actions/runs/1500/jobs/1 true

For users who already ran the old migration 326, this change has no functional impact. Their historical URLs are already stored in the ID-based form, and ID-based URLs continue to resolve correctly.

For users who have not run the old migration 326, only a subset of legacy target URLs will now be rewritten during upgrade. This avoids the extreme runtime cost of the previous full migration, while all remaining legacy target URLs continue to work through the web-layer compatibility logic.

Many thanks to @wxiaoguang for the suggestions.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 27, 2026
@github-actions github-actions bot added modifies/go Pull requests that update Go code modifies/migrations labels Mar 27, 2026
@Zettat123 Zettat123 added the topic/gitea-actions related to the actions of Gitea label Mar 27, 2026
@lunny lunny added this to the 1.26.0 milestone Mar 27, 2026
@silverwind
Copy link
Copy Markdown
Member

silverwind commented Mar 27, 2026

Would it be hard to add a "revert" migration? Migrations should be append-only, removing one in the middle defeats the migration concept. Alternatively keep it but make it a no-op.

@wxiaoguang
Copy link
Copy Markdown
Contributor

I don't think you should touch the doctor system anymore. Unless you can fully make it right and fix bugs like "gitea doctor check --all --fix deleted all my LFS files and attachments #36227"

@wxiaoguang wxiaoguang marked this pull request as draft March 27, 2026 15:18
@lunny
Copy link
Copy Markdown
Member

lunny commented Mar 27, 2026

Would it be hard to add a "revert" migration? Migrations should be append-only, removing one in the middle defeats the migration concept. Alternatively keep it but make it a no-op.

Revert migration will only make it worse it might be more slow.

@wxiaoguang
Copy link
Copy Markdown
Contributor

AI review, rubberstamp approval, bad design, then hurt end users.

@delvh
Copy link
Copy Markdown
Member

delvh commented Mar 27, 2026

One problem I see is the following:
We are effectively getting unremovable tech debt with this approach - we need to keep this logic around forever.
How bad is the problem for large instances?
A couple minutes, or maybe a couple hours?
If it is just that, I would prefer not adding this really hacky workaround to the situation.
Then these instances should declare a maintenance window during which the DB is upgraded, and afterwards the problem has been solved without limiting us for the entire future which this proposal would do.

@Zettat123 Zettat123 changed the title Support legacy run/job index-based URLs and remove migration 326 Support legacy run/job index-based URLs and make migration 326 no-op Mar 27, 2026
@Zettat123
Copy link
Copy Markdown
Contributor Author

Would it be hard to add a "revert" migration? Migrations should be append-only, removing one in the middle defeats the migration concept. Alternatively keep it but make it a no-op.

This PR actually makes migration 326 a no-op rather than removing it entirely. I apologize for the inaccuracies in the title and description. I have updated them.

@Zettat123
Copy link
Copy Markdown
Contributor Author

I don't think you should touch the doctor system anymore. Unless you can fully make it right and fix bugs like "gitea doctor check --all --fix deleted all my LFS files and attachments #36227"

You're right. I'd prefer not to introduce additional doctor checks either. I've removed it in b288cf4

@Zettat123
Copy link
Copy Markdown
Contributor Author

Zettat123 commented Mar 27, 2026

Then these instances should declare a maintenance window during which the DB is upgraded, and afterwards the problem has been solved without limiting us for the entire future which this proposal would do.

I agree. Running migration 326 to update all legacy target URLs would be the ideal solution. However, this PR introduces a compatibility mechanism instead of a mandatory update for the following reasons:

  • Unpredictable Migration Time: The duration is difficult to estimate, as it depends on the database type and the number of records. If a user starts an upgrade without knowing the required time and it exceeds their maintenance window, the process could become unmanageable.

  • Full Update is Often Unnecessary: In most cases, updating every legacy target URL isn't necessary. Most users only interact with target URLs from recent commit statuses. Updating statuses from months or years ago takes significant time, yet those fields may never be accessed.


I must admit that my design in #36842 was a mistake. We should have used a new URL format, such as /actions/run_id/{run_id}/job_id/{job_id}, instead of reusing the existing one. However, since some users may have already executed migration 326, switching to a new URL format is no longer a viable option. Moreover, if support for legacy URLs must be removed in the future, we would still need to introduce a compatibility and migration path.

@wxiaoguang
Copy link
Copy Markdown
Contributor

wxiaoguang commented Mar 27, 2026

What about "partially" migrating the jobs links?

For example, choose a magic number N = 1000

  • For the Actions jobs whose ID < 1000, find their run, and update the generated links of the commits
  • Set next id of job id to max((max(job_id), 1000)
  • When get a {job} from path parameter:
    • if {job} < 1000:
      • if the run contains job id < 1000: then it is a job id (we have migrated them)
      • otherwise, it is a job index (the legacy links)
    • if {job} >= 1000, then it must be a job id
  • The change can be done in migration 326:
    • For users who have run migration 326: the logic is still correct (they just get a full migration)
    • For users who haven't run migration 326: they get the new partially migration, everything should still work

This approach should have stable behavior, and doesn't need to keep guessing whether {job} is an index or an id.

@wxiaoguang
Copy link
Copy Markdown
Contributor

wxiaoguang commented Mar 27, 2026

Or, only migrate the first 1000 (or 10000) runs: there should be more than 1000 jobs. also make sure next max run/job id > 1000

  • when get a {job}:
    • if {job} < 1000, check its run id:
      • if it's run id < 1000: it is job id (we have migrated)
      • otherwise, it is job index (legacy)
    • if {job} >= 1000, then it is a job id.

@Zettat123
Copy link
Copy Markdown
Contributor Author

Zettat123 commented Mar 27, 2026

@wxiaoguang Thank you for your suggestion. I'd prefer the approach "migrate the first 1000 (or 10000) runs" and will implement it.

@wxiaoguang
Copy link
Copy Markdown
Contributor

Or even simpler: migrate runs with id < 1000, jobs with id < 1000, and make sure the new run/id have id > 1000

Then:

  • if {run} < 1000:
    • {run} and {job} are ids
  • if {run} >= 1000:
    • if {job} < 1000: it is index
    • if {job} >= 1000: it is id

@Zettat123 Zettat123 changed the title Support legacy run/job index-based URLs and make migration 326 no-op Support legacy run/job index-based URLs and refactor migration 326 Mar 28, 2026
@Zettat123
Copy link
Copy Markdown
Contributor Author

Zettat123 commented Mar 28, 2026

Or even simpler: migrate runs with id < 1000, jobs with id < 1000, and make sure the new run/id have id > 1000

Then:

  • if {run} < 1000:

    • {run} and {job} are ids
  • if {run} >= 1000:

    • if {job} < 1000: it is index
    • if {job} >= 1000: it is id

I didn't use this solution. This is because if we use this approach, the target URL for run(id=1100, index=5) would not be updated (since id > 1000), yet the {run} in the target URL would still need be resolved as run_index.

I actually implemented the solution from #37008 (comment), which is based on {job} and allows us to retrieve both the run and job more accurately. In the migration, I updated the data for runs id < 1000, but I did not update next_id to 1000. I think resolveCurrentRunForView can properly handle cases where the run/job ID is less than 1000.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Mar 31, 2026
Copy link
Copy Markdown
Member

@silverwind silverwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. getCommitStatusCommitID silently skips unhandled event typesworkflow_dispatch and schedule are common triggers, but returning "" produces no log output. Operators debugging missed rewrites have no trail. Suggestion: add a log.Debug in the default branch noting the skipped event type.

  2. legacyURLIDThreshold = 1000 lacks rationale — This constant drives the entire partial-migration tradeoff but has no comment explaining why 1000. Suggestion: add a comment documenting the reasoning (e.g. "covers the range where run index and ID are most likely to collide; remaining legacy URLs are resolved at runtime").

  3. resolveCurrentRunForView complexity — Run-only pages use ID-first fallback, job pages use legacy-preferred fallback. The asymmetry is hard to follow in one function. Suggestion: split into resolveRunSummary and resolveRunJob helpers.


This comment was written with the help of Claude Opus 4.6.

@Zettat123
Copy link
Copy Markdown
Contributor Author

  1. getCommitStatusCommitID silently skips unhandled event types

This is because runs triggered by other events do not create commit status. The logic of getCommitStatusCommitID is based on actions_service.getCommitStatusEventNameAndCommitID

func getCommitStatusEventNameAndCommitID(run *actions_model.ActionRun) (event, commitID string, _ error) {
switch run.Event {
case webhook_module.HookEventPush:
event = "push"
payload, err := run.GetPushEventPayload()
if err != nil {
return "", "", fmt.Errorf("GetPushEventPayload: %w", err)
}
if payload.HeadCommit == nil {
return "", "", errors.New("head commit is missing in event payload")
}
commitID = payload.HeadCommit.ID
case // pull_request
webhook_module.HookEventPullRequest,
webhook_module.HookEventPullRequestSync,
webhook_module.HookEventPullRequestAssign,
webhook_module.HookEventPullRequestLabel,
webhook_module.HookEventPullRequestReviewRequest,
webhook_module.HookEventPullRequestMilestone:
if run.TriggerEvent == actions_module.GithubEventPullRequestTarget {
event = "pull_request_target"
} else {
event = "pull_request"
}
payload, err := run.GetPullRequestEventPayload()
if err != nil {
return "", "", fmt.Errorf("GetPullRequestEventPayload: %w", err)
}
if payload.PullRequest == nil {
return "", "", errors.New("pull request is missing in event payload")
} else if payload.PullRequest.Head == nil {
return "", "", errors.New("head of pull request is missing in event payload")
}
commitID = payload.PullRequest.Head.Sha
case // pull_request_review events share the same PullRequestPayload as pull_request
webhook_module.HookEventPullRequestReviewApproved,
webhook_module.HookEventPullRequestReviewRejected,
webhook_module.HookEventPullRequestReviewComment:
event = run.TriggerEvent
payload, err := run.GetPullRequestEventPayload()
if err != nil {
return "", "", fmt.Errorf("GetPullRequestEventPayload: %w", err)
}
if payload.PullRequest == nil {
return "", "", errors.New("pull request is missing in event payload")
} else if payload.PullRequest.Head == nil {
return "", "", errors.New("head of pull request is missing in event payload")
}
commitID = payload.PullRequest.Head.Sha
case webhook_module.HookEventRelease:
event = string(run.Event)
commitID = run.CommitSHA
default: // do nothing, return empty
}
return event, commitID, nil
}

  1. legacyURLIDThreshold = 1000 lacks rationale

I have added more comments to it to clarify its purpose.

Only commit status target URLs whose resolved run ID is smaller than this threshold are rewritten by this partial migration.
The fixed value 1000 is a conservative cutoff chosen to cover the smaller legacy run indexes that are most likely to be confused with ID-based URLs at runtime.
Larger legacy {run} or {job} numbers are usually easier to disambiguate. For example:

  • /actions/runs/1200/jobs/1420 is most likely an ID-based URL, because a run should not contain more than 256 jobs.
  • /actions/runs/1500/jobs/3 is most likely an index-based URL, because a job ID cannot be smaller than its run ID.

But URLs with small numbers, such as /actions/runs/5/jobs/6, are much harder to distinguish reliably.
This migration therefore prioritizes rewriting target URLs for runs in that lower range.

However, this partial migration combined with the web-layer compatibility logic still cannot accurately handle all legacy target URLs. In rare cases, a legacy target URL may remain ambiguous. For example:

run_id run_index job_id job_index old target URL updated by migration 326 current target URL can be resolved correctly
3 2 4 1 /user2/repo2/actions/runs/2/jobs/1 true /user2/repo2/actions/runs/3/jobs/4 true
1100 3 1200 4 /user2/repo2/actions/runs/3/jobs/4 false /user2/repo2/actions/runs/3/jobs/4 false

In such cases, the only solution is to explicitly use index-based parsing by specifying the by_index=1 query parameter to ensure the URL resolves to the correct run.

  1. resolveCurrentRunForView complexity

I originally tried to make resolveCurrentRunForView prioritize index-based parsing if both methods are valid. However, since this scenario is rare and would cause a discrepancy between {job} and {run} parsing behaviors, I updated the logic in d2ec292. Now both {run} and {job} prioritize ID-based parsing.

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Apr 1, 2026
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Apr 2, 2026
@wxiaoguang wxiaoguang removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Apr 2, 2026
@wxiaoguang wxiaoguang marked this pull request as draft April 2, 2026 03:15
@Zettat123 Zettat123 marked this pull request as ready for review April 2, 2026 15:04
@lunny lunny added the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Apr 2, 2026
@lunny lunny merged commit 23c662e into go-gitea:main Apr 3, 2026
26 checks passed
@GiteaBot GiteaBot removed the reviewed/wait-merge This pull request is part of the merge queue. It will be merged soon. label Apr 3, 2026
@wxiaoguang wxiaoguang deleted the fix-migration-326 branch April 3, 2026 01:15
zjjhot added a commit to zjjhot/gitea that referenced this pull request Apr 3, 2026
* main:
  Fix RPM Registry 404 when package name contains 'package' (go-gitea#37087)
  Improve actions notifier for `workflow_run` (go-gitea#37088)
  Refactor code render and render control chars (go-gitea#37078)
  Fix various problems (go-gitea#37077)
  [skip ci] Updated translations via Crowdin
  Support legacy run/job index-based URLs and refactor migration 326 (go-gitea#37008)
  Fix a bug when forking a repository in an organization (go-gitea#36950)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. modifies/go Pull requests that update Go code modifies/migrations topic/gitea-actions related to the actions of Gitea

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants