Make history-related blocking respect $wgCrawlerProtectedActions and add $wgCrawlerProtectionProtectRevisions#30
Conversation
Agent-Logs-Url: https://github.com/mywikis/CrawlerProtection/sessions/74ca4e87-8423-400c-9cd2-dcbbcecf61c6 Co-authored-by: jeffw16 <11380894+jeffw16@users.noreply.github.com>
$wgCrawlerProtectedActions
There was a problem hiding this comment.
Pull request overview
This PR makes history-related blocking consistently respect the $wgCrawlerProtectedActions configuration by ensuring that all history-surface URLs (including type=revision, diff, and oldid) are controlled by the single 'history' token rather than partially hardcoded checks.
Changes:
- Updated
CrawlerProtectionService::checkPerformAction()to gatetype=revision,diff>0, andoldid>0checks behindisProtectedAction( 'history' ). - Added a unit test (with data provider) covering the now-allowed history-shaped request parameters when
'history'is not configured. - Updated the PHPUnit namespaced
Userstub to provide proper return types (isRegistered(): bool,getName(): string) to avoid null default mock return values.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| includes/CrawlerProtectionService.php | Gates diff/oldid/type=revision blocking behind the 'history' protected-action token. |
| tests/phpunit/unit/CrawlerProtectionServiceTest.php | Adds coverage ensuring history-related URL shapes are allowed when 'history' isn’t configured. |
| tests/phpunit/namespaced-stubs.php | Improves User stub typing to prevent null returns from PHPUnit mocks. |
|
@copilot, this change doesn't work even when |
It's not a dependency-injection issue — it's MediaWiki's config array merge behavior. When a config var has no // Optimistic: If the global is not set, or is an empty array, replace it entirely.
if ( !array_key_exists( $key, $GLOBALS ) || ( is_array( $GLOBALS[$key] ) && !$GLOBALS[$key] ) ) {
$GLOBALS[$key] = $val; // your [] becomes ['history']
continue;
}So Fixed in 60f6b4f by adding
Correction: the fix is in commit |
|
@copilot All of the pipeline's checks are failing for MediaWiki 1.39 |
Fixed in be8afbe. The 1.39 failures weren't from the PR code — every 1.39 job (style, static analysis, unit tests) died in the MediaWiki REL1_39 pins The workflow tried to handle this with "config": { "audit": { "abandoned": "report" }, "policy": { "advisories": { "block": false } } }
|
|
@copilot These checks are still failing on MediaWiki 1.39 |
The 1.39 jobs were still dying at Fixed by setting the policy directly on the root composer.json before resolving: Applied to all three Setup Composer steps. |
…endently control revision/diff blocking (#32) * Initial plan * Add CrawlerProtectionProtectRevisions to independently control revision/diff blocking --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
This is already in this branch — |
$wgCrawlerProtectedActions$wgCrawlerProtectedActions and add $wgCrawlerProtectionProtectRevisions
Removing
'history'from$wgCrawlerProtectedActionsdid not actually unblock history for anonymous users:?action=historystill went through, but related URLs (type=revision,diff=N,oldid=N) stayed blocked because those conditions were hardcoded incheckPerformAction. In addition, setting$wgCrawlerProtectedActions = []inLocalSettings.phphad no effect because MediaWiki's default config merge strategy replaces an empty-array global with the extension.json default.Beyond simply tying revisions/diffs to the
'history'token, the history listing and individual revision/diff viewing are now controlled independently, so operators can protect one without protecting the other.Changes
CrawlerProtectionService::checkPerformAction: the history listing (action=history) is controlled by the'history'token in$wgCrawlerProtectedActions, while thetype === 'revision',diff > 0, andoldid > 0checks are gated on the new$wgCrawlerProtectionProtectRevisionsboolean. This lets operators disable history-listing protection while still blocking direct access to individual revisions and diffs, or vice versa.$wgCrawlerProtectionProtectRevisions: new boolean config var wheretruedenies access to revisions and diffs andfalseallows them (independently of the history listing). Registered inextension.jsonand documented inREADME.md.extension.json: add"merge_strategy": "provide_default"to the array config vars (CrawlerProtectedActions,CrawlerProtectedSpecialPages,CrawlerProtectionAllowedIPs). Without it,ExtensionRegistry::exportExtractedData()treats an empty-array global as "unset" and overwrites it with the extension.json default, so an operator setting$wgCrawlerProtectedActions = []would silently be reset to['history']. Withprovide_default, the extension.json value is only used when the global is not set at all, so an explicit operator value (including[]) wins.$wgCrawlerProtectionProtectRevisionsis true and allowed when it is false (regardless of the'history'token), andaction=historyremains controlled solely by$wgCrawlerProtectedActions. The revision-shaped allow-cases use a focusedprovideRevisionOnlyRequestParamsprovider, and theaction=historyallow-case relies on the existingtestCheckPerformActionAllowsActionNotInConfigto avoid duplicated coverage.stringreturn type toUser::getNameinnamespaced-stubs.phpso PHPUnit-generated mocks no longer returnnullintoIPUtils::isInRanges, which was masking failures in pre-existing IP-allow-list tests..github/workflows/ci.yml): fix MediaWiki 1.39 pipeline failures. All REL1_39 jobs were dying incomposer updatebecause MW 1.39 pinssymfony/yaml 5.4.23, which now has published security advisories, and recent Composer refuses to install advisory-flagged versions. Settingpolicy.advisories.block: falseinside the generatedcomposer.local.jsonhad no effect, because the composer-merge-plugin does not merge theconfigsection from merged files — only the rootcomposer.jsonconfig is honored. The policy is now set on the root composer.json directly withcomposer config --no-plugins policy.advisories.block falsebeforecomposer update, so the pinned dependencies install again. This is a temporal/environmental failure unrelated to the extension code (REL1_43 pins unaffected versions and was already green).Before / after
Compatibility
Default config (
$wgCrawlerProtectedActions = ['history'],$wgCrawlerProtectionProtectRevisions = true) is unchanged in behavior. Withprovide_default, an explicitly set$wgCrawlerProtectedActions(including an empty array) now fully replaces the default instead of being merged with it, so operators who exclude'history'— or empty the list entirely — will see the unblocking they expected. Revision/diff protection can now be toggled separately via$wgCrawlerProtectionProtectRevisions.