Skip to content

Security: Potential XSS via unescaped template variables injected into inline JavaScript#990

Open
tuanaiseo wants to merge 1 commit intowebrecorder:mainfrom
tuanaiseo:contribai/fix/security/potential-xss-via-unescaped-template-var
Open

Security: Potential XSS via unescaped template variables injected into inline JavaScript#990
tuanaiseo wants to merge 1 commit intowebrecorder:mainfrom
tuanaiseo:contribai/fix/security/potential-xss-via-unescaped-template-var

Conversation

@tuanaiseo
Copy link
Copy Markdown

Problem

The template disables autoescaping and injects user-influenced values (such as cdx.url, top_url, coll, and others) directly into JavaScript string literals. If any value contains quotes, backslashes, or script-breaking payloads, it can execute arbitrary JavaScript in the replay UI.

Severity: high
File: pywb/templates/head_insert.html

Solution

Remove autoescape false for script data paths and serialize all dynamic JS values using safe JSON encoding (for example |tojson) instead of manual quoted interpolation.

Changes

  • pywb/templates/head_insert.html (modified)

Testing

  • Existing tests pass
  • Manual review completed
  • No new warnings/errors introduced

The template disables autoescaping and injects user-influenced values (such as `cdx.url`, `top_url`, `coll`, and others) directly into JavaScript string literals. If any value contains quotes, backslashes, or script-breaking payloads, it can execute arbitrary JavaScript in the replay UI.

Affected files: head_insert.html

Signed-off-by: tuanaiseo <221258316+tuanaiseo@users.noreply.github.com>
@wumpus
Copy link
Copy Markdown

wumpus commented Apr 4, 2026

You didn't remove the autoescape? And there's no test.

Also note that there's some history behind this detail -- in 2021 Ilya checked in autoescape false and also added a comment in CHANGES.rst that he was fixing an xss pointed out by Sebastian. So that's a little weird. @ikreymer @sebastian-nagel

@ato
Copy link
Copy Markdown
Collaborator

ato commented Apr 6, 2026

While this should be fixed just for the principle of it, I couldn't find a straightforward way to exploit this from the frontend because the browser will encode " to %22 in the request and that's what will end up in top_url.

I did verify that you can inject scripts via WARC-Target-URI though in a malicious WARC file. Although normally scripts in archived pages get replayed anyway so an attacker who controls the WARC file doesn't actually need to do this.

 WARC-Target-URI: http://poc.example/</script><script>alert('oops')</script>

It looks like | tojson is an incomplete fix as while it encodes " and \ you can still break out with </script>.

wbinfo.proxy_magic = "{{ env.pywb_proxy_magic }}";
wbinfo.static_prefix = "{{ static_prefix }}/";
wbinfo.coll = {{ coll | tojson }};
wbinfo.proxy_magic = {{ env.pywb_proxy_magic | tojson }};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  File "pywb/templates/head_insert.html", line 27, in top-level template code
    wbinfo.proxy_magic = {{ env.pywb_proxy_magic | tojson }};
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pywb/rewrite/templateview.py", line 269, in tojson
    return json.dumps(obj)
           ^^^^^^^^^^^^^^^
TypeError: Undefined is not JSON serializable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants