Skip to content

Fix parsing error for backslash continuations with no indentation#4951

Open
Ashinee-work wants to merge 4 commits intopsf:mainfrom
Ashinee-work:ashineek/fix-issue-4945
Open

Fix parsing error for backslash continuations with no indentation#4951
Ashinee-work wants to merge 4 commits intopsf:mainfrom
Ashinee-work:ashineek/fix-issue-4945

Conversation

@Ashinee-work
Copy link
Copy Markdown

Fix parsing error for backslash continuations with no indentation

Description

Fixes the parsing error that occurred when Black encountered multi-line expressions using backslash (\) line continuations followed by unindented lines.

Problem

Black would fail to parse valid Python code like:

if True:
    foo = 1+\
2
    print(foo)

With the error:

error: cannot format file.py: Cannot parse for target version Python 3.13: 3:0: 2

This is valid Python code that runs without issues, but Black couldn't parse it.

Root Cause

The tokenizer in blib2to3/pgen2/tokenize.py was not correctly handling INDENT/DEDENT tokens generated by pytokens during backslash continuations.

When pytokens encountered an unindented line after a backslash continuation, it would emit DEDENT tokens based on the physical indentation. However, in Python's grammar, backslash continuations mean the next line is part of the same logical line, so indentation changes should be ignored until the logical line ends (at a NEWLINE token).

Solution

Modified the tokenize function in src/blib2to3/pgen2/tokenize.py to:

  1. Detect backslash continuations: Only NL tokens that start with \ are treated as backslash continuations (not all NL tokens, which can also represent blank lines)

  2. Skip incorrect INDENT/DEDENT tokens: During a backslash continuation (between a backslash NL token and a NEWLINE token), all INDENT and DEDENT tokens are skipped and tracked in a balance counter

  3. Balance tokens after continuation: After the logical line ends (NEWLINE token), we skip additional INDENT/DEDENT tokens to balance what was skipped during the continuation, ensuring the indentation level remains correct

This ensures Black's tokenization matches Python's standard tokenizer behavior.

Changes Made

Modified Files

  • src/blib2to3/pgen2/tokenize.py: Enhanced the tokenize function with logic to handle backslash continuations
  • tests/test_tokenize.py: Added comprehensive test case test_backslash_continuation() with multiple scenarios

Test Cases

Test Case 1: Simple Backslash Continuation

if True:
    foo = 1+\
2
    print(foo)
  • Before: Cannot parse for target version Python 3.13: 3:0: 2
  • After: Successfully formats to foo = 1 + 2

Test Case 2: Multiple Backslash Continuations

if True:
    result = 1+\
2+\
3
    print(result)
  • Before: Parsing error
  • After: Successfully formats to result = 1 + 2 + 3

Test Case 3: Backslash Continuation in Function Call

if True:
    print(1+\
2)
  • Before: Parsing error
  • After: Successfully formats to print(1 + 2)

Test Case 4: With Windows Line Endings

Tested with \r\n line endings to ensure cross-platform compatibility.

  • Before: Parsing error
  • After: Successfully formats

Test File

Before running the script

image

After running the script

image

Testing

All tests pass:

$ python -m pytest tests/test_tokenize.py -v
tests/test_tokenize.py::test_simple PASSED                               [ 33%]
tests/test_tokenize.py::test_fstring PASSED                              [ 66%]
tests/test_tokenize.py::test_backslash_continuation PASSED               [100%]

============================== 3 passed ==============================

Checklist

  • Implement any code style changes under the --preview style, following the stability policy?

    • N/A - This is a bug fix for the parser, not a code style change
  • Add an entry in CHANGES.md if necessary?

    • Will add upon review/approval
  • Add / update tests if necessary?

    • ✅ Added comprehensive test cases in tests/test_tokenize.py
    • ✅ Tests cover simple, multiple, and nested backslash continuations
    • ✅ All existing tests continue to pass

Impact

This is a bug fix that enables Black to correctly parse and format valid Python code that was previously failing. The fix:

  • ✅ Does not change any formatting behavior for code that was already working
  • ✅ Only affects the tokenization layer to match Python's standard behavior
  • ✅ Is fully backward compatible
  • ✅ Works with both Unix (\n) and Windows (\r\n) line endings

Related Issues: #4945
Testing Instructions:

# Create a test file
cat > test_fix.py << 'EOF'
if True:
    foo = 1+\
2
    print(foo)
EOF

# Format it with Black
python -m black test_fix.py --target-version py313
# Should output: "reformatted test_fix.py" ✅

@cobaltt7
Copy link
Copy Markdown
Collaborator

Thanks for this PR! Instead of making a new test file, could you instead make a new case file in tests/data/cases? I don't think it's necessary to test line endings, if that's not possible in a case file. Please also add a changelog entry.

Other than that, the PR looks good, thanks!

(For reference CI is red due to #4944, not anything you need to worry about)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 12, 2026

diff-shades results comparing this PR (e628db6) to main (21a2a8c):

--preview style: no changes

--stable style: no changes


What is this? | Workflow run | diff-shades documentation

@ericli-splunk
Copy link
Copy Markdown

ericli-splunk commented Jan 12, 2026

Hi @Ashinee-work, I tested your PR, but looks like it does not work properly when the backslash continued line is at the end of file. For example:

if True:
    foo = 1+\
2

The error message is:

$ cat a.py
if True:
    foo = 1+\
2
$ black a.py
error: cannot format a.py: Cannot parse: 3:0: 2

Oh no! 💥 💔 💥
1 file failed to reformat.
$ 

Current commit is d3b20ad

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants