Fix parsing error for backslash continuations with no indentation#4951
Open
Ashinee-work wants to merge 4 commits intopsf:mainfrom
Open
Fix parsing error for backslash continuations with no indentation#4951Ashinee-work wants to merge 4 commits intopsf:mainfrom
Ashinee-work wants to merge 4 commits intopsf:mainfrom
Conversation
Collaborator
|
Thanks for this PR! Instead of making a new test file, could you instead make a new case file in Other than that, the PR looks good, thanks! (For reference CI is red due to #4944, not anything you need to worry about) |
|
Hi @Ashinee-work, I tested your PR, but looks like it does not work properly when the backslash continued line is at the end of file. For example: if True:
foo = 1+\
2The error message is: Current commit is d3b20ad |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix parsing error for backslash continuations with no indentation
Description
Fixes the parsing error that occurred when Black encountered multi-line expressions using backslash (
\) line continuations followed by unindented lines.Problem
Black would fail to parse valid Python code like:
With the error:
This is valid Python code that runs without issues, but Black couldn't parse it.
Root Cause
The tokenizer in
blib2to3/pgen2/tokenize.pywas not correctly handling INDENT/DEDENT tokens generated bypytokensduring backslash continuations.When
pytokensencountered an unindented line after a backslash continuation, it would emit DEDENT tokens based on the physical indentation. However, in Python's grammar, backslash continuations mean the next line is part of the same logical line, so indentation changes should be ignored until the logical line ends (at a NEWLINE token).Solution
Modified the
tokenizefunction insrc/blib2to3/pgen2/tokenize.pyto:Detect backslash continuations: Only NL tokens that start with
\are treated as backslash continuations (not all NL tokens, which can also represent blank lines)Skip incorrect INDENT/DEDENT tokens: During a backslash continuation (between a backslash NL token and a NEWLINE token), all INDENT and DEDENT tokens are skipped and tracked in a balance counter
Balance tokens after continuation: After the logical line ends (NEWLINE token), we skip additional INDENT/DEDENT tokens to balance what was skipped during the continuation, ensuring the indentation level remains correct
This ensures Black's tokenization matches Python's standard tokenizer behavior.
Changes Made
Modified Files
src/blib2to3/pgen2/tokenize.py: Enhanced thetokenizefunction with logic to handle backslash continuationstests/test_tokenize.py: Added comprehensive test casetest_backslash_continuation()with multiple scenariosTest Cases
Test Case 1: Simple Backslash Continuation
Cannot parse for target version Python 3.13: 3:0: 2foo = 1 + 2Test Case 2: Multiple Backslash Continuations
result = 1 + 2 + 3Test Case 3: Backslash Continuation in Function Call
print(1 + 2)Test Case 4: With Windows Line Endings
Tested with
\r\nline endings to ensure cross-platform compatibility.Test File
Before running the script
After running the script
Testing
All tests pass:
Checklist
Implement any code style changes under the
--previewstyle, following the stability policy?Add an entry in
CHANGES.mdif necessary?Add / update tests if necessary?
tests/test_tokenize.pyImpact
This is a bug fix that enables Black to correctly parse and format valid Python code that was previously failing. The fix:
\n) and Windows (\r\n) line endingsRelated Issues: #4945
Testing Instructions: