BUG: loc setitem with duplicate columns and new columns corrupts Data…#65208
BUG: loc setitem with duplicate columns and new columns corrupts Data…#65208jbrockmendel merged 2 commits intopandas-dev:mainfrom
Conversation
|
The test creates a DataFrame with columns |
|
I can confirm the indexer length always matches len(keys). get_indexer is guaranteed to return one value per element in its input — either the position of that key in the existing columns, or -1 for new ones. So the length is always correct. The problem with the old code was that it assumed new columns always end up at the tail of keys, which isn't true when there are duplicate columns. get_indexer handles duplicates and any ordering correctly without making that assumption. |
|
thanks @roeimed0 |
…Frame (#58317)
doc/source/whatsnew/vX.X.X.rstfile if fixing a bug or adding a new feature.AGENTS.md.Summary
fixing stale PR #64079
When using
df.locto assign a DataFrame with duplicate column names and new columns ,unrelated columns were corrupted.The indexer in
_ensure_listlike_indexerassumed the expanded columns mapped 1-to-1to the original columns in order, which broke when
unioninserted duplicates in themiddle. Fixed by using
get_indexerto correctly map each column position.AI was used to explore the code path and trace the root cause.