Conversation
Agent-Logs-Url: https://github.com/mikefarah/yq/sessions/a8525fbb-77a7-4bb0-a3a7-b24f99ae8710 Co-authored-by: mikefarah <[email protected]>
| ``` | ||
| then | ||
| ```bash | ||
| yq '.country[0:5]' sample.yml |
There was a problem hiding this comment.
Test says without the second number, but both numbers are present
pkg/yqlib/operator_slice_test.go
Outdated
| description: "Slicing strings - without the second number", | ||
| subdescription: "Finishes at the end of the string", | ||
| document: `country: Australia`, | ||
| expression: `.country[0:5]`, |
There was a problem hiding this comment.
yep this has the second number...should this be .country[5:] ?
There was a problem hiding this comment.
@copilot fix this ^;
also the build is failing with:
Error: pkg/yqlib/operator_slice.go:19:98: sliceStringNode - result 1 (error) is always nil (unparam)
func sliceStringNode(lhsNode *CandidateNode, firstNumber int, secondNumber int) (*CandidateNode, error) {
There was a problem hiding this comment.
Fixed in 9a9399a:
- Changed "Slicing strings - without the second number" expression from
.country[0:5]to.country[5:](now actually omits the second number, givingalia) - Changed the first "Slicing strings" test to use both numbers:
.country[0:5]→Austr - Removed the always-nil error return from
sliceStringNodeto fix the build warning
Agent-Logs-Url: https://github.com/mikefarah/yq/sessions/58726b13-68ae-4f93-971f-eb70459edcf4 Co-authored-by: mikefarah <[email protected]>
There was a problem hiding this comment.
Pull request overview
Adds correct string-slicing behavior to yq’s slice operator and fixes traversal so slice operations apply to the evaluated LHS value (e.g. .country[4:] slices the country string, not the root document).
Changes:
- Add rune-based string slicing support and route
!!strscalar nodes through it. - Fix
traverseArrayOperatorto evaluate/pipeline the LHS before applying slice logic. - Extend lexer post-processing to support omitted-start string slicing syntax like
.field[:N].
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| project-words.txt | Adds new words used by tests/docs to the project dictionary. |
| pkg/yqlib/operator_traverse_path.go | Ensures slice operations receive the evaluated LHS context. |
| pkg/yqlib/operator_slice.go | Introduces sliceStringNode and routes string scalars through it. |
| pkg/yqlib/lexer.go | Enables implicit 0 insertion for [:N] after [ (not just after .[). |
| pkg/yqlib/operator_slice_test.go | Adds coverage for string slicing (including Unicode and bounds cases). |
| pkg/yqlib/doc/operators/slice-array.md | Updates generated operator docs to include string slicing examples. |
| pkg/yqlib/doc/operators/headers/slice-array.md | Updates the slice operator header to reflect string support. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/yqlib/operator_slice.go
Outdated
| relativeFirstNumber := firstNumber | ||
| if relativeFirstNumber < 0 { | ||
| relativeFirstNumber = len(lhsNode.Content) + firstNumber | ||
| } |
There was a problem hiding this comment.
Array slicing can still panic when the start index is more negative than the array length (e.g. .[-100:]): relativeFirstNumber is adjusted by len(lhsNode.Content)+firstNumber but is not clamped back to 0, so the loop can index lhsNode.Content with a negative i. Clamp relativeFirstNumber (and similarly relativeSecondNumber) into [0,len] before iterating to avoid runtime panics and match the existing out-of-bounds clamping behavior for other cases.
There was a problem hiding this comment.
Fixed in 341e252. Array slice now clamps both relativeFirstNumber and relativeSecondNumber into [0, len(Content)] after adjusting for negative indices, preventing a panic on expressions like .[-100:]. Also added test cases for both directions of out-of-bounds negative indices.
Co-authored-by: Copilot <[email protected]>
Agent-Logs-Url: https://github.com/mikefarah/yq/sessions/7c146762-d251-45fd-8555-2488f59fc57b Co-authored-by: mikefarah <[email protected]>
|
Hi Mike, while I was working on some OSS-Fuzz bug reports against yqlib I've also been running my Deep Review tool against this PR because I was curious what it would find: https://jandubois.github.io/yq/20260405-232458-pr-2639.html For the I2 finding (index clamping) you could use the Haven't looked through the review in detail yet, but thought you might be interested. |
|
@copilot resolve conflicts with main; you should be able to use the new "clampSliceIndex" function; and action the following feeback S2. Lexer openCollect change affects standalone [: expr] ClaudeGemini ¶ The added openCollect condition correctly enables .field[:N] but also fires on standalone [ (array literal start). [: "b"] now silently produces [{0: b}] instead of an error. Unlikely to affect real users, but broadens accepted syntax unintentionally. (suggestion, gap) Fix: Tighten the condition to require that openCollect follows a traversal token. S3. Doc header "Splice" misleading for strings Claude ¶ The header was updated from "Slice/Splice Array" to "Slice/Splice Array or String". "Splice" implies in-place mutation, which is not supported for strings. Consider "Slice Array or String" instead. S4. Documentation uses only ASCII examples Claude ¶ All four string slicing doc sections use country: Australia (ASCII-only). Since the feature's distinguishing value is rune-based Unicode handling, one documented example with a multibyte string would demonstrate this. |
Co-authored-by: mikefarah <[email protected]>
Agent-Logs-Url: https://github.com/mikefarah/yq/sessions/ec06083e-e20a-45d2-bf7e-4e1fa7be1073 Co-authored-by: mikefarah <[email protected]>
... Done in adb150d (and merge commit 1afa3c7):
|
|
@copilot build failed; pkg/yqlib/doc/operators/slice-array.md:169:28 - Unknown word (multibyte) |
Agent-Logs-Url: https://github.com/mikefarah/yq/sessions/6e7b304b-5b52-4e89-8bad-ba22813305c7 Co-authored-by: mikefarah <[email protected]>
|
Hi Mike, thanks for the quick updates! I'm wondering what your plans are regarding the Also automated check of review against merged state: Follow-up: PR #2639 Post-Merge Finding StatusChecked against: AddressedI2 — Inconsistent index-clamping structure. Fixed. I3 — Missing string edge-case tests. Partially fixed. The merged PR adds tests for explicit indices, omitted start/end, negative indices, both ends, full slice S1 — Commit structure. Fixed. Squash-merged into a single commit. S2 — Lexer S3 — "Splice" in doc header. Fixed. Header changed to "Slice Array or String". S4 — ASCII-only documentation examples. Fixed. Added a Unicode example ( Not addressedI1 — Byte-vs-rune This is the most impactful remaining issue: any expression that feeds I4 — Non-string scalar slicing produces invalid output. No guard was added before the array fallthrough. I5 — Slice index expressions evaluated against LHS instead of root context. SummarySix of nine findings addressed (all four suggestions, I2, and partially I3). The three remaining important issues (I1, I4, I5) are pre-existing gaps that the PR exposed rather than introduced — none are regressions. |
yq supported array slicing (
.[1:3]) but not string slicing —.field[4:]on a string value silently returned the wrong result or panicked.Changes
operator_slice.go— AddedsliceStringNode(rune-based for Unicode correctness); uses the sharedclampSliceIndexhelper for bounds clamping.sliceArrayOperatornow routes!!strscalar nodes to it instead of the array path. The original node's tag and style are preserved on the replacement.operator_traverse_path.go— FixedtraverseArrayOperatorto evaluate the LHS before callingsliceArrayOperator. Previously it passed the raw document context, so.country[4:]would operate on the whole map rather than thecountryvalue.lexer.go— Extended the implicit-0insertion (for omitted start index in[:N]) to fire onopenCollect([) in addition totraverseArrayCollect(.[), enabling.field[:N]syntax. The condition is tightened to require thatopenCollectfollows a traversal token (operationToken, closeCollect, or closeCollectObject), preventing standalone[: "b"]map literals from being silently misinterpreted.operator_slice_test.go— Added string slicing scenarios: start-only, end-only, both, negative indices, out-of-bounds clamping (including underflow), full[:], and Unicode. Regression tests for very-negative array indices (OSS-Fuzz) are included.doc/operators/headers/slice-array.md— Header updated to "Slice Array or String" (removed "Splice" which implied unsupported in-place mutation).Example
Unicode indices are rune-based:
📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.