No ticket: Clarify when the zigzag encoding is used [skip ci] by abellgithub · Pull Request #3503 · apache/thrift

abellgithub · 2026-05-20T16:41:35Z

Client: all

Did you create an Apache Jira ticket? (Request account here, not required for trivial changes)
If a ticket exists: Does your pull request title follow the pattern "THRIFT-NNNN: describe my issue"?
Did you squash your changes to a single commit? (not required, but preferred)
Did you do your best to avoid breaking changes? If one was needed, did you label the Jira ticket with "Breaking-Change"?
If your change does not involve any code, include [skip ci] anywhere in the commit message to free up build resources.

Client: all

Jens-G · 2026-05-20T20:11:35Z

+possible to handle unknown fields while decoding by ignoring them. The field type is used to determine how to decode field values.

-Note that the field name is not encoded so field renames in the IDL do not affect forward and backward compatibility.
+Note that the fields are identified by their integer value and not thier name.


thier name?

Seems OK to me, but you're welcome to change.

not "their name"?

Ah, spelling. Will fix.

Jens-G · 2026-05-20T20:30:57Z

Code review

Found 1 issue:

Double encoding description contradicts the document's own i64 definition — the new "Integer encoding" section defines i16/i32/i64 as ZigZag-encoded and then varint-encoded. Saying doubles are "encoded as if they were i64" implies they go through the same ZigZag+varint pipeline, which is incorrect — doubles are written as 8 raw little-endian bytes with no ZigZag and no varint transformation.

thrift/doc/specs/thrift-compact-protocol.md

Lines 88 to 121 in 26f275f

    
           Values of type `i8` are encoded as one byte. 
        
           Values of type `i16`, `i32` and `i64` are first transformed to using [ZigZag](#zigzag-encoding) 
        
           and that result is then [varint](#varint-encoding) encoded before being written. 
        
           ### Enum encoding 
        
           `Enum`s are encoded as their integer value as defined by the Thrift IDL. 
        
           ### Binary encoding 
        
           Binary is sent as follows: 
        
           ``` 
        
           Binary protocol, binary data, 1+ bytes: 
        
           +--------+...+--------+--------+...+--------+ 
        
           | byte length         | bytes               | 
        
           +--------+...+--------+--------+...+--------+ 
        
           ``` 
        
           Where: 
        
           * `byte length` is the length of the byte array (non-negative), encoded as a 
        
              [varint](#varint-encoding) (**not**  [ZigZag'ed](#zigzag-encoding)). 
        
           * `bytes` are the bytes of the byte array. 
        
           ### String encoding 
        
           Strings are written as [binary](#binary-encoding) and must be encoded as UTF-8. Strings do 
        
            not include a NULL(0) terminator. 
        
           ### Double encoding 
        
           Values of type `double` are converted to little-endian and then encoded as if they were `i64` (the bytes are treated as the bytes of an 64-bit integer).

🤖 Generated with Claude Code

_{- If this code review was useful, please react with 👍. Otherwise, react with 👎.}

abellgithub · 2026-05-20T21:10:53Z

I don't see any way to respond explicitly to the comment about doubles. A look at the code shows that I misinterpreted the old language. I will edit.

abellgithub · 2026-05-20T21:15:10Z

Actually, the old language talking about int64 seems pointless. The bytes of a double seem to be written directly in little-endian byte order -- any reference to i64 would strike me as confusing (and was probably why I was confused when I read the old language).

No ticket: Clarify when the zigzag encoding is used.

26f275f

Client: all

abellgithub requested a review from Jens-G as a code owner May 20, 2026 16:41

Jens-G reviewed May 20, 2026

View reviewed changes

Jens-G changed the title ~~No ticket: Clarify when the zigzag encoding is used.~~ No ticket: Clarify when the zigzag encoding is used [skip ci] May 20, 2026

Jens-G added the doc label May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No ticket: Clarify when the zigzag encoding is used [skip ci]#3503

No ticket: Clarify when the zigzag encoding is used [skip ci]#3503
abellgithub wants to merge 1 commit into
apache:masterfrom
abellgithub:compact-correction

abellgithub commented May 20, 2026

Uh oh!

Jens-G May 20, 2026

Uh oh!

abellgithub May 20, 2026

Uh oh!

Jens-G May 20, 2026

Uh oh!

abellgithub May 21, 2026

Uh oh!

Jens-G commented May 20, 2026

Uh oh!

abellgithub commented May 20, 2026 •

edited

Loading

Uh oh!

abellgithub commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abellgithub commented May 20, 2026

Uh oh!

Jens-G May 20, 2026

Choose a reason for hiding this comment

Uh oh!

abellgithub May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Jens-G May 20, 2026

Choose a reason for hiding this comment

Uh oh!

abellgithub May 21, 2026

Choose a reason for hiding this comment

Uh oh!

Jens-G commented May 20, 2026

Code review

Uh oh!

abellgithub commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abellgithub commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abellgithub commented May 20, 2026 •

edited

Loading