Skip to content

THRIFT-6052: Limit struct read/write recursion depth in Smalltalk library#3557

Open
Jens-G wants to merge 1 commit into
apache:masterfrom
Jens-G:smalltalk-recursion-depth
Open

THRIFT-6052: Limit struct read/write recursion depth in Smalltalk library#3557
Jens-G wants to merge 1 commit into
apache:masterfrom
Jens-G:smalltalk-recursion-depth

Conversation

@Jens-G
Copy link
Copy Markdown
Member

@Jens-G Jens-G commented May 28, 2026

Summary

THRIFT-6052 — bound the recursion depth of Smalltalk struct read/write.

Smalltalk struct serialization is emitted inline by the generator (the
struct_writer / struct_reader templates; there are no per-struct read/write
methods). On master these do not bound recursion depth, so a deeply nested
message is read or written without a limit.

This PR wraps the generated struct read/write bodies with
incrementRecursionDepthensure: [decrementRecursionDepth] and adds the
counter to TProtocol (limit 64, TProtocolError depthLimit on excess).

Incidental fix: the struct reader emitted oprot readStructEnd in a read
context where it should be iprot. That was a latent bug that broke service
struct reads independently of depth; it is corrected here (the oprotiprot
change in struct_reader).

Test

The previous test pumped incrementRecursionDepth / decrementRecursionDepth
directly and never serialized a struct. It is replaced by a round-trip test
(lib/st/test/TProtocolRecursionDepthTest.st + RecursionDepthTest.thrift) that
drives the generated DeepClient send/recv path:

case expectation
chain of distinct structs at the limit (64), write round-trips
chain one past the limit (65), write TProtocolError (depth limit)
shallow payload, read (generated reader) round-trips
payload nested past the limit, read TProtocolError (depth limit)

Why a finite chain instead of Recursive.thrift

The Smalltalk generator inline-expands nested struct serialization, so a
genuinely recursive type (e.g. CoRec/CoRec2/RecTree) makes it recurse
without bound at code-generation time and crash (stack overflow). That is a
separate, pre-existing generator limitation, out of scope for this ticket —
tracked separately in
THRIFT-6062. So the limit
is exercised with a finite chain of distinct
struct types (A1..A65) deep enough to cross it. The runtime depth counter only
increments per struct level (not per container), so this finite chain is the
representable way to reach the limit through generated code.

Validation

Smalltalk is not in the CI matrix (build/docker/README.md), so this was
validated locally with GNU Smalltalk 3.2.5 (the suite also files into Squeak,
the documented target; gst needs a couple of trivial Collection>>sum-style
compat shims that Squeak does not):

  • with this PR: all four round-trip tests pass.
  • against master: the two over-limit cases stop passing — the over-limit
    write serializes with no error (the bound is absent) and the over-limit
    read is not cleanly rejected — confirming the gap this change closes.

Note on exception/union coverage

The runtime guard reaches struct, union and exception alike: field dispatch routes is_struct() || is_xception() (a union is a t_struct, so this covers unions too) through the same guarded struct_reader/struct_writer. The committed test exercises a finite struct chain only — a recursive type cannot be generated in Smalltalk at all: the generator inline-expands struct serialization with no base case, so any recursive struct/union/exception overflows the C++ compiler's stack at code-generation time (filed as THRIFT-6062). A recursive-exception round-trip is therefore not expressible here; exception coverage rests on the shared guarded reader/writer path described above.

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

@Jens-G Jens-G requested review from fishy and mhlakhani as code owners May 28, 2026 11:47
@mergeable mergeable Bot added the compiler label May 28, 2026
@Jens-G Jens-G marked this pull request as draft May 28, 2026 22:45
@Jens-G Jens-G force-pushed the smalltalk-recursion-depth branch from 2db5638 to e4ec62a Compare May 30, 2026 15:50
@Jens-G Jens-G marked this pull request as ready for review May 30, 2026 15:50
…rary

Client: st

Smalltalk struct serialization is emitted inline by the generator (the
struct_writer / struct_reader templates). These did not bound recursion depth,
so a deeply nested message was read or written without a limit.

Wrap the generated struct read/write bodies with incrementRecursionDepth ...
ensure: [decrementRecursionDepth] and add the counter to TProtocol (limit 64,
TProtocolError depthLimit on excess). While there, fix the struct reader, which
emitted "oprot readStructEnd" in a read context where it should be "iprot" -- a
latent bug that broke service struct reads independently of depth.

Replace the isolated counter test with a round-trip test driving the generated
DeepClient send/recv path (lib/st/test/TProtocolRecursionDepthTest.st +
RecursionDepthTest.thrift): a chain of distinct struct types at the limit
round-trips, while one level past it is rejected with the depth limit on both
write and read. A finite chain of distinct types is used rather than a genuinely
recursive type because the generator inline-expands nested struct serialization
and recurses without bound at code-generation time on a recursive type -- a
separate, pre-existing generator limitation, out of scope here.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@Jens-G Jens-G force-pushed the smalltalk-recursion-depth branch from e4ec62a to cd6d305 Compare May 30, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant