Skip to content

KAFKA-20072 Don't generate IDs with hyphens#21313

Merged
mimaison merged 3 commits into
apache:trunkfrom
wernerdv:KAFKA-20072
Apr 30, 2026
Merged

KAFKA-20072 Don't generate IDs with hyphens#21313
mimaison merged 3 commits into
apache:trunkfrom
wernerdv:KAFKA-20072

Conversation

@wernerdv
Copy link
Copy Markdown
Contributor

@wernerdv wernerdv commented Jan 15, 2026

The Uuid#randomUuid method has been modified to avoid generating UUIDs
containing hyphens.

@github-actions github-actions Bot added triage PRs from the community clients small Small PRs labels Jan 15, 2026
@wernerdv
Copy link
Copy Markdown
Contributor Author

@mumrah Hello, are the current minimal changes sufficient or do we need to optimize the randomUuid() method?

Copy link
Copy Markdown
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wernerdv yea this is pretty much what I was thinking. Thanks for the patch!

I'll leave this open for a couple of days to see if any committers have any feedback

public static Uuid randomUuid() {
Uuid uuid = unsafeRandomUuid();
while (RESERVED.contains(uuid) || uuid.toString().startsWith("-")) {
while (RESERVED.contains(uuid) || uuid.toString().contains("-")) {
Copy link
Copy Markdown
Contributor

@squah-confluent squah-confluent Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the probability of rejecting a generated uuid rises from 1.56% (= 1/64) to 30.5% (= 1 - (63/64)**19 * 15/16, taking into account fixed bits in version 4 uuids). The new p99 number of rejections is ~3.88 (was 1.11) which I think is not too bad.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing the math :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30% failure rate for a general utility class is really high and I don't think it makes sense. Uuids are occasionally used in cases where performance matters. Also, the reasoning for the change is unclear.

Copy link
Copy Markdown
Member

@mumrah mumrah Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @ijuma -- that's a very good point about performance! I did check through fetch/produce paths originally, but of course that doesn't protect us from future usages (or things not caught during the review).

I'll file a patch shortly that makes this new behavior opt-in.

As to the reasoning, the original Jira I wrote was about issues with double-clicking on IDs. Really it's just meant to be a small quality of life improvement.

Copy link
Copy Markdown
Member

@mumrah mumrah Jun 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in #22442

@github-actions github-actions Bot removed the triage PRs from the community label Jan 16, 2026
@mimaison
Copy link
Copy Markdown
Member

We use Uuid on the client side for group member ids. If I understand correctly, hyphens don't affect the behavior, Uuids with hyphens will continue to work as before, and this is only a slight usability improvement. Should we still document it, for example in the protocol page, as a recommendation for 3rd party clients?

@mumrah
Copy link
Copy Markdown
Member

mumrah commented Jan 20, 2026

@mimaison indeed this won't affect existing IDs or externally generated ones that "violate" the rule. I'm not sure if we document the leading underscore preference, but it seems like a reasonable inclusion (e.g., "clients should generate UUIDs without hyphens).

@mumrah
Copy link
Copy Markdown
Member

mumrah commented Jan 27, 2026

@mimaison or @squah-confluent any further feedback here?

@mimaison
Copy link
Copy Markdown
Member

I don't see an update to the docs in the changes.

@wernerdv
Copy link
Copy Markdown
Contributor Author

@mumrah @mimaison I’ve updated protocol.md.
Could you let me know if these changes are sufficient?
Alternatively, do you think there might be a more suitable section in the documentation for this content?

@squah-confluent
Copy link
Copy Markdown
Contributor

The code change is fine by me.

The current documentation change raises some questions. It would be nice to improve the wording.

  • When do clients generate UUIDs? There are member IDs but I believe that is a recommendation / convention that we use rather than a requirement.
    KIP-1082 says

    In the new version of the ConsumerGroupHeartbeat RPC, the client must generate a UUID as the member ID during the initial heartbeat. This member ID must be included in every subsequent request to ensure consistency. We highly recommend that users utilize a UUID as the member ID, but ultimately, the choice is up to the user."

    It's contradictory but it sounds like it's trying to say that member IDs can be any format but the recommendation is a base64-encoded UUID.

  • To the average developer, a UUID looks like "00000000-0000-0000-0000-000000000000". It's not clear that we are referring to the base64-encoded form, and not even the "default" encoding with "+" and "/", but the URL-safe encoding

@wernerdv
Copy link
Copy Markdown
Contributor Author

@squah-confluent Thanks for your comment.
I have made the updates — please review them.

@github-actions
Copy link
Copy Markdown

This PR is being marked as stale since it has not had any activity in 90 days. If you
would like to keep this PR alive, please leave a comment asking for a review. If the PR has
merge conflicts, update it with the latest from the base branch.

If you are having difficulty finding a reviewer, please reach out on the [mailing list](https://kafka.apache.org/contact).

If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed.

@github-actions github-actions Bot added the stale Stale PRs label Apr 29, 2026
Copy link
Copy Markdown
Member

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mimaison mimaison merged commit 474798a into apache:trunk Apr 30, 2026
20 checks passed
@wernerdv wernerdv deleted the KAFKA-20072 branch April 30, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants