Skip to content

fix: support nested generic types in class diagram definitions#7610

Open
M-a-c wants to merge 2 commits intomermaid-js:developfrom
M-a-c:class-nested-generics
Open

fix: support nested generic types in class diagram definitions#7610
M-a-c wants to merge 2 commits intomermaid-js:developfrom
M-a-c:class-nested-generics

Conversation

@M-a-c
Copy link
Copy Markdown
Contributor

@M-a-c M-a-c commented Apr 12, 2026

The jison lexer for class diagrams could not handle nested generic types like List~List~Person~~ because each ~ inside the generic state immediately exited it. This adds depth tracking to the lexer so nested tildes are correctly captured as a single GENERICTYPE token, and fixes splitClassNameAndType to extract content between the first and last ~.

Resolves #7480

📑 Summary

The class diagram parser failed on nested generic types (e.g. class People~List~Person~~) because the jison lexer's <generic> state exited at every ~ with no concept of nesting depth. This PR fixes the lexer to track depth and accumulate nested content, and updates splitClassNameAndType to correctly extract the full generic type string.

📏 Design Decisions

Lexer (classDiagram.jison): Replaced the three simple generic-state rules with depth-tracking logic. When a ~ is encountered inside the generic state, the lexer checks the next character — if it's a word character (\w), the tilde opens a nested generic (depth++); otherwise it closes one (depth--). Content accumulates in a buffer and GENERICTYPE is only returned when depth reaches 0.

splitClassNameAndType (classDb.ts): Changed from id.split('~')[1] (which broke on multiple tildes) to using indexOf/lastIndexOf to extract everything between the first and last ~.

📋 Tasks

  • 📖 have read the contribution guidelines
  • 💻 have added necessary unit/e2e tests.
  • 📓 have added documentation. Make sure MERMAID_RELEASE_VERSION is used for all new features.
  • 🦋 If your PR makes a change that should be noted in one or more packages' changelogs, generate a changeset by running pnpm changeset and following the prompts. Changesets that add features should be minor and those that fix bugs should be patch. Please prefix changeset messages with feat:, fix:, or chore:.

🤖 Co-authored by Claude Opus 4.6 (guided with care by @M-a-c)

Screenshot 2026-04-11 205447

The jison lexer for class diagrams could not handle nested generic types
like `List~List~Person~~` because each `~` inside the generic state
immediately exited it. This adds depth tracking to the lexer so nested
tildes are correctly captured as a single GENERICTYPE token, and fixes
splitClassNameAndType to extract content between the first and last `~`.

Resolves mermaid-js#7480

Co-Authored-By: Claude Opus 4.6 (guided with care by @M-a-c) <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 12, 2026

🦋 Changeset detected

Latest commit: 0b97250

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
mermaid Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 12, 2026

Deploy Preview for mermaid-js ready!

Name Link
🔨 Latest commit 0b97250
🔍 Latest deploy log https://app.netlify.com/projects/mermaid-js/deploys/69dd2ae562cdef00086fbc74
😎 Deploy Preview https://deploy-preview-7610--mermaid-js.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 12, 2026

Open in StackBlitz

@mermaid-js/examples

npm i https://pkg.pr.new/@mermaid-js/examples@7610

mermaid

npm i https://pkg.pr.new/mermaid@7610

@mermaid-js/layout-elk

npm i https://pkg.pr.new/@mermaid-js/layout-elk@7610

@mermaid-js/layout-tidy-tree

npm i https://pkg.pr.new/@mermaid-js/layout-tidy-tree@7610

@mermaid-js/mermaid-zenuml

npm i https://pkg.pr.new/@mermaid-js/mermaid-zenuml@7610

@mermaid-js/parser

npm i https://pkg.pr.new/@mermaid-js/parser@7610

@mermaid-js/tiny

npm i https://pkg.pr.new/@mermaid-js/tiny@7610

commit: 0b97250

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 3.33%. Comparing base (57e5f54) to head (0b97250).
⚠️ Report is 139 commits behind head on develop.

Files with missing lines Patch % Lines
packages/mermaid/src/diagrams/class/classDb.ts 0.00% 5 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           develop   #7610      +/-   ##
==========================================
- Coverage     3.33%   3.33%   -0.01%     
==========================================
  Files          536     536              
  Lines        56249   56250       +1     
  Branches       820     820              
==========================================
  Hits          1876    1876              
- Misses       54373   54374       +1     
Flag Coverage Δ
unit 3.33% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
packages/mermaid/src/diagrams/class/classDb.ts 0.00% <0.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@argos-ci
Copy link
Copy Markdown

argos-ci Bot commented Apr 12, 2026

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ⚠️ Changes detected (Review) 1 changed Apr 13, 2026, 5:53 PM

Copy link
Copy Markdown
Collaborator

@ashishjain0512 ashishjain0512 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[sisyphus-bot]

Thanks for picking up #7480, @M-a-c! Nested generics in class diagrams have been a long-standing papercut, and the depth-tracking approach here is a clean way to solve it without rewriting the whole lexer. Really nice to see this come together.


What's working well

🎉 Focused, surgical fix. Two coordinated changes (lexer depth tracking + substring-based extraction in classDb.splitClassNameAndType) addressing the root cause from both ends. No collateral changes to unrelated code.

🎉 Good test coverage. The four new test cases hit the important shapes: simple nested (List~Person~), deeply nested (List~List~String~~), nested-with-relationships, and nested-with-class-body. The deeply-nested case is particularly valuable — it's exactly the kind of scenario that tends to regress.

🎉 Backward-compatible substring approach. Switching from id.split('~') to substring(firstTilde+1, lastTilde) in classDb.ts:78-90 preserves existing behavior for single-tilde generics while correctly handling nested ones. splitClassNameAndType is called from 7 sites — keeping the contract stable here is the right call.

🎉 Correctly scoped changeset. Patch bump matches the bug-fix nature of the change. The changeset description is clear about what's now supported.


Things to address

🟡 [important] — Empty-content behavior change

packages/mermaid/src/diagrams/class/parser/classDiagram.jison:106, 116-117

The old rule <generic>[^~]* was * (zero-or-more) and would emit an empty GENERICTYPE token for class Empty~~ (empty generic). The new rule <generic>[^~]+ is + (one-or-more), and the close branch explicitly skips emitting the token when _gContent.length === 0:

if (this._gContent.length > 0) return "GENERICTYPE";

This is a subtle behavior change that could affect any downstream grammar rule expecting a GENERICTYPE token to always appear between matched tildes. It would be great to add one of:

  1. A test case for class Empty~~ confirming the intended behavior (parses successfully? type is ''? type is undefined?), and
  2. A short comment explaining the choice (e.g., // Skip emitting empty GENERICTYPE — old behavior emitted "" but downstream consumers don't need it).

If this is an intentional change, the test plus comment makes it a deliberate decision; if it's accidental, this is the easiest place to catch it.


🟡 [important] — Heuristic edge cases worth documenting

packages/mermaid/src/diagrams/class/parser/classDiagram.jison:107-119

The \w lookahead (/\w/.test(nextCh)) is a clever heuristic: a word char after ~ means "open a deeper level," anything else means "close." It works for the canonical inputs in the tests, but a few inputs will misbehave:

  • Map~ Key~ (leading space inside generic) — next char after first inner ~ is , which is not \w, so the lexer treats it as a close. The generic ends at Key.
  • Map~?Key~ (non-word punctuation) — same issue.
  • Map~~Bar~~ — first inner ~ sees next char ~ (not \w) → close at depth 0, returning ''. The remaining Bar~~ then parses as a separate token sequence.

Most of these were probably broken (or weirdly-handled) before too, so this isn't a regression. But it would be worth a brief comment above the rule explaining the strategy and its assumption (generics contain identifier-like content with no leading whitespace), so future maintainers know why the lookahead exists. Something like:

// On '~', peek next char to decide:
//   word char  -> opening a nested generic, increment depth
//   anything else -> closing current depth (or end of generic if depth hits 0)
// Assumes generics contain identifier-like content; whitespace inside generics
// will close the type prematurely.

🟢 [nit] — Variable naming

packages/mermaid/src/diagrams/class/parser/classDiagram.jison:107, 119

this._gDepth and this._gContent are concise but a bit cryptic. _genericDepth / _genericContent would self-document and match the surrounding <generic> state name. Not blocking — just a small readability win.


🟢 [nit] — Lexer instance state isn't reset on close

packages/mermaid/src/diagrams/class/parser/classDiagram.jison:113-117

After popState(), this._gDepth and this._gContent retain their last values. The opening rule re-initializes them on the next <*>"~" match, so this is functionally fine within a single parse. Just calling it out — if you wanted to be tidy, resetting _gContent = '' on close would make the state machine easier to reason about. Optional.


💡 [suggestion] — Stronger assertion in the relationship test

packages/mermaid/src/diagrams/class/classDiagram.spec.ts:1247-1252

The "should handle nested generic class with relationships" test only verifies parser.parse(str) doesn't throw. Adding an assertion that the relationship was actually captured (e.g., expect(classDb.getRelations()).toHaveLength(1) or checking the from/to class names) would catch silent failures where parsing succeeds but the relationship doesn't get registered. Minor — the parse-doesn't-throw test still has value.


Security

No XSS or injection issues identified. The changes are confined to the lexer state machine and a substring extraction in the DB; no DOM sinks, no SVG output paths, no sanitization config touched. sanitizeText continues to be applied to both className and genericType in splitClassNameAndType.


Out of scope (FYI, no action needed)

The class diagram still uses JISON (no Langium grammar exists for it yet). When/if a Langium class grammar is built, the same nesting concern will need to be solved there — this fix is the right scope for now and shouldn't be expanded. Just flagging for institutional memory.


Excellent work tracking down the root cause and fixing it cleanly. Once the empty-generic behavior is decided (test + comment) and the heuristic comment is added, this should be in great shape to ship. Looking forward to seeing this land! 🚀

- Add comment block explaining the depth-tracking heuristic, its
  assumptions (identifier-like content), and the intentional empty
  generic behavior
- Add test for empty generics (`class Empty~~`) confirming it parses
  without a type parameter
- Strengthen relationship test with assertions on relation count and
  endpoint class names

Co-Authored-By: Claude Opus 4.6 (guided with care by @M-a-c) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix: Syntax error in class diagrams with nested generics and inline type definitions

2 participants