Skip to content

fix(svg): split text fallback for symbols and render color emoji glyphs#319

Draft
joeykchen wants to merge 3 commits into
goplus:spx4.4.1from
joeykchen:fix/svg-symbol-emoji-fallback
Draft

fix(svg): split text fallback for symbols and render color emoji glyphs#319
joeykchen wants to merge 3 commits into
goplus:spx4.4.1from
joeykchen:fix/svg-symbol-emoji-fallback

Conversation

@joeykchen

Copy link
Copy Markdown

Background

The current SVG text rendering path did not provide a real fallback chain for identical font-family names, so Scratch SVG mixed text could route symbols, emoji, and CJK characters to the wrong fonts. In addition, color emoji glyphs embedded as SVG-in-OT were present in the font but could not be rendered through the existing text path.

Changes

  • Split SVG text into base/default/symbols/emoji runs by character category
  • Bind internal fallback families to SPX Default, Symbols, and Emoji
  • Recognize emoji presentation controls such as variation selectors, ZWJ, and skin-tone modifiers to avoid misclassifying normal symbols as emoji
  • Expose glyph SVG access in plutovg
  • Render embedded SVG emoji glyphs as images when available, and align bounding box calculation accordingly
  • Ignore duplicate family registrations at the font registry layer to prevent later registrations from overwriting earlier ones

Impact

  • Symbols such as , arrows, and dingbats no longer fall through to the emoji font incorrectly
  • Color emoji can now render as expected inside SVG text
  • The Godot-side rendering behavior now matches the SPX-side packaged fallback fonts and family naming

Verification

  • Verified together with the matching spx branch using the mixed-text case Hello ❤️ 你好
  • Did not run a full Godot build or full test suite for this change

@joeykchen joeykchen marked this pull request as draft June 26, 2026 03:29

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for rendering embedded SVG color emojis and handling fallback fonts (such as CJK and symbols) in LunaSVG. The review feedback highlights a potential null pointer dereference when retrieving glyph SVG data, a grouping issue that prevents consecutive emojis from rendering correctly as color emojis, and missing CJK character ranges in the Supplementary Ideographic Plane for font fallback classification.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +289 to +297
int plutovg_font_face_get_glyph_svg(plutovg_font_face_t* face, plutovg_codepoint_t codepoint, const char** svg)
{
if(svg)
*svg = NULL;
if(face == NULL || svg == NULL)
return 0;
glyph_t* glyph = plutovg_font_face_get_glyph(face, codepoint);
return stbtt_GetGlyphSVG(&face->info, glyph->index, svg);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If plutovg_font_face_get_glyph returns NULL (e.g., when the codepoint is not supported by the font face), dereferencing glyph->index will cause a null pointer dereference and crash the application. A null check should be added for glyph.

int plutovg_font_face_get_glyph_svg(plutovg_font_face_t* face, plutovg_codepoint_t codepoint, const char** svg)
{
    if(svg)
        *svg = NULL;
    if(face == NULL || svg == NULL)
        return 0;
    glyph_t* glyph = plutovg_font_face_get_glyph(face, codepoint);
    if(glyph == NULL)
        return 0;
    return stbtt_GetGlyphSVG(&face->info, glyph->index, svg);
}

Comment on lines 350 to 354
auto shouldStartNewFragment = characterPosition.x || characterPosition.y || dx || dy || angle || angle != lastAngle || runKind != currentRunKind;
if(shouldStartNewFragment && didStartTextFragment) {
recordTextFragment(startOffset, textOffset);
recordTextFragment(startOffset, textOffset, currentRunKind);
startOffset = textOffset;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Because tryRenderColorEmojiGlyph and the bounding box calculation both require fragment.text.size() == 1 to render and measure color emojis, consecutive emoji characters (which share the same TextRunKind::Emoji) will be grouped into a single fragment of size > 1. This causes consecutive emojis to completely fail to render as color emojis, falling back to standard text rendering instead. To fix this, we should force a new fragment for every emoji character by checking if runKind == TextRunKind::Emoji.

Suggested change
auto shouldStartNewFragment = characterPosition.x || characterPosition.y || dx || dy || angle || angle != lastAngle || runKind != currentRunKind;
if(shouldStartNewFragment && didStartTextFragment) {
recordTextFragment(startOffset, textOffset);
recordTextFragment(startOffset, textOffset, currentRunKind);
startOffset = textOffset;
}
auto shouldStartNewFragment = characterPosition.x || characterPosition.y || dx || dy || angle || angle != lastAngle || runKind != currentRunKind || runKind == TextRunKind::Emoji;
if(shouldStartNewFragment && didStartTextFragment) {
recordTextFragment(startOffset, textOffset, currentRunKind);
startOffset = textOffset;
}

Comment on lines +88 to +109
static bool isDefaultFallbackCodepoint(uint32_t codepoint)
{
if(codepoint < 0x80 || isEmojiCodepoint(codepoint))
return false;
if(isWhitespaceCodepoint(codepoint))
return false;

return (codepoint >= 0x3000 && codepoint <= 0x303F) ||
(codepoint >= 0x3040 && codepoint <= 0x30FF) ||
(codepoint >= 0x3100 && codepoint <= 0x312F) ||
(codepoint >= 0x31A0 && codepoint <= 0x31BF) ||
(codepoint >= 0x31C0 && codepoint <= 0x31EF) ||
(codepoint >= 0x3200 && codepoint <= 0x33FF) ||
(codepoint >= 0x3400 && codepoint <= 0x4DBF) ||
(codepoint >= 0x4E00 && codepoint <= 0x9FFF) ||
(codepoint >= 0xA960 && codepoint <= 0xA97F) ||
(codepoint >= 0xAC00 && codepoint <= 0xD7AF) ||
(codepoint >= 0xD7B0 && codepoint <= 0xD7FF) ||
(codepoint >= 0xF900 && codepoint <= 0xFAFF) ||
(codepoint >= 0xFE30 && codepoint <= 0xFE6F) ||
(codepoint >= 0xFF00 && codepoint <= 0xFFEF);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current isDefaultFallbackCodepoint implementation only checks for CJK characters in the Basic Multilingual Plane (BMP) up to 0xFFEF. However, many CJK characters (including CJK Unified Ideographs Extensions B through H, and Compatibility Ideographs Supplement) reside in the Supplementary Ideographic Plane (SIP) between 0x20000 and 0x323AF. Excluding these will prevent rare or extended CJK characters from falling back to the default font family, leading to missing glyphs.

static bool isDefaultFallbackCodepoint(uint32_t codepoint)
{
    if(codepoint < 0x80 || isEmojiCodepoint(codepoint))
        return false;
    if(isWhitespaceCodepoint(codepoint))
        return false;

    return (codepoint >= 0x3000 && codepoint <= 0x303F) ||
        (codepoint >= 0x3040 && codepoint <= 0x30FF) ||
        (codepoint >= 0x3100 && codepoint <= 0x312F) ||
        (codepoint >= 0x31A0 && codepoint <= 0x31BF) ||
        (codepoint >= 0x31C0 && codepoint <= 0x31EF) ||
        (codepoint >= 0x3200 && codepoint <= 0x33FF) ||
        (codepoint >= 0x3400 && codepoint <= 0x4DBF) ||
        (codepoint >= 0x4E00 && codepoint <= 0x9FFF) ||
        (codepoint >= 0xA960 && codepoint <= 0xA97F) ||
        (codepoint >= 0xAC00 && codepoint <= 0xD7AF) ||
        (codepoint >= 0xD7B0 && codepoint <= 0xD7FF) ||
        (codepoint >= 0xF900 && codepoint <= 0xFAFF) ||
        (codepoint >= 0xFE30 && codepoint <= 0xFE6F) ||
        (codepoint >= 0xFF00 && codepoint <= 0xFFEF) ||
        (codepoint >= 0x20000 && codepoint <= 0x323AF);
}

@fennoai fennoai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds text-run splitting for symbol/emoji fallback fonts and introduces color emoji glyph rendering via embedded SVG. The architecture is sound and the feature is useful, but there are two correctness bugs and one per-frame performance issue that should be addressed before merge.

Critical

  • Null pointer dereference in plutovg_font_face_get_glyph_svg if malloc fails
  • Overlapping ranges between isSymbolsFallbackCodepoint and isEmojiCodepoint silently misroute a wide set of standard emoji (U+231A–U+27BF) to the Symbols font instead of the Emoji font

Performance

  • Document::loadFromData re-parses the full SVG glyph document on every render call; with no cache this is O(emoji_count × fps) parse operations per second

Minor

  • marksPreviousEmojiPresentation is an identity wrapper for isEmojiFormattingCodepoint; its inner-branch condition is always true, making the extra function actively misleading
  • isDefaultFallbackCodepoint name overpromises — it only covers BMP CJK; supplementary-plane CJK (U+20000–U+323AF) falls through silently
  • API doc for plutovg_font_face_get_glyph_svg does not mention pointer lifetime (svgData is an interior pointer into the font binary, invalid after the face is destroyed)

if(face == NULL || svg == NULL)
return 0;
glyph_t* glyph = plutovg_font_face_get_glyph(face, codepoint);
return stbtt_GetGlyphSVG(&face->info, glyph->index, svg);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Null pointer dereference: plutovg_font_face_get_glyph can return NULL if malloc or calloc fails inside the cache-fill path. glyph->index is then dereferenced unconditionally, causing a crash. The existing plutovg_font_face_get_glyph_metrics callers share this pattern (pre-existing), but adding a new public API is a good opportunity to fix it:

glyph_t* glyph = plutovg_font_face_get_glyph(face, codepoint);
if(glyph == NULL)
    return 0;
return stbtt_GetGlyphSVG(&face->info, glyph->index, svg);

return (codepoint >= 0x2190 && codepoint <= 0x21FF) ||
(codepoint >= 0x2300 && codepoint <= 0x23FF) ||
(codepoint >= 0x2460 && codepoint <= 0x27BF) ||
(codepoint >= 0x2900 && codepoint <= 0x2BFF);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Range overlap causes emoji misclassification: isSymbolsFallbackCodepoint includes 0x2460–0x27BF (Enclosed Alphanumerics through Dingbats). isEmojiCodepoint independently claims the overlapping range 0x231A–0x27BF (includes U+231A WATCH ⌚, U+23F0 ALARM CLOCK ⏰, U+2702 SCISSORS ✂, etc.). Because isSymbolsFallbackCodepoint is tested first in classifyTextRunKind, all these emoji are routed to the Symbols font rather than the Emoji font, even without a VS-16 qualifier. The two functions' ranges need to be made mutually exclusive, or the check order needs to be inverted for ambiguous codepoints.

if(dstRect.isEmpty())
return false;

auto document = Document::loadFromData(svgData, static_cast<size_t>(svgLength));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SVG parsed on every render call — no caching: Document::loadFromData fully parses the embedded SVG glyph document on each invocation. For animated or dynamic SVG text with emoji rendered at 60 fps, this is 60 parse-and-allocate cycles per unique emoji per second. The svgData pointer returned by stbtt_GetGlyphSVG is stable (it points into the font binary), so a simple cache keyed on (face*, codepoint, pixel_size)Bitmap would reduce this to a one-time cost per glyph.

At minimum, add a // TODO: cache rendered emoji bitmaps comment so the regression is visible.


static bool marksPreviousEmojiPresentation(uint32_t codepoint)
{
return isEmojiFormattingCodepoint(codepoint);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant identity function with a misleading name: marksPreviousEmojiPresentation simply delegates to isEmojiFormattingCodepoint with no added logic. More importantly, it is called inside an if(isEmojiFormattingCodepoint(currentCharacter)) branch, so it is always true there — the inner condition marksPreviousEmojiPresentation(currentCharacter) is a tautology:

if(isEmojiFormattingCodepoint(currentCharacter)) {
    if(marksPreviousEmojiPresentation(currentCharacter) && ...)  // always true

This makes the code misleading: a reader expects the two predicates to differ. Remove the wrapper and directly inline the logic (or narrow marksPreviousEmojiPresentation to only VS-16 U+FE0F if that is the intended scope).

*
* @param face A pointer to a `plutovg_font_face_t` object.
* @param codepoint The Unicode code point of the glyph.
* @param svg Pointer that receives the embedded SVG document owned by the font face.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing lifetime constraint in API doc: The @param svg line says "owned by the font face" but the pointer is actually an interior pointer into the raw font binary (returned by stbtt_GetGlyphSVG). Callers must not free it, must not modify it, and it becomes invalid once face is destroyed. The current wording does not convey the raw-interior-pointer nature. Suggested addition:

The pointer points directly into the font face's internal data and is valid only for the lifetime of face. The caller must not free or modify it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant