Skip to content

Fix garbled non-MIME-encoded UTF-8 headers#10078

Open
OctopusET wants to merge 1 commit into
roundcube:masterfrom
OctopusET:fix-utf8-header
Open

Fix garbled non-MIME-encoded UTF-8 headers#10078
OctopusET wants to merge 1 commit into
roundcube:masterfrom
OctopusET:fix-utf8-header

Conversation

@OctopusET

@OctopusET OctopusET commented Jan 27, 2026

Copy link
Copy Markdown

decode_mime_string() blindly applies the body charset to non-MIME-encoded headers. If the header is raw UTF-8 but the body charset differs, the header gets corrupted.

Check for valid UTF-8 before falling back to the body charset, same approach as Thunderbird's convert8BitHeader(): https://searchfox.org/comm-central/source/mailnews/mime/jsmime/jsmime.mjs#675

RFC 6532 (2012) legitimizes raw UTF-8 in email headers via SMTPUTF8 extension.

Screenshots

Before

image

After

image

ju-ef added a commit to ju-ef/roundcubemail that referenced this pull request Feb 24, 2026
Messages with charset=windows-1251 (quoted-printable encoding) display
as garbled text in Roundcube. Other clients (Thunderbird, K-9, mutt)
display them correctly.

`get_message_part()` already converts body charset to UTF-8, but
`format_part_body()` converts it again using `$part->charset` which
still contains the original charset (e.g. windows-1251). This double
conversion produces garbled output.

Check `mb_check_encoding()` before converting — if body is already
valid UTF-8, skip conversion.

Roundcube 1.6.13 + Stalwart mail server, windows-1251 quoted-printable
messages now display correctly.

Similar UTF-8 detection approach used in roundcube#10078.
@OctopusET OctopusET force-pushed the fix-utf8-header branch 3 times, most recently from b35d2bb to 089e201 Compare February 27, 2026 15:16
@OctopusET OctopusET changed the title Fix UTF-8 header corruption when fallback charset differs Fix garbled non-MIME-encoded UTF-8 headers Feb 27, 2026
@OctopusET

Copy link
Copy Markdown
Author

@alecpl I cleaned up the PR description. Previous one was too verbose and slop. Would you review this PR? Thank you.

$default_charset = $fallback ?: self::get_charset();

// RFC 6532: detect raw UTF-8 in headers to avoid wrong charset conversion
if ($fallback !== false && mb_check_encoding($input, 'UTF-8') && preg_match('/[\x80-\xFF]/', $input)) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not assume RCUBE_CHARSET is always UTF-8.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then should we change to

$default_charset = 'UTF-8';

instead?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rebased and applied that part.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for that I had to force push, sorry if it made you inconvenient

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants