Skip to content

Raise FileTypeError for WAVE_FORMAT_MPEGLAYER3 WAV files#105

Merged
snejus merged 8 commits intobeetbox:masterfrom
elainec2024:fix-mpeglayer3-wav
Apr 14, 2026
Merged

Raise FileTypeError for WAVE_FORMAT_MPEGLAYER3 WAV files#105
snejus merged 8 commits intobeetbox:masterfrom
elainec2024:fix-mpeglayer3-wav

Conversation

@elainec2024
Copy link
Copy Markdown
Contributor

When a WAV file contains an MP3 stream (wFormatTag=0x0055, WAVE_FORMAT_MPEGLAYER3), mutagen opens it as a 'WAVE' object but cannot read or write its tags correctly. This causes MediaFile to return the wrong metadata - incorrect duration, zero bitrate, and format reported as "WAVE" instead of "MP3."
This was reported in beetbox/beets#6455.

The fix was to detect the WAVE_FORMAT_MPEGLAYER3 subtype using mutagen's 'audio_format' attribute on the WAV info object and raise 'FileTypeError' with a clear message.

Adds a test fixture and regression test.

WAV files containing an MP3 stream (wFormatTag=0x0055) are silently
mishandled by mutagen, returning incorrect duration, bitrate, and
format metadata. Detect this subtype via the audio_format attribute
and raise FileTypeError with a clear message instead.

Adds a test fixture and regression test.

Ref: beetbox/beets#6455
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@mrpapersonic
Copy link
Copy Markdown

It would be nice to blacklist other formats that beets can't work with. I could see mu-law being a problem (albeit rare) for example.

In addition to WAVE_FORMAT_MPEGLAYER3 (0x0055), also reject
WAVE_FORMAT_ADPCM (0x0002), WAVE_FORMAT_ALAW (0x0006), and
WAVE_FORMAT_MULAW (0x0007).
@elainec2024
Copy link
Copy Markdown
Contributor Author

Thank you for the suggestion! I've updated the PR to also blacklist WAVE_FORMAT_ADPCM (0x0002), WAVE_FORMAT_ALAW (0x0006), and WAVE_FORMAT_MULAW (0x0007), which all show similar issues with mutagen. I verified that IEEE float (0x0003) and WAVE_FORMAT_EXTENSIBLE (0xFFFE) work correctly and don't need to be blacklisted.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.02%. Comparing base (8952a73) to head (bad5096).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #105      +/-   ##
==========================================
+ Coverage   92.78%   93.02%   +0.24%     
==========================================
  Files          16       16              
  Lines         832      832              
  Branches      117      117              
==========================================
+ Hits          772      774       +2     
+ Misses         41       40       -1     
+ Partials       19       18       -1     

see 1 file with indirect coverage changes

Copy link
Copy Markdown
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks. Just fix the issues in CI checks

@elainec2024 elainec2024 requested a review from snejus April 13, 2026 19:59
Comment thread docs/changelog.rst Outdated
Comment thread test/test_mediafile_edge.py Outdated
@elainec2024 elainec2024 requested a review from snejus April 13, 2026 22:12
@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 13, 2026

Ah another issue. Just adjust this line and it should be happy:

diff --git a/.github/workflows/lint.yaml b/.github/workflows/lint.yaml
index a4169b4..015c3d9 100644
--- a/.github/workflows/lint.yaml
+++ b/.github/workflows/lint.yaml
@@ -106,7 +106,7 @@ jobs:
           cache: poetry
 
       - name: Install dependencies
-        run: poetry install --only=lint
+        run: poetry install
 
       - name: Type check code
         uses: liskin/gh-problem-matcher-wrap@v3

Comment thread test/test_mediafile_edge.py Outdated
@snejus snejus merged commit 7dd6e8f into beetbox:master Apr 14, 2026
13 checks passed
snejus added a commit to beetbox/beets that referenced this pull request Apr 14, 2026
Fixes #6455 

WAV files can contain MP3 audio streams (WAVE_FORMAT_MPEGLAYER3,
wFormatTag=0x0055), but beets has no way to tag them correctly and would
silently import them with wrong duration, bitrate, and format metadata.

This PR fixes the issue by automatically detecting such files during
import and using ffmpeg to remux them into proper .mp3 files. The
original WAV is deleted after successful remuxing, and the MP3 is
imported normally with correct metadata.

This fix depends on beetbox/mediafile#105 which raises a clear
FileTypeError for these files, allowing beets to detect and handle them.
@kiefermat
Copy link
Copy Markdown
Contributor

Hi, it is a bit unfortunate that mediafile now throws an exception just because mutagen cannot correctly work with these file types. mediafile is a separate python module on pypi and I am using it for a project separately from beets. Maybe others are also using it like this.
Would it be possible to rather adapt beets to react correctly on these kind of files?

@jackwilsdon
Copy link
Copy Markdown
Member

mediafile already throws exceptions for unsupported files (i.e. files it cannot read tags from) - is there a use case you have for using mediafile on MP3-in-WAV files that isn't managing tags?

@kiefermat
Copy link
Copy Markdown
Contributor

In addition to tags I am using MediaFile to get some basic attributes of audio files, e.g. channels, samplerate, length, bitrate. E.g. MediaFile 0.16.0 would provide the following information for this wav file https://www.mmsp.ece.mcgill.ca/Documents/AudioFormats/WAVE/Samples/AFsp/M1F1-mulaw-AFsp.wav
channels: 2
bitrate: 128000
samplerate: 8000
length: 2.936625

MediaFile in version 0.16.1 throws an exception: "File type 'WAVE_FORMAT_MULAW' is not supported"

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 16, 2026

@elainec2024 does beets really need this functionality or could this logic be moved to beets by any chance?

@elainec2024
Copy link
Copy Markdown
Contributor Author

I just tested this approach locally by calling remux_mpeglayer3_wav in read_item (in beets.importer.tasks) before attempting to read metadata, and beets can handle the detection without depending on mediafile raising FileTypeError. All tests pass with this approach, but the tradeoff is that remux_mpeglayer3_wav now runs on every imported file rather than only on files that fail. However, the overhead is minimal because it just opens the file with mutagen and checks one attribute.

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 17, 2026

Sounds good. Would you mind submitting a PR for this? I will revert this PR once we merge that :)

@snejus
Copy link
Copy Markdown
Member

snejus commented Apr 17, 2026

Actually, maybe a better solution would be raise this error optionally, @elainec2024? Do not raise by default, but have beets provide an optional variable (into the constructor, I guess) to raise on invalid wavs.

snejus added a commit that referenced this pull request Apr 17, 2026
Following up on #105, this adds a `raise_on_unsupported_wav` parameter
to `MediaFile.__init__` (defaults to `False`) to make the
`FileTypeError` for non-PCM WAV formats optional rather than mandatory.

This allows other mediafile users to continue reading basic audio
properties (channels, samplerate, length, bitrate) from non-PCM WAV
files, while beets can opt in to the behavior by passing
`raise_on_unsupported_wav=True`.

Addresses the concern raised in #105.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants