Skip to content

Harden filamesh parser and LinearImage against malformed inputs#9905

Open
sharadboni wants to merge 3 commits intogoogle:mainfrom
sharadboni:fix-filamesh-linearimage-overflow
Open

Harden filamesh parser and LinearImage against malformed inputs#9905
sharadboni wants to merge 3 commits intogoogle:mainfrom
sharadboni:fix-filamesh-linearimage-overflow

Conversation

@sharadboni
Copy link
Copy Markdown

@sharadboni sharadboni commented Apr 15, 2026

Summary

Hardens two attacker-reachable parsers against malformed input while fixing three latent correctness bugs that exist in main today.

libs/filameshio — MeshReader

  • New loadMeshFromBuffer overload that accepts dataSize and bounds-checks every read against the buffer end (magic, header, vertexSize, indexSize, parts array, materialCount, each material nameLength). The existing overloads are kept as thin wrappers forwarding with dataSize = SIZE_MAX, so external callers compile unchanged.
  • loadMeshFromFile forwards the known file size through the new bounds-validated path (plus the pre-existing fd < 0 guard).
  • Alignment / strict-aliasing fix: Header, CompressionHeader, and each Part are now read via memcpy into a local instead of cast through the caller-supplied buffer, which may not be naturally aligned for uint32_t.
  • materialCount / nameLength read fix: the original code wrote uint32_t x = (uint32_t) *p; — this dereferences a single byte and then advances by sizeof(uint32_t), silently truncating any value ≥ 256 and breaking on big-endian hosts. Replaced with memcpy so all four bytes emitted by the writer (tools/filamesh/src/MeshWriter.cpp:253-255) are read.
  • nameLength + 1 overflow fix: promote to size_t before + 1, otherwise nameLength == UINT32_MAX wraps to zero and the bounds check is bypassed.
  • Header version check: reject unknown versions before interpreting fields whose layout could differ.
  • Integer-overflow checks on compressed buffer sizing (indexSize * indexCount, vertexSize * vertexCount) and on header.parts * sizeof(Part) before pointer arithmetic.
  • malloc return checks on the two compressed-decode paths; the scratch buffer is free'd if decoding fails instead of leaking.

libs/image — LinearImage

  • LinearImage(width, height, channels) previously computed uint32_t nfloats = width * height * channels; — silently truncating on values like 65536 × 65536 × 3 and producing an undersized allocation followed by out-of-bounds memset / pixel writes.
  • Now saturates in two stages (width * height fits in uint64_t; the multiply by channels is checked against UINT64_MAX / channels), then validates against SIZE_MAX / sizeof(float) before allocation. Throws std::runtime_error on overflow, matching the convention already used in libs/imageio (ImageDecoder.cpp, ImageEncoder.cpp).

Test plan

  • test_filamesh (NonInterleaved, Interleaved) — updated to call the bounds-validated overload; should continue to pass.
  • Craft a filamesh buffer with parts = 0xFFFFFFFF and verify the overflow check rejects it.
  • Craft a filamesh buffer that claims materialCount = 300 to confirm that, with the memcpy fix, all 300 material entries are parsed (old code would silently read 300 as 0x2C = 44 on little-endian).
  • Craft a filamesh buffer with nameLength = 0xFFFFFFFF and verify the size_t-promoted bounds check rejects it.
  • Truncated buffer (every payload field one byte short of the file) — each should return an empty Mesh instead of reading past the end.
  • LinearImage(65536, 65536, 3) throws std::runtime_error.
  • Existing .filamesh samples load unchanged via the backward-compat overload.

…and LinearImage

MeshReader (libs/filameshio/src/MeshReader.cpp):
- loadMeshFromFile: check open() return value to prevent passing fd=-1 to
  fileSize/read, which causes undefined behavior (SEGV on lseek with
  invalid fd).
- loadMeshFromBuffer: validate header->parts against SIZE_MAX/sizeof(Part)
  to prevent integer overflow in pointer arithmetic. Validate combined
  payload sizes (vertexSize + indexSize + parts*sizeof(Part)) do not
  overflow. Add sanity check on materialCount. Validate nameLength to
  prevent unbounded reads from attacker-controlled material name lengths.
- Compressed index/vertex buffer paths: check indexSize*indexCount and
  vertexSize*vertexCount for size_t overflow before malloc, preventing
  undersized allocation and subsequent heap buffer overflow.

LinearImage (libs/image/src/LinearImage.cpp):
- Use uint64_t for width*height*channels multiplication instead of
  uint32_t to prevent silent integer overflow (e.g. 65536*65536*3 wraps
  to a small value in uint32_t). Validate against SIZE_MAX/sizeof(float)
  before allocation to prevent undersized buffer and heap corruption.
@sharadboni
Copy link
Copy Markdown
Author

@poweifeng Could you review this security fix? It adds bounds validation to the filamesh parser, fixes a uint32 overflow in LinearImage dimensions, and adds overflow checks in compressed mesh decompression.

@romainguy
Copy link
Copy Markdown
Contributor

See #9910 for MeshReader fixes that conflict with these. The other PR has better validation of the data since it checks there are no reads outside of the bounds of the file.

MeshReader:
- Add a new loadMeshFromBuffer overload that takes a dataSize and
  validates every read against the buffer end. The old overload is kept
  as a backward-compatible thin wrapper (dataSize = SIZE_MAX).
- loadMeshFromFile forwards the known file size through the new path.
- Use memcpy to read Header, CompressionHeader and Part instances
  instead of casting into the buffer; avoids strict-aliasing and
  alignment UB on unaligned caller buffers.
- Fix the materialCount / nameLength reads: the original code did
  "(uint32_t) *p" which dereferences a single byte, then advanced by
  sizeof(uint32_t). Replaced with memcpy so the full 4-byte value is
  read (the writer always emits 4 bytes).
- Promote nameLength to size_t before adding 1 so that nameLength ==
  UINT32_MAX cannot wrap to zero and bypass the material-name bounds
  check.
- Reject unknown header.version values before interpreting fields.
- Check malloc returns on the compressed index/vertex paths and free
  the scratch buffer when decoding fails.

LinearImage:
- Throw std::runtime_error (matching libs/imageio) instead of
  std::overflow_error.
- Saturate the width * height * channels computation in two stages so
  it cannot overflow uint64_t before the SIZE_MAX check.

test_filamesh: exercise the bounds-validated overload.
@sharadboni sharadboni changed the title Fix buffer overflow and integer overflow in filamesh and LinearImage Harden filamesh parser and LinearImage against malformed inputs Apr 17, 2026
@sharadboni
Copy link
Copy Markdown
Author

Thanks @romainguy — I took a look at #9910 and agree the dataSize approach is the right foundation. I have updated this PR to adopt a bounds-validated loadMeshFromBuffer(data, dataSize, ...) overload (old overloads kept as backward-compatible wrappers so external callers continue to compile).

I also took the opportunity to address three latent issues that neither PR was fixing, which I think make this version strictly stronger than #9910:

  1. (uint32_t) *p reads only one byte. Both the existing code and Validate filamesh buffer bounds in MeshReader #9910 keep lines like uint32_t materialCount = (uint32_t) *p; — that dereferences a single uint8_t and then advances by sizeof(uint32_t). On little-endian it silently truncates any value ≥ 256; on big-endian it always reads 0. The writer at tools/filamesh/src/MeshWriter.cpp:253-255 emits a full 4-byte uint32_t, so this is a latent correctness bug for materialCount > 255 and any filamesh file on a big-endian reader. Fixed via memcpy.

  2. Header / Part / CompressionHeader casts are strict-aliasing + alignment UB on unaligned caller buffers (e.g. JS bindings, mmap + offset). Replaced with memcpy into a local.

  3. nameLength + 1u can wrap to zero. In Validate filamesh buffer bounds in MeshReader #9910 the bounds check is if (size_t(end - p) < nameLength + 1u). Because nameLength is uint32_t, the + 1u is uint32_t arithmetic, so nameLength == UINT32_MAX wraps to 0 and the check is vacuous. Fixed here by promoting to size_t first.

Plus: a header.version check, malloc null-return checks, and the LinearImage fix (which #9910 does not touch).

If you would prefer I land the LinearImage part as a separate PR or drop something in scope, happy to split — just let me know.

Comment thread libs/filameshio/src/MeshReader.cpp Outdated

// Promote to size_t before adding 1 so that nameLength == UINT32_MAX
// cannot wrap to 0 and bypass the bounds check.
const size_t nameSpan = size_t(nameLength) + 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On 32bit platforms, size_t is 32-bit. So this will overflow to 0 if nameLength is UINT32_MAX.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The size_t cast was a no-op on 32-bit targets, so UINT32_MAX + 1 still wraps there. Switched to if (nameLength >= dataSize - consumed) return {}; before doing the + 1, so the add only ever runs on a bounded value.

Comment thread libs/filameshio/src/MeshReader.cpp Outdated
utils::slog.e << "Invalid material name length." << utils::io::endl;
return {};
}
partsMaterial[i] = (const char*) (base + consumed);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You just validated nameSpan to ensure it fits in the buffer and then used this assignment that ignores the value. This will call strlen, which will incur a security hole if the string is not nul-terminated.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, the length check was pointless if the copy itself falls back to strlen. Changed to partsMaterial[i].assign((const char*)(base + consumed), nameLength) so it uses the validated length directly and never walks looking for a NUL.

Comment thread libs/filameshio/src/MeshReader.cpp Outdated
size_t indexCount = header.indexCount;
if (indexCount > 0 && indexSize > SIZE_MAX / indexCount) {
utils::slog.e << "Index buffer size overflow." << utils::io::endl;
return {};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens for the created mesh.indexBuffer if we return here?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It leaks. Same issue applies to the malloc/decode failure paths below it and to the vertex-buffer block further down. Fix was to move both compressed-size overflow checks above the IndexBuffer::Builder / VertexBuffer::Builder calls (so those rejections return before anything is built), and to call engine->destroy(mesh.indexBuffer) / engine->destroy(mesh.vertexBuffer) on the malloc and decode failure paths. Every early return from loadMeshFromBuffer should be leak-free now.

- Reformulate the material-name bounds check as
  "nameLength >= dataSize - consumed" so the + 1 cannot wrap when
  size_t is 32-bit.
- Use std::string::assign(ptr, length) instead of implicit
  construction from a C-string pointer, so reading the material name
  does not fall back to strlen if the buffer is not NUL-terminated at
  the expected offset.
- Hoist the compressed index/vertex size-overflow checks above the
  IndexBuffer/VertexBuffer builders so those rejections return before
  any engine resources are allocated. For malloc and decode failures,
  destroy the already-built buffers before returning so no engine
  resource leaks on the error paths.
@sharadboni sharadboni requested a review from z3moon April 22, 2026 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants