Skip to content

fix gptq quantization condition#2416

Merged
IlyasMoutawwakil merged 2 commits intohuggingface:mainfrom
jiqing-feng:main
Apr 27, 2026
Merged

fix gptq quantization condition#2416
IlyasMoutawwakil merged 2 commits intohuggingface:mainfrom
jiqing-feng:main

Conversation

@jiqing-feng
Copy link
Copy Markdown
Contributor

Same as huggingface/transformers#44588. The quantization only works for original nn.Linear module, subclass has custom forward so quantized layer cannot handle it.

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@jiqing-feng
Copy link
Copy Markdown
Contributor Author

jiqing-feng commented Mar 25, 2026

Hi @SunMarc . Please also review this PR. Thanks!

cc @Qubitium

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
@Qubitium
Copy link
Copy Markdown
Contributor

@jiqing-feng @SunMarc Looks good to me for the short-term.

Long term, since we have lack of information, we really don't know if this object which inherits nn.Linear but is not exactly nn.Linear is truely not-quantizable. For example, if we have a module that overrides nn.Linear and only wraps forward and the code inside just move tensors from disk to gpu pre-forward, and then after fwd, move the tensor back to disk, it would be black-listed by this logic but it is actually qualifiable for quantization.

Like my comment in huggingface/transformers#44588 (comment), in the future, we need much more information to better decide.

In the current space with lack of info, any decision we make is going to be incomplete and will either target too wide or too narrow.

@jiqing-feng
Copy link
Copy Markdown
Contributor Author

Hi @IlyasMoutawwakil . Would you please review this PR? Thanks!

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks !

@SunMarc SunMarc requested a review from IlyasMoutawwakil April 2, 2026 14:25
@jiqing-feng
Copy link
Copy Markdown
Contributor Author

Hi @SunMarc . Would you please help to merge the PR? Thanks!

@IlyasMoutawwakil IlyasMoutawwakil merged commit 7a2c375 into huggingface:main Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants