feat(metrics): add Perplexity metric to ignite.metrics.nlp#3743
feat(metrics): add Perplexity metric to ignite.metrics.nlp#3743steaphenai wants to merge 13 commits intopytorch:masterfrom
Conversation
Expose a new token-level Perplexity metric in ignite.metrics.nlp and top-level ignite.metrics, with dedicated unit tests to validate correctness and behavior.
8453d4e to
fa394ca
Compare
|
Nice addition! Perplexity is definitely a useful metric for language modeling and it fits well under The test coverage looks solid — especially the token-weighted accumulation test, which ensures correctness across batches with different sequence lengths. One small suggestion: it might be useful to add a GPU test to ensure the metric behaves correctly when tensors are on CUDA devices, since many language modeling workloads run on GPU. Something like: def test_gpu_support():
if not torch.cuda.is_available():
pytest.skip()Overall the implementation and tests look clean and consistent with existing Ignite metrics. |
|
Good point, thanks. I’d like to keep this PR scoped to the Perplexity implementation and core correctness tests. We can add a dedicated CUDA test if maintainers want explicit GPU coverage. |
vfdev-5
left a comment
There was a problem hiding this comment.
@steaphenai thanks for the PR, I made a quick pass and left few comments.
The tests look shallow and there is no reference implementation that we test against.
I suggest to check what we can use for reference implementation.
In terms of testing on accelerators check other tests like test_accuracy.py to inspire from.
Thanks for the quick review, @vfdev-5. |
Co-authored-by: vfdev <vfdev.5@gmail.com>
|
@steaphenai code style check is failing: https://github.com/pytorch/ignite/actions/runs/24830927009/job/72725024776?pr=3743 |
|
@vfdev-5 Could you approve the workflows to run? The required checks are pending approval. |
|
@steaphenai this failure is real: https://github.com/pytorch/ignite/actions/runs/24847148226/job/72879291595?pr=3743 |
…rs for MPS compatibility
|
@vfdev-5 I checked other metrics in |

Closes #3742
Summary
Perplexitymetric implementation inignite.metrics.nlp.perplexity.Perplexityfrom bothignite.metrics.nlpand top-levelignite.metrics.Test plan
python -m pytest tests/ignite/metrics/nlp/test_perplexity.py -vpython -c "from ignite.metrics.nlp import Perplexity; import torch; ppl = Perplexity(); ppl.reset(); ppl.update((torch.randn(2,5,3), torch.randint(0,5,(2,3)))); print('PPL =', ppl.compute())"Files changed
ignite/metrics/nlp/perplexity.pyignite/metrics/nlp/__init__.pyignite/metrics/__init__.pytests/ignite/metrics/nlp/test_perplexity.py