remove sklearn dependency from cohenkappa score calculation logic and applied custom calculation and updated tests#3731
Conversation
… applied custom calculation and updated tests
|
Thanks for the pr! In this metric we inherited form the instead of using raw tensor values for final computation in |
|
so there is no need to bring all the raw tensors to the memory because there is no scikit-learn now, got it |
|
@avishkarsonni yes, instead of storing raw tensors in |
|
Nice implementation! Using torch.bincount for the confusion matrix is a good approach and keeps the computation on GPU. I had a couple of small observations: It may be safer to guard against the case where p_e == 1, since (1 - p_e) would cause a division by zero. Overall the implementation looks clean and matches the expected Cohen’s Kappa formulation. |
| n_classes = int(max(y_targets.max().item(), y_preds.max().item())) + 1 | ||
|
|
||
| indices = y_targets * n_classes + y_preds | ||
| conf = torch.bincount(indices, minlength=n_classes * n_classes).reshape(n_classes, n_classes).double() |
There was a problem hiding this comment.
Why can't we use ignite's confusion matrix class to compute this metric?
If we use ConfusionMatrix, we should provide num_classes as input argument... In order to keep backward compatibility, we can add num_classes arg to the constructor as optional kwargs.
We can have two private cohen kappa implementations: one using EpochMetric (current one) and second using ConfusionMatrix. The public CohenKappa can route depending on num_classes arg.
In the private implementation using EpochMetric we should still use ConfusionMatrix to compute the confusion matrix in compute() method instead of doing that manually (as currently) to avoid bugs.
There was a problem hiding this comment.
I used bincount as I was aware of it and from start planned as such but now as mentioned it is possible. in the start of this PR it was not intuitive to me as it needs num_classes at init. Current impl infers it dynamically from data, but it is indeed possible.
There was a problem hiding this comment.
And again this method is better only when the user knows num_classes if not we will need a fall back to infer it from the data on its own. Both of the methods have same time complexity, just advantage of using ConfusionMatrix is edge case coverage over current approach, would love to hear what would be a better approach!
There was a problem hiding this comment.
Benefit of using ConfusionMatrix is that we do not store the list of y_preds and y_true in RAM vs the current implementation using EpochMetric. Drawback of using ConfusionMatrix is that we should specify num_classes in the metric constructor and also it gives a backward compatibility break.
OK, the suggestion for this PR: let's use ConfusionMatrix in the _cohen_kappa_score in order to avoid using manual confusion matrix computation. We can instantiate an object of ConfusionMatrix with the number of classes and do a single update and call compute.
| cm = ConfusionMatrix(num_classes=num_classes, device=y_pred.device) | ||
| y_pred_oh = F.one_hot(y_pred.long(), num_classes).float() | ||
| cm.update((y_pred_oh, y.long())) | ||
| conf = cm.compute().double() |
There was a problem hiding this comment.
Passing to double like this wont work on MPS
There was a problem hiding this comment.
but then the precision issue comes up as MPS needs float32 and CUDA goes for float64, I've been picking at this issue for some time, can I use torch.promote_types to pick highest precision? or that would create an overhead? or there is another way to add a try-catch block like if it rejects double() it goes to float()?
There was a problem hiding this comment.
having different precision for the mps and cuda will be weird , either we can stick with .float() or convert the gpu tensors to cpu tensors and then call the .double()
There was a problem hiding this comment.
the second one is what I am going with, can do good with better precision
There was a problem hiding this comment.
I suggest to use different precisions double for cpu/cuda and float on mps as it does not support double using metric._double_dtype arg:
ignite/ignite/metrics/metric.py
Lines 389 to 390 in c830a06
…) method to better preserve precision for mps
vfdev-5
left a comment
There was a problem hiding this comment.
@avishkarsonni thanks a lot for the update of the code using the suggested approach.
However, I'm not totally agree with Aaishwarya about passing to cpu from an accelerator to perform computation on double. If device is cuda supporting double, passing to cpu is not optimal.
I think we can use _double_dtype for that:
ignite/ignite/metrics/metric.py
Lines 389 to 390 in c830a06
I am not sure what you mean exactly, |
|
The suggestion is to use ignite/ignite/metrics/multilabel_confusion_matrix.py Lines 137 to 138 in c830a06 |
|
I too agree that wherever we can use the cuda acceleration we should use it, but it again creates a situation where we have to treat mps and cuda differently and cannot make the code agnostic, I don't know what is the better approach |
|
it is just a precision difference, we do not treat them essentially differently, so it is fine to use my suggestion. |
…e available device using _double_dtype.
Fixes #3701
Description:
Remove scikit-learn dependency from
CohenKappaby implementing a native PyTorch version.sklearn.metrics.cohen_kappa_scorewith a pure PyTorch implementation.cpu().numpy()) — metric now runs fully on the configured devicetorch.bincount(single vectorised kernel) instead of a Python loopfloat64throughout to match sklearn's numerical precisionCheck list: