Skip to content

REF: Refactor PET model#204

Merged
jhlegarreta merged 1 commit into
nipreps:mainfrom
jhlegarreta:ref/refactor-pet-model
Dec 21, 2025
Merged

REF: Refactor PET model#204
jhlegarreta merged 1 commit into
nipreps:mainfrom
jhlegarreta:ref/refactor-pet-model

Conversation

@jhlegarreta
Copy link
Copy Markdown
Contributor

@jhlegarreta jhlegarreta commented Aug 9, 2025

Refactor PET model: use a base class that contains the essential properties for a PET model and create a derived class that implements the B-Spline correction.

In the base class:

  • Remove the timepoints attribute: it corresponds to the PET dataset midframe attribute, and thus, there is no need to duplicate it in the model.
  • Remove the xlim attribute: it corresponds to the PET dataset total_duration attribute, and thus, there is no need to duplicate it in the model.
  • Remove the checks related to the above attributes: checking the coherence of their values is not the model's responsibility, and should be done when instantiating the dataset.

Make the model use the frame index instead of the midframe value as the argument of the fit_predict function. This increases makes the PET model be consistent with the approach used in the code base as a principle for generalization in the framework.

The index was also marked as an integer by the type annotations (as it should be expected), whereas midframe values that were being provided are typically floating point values.

Change the tests and the PET notebook accordingly.

In the derived, BSpline class:

  • Remove the unnecessary explicit float casting around the number of control points: the dtype="float32" specifier creates a float array.
    Fixes:
    Expected type 'str | Buffer | SupportsFloat | SupportsIndex', got 'Literal[0] | None | {__eq__} | int' instead
    
    raised by the IDE.

Take advantage of the commit to use np.asarray to avoid extra copies when not necessary.

Cast explicitly to int when getting the frame indices in the estimator's run method to avoid type checking errors:

src/nifreeze/estimator.py:274: error:
 Argument 1 to "fit_predict" of "BSplinePETModel" has incompatible type
 "signedinteger[_32Bit | _64Bit]"; expected "int | None"  [arg-type]

raised for example at:
https://github.com/nipreps/nifreeze/actions/runs/20402438629/job/58626813972#step:8:35

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch from 2fb555e to 90eb886 Compare August 12, 2025 23:28
@jhlegarreta jhlegarreta requested a review from mnoergaard August 12, 2025 23:38
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

jhlegarreta commented Aug 12, 2025

Had another look at this. I had introduced a BasePETModel based on the BaseDWIModel. Debugging the thing I currently do not see scenarios where we will need to have the _fit and predict_fit implemented in the BasePETModel, but last time we did discuss the possibility of adding other models. However, as I see things, we wouldn't need to have have _fit and predict_fit implemented in the base class for PET. So, unless you tell me the contrary and you clearly see that it may be beneficial and that you see this immediately @mnoergaard @oesteban, I would leave them as abstract functions and make all the current logic dwell in the BSplinePETModel.

Comment thread src/nifreeze/model/pet.py Outdated
Comment thread src/nifreeze/model/pet.py Outdated
)

# ToDo
# Are timepoints and xlim features that all PET models require ??
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnoergaard question for you.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say so, since we are dealing with motion correction, and hence would need temporal information. Currently, the PET model needs both the sampling midpoints and the scan’s total duration to work, so enforcing their presence in the base class keeps the API consistent. When new PET models that do not need temporal information are introduced, we can revisit this requirement; until then, providing dataset.midframe and dataset.total_duration (as done in the unit tests) is the intended usage.

Copy link
Copy Markdown
Contributor Author

@jhlegarreta jhlegarreta Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see that this is related to #204 (review). I think the naming should then be made consistent across the PET data class and the model class. Also, I do not see why the model is given the timepoints, if these are hold by the PET data class, i.e. from #204 (review)

(...) midframe is where the dataset stores real-world frame timing; timepoints/_x is the copy of those timings handed to the model

then we should not be providing the model with a copy of them.

If we make the parallel with the DWI class, the gradients are obtained from the dataset, e.g.:

gradient = self._dataset.gradients[:, index]

Also, can the xlim name be made somehow more descriptive or is it a name that is commonly used within the PET domain?

Comment thread src/nifreeze/model/pet.py Outdated
raise NotImplementedError("Fitting with held-out data is not supported")

# ToDo
# Does not make sense to make timepoints be a kwarg if it is provided as a named parameter to __init__
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnoergaard question for you.

Comment thread src/nifreeze/model/pet.py Outdated

# ToDo
# What is the gtab equivalent of PET ?
model_str = getattr(self, "_model_class", "")
Copy link
Copy Markdown
Contributor Author

@jhlegarreta jhlegarreta Aug 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not be necessary.

Comment thread src/nifreeze/model/pet.py Outdated

# ToDo
# What are the gtab (and S0 if any) equivalent of PET ?
if n_models == 1:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/nifreeze/model/pet.py Outdated

# ToDo
# Does not make sense to make timepoints be a kwarg if it is provided as a named parameter to __init__
timepoints = kwargs.get("timepoints", None) or self._x
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/nifreeze/model/pet.py
# Does the below apply to PET ? Martin has the return None statement
# if index is None:
# raise RuntimeError(
# f"Model {self.__class__.__name__} does not allow locking.")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another question for @mnoergaard.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch from 90eb886 to afb11da Compare August 13, 2025 12:56
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

jhlegarreta commented Aug 13, 2025

The file test_model_pet.py will eventually need to be integrated into test_model.py as all models are being tested there. Unless we decide to put the dMRI models into a different file as well @oesteban ?

@jhlegarreta
Copy link
Copy Markdown
Contributor Author

jhlegarreta commented Aug 18, 2025

Re #204 (comment) Compared to afb11da, in cbf417d I am discarding the idea of keeping a parallel of the dMRI base model in the PET base model. The PET model tests pass and the notebook runs now. It gives a result that is almost the same as in #203 (comment), with only minor differences in X and Y rotation start/ends. However, the parallelization was lost in the refactoring and the notebook takes almost 2 hours now compared to 30 minutes. So I need to look into that.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch from cbf417d to 10af03f Compare August 18, 2025 01:52
@codecov
Copy link
Copy Markdown

codecov Bot commented Aug 18, 2025

Codecov Report

❌ Patch coverage is 82.69231% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.10%. Comparing base (2a13628) to head (7c7c058).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
src/nifreeze/model/pet.py 80.85% 7 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #204      +/-   ##
==========================================
- Coverage   82.17%   82.10%   -0.08%     
==========================================
  Files          37       37              
  Lines        2037     2045       +8     
  Branches      225      224       -1     
==========================================
+ Hits         1674     1679       +5     
- Misses        317      320       +3     
  Partials       46       46              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 5 times, most recently from 6318c50 to 908faa8 Compare August 18, 2025 23:38
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

jhlegarreta commented Aug 18, 2025

Re #204 (comment), the throttle issue was due to this line:

n_jobs = n_jobs or 1

which I had borrowed from

n_jobs = n_jobs or 1

Now changed to

n_jobs = n_jobs or min(cpu_count() or 1, 8)

To match more closely

n_jobs = n_jobs or 1

And the speed is back. Results in plots, as I was seeing yesterday (cf. the above comment), almost the same with minor differences:

pet_estimated_motion_pet_model_refactor pet_fd_pet_model_refactor

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 3 times, most recently from 4751545 to 19d5956 Compare August 22, 2025 22:39
@mnoergaard mnoergaard marked this pull request as ready for review September 22, 2025 07:52
Comment thread src/nifreeze/model/pet.py Outdated
Comment thread src/nifreeze/model/pet.py
# Calculate index coordinates in the B-Spline grid
self._n_ctrl = n_ctrl or (len(timepoints) // 4) + 1
def _preproces_data(self) -> np.ndarray:
# ToDo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the todo here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, the idea there is to preprocess only the data that is required; e.g. if we are using only a subset of the volumes, then we shouldn't preprocess the entire data, e.g.

data, _, gtab = self._dataset[idxmask]
# Select voxels within mask or just unravel 3D if no mask
data = data[brainmask, ...] if brainmask is not None else data.reshape(-1, data.shape[-1])

The data, _, gtab = self._dataset[idxmask] is commented out in here immediately after the ToDo to as a pointer to that idea.

Comment thread src/nifreeze/model/pet.py Outdated

# Convert data into V (voxels) x T (timepoints)
data = data.reshape((-1, data.shape[-1])) if brainmask is None else data[brainmask]
# ToDo
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing _fit to override timepoints through kwargs duplicates information that was already validated and stored on self._x during initialization. Dropping that extra kwarg simplifies the API and avoids inconsistent inputs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand: maybe the reply is answering a question elsewhere?

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 3 times, most recently from 56cf1b8 to a4643c5 Compare September 26, 2025 01:24
@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 2 times, most recently from 21026bb to 7d22d9e Compare September 28, 2025 01:20
@oesteban oesteban linked an issue Nov 21, 2025 that may be closed by this pull request
@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 6 times, most recently from 93c2fd2 to 485d8f4 Compare December 12, 2025 02:53
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

jhlegarreta commented Dec 12, 2025

Some questions that will need to be addressed at some point: #318 (comment)

Especially relevant here (or for an immediately subsequent PR) is the need of the timepoints parameter. If values are a copy of the midframe values of the PET dataset class (#204 (review)), then we do not need to duplicate those: providing the index that needs to be predicted and the scaling factor, we should be good. Also mentioned in #204 (comment).

@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 3 times, most recently from eb11211 to 515ee05 Compare December 12, 2025 03:51
@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch 4 times, most recently from c0e5c8c to 70ad152 Compare December 21, 2025 02:24
Refactor PET model: use a base class that contains the essential
properties for a PET model and create a derived class that implements
the B-Spline correction.

In the base class:
- Remove the `timepoints` attribute: it corresponds to the PET dataset
  `midframe` attribute, and thus, there is no need to duplicate it in
  the model.
- Remove the `xlim` attribute: it corresponds to the PET dataset
  `total_duration` attribute, and thus, there is no need to duplicate it
  in the model.
- Remove the checks related to the above attributes: checking the
  coherence of their values is not the model's responsibility, and
  should be done when instantiating the dataset.

Make the model use the frame index instead of the `midframe` value as
the argument of the `fit_predict` function. This increases makes the PET
model be consistent with the approach used in the code base as a
principle for generalization in the framework.

The index was also marked as an integer by the type annotations (as it
should be expected), whereas `midframe` values that were being provided
are typically floating point values.

Change the tests and the PET notebook accordingly.

In the derived, BSpline class:
- Remove the unnecessary explicit `float` casting around the number of
  control points: the `dtype="float32"` specifier creates a float array.
  Fixes:
  ```
  Expected type 'str | Buffer | SupportsFloat | SupportsIndex', got 'Literal[0] | None | {__eq__} | int' instead
  ```
  raised by the IDE.

Take advantage of the commit to use `np.asarray` to avoid extra copies
when not necessary.

Cast explicitly to `int` when getting the frame indices in the
estimator's `run` method to avoid type checking errors:
```
src/nifreeze/estimator.py:274: error:
 Argument 1 to "fit_predict" of "BSplinePETModel" has incompatible type
 "signedinteger[_32Bit | _64Bit]"; expected "int | None"  [arg-type]
```

raised for example at:
https://github.com/nipreps/nifreeze/actions/runs/20402438629/job/58626813972#step:8:35
@jhlegarreta jhlegarreta force-pushed the ref/refactor-pet-model branch from 70ad152 to 7c7c058 Compare December 21, 2025 02:26
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

OK, I think we have reached a point where this needs to be merged to move forward. So, I am going to go ahead and merge this. The summary of the status:

  • The PET notebook fails with the following error:
IndexError                                Traceback (most recent call last)
Cell In[17], line 6
      3 estimator = PETMotionEstimator()
      5 # Run the estimator
----> 6 affines = estimator.run(
      7     pet_dataset,
      8     omp_nthreads=8,
      9 )

File ~/.virtualenvs/nifreeze/lib/python3.10/site-packages/nifreeze/estimator.py:274,
 in PETMotionEstimator.run(self, pet_dataset, omp_nthreads)
    271 model.fit_predict(None)
    273 # Predict the reference volume at the test frame's timepoint
--> 274 predicted = model.fit_predict(idx)
    276 fixed_image_path = tmp_path / f"fixed_frame_{idx:03d}.nii.gz"
    277 moving_image_path = tmp_path / f"moving_frame_{idx:03d}.nii.gz"

File ~/.virtualenvs/nifreeze/lib/python3.10/site-packages/nifreeze/model/pet.py:225,
 in BSplinePETModel.fit_predict(self, index, **kwargs)
    221     return None
    223 # Project sample timing into B-Spline coordinates
    224 # ToDo: x is not really a matrix ...
--> 225 x = np.asarray((self._dataset.midframe[index] / self._dataset.total_duration) * self._n_ctrl)
    226 A = BSpline.design_matrix(x, self._t, k=self._order)
    228 # A is 1 (num. timepoints) x C (num. coeff)
    229 # self._locked_fit is V (num. voxels) x K - 4

IndexError: index 20 is out of bounds for axis 0 with size 20

This is caused by the lofo_split call that exists within the PET estimator class: the tqdm iterates over the original volume frame indices (i.e. 21 frames in the PET notebook example data); however the lofo_split makes such that the model instantiated with the training set has only 20 frames. Thus, when idx reaches a given value, it exceeds the bounds of the training dataset.

This requires refactoring the estimator, which is a WIP in PR #203. Thus, solving this is left for that PR.

@jhlegarreta
Copy link
Copy Markdown
Contributor Author

There are some warnings related to DIPY and unrelated to this patch set that are being reported by the CI:

============================== warnings summary ===============================
  .tox/py312/lib/python3.12/site-packages/dipy/core/geometry.py:150: 4 warnings
  test/test_model.py: 8 warnings
    /home/runner/work/nifreeze/nifreeze/.tox/py312/lib/python3.12/site-packages/dipy/core/geometry.py:150: UserWarning: 'where' used without 'out', expect unitialized memory in output. If this is intentional, use out=None.
      cos = np.divide(z, r, where=r > 0)
  
  .tox/py312/lib/python3.12/site-packages/dipy/core/geometry.py:151: 4 warnings
  test/test_model.py: 8 warnings
    /home/runner/work/nifreeze/nifreeze/.tox/py312/lib/python3.12/site-packages/dipy/core/geometry.py:151: UserWarning: 'where' used without 'out', expect unitialized memory in output. If this is intentional, use out=None.
      theta = np.arccos(cos, where=(cos >= -1) & (cos <= 1))

They should either be ignored or DIPY be bumped to be fixed: dipy/dipy@c122e00

Merging.

@jhlegarreta jhlegarreta merged commit 9b36f58 into nipreps:main Dec 21, 2025
8 of 10 checks passed
@jhlegarreta jhlegarreta deleted the ref/refactor-pet-model branch December 21, 2025 02:51
@jhlegarreta
Copy link
Copy Markdown
Contributor Author

In the commit message:

- Remove the `timepoints` attribute: it corresponds to the PET dataset
  `midframe` attribute, and thus, there is no need to duplicate it in
  the model.

timepoints was meant to be x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finalize PET model merge

2 participants