ENH: Validate PET data objects' attributes at instantiation#336
Conversation
|
Depends on PR #335. |
e094509 to
a548cbc
Compare
a548cbc to
96e916b
Compare
|
@mnoergaard While working on this I've realized that as things stand now on
I think that answer will allow me to refactor this properly and avoid inconsistencies. Thanks. |
@jhlegarreta - that is correct! Thanks. |
745e408 to
bc7617e
Compare
|
@mnoergaard Please, see if the refactoring for the PET data class instantiation and |
803b020 to
1eae4b2
Compare
|
Pending to do an additional refactoring (in a separate commit) to stick to the convention adopted for dMRI data to split the |
ffe563c to
37ed54c
Compare
|
Re #336 (comment) as of commit 37ed54c and the
Also, it should be possible to host and read midframe and total duration data from a JSON file (or any other file), much like it is done with the frame duration data, falling back to the default way of computing them if not present. Refactoring things this way (i.e. allowing to provide all attributes to the WDTY @mnoergaard? Sorry for so many questions. Hopefully we will converge and the implementation across the multiple ways to read/write data will be consistent and robust after this. |
6751a06 to
bbe0ca0
Compare
|
Comments:
So, we need to discuss:
We probably want to override the base class |
e7a03a1 to
747b002
Compare
9f1a2bc to
387dfe6
Compare
|
Marking this as a draft: commits 747b002 and 387dfe6 show that the difficulties to have a robust PET data instantiation (i.e. not requiring more than what we need while trying to disallow means to set parameter values that can be wrong or inconsistent across time arrays), having a consistent API across modalities and being BIDS-compatible when serializing/reading the data required by the class). I think the PET estimator (PR #203) needs to be fixed so that this is not necessary: nifreeze/src/nifreeze/estimator.py Lines 256 to 261 in f892e52 We should not need to do that instantiation, which conditions in important ways the PET data class refactoring. Commit dd3738a shows why I have explored a lot of other avenues to try to have this sorted and none is robust. PRs #203 and #204 need to be worked on and merged before more time is invested in this PR. |
9f9ed71 to
43b99af
Compare
Validate PET data objects' attributes at instantiation: ensures that the attributes are present and match the expected dimensionalities. **PET class attributes** Refactor the PET attributes so that only `midframe` and `total_duration` are required and accepted by the constructor. These are the only parameters that are required by the current PET model. Remove `uptake` from the constructor: the PET data class does not need to know the uptake values held across its frames; it is rather the estimator that needs to know about its values so that the iterator can pick the frames following the appropriate sorting. Validate and format attributes so to avoid missing or inconsistent data. Specifically, require the midframe data to have the same length as the number of frames in the data object, and disallow the last midframe value being larger than the total duration. Make the `_compute_uptake_statistic` public so that users can call it. **`from_nii`** function: Refactor the `from_nii` function to accept filenames instead of a mix of filenames (e.g. the PET image sequence and brainmask) and temporal attribute arrays. Honors the name of the function, increases consistency with the dMRI counterpart and allows to offer a uniform API. The only required temporal parameter required by BIDS is the frame time (`FrameTimesStart`). Thus, the temporal attribute JSON (sidecar) file is required to contain that key. The values required to model a PET dataset for the purposes of NiFreeze, namely the midframe and total duration values, are computed from the frame time. It is assumed that the frame duration spans entirely the time elapsed between two consecutive time frame values. Refactor and rename the `_compute_frame_duration` function so that it computes and returns the required parameters to instantiate a PET data object. The computation of the relevant temporal values is, thus, done at this place only. Use the `get_data` utils function in `from_nii` to handle automatically the data type when loading the PET data. **`to_nifti`** function Preserve the base class `to_nifti` method to serialize the `PET` dataset to NIfTI data. The `PET` dataset class does not need to write its temporal attributes, as they do not change along the prediction process, and they can be computed from the primary JSON file where `FrameTimesStart` data dwell. NiFreeze will still allow writing a BIDS-compatible derivative dataset by only writing the motion-corrected PET frames, and users would read the `FrameTimesStart` data from the primary JSON file. **`PET.load`** class method: Remove the `PET.load` class method and rely on the `data.__init__.load` function: - If an HDF5 filename is provided, it is assumed that it hosts all necessary information, and the data module `load` function should take of loading all data. - If the provided arguments are NIfTI files plus other data files, the function will call the `pet.PET.from_nii` function. Change the `kwargs` arguments to be able to identify the relevant keyword arguments that are now present in the `from_nii` function. Change accordingly the `PET.load(pet_file, json_file)` call in the PET notebook and the `test_pet_load` test function. **Tests**: Refactor the PET data creation fixture in `conftest.py` to accept the `frame_time` (as it is the only required arguments by BIDS and the one that allows computing the rest) and to return the necessary data. Remove values that are no longer needed (i.e. `total_duration`). Refactor the tests accordingly and increase consistency with the `dmri` data module testing helper functions. Reduces cognitive load and maintenance burden. Add additional object instantiation equality checks: check that objects instantiated through reading NIfTI files equal objects instantiated directly. Check the PET dataset attributes systematically in round trip tests by collecting all named attributes that need to be tested. Modify accordingly the PET model and integration tests. Take advantage of the patch set to make other opinionated choices: - Prefer using the global `setup_random_pet_data` fixture over the local `random_dataset` fixture: it allows to control the parameters of the generated data and increases consistency with the practice adopted across the dMRI dataset tests. Remove the `random_dataset` fixture. - Prefer using `assert np.allclose` over `np.testing.assert_array_equal` for the sake of consistency **Dependencies** Require `attrs>24.1.0` so that `attrs.Converter` can be used. Documentation: https://www.attrs.org/en/25.4.0/api.html#converters
43b99af to
a716649
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #336 +/- ##
==========================================
+ Coverage 81.81% 81.92% +0.10%
==========================================
Files 34 34
Lines 1980 2019 +39
Branches 211 223 +12
==========================================
+ Hits 1620 1654 +34
- Misses 312 318 +6
+ Partials 48 47 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I modified this following our conversation. The current PET model and estimator remain unchanged. I am going to go ahead and merge this to move forward. |
Validate PET data objects' attributes at instantiation: ensures that the attributes are present and match the expected dimensionalities.
PET class attributes
Refactor the PET attributes so that only
midframeandtotal_durationare required and accepted by the constructor. These are the only parameters that are required by the current PET model.Remove
uptakefrom the constructor: the PET data class does not need to know the uptake values held across its frames; it is rather the estimator that needs to know about its values so that the iterator can pick the frames following the appropriate sorting.Validate and format attributes so to avoid missing or inconsistent data. Specifically, require the midframe data to have the same length as the number of frames in the data object, and disallow the last midframe value being larger than the total duration.
Make the
_compute_uptake_statisticpublic so that users can call it.from_niifunction:Refactor the
from_niifunction to accept filenames instead of a mix of filenames (e.g. the PET image sequence and brainmask) and temporal attribute arrays. Honors the name of the function, increases consistency with the dMRI counterpart and allows to offer a uniform API.The only required temporal parameter required by BIDS is the frame time (
FrameTimesStart). Thus, the temporal attribute JSON (sidecar) file is required to contain that key. The values required to model a PET dataset for the purposes of NiFreeze, namely the midframe and total duration values, are computed from the frame time. It is assumed that the frame duration spans entirely the time elapsed between two consecutive time frame values.Refactor and rename the
_compute_frame_durationfunction so that it computes and returns the required parameters to instantiate a PET data object. The computation of the relevant temporal values is, thus, done at this place only.Use the
get_datautils function infrom_niito handle automatically the data type when loading the PET data.to_niftifunctionPreserve the base class
to_niftimethod to serialize thePETdataset to NIfTI data. ThePETdataset class does not need to write its temporal attributes, as they do not change along the prediction process, and they can be computed from the primary JSON file whereFrameTimesStartdata dwell. NiFreeze will still allow writing a BIDS-compatible derivative dataset by only writing the motion-corrected PET frames, and users would read theFrameTimesStartdata from the primary JSON file.PET.loadclass method:Remove the
PET.loadclass method and rely on thedata.__init__.loadfunction:loadfunction should take of loading all data.pet.PET.from_niifunction.Change the
kwargsarguments to be able to identify the relevant keyword arguments that are now present in thefrom_niifunction.Change accordingly the
PET.load(pet_file, json_file)call in the PET notebook and thetest_pet_loadtest function.Tests:
Refactor the PET data creation fixture in
conftest.pyto accept theframe_time(as it is the only required arguments by BIDS and the one that allows computing the rest) and to return the necessary data.Remove values that are no longer needed (i.e.
total_duration).Refactor the tests accordingly and increase consistency with the
dmridata module testing helper functions. Reduces cognitive load and maintenance burden.Add additional object instantiation equality checks: check that objects instantiated through reading NIfTI files equal objects instantiated directly.
Check the PET dataset attributes systematically in round trip tests by collecting all named attributes that need to be tested.
Modify accordingly the PET model and integration tests.
Take advantage of the patch set to make other opinionated choices:
setup_random_pet_datafixture over the localrandom_datasetfixture: it allows to control the parameters of the generated data and increases consistency with the practice adopted across the dMRI dataset tests. Remove therandom_datasetfixture.assert np.allcloseovernp.testing.assert_array_equalfor the sake of consistencyDependencies
Require
attrs>24.1.0so thatattrs.Convertercan be used.Documentation:
https://www.attrs.org/en/25.4.0/api.html#converters