Track and write measurement time for area#3168
Track and write measurement time for area#3168gerritholl wants to merge 61 commits intopytroll:mainfrom
Conversation
Start unit test to track time for the SEVIRI HRIT reader. This is to prepare an implementation of pytroll#3161. There is no implementation yet.
Added a first implementation for SEVIRI per-pixel times in ancillary variables. So far just takes the scanline time for all pixels in the scanline.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3168 +/- ##
========================================
Coverage 96.34% 96.35%
========================================
Files 466 468 +2
Lines 59131 59315 +184
========================================
+ Hits 56971 57151 +180
- Misses 2160 2164 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Pull Request Test Coverage Report for Build 16744116940Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
|
To track "valid time" through resampling for geostationary satellite data, I'm trying to think of a good way to approach this. My current thinking is that if the user creates the scene with reader_kwargs={"track_time": True}, the file handler adds per-pixel time information to the variable attributes. By putting it inside ancillary variables, dataset_walker and co ensure it gets sliced and resampled along with the parent dataset, so that at the end, an approximate measurement time can be estimated for the resampled scene. But normally ancillary variables are actual variables available in the data file, they remain text attributes until after segments have been concatenated, and then the file handler is called again for every segment, to replace the text label by actual data in ancillary variables. Therefore, I'm not sure if ancillary variables is the right place to put it, or if time should be stored there at all. |
|
Maybe it fits better in a coordinate variable, but not sure if it's valid to have those be not-unique. |
|
Coordinates can probably be non-unique, but are currently dropped in resampling. In case of nearest neighbour resampling, this is due to |
Add time as a coordinate rather than as an auxiliary variable. Start working on resampling time coordinates, so far with a failing test.
Implement time tracking as a float coordinate for SEVIRI L1B HRIT. Retain this through resampling. Add documentation on how to use this.
Adapt ninjo tag test to account for the novelty of optional dynamic tags.
|
Addressing the CodeScene complaint about I am not able to resolve this as part of this PR. |
Refactor getting the valid time out of the ninjogeotiff writer, so that the functionality is available elsewhere.
Add more tests covering the different scenarios of time coordinates in compositing.
|
Elsewhere, Satpy expects that time coordinates are of dtype datetime64, for example, in satpy/satpy/composites/core.py Line 38 in e078c6e satpy/satpy/composites/core.py Lines 549 to 555 in e078c6e This is a problem, because datetime64 cannot be resampled, so this conflicts with the use case for tracking measurement time. |
|
Elsewhere, satpy expects numerical values: satpy/satpy/composites/core.py Line 535 in e078c6e Why is it skipping time coordinates where the initial value is 0? |
|
It also reduces the dimensionality of the time coordinate by taking only the first value — why? satpy/satpy/composites/core.py Line 536 in e078c6e Seems to be originally added in d41a213. |
When creating a composite where more than zero components have a time coordinate, calculate the arithmetic mean of those time coordinates.
It should be the time coordinate averaged, not the data that have the time coordinates. Units should be retained, and this should be tested, that they are retained.
The time coordinate resampling was failing if reduce_data=False, because reduce_data=True has a side-effect of adding the area information to the attributes. Add this explicitly for the time coordinate.
Retain time coordinates in several more cases where they were accidentally dropped, such as in filler compositors and DayNightCompositor.
Verify that the times do no lead to early dask computes
The time coordinate was getting a bands=R dimension, because it was taking the non-time coordinates from the first projectable. This was then leading to the G and B dimensions being removed when the time coordinate was being assigned.
The time coordinate was getting a bands=R dimension, because it was taking the non-time coordinates from the first projectable. This was then leading to the G and B dimensions being removed when the time coordinate was being assigned. Add a test to verify that this is fixed.
Add a test to ensure we are not having early dask computations due to filling with the time coordinate. The test is currently failing.
Combine times prior to compositing, avoiding an accidental early dask computation in fillna.
… feature-valid-time
Improve laziness for time coordinate tracking. Not there yet.
Calculate the tag value for the mean time lazily. Putting the mean time in the filename is not supported for now. Will be readded in a later commit, but with a warning that this triggers an early computation and is thus not recommended if performance matters.
|
Writing the valid time in the headers now works lazily. For the filename, lazy evaluation is too hard and I will probably not implement this. Users who need this should use a second step renaming the file based on information from the header. |
Mode tests for modifiers in modifiers/atmosphere.py to their own test module in tests/modifier_tests/test_atmosphere.py.
Add a test to confirm that the CO2Corrector modifier does not trigger a dask computation in the presence of non-identical time coordinates.
Retain time coordinates without computation when applying the CO2 correction.
Retain time coordinates in the arithmetic compositors and the CO2 correction, such that they are retained in the convection RGB.
|
This PR is strongly increasing the overhead of Calling
NB: This is just for calling resample — no computation. See also #2620. I don't yet understand why the overhead for |
Track measurement times such that a representative measurement time can be estimated and written for an area.
To include valid time in the filename:
reader_kwargs: {"track_time": True}toScene.__init__, andresample_coords=TruetoScene.resample, anddynamic_fields={"valid_time"}toScene.save_datasets, and{valid_time}in the filename pattern passing toScene.save_datasets.When storing as
ninjogeotiff, time coordinates will be averaged and stored in theninjo_ValidTimeIDtag.An example is included in the PR under
doc/source/examples/valid_time.rst.Those two PRs must be merged first:
_resampled_scene()and_reduce_data()methods of theSceneclass #3178The PR also does some refactoring in
Scene._resampled_scene, on top of Refactor_resampled_scene()and_reduce_data()methods of theSceneclass #3178, which should be merged first. Also,Next steps, either within this PR or within one or more later PRs:
The latter is difficult to do without triggering a computation, so doing this efficiently should probably be done as postprocessing beyond the scope of this PR.
AUTHORS.mdif not there already