add enum checking for attributes#557
Conversation
|
@sherjeelshabih please use this as a starting point. |
|
@lukaspie Please check the warnings for NXxps, they are probably relevant, similar as for NXmpes_arpes. |
|
@rettigl thanks for the pointer. Indeed, the I started making them |
|
I added a small check for discussion. I now add list types (based on our convention) as enumeration objects in the validation tree. I tested with passing a list [0, 0, 1] and that works as expected now. If there are more examples, I can have a look. |
|
@sherjeelshabih Thanks for the update! With this branch now, I got new errors because I was using values for enumerated attributes that were not in the list. I updated pynxtools-xps, so now the tests are passing again. However, I was expecting that I still get a similar warning as before (see above) since the thee |
I checked the combination of this PR and #554. This seems to work more or less. The vectors are correctly checked for the enums, and a warning is given if the dtype is wrong (ints passed for NX_CHAR). However, the checking of list of strings seems to not work anymore now, as example the axes attribute in data. I pass a tuple with ("angular0", "angular1", "energy"), and it produces these errors: @RubelMozumder This might be due to that tuples are not considered in #554, and also maybe not here? |
| if isinstance(value, list): | ||
| return all(is_valid_data_field(v, nxdl_type, path) for v in value), output_value | ||
|
|
There was a problem hiding this comment.
This produces one warning per element in a list, if the datatype is not correct. @RubelMozumder had a different solution for this in #554.
| "ISO8601": (str), | ||
| "NX_BINARY": (bytes, bytearray, np.byte, np.ubyte, np.ndarray), | ||
| "NX_BOOLEAN": (bool, np.ndarray, np.bool_), | ||
| "NX_CHAR": (str, np.ndarray, np.chararray), |
There was a problem hiding this comment.
This still allows to write numeric arrays to an NX_CHAR, which is not correct. This issue is also solved by #554
| try: | ||
| if accepted_types[0] is bool and isinstance(value, str): | ||
| value = convert_str_to_bool_safe(value) | ||
| if value is None: | ||
| raise ValueError | ||
| output_value = accepted_types[0](value) | ||
| except ValueError: | ||
| collector.collect_and_log( | ||
| path, ValidationProblem.InvalidType, accepted_types, nxdl_type | ||
| ) | ||
| return False, value |
There was a problem hiding this comment.
I would keep the boolean string to bool conversion, this certainly makes sense and We could even think about doing this for all datatypes with safe literal conversion, as you did for the enums.
|
@sanbrock This here adds lists as possible elements of enumerations for the dataconverter. However, read_nexus apparently still does not support this. In the debug log, I don't get any warning or so about a wrong enum: Nomad, on the other hand, reports a warning when parsing the file: Do you know where to fix this? |
|
Closed in favor of #554 |
Possible fix for #556