Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a concrete Annotation base class with default begin=0 and end=0 attributes to provide a stable typing foundation for annotation instances. This addresses sorting and serialization issues with runtime-generated annotation types.
- Introduced
Annotationclass that all annotation types inherit from - Updated XMI test fixtures to include default
begin="0" end="0"attributes for annotation-like types - Refactored sorting logic to use stable key-based comparisons instead of comparator functions
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| cassis/typesystem.py | Added Annotation base class and updated type generation to inherit from it for annotation types |
| cassis/cas.py | Updated type hints to use Annotation instead of FeatureStructure where appropriate; improved duck-typing in _sort_func |
| cassis/xmi.py | Fixed offset conversion logic with defensive checks and improved comments |
| cassis/util.py | Replaced comparator-based sorting with stable key functions; refactored hash computation to return deterministic string representations |
| tests/test_files/xmi/*.xmi | Added begin="0" end="0" attributes to annotation instances in test fixtures |
| tests/test_ducktyping.py | New test file verifying runtime-generated annotations work correctly |
| tests/test_cas.py | Added tests for non-annotation feature structures and runtime-generated annotations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
121e10f to
9318f26
Compare
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Introduced Annotation class with default begin/end 0 - Update reference data accordingly - Fixed several issues with sorting/ordering annotations
9318f26 to
e5eb8e0
Compare
* main: Issue #349: Upgrade dependencies #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range Issue #342: Improve cas_to_comparable_text #328 - Remove annotations in a specified range Issue #340: Add annotation predicates Issue #340: Add annotation predicates No issue: Upgrade dependencies No issue: Run uv with --frozen on github actions #328 - Remove annotations in a specified range % Conflicts: % cassis/util.py % tests/test_cas.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
cassis/cas.py:585
select_all()is typed asList[Annotation], but the view index now contains arbitraryFeatureStructureinstances (seeadd()adding any FS, and tests that add/select a non-annotation). The current return annotation is therefore incorrect and will mislead type-checkers/users; it also increases the risk that callers assumebegin/endexist. Change the return type toList[FeatureStructure](and adjust the docstring accordingly), or split into separate APIs for annotations vs all feature structures.
def select_all(self) -> List[Annotation]:
"""Finds all feature structures in this Cas
Returns:
A list of all annotations in this Cas
"""
return self._current_view.get_all_annotations()
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
cassis/cas.py:585
select_all()is annotated as returningList[Annotation], but it actually returnsView.get_all_annotations(), which contains anyFeatureStructureadded to the view (including non-annotation FS). This makes the public API typing incorrect and also contradicts the docstring which still says it returns annotations. Update the return type toList[FeatureStructure](and adjust the docstring wording accordingly).
def select_all(self) -> List[Annotation]:
"""Finds all feature structures in this Cas
Returns:
A list of all annotations in this Cas
"""
return self._current_view.get_all_annotations()
- Fix `_compare_fs` antisymmetry: return `1` (not `-1`) when `b` is an annotation and `a` is not, so mixed annotation/non-annotation orderings are stable
- Convert `Cas.select_covering` from a generator into a list-returning method, matching `select_covered`; correct return annotation to `List[Annotation]`
- Fix typo in `_compare_fs` comment ("offets" → "offsets")
f81f91e to
58dd8ba
Compare
* main: Issue #349: Upgrade dependencies
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
cassis/cas.py:585
select_all()is documented as returning “all feature structures”, but the signature returnsList[Annotation]and the docstring still says “A list of all annotations”. With non-annotation FS now being indexed/returned (see new test), this should beList[FeatureStructure](or a new dedicated method for annotation-only selection) and the docstring should be updated to reflect the actual behavior.
def select_all(self) -> List[Annotation]:
"""Finds all feature structures in this Cas
Returns:
A list of all annotations in this Cas
"""
return self._current_view.get_all_annotations()
- Split `Cas.select_all()` into honestly-named successors: `select_all_annotations()` (filters via `is_annotation`, safe for `.begin`/`.end` access) and `select_all_fs()` (all indexed feature structures); deprecate `select_all()` as an alias for `select_all_annotations()` to restore the pre-Annotation-class contract where only annotations were ever indexed - Rename view-level index ops to match UIMA's `addFsToIndexes` terminology: introduce `View.add_fs_to_indexes()`, `View.remove_fs_from_indexes()`, and `View.get_all_fs()`; deprecate `add_annotation_to_index`, `remove_annotation_from_index`, and `get_all_annotations` as aliases (the last filtering to annotations only) - Migrate internal callers to the new names: `crop_sofa_string` and `remove_annotations_in_range` use `select_all_annotations()` (fixes the `AttributeError` risk when non-annotation FS were indexed; drops the now-redundant `is_instance_of(...,TYPE_NAME_ANNOTATION)` filter); `_find_all_fs` seeding and `_get_indexed_feature_structures` use `select_all_fs()`; XMI/JSON view-member serialization uses `get_all_fs()` - Migrate test suites (`test_cas.py`, `test_xmi.py`, `test_json.py`, `test_util.py`) off the deprecated APIs; one test that explicitly probes non-annotation FS visibility now uses `select_all_fs()` - Drop the now-unused `TYPE_NAME_ANNOTATION` import in `cas.py`
- Fix equality/hash contract violation on `Annotation`: switch its `@attr.s` decorator to `eq=False, order=False` so generated annotation FS classes inherit `FeatureStructure.__eq__` and `__hash__` consistently, instead of getting an attrs-synthesized `__eq__` (over `type`/`begin`/`end`) paired with an `xmiID`-based `__hash__` - Simplify `Cas.select_all_annotations()` to filter via `isinstance(fs, Annotation)` (the dynamic class hierarchy already mirrors the UIMA annotation supertype chain), dropping the typing.cast and the unused `TYPE_NAME_ANNOTATION` import - Fix `test_add_non_annotation_and_select` to actually exercise non-annotation FS by passing `supertypeName=TYPE_NAME_TOP` to `create_type`, since the default supertype is `uima.tcas.Annotation`
- Validate the type argument in `Cas.remove_annotations_in_range`: raise `TypeError` if the supplied type is not a subtype of `uima.tcas.Annotation`, replacing the previous silent `AttributeError` that would surface deep in the loop on `.begin`/`.end` access for non-annotation FS - Hoist the supertype-chain walk in dynamic FS class generation: `_is_annotation_type(self)` is now computed once per type and reused for both the `begin`/`end` field-skip filter and the `Annotation` vs `FeatureStructure` base-class selection, instead of being re-walked per feature
- Validate annotation types in `Cas.select_covered`, `Cas.select_covering`, and `Cas.remove_annotations_in_range`: extract a shared `_require_annotation_type(type_, operation)` helper that resolves `Type | str` and raises `TypeError` if the type is not a subtype of `uima.tcas.Annotation`, replacing the previous silent `AttributeError` on `.begin`/`.end` access for non-annotation FS - Rename the parameter of `Cas.add` from `annotation` to `fs` (with matching docstring) to reflect that it accepts any feature structure, aligning with `View.add_fs_to_indexes` - Tighten `is_annotation` to a nominal `isinstance(fs, Annotation)` check returning `TypeGuard[Annotation]` (replacing the begin/end duck-typing); enables type-narrowing at call sites and matches the dynamic class hierarchy - Re-export `Annotation` and `is_annotation` from the top-level `cassis` package
What's in the PR
How to test manually
Automatic testing
Documentation