#336 - Introduce Annotation class by reckart · Pull Request #337 · dkpro/dkpro-cassis

reckart · 2025-11-10T21:49:41Z

What's in the PR

Introduced Annotation class with default begin/end 0
Update reference data accordingly
Fixed several issues with sorting/ordering annotations

How to test manually

No specific test procedure

Automatic testing

PR includes unit tests

Documentation

PR updates documentation

Copilot

Pull Request Overview

This PR introduces a concrete Annotation base class with default begin=0 and end=0 attributes to provide a stable typing foundation for annotation instances. This addresses sorting and serialization issues with runtime-generated annotation types.

Introduced Annotation class that all annotation types inherit from
Updated XMI test fixtures to include default begin="0" end="0" attributes for annotation-like types
Refactored sorting logic to use stable key-based comparisons instead of comparator functions

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
cassis/typesystem.py	Added `Annotation` base class and updated type generation to inherit from it for annotation types
cassis/cas.py	Updated type hints to use `Annotation` instead of `FeatureStructure` where appropriate; improved duck-typing in `_sort_func`
cassis/xmi.py	Fixed offset conversion logic with defensive checks and improved comments
cassis/util.py	Replaced comparator-based sorting with stable key functions; refactored hash computation to return deterministic string representations
tests/test_files/xmi/*.xmi	Added `begin="0" end="0"` attributes to annotation instances in test fixtures
tests/test_ducktyping.py	New test file verifying runtime-generated annotations work correctly
tests/test_cas.py	Added tests for non-annotation feature structures and runtime-generated annotations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull Request Overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Introduced Annotation class with default begin/end 0 - Update reference data accordingly - Fixed several issues with sorting/ordering annotations

* main: Issue #349: Upgrade dependencies #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range #328 - Remove annotations in a specified range Issue #342: Improve cas_to_comparable_text #328 - Remove annotations in a specified range Issue #340: Add annotation predicates Issue #340: Add annotation predicates No issue: Upgrade dependencies No issue: Run uv with --frozen on github actions #328 - Remove annotations in a specified range % Conflicts: % cassis/util.py % tests/test_cas.py

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

cassis/cas.py:585

select_all() is typed as List[Annotation], but the view index now contains arbitrary FeatureStructure instances (see add() adding any FS, and tests that add/select a non-annotation). The current return annotation is therefore incorrect and will mislead type-checkers/users; it also increases the risk that callers assume begin/end exist. Change the return type to List[FeatureStructure] (and adjust the docstring accordingly), or split into separate APIs for annotations vs all feature structures.

    def select_all(self) -> List[Annotation]:
        """Finds all feature structures in this Cas

        Returns:
            A list of all annotations in this Cas

        """
        return self._current_view.get_all_annotations()

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

cassis/cas.py:585

select_all() is annotated as returning List[Annotation], but it actually returns View.get_all_annotations(), which contains any FeatureStructure added to the view (including non-annotation FS). This makes the public API typing incorrect and also contradicts the docstring which still says it returns annotations. Update the return type to List[FeatureStructure] (and adjust the docstring wording accordingly).

    def select_all(self) -> List[Annotation]:
        """Finds all feature structures in this Cas

        Returns:
            A list of all annotations in this Cas

        """
        return self._current_view.get_all_annotations()

- Fix `_compare_fs` antisymmetry: return `1` (not `-1`) when `b` is an annotation and `a` is not, so mixed annotation/non-annotation orderings are stable - Convert `Cas.select_covering` from a generator into a list-returning method, matching `select_covered`; correct return annotation to `List[Annotation]` - Fix typo in `_compare_fs` comment ("offets" → "offsets")

* main: Issue #349: Upgrade dependencies

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

cassis/cas.py:585

select_all() is documented as returning “all feature structures”, but the signature returns List[Annotation] and the docstring still says “A list of all annotations”. With non-annotation FS now being indexed/returned (see new test), this should be List[FeatureStructure] (or a new dedicated method for annotation-only selection) and the docstring should be updated to reflect the actual behavior.

    def select_all(self) -> List[Annotation]:
        """Finds all feature structures in this Cas

        Returns:
            A list of all annotations in this Cas

        """
        return self._current_view.get_all_annotations()

- Split `Cas.select_all()` into honestly-named successors: `select_all_annotations()` (filters via `is_annotation`, safe for `.begin`/`.end` access) and `select_all_fs()` (all indexed feature structures); deprecate `select_all()` as an alias for `select_all_annotations()` to restore the pre-Annotation-class contract where only annotations were ever indexed - Rename view-level index ops to match UIMA's `addFsToIndexes` terminology: introduce `View.add_fs_to_indexes()`, `View.remove_fs_from_indexes()`, and `View.get_all_fs()`; deprecate `add_annotation_to_index`, `remove_annotation_from_index`, and `get_all_annotations` as aliases (the last filtering to annotations only) - Migrate internal callers to the new names: `crop_sofa_string` and `remove_annotations_in_range` use `select_all_annotations()` (fixes the `AttributeError` risk when non-annotation FS were indexed; drops the now-redundant `is_instance_of(...,TYPE_NAME_ANNOTATION)` filter); `_find_all_fs` seeding and `_get_indexed_feature_structures` use `select_all_fs()`; XMI/JSON view-member serialization uses `get_all_fs()` - Migrate test suites (`test_cas.py`, `test_xmi.py`, `test_json.py`, `test_util.py`) off the deprecated APIs; one test that explicitly probes non-annotation FS visibility now uses `select_all_fs()` - Drop the now-unused `TYPE_NAME_ANNOTATION` import in `cas.py`

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

- Fix equality/hash contract violation on `Annotation`: switch its `@attr.s` decorator to `eq=False, order=False` so generated annotation FS classes inherit `FeatureStructure.__eq__` and `__hash__` consistently, instead of getting an attrs-synthesized `__eq__` (over `type`/`begin`/`end`) paired with an `xmiID`-based `__hash__` - Simplify `Cas.select_all_annotations()` to filter via `isinstance(fs, Annotation)` (the dynamic class hierarchy already mirrors the UIMA annotation supertype chain), dropping the typing.cast and the unused `TYPE_NAME_ANNOTATION` import - Fix `test_add_non_annotation_and_select` to actually exercise non-annotation FS by passing `supertypeName=TYPE_NAME_TOP` to `create_type`, since the default supertype is `uima.tcas.Annotation`

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

- Validate the type argument in `Cas.remove_annotations_in_range`: raise `TypeError` if the supplied type is not a subtype of `uima.tcas.Annotation`, replacing the previous silent `AttributeError` that would surface deep in the loop on `.begin`/`.end` access for non-annotation FS - Hoist the supertype-chain walk in dynamic FS class generation: `_is_annotation_type(self)` is now computed once per type and reused for both the `begin`/`end` field-skip filter and the `Annotation` vs `FeatureStructure` base-class selection, instead of being re-walked per feature

Copilot

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

- Validate annotation types in `Cas.select_covered`, `Cas.select_covering`, and `Cas.remove_annotations_in_range`: extract a shared `_require_annotation_type(type_, operation)` helper that resolves `Type | str` and raises `TypeError` if the type is not a subtype of `uima.tcas.Annotation`, replacing the previous silent `AttributeError` on `.begin`/`.end` access for non-annotation FS - Rename the parameter of `Cas.add` from `annotation` to `fs` (with matching docstring) to reflect that it accepts any feature structure, aligning with `View.add_fs_to_indexes` - Tighten `is_annotation` to a nominal `isinstance(fs, Annotation)` check returning `TypeGuard[Annotation]` (replacing the begin/end duck-typing); enables type-narrowing at call sites and matches the dynamic class hierarchy - Re-export `Annotation` and `is_annotation` from the top-level `cassis` package

reckart added this to the 0.11.0 milestone Nov 10, 2025

reckart self-assigned this Nov 10, 2025

reckart added the 🆕Enhancement label Nov 10, 2025

reckart requested a review from Copilot November 10, 2025 21:49

Copilot started reviewing on behalf of reckart November 10, 2025 21:50 View session

Copilot finished reviewing on behalf of reckart November 10, 2025 21:51

Copilot AI reviewed Nov 10, 2025

View reviewed changes

reckart force-pushed the feature/336-Introduce-Annotation-class branch 2 times, most recently from 121e10f to 9318f26 Compare November 11, 2025 06:19

reckart requested a review from Copilot November 11, 2025 06:29

Copilot started reviewing on behalf of reckart November 11, 2025 06:30 View session

Copilot finished reviewing on behalf of reckart November 11, 2025 06:33

Copilot AI reviewed Nov 11, 2025

View reviewed changes

Comment thread cassis/util.py Outdated

Comment thread cassis/util.py Outdated

Comment thread cassis/xmi.py Outdated

Comment thread cassis/cas.py Outdated

Comment thread cassis/cas.py Outdated

Comment thread cassis/util.py Outdated

#336 - Introduce Annotation class

e5eb8e0

- Introduced Annotation class with default begin/end 0 - Update reference data accordingly - Fixed several issues with sorting/ordering annotations

reckart force-pushed the feature/336-Introduce-Annotation-class branch from 9318f26 to e5eb8e0 Compare November 11, 2025 07:17

reckart requested a review from Copilot May 6, 2026 11:17

Copilot started reviewing on behalf of reckart May 6, 2026 11:18 View session