diff --git a/kedro-datasets/RELEASE.md b/kedro-datasets/RELEASE.md index 05213a0f1..7798c23bb 100755 --- a/kedro-datasets/RELEASE.md +++ b/kedro-datasets/RELEASE.md @@ -5,12 +5,27 @@ | Type | Description | Location | | ------------------------------------ | ---------------------------------------------------- | -------------------------------------- | -| `opik.OpikEvaluationDataset` | A dataset for managing Opik evaluation datasets. | `kedro_datasets_experimental.opik` | - -## Bug fixes and other changes - -- Refactored shared validation and utility logic from the three Opik experimental datasets (`OpikPromptDataset`, `OpikEvaluationDataset`, `OpikTraceDataset`) into a common `opik._common` module. -- Refactored shared validation and utility logic from the three Langfuse experimental datasets (`LangfusePromptDataset`, `LangfuseEvaluationDataset`, `LangfuseTraceDataset`) into a common `langfuse._common` module. +| `opik.EvaluationDataset` | A dataset for managing Opik evaluation datasets. | `kedro_datasets_experimental.opik` | + +## Breaking changes to experimental datasets +- Renamed dataset classes and shortened `pyproject.toml` extra names for `langfuse`, `opik`, and `langchain` experimental datasets. The redundant package-family prefix has been dropped: + - Classes: + - `langfuse.LangfusePromptDataset` → `langfuse.PromptDataset` + - `langfuse.LangfuseTraceDataset` → `langfuse.TraceDataset` + - `langfuse.LangfuseEvaluationDataset` → `langfuse.EvaluationDataset` + - `opik.OpikPromptDataset` → `opik.PromptDataset` + - `opik.OpikTraceDataset` → `opik.TraceDataset` + - `langchain.LangChainPromptDataset` → `langchain.PromptDataset` + - Extras: + - `langfuse-langfusepromptdataset` → `langfuse-promptdataset` + - `opik-opiktracedataset` → `opik-tracedataset` + - `langchain-langchainpromptdataset` → `langchain-promptdataset` + - etc. + +## Bug fixes and other changes + +- Refactored shared validation and utility logic from the three Opik experimental datasets (`PromptDataset`, `EvaluationDataset`, `TraceDataset`) into a common `opik._common` module. +- Refactored shared validation and utility logic from the three Langfuse experimental datasets (`PromptDataset`, `EvaluationDataset`, `TraceDataset`) into a common `langfuse._common` module. - Added `os.PathLike` support for `plotly` datasets. ## Community contributions diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/index.md b/kedro-datasets/docs/api/kedro_datasets_experimental/index.md index b83adb32d..7a71e9e23 100644 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/index.md +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/index.md @@ -8,18 +8,18 @@ Name | Description -----|------------ [chromadb.ChromaDBDataset](chromadb.ChromaDBDataset.md) | ``ChromaDBDataset`` loads and saves data to ChromaDB vector database collections. [databricks.ExternalTableDataset](databricks.ExternalTableDataset.md) | ``ExternalTableDataset`` implementation to access external tables in Databricks. -[langchain.LangChainPromptDataset](langchain.LangChainPromptDataset.md) | ``LangChainPromptDataset`` loads a `langchain` prompt template. -[langfuse.LangfuseEvaluationDataset](langfuse.LangfuseEvaluationDataset.md) | ``LangfuseEvaluationDataset`` manages Langfuse evaluation datasets for LLM experiment workflows, supporting local file syncing and remote dataset versioning. -[langfuse.LangfusePromptDataset](langfuse.LangfusePromptDataset.md) | ``LangfusePromptDataset`` provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies. -[langfuse.LangfuseTraceDataset](langfuse.LangfuseTraceDataset.md) | ``LangfuseTraceDataset`` provides Langfuse tracing clients for LLM observability and monitoring. +[langchain.PromptDataset](langchain.PromptDataset.md) | ``PromptDataset`` loads a `langchain` prompt template. +[langfuse.EvaluationDataset](langfuse.EvaluationDataset.md) | ``EvaluationDataset`` manages Langfuse evaluation datasets for LLM experiment workflows, supporting local file syncing and remote dataset versioning. +[langfuse.PromptDataset](langfuse.PromptDataset.md) | ``PromptDataset`` provides a seamless integration between local prompt files (JSON/YAML) and Langfuse prompt management, supporting version control, labeling, and different synchronization policies. +[langfuse.TraceDataset](langfuse.TraceDataset.md) | ``TraceDataset`` provides Langfuse tracing clients for LLM observability and monitoring. [mlrun.MLRunAbstractDataset](mlrun.MLRunAbstractDataset.md) | ``MLRunAbstractDataset`` base class for MLRun datasets, can be used directly for generic artifacts. [mlrun.MLRunModel](mlrun.MLRunModel.md) | ``MLRunModel`` saves and loads ML models via MLRun with framework metadata and configurable file format. [mlrun.MLRunDataframeDataset](mlrun.MLRunDataframeDataset.md) | ``MLRunDataframeDataset`` saves and loads pandas DataFrames as MLRun artifacts. [mlrun.MLRunResult](mlrun.MLRunResult.md) | ``MLRunResult`` logs scalar results and metrics to MLRun with optional nested dict flattening. [netcdf.NetCDFDataset](netcdf.NetCDFDataset.md) | ``NetCDFDataset`` loads/saves data from/to a NetCDF file using an underlying filesystem (e.g.: local, S3, GCS). It uses xarray to handle the NetCDF file. -[opik.OpikEvaluationDataset](opik.OpikEvaluationDataset.md) | ``OpikEvaluationDataset`` manages Opik evaluation datasets for LLM experiment workflows. -[opik.OpikPromptDataset](opik.OpikPromptDataset.md) | ``OpikPromptDataset`` manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates. -[opik.OpikTraceDataset](opik.OpikTraceDataset.md) | ``OpikTraceDataset`` provides Opik tracing clients for observability and monitoring. +[opik.EvaluationDataset](opik.EvaluationDataset.md) | ``EvaluationDataset`` manages Opik evaluation datasets for LLM experiment workflows. +[opik.PromptDataset](opik.PromptDataset.md) | ``PromptDataset`` manages prompts with Opik versioning support, returning either raw SDK objects or LangChain templates. +[opik.TraceDataset](opik.TraceDataset.md) | ``TraceDataset`` provides Opik tracing clients for observability and monitoring. [optuna.StudyDataset](optuna.StudyDataset.md) | ``StudyDataset`` loads/saves an Optuna study, enabling distributed hyperparameter tuning. [pypdf.PDFDataset](pypdf.PDFDataset.md) | ``PDFDataset`` loads data from PDF files using pypdf to extract text from pages. Read-only dataset. [polars.PolarsDatabaseDataset](polars.PolarsDatabaseDataset.md) | ``PolarsDatabaseDataset`` implementation to access databases as Polars DataFrames. It supports reading from a SQL query and writing to a database table. diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/langchain.LangChainPromptDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langchain.LangChainPromptDataset.md deleted file mode 100644 index 70644d38f..000000000 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/langchain.LangChainPromptDataset.md +++ /dev/null @@ -1,4 +0,0 @@ -::: kedro_datasets_experimental.langchain.LangChainPromptDataset - options: - members: true - show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikEvaluationDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langchain.PromptDataset.md similarity index 50% rename from kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikEvaluationDataset.md rename to kedro-datasets/docs/api/kedro_datasets_experimental/langchain.PromptDataset.md index 7fd57c76b..f99a7c8e7 100644 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikEvaluationDataset.md +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/langchain.PromptDataset.md @@ -1,4 +1,4 @@ -::: kedro_datasets_experimental.opik.OpikEvaluationDataset +::: kedro_datasets_experimental.langchain.PromptDataset options: members: true show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.EvaluationDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.EvaluationDataset.md new file mode 100644 index 000000000..08cb13dbf --- /dev/null +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.EvaluationDataset.md @@ -0,0 +1,4 @@ +::: kedro_datasets_experimental.langfuse.EvaluationDataset + options: + members: true + show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseEvaluationDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseEvaluationDataset.md deleted file mode 100644 index 8e851cb48..000000000 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseEvaluationDataset.md +++ /dev/null @@ -1,4 +0,0 @@ -::: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset - options: - members: true - show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfusePromptDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfusePromptDataset.md deleted file mode 100644 index 2aa623609..000000000 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfusePromptDataset.md +++ /dev/null @@ -1,4 +0,0 @@ -::: kedro_datasets_experimental.langfuse.LangfusePromptDataset - options: - members: true - show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseTraceDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseTraceDataset.md deleted file mode 100644 index c9fbd95b5..000000000 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.LangfuseTraceDataset.md +++ /dev/null @@ -1,4 +0,0 @@ -::: kedro_datasets_experimental.langfuse.LangfuseTraceDataset - options: - members: true - show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikPromptDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.PromptDataset.md similarity index 52% rename from kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikPromptDataset.md rename to kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.PromptDataset.md index 3b2cece86..da305c95e 100644 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikPromptDataset.md +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.PromptDataset.md @@ -1,4 +1,4 @@ -::: kedro_datasets_experimental.opik.OpikPromptDataset +::: kedro_datasets_experimental.langfuse.PromptDataset options: members: true show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikTraceDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.TraceDataset.md similarity index 53% rename from kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikTraceDataset.md rename to kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.TraceDataset.md index 635dea36d..e404d2c88 100644 --- a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.OpikTraceDataset.md +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/langfuse.TraceDataset.md @@ -1,4 +1,4 @@ -::: kedro_datasets_experimental.opik.OpikTraceDataset +::: kedro_datasets_experimental.langfuse.TraceDataset options: members: true show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.EvaluationDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.EvaluationDataset.md new file mode 100644 index 000000000..89633245b --- /dev/null +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.EvaluationDataset.md @@ -0,0 +1,4 @@ +::: kedro_datasets_experimental.opik.EvaluationDataset + options: + members: true + show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.PromptDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.PromptDataset.md new file mode 100644 index 000000000..4371a7169 --- /dev/null +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.PromptDataset.md @@ -0,0 +1,4 @@ +::: kedro_datasets_experimental.opik.PromptDataset + options: + members: true + show_source: true diff --git a/kedro-datasets/docs/api/kedro_datasets_experimental/opik.TraceDataset.md b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.TraceDataset.md new file mode 100644 index 000000000..2eeae8129 --- /dev/null +++ b/kedro-datasets/docs/api/kedro_datasets_experimental/opik.TraceDataset.md @@ -0,0 +1,4 @@ +::: kedro_datasets_experimental.opik.TraceDataset + options: + members: true + show_source: true diff --git a/kedro-datasets/kedro_datasets_experimental/langchain/__init__.py b/kedro-datasets/kedro_datasets_experimental/langchain/__init__.py index e6f21303b..37e5851aa 100644 --- a/kedro-datasets/kedro_datasets_experimental/langchain/__init__.py +++ b/kedro-datasets/kedro_datasets_experimental/langchain/__init__.py @@ -4,16 +4,16 @@ import lazy_loader as lazy try: - from .langchain_prompt_dataset import LangChainPromptDataset + from .prompt_dataset import PromptDataset except (ImportError, RuntimeError): # For documentation builds that might fail due to dependency issues # https://github.com/pylint-dev/pylint/issues/4300#issuecomment-1043601901 - LangChainPromptDataset: Any + PromptDataset: Any __getattr__, __dir__, __all__ = lazy.attach( __name__, submod_attrs={ - "langchain_prompt_dataset": ["LangChainPromptDataset"], + "prompt_dataset": ["PromptDataset"], }, ) diff --git a/kedro-datasets/kedro_datasets_experimental/langchain/langchain_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/langchain/prompt_dataset.py similarity index 97% rename from kedro-datasets/kedro_datasets_experimental/langchain/langchain_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/langchain/prompt_dataset.py index 32d1c98d2..45cfa777d 100644 --- a/kedro-datasets/kedro_datasets_experimental/langchain/langchain_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/langchain/prompt_dataset.py @@ -18,7 +18,7 @@ from kedro_datasets._typing import JSONPreview -class LangChainPromptDataset(AbstractDataset[Union[PromptTemplate, ChatPromptTemplate], Any]): # noqa UP007 +class PromptDataset(AbstractDataset[Union[PromptTemplate, ChatPromptTemplate], Any]): # noqa UP007 """ A Kedro dataset for loading LangChain prompt templates from text, JSON, or YAML files. @@ -29,7 +29,7 @@ class LangChainPromptDataset(AbstractDataset[Union[PromptTemplate, ChatPromptTem ### Example usage for the [YAML API](https://docs.kedro.org/en/stable/catalog-data/data_catalog_yaml_examples/): ```yaml my_prompt: - type: kedro_datasets_experimental.langchain.LangChainPromptDataset + type: kedro_datasets_experimental.langchain.PromptDataset filepath: data/prompts/my_prompt.json template: PromptTemplate dataset: @@ -47,9 +47,9 @@ class LangChainPromptDataset(AbstractDataset[Union[PromptTemplate, ChatPromptTem ### Example usage for the [Python API](https://docs.kedro.org/en/stable/catalog-data/advanced_data_catalog_usage/): ```python - from kedro_datasets_experimental.langchain import LangChainPromptDataset + from kedro_datasets_experimental.langchain import PromptDataset - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath="data/prompts/my_prompt.json", template="PromptTemplate", dataset={"type": "json.JSONDataset"}, @@ -294,7 +294,7 @@ def _create_chat_prompt_template(self, data: dict | list[tuple[str, str]]) -> Ch return ChatPromptTemplate.from_messages(messages) def save(self, data: Any) -> None: - raise DatasetError("Saving is not supported for LangChainPromptDataset") + raise DatasetError("Saving is not supported for PromptDataset") def _describe(self) -> dict[str, Any]: clean_config = { diff --git a/kedro-datasets/kedro_datasets_experimental/langfuse/README.md b/kedro-datasets/kedro_datasets_experimental/langfuse/README.md index 55cc421da..ea54c6cef 100644 --- a/kedro-datasets/kedro_datasets_experimental/langfuse/README.md +++ b/kedro-datasets/kedro_datasets_experimental/langfuse/README.md @@ -8,20 +8,20 @@ | Dataset | Description | |---------|-------------| -| [LangfusePromptDataset](#langfusepromptdataset) | Prompt management with Langfuse versioning, sync policies, and LangChain integration. | -| [LangfuseTraceDataset](#langfusetracedataset) | Tracing clients and callbacks for LangChain, OpenAI, AutoGen, and direct SDK usage. | -| [LangfuseEvaluationDataset](#langfuseevaluationdataset) | Evaluation dataset management with local/remote sync and upsert semantics. | +| [PromptDataset](#langfusepromptdataset) | Prompt management with Langfuse versioning, sync policies, and LangChain integration. | +| [TraceDataset](#langfusetracedataset) | Tracing clients and callbacks for LangChain, OpenAI, AutoGen, and direct SDK usage. | +| [EvaluationDataset](#langfuseevaluationdataset) | Evaluation dataset management with local/remote sync and upsert semantics. | -## LangfusePromptDataset +## PromptDataset A Kedro dataset for seamless AI prompt management with Langfuse versioning, synchronization, and team collaboration. Supports both LangChain integration and direct SDK usage with flexible sync policies for development and production workflows. ### Quick Start ```python -from kedro_datasets_experimental.langfuse import LangfusePromptDataset +from kedro_datasets_experimental.langfuse import PromptDataset # Load and use a prompt -dataset = LangfusePromptDataset( +dataset = PromptDataset( filepath="prompts/intent.json", prompt_name="intent-classifier", credentials={ @@ -41,7 +41,7 @@ prompt = dataset.load() #### SDK Mode Only For basic Langfuse integration without LangChain dependencies: ```bash -pip install "kedro-datasets[langfuse-langfusepromptdataset]" +pip install "kedro-datasets[langfuse-promptdataset]" ``` #### Full Installation @@ -130,7 +130,7 @@ Conversational format with role-based messages: Returns raw Langfuse prompt objects for maximum flexibility: ```python -dataset = LangfusePromptDataset( +dataset = PromptDataset( filepath="prompts/intent.json", prompt_name="intent-classifier", credentials={ @@ -159,7 +159,7 @@ compiled_prompt = intent_ds.compile(user_query="Hello world!") Returns ready-to-use `ChatPromptTemplate` objects: ```python -dataset = LangfusePromptDataset(mode="langchain", ...) +dataset = PromptDataset(mode="langchain", ...) # ChatPromptTemplate object template = dataset.load() @@ -176,7 +176,7 @@ formatted = template.format(user_query="Hello world") ```yaml intent_prompt: - type: kedro_datasets_experimental.langfuse.LangfusePromptDataset + type: kedro_datasets_experimental.langfuse.PromptDataset filepath: data/prompts/intent.json prompt_name: "intent-classifier" prompt_type: "chat" @@ -190,7 +190,7 @@ intent_prompt: ##### Remote Sync Policy - Production ```yaml production_prompt: - type: kedro_datasets_experimental.langfuse.LangfusePromptDataset + type: kedro_datasets_experimental.langfuse.PromptDataset filepath: data/prompts/production.json prompt_name: "intent-classifier" prompt_type: "chat" @@ -205,7 +205,7 @@ production_prompt: ```yaml validation_prompt: - type: kedro_datasets_experimental.langfuse.LangfusePromptDataset + type: kedro_datasets_experimental.langfuse.PromptDataset filepath: data/prompts/validation.yaml prompt_name: "intent-classifier" prompt_type: "chat" @@ -220,10 +220,10 @@ validation_prompt: ##### Basic Usage ```python -from kedro_datasets_experimental.langfuse import LangfusePromptDataset +from kedro_datasets_experimental.langfuse import PromptDataset # Minimal configuration -dataset = LangfusePromptDataset( +dataset = PromptDataset( filepath="prompts/intent.json", prompt_name="intent-classifier", credentials={ @@ -236,7 +236,7 @@ dataset = LangfusePromptDataset( ##### Advanced Configuration ```python # Full configuration with custom host -dataset = LangfusePromptDataset( +dataset = PromptDataset( filepath="prompts/support.yaml", prompt_name="customer-support", prompt_type="chat", @@ -270,7 +270,7 @@ langfuse_credentials: ```python # Multi-intent classification system -intent_dataset = LangfusePromptDataset( +intent_dataset = PromptDataset( filepath="prompts/intent.json", prompt_name="intent-classifier", prompt_type="chat", @@ -287,7 +287,7 @@ You can read more about this use case on [kedro-academy](https://github.com/kedr ##### Response Generation ```python # Dynamic response generation -response_dataset = LangfusePromptDataset( +response_dataset = PromptDataset( filepath="prompts/response.yaml", prompt_name="response-generator", prompt_type="chat", @@ -304,7 +304,7 @@ response = template.format( ##### RAG Applications ```python # Retrieval-Augmented Generation -rag_dataset = LangfusePromptDataset( +rag_dataset = PromptDataset( filepath="prompts/rag.json", prompt_name="rag-synthesizer", prompt_type="chat", @@ -332,17 +332,15 @@ final_prompt = template.format( dataset.save(prompt_content) # Auto-creates new version # Apply labels for organization -dataset = LangfusePromptDataset( - save_args={"labels": ["v2.1.0", "production", "stable"]} -) +dataset = PromptDataset(save_args={"labels": ["v2.1.0", "production", "stable"]}) ``` ##### Version-Specific Loading ```python # Load specific versions -historical_dataset = LangfusePromptDataset(load_args={"version": 3}) # Load version 3 +historical_dataset = PromptDataset(load_args={"version": 3}) # Load version 3 -labeled_dataset = LangfusePromptDataset( +labeled_dataset = PromptDataset( load_args={"label": "production"} # Load production label ) ``` @@ -469,16 +467,16 @@ DatasetError: Remote sync policy specified but no remote prompt exists --- -## LangfuseTraceDataset +## TraceDataset A Kedro dataset for managing [Langfuse tracing](https://langfuse.com/docs/tracing) clients and callbacks. It provides the appropriate tracing object based on a configurable mode, enabling seamless integration with LangChain, OpenAI, AutoGen, or direct Langfuse SDK usage. Environment variables are automatically configured during initialization. ### Quick Start ```python -from kedro_datasets_experimental.langfuse import LangfuseTraceDataset +from kedro_datasets_experimental.langfuse import TraceDataset -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -498,13 +496,13 @@ response = client.chat.completions.create( ### Installation ```bash -pip install "kedro-datasets[langfuse-langfusetracedataset]" +pip install "kedro-datasets[langfuse-tracedataset]" ``` For AutoGen mode, install with OpenTelemetry dependencies: ```bash -pip install "kedro-datasets[langfuse-langfusetracedataset-autogen]" +pip install "kedro-datasets[langfuse-tracedataset-autogen]" ``` Or install all Langfuse datasets at once: @@ -535,7 +533,7 @@ pip install "kedro-datasets[langfuse]" Returns a raw Langfuse client for manual trace creation: ```python -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -555,7 +553,7 @@ span.end() Returns a `CallbackHandler` to pass into LangChain chains or agents: ```python -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -572,7 +570,7 @@ chain.invoke(input, config={"callbacks": [callback]}) Returns an OpenAI client wrapper that traces all API calls automatically: ```python -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -593,7 +591,7 @@ response = client.chat.completions.create( Returns a configured OpenTelemetry `Tracer` for AutoGen agent conversations. Requires an OTLP endpoint in credentials: ```python -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -613,7 +611,7 @@ with tracer.start_as_current_span("response_generation") as span: For self-hosted Langfuse, provide both `host` and `endpoint`: ```python -dataset = LangfuseTraceDataset( +dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -634,7 +632,7 @@ dataset = LangfuseTraceDataset( ```yaml langfuse_trace: - type: kedro_datasets_experimental.langfuse.LangfuseTraceDataset + type: kedro_datasets_experimental.langfuse.TraceDataset credentials: langfuse_credentials mode: openai ``` @@ -643,7 +641,7 @@ langfuse_trace: ```yaml langfuse_trace: - type: kedro_datasets_experimental.langfuse.LangfuseTraceDataset + type: kedro_datasets_experimental.langfuse.TraceDataset credentials: langfuse_credentials mode: langchain ``` @@ -652,7 +650,7 @@ langfuse_trace: ```yaml langfuse_trace: - type: kedro_datasets_experimental.langfuse.LangfuseTraceDataset + type: kedro_datasets_experimental.langfuse.TraceDataset credentials: langfuse_credentials mode: autogen ``` @@ -761,7 +759,7 @@ DatasetError: AutoGen mode requires OpenTelemetry. ##### Solution: ```bash -pip install "kedro-datasets[langfuse-langfusetracedataset-autogen]" +pip install "kedro-datasets[langfuse-tracedataset-autogen]" ``` --- @@ -769,23 +767,23 @@ pip install "kedro-datasets[langfuse-langfusetracedataset-autogen]" #### Save Not Supported ``` -NotImplementedError: LangfuseTraceDataset is read-only +NotImplementedError: TraceDataset is read-only ``` -##### Solution: `LangfuseTraceDataset` is a read-only dataset that provides tracing clients. Traces are logged automatically through the returned client, not via `save()`. +##### Solution: `TraceDataset` is a read-only dataset that provides tracing clients. Traces are logged automatically through the returned client, not via `save()`. --- -## LangfuseEvaluationDataset +## EvaluationDataset A Kedro dataset for managing [Langfuse evaluation datasets](https://langfuse.com/docs/evaluation/experiments/datasets). It connects to a remote Langfuse dataset, optionally backed by a local JSON/YAML file, and returns a `DatasetClient` on `load()` — ready for iterating items or running experiments via `dataset.run_experiment()`. ### Quick Start ```python -from kedro_datasets_experimental.langfuse import LangfuseEvaluationDataset +from kedro_datasets_experimental.langfuse import EvaluationDataset -dataset = LangfuseEvaluationDataset( +dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={ "public_key": "pk_...", @@ -804,7 +802,7 @@ for item in eval_ds.items: ### Installation ```bash -pip install "kedro-datasets[langfuse-langfuseevaluationdataset]" +pip install "kedro-datasets[langfuse-evaluationdataset]" ``` Or install all Langfuse datasets at once: @@ -908,7 +906,7 @@ Both methods use **upsert** semantics: every item is sent to `Langfuse.create_da ```yaml evaluation_dataset: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -921,7 +919,7 @@ evaluation_dataset: ```yaml production_eval: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote credentials: langfuse_credentials @@ -931,7 +929,7 @@ production_eval: ```yaml eval_snapshot: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote version: "2026-01-15T00:00:00Z" @@ -943,9 +941,9 @@ eval_snapshot: ##### Basic Usage ```python -from kedro_datasets_experimental.langfuse import LangfuseEvaluationDataset +from kedro_datasets_experimental.langfuse import EvaluationDataset -dataset = LangfuseEvaluationDataset( +dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={ "public_key": "pk_...", @@ -974,7 +972,7 @@ dataset.save( ##### Versioned Remote Load ```python -dataset = LangfuseEvaluationDataset( +dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={ "public_key": "pk_...", @@ -992,9 +990,9 @@ snapshot = dataset.load() The `DatasetClient` returned by `load()` integrates directly with Langfuse's experiment runner. Langfuse manages the experiment lifecycle — tracing, scoring, and result aggregation. ```python -from kedro_datasets_experimental.langfuse import LangfuseEvaluationDataset +from kedro_datasets_experimental.langfuse import EvaluationDataset -dataset = LangfuseEvaluationDataset( +dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={ "public_key": "pk_...", diff --git a/kedro-datasets/kedro_datasets_experimental/langfuse/__init__.py b/kedro-datasets/kedro_datasets_experimental/langfuse/__init__.py index 706ea4b73..8d0a29d6e 100644 --- a/kedro-datasets/kedro_datasets_experimental/langfuse/__init__.py +++ b/kedro-datasets/kedro_datasets_experimental/langfuse/__init__.py @@ -5,21 +5,21 @@ import lazy_loader as lazy try: - from .langfuse_evaluation_dataset import LangfuseEvaluationDataset - from .langfuse_prompt_dataset import LangfusePromptDataset - from .langfuse_trace_dataset import LangfuseTraceDataset + from .evaluation_dataset import EvaluationDataset + from .prompt_dataset import PromptDataset + from .trace_dataset import TraceDataset except (ImportError, RuntimeError): # For documentation builds that might fail due to dependency issues # https://github.com/pylint-dev/pylint/issues/4300#issuecomment-1043601901 - LangfuseEvaluationDataset: Any - LangfusePromptDataset: Any - LangfuseTraceDataset: Any + EvaluationDataset: Any + PromptDataset: Any + TraceDataset: Any __getattr__, __dir__, __all__ = lazy.attach( __name__, submod_attrs={ - "langfuse_evaluation_dataset": ["LangfuseEvaluationDataset"], - "langfuse_prompt_dataset": ["LangfusePromptDataset"], - "langfuse_trace_dataset": ["LangfuseTraceDataset"], + "evaluation_dataset": ["EvaluationDataset"], + "prompt_dataset": ["PromptDataset"], + "trace_dataset": ["TraceDataset"], }, ) diff --git a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_evaluation_dataset.py b/kedro-datasets/kedro_datasets_experimental/langfuse/evaluation_dataset.py similarity index 97% rename from kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_evaluation_dataset.py rename to kedro-datasets/kedro_datasets_experimental/langfuse/evaluation_dataset.py index 006abf6b4..af558a1c6 100644 --- a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_evaluation_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/langfuse/evaluation_dataset.py @@ -29,7 +29,7 @@ VALID_SYNC_POLICIES = {"local", "remote"} -class LangfuseEvaluationDataset(AbstractDataset[list[dict[str, Any]], "DatasetClient"]): +class EvaluationDataset(AbstractDataset[list[dict[str, Any]], "DatasetClient"]): """Kedro dataset for Langfuse evaluation datasets. Connects to a Langfuse evaluation dataset and returns a ``DatasetClient`` @@ -96,7 +96,7 @@ class LangfuseEvaluationDataset(AbstractDataset[list[dict[str, Any]], "DatasetCl ```yaml # Local sync policy - local file seeds and syncs to remote evaluation_dataset: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -106,14 +106,14 @@ class LangfuseEvaluationDataset(AbstractDataset[list[dict[str, Any]], "DatasetCl # Remote sync policy - Langfuse is the source of truth production_eval: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote credentials: langfuse_credentials # Pinned to a historical snapshot for reproducibility eval_snapshot: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote version: "2026-01-15T00:00:00Z" @@ -123,9 +123,9 @@ class LangfuseEvaluationDataset(AbstractDataset[list[dict[str, Any]], "DatasetCl Using Python API: ```python - from kedro_datasets_experimental.langfuse import LangfuseEvaluationDataset + from kedro_datasets_experimental.langfuse import EvaluationDataset - dataset = LangfuseEvaluationDataset( + dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={ "public_key": "pk_...", @@ -157,7 +157,7 @@ def __init__( # noqa: PLR0913 metadata: dict[str, Any] | None = None, version: str | None = None, ): - """Initialise ``LangfuseEvaluationDataset``. + """Initialise ``EvaluationDataset``. Args: dataset_name: Name of the evaluation dataset in Langfuse. diff --git a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/langfuse/prompt_dataset.py similarity index 97% rename from kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/langfuse/prompt_dataset.py index d4da99b04..bd1466646 100755 --- a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/langfuse/prompt_dataset.py @@ -48,7 +48,7 @@ def _get_content(data: str | list) -> str: return "\n".join(msg["content"] for msg in data) -class LangfusePromptDataset(AbstractDataset): +class PromptDataset(AbstractDataset): """Kedro dataset for managing prompts with Langfuse versioning and synchronization. This dataset provides seamless integration between local prompt files (JSON/YAML) @@ -75,7 +75,7 @@ class LangfusePromptDataset(AbstractDataset): ```yaml # Local sync policy - local files are source of truth intent_prompt: - type: kedro_datasets_experimental.langfuse.LangfusePromptDataset + type: kedro_datasets_experimental.langfuse.PromptDataset filepath: data/prompts/intent.json prompt_name: "intent-classifier" prompt_type: "chat" @@ -89,7 +89,7 @@ class LangfusePromptDataset(AbstractDataset): # Remote sync policy - Langfuse versions are source of truth production_prompt: - type: kedro_datasets_experimental.langfuse.LangfusePromptDataset + type: kedro_datasets_experimental.langfuse.PromptDataset filepath: data/prompts/production.json prompt_name: "intent-classifier" sync_policy: remote @@ -100,10 +100,10 @@ class LangfusePromptDataset(AbstractDataset): Using Python API: ```python - from kedro_datasets_experimental.langfuse import LangfusePromptDataset + from kedro_datasets_experimental.langfuse import PromptDataset # Basic usage (using default Langfuse cloud) - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath="data/prompts/intent.json", prompt_name="intent-classifier", prompt_type="chat", @@ -114,7 +114,7 @@ class LangfusePromptDataset(AbstractDataset): ) # With custom host - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath="data/prompts/intent.json", prompt_name="intent-classifier", prompt_type="chat", @@ -150,7 +150,7 @@ def __init__( # noqa: PLR0913 save_args: dict[str, Any] | None = None, ) -> None: """ - Initialize LangfusePromptDataset for managing prompts with Langfuse versioning. + Initialize PromptDataset for managing prompts with Langfuse versioning. Args: filepath: Local file path for storing prompt. Supports .json, .yaml, .yml extensions. @@ -176,14 +176,14 @@ def __init__( # noqa: PLR0913 Examples: >>> # Local sync policy (default) - local files are source of truth - >>> dataset = LangfusePromptDataset( + >>> dataset = PromptDataset( ... filepath="prompts/intent.json", ... prompt_name="intent-classifier", ... credentials={"public_key": "pk_...", "secret_key": "sk_..."} # pragma: allowlist secret ... ) >>> # Remote sync policy - load specific version from Langfuse - >>> dataset = LangfusePromptDataset( + >>> dataset = PromptDataset( ... filepath="prompts/intent.yaml", ... prompt_name="intent-classifier", ... credentials=creds, @@ -192,7 +192,7 @@ def __init__( # noqa: PLR0913 ... ) >>> # Remote sync policy - load specific label from Langfuse - >>> dataset = LangfusePromptDataset( + >>> dataset = PromptDataset( ... filepath="prompts/production.json", ... prompt_name="intent-classifier", ... credentials=creds, @@ -201,14 +201,14 @@ def __init__( # noqa: PLR0913 ... ) >>> # With custom host - >>> dataset = LangfusePromptDataset( + >>> dataset = PromptDataset( ... filepath="prompts/intent.json", ... prompt_name="intent-classifier", ... credentials={"public_key": "pk_...", "secret_key": "sk_...", "host": "https://custom.langfuse.com"} # pragma: allowlist secret ... ) >>> # Auto-label new versions when saving (works with any sync policy) - >>> dataset = LangfusePromptDataset( + >>> dataset = PromptDataset( ... filepath="prompts/intent.json", ... prompt_name="intent-classifier", ... credentials=creds, diff --git a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_trace_dataset.py b/kedro-datasets/kedro_datasets_experimental/langfuse/trace_dataset.py similarity index 92% rename from kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_trace_dataset.py rename to kedro-datasets/kedro_datasets_experimental/langfuse/trace_dataset.py index 236bde7bf..b7c5e9d14 100644 --- a/kedro-datasets/kedro_datasets_experimental/langfuse/langfuse_trace_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/langfuse/trace_dataset.py @@ -10,7 +10,7 @@ REQUIRED_LANGFUSE_CREDENTIALS_AUTOGEN = {"endpoint"} -class LangfuseTraceDataset(AbstractDataset): +class TraceDataset(AbstractDataset): """Kedro dataset for managing Langfuse tracing clients and callbacks. This dataset provides appropriate tracing objects based on mode configuration, @@ -31,7 +31,7 @@ class LangfuseTraceDataset(AbstractDataset): ```yaml langfuse_trace: - type: kedro_datasets_experimental.langfuse.LangfuseTraceDataset + type: kedro_datasets_experimental.langfuse.TraceDataset credentials: langfuse_credentials mode: openai ``` @@ -39,10 +39,10 @@ class LangfuseTraceDataset(AbstractDataset): Using Python API: ```python - from kedro_datasets_experimental.langfuse import LangfuseTraceDataset + from kedro_datasets_experimental.langfuse import TraceDataset # Basic usage (using default Langfuse cloud) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -52,7 +52,7 @@ class LangfuseTraceDataset(AbstractDataset): ) # With custom host - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -67,7 +67,7 @@ class LangfuseTraceDataset(AbstractDataset): response = client.chat.completions.create(...) # Automatically traced # AutoGen mode Langfuse cloud - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -78,7 +78,7 @@ class LangfuseTraceDataset(AbstractDataset): tracer = dataset.load() # AutoGen mode self-hosted - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_...", "secret_key": "sk_...", # pragma: allowlist secret @@ -98,7 +98,7 @@ def __init__( mode: Literal["langchain", "openai", "autogen", "sdk"] = "sdk", **trace_kwargs: Any ): - """Initialize LangfuseTraceDataset and configure environment variables. + """Initialize TraceDataset and configure environment variables. Validates credentials and sets up appropriate environment variables for Langfuse tracing integration. Environment variables are set immediately @@ -118,12 +118,12 @@ def __init__( Examples: >>> # Basic SDK mode (using default Langfuse cloud) - >>> dataset = LangfuseTraceDataset( + >>> dataset = TraceDataset( ... credentials={"public_key": "pk_...", "secret_key": "sk_..."} # pragma: allowlist secret ... ) >>> # With custom host - >>> dataset = LangfuseTraceDataset( + >>> dataset = TraceDataset( ... credentials={ ... "public_key": "pk_...", ... "secret_key": "sk_...", # pragma: allowlist secret @@ -132,7 +132,7 @@ def __init__( ... ) >>> # OpenAI mode with API key - >>> dataset = LangfuseTraceDataset( + >>> dataset = TraceDataset( ... credentials={ ... "public_key": "pk_...", ... "secret_key": "sk_...", # pragma: allowlist secret @@ -142,7 +142,7 @@ def __init__( ... ) >>> # AutoGen mode cloud - >>> dataset = LangfuseTraceDataset( + >>> dataset = TraceDataset( ... credentials={ ... "public_key": "pk_...", ... "secret_key": "sk_...", # pragma: allowlist secret @@ -152,7 +152,7 @@ def __init__( ... ) >>> # AutoGen mode self-hosted - >>> dataset = LangfuseTraceDataset( + >>> dataset = TraceDataset( ... credentials={ ... "public_key": "pk_...", ... "secret_key": "sk_...", # pragma: allowlist secret @@ -304,17 +304,17 @@ def load(self) -> Any: Examples: # LangChain mode - dataset = LangfuseTraceDataset(credentials=creds, mode="langchain") + dataset = TraceDataset(credentials=creds, mode="langchain") callback = dataset.load() chain.invoke(input, config={"callbacks": [callback]}) # OpenAI mode - dataset = LangfuseTraceDataset(credentials=creds, mode="openai") + dataset = TraceDataset(credentials=creds, mode="openai") client = dataset.load() response = client.chat.completions.create(model="gpt-4", messages=[...]) # AutoGen mode - dataset = LangfuseTraceDataset(credentials=creds, mode="autogen") + dataset = TraceDataset(credentials=creds, mode="autogen") tracer = dataset.load() # Returns configured Tracer # Option 1: Automatic tracing (LLM calls traced automatically) @@ -326,7 +326,7 @@ def load(self) -> Any: agent.invoke(context) # Child spans nested under parent # SDK mode - dataset = LangfuseTraceDataset(credentials=creds, mode="sdk") + dataset = TraceDataset(credentials=creds, mode="sdk") langfuse = dataset.load() trace = langfuse.trace(name="my-trace") """ @@ -364,8 +364,8 @@ def save(self, data: Any) -> None: NotImplementedError: Always raised as tracing datasets are read-only. Note: - LangfuseTraceDataset is designed for providing tracing clients, + TraceDataset is designed for providing tracing clients, not for data storage. Use the returned tracing clients to automatically log traces, spans, and generations to Langfuse. """ - raise NotImplementedError("LangfuseTraceDataset is read-only - it provides tracing clients, not data storage") + raise NotImplementedError("TraceDataset is read-only - it provides tracing clients, not data storage") diff --git a/kedro-datasets/kedro_datasets_experimental/opik/README.md b/kedro-datasets/kedro_datasets_experimental/opik/README.md index 63ee8cdc4..c58a14d4c 100644 --- a/kedro-datasets/kedro_datasets_experimental/opik/README.md +++ b/kedro-datasets/kedro_datasets_experimental/opik/README.md @@ -4,16 +4,16 @@ [![Kedro](https://img.shields.io/badge/kedro-compatible-green)](https://kedro.org/) [![Opik](https://img.shields.io/badge/opik-integration-purple)](https://www.comet.com/site/products/opik/) -## OpikPromptDataset +## PromptDataset A Kedro dataset for seamless AI prompt management with Opik versioning, synchronisation, and experiment tracking. Supports both LangChain integration and direct SDK usage with flexible sync policies for development and production workflows. ### Quick Start ```python -from kedro_datasets_experimental.opik import OpikPromptDataset +from kedro_datasets_experimental.opik import PromptDataset # Load and use a prompt -dataset = OpikPromptDataset( +dataset = PromptDataset( filepath="prompts/customer_support.json", prompt_name="customer_support_v1", credentials={ @@ -31,7 +31,7 @@ prompt = dataset.load() #### SDK Mode Only For basic Opik integration without LangChain dependencies: ```bash -pip install "kedro-datasets[opik-opikpromptdataset]" +pip install "kedro-datasets[opik-promptdataset]" ``` #### Full Installation @@ -116,7 +116,7 @@ Conversational format with role-based messages: Returns raw Opik Prompt objects for maximum flexibility: ```python -dataset = OpikPromptDataset( +dataset = PromptDataset( filepath="prompts/customer.json", prompt_name="customer_support_v1", credentials={ @@ -138,7 +138,7 @@ metadata = prompt_obj.metadata ##### LangChain Mode Returns ready-to-use `ChatPromptTemplate` objects: ```python -dataset = OpikPromptDataset(mode="langchain", ...) +dataset = PromptDataset(mode="langchain", ...) # ChatPromptTemplate object template = dataset.load() @@ -155,7 +155,7 @@ formatted = template.format(question="What is Kedro?") ```yaml dev_prompt: - type: kedro_datasets_experimental.opik.OpikPromptDataset + type: kedro_datasets_experimental.opik.PromptDataset filepath: data/prompts/customer_support.json prompt_name: customer_support_v1 prompt_type: chat @@ -171,7 +171,7 @@ dev_prompt: ##### Remote Sync Policy - Production ```yaml production_prompt: - type: kedro_datasets_experimental.opik.OpikPromptDataset + type: kedro_datasets_experimental.opik.PromptDataset filepath: data/prompts/production.json prompt_name: customer_support_v1 prompt_type: chat @@ -183,7 +183,7 @@ production_prompt: ##### Strict Sync Policy - CI/CD ```yaml validation_prompt: - type: kedro_datasets_experimental.opik.OpikPromptDataset + type: kedro_datasets_experimental.opik.PromptDataset filepath: data/prompts/validation.yaml prompt_name: customer_support_v1 prompt_type: chat @@ -196,10 +196,10 @@ validation_prompt: ##### Basic Usage ```python -from kedro_datasets_experimental.opik import OpikPromptDataset +from kedro_datasets_experimental.opik import PromptDataset # Minimal configuration -dataset = OpikPromptDataset( +dataset = PromptDataset( filepath="prompts/support.json", prompt_name="support_assistant", credentials={ @@ -212,7 +212,7 @@ dataset = OpikPromptDataset( ##### Advanced Configuration ```python # Full configuration with metadata -dataset = OpikPromptDataset( +dataset = PromptDataset( filepath="prompts/assistant.yaml", prompt_name="assistant_v2", prompt_type="chat", @@ -246,7 +246,7 @@ opik_credentials: ##### Customer Support Assistant ```python # Dynamic customer support responses -support_dataset = OpikPromptDataset( +support_dataset = PromptDataset( filepath="prompts/support.json", prompt_name="support_assistant_v2", prompt_type="chat", @@ -264,7 +264,7 @@ response = template.format( ##### Code Generation ```python # Code generation prompts -code_dataset = OpikPromptDataset( +code_dataset = PromptDataset( filepath="prompts/code_gen.yaml", prompt_name="python_generator", prompt_type="text", @@ -281,7 +281,7 @@ code_prompt = prompt_obj.prompt.format( ##### RAG Applications ```python # Retrieval-Augmented Generation -rag_dataset = OpikPromptDataset( +rag_dataset = PromptDataset( filepath="prompts/rag.json", prompt_name="rag_synthesizer", prompt_type="chat", @@ -302,7 +302,7 @@ final_prompt = template.format( #### Metadata Management ```python # Track prompt versions with metadata -dataset = OpikPromptDataset( +dataset = PromptDataset( save_args={ "metadata": { "version": "2.1.0", @@ -323,7 +323,7 @@ Opik automatically tracks prompt versions as part of your ML experiments: ```python # Prompts are versioned and tracked in Opik datasets # Access via Opik UI to compare prompt performance across experiments -dataset = OpikPromptDataset( +dataset = PromptDataset( filepath="prompts/experiment.json", prompt_name="experiment_prompt_v3", credentials=opik_credentials, @@ -433,17 +433,17 @@ DatasetError: Remote sync policy specified but no remote prompt exists in Opik --- -## OpikEvaluationDataset +## EvaluationDataset A Kedro dataset for managing LLM evaluation datasets with Opik. Supports a local JSON/YAML file as the authoring surface and keeps it in sync with a remote Opik dataset, or delegates entirely to the remote dataset in production. ### Quick Start ```python -from kedro_datasets_experimental.opik import OpikEvaluationDataset +from kedro_datasets_experimental.opik import EvaluationDataset from opik.evaluation import evaluate -dataset = OpikEvaluationDataset( +dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={"api_key": "opik_..."}, # pragma: allowlist secret filepath="data/evaluation/intent_items.json", @@ -462,7 +462,7 @@ evaluate( ### Installation ```bash -pip install "kedro-datasets[opik-opikevaluationdataset]" +pip install "kedro-datasets[opik-evaluationdataset]" ``` #### Requirements @@ -512,7 +512,7 @@ The local file and `save()` data must be a list of dicts: ```yaml evaluation_dataset: - type: kedro_datasets_experimental.opik.OpikEvaluationDataset + type: kedro_datasets_experimental.opik.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -525,7 +525,7 @@ evaluation_dataset: ```yaml production_eval: - type: kedro_datasets_experimental.opik.OpikEvaluationDataset + type: kedro_datasets_experimental.opik.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote credentials: opik_credentials @@ -625,9 +625,9 @@ DatasetError: Dataset item at index 0 is missing required 'input' key. --- -## Migrating from LangfuseEvaluationDataset to OpikEvaluationDataset +## Migrating from langfuse.EvaluationDataset to opik.EvaluationDataset -`OpikEvaluationDataset` and `LangfuseEvaluationDataset` share the same constructor signature and local file format. Migrating is a catalog swap plus evaluation pipeline node changes. +`opik.EvaluationDataset` and `langfuse.EvaluationDataset` share the same constructor signature and local file format. Migrating is a catalog swap plus evaluation pipeline node changes. > **Item identity behaves differently between platforms.** Langfuse forwards any string `id` to the API for upsert. Opik only forwards `id` values that are valid UUID v7 — all others are stripped and Opik auto-generates a new UUID v7 on every sync, creating a new remote row each time. If your local items use human-readable or non-UUID v7 `id` values, those items will accumulate new remote rows on every sync after migrating. To preserve stable remote identity, update your item `id` fields to valid UUID v7 values before switching to Opik. @@ -635,7 +635,7 @@ DatasetError: Dataset item at index 0 is missing required 'input' key. | | Langfuse | Opik | |---|---|---| -| `type` | `kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset` | `kedro_datasets_experimental.opik.OpikEvaluationDataset` | +| `type` | `kedro_datasets_experimental.langfuse.EvaluationDataset` | `kedro_datasets_experimental.opik.EvaluationDataset` | | `credentials` key | `public_key` + `secret_key` | `api_key` | | Optional credential keys | `host` | `workspace`, `host`, `project_name` | | `version` param | ✅ Supported (ISO 8601 snapshot pinning) | ❌ Not available | @@ -645,7 +645,7 @@ DatasetError: Dataset item at index 0 is missing required 'input' key. ```yaml evaluation_dataset: - type: kedro_datasets_experimental.langfuse.LangfuseEvaluationDataset + type: kedro_datasets_experimental.langfuse.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -661,7 +661,7 @@ langfuse_credentials: ```yaml evaluation_dataset: - type: kedro_datasets_experimental.opik.OpikEvaluationDataset + type: kedro_datasets_experimental.opik.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -743,7 +743,7 @@ def my_task(dataset_item: dict) -> dict: ### Known limitations - **`metadata` is local-only**: Opik's `create_dataset()` does not accept a `metadata` argument. The `metadata` param is stored and returned by `_describe()` but is not propagated to the remote dataset (unlike Langfuse, which passes it through). -- **No snapshot versioning**: Opik does not support pinning `load()` to a historical snapshot. The `version` param from `LangfuseEvaluationDataset` has no Opik equivalent. +- **No snapshot versioning**: Opik does not support pinning `load()` to a historical snapshot. The `version` param from `langfuse.EvaluationDataset` has no Opik equivalent. - **UUID v7 `id` values are forwarded; Opik upserts by item ID**: If a local item's `id` is a valid UUID v7, it is passed to Opik's `create_or_update` API, which upserts by item ID — the first sync creates the remote row; subsequent syncs update that same row in-place (content changes replace the row; unchanged content is a no-op). Items without a valid UUID v7 `id` have it stripped before upload; Opik auto-generates a new UUID v7 each sync, so those items create a new remote row on every sync, even when the content is unchanged. ### Support diff --git a/kedro-datasets/kedro_datasets_experimental/opik/__init__.py b/kedro-datasets/kedro_datasets_experimental/opik/__init__.py index 02881a5a2..5b180167a 100644 --- a/kedro-datasets/kedro_datasets_experimental/opik/__init__.py +++ b/kedro-datasets/kedro_datasets_experimental/opik/__init__.py @@ -5,21 +5,21 @@ import lazy_loader as lazy try: - from .opik_evaluation_dataset import OpikEvaluationDataset - from .opik_prompt_dataset import OpikPromptDataset - from .opik_trace_dataset import OpikTraceDataset + from .evaluation_dataset import EvaluationDataset + from .prompt_dataset import PromptDataset + from .trace_dataset import TraceDataset except (ImportError, RuntimeError): # For documentation builds that might fail due to dependency issues # https://github.com/pylint-dev/pylint/issues/4300#issuecomment-1043601901 - OpikEvaluationDataset: Any - OpikPromptDataset: Any - OpikTraceDataset: Any + EvaluationDataset: Any + PromptDataset: Any + TraceDataset: Any __getattr__, __dir__, __all__ = lazy.attach( __name__, submod_attrs={ - "opik_evaluation_dataset": ["OpikEvaluationDataset"], - "opik_prompt_dataset": ["OpikPromptDataset"], - "opik_trace_dataset": ["OpikTraceDataset"], + "evaluation_dataset": ["EvaluationDataset"], + "prompt_dataset": ["PromptDataset"], + "trace_dataset": ["TraceDataset"], }, ) diff --git a/kedro-datasets/kedro_datasets_experimental/opik/opik_evaluation_dataset.py b/kedro-datasets/kedro_datasets_experimental/opik/evaluation_dataset.py similarity index 98% rename from kedro-datasets/kedro_datasets_experimental/opik/opik_evaluation_dataset.py rename to kedro-datasets/kedro_datasets_experimental/opik/evaluation_dataset.py index b969eb20c..91bfd87c0 100644 --- a/kedro-datasets/kedro_datasets_experimental/opik/opik_evaluation_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/opik/evaluation_dataset.py @@ -31,7 +31,7 @@ REQUIRED_UUID_VERSION = 7 -class OpikEvaluationDataset(AbstractDataset): +class EvaluationDataset(AbstractDataset): """Kedro dataset for Opik evaluation datasets. Connects to an Opik evaluation dataset and returns an ``opik.Dataset`` @@ -118,7 +118,7 @@ class OpikEvaluationDataset(AbstractDataset): ```yaml # Local sync policy — local file seeds and syncs to remote evaluation_dataset: - type: kedro_datasets_experimental.opik.OpikEvaluationDataset + type: kedro_datasets_experimental.opik.EvaluationDataset dataset_name: intent-detection-eval filepath: data/evaluation/intent_items.json sync_policy: local @@ -128,7 +128,7 @@ class OpikEvaluationDataset(AbstractDataset): # Remote sync policy — Opik is the source of truth production_eval: - type: kedro_datasets_experimental.opik.OpikEvaluationDataset + type: kedro_datasets_experimental.opik.EvaluationDataset dataset_name: intent-detection-eval sync_policy: remote credentials: opik_credentials @@ -137,9 +137,9 @@ class OpikEvaluationDataset(AbstractDataset): Using Python API: ```python - from kedro_datasets_experimental.opik import OpikEvaluationDataset + from kedro_datasets_experimental.opik import EvaluationDataset - dataset = OpikEvaluationDataset( + dataset = EvaluationDataset( dataset_name="intent-detection-eval", credentials={"api_key": "..."}, # pragma: allowlist secret filepath="data/evaluation/intent_items.json", @@ -175,7 +175,7 @@ def __init__( sync_policy: Literal["local", "remote"] = "local", metadata: dict[str, Any] | None = None, ): - """Initialise ``OpikEvaluationDataset``. + """Initialise ``EvaluationDataset``. Args: dataset_name: Name of the evaluation dataset in Opik. diff --git a/kedro-datasets/kedro_datasets_experimental/opik/opik_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/opik/prompt_dataset.py similarity index 98% rename from kedro-datasets/kedro_datasets_experimental/opik/opik_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/opik/prompt_dataset.py index 5e9d6b043..a890381f3 100644 --- a/kedro-datasets/kedro_datasets_experimental/opik/opik_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/opik/prompt_dataset.py @@ -49,7 +49,7 @@ def _get_content(data: str | list) -> str: return "\n".join(msg["content"] for msg in data) -class OpikPromptDataset(AbstractDataset): +class PromptDataset(AbstractDataset): """Kedro dataset for managing prompts with Opik versioning and synchronisation. This dataset provides seamless integration between local prompt files (JSON/YAML) @@ -76,7 +76,7 @@ class OpikPromptDataset(AbstractDataset): ```yaml # Local sync policy - local files are source of truth customer_prompt: - type: kedro_datasets_experimental.opik.OpikPromptDataset + type: kedro_datasets_experimental.opik.PromptDataset filepath: data/prompts/customer.json prompt_name: customer_support_v1 prompt_type: chat @@ -86,7 +86,7 @@ class OpikPromptDataset(AbstractDataset): # Remote sync policy - Opik versions are source of truth production_prompt: - type: kedro_datasets_experimental.opik.OpikPromptDataset + type: kedro_datasets_experimental.opik.PromptDataset filepath: data/prompts/production.yaml prompt_name: customer_support_v1 sync_policy: remote @@ -96,10 +96,10 @@ class OpikPromptDataset(AbstractDataset): Using Python API: ```python - from kedro_datasets_experimental.opik import OpikPromptDataset + from kedro_datasets_experimental.opik import PromptDataset # Create dataset for chat prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath="data/prompts/customer_support.json", prompt_name="customer_support_v1", prompt_type="chat", @@ -134,7 +134,7 @@ def __init__( # noqa: PLR0913 save_args: dict[str, Any] | None = None, **opik_kwargs: Any ): - """Initialise OpikPromptDataset with local and remote configuration. + """Initialise PromptDataset with local and remote configuration. Args: filepath: Local file path for storing prompt. Supports .json, .yaml, .yml extensions. diff --git a/kedro-datasets/kedro_datasets_experimental/opik/opik_trace_dataset.py b/kedro-datasets/kedro_datasets_experimental/opik/trace_dataset.py similarity index 95% rename from kedro-datasets/kedro_datasets_experimental/opik/opik_trace_dataset.py rename to kedro-datasets/kedro_datasets_experimental/opik/trace_dataset.py index 9488e01eb..dd8964fa2 100644 --- a/kedro-datasets/kedro_datasets_experimental/opik/opik_trace_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/opik/trace_dataset.py @@ -14,7 +14,7 @@ OPTIONAL_OPIK_CREDENTIALS = {"project_name", "url_override"} -class OpikTraceDataset(AbstractDataset): +class TraceDataset(AbstractDataset): """Kedro dataset for managing Opik tracing clients and callbacks. This dataset provides Opik tracing integrations for various AI frameworks or direct SDK usage. @@ -33,17 +33,17 @@ class OpikTraceDataset(AbstractDataset): Using catalog YAML configuration: ```yaml opik_trace: - type: kedro_datasets_experimental.opik.OpikTraceDataset + type: kedro_datasets_experimental.opik.TraceDataset credentials: opik_credentials mode: openai ``` Using Python API: ```python - from kedro_datasets_experimental.opik import OpikTraceDataset + from kedro_datasets_experimental.opik import TraceDataset # Example: OpenAI mode (traced completions) - dataset = OpikTraceDataset( + dataset = TraceDataset( credentials={ "api_key": "opik_api_key", # pragma: allowlist secret "workspace": "my-workspace", @@ -65,7 +65,7 @@ class OpikTraceDataset(AbstractDataset): ) # Example: SDK mode (manual tracing via decorator) - dataset = OpikTraceDataset( + dataset = TraceDataset( credentials={ "api_key": "opik_api_key", # pragma: allowlist secret "workspace": "my-workspace", @@ -84,7 +84,7 @@ def multiply(x: int, y: int) -> int: print(multiply(3, 4)) # Example: LangChain mode - dataset = OpikTraceDataset( + dataset = TraceDataset( credentials={ "api_key": "opik_api_key", # pragma: allowlist secret "workspace": "my-workspace", @@ -95,7 +95,7 @@ def multiply(x: int, y: int) -> int: # Use tracer in your LangChain Runnable or chain.run(callbacks=[tracer]) # Example: AutoGen mode Opik cloud - dataset = OpikTraceDataset( + dataset = TraceDataset( credentials={ "api_key": "opik_api_key", # pragma: allowlist secret "workspace": "my-workspace", @@ -116,7 +116,7 @@ def multiply(x: int, y: int) -> int: agent.invoke(context) # Child spans nested under "response_generation" # Example: AutoGen mode self-hosted - dataset = OpikTraceDataset( + dataset = TraceDataset( credentials={ "api_key": "opik_api_key", # pragma: allowlist secret "workspace": "my-workspace", @@ -132,7 +132,7 @@ def multiply(x: int, y: int) -> int: **Notes** - Opik configuration is global within the Python process. - Using multiple `OpikTraceDataset` instances with different projects in the same session + Using multiple `TraceDataset` instances with different projects in the same session may cause all traces to log to the first configured project. - To switch projects, restart the Python process or reload the Opik module. """ @@ -207,7 +207,7 @@ def _validate_openai_client_params(self) -> None: """ if "openai" not in self._credentials: raise DatasetError( - "Missing 'openai' section in OpikTraceDataset credentials. " + "Missing 'openai' section in TraceDataset credentials. " "For OpenAI mode, include an 'openai' block inside your credentials." ) @@ -305,7 +305,7 @@ def load(self) -> Any: elif self._mode == "autogen": self._cached_client = self._build_autogen_tracer() else: - raise DatasetError(f"Unsupported mode '{self._mode}' for OpikTraceDataset") + raise DatasetError(f"Unsupported mode '{self._mode}' for TraceDataset") return self._cached_client @@ -359,5 +359,5 @@ def _load_langchain_tracer(self) -> Any: return OpikTracer(**self._trace_kwargs) def save(self, data: Any) -> None: - """Saving traces manually is not supported; OpikTraceDataset is read-only.""" - raise NotImplementedError("OpikTraceDataset is read-only.") + """Saving traces manually is not supported; TraceDataset is read-only.""" + raise NotImplementedError("TraceDataset is read-only.") diff --git a/kedro-datasets/kedro_datasets_experimental/tests/langchain/test_langchain_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/langchain/test_prompt_dataset.py similarity index 91% rename from kedro-datasets/kedro_datasets_experimental/tests/langchain/test_langchain_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/langchain/test_prompt_dataset.py index 63152495b..9484ac14c 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/langchain/test_langchain_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/langchain/test_prompt_dataset.py @@ -13,8 +13,8 @@ # LangChain < 1.0 from langchain.prompts import ChatPromptTemplate, PromptTemplate -from kedro_datasets_experimental.langchain.langchain_prompt_dataset import ( - LangChainPromptDataset, +from kedro_datasets_experimental.langchain.prompt_dataset import ( + PromptDataset, ) @@ -84,10 +84,10 @@ def chat_yaml_prompt_file(tmp_path: Path) -> Path: return prompt_file -class TestLangChainPromptDataset: +class TestPromptDataset: def test_init_with_txt_file(self, txt_prompt_file: Path) -> None: """Test dataset initialization with a .txt file.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"} ) @@ -96,7 +96,7 @@ def test_init_with_txt_file(self, txt_prompt_file: Path) -> None: def test_init_with_json_file(self, json_prompt_file: Path) -> None: """Test dataset initialization with a .json file.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(json_prompt_file), dataset={"type": "json.JSONDataset"} ) @@ -106,7 +106,7 @@ def test_init_with_json_file(self, json_prompt_file: Path) -> None: def test_init_with_invalid_template(self, txt_prompt_file: Path) -> None: """Test initialization with invalid template type.""" with pytest.raises(DatasetError, match="Invalid template"): - LangChainPromptDataset( + PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"}, template="InvalidTemplate" @@ -114,7 +114,7 @@ def test_init_with_invalid_template(self, txt_prompt_file: Path) -> None: def test_load_text_prompt(self, txt_prompt_file: Path) -> None: """Test loading a simple text prompt.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "kedro_datasets.text.TextDataset"} ) @@ -124,7 +124,7 @@ def test_load_text_prompt(self, txt_prompt_file: Path) -> None: def test_load_json_prompt(self, json_prompt_file: Path) -> None: """Test loading a JSON prompt configuration.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(json_prompt_file), dataset={"type": "kedro_datasets.json.JSONDataset"}) prompt = dataset.load() @@ -133,7 +133,7 @@ def test_load_json_prompt(self, json_prompt_file: Path) -> None: def test_load_chat_prompt(self, chat_prompt_file: Path) -> None: """Test loading a chat prompt configuration.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(chat_prompt_file), dataset={"type": "json.JSONDataset"}, template="ChatPromptTemplate" @@ -147,7 +147,7 @@ def test_load_chat_prompt(self, chat_prompt_file: Path) -> None: def test_load_yaml_prompt(self, yaml_prompt_file: Path) -> None: """Test loading a YAML prompt configuration.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(yaml_prompt_file), dataset={"type": "kedro_datasets.yaml.YAMLDataset"} ) @@ -157,7 +157,7 @@ def test_load_yaml_prompt(self, yaml_prompt_file: Path) -> None: def test_load_yaml_chat_prompt(self, chat_yaml_prompt_file: Path) -> None: """Test loading a YAML chat prompt configuration.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(chat_yaml_prompt_file), dataset={"type": "yaml.YAMLDataset"}, template="ChatPromptTemplate" @@ -172,7 +172,7 @@ def test_load_yaml_chat_prompt(self, chat_yaml_prompt_file: Path) -> None: def test_save_not_supported(self, txt_prompt_file: Path) -> None: """Test that save operation raises an error.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"} ) @@ -181,7 +181,7 @@ def test_save_not_supported(self, txt_prompt_file: Path) -> None: def test_describe(self, txt_prompt_file: Path) -> None: """Test the _describe method returns correct information.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"}) desc = dataset._describe() @@ -191,7 +191,7 @@ def test_describe(self, txt_prompt_file: Path) -> None: def test_exists(self, txt_prompt_file: Path) -> None: """Test the _exists method.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"} ) @@ -200,7 +200,7 @@ def test_exists(self, txt_prompt_file: Path) -> None: def test_credentials_propagation(self, json_prompt_file: Path) -> None: """Test that credentials are properly propagated to the underlying dataset.""" credentials = {"key": "value"} - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(json_prompt_file), dataset={"type": "json.JSONDataset"}, credentials=credentials, @@ -216,7 +216,7 @@ def test_chat_prompt_template_with_plain_string(self, tmp_path: Path) -> None: """Test that plain string data raises DatasetError for ChatPromptTemplate.""" prompt_file = tmp_path / "plain_string.json" prompt_file.write_text('"Just a plain string, not a chat prompt."') - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(prompt_file), dataset={"type": "text.TextDataset"}, template="ChatPromptTemplate" @@ -228,13 +228,13 @@ def test_invalid_dataset_type_raises_error(self, txt_prompt_file: Path) -> None: """Test that using an invalid dataset type raises DatasetError.""" invalid_dataset = {"type": "pandas.CSVDataset"} with pytest.raises(DatasetError, match="Unsupported dataset type 'pandas.CSVDataset'"): - LangChainPromptDataset(filepath=str(txt_prompt_file), dataset=invalid_dataset) + PromptDataset(filepath=str(txt_prompt_file), dataset=invalid_dataset) def test_none_dataset_type_raises_error(self, txt_prompt_file: Path) -> None: """Test that passing no dataset type raises DatasetError.""" invalid_dataset = None with pytest.raises(DatasetError, match="Underlying dataset type cannot be empty"): - LangChainPromptDataset(filepath=str(txt_prompt_file), dataset=invalid_dataset) + PromptDataset(filepath=str(txt_prompt_file), dataset=invalid_dataset) @pytest.mark.parametrize( "bad_data,error_pattern", @@ -251,7 +251,7 @@ def test_invalid_chat_prompt_data( """Test validation of chat prompt data.""" prompt_file = tmp_path / "bad_chat.json" prompt_file.write_text(str(bad_data)) - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(prompt_file), dataset={"type": "json.JSONDataset"}, template="ChatPromptTemplate" @@ -274,7 +274,7 @@ def test_invalid_yaml_chat_prompt(self, tmp_path: Path, bad_data: any, error_pat prompt_file = tmp_path / "bad_chat.yaml" prompt_file.write_text(yaml.dump(bad_data)) - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(prompt_file), dataset={"type": "yaml.YAMLDataset"}, template="ChatPromptTemplate" @@ -285,7 +285,7 @@ def test_invalid_yaml_chat_prompt(self, tmp_path: Path, bad_data: any, error_pat def test_preview_txt_prompt(self, txt_prompt_file: str): """Preview a plain text prompt returns raw string.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(txt_prompt_file), dataset={"type": "text.TextDataset"} ) @@ -294,7 +294,7 @@ def test_preview_txt_prompt(self, txt_prompt_file: str): def test_preview_json_prompt(self, json_prompt_file: str): """Preview a JSON prompt returns serialized dict.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(json_prompt_file), dataset={"type": "json.JSONDataset"} ) @@ -305,7 +305,7 @@ def test_preview_json_prompt(self, json_prompt_file: str): def test_preview_yaml_prompt(self, yaml_prompt_file: str): """Preview a YAML prompt returns serialized dict.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(yaml_prompt_file), dataset={"type": "yaml.YAMLDataset"} ) @@ -316,7 +316,7 @@ def test_preview_yaml_prompt(self, yaml_prompt_file: str): def test_preview_chat_json_prompt(self, chat_prompt_file: str): """Preview a JSON chat prompt returns serialized messages list.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(chat_prompt_file), dataset={"type": "json.JSONDataset"}, template="ChatPromptTemplate") @@ -328,7 +328,7 @@ def test_preview_chat_json_prompt(self, chat_prompt_file: str): def test_preview_chat_yaml_prompt(self, chat_yaml_prompt_file: str): """Preview a YAML chat prompt returns serialized messages list.""" - dataset = LangChainPromptDataset( + dataset = PromptDataset( filepath=str(chat_yaml_prompt_file), dataset={"type": "yaml.YAMLDataset"}, template="ChatPromptTemplate") diff --git a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_evaluation_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_evaluation_dataset.py similarity index 90% rename from kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_evaluation_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_evaluation_dataset.py index 2bdd20c3a..8c098ede1 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_evaluation_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_evaluation_dataset.py @@ -14,12 +14,12 @@ validate_langfuse_credentials, validate_sync_policy, ) -from kedro_datasets_experimental.langfuse.langfuse_evaluation_dataset import ( +from kedro_datasets_experimental.langfuse.evaluation_dataset import ( VALID_SYNC_POLICIES, - LangfuseEvaluationDataset, + EvaluationDataset, ) -MODULE = "kedro_datasets_experimental.langfuse.langfuse_evaluation_dataset" +MODULE = "kedro_datasets_experimental.langfuse.evaluation_dataset" @pytest.fixture @@ -102,8 +102,8 @@ def empty_remote_dataset(): @pytest.fixture def eval_dataset(filepath_json, mock_credentials, mock_langfuse): - """Basic LangfuseEvaluationDataset for testing (local sync).""" - return LangfuseEvaluationDataset( + """Basic EvaluationDataset for testing (local sync).""" + return EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -112,10 +112,10 @@ def eval_dataset(filepath_json, mock_credentials, mock_langfuse): class TestInit: - """Test LangfuseEvaluationDataset initialization.""" + """Test EvaluationDataset initialization.""" def test_init_minimal_params(self, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -126,7 +126,7 @@ def test_init_minimal_params(self, mock_credentials, mock_langfuse): assert ds._version is None def test_init_all_params(self, filepath_json, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -153,7 +153,7 @@ def test_init_missing_required_credentials( with pytest.raises( DatasetError, match=f"Missing required Langfuse credential: '{missing_key}'" ): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=credentials_dict, ) @@ -167,7 +167,7 @@ def test_init_empty_credentials(self, invalid_value, mock_langfuse): with pytest.raises( DatasetError, match="Langfuse credential 'public_key' cannot be empty" ): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=creds, ) @@ -181,14 +181,14 @@ def test_init_empty_host(self, mock_langfuse): with pytest.raises( DatasetError, match="Langfuse credential 'host' cannot be empty if provided" ): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=creds, ) def test_init_invalid_sync_policy(self, mock_credentials, mock_langfuse): with pytest.raises(DatasetError, match="Invalid sync_policy 'bad'"): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="bad", @@ -198,7 +198,7 @@ def test_init_unsupported_extension(self, tmp_path, mock_credentials, mock_langf bad_file = tmp_path / "items.txt" bad_file.write_text("test") with pytest.raises(DatasetError, match="Unsupported file extension '.txt'"): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=str(bad_file), @@ -209,7 +209,7 @@ def test_init_version_with_local_raises(self, mock_credentials, mock_langfuse): DatasetError, match="'version' parameter can only be used with sync_policy='remote'", ): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="local", @@ -217,7 +217,7 @@ def test_init_version_with_local_raises(self, mock_credentials, mock_langfuse): ) def test_init_version_with_remote_accepted(self, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -227,7 +227,7 @@ def test_init_version_with_remote_accepted(self, mock_credentials, mock_langfuse def test_init_invalid_version_format(self, mock_credentials, mock_langfuse): with pytest.raises(DatasetError, match="Invalid version 'not-a-date'"): - LangfuseEvaluationDataset( + EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -244,7 +244,7 @@ def test_load_existing_remote_no_local_file( """Remote exists, no local file → returns remote as-is.""" mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="local", @@ -267,7 +267,7 @@ def test_load_existing_remote_local_file_upserts_all_items( refreshed.items = [remote_item, Mock()] mock_langfuse.get_dataset.side_effect = [remote, refreshed] - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -299,7 +299,7 @@ def test_load_nonexistent_remote_creates_it( created, ] - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="local", @@ -326,7 +326,7 @@ def test_load_no_local_no_remote_creates_empty( empty_ds, ] - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=(tmp_path / "nonexistent.json").as_posix(), @@ -342,7 +342,7 @@ def test_load_validates_before_remote_creation( filepath = tmp_path / "bad_items.json" filepath.write_text(json.dumps([{"bad": "no input key"}])) - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath.as_posix(), @@ -360,7 +360,7 @@ def test_load_idempotent_reload( """Each load upserts all local items to remote.""" mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -381,7 +381,7 @@ def test_load_returns_dataset_client( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -396,7 +396,7 @@ def test_load_no_local_file_interaction( """Remote policy never reads or writes the local file.""" mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -413,7 +413,7 @@ def test_load_without_version( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -428,7 +428,7 @@ def test_load_with_version( """Versioned load passes the parsed datetime to get_dataset.""" mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -449,7 +449,7 @@ def test_save_uploads_new_items( ): mock_langfuse.get_dataset.return_value = empty_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -467,7 +467,7 @@ def test_save_upserts_existing_ids( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -486,7 +486,7 @@ def test_save_updates_existing_item_content( """Saving an item with an existing id but different input upserts it.""" mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -504,7 +504,7 @@ def test_save_items_without_id_always_uploaded( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -523,7 +523,7 @@ def test_save_local_mode_merges_to_file( ): mock_langfuse.get_dataset.return_value = empty_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -546,7 +546,7 @@ def test_save_remote_mode_skips_file( original = Path(filepath_json).read_text() - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -561,7 +561,7 @@ def test_save_empty_list_no_op( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -574,7 +574,7 @@ def test_save_missing_input_raises( ): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -586,7 +586,7 @@ def test_save_validates_before_remote_creation( self, mock_credentials, mock_langfuse ): """Validation fails before the remote dataset is created.""" - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, sync_policy="remote", @@ -604,7 +604,7 @@ class TestPreview: def test_preview_returns_json_preview_with_data( self, filepath_json, mock_credentials, mock_langfuse ): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -617,7 +617,7 @@ def test_preview_returns_json_preview_with_data( assert parsed[0]["id"] == "q1" def test_preview_no_filepath(self, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -625,7 +625,7 @@ def test_preview_no_filepath(self, mock_credentials, mock_langfuse): assert "No filepath configured" in str(preview) def test_preview_missing_file(self, tmp_path, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=(tmp_path / "missing.json").as_posix(), @@ -640,7 +640,7 @@ class TestDescribe: def test_describe_returns_expected_keys( self, filepath_json, mock_credentials, mock_langfuse ): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -658,7 +658,7 @@ def test_describe_returns_expected_keys( def test_describe_no_credentials_in_output( self, mock_credentials, mock_langfuse ): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -673,7 +673,7 @@ class TestExists: def test_exists_true(self, mock_credentials, mock_langfuse, mock_remote_dataset): mock_langfuse.get_dataset.return_value = mock_remote_dataset - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -685,7 +685,7 @@ def test_exists_false(self, mock_credentials, mock_langfuse): mock_langfuse.get_dataset.side_effect = LangfuseNotFoundError( body=not_found_body ) - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -693,7 +693,7 @@ def test_exists_false(self, mock_credentials, mock_langfuse): def test_exists_api_error_raises(self, mock_credentials, mock_langfuse): mock_langfuse.get_dataset.side_effect = LangfuseApiError(body="Server error") - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) @@ -745,62 +745,62 @@ def test_validate_filepath_invalid(self, tmp_path): validate_file_extension(str(tmp_path / "items.csv")) def test_validate_items_valid(self): - LangfuseEvaluationDataset._validate_items( + EvaluationDataset._validate_items( [{"input": "a"}, {"input": "b"}] ) def test_validate_items_missing_input(self): with pytest.raises(DatasetError, match="index 1 is missing required 'input'"): - LangfuseEvaluationDataset._validate_items( + EvaluationDataset._validate_items( [{"input": "ok"}, {"expected_output": "bad"}] ) def test_merge_items_no_overlap(self): existing = [{"id": "a", "input": "x"}] new = [{"id": "b", "input": "y"}] - merged = LangfuseEvaluationDataset._merge_items(existing, new) + merged = EvaluationDataset._merge_items(existing, new) assert len(merged) == 2 assert [m["id"] for m in merged] == ["a", "b"] def test_merge_items_new_takes_precedence(self): existing = [{"id": "a", "input": "x"}] new = [{"id": "a", "input": "updated"}] - merged = LangfuseEvaluationDataset._merge_items(existing, new) + merged = EvaluationDataset._merge_items(existing, new) assert len(merged) == 1 assert merged[0]["input"] == "updated" def test_merge_items_without_id_appended(self): existing = [{"id": "a", "input": "x"}] new = [{"input": "no-id"}] - merged = LangfuseEvaluationDataset._merge_items(existing, new) + merged = EvaluationDataset._merge_items(existing, new) assert len(merged) == 2 def test_merge_items_empty_existing(self): - merged = LangfuseEvaluationDataset._merge_items( + merged = EvaluationDataset._merge_items( [], [{"id": "a", "input": "x"}] ) assert len(merged) == 1 def test_merge_items_empty_new(self): existing = [{"id": "a", "input": "x"}] - merged = LangfuseEvaluationDataset._merge_items(existing, []) + merged = EvaluationDataset._merge_items(existing, []) assert merged == existing def test_parse_version_none(self): - assert LangfuseEvaluationDataset._parse_version(None) is None + assert EvaluationDataset._parse_version(None) is None def test_parse_version_valid(self): - result = LangfuseEvaluationDataset._parse_version("2026-01-15T00:00:00Z") + result = EvaluationDataset._parse_version("2026-01-15T00:00:00Z") assert result is not None assert result.year == 2026 def test_parse_version_naive_gets_utc(self): - result = LangfuseEvaluationDataset._parse_version("2026-01-15T00:00:00") + result = EvaluationDataset._parse_version("2026-01-15T00:00:00") assert result.tzinfo == timezone.utc def test_parse_version_invalid(self): with pytest.raises(DatasetError, match="Invalid version"): - LangfuseEvaluationDataset._parse_version("bad") + EvaluationDataset._parse_version("bad") @pytest.mark.parametrize( "filepath_fixture,expected_class", @@ -813,7 +813,7 @@ def test_file_dataset_property( self, request, mock_credentials, mock_langfuse, filepath_fixture, expected_class ): filepath = request.getfixturevalue(filepath_fixture) - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath, @@ -821,7 +821,7 @@ def test_file_dataset_property( assert ds.file_dataset.__class__.__name__ == expected_class def test_file_dataset_caching(self, filepath_json, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, filepath=filepath_json, @@ -831,7 +831,7 @@ def test_file_dataset_caching(self, filepath_json, mock_credentials, mock_langfu assert fd1 is fd2 def test_file_dataset_no_filepath_raises(self, mock_credentials, mock_langfuse): - ds = LangfuseEvaluationDataset( + ds = EvaluationDataset( dataset_name="test-eval", credentials=mock_credentials, ) diff --git a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_prompt_dataset.py similarity index 92% rename from kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_prompt_dataset.py index efca45b46..f0aa9ce71 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_prompt_dataset.py @@ -5,8 +5,8 @@ import yaml from kedro.io import DatasetError -from kedro_datasets_experimental.langfuse.langfuse_prompt_dataset import ( - LangfusePromptDataset, +from kedro_datasets_experimental.langfuse.prompt_dataset import ( + PromptDataset, _get_content, _hash, ) @@ -15,7 +15,7 @@ @pytest.fixture def mock_langfuse(): """Mock Langfuse client for testing.""" - with patch("kedro_datasets_experimental.langfuse.langfuse_prompt_dataset.Langfuse") as mock: + with patch("kedro_datasets_experimental.langfuse.prompt_dataset.Langfuse") as mock: langfuse_instance = Mock() mock.return_value = langfuse_instance yield langfuse_instance @@ -109,8 +109,8 @@ def mock_langfuse_prompt(): @pytest.fixture def langfuse_dataset(filepath_json_chat, mock_credentials, mock_langfuse): - """Basic LangfusePromptDataset for testing.""" - return LangfusePromptDataset( + """Basic PromptDataset for testing.""" + return PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -118,12 +118,12 @@ def langfuse_dataset(filepath_json_chat, mock_credentials, mock_langfuse): ) -class TestLangfusePromptDatasetInit: - """Test LangfusePromptDataset initialization.""" +class TestPromptDatasetInit: + """Test PromptDataset initialization.""" def test_init_minimal_params(self, filepath_json_chat, mock_credentials, mock_langfuse): """Test initialization with minimal required parameters.""" - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials @@ -138,7 +138,7 @@ def test_init_all_params(self, filepath_json_chat, mock_credentials, mock_langfu load_args = {"version": 1} save_args = {"labels": ["test"]} - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -166,7 +166,7 @@ def test_init_all_params(self, filepath_json_chat, mock_credentials, mock_langfu def test_init_missing_required_credentials(self, filepath_json_chat, missing_key, credentials_dict, mock_langfuse): """Test initialization with missing required credentials raises DatasetError.""" with pytest.raises(DatasetError, match=f"Missing required Langfuse credential: '{missing_key}'"): - LangfusePromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=credentials_dict @@ -181,7 +181,7 @@ def test_init_empty_credentials(self, filepath_json_chat, invalid_value, mock_la invalid_credentials = {"public_key": invalid_value, "secret_key": "sk_test_67890"} # pragma: allowlist secret with pytest.raises(DatasetError, match="Langfuse credential 'public_key' cannot be empty"): - LangfusePromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=invalid_credentials @@ -196,7 +196,7 @@ def test_init_empty_host(self, filepath_json_chat, mock_langfuse): } with pytest.raises(DatasetError, match="Langfuse credential 'host' cannot be empty if provided"): - LangfusePromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=invalid_credentials @@ -208,19 +208,19 @@ def test_init_unsupported_file_extension(self, tmp_path, mock_credentials, mock_ unsupported_file.write_text("test prompt") with pytest.raises(DatasetError, match="Unsupported file extension '.txt'"): - LangfusePromptDataset( + PromptDataset( filepath=str(unsupported_file), prompt_name="test-prompt", credentials=mock_credentials ) -class TestLangfusePromptDatasetSave: - """Test LangfusePromptDataset save functionality.""" +class TestPromptDatasetSave: + """Test PromptDataset save functionality.""" def test_save_text_prompt(self, filepath_json_text, mock_credentials, mock_langfuse): """Test saving a text prompt.""" - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_text, prompt_name="test-prompt", credentials=mock_credentials, @@ -238,7 +238,7 @@ def test_save_text_prompt(self, filepath_json_text, mock_credentials, mock_langf def test_save_chat_prompt(self, filepath_json_chat, mock_credentials, mock_langfuse): """Test saving a chat prompt.""" - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -260,7 +260,7 @@ def test_save_chat_prompt(self, filepath_json_chat, mock_credentials, mock_langf def test_save_with_labels(self, filepath_json_chat, mock_credentials, mock_langfuse): """Test saving a prompt with labels.""" save_args = {"labels": ["production", "v2.0"]} - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -278,8 +278,8 @@ def test_save_with_labels(self, filepath_json_chat, mock_credentials, mock_langf ) -class TestLangfusePromptDatasetLoad: - """Test LangfusePromptDataset load functionality.""" +class TestPromptDatasetLoad: + """Test PromptDataset load functionality.""" @pytest.mark.skip(reason="Skipping for now, langfuse not compatible with langchain>=1.0 yet") def test_load_sdk_mode(self, langfuse_dataset, mock_langfuse, mock_langfuse_prompt): @@ -298,7 +298,7 @@ def test_load_with_version(self, filepath_json_chat, mock_credentials, mock_lang """Test load with specific version.""" mock_langfuse.get_prompt.return_value = mock_langfuse_prompt - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -315,7 +315,7 @@ def test_load_with_label(self, filepath_json_chat, mock_credentials, mock_langfu """Test load with specific label.""" mock_langfuse.get_prompt.return_value = mock_langfuse_prompt - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -358,14 +358,14 @@ def test_load_invalid_mode(self, langfuse_dataset, mock_langfuse, mock_langfuse_ langfuse_dataset.load() -class TestLangfusePromptDatasetSyncPolicies: +class TestPromptDatasetSyncPolicies: """Test different sync policies.""" def test_sync_local_no_remote(self, filepath_json_chat, mock_credentials, mock_langfuse): """Test local sync policy when no remote prompt exists.""" mock_langfuse.get_prompt.side_effect = Exception("Prompt not found") - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -382,7 +382,7 @@ def test_sync_remote_no_remote_fails(self, filepath_json_chat, mock_credentials, """Test remote sync policy when no remote prompt exists raises DatasetError.""" mock_langfuse.get_prompt.side_effect = Exception("Prompt not found") - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -398,7 +398,7 @@ def test_sync_strict_no_local_fails(self, tmp_path, mock_credentials, mock_langf non_existent_file = (tmp_path / "nonexistent.json").as_posix() mock_langfuse.get_prompt.return_value = mock_langfuse_prompt - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=non_existent_file, prompt_name="test-prompt", credentials=mock_credentials, @@ -413,7 +413,7 @@ def test_load_args_warning_local_policy(self, filepath_json_chat, mock_credentia """Test that load_args produce warning in local sync policy.""" mock_langfuse.get_prompt.return_value = mock_langfuse_prompt - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", credentials=mock_credentials, @@ -422,14 +422,14 @@ def test_load_args_warning_local_policy(self, filepath_json_chat, mock_credentia load_args={"version": 1} ) - with patch("kedro_datasets_experimental.langfuse.langfuse_prompt_dataset.logger") as mock_logger: + with patch("kedro_datasets_experimental.langfuse.prompt_dataset.logger") as mock_logger: # Access _get_build_args to trigger the warning _ = dataset._get_build_args mock_logger.warning.assert_called() warning_message = mock_logger.warning.call_args[0][0] assert "Ignoring load_args" in warning_message -class TestLangfusePromptDatasetFileFormats: +class TestPromptDatasetFileFormats: """Test different file format support.""" @pytest.mark.parametrize( @@ -441,7 +441,7 @@ def test_file_format_loading(self, request, mock_credentials, mock_langfuse, moc filepath = request.getfixturevalue(filepath_fixture) mock_langfuse.get_prompt.return_value = mock_langfuse_prompt - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath, prompt_name="test-prompt", credentials=mock_credentials, @@ -452,7 +452,7 @@ def test_file_format_loading(self, request, mock_credentials, mock_langfuse, moc assert result == mock_langfuse_prompt -class TestLangfusePromptDatasetUtilityMethods: +class TestPromptDatasetUtilityMethods: """Test utility methods and functions.""" @pytest.mark.skip(reason="Skipping for now, langfuse not compatible with langchain>=1.0 yet") @@ -472,7 +472,7 @@ def test_describe(self, langfuse_dataset): def test_file_dataset_property(self, request, mock_credentials, mock_langfuse, filepath_fixture, expected_class): """Test file_dataset property returns correct dataset type.""" filepath = request.getfixturevalue(filepath_fixture) - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=filepath, prompt_name="test-prompt", credentials=mock_credentials @@ -523,7 +523,7 @@ def test_preview_existing_file(self, langfuse_dataset): def test_preview_nonexistent_file(self, tmp_path, mock_credentials, mock_langfuse): """Test preview returns error message for nonexistent file.""" nonexistent_file = (tmp_path / "nonexistent.json").as_posix() - dataset = LangfusePromptDataset( + dataset = PromptDataset( filepath=nonexistent_file, prompt_name="test-prompt", credentials=mock_credentials diff --git a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_trace_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_trace_dataset.py similarity index 88% rename from kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_trace_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_trace_dataset.py index ac4194d3b..e9c50e63e 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_langfuse_trace_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/langfuse/test_trace_dataset.py @@ -1,4 +1,4 @@ -"""Unit tests for LangfuseTraceDataset.""" +"""Unit tests for TraceDataset.""" import os from unittest.mock import MagicMock @@ -6,21 +6,21 @@ import pytest from kedro.io import DatasetError -from kedro_datasets_experimental.langfuse import LangfuseTraceDataset +from kedro_datasets_experimental.langfuse import TraceDataset LANGFUSE_AUTOGEN_ENDPOINT = "https://cloud.langfuse.com/api/public/otel/v1/traces" -class TestLangfuseTraceDataset: +class TestTraceDataset: def test_missing_credentials(self): """Test that dataset raises error when credentials are missing.""" with pytest.raises(DatasetError, match="Missing required Langfuse credential"): - LangfuseTraceDataset(credentials={}) + TraceDataset(credentials={}) def test_empty_credentials(self): """Test that dataset raises error when credentials are empty.""" with pytest.raises(DatasetError, match="cannot be empty"): - LangfuseTraceDataset(credentials={"public_key": "", "secret_key": "sk"}) # pragma: allowlist secret + TraceDataset(credentials={"public_key": "", "secret_key": "sk"}) # pragma: allowlist secret def test_langchain_mode(self, mocker): """Test langchain mode returns CallbackHandler.""" @@ -34,7 +34,7 @@ def test_langchain_mode(self, mocker): # Mock the langfuse.langchain module mocker.patch.dict("sys.modules", {"langfuse.langchain": mock_langchain}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="langchain" ) @@ -47,7 +47,7 @@ def test_host_setting(self, mocker): """Test that host is set in environment when provided.""" mocker.patch.dict("os.environ", {}, clear=True) - LangfuseTraceDataset( + TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -72,7 +72,7 @@ def test_sdk_mode(self, mocker): mocker.patch.dict("sys.modules", {"langfuse": mock_langfuse_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="sdk" ) @@ -97,7 +97,7 @@ def test_sdk_mode_fallback(self, mocker): mocker.patch.dict("sys.modules", {"langfuse": mock_langfuse_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="sdk" ) @@ -121,7 +121,7 @@ def test_load_caching(self, mocker): mocker.patch.dict("sys.modules", {"langfuse": mock_langfuse_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="sdk" ) @@ -136,12 +136,12 @@ def test_load_caching(self, mocker): def test_save_not_implemented(self): """Test save raises DatasetError (wrapping NotImplementedError).""" - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"} # pragma: allowlist secret ) # Kedro wraps NotImplementedError in DatasetError - with pytest.raises(DatasetError, match="LangfuseTraceDataset is read-only"): + with pytest.raises(DatasetError, match="TraceDataset is read-only"): dataset.save("some_data") def test_openai_mode(self, mocker): @@ -157,7 +157,7 @@ def test_openai_mode(self, mocker): mocker.patch.dict("sys.modules", {"langfuse.openai": mock_openai_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -180,7 +180,7 @@ def test_openai_mode_with_base_url(self, mocker): mocker.patch.dict("sys.modules", {"langfuse.openai": mock_openai_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -199,7 +199,7 @@ def test_openai_mode_empty_base_url_raises(self, mocker): mock_openai_module = MagicMock() mocker.patch.dict("sys.modules", {"langfuse.openai": mock_openai_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -218,7 +218,7 @@ def test_openai_mode_missing_credentials(self, mocker): mock_openai_module = MagicMock() mocker.patch.dict("sys.modules", {"langfuse.openai": mock_openai_module}) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="openai" ) @@ -228,7 +228,7 @@ def test_openai_mode_missing_credentials(self, mocker): def test_describe_method(self): """Test _describe returns correct format.""" - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="langchain" ) @@ -244,11 +244,11 @@ def test_autogen_mode(self, mocker): mock_tracer = MagicMock() mocker.patch( - "kedro_datasets_experimental.langfuse.langfuse_trace_dataset.LangfuseTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.langfuse.trace_dataset.TraceDataset._build_autogen_tracer", return_value=mock_tracer ) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -266,11 +266,11 @@ def test_autogen_mode_caching(self, mocker): mock_tracer = MagicMock() build_tracer_mock = mocker.patch( - "kedro_datasets_experimental.langfuse.langfuse_trace_dataset.LangfuseTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.langfuse.trace_dataset.TraceDataset._build_autogen_tracer", return_value=mock_tracer ) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -293,11 +293,11 @@ def test_autogen_mode_sets_environment_variables(self, mocker): # Mock the tracer builder to avoid actual OpenTelemetry imports mocker.patch( - "kedro_datasets_experimental.langfuse.langfuse_trace_dataset.LangfuseTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.langfuse.trace_dataset.TraceDataset._build_autogen_tracer", return_value=MagicMock() ) - LangfuseTraceDataset( + TraceDataset( credentials={ "public_key": "pk_test_autogen", "secret_key": "sk_test_autogen", # pragma: allowlist secret @@ -312,7 +312,7 @@ def test_autogen_mode_sets_environment_variables(self, mocker): def test_autogen_mode_missing_endpoint(self): """Test that autogen mode raises error when endpoint is missing.""" with pytest.raises(DatasetError, match="AutoGen mode requires 'endpoint'"): - LangfuseTraceDataset( + TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="autogen" ) @@ -320,7 +320,7 @@ def test_autogen_mode_missing_endpoint(self): def test_autogen_mode_empty_endpoint(self): """Test that autogen mode raises error when endpoint is empty.""" with pytest.raises(DatasetError, match="AutoGen mode requires 'endpoint'"): - LangfuseTraceDataset( + TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -332,7 +332,7 @@ def test_autogen_mode_empty_endpoint(self): def test_autogen_mode_endpoint_not_required_for_other_modes(self): """Test that endpoint is not required for non-autogen modes.""" # Endpoint is only required for autogen mode - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={"public_key": "pk_test", "secret_key": "sk_test"}, # pragma: allowlist secret mode="sdk" ) @@ -350,11 +350,11 @@ def raise_import_error(): ) mocker.patch( - "kedro_datasets_experimental.langfuse.langfuse_trace_dataset.LangfuseTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.langfuse.trace_dataset.TraceDataset._build_autogen_tracer", side_effect=raise_import_error ) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret @@ -372,11 +372,11 @@ def test_describe_method_autogen_mode(self, mocker): # Mock the tracer builder to avoid actual OpenTelemetry imports mocker.patch( - "kedro_datasets_experimental.langfuse.langfuse_trace_dataset.LangfuseTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.langfuse.trace_dataset.TraceDataset._build_autogen_tracer", return_value=MagicMock() ) - dataset = LangfuseTraceDataset( + dataset = TraceDataset( credentials={ "public_key": "pk_test", "secret_key": "sk_test", # pragma: allowlist secret diff --git a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_evaluation_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_evaluation_dataset.py similarity index 94% rename from kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_evaluation_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/opik/test_evaluation_dataset.py index a2de5639a..8836cc63e 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_evaluation_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_evaluation_dataset.py @@ -7,8 +7,8 @@ from kedro.io import DatasetError from opik.rest_api.core.api_error import ApiError -from kedro_datasets_experimental.opik.opik_evaluation_dataset import ( - OpikEvaluationDataset, +from kedro_datasets_experimental.opik.evaluation_dataset import ( + EvaluationDataset, ) @@ -20,7 +20,7 @@ def make_api_error(status_code: int) -> ApiError: @pytest.fixture def mock_opik(): """Mock Opik client instance.""" - with patch("kedro_datasets_experimental.opik.opik_evaluation_dataset.Opik") as mock_class: + with patch("kedro_datasets_experimental.opik.evaluation_dataset.Opik") as mock_class: instance = Mock() mock_class.return_value = instance yield instance @@ -121,9 +121,9 @@ def mock_remote_dataset(): @pytest.fixture def dataset_local(filepath_json, mock_credentials, mock_opik, mock_remote_dataset): - """OpikEvaluationDataset with local sync policy.""" + """EvaluationDataset with local sync policy.""" mock_opik.get_dataset.return_value = mock_remote_dataset - return OpikEvaluationDataset( + return EvaluationDataset( dataset_name="test-dataset", credentials=mock_credentials, filepath=filepath_json, @@ -133,21 +133,21 @@ def dataset_local(filepath_json, mock_credentials, mock_opik, mock_remote_datase @pytest.fixture def dataset_remote(mock_credentials, mock_opik, mock_remote_dataset): - """OpikEvaluationDataset with remote sync policy and no filepath.""" + """EvaluationDataset with remote sync policy and no filepath.""" mock_opik.get_dataset.return_value = mock_remote_dataset - return OpikEvaluationDataset( + return EvaluationDataset( dataset_name="test-dataset", credentials=mock_credentials, sync_policy="remote", ) -class TestOpikEvaluationDatasetInit: - """Test OpikEvaluationDataset initialisation.""" +class TestEvaluationDatasetInit: + """Test EvaluationDataset initialisation.""" def test_init_minimal_params(self, mock_credentials, mock_opik): """Minimal required params store expected defaults.""" - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="my-dataset", credentials=mock_credentials, ) @@ -159,7 +159,7 @@ def test_init_minimal_params(self, mock_credentials, mock_opik): def test_init_all_params(self, filepath_json, mock_credentials, mock_opik): """All params are stored correctly.""" meta = {"project": "test"} - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="my-dataset", credentials=mock_credentials, filepath=filepath_json, @@ -173,7 +173,7 @@ def test_init_all_params(self, filepath_json, mock_credentials, mock_opik): def test_init_missing_api_key(self, mock_opik): """Missing api_key raises DatasetError.""" with pytest.raises(DatasetError, match="Missing required Opik credential: 'api_key'"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials={"workspace": "w"}, ) @@ -182,7 +182,7 @@ def test_init_missing_api_key(self, mock_opik): def test_init_empty_api_key(self, mock_opik, empty_value): """Empty api_key raises DatasetError.""" with pytest.raises(DatasetError, match="Opik credential 'api_key' cannot be empty"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials={"api_key": empty_value}, ) @@ -190,7 +190,7 @@ def test_init_empty_api_key(self, mock_opik, empty_value): def test_init_empty_optional_credential(self, mock_opik): """Empty optional credential (workspace) raises DatasetError.""" with pytest.raises(DatasetError, match="Opik credential 'workspace' cannot be empty if provided"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials={"api_key": "key", "workspace": ""}, # pragma: allowlist secret ) @@ -198,7 +198,7 @@ def test_init_empty_optional_credential(self, mock_opik): def test_init_invalid_sync_policy(self, mock_credentials, mock_opik): """Invalid sync_policy raises DatasetError.""" with pytest.raises(DatasetError, match="Invalid sync_policy 'invalid'"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials=mock_credentials, sync_policy="invalid", @@ -209,7 +209,7 @@ def test_init_unsupported_filepath_extension(self, tmp_path, mock_credentials, m bad_file = tmp_path / "data.txt" bad_file.write_text("content") with pytest.raises(DatasetError, match="Unsupported file extension '.txt'"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(bad_file), @@ -217,10 +217,10 @@ def test_init_unsupported_filepath_extension(self, tmp_path, mock_credentials, m def test_init_client_failure_raises_dataset_error(self, mock_credentials): """Opik client construction failure is wrapped in DatasetError.""" - with patch("kedro_datasets_experimental.opik.opik_evaluation_dataset.Opik") as mock_class: + with patch("kedro_datasets_experimental.opik.evaluation_dataset.Opik") as mock_class: mock_class.side_effect = Exception("Connection refused") with pytest.raises(DatasetError, match="Failed to initialise Opik client"): - OpikEvaluationDataset( + EvaluationDataset( dataset_name="ds", credentials=mock_credentials, ) @@ -236,7 +236,7 @@ def test_json_returns_json_dataset(self, dataset_local): def test_yaml_returns_yaml_dataset(self, filepath_yaml, mock_credentials, mock_opik, mock_remote_dataset): """YAML filepath resolves to YAMLDataset.""" mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=filepath_yaml, @@ -306,17 +306,17 @@ class TestValidateItems: def test_valid_items_pass(self, eval_items): """Items with 'input' keys pass validation without error.""" - OpikEvaluationDataset._validate_items(eval_items) # no exception + EvaluationDataset._validate_items(eval_items) # no exception def test_empty_list_passes(self): """Empty item list is valid.""" - OpikEvaluationDataset._validate_items([]) + EvaluationDataset._validate_items([]) def test_missing_input_raises_dataset_error(self): """Item missing 'input' raises DatasetError with index.""" items = [{"input": {"q": "ok"}}, {"expected_output": "missing input"}] with pytest.raises(DatasetError, match="index 1.*missing required 'input'"): - OpikEvaluationDataset._validate_items(items) + EvaluationDataset._validate_items(items) class TestUploadItems: @@ -410,7 +410,7 @@ def test_returns_dataset_unchanged_when_no_filepath(self, dataset_remote, mock_r def test_returns_dataset_unchanged_when_file_missing(self, tmp_path, mock_credentials, mock_opik, mock_remote_dataset): """No-op when local file does not exist.""" mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(tmp_path / "nonexistent.json"), @@ -424,7 +424,7 @@ def test_returns_dataset_unchanged_for_empty_file(self, tmp_path, mock_credentia empty_file.write_text("[]") mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(empty_file), @@ -500,13 +500,13 @@ def test_warns_when_id_missing_or_empty( filepath.write_text(json.dumps(items)) mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(filepath), ) - with patch("kedro_datasets_experimental.opik.opik_evaluation_dataset.logger") as mock_logger: + with patch("kedro_datasets_experimental.opik.evaluation_dataset.logger") as mock_logger: with patch.object(ds, "_upload_items"): ds._sync_local_to_remote(mock_remote_dataset) warning_messages = [c[0][0] for c in mock_logger.warning.call_args_list] @@ -520,7 +520,7 @@ def test_new_item_replaces_existing_by_id(self): """New item with existing ID replaces the old entry in place.""" existing = [{"id": "a", "input": {"v": 1}}, {"id": "b", "input": {"v": 2}}] new = [{"id": "a", "input": {"v": 99}}] - result = OpikEvaluationDataset._merge_items(existing, new) + result = EvaluationDataset._merge_items(existing, new) assert result[0]["input"]["v"] == 99 assert len(result) == 2 @@ -528,7 +528,7 @@ def test_new_item_without_id_is_appended(self): """New item without ID is always appended, never deduped.""" existing = [{"id": "a", "input": {"v": 1}}] new = [{"input": {"v": 2}}] - result = OpikEvaluationDataset._merge_items(existing, new) + result = EvaluationDataset._merge_items(existing, new) assert len(result) == 2 assert result[1]["input"]["v"] == 2 @@ -536,27 +536,27 @@ def test_new_item_with_new_id_is_appended(self): """New item with a novel ID is appended after existing items.""" existing = [{"id": "a", "input": {"v": 1}}] new = [{"id": "b", "input": {"v": 2}}] - result = OpikEvaluationDataset._merge_items(existing, new) + result = EvaluationDataset._merge_items(existing, new) assert len(result) == 2 assert result[1]["id"] == "b" def test_empty_existing_returns_new(self): """Merging into empty list returns a copy of new items.""" new = [{"id": "a", "input": {"v": 1}}] - result = OpikEvaluationDataset._merge_items([], new) + result = EvaluationDataset._merge_items([], new) assert result == new def test_empty_new_returns_existing(self): """Merging empty new list returns existing unchanged.""" existing = [{"id": "a", "input": {"v": 1}}] - result = OpikEvaluationDataset._merge_items(existing, []) + result = EvaluationDataset._merge_items(existing, []) assert result == existing def test_order_preserved_with_replacement(self): """Replacement keeps the item at its original position.""" existing = [{"id": "a", "input": {"v": 1}}, {"id": "b", "input": {"v": 2}}] new = [{"id": "b", "input": {"v": 99}}] - result = OpikEvaluationDataset._merge_items(existing, new) + result = EvaluationDataset._merge_items(existing, new) assert result[0]["id"] == "a" assert result[1]["input"]["v"] == 99 @@ -564,7 +564,7 @@ def test_duplicate_no_id_items_both_appended(self): """Two new items without ID are both appended (no dedup possible).""" existing = [] new = [{"input": {"v": 1}}, {"input": {"v": 1}}] - result = OpikEvaluationDataset._merge_items(existing, new) + result = EvaluationDataset._merge_items(existing, new) assert len(result) == 2 @@ -657,7 +657,7 @@ def test_save_local_mode_creates_file_if_missing(self, tmp_path, mock_credential missing = tmp_path / "new.json" mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(missing), @@ -681,7 +681,7 @@ def test_save_remote_mode_does_not_write_local_file(self, dataset_remote, mock_o def test_save_remote_mode_logs_warning(self, dataset_remote, mock_opik, mock_remote_dataset, eval_items): """Remote mode logs a warning that the local file won't be updated.""" mock_opik.get_dataset.return_value = mock_remote_dataset - with patch("kedro_datasets_experimental.opik.opik_evaluation_dataset.logger") as mock_logger: + with patch("kedro_datasets_experimental.opik.evaluation_dataset.logger") as mock_logger: dataset_remote.save(eval_items) warning_messages = [c[0][0] for c in mock_logger.warning.call_args_list] assert any("uploads to remote only" in msg for msg in warning_messages) @@ -739,7 +739,7 @@ def test_describe_filepath_none_when_not_set(self, dataset_remote): def test_describe_metadata_included(self, mock_credentials, mock_opik, mock_remote_dataset): """metadata dict is returned in _describe.""" mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, metadata={"project": "evaluation"}, @@ -760,7 +760,7 @@ def test_preview_existing_json_file(self, dataset_local, eval_items): def test_preview_nonexistent_file(self, tmp_path, mock_credentials, mock_opik, mock_remote_dataset): """Returns a descriptive message when the local file does not exist.""" mock_opik.get_dataset.return_value = mock_remote_dataset - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(tmp_path / "missing.json"), @@ -779,7 +779,7 @@ def test_preview_non_serialisable_data_returns_message( filepath = tmp_path / "eval.json" filepath.write_text(json.dumps([{"input": "x"}])) - ds = OpikEvaluationDataset( + ds = EvaluationDataset( dataset_name="ds", credentials=mock_credentials, filepath=str(filepath), diff --git a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_prompt_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_prompt_dataset.py similarity index 92% rename from kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_prompt_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/opik/test_prompt_dataset.py index b8fe98779..f8a35801b 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_prompt_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_prompt_dataset.py @@ -5,8 +5,8 @@ import yaml from kedro.io import DatasetError -from kedro_datasets_experimental.opik.opik_prompt_dataset import ( - OpikPromptDataset, +from kedro_datasets_experimental.opik.prompt_dataset import ( + PromptDataset, _get_content, _hash, ) @@ -15,7 +15,7 @@ @pytest.fixture def mock_opik(): """Mock Opik client for testing.""" - with patch("kedro_datasets_experimental.opik.opik_prompt_dataset.Opik") as mock: + with patch("kedro_datasets_experimental.opik.prompt_dataset.Opik") as mock: opik_instance = Mock() mock.return_value = opik_instance yield opik_instance @@ -96,8 +96,8 @@ def mock_opik_dataset(): @pytest.fixture def opik_dataset(filepath_json_chat, mock_credentials, mock_opik, mock_opik_dataset): - """Basic OpikPromptDataset for testing.""" - return OpikPromptDataset( + """Basic PromptDataset for testing.""" + return PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -106,13 +106,13 @@ def opik_dataset(filepath_json_chat, mock_credentials, mock_opik, mock_opik_data ) -class TestOpikPromptDatasetInit: - """Test OpikPromptDataset initialisation.""" +class TestPromptDatasetInit: + """Test PromptDataset initialisation.""" def test_init_minimal_params(self, filepath_json_chat, mock_credentials, mock_opik, mock_opik_dataset): """Test initialisation with minimal required parameters.""" - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -130,7 +130,7 @@ def test_init_all_params(self, filepath_json_chat, mock_credentials, mock_opik, load_args = {"version": 1} # For future use save_args = {"metadata": {"environment": "test"}} - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -151,7 +151,7 @@ def test_init_all_params(self, filepath_json_chat, mock_credentials, mock_opik, def test_init_invalid_prompt_type(self, filepath_json_chat, mock_opik): """Test initialisation with invalid prompt type raises DatasetError.""" with pytest.raises(DatasetError, match="Invalid prompt_type 'invalid'"): - OpikPromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="invalid", @@ -161,7 +161,7 @@ def test_init_invalid_prompt_type(self, filepath_json_chat, mock_opik): def test_init_invalid_sync_policy(self, filepath_json_chat, mock_opik): """Test initialisation with invalid sync policy raises DatasetError.""" with pytest.raises(DatasetError, match="Invalid sync_policy 'invalid'"): - OpikPromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -172,7 +172,7 @@ def test_init_invalid_sync_policy(self, filepath_json_chat, mock_opik): def test_init_invalid_mode(self, filepath_json_chat, mock_opik): """Test initialisation with invalid mode raises DatasetError.""" with pytest.raises(DatasetError, match="Invalid mode 'invalid'"): - OpikPromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -186,7 +186,7 @@ def test_init_unsupported_file_extension(self, tmp_path, mock_opik): unsupported_file.write_text("test prompt") with pytest.raises(DatasetError, match="Unsupported file extension '.txt'"): - OpikPromptDataset( + PromptDataset( filepath=str(unsupported_file), prompt_name="test-prompt", prompt_type="text", @@ -195,11 +195,11 @@ def test_init_unsupported_file_extension(self, tmp_path, mock_opik): def test_init_opik_client_failure(self, filepath_json_chat, mock_credentials): """Test initialisation handles Opik client creation failure.""" - with patch("kedro_datasets_experimental.opik.opik_prompt_dataset.Opik") as mock_opik_class: + with patch("kedro_datasets_experimental.opik.prompt_dataset.Opik") as mock_opik_class: mock_opik_class.side_effect = Exception("Connection failed") with pytest.raises(DatasetError, match="Failed to initialise Opik client"): - OpikPromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -209,10 +209,10 @@ def test_init_opik_client_failure(self, filepath_json_chat, mock_credentials): def test_init_langchain_mode_without_package(self, filepath_json_chat, mock_opik, mock_opik_dataset): """Test initialisation with langchain mode when package not installed.""" - with patch("kedro_datasets_experimental.opik.opik_prompt_dataset.TYPE_CHECKING", False): + with patch("kedro_datasets_experimental.opik.prompt_dataset.TYPE_CHECKING", False): with patch.dict("sys.modules", {"langchain_core.prompts": None}): with pytest.raises(ImportError, match="'langchain-core' package is required"): - OpikPromptDataset( + PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -221,13 +221,13 @@ def test_init_langchain_mode_without_package(self, filepath_json_chat, mock_opik ) -class TestOpikPromptDatasetSave: - """Test OpikPromptDataset save functionality.""" +class TestPromptDatasetSave: + """Test PromptDataset save functionality.""" def test_save_text_prompt(self, filepath_json_text, mock_credentials, mock_opik, mock_opik_dataset): """Test saving a text prompt.""" - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_text, prompt_name="test-prompt", prompt_type="text", @@ -246,7 +246,7 @@ def test_save_text_prompt(self, filepath_json_text, mock_credentials, mock_opik, def test_save_chat_prompt(self, filepath_json_chat, mock_credentials, mock_opik, mock_opik_dataset): """Test saving a chat prompt.""" - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -269,7 +269,7 @@ def test_save_with_metadata(self, filepath_json_chat, mock_credentials, mock_opi """Test saving a prompt with additional metadata.""" save_args = {"metadata": {"environment": "production", "version": "2.0"}} - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -293,7 +293,7 @@ def test_save_invalid_chat_format(self, opik_dataset): def test_save_invalid_text_format(self, filepath_json_text, mock_credentials, mock_opik, mock_opik_dataset): """Test saving invalid text format raises DatasetError.""" - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_text, prompt_name="test-prompt", prompt_type="text", @@ -304,8 +304,8 @@ def test_save_invalid_text_format(self, filepath_json_text, mock_credentials, mo dataset.save(["This should be a string"]) -class TestOpikPromptDatasetLoad: - """Test OpikPromptDataset load functionality.""" +class TestPromptDatasetLoad: + """Test PromptDataset load functionality.""" def test_load_sdk_mode(self, opik_dataset, mock_opik, mock_opik_prompt): """Test successful load in SDK mode.""" @@ -339,7 +339,7 @@ def test_load_langchain_mode_text(self, filepath_json_text, mock_credentials, mo mock_text_prompt.prompt = "Answer the question: {question}" mock_opik.get_prompt.return_value = mock_text_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_text, prompt_name="test-prompt", prompt_type="text", @@ -361,7 +361,7 @@ def test_load_args_warning_local_policy(self, filepath_json_chat, mock_credentia """Test that load_args produce warning in local sync policy.""" mock_opik.get_prompt.return_value = mock_opik_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -371,7 +371,7 @@ def test_load_args_warning_local_policy(self, filepath_json_chat, mock_credentia load_args={"version": 1} ) - with patch("kedro_datasets_experimental.opik.opik_prompt_dataset.logger") as mock_logger: + with patch("kedro_datasets_experimental.opik.prompt_dataset.logger") as mock_logger: dataset.load() # Check that at least one warning contains "Ignoring load_args" @@ -380,14 +380,14 @@ def test_load_args_warning_local_policy(self, filepath_json_chat, mock_credentia f"Expected 'Ignoring load_args' warning, but got: {warning_calls}" -class TestOpikPromptDatasetSyncPolicies: +class TestPromptDatasetSyncPolicies: """Test different sync policies.""" def test_sync_local_no_remote(self, filepath_json_chat, mock_credentials, mock_opik, mock_opik_dataset): """Test local sync policy when no remote prompt exists.""" mock_opik.get_prompt.side_effect = Exception("Prompt not found") - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -405,7 +405,7 @@ def test_sync_remote_no_remote_fails(self, filepath_json_chat, mock_credentials, """Test remote sync policy when no remote prompt exists raises DatasetError.""" mock_opik.get_prompt.return_value = None - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -422,7 +422,7 @@ def test_sync_strict_no_local_fails(self, tmp_path, mock_credentials, mock_opik, non_existent_file = (tmp_path / "nonexistent.json").as_posix() mock_opik.get_prompt.return_value = mock_opik_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=non_existent_file, prompt_name="test-prompt", prompt_type="chat", @@ -445,7 +445,7 @@ def test_sync_strict_mismatch_fails(self, filepath_json_chat, mock_credentials, ] mock_opik.get_prompt.return_value = mock_remote_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -468,7 +468,7 @@ def test_sync_remote_updates_local(self, filepath_json_chat, mock_credentials, m ] mock_opik.get_prompt.return_value = mock_remote_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -482,7 +482,7 @@ def test_sync_remote_updates_local(self, filepath_json_chat, mock_credentials, m mock_save.assert_called_once() -class TestOpikPromptDatasetFileFormats: +class TestPromptDatasetFileFormats: """Test different file format support.""" @pytest.mark.parametrize( @@ -499,7 +499,7 @@ def test_file_format_loading(self, request, mock_credentials, mock_opik, mock_op filepath = request.getfixturevalue(filepath_fixture) mock_opik.get_prompt.return_value = mock_opik_prompt - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath, prompt_name="test-prompt", prompt_type=prompt_type, @@ -511,7 +511,7 @@ def test_file_format_loading(self, request, mock_credentials, mock_opik, mock_op assert result == mock_opik_prompt -class TestOpikPromptDatasetUtilityMethods: +class TestPromptDatasetUtilityMethods: """Test utility methods and functions.""" def test_describe(self, opik_dataset): @@ -534,7 +534,7 @@ def test_file_dataset_property(self, request, mock_credentials, mock_opik, mock_ """Test file_dataset property returns correct dataset type.""" filepath = request.getfixturevalue(filepath_fixture) - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath, prompt_name="test-prompt", prompt_type="chat", @@ -584,7 +584,7 @@ def test_preview_nonexistent_file(self, tmp_path, mock_credentials, mock_opik, m """Test preview returns error message for nonexistent file.""" nonexistent_file = (tmp_path / "nonexistent.json").as_posix() - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=nonexistent_file, prompt_name="test-prompt", prompt_type="chat", @@ -600,7 +600,7 @@ def test_ensure_dataset_exists_creates_new(self, filepath_json_chat, mock_creden mock_new_dataset = Mock() mock_opik.create_dataset.return_value = mock_new_dataset - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", @@ -617,7 +617,7 @@ def test_ensure_dataset_exists_uses_existing(self, filepath_json_chat, mock_cred """Test that _ensure_dataset_exists uses existing dataset if available.""" mock_opik.get_dataset.return_value = mock_opik_dataset - dataset = OpikPromptDataset( + dataset = PromptDataset( filepath=filepath_json_chat, prompt_name="test-prompt", prompt_type="chat", diff --git a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_trace_dataset.py b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_trace_dataset.py similarity index 73% rename from kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_trace_dataset.py rename to kedro-datasets/kedro_datasets_experimental/tests/opik/test_trace_dataset.py index faa99418a..e800b90fd 100644 --- a/kedro-datasets/kedro_datasets_experimental/tests/opik/test_opik_trace_dataset.py +++ b/kedro-datasets/kedro_datasets_experimental/tests/opik/test_trace_dataset.py @@ -5,7 +5,7 @@ import pytest from kedro.io import DatasetError -from kedro_datasets_experimental.opik.opik_trace_dataset import OpikTraceDataset +from kedro_datasets_experimental.opik.trace_dataset import TraceDataset OPIK_AUTOGEN_ENDPOINT = "https://www.comet.com/opik/api/v1/private/otel/v1/traces" @@ -20,10 +20,10 @@ def autogen_credentials(base_credentials): return base_credentials | {"endpoint": OPIK_AUTOGEN_ENDPOINT} -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_init_with_valid_credentials(configure_mock, base_credentials): """Test that dataset initializes correctly with valid credentials and calls configure.""" - dataset = OpikTraceDataset(base_credentials) + dataset = TraceDataset(base_credentials) assert dataset._credentials == base_credentials configure_mock.assert_called_once() @@ -31,46 +31,46 @@ def test_init_with_valid_credentials(configure_mock, base_credentials): def test_missing_required_credentials_raises(): """Test that missing required credentials raises DatasetError.""" with pytest.raises(DatasetError, match="Missing required Opik credential"): - OpikTraceDataset({"workspace": "x"}) + TraceDataset({"workspace": "x"}) def test_empty_optional_credential_raises(base_credentials): """Test that empty optional credentials raise DatasetError.""" creds = base_credentials | {"project_name": " "} with pytest.raises(DatasetError, match="cannot be empty"): - OpikTraceDataset(creds) + TraceDataset(creds) -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_configure_opik_sets_project_name(configure_mock, base_credentials): """Test that configuring Opik sets the project name environment variable.""" creds = base_credentials | {"project_name": "test-proj"} - OpikTraceDataset(creds) + TraceDataset(creds) assert os.getenv("OPIK_PROJECT_NAME") == "test-proj" configure_mock.assert_called_once() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_configure_opik_warns_on_project_switch(configure_mock, base_credentials, caplog): """Test that configuring Opik warns when switching to a different project.""" os.environ["OPIK_PROJECT_NAME"] = "existing-proj" creds = base_credentials | {"project_name": "new-proj"} - OpikTraceDataset(creds) + TraceDataset(creds) assert "will be ignored" in caplog.text assert os.getenv("OPIK_PROJECT_NAME") == "existing-proj" -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.track") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.track") def test_load_sdk_client_returns_wrapper(track_mock, configure_mock, base_credentials): """Test that loading SDK mode returns a wrapper client with a track method.""" - dataset = OpikTraceDataset(base_credentials, mode="sdk") + dataset = TraceDataset(base_credentials, mode="sdk") client = dataset.load() assert hasattr(client, "track") assert client.track is track_mock -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") @patch("opik.integrations.openai.track_openai") @patch("openai.OpenAI") def test_load_openai_client(openai_mock, track_openai_mock, configure_mock, base_credentials): @@ -79,13 +79,13 @@ def test_load_openai_client(openai_mock, track_openai_mock, configure_mock, base "openai": {"api_key": "sk-test"}, # pragma: allowlist secret "project_name": "proj-a", } - dataset = OpikTraceDataset(creds, mode="openai") + dataset = TraceDataset(creds, mode="openai") dataset.load() openai_mock.assert_called_once() track_openai_mock.assert_called_once() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") @patch("opik.integrations.openai.track_openai") @patch("openai.OpenAI") def test_load_openai_client_with_base_url(openai_mock, track_openai_mock, configure_mock, base_credentials): @@ -93,12 +93,12 @@ def test_load_openai_client_with_base_url(openai_mock, track_openai_mock, config creds = base_credentials | { "openai": {"api_key": "sk-test", "base_url": "https://custom.openai.com/v1"}, # pragma: allowlist secret } - dataset = OpikTraceDataset(creds, mode="openai") + dataset = TraceDataset(creds, mode="openai") dataset.load() openai_mock.assert_called_once_with(api_key="sk-test", base_url="https://custom.openai.com/v1") # pragma: allowlist secret -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_openai_empty_base_url_raises(configure_mock, base_credentials, mocker): """Test that empty base_url raises DatasetError.""" mock_openai = MagicMock() @@ -109,12 +109,12 @@ def test_openai_empty_base_url_raises(configure_mock, base_credentials, mocker): }) creds = base_credentials | {"openai": {"api_key": "sk-test", "base_url": " "}} # pragma: allowlist secret - dataset = OpikTraceDataset(creds, mode="openai") + dataset = TraceDataset(creds, mode="openai") with pytest.raises(DatasetError, match="'base_url' cannot be empty if provided"): dataset.load() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_openai_missing_credentials_raises(configure_mock, base_credentials, mocker): """Test that missing OpenAI API key raises DatasetError.""" # Mock openai and opik.integrations.openai to avoid real imports @@ -126,12 +126,12 @@ def test_openai_missing_credentials_raises(configure_mock, base_credentials, moc }) creds = base_credentials | {"openai": {}} - dataset = OpikTraceDataset(creds, mode="openai") + dataset = TraceDataset(creds, mode="openai") with pytest.raises(DatasetError, match="Missing or empty OpenAI API key"): dataset.load() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_openai_missing_section_raises(configure_mock, base_credentials, mocker): """Test that missing OpenAI section raises DatasetError.""" # Mock openai and opik.integrations.openai to avoid real imports @@ -142,68 +142,68 @@ def test_openai_missing_section_raises(configure_mock, base_credentials, mocker) "opik.integrations.openai": mock_opik_openai, }) - dataset = OpikTraceDataset(base_credentials, mode="openai") - with pytest.raises(DatasetError, match="Missing 'openai' section in OpikTraceDataset credentials."): + dataset = TraceDataset(base_credentials, mode="openai") + with pytest.raises(DatasetError, match="Missing 'openai' section in TraceDataset credentials."): dataset.load() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") @patch("opik.integrations.langchain.OpikTracer") def test_load_langchain_tracer(opik_tracer_mock, configure_mock, base_credentials): """Test that loading LangChain mode returns an OpikTracer instance.""" - dataset = OpikTraceDataset(base_credentials, mode="langchain") + dataset = TraceDataset(base_credentials, mode="langchain") client = dataset.load() opik_tracer_mock.assert_called_once() assert client == opik_tracer_mock.return_value -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_langchain_import_error_raises(configure_mock, base_credentials, monkeypatch): """Test that ImportError in LangChain integration raises DatasetError.""" monkeypatch.setitem(sys.modules, "opik.integrations.langchain", None) with pytest.raises(DatasetError, match="Opik LangChain integration not available"): - OpikTraceDataset(base_credentials, mode="langchain").load() + TraceDataset(base_credentials, mode="langchain").load() -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.track") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.track") def test_client_is_cached(track_mock, configure_mock, base_credentials): """Test that multiple calls to load() return the cached client instance.""" - dataset = OpikTraceDataset(base_credentials, mode="sdk") + dataset = TraceDataset(base_credentials, mode="sdk") client1 = dataset.load() client2 = dataset.load() assert client1 is client2 -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_describe_masks_credentials(configure_mock, base_credentials): """Test that _describe() masks credential values.""" - dataset = OpikTraceDataset(base_credentials) + dataset = TraceDataset(base_credentials) desc = dataset._describe() assert all(v == "***" for v in desc["credentials"].values()) -@patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") +@patch("kedro_datasets_experimental.opik.trace_dataset.configure") def test_save_not_implemented(configure_mock, base_credentials): """Test that calling save() raises DatasetError because dataset is read-only.""" - dataset = OpikTraceDataset(base_credentials) + dataset = TraceDataset(base_credentials) with pytest.raises(DatasetError): dataset.save("data") # AutoGen mode tests -class TestOpikTraceDatasetAutogenMode: - """Tests for AutoGen mode in OpikTraceDataset.""" +class TestTraceDatasetAutogenMode: + """Tests for AutoGen mode in TraceDataset.""" def test_autogen_mode_returns_tracer(self, mocker, autogen_credentials): """Test AutoGen mode returns configured Tracer.""" mock_tracer = MagicMock() mocker.patch( - "kedro_datasets_experimental.opik.opik_trace_dataset.OpikTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.opik.trace_dataset.TraceDataset._build_autogen_tracer", return_value=mock_tracer ) - dataset = OpikTraceDataset(autogen_credentials, mode="autogen") + dataset = TraceDataset(autogen_credentials, mode="autogen") result = dataset.load() assert result == mock_tracer @@ -212,11 +212,11 @@ def test_autogen_mode_caching(self, mocker, autogen_credentials): """Test that AutoGen mode caches the tracer.""" mock_tracer = MagicMock() build_tracer_mock = mocker.patch( - "kedro_datasets_experimental.opik.opik_trace_dataset.OpikTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.opik.trace_dataset.TraceDataset._build_autogen_tracer", return_value=mock_tracer ) - dataset = OpikTraceDataset(autogen_credentials, mode="autogen") + dataset = TraceDataset(autogen_credentials, mode="autogen") # Call load twice result1 = dataset.load() @@ -228,15 +228,15 @@ def test_autogen_mode_caching(self, mocker, autogen_credentials): def test_autogen_mode_skips_opik_configure(self, mocker, autogen_credentials): """Test that AutoGen mode does not call Opik SDK configure.""" - configure_mock = mocker.patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") + configure_mock = mocker.patch("kedro_datasets_experimental.opik.trace_dataset.configure") # Mock the tracer builder to avoid actual OpenTelemetry imports mocker.patch( - "kedro_datasets_experimental.opik.opik_trace_dataset.OpikTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.opik.trace_dataset.TraceDataset._build_autogen_tracer", return_value=MagicMock() ) - OpikTraceDataset(autogen_credentials, mode="autogen") + TraceDataset(autogen_credentials, mode="autogen") # configure should not be called for autogen mode configure_mock.assert_not_called() @@ -244,20 +244,20 @@ def test_autogen_mode_skips_opik_configure(self, mocker, autogen_credentials): def test_autogen_mode_missing_endpoint(self, base_credentials): """Test that autogen mode raises error when endpoint is missing.""" with pytest.raises(DatasetError, match="AutoGen mode requires 'endpoint'"): - OpikTraceDataset(base_credentials, mode="autogen") + TraceDataset(base_credentials, mode="autogen") def test_autogen_mode_empty_endpoint(self, base_credentials): """Test that autogen mode raises error when endpoint is empty.""" creds = base_credentials | {"endpoint": ""} with pytest.raises(DatasetError, match="AutoGen mode requires 'endpoint'"): - OpikTraceDataset(creds, mode="autogen") + TraceDataset(creds, mode="autogen") def test_autogen_mode_endpoint_not_required_for_other_modes(self, mocker, base_credentials): """Test that endpoint is not required for non-autogen modes.""" - mocker.patch("kedro_datasets_experimental.opik.opik_trace_dataset.configure") + mocker.patch("kedro_datasets_experimental.opik.trace_dataset.configure") # Endpoint is only required for autogen mode - dataset = OpikTraceDataset(base_credentials, mode="sdk") + dataset = TraceDataset(base_credentials, mode="sdk") assert dataset._mode == "sdk" def test_autogen_mode_import_error(self, mocker, autogen_credentials): @@ -270,11 +270,11 @@ def raise_import_error(): ) mocker.patch( - "kedro_datasets_experimental.opik.opik_trace_dataset.OpikTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.opik.trace_dataset.TraceDataset._build_autogen_tracer", side_effect=raise_import_error ) - dataset = OpikTraceDataset(autogen_credentials, mode="autogen") + dataset = TraceDataset(autogen_credentials, mode="autogen") with pytest.raises(DatasetError, match="AutoGen mode requires OpenTelemetry"): dataset.load() @@ -283,11 +283,11 @@ def test_describe_autogen_mode(self, mocker, autogen_credentials): """Test _describe returns correct format for autogen mode.""" # Mock the tracer builder to avoid actual OpenTelemetry imports mocker.patch( - "kedro_datasets_experimental.opik.opik_trace_dataset.OpikTraceDataset._build_autogen_tracer", + "kedro_datasets_experimental.opik.trace_dataset.TraceDataset._build_autogen_tracer", return_value=MagicMock() ) - dataset = OpikTraceDataset(autogen_credentials, mode="autogen") + dataset = TraceDataset(autogen_credentials, mode="autogen") desc = dataset._describe() assert desc["mode"] == "autogen" assert all(v == "***" for v in desc["credentials"].values()) diff --git a/kedro-datasets/mkdocs.yml b/kedro-datasets/mkdocs.yml index 5f70f0419..1b74d5bfd 100644 --- a/kedro-datasets/mkdocs.yml +++ b/kedro-datasets/mkdocs.yml @@ -173,13 +173,13 @@ plugins: Experimental LLM and AI: - api/kedro_datasets_experimental/chromadb.ChromaDBDataset.md: ChromaDB vector database integration - - api/kedro_datasets_experimental/langchain.LangChainPromptDataset.md: LangChain prompt templates - - api/kedro_datasets_experimental/langfuse.LangfuseEvaluationDataset.md: Langfuse evaluation dataset integration - - api/kedro_datasets_experimental/langfuse.LangfusePromptDataset.md: Langfuse prompt integration - - api/kedro_datasets_experimental/langfuse.LangfuseTraceDataset.md: Langfuse tracing integration - - api/kedro_datasets_experimental/opik.OpikEvaluationDataset.md: Opik evaluation integration - - api/kedro_datasets_experimental/opik.OpikPromptDataset.md: Opik prompt integration - - api/kedro_datasets_experimental/opik.OpikTraceDataset.md: Opik tracing integration + - api/kedro_datasets_experimental/langchain.PromptDataset.md: LangChain prompt templates + - api/kedro_datasets_experimental/langfuse.EvaluationDataset.md: Langfuse evaluation dataset integration + - api/kedro_datasets_experimental/langfuse.PromptDataset.md: Langfuse prompt integration + - api/kedro_datasets_experimental/langfuse.TraceDataset.md: Langfuse tracing integration + - api/kedro_datasets_experimental/opik.EvaluationDataset.md: Opik evaluation integration + - api/kedro_datasets_experimental/opik.PromptDataset.md: Opik prompt integration + - api/kedro_datasets_experimental/opik.TraceDataset.md: Opik tracing integration Experimental Deep Learning: - api/kedro_datasets_experimental/pytorch.PyTorchDataset.md: PyTorch model storage @@ -343,11 +343,11 @@ nav: - Databricks: - databricks.ExternalTableDataset: api/kedro_datasets_experimental/databricks.ExternalTableDataset.md - Langchain: - - langchain.LangChainPromptDataset: api/kedro_datasets_experimental/langchain.LangChainPromptDataset.md + - langchain.PromptDataset: api/kedro_datasets_experimental/langchain.PromptDataset.md - Langfuse: - - langfuse.LangfuseEvaluationDataset: api/kedro_datasets_experimental/langfuse.LangfuseEvaluationDataset.md - - langfuse.LangfusePromptDataset: api/kedro_datasets_experimental/langfuse.LangfusePromptDataset.md - - langfuse.LangfuseTraceDataset: api/kedro_datasets_experimental/langfuse.LangfuseTraceDataset.md + - langfuse.EvaluationDataset: api/kedro_datasets_experimental/langfuse.EvaluationDataset.md + - langfuse.PromptDataset: api/kedro_datasets_experimental/langfuse.PromptDataset.md + - langfuse.TraceDataset: api/kedro_datasets_experimental/langfuse.TraceDataset.md - MLRun: - mlrun.MLRunAbstractDataset: api/kedro_datasets_experimental/mlrun.MLRunAbstractDataset.md - mlrun.MLRunDataframeDataset: api/kedro_datasets_experimental/mlrun.MLRunDataframeDataset.md @@ -356,9 +356,9 @@ nav: - NetCDF: - netcdf.NetCDFDataset: api/kedro_datasets_experimental/netcdf.NetCDFDataset.md - Opik: - - opik.OpikEvaluationDataset: api/kedro_datasets_experimental/opik.OpikEvaluationDataset.md - - opik.OpikPromptDataset: api/kedro_datasets_experimental/opik.OpikPromptDataset.md - - opik.OpikTraceDataset: api/kedro_datasets_experimental/opik.OpikTraceDataset.md + - opik.EvaluationDataset: api/kedro_datasets_experimental/opik.EvaluationDataset.md + - opik.PromptDataset: api/kedro_datasets_experimental/opik.PromptDataset.md + - opik.TraceDataset: api/kedro_datasets_experimental/opik.TraceDataset.md - Optuna: - optuna.StudyDataset: api/kedro_datasets_experimental/optuna.StudyDataset.md - PyPDF: diff --git a/kedro-datasets/pyproject.toml b/kedro-datasets/pyproject.toml index 9cc876c55..58fd4419d 100644 --- a/kedro-datasets/pyproject.toml +++ b/kedro-datasets/pyproject.toml @@ -213,19 +213,19 @@ chromadb = ["kedro-datasets[chromadb-chromadbdataset]"] darts-torch-model-dataset = ["u8darts-all"] darts = ["kedro-datasets[darts-torch-model-dataset]"] databricks-externaltabledataset = ["kedro-datasets[hdfs-base,s3fs-base]"] -langchain-langchainpromptdataset = ["langchain>=0.3.0"] -langchain = ["kedro-datasets[langchain-chatopenaidataset,langchain-openaiembeddingsdataset,langchain-chatanthropicdataset,langchain-chatcoheredataset, langchain-langchainpromptdataset]"] -langfuse-langfuseevaluationdataset = ["langfuse>=3.14.0"] -langfuse-langfusepromptdataset = ["langfuse>=2.0.0"] -langfuse-langfusetracedataset = ["langfuse>=2.0.0"] -langfuse-langfusetracedataset-autogen = ["langfuse>=2.0.0", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http"] -langfuse = ["kedro-datasets[langfuse-langfuseevaluationdataset,langfuse-langfusepromptdataset,langfuse-langfusetracedataset,langfuse-langfusetracedataset-autogen]", "openai>=2.3.0", "langchain>=0.2.0, <1.0"] +langchain-promptdataset = ["langchain>=0.3.0"] +langchain = ["kedro-datasets[langchain-chatopenaidataset,langchain-openaiembeddingsdataset,langchain-chatanthropicdataset,langchain-chatcoheredataset, langchain-promptdataset]"] +langfuse-evaluationdataset = ["langfuse>=3.14.0"] +langfuse-promptdataset = ["langfuse>=2.0.0"] +langfuse-tracedataset = ["langfuse>=2.0.0"] +langfuse-tracedataset-autogen = ["langfuse>=2.0.0", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http"] +langfuse = ["kedro-datasets[langfuse-evaluationdataset,langfuse-promptdataset,langfuse-tracedataset,langfuse-tracedataset-autogen]", "openai>=2.3.0", "langchain>=0.2.0, <1.0"] mlrun = ["mlrun>=1.10.0"] -opik-opikevaluationdataset = ["opik>=1.8.0"] -opik-opikpromptdataset = ["opik>=1.8.0"] -opik-opiktracedataset = ["opik>=1.8.0"] -opik-opiktracedataset-autogen = ["opik>=1.8.0", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http"] -opik = ["kedro-datasets[opik-opikevaluationdataset, opik-opikpromptdataset, opik-opiktracedataset, opik-opiktracedataset-autogen]", "openai>=2.3.0", "langchain>=0.2.0"] +opik-evaluationdataset = ["opik>=1.8.0"] +opik-promptdataset = ["opik>=1.8.0"] +opik-tracedataset = ["opik>=1.8.0"] +opik-tracedataset-autogen = ["opik>=1.8.0", "opentelemetry-sdk", "opentelemetry-exporter-otlp-proto-http"] +opik = ["kedro-datasets[opik-evaluationdataset, opik-promptdataset, opik-tracedataset, opik-tracedataset-autogen]", "openai>=2.3.0", "langchain>=0.2.0"] netcdf-netcdfdataset = ["h5netcdf>=1.2.0","netcdf4>=1.6.4","xarray>=2023.1.0"] netcdf = ["kedro-datasets[netcdf-netcdfdataset]"]