feat(datasets): Extend LangfuseTraceDataset to support AutoGen tracing#1288
Conversation
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
LangfuseTraceDataset to support AutoGen tracingLangfuseTraceDataset to support AutoGen tracing
ElenaKhaustova
left a comment
There was a problem hiding this comment.
Left a small comment, other than that he implementation looks good 👍
Could you please also open a PR in the academy project applying autogent mode to this pipeline https://github.com/kedro-org/kedro-academy/tree/main/kedro-agentic-workflows/src/kedro_agentic_workflows/pipelines/response_generation_autogen so it it easy to test for reviewers?
Also don't forget to update RELEASE.md
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
|
Hi @SajidAlamQB , The implementation looks good. It would be nice to have some QA steps or as Elena mentioned some way to test this out, would be cool. Thank you |
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
ElenaKhaustova
left a comment
There was a problem hiding this comment.
Need some help to clarify how to install: kedro-org/kedro-academy#104 (review)
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
ElenaKhaustova
left a comment
There was a problem hiding this comment.
Thank you, @SajidAlamQB, changes made make sense to me! I left a suggestion regarding the implementation.
I tested it with the academy project, and it works now. I left a question regarding the warning produced kedro-org/kedro-academy#104 (review)
And another general question is regarding the OTLP approach we chose. Is it because we try to align with the autogen mode implementation for OpikTraceDataset? Otherwise, this approach (https://langfuse.com/integrations/frameworks/autogen) looks much easier and requires only configuration through Langfuse, as we already do for other modes.
I also wonder what the difference is between those two approaches in terms of the end result, and if you had a chance to explore it?
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Yes the main reason for OTLP approach was to keep consistent with Opik which didn't have an equivalent, so its autogen mode uses OTLP directly. I think for initial implementation it makes sense to keep OTLP for consistency, but we could explore adding an openlit mode or enhancing the autogen mode for Langfuse specifically in a follow up if those other features are needed. For the endpoint that makes sense I'll make it configurable. |
I mean, is there any notable difference at all, aside from the configuration? |
The openLit approach just gives more detailed traces out of the box but otherwise not much difference tbh. |
Signed-off-by: Sajid Alam <90610031+SajidAlamQB@users.noreply.github.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
ElenaKhaustova
left a comment
There was a problem hiding this comment.
Thank you, @SajidAlamQB!
I've unresolved the comment about the endpoint as it does not seem to be solved. Also added a few suggestions on how it can be done.
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
|
Hi @SajidAlamQB , The code looks good and it works well with the test project in Thank you |
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
ankatiyar
left a comment
There was a problem hiding this comment.
Code looks good overall, I'll let Elena and Ravi do the final approvals :)
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
|
Hey team so this PR went through a few different iteration so just to make it clear: We explored two approaches for AutoGen tracing with Langfuse: Approach 1: OpenLit (attempted, reverted) Trace hierarchy was breaking without manual spans and without wrapping agent calls in Graph visualisation issues: Even with correct trace hierarchy Langfuse's graph view renders multi-agent workflows incorrectly. This is a known Langfuse limitation (see issues below). Approach 2: OTLP (current implementation) Provides stable API and aligns with opik setup and produces correct trace structure I've added a note in the docstring that Langfuse's graph visualisation is in beta and may not render complex multi-agent workflows correctly. Also opened an issue on their side: Other Related issues: langfuse/langfuse#9427 |
ElenaKhaustova
left a comment
There was a problem hiding this comment.
Thank you, @SajidAlamQB, implementation looks good to me!
One minor thing that I've noticed is that docs are not rendered properly:
Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
…b.com/kedro-org/kedro-plugins into feat/add-auto-gen-support-to-langfuse Signed-off-by: Sajid Alam <sajid_alam@mckinsey.com>
Description
Related to: #1276
To test for QA use the kedro-academy example: kedro-org/kedro-academy#104
Adds
autogenmode toLangfuseTraceDataset, enabling OpenTelemetry based tracing for AutoGen agent pipelines via Langfuse's OTLP endpoint.Development notes
autogenmode toLangfuseTraceDatasetthat returns a configured OpenTelemetry Tracer_build_autogen_tracer()sets up an OTLP exporterChecklist
jsonschema/kedro-catalog-X.XX.jsonif necessaryRELEASE.mdfile