Skip to content

GSoC 2026 Proposal | Multimodal AI & Agent API Eval Framework | Soumyaraj Bag#1601

Merged
animator merged 1 commit into
foss42:mainfrom
soumyarajbag:main
Mar 31, 2026
Merged

GSoC 2026 Proposal | Multimodal AI & Agent API Eval Framework | Soumyaraj Bag#1601
animator merged 1 commit into
foss42:mainfrom
soumyarajbag:main

Conversation

@soumyarajbag
Copy link
Copy Markdown
Contributor

@soumyarajbag soumyarajbag commented Mar 30, 2026

PR Description

Hello mentors @animator @ashitaprasad @DenserMeerkat @synapsecode , I am Soumyaraj Bag, a 4th-year B.Tech student (CSE, RCCIIT) and Full-Stack Engineer. I am submitting my formal proposal for the Multimodal AI and Agent API Eval Framework.

This proposal outlines a 350-hour "Large" project focused on building a production-grade, full-stack evaluation framework as a companion service to API Dash. It leverages my hands-on experience at Trumio (AI Interviewer Agent, FastAPI microservices) and smallcase (production backend engineering), along with a fully deployed proof-of-concept at apidash-eval-poc.vercel.app — covering multimodal evaluation for Text, Image, and Audio modalities across providers (OpenAI, Anthropic, Google), a React/TypeScript dashboard with real-time SSE streaming, and an async FastAPI backend backed by MongoDB Atlas.

My existing APIDash contributions include:

  • PR #1588 — Fixed Anthropic API spec violation in packages/genai (system prompt placement + SSE streaming fix + new provider unit tests)
  • PR #1590 — Added a typed ResponseEvaluator utility to better_networking with 25 test cases
  • Issue #1591 — Identified hard Flutter SDK dependency in genai package blocking headless/pure-Dart use
  • Issue #1592 — Documented two compounding bugs in Anthropic's SSE streaming parser causing null token extraction

Related Issues

Checklist

  • I have gone through the contributing guide
  • I have followed the GSoC 2026 Application Guide and used the required proposal template.
  • I have included my agreement to the AI Usage Policy within the proposal.
  • I have updated my branch and synced it with the project's main branch before making this PR.

Added/updated tests?

We encourage you to add relevant test cases.

  • Yes
  • No, and this is why: This is a Documentation PR containing my formal GSoC 2026 Project Proposal as required by the application guidelines.

OS on which you have developed and tested the feature?

  • Windows
  • macOS
  • Linux

Copy link
Copy Markdown
Member

@animator animator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumyarajbag the proposal looks good. Please go ahead and submit it in the GSoC portal.

@animator animator merged commit c3a7b24 into foss42:main Mar 31, 2026
@animator
Copy link
Copy Markdown
Member

animator commented Apr 2, 2026

@soumyarajbag You can now work on a proof of concept (PoC) showcasing your coding ability and problem solving skills but making some minimal implementation of the technology you proposed in the project.

Please find the guidelines below:
👉 Any PoC that is a new project or is not directly dependent on API Dash source code should be submitted to the repository - https://github.com/foss42/gsoc-poc
👉 PoCs must be sent through this process. You can have a version hosted on personal repo or any website/link, but this way it will be easier for us to keep track of the submitted PoC as your PoC link might be buried in your proposal. It will also ease the review process and declutter the main repo PRs.
👉 Also, in your case: Apart from the proposed solution, AI Eval project candidates must also go through these resource -

  1. https://dev.to/aws/how-i-built-mcp-apps-based-sales-analytics-agentic-ui-deployed-it-on-amazon-bedrock-agentcore-4e9i
  2. https://github.com/ashitaprasad/sample-mcp-apps-chatflow

and explore if AI evaluation UI can be built using it to make it easy for end users to run evals from inside AI agents.

@soumyarajbag
Copy link
Copy Markdown
Contributor Author

surely @animator - I previously implemented a PoC structure - refining it a more as per the guidelines. Thank you. Will update you here in the threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants