feat(smallestai): add Pulse STT with real-time streaming and batch transcription by harshitajain165 · Pull Request #5312 · livekit/agents

harshitajain165 · 2026-04-02T11:38:02Z

Summary

This PR adds speech-to-text support to the existing livekit-plugins-smallestai package via the Smallest AI Pulse STT API, complementing the Lightning TTS integration that already exists.

Streaming (SpeechStream): real-time transcription over WebSocket with interim and final transcripts, ~64ms TTFT
Batch (_recognize_impl): pre-recorded transcription via HTTP POST
Word-level timestamps: per-word start/end/confidence included by default (word_timestamps=True)
Speaker diarization: opt-in via diarize=True
Configurable end-of-utterance timeout: eou_timeout_ms (100–10,000ms, default 800ms)

Implementation notes

Follows the same patterns as other STT plugins in the repo
API field names (transcript, is_final, is_last, finalize message) verified against docs.smallest.ai
START_OF_SPEECH is inferred from the first non-empty transcript since the Pulse API does not emit a dedicated speech-start event

Test plan

test_recognize[livekit.plugins.smallestai] passes
test_stream[livekit.plugins.smallestai] passes
ruff format and ruff check pass
mypy --strict passes

devin-ai-integration

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-04-02T11:42:46Z

livekit-plugins/livekit-plugins-smallestai/livekit/plugins/smallestai/stt.py

+        else:
+            self._event_ch.send_nowait(
+                stt.SpeechEvent(
+                    type=stt.SpeechEventType.INTERIM_TRANSCRIPT,
+                    request_id=self._session_id,
+                    alternatives=alts,
+                )
+            )


🔴 STT capability declares interim_results=False but code emits INTERIM_TRANSCRIPT events

The STT constructor at line 139 declares interim_results=False in STTCapabilities, but _process_stream_event at lines 517-524 emits stt.SpeechEventType.INTERIM_TRANSCRIPT events whenever the server returns a non-final transcript (is_final=False). The Smallest AI Pulse API does return partial transcripts (the schema comment at line 475 says transcript is "partial or final text"), so the capability should be True. This mismatch causes incorrect behavior in the FallbackAdapter (livekit-agents/livekit/agents/stt/fallback_adapter.py:80) which uses all(t.capabilities.interim_results for t in stt) to compose capabilities — it would incorrectly report that the combined STT doesn't support interim results even if the other STT does.

Was this helpful? React with 👍 or 👎 to provide feedback.

CLAassistant · 2026-04-02T11:56:30Z

All committers have signed the CLA.

tinalenguyen

hi, thank you for the PR! i have a few notes, could you:

address all of the devin comments, especially the one regarding interim transcripts
remove smallest ai from the test files, as we do not have a smallestai api key for testing as of yet
sign the CLA if possible

harshitajain165 · 2026-04-08T06:33:30Z

Hey @tinalenguyen

Thanks for the comment. I'm addressing the devin comments, removing smallest ai from test files and signing the CLA. Will keep you posted once all are done

harshitajain165 · 2026-04-08T07:15:35Z

recheck

harshitajain165 · 2026-04-08T07:25:44Z

Hey @tinalenguyen

The devin comments have been incorporated, smallest ai has been removed from test files and I have signed the CLA too. Please feel free to re-review/take this forward.

…support Adds speech-to-text support to the existing Smallest AI plugin via the Waves Pulse API, covering both real-time WebSocket streaming and pre-recorded HTTP batch transcription.

- Add lightning-v3.1 as the new default model (80+ voices, ~100ms latency) - Remove deprecated lightning and lightning-large models - Update base URL to api.smallest.ai/waves/v1 - Simplify endpoint to get_speech for all models (removes get_speech_long_text) - Add alaw encoding support (v3.1) - Restrict consistency/similarity/enhancement params to lightning-v2 only

…cstrings

…ce tracking

- Remove unused `interim_results` option from STT (constructor, options dataclass, and update_options). The Pulse API does not support server-side interim filtering and the plugin never honoured the flag. STTCapabilities now declares interim_results=False. - Remove smallestai from test_stt.py and test_tts.py since there is no Smallest AI API key available in CI. - Remove spurious TTS warning about consistency/similarity/enhancement params that fired on every default TTS() instantiation. The downstream _to_smallest_options already correctly excludes those params for non-v2 models. Made-with: Cursor

tinalenguyen · 2026-04-08T18:58:40Z

@harshitajain165 Thank you for iterating on the feedback!

For the STT, I printed out the received events and it does seem that interim results are emitted. Is there a setting to pass to the API or does the API always send interim results? If that is always the case, I would set interim_results to True and disregard the devin comment. I also noticed that for the final transcript, there are often spaces in the beginning. Is that expected?

Also, when testing the TTS, I keep facing this error:
failed to synthesize speech: message='Bad Request (400)', status_code=400, retryable=False, retrying in 2.0s

theomonnom · 2026-04-08T23:15:53Z

livekit-plugins/livekit-plugins-smallestai/livekit/plugins/smallestai/stt.py

+        encoding: STTEncoding | str = "linear16",
+        word_timestamps: bool = True,
+        diarize: bool = False,
+        eou_timeout_ms: int = 800,


Should this be 0?

With our end-of-turn detection model, we should prioritize minimizing latency to receive transcripts.

devin-ai-integration bot reviewed Apr 2, 2026

View reviewed changes

harshitajain165 force-pushed the smallest-stt branch from 49ec8de to 619df8f Compare April 2, 2026 11:57

This comment was marked as resolved.

Sign in to view

tinalenguyen reviewed Apr 7, 2026

View reviewed changes

harshitajain165 force-pushed the smallest-stt branch 2 times, most recently from fc0b7bb to 26ba81c Compare April 8, 2026 07:14

harshitajain165 requested a review from tinalenguyen April 8, 2026 07:29

harshitajain165 added 8 commits April 8, 2026 13:16

feat(smallestai): add Pulse STT integration with streaming and batch …

ce9271f

…support Adds speech-to-text support to the existing Smallest AI plugin via the Waves Pulse API, covering both real-time WebSocket streaming and pre-recorded HTTP batch transcription.

chore(smallestai): ruff format stt.py and update README with STT usage

ddf79a1

chore(smallestai): remove outdated Waves API reference from README

71e5a8c

chore(smallestai): remove stale Waves references from comments and do…

c24df39

…cstrings

test(smallestai): add smallestai to TTS test suite

9dcdb17

feat(smallestai): add X-Source and X-LiveKit-Version headers for sour…

84d67ee

…ce tracking

harshitajain165 force-pushed the smallest-stt branch from 26ba81c to 0edd0e0 Compare April 8, 2026 07:46

theomonnom reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(smallestai): add Pulse STT with real-time streaming and batch transcription#5312

feat(smallestai): add Pulse STT with real-time streaming and batch transcription#5312
harshitajain165 wants to merge 8 commits intolivekit:mainfrom
harshitajain165:smallest-stt

harshitajain165 commented Apr 2, 2026 •

edited

Loading

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Apr 2, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 2, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

tinalenguyen left a comment

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

tinalenguyen commented Apr 8, 2026

Uh oh!

theomonnom Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

harshitajain165 commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Implementation notes

Test plan

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

tinalenguyen left a comment

Choose a reason for hiding this comment

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

harshitajain165 commented Apr 8, 2026

Uh oh!

tinalenguyen commented Apr 8, 2026

Uh oh!

theomonnom Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

harshitajain165 commented Apr 2, 2026 •

edited

Loading

devin-ai-integration bot Apr 2, 2026 •

edited

Loading

CLAassistant commented Apr 2, 2026 •

edited

Loading