Skip to content

Latest commit

 

History

History
221 lines (191 loc) · 11.4 KB

File metadata and controls

221 lines (191 loc) · 11.4 KB
title Transcriber fallback configuration
subtitle Configure fallback transcribers that activate automatically if your primary transcriber fails.
slug customization/transcriber-fallback-plan

Overview

Transcriber fallback configuration ensures your calls continue even if your primary speech-to-text provider experiences issues. Vapi supports two approaches:

  • Auto fallback — Vapi intelligently routes transcription to an alternative provider when your primary fails. No configuration required.
  • Manual fallback — You specify exact backup providers in priority order for full control over the failover sequence.

You can use both together. When combined, your manual fallbacks are tried first. If all of them fail, Vapi's auto fallback takes over as a final safety net.

Key benefits:

  • Call continuity during provider outages
  • Automatic failover with no user intervention required
  • Provider diversity to protect against single points of failure
Without any fallback plan configured, your call will end with an error if your chosen transcription provider fails.

How it works

When a transcriber failure occurs, Vapi follows this priority order:

  1. Manual fallbacks first — If you've configured explicit fallback transcribers, Vapi tries each one sequentially in the order you specified.
  2. Auto fallback as safety net — If all manual fallbacks fail (or none are configured), and auto fallback is enabled, Vapi intelligently selects an alternative provider and routes your transcription audio to it.
  3. Call termination — The call ends only if every fallback option has been exhausted.

Auto fallback

Auto fallback is the simplest way to add resilience. Toggle it on, and Vapi handles provider selection for you—automatically routing transcription audio to an alternative STT provider when your primary fails.

Enabling auto fallback may route audio to other providers. If your organization has strict compliance requirements, review your compliance settings to ensure this aligns with your needs.

To enable auto fallback via API, set transcriber.fallbackPlan.autoFallback.enabled to true:

{
  "transcriber": {
    "provider": "deepgram",
    "model": "nova-3",
    "language": "en",
    "fallbackPlan": {
      "autoFallback": {
        "enabled": true
      }
    }
  }
}

Manual fallbacks

Manual fallbacks give you full control over which providers Vapi tries, and in what order. This is useful when you need specific providers for compliance, language support, or cost reasons.

Configure via dashboard

Navigate to your assistant and select the **Transcriber** tab. Scroll down to find the **Transcriber Fallback** section. Under **Manual Fallbacks**, click **Add** to configure your backup providers in priority order. For each fallback, configure: - Select a **provider** from the dropdown - Choose a **model** (if the provider offers multiple models) - Select a **language** for transcription Expand **Additional Configuration** to access provider-specific settings like numerals formatting, VAD settings, and confidence thresholds. Repeat to add additional fallback transcribers. Order matters—the first fallback in your list is tried first. If HIPAA or PCI compliance is enabled on your account or assistant, only **Deepgram** and **Azure** transcribers will be available as fallback options.

Configure via API

Add the fallbackPlan property to your assistant's transcriber configuration, and specify the fallback transcribers within the transcribers property. You can combine manual fallbacks with auto fallback for maximum resilience.

{
  "transcriber": {
    "provider": "deepgram",
    "model": "nova-3",
    "language": "en",
    "fallbackPlan": {
      "autoFallback": {
        "enabled": true
      },
      "transcribers": [
        {
          "provider": "assembly-ai",
          "speechModel": "universal-streaming-multilingual",
          "language": "en"
        },
        {
          "provider": "azure",
          "language": "en-US"
        }
      ]
    }
  }
}

In this example, if Deepgram fails, Vapi tries AssemblyAI first, then Azure. If both manual fallbacks fail, auto fallback intelligently selects another available provider.

Provider-specific settings

Each transcriber provider supports different configuration options. Expand the accordion below to see available settings for each provider.

- **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, etc.). - **language**: Language code for transcription. - **keywords**: Keywords with optional boost values for improved recognition (e.g., `["companyname", "productname:2"]`). - **keyterm**: Keyterm prompting for up to 90% keyword recall rate improvement. - **smartFormat** (boolean): Enable smart formatting for numbers and dates. - **eotThreshold** (0.5-0.9): End-of-turn confidence threshold. Only available with Flux models. - **eotTimeoutMs** (500-10000): Maximum time to wait after speech before finalizing turn. Only available with Flux models. Default is 5000ms. - **language**: Language code (`multi` for multilingual, `en` for English). - **speechModel**: Streaming speech model (`universal-streaming-english` or `universal-streaming-multilingual`). - **wordBoost**: Custom vocabulary array (up to 2500 characters total). - **keytermsPrompt**: Array of keyterms for improved recognition (up to 100 terms, 50 characters each). Costs additional $0.04/hour. - **endUtteranceSilenceThreshold**: Duration of silence in milliseconds to detect end of utterance. - **disablePartialTranscripts** (boolean): Set to `true` to disable partial transcripts. - **confidenceThreshold** (0-1): Minimum confidence threshold for accepting transcriptions. Default is 0.4. - **vadAssistedEndpointingEnabled** (boolean): Enable VAD-based endpoint detection. - **language**: Language code in BCP-47 format (e.g., `en-US`, `es-MX`, `fr-FR`). - **segmentationSilenceTimeoutMs** (100-5000): Duration of silence after which a phrase is finalized. Configure to adjust sensitivity to pauses. - **segmentationMaximumTimeMs** (20000-70000): Maximum duration a segment can reach before being cut off. - **segmentationStrategy**: Controls phrase boundary detection. Options: `Default`, `Time`, or `Semantic`. - **model**: Model selection (`fast`, `accurate`, or `solaria-1`). - **language**: Language code. - **confidenceThreshold** (0-1): Minimum confidence for transcription acceptance. Default is 0.4. - **endpointing** (0.01-10): Time in seconds to wait before considering speech ended. - **speechThreshold** (0-1): Speech detection sensitivity (0.0 to 1.0). - **prosody** (boolean): Enable prosody detection (laugh, giggle, music, etc.). - **audioEnhancer** (boolean): Pre-process audio for improved accuracy (increases latency). - **transcriptionHint**: Hint text to guide transcription. - **customVocabularyEnabled** (boolean): Enable custom vocabulary. - **customVocabularyConfig**: Custom vocabulary configuration with vocabulary array and default intensity. - **region**: Processing region (`us-west` or `eu-west`). - **receivePartialTranscripts** (boolean): Enable partial transcript delivery. - **model**: Model selection (currently only `default`). - **language**: Language code. - **operatingPoint**: Accuracy level. `standard` for faster turnaround, `enhanced` for highest accuracy. Default is `enhanced`. - **region**: Processing region (`eu` for Europe, `us` for United States). Default is `eu`. - **enableDiarization** (boolean): Enable speaker identification for multi-speaker conversations. - **maxDelayMs**: Maximum delay in milliseconds for partial transcripts. Balances latency and accuracy. - **model**: Gemini model selection. - **language**: Language selection (e.g., `Multilingual`, `English`, `Spanish`, `French`). - **model**: OpenAI Realtime STT model selection (required). - **language**: Language code for transcription. - **model**: Model selection (currently only `scribe_v1`). - **language**: ISO 639-1 language code. - **model**: Model selection (currently only `ink-whisper`). - **language**: ISO 639-1 language code.

Best practices

  • Start with auto fallback for quick, zero-config resilience—it works well for most use cases.
  • Add manual fallbacks when you need control over specific providers for compliance, language, or cost reasons.
  • Combine both for maximum reliability—manual fallbacks run first, auto fallback catches anything they miss.
  • Use different providers for manual fallbacks to protect against provider-wide outages.
  • Consider language compatibility when selecting fallbacks—ensure all fallback transcribers support your required languages.
  • For HIPAA/PCI compliance, ensure all fallbacks are compliant providers (Deepgram or Azure) and review data routing implications before enabling auto fallback.

FAQ

Auto fallback lets Vapi intelligently select an alternative provider for you—no configuration needed. Manual fallback lets you specify exact providers in a specific priority order. You can use both together: manual fallbacks are tried first, and auto fallback acts as a safety net if they all fail. All major transcriber providers are supported: Deepgram, AssemblyAI, Azure, Gladia, Google, Speechmatics, Cartesia, ElevenLabs, and OpenAI. No additional fees for using fallback transcribers. You are only billed for the transcriber that processes the audio. Failover typically occurs within milliseconds of detecting a failure, ensuring minimal disruption to the call. Yes. When auto fallback activates, Vapi may route transcription audio to a different cloud provider than your primary. If you have data residency or compliance requirements, we recommend reviewing your organization's policies to ensure this aligns with your needs. Yes, each fallback transcriber can have its own language configuration. However, for the best user experience, we recommend using the same or similar languages across all fallbacks.