Build and deploy your Flash application to Runpod Serverless endpoints in one step.
The flash deploy command is the primary way to get your Flash application running in the cloud. It combines the build process with deployment, taking your local code and turning it into live serverless endpoints on Runpod.
When to use this command:
- Deploying your application for the first time
- Pushing code updates to an existing environment
- Setting up new environments (dev, staging, production)
- Testing your full distributed system with
--previewbefore going live
What happens during deployment:
- Build: Packages your code, dependencies, and manifest (same as
flash build) - Upload: Sends the artifact to Runpod's storage
- Provision: Creates or updates serverless endpoints based on your endpoint configs
- Configure: Sets up environment variables, volumes, and service discovery
- Verify: Confirms endpoints are healthy and displays access information
Key features:
- One command: No need to run build and deploy separately
- Smart environment handling: Auto-selects environment if only one exists, prompts if multiple
- Incremental updates: Only updates what changed, preserving endpoint URLs
- Preview mode: Test locally with Docker before deploying to production
With flash deploy, your entire application runs on Runpod Serverless -- all endpoints deploy as peer serverless endpoints:
┌─────────────────────────────────────────────────────────────────┐
│ RUNPOD SERVERLESS │
│ │
│ All endpoints deployed as peers, using manifest for discovery │
│ │
│ ┌─────────────────────────┐ ┌─────────────────────────┐ │
│ │ gpu-worker │ │ cpu-worker │ │
│ │ (your Endpoint function)│ │ (your Endpoint function)│ │
│ └─────────────────────────┘ └─────────────────────────┘ │
│ │
│ ┌─────────────────────────┐ │
│ │ lb-worker │ │
│ │ (load-balanced endpoint)│ │
│ └─────────────────────────┘ │
│ │
│ Service discovery: flash_manifest.json + State Manager GraphQL │
└─────────────────────────────────────────────────────────────────┘
▲
│ HTTPS (authenticated)
│
┌─────┴─────┐
│ USERS │
└───────────┘
Key points:
- All endpoints run on Runpod as serverless endpoints
- Users call endpoint URLs directly (e.g.,
https://{id}.api.runpod.ai/api/hellofor LB,https://api.runpod.ai/v2/{id}/runsyncfor QB) - No
live-prefix on endpoint names (these are production endpoints) - No hot reload: code changes require a new deployment
This is different from flash run, where your FastAPI app runs locally on your machine. See flash run for the hybrid development architecture.
| Aspect | flash run |
flash deploy |
|---|---|---|
| App runs on | Your machine (localhost) | Runpod Serverless |
| Endpoint functions run on | Runpod Serverless | Runpod Serverless |
| Endpoint naming | live- prefix (e.g., live-gpu-worker) |
No prefix (e.g., gpu-worker) |
| Hot reload | Yes | No |
| Use case | Development & testing | Production deployment |
| Build artifact created | No | Yes (tarball + manifest) |
flash deploy [OPTIONS]--env, -e: Target environment name (auto-selected if only one exists)--app, -a: Flash app name (auto-detected from current directory)--no-deps: Skip transitive dependencies during pip install (default: false)--exclude: Comma-separated packages to exclude (e.g., 'torch,torchvision')--output, -o: Custom archive name (default: artifact.tar.gz)--preview: Build and launch local preview environment instead of deploying
# Build and deploy (auto-selects environment if only one exists)
flash deploy
# Deploy to specific environment
flash deploy --env staging
# Deploy to specific app and environment
flash deploy --app my-project --env production
# Deploy with excluded packages (reduces deployment size)
flash deploy --exclude torch,torchvision,torchaudio
# Build and test locally before deploying
flash deploy --preview
# Combine options
flash deploy --env staging --exclude torch --no-depsThe deploy command combines building and deploying your Flash application in a single step:
-
Build Phase: Creates deployment artifact (see flash build for details)
- Scans project for
Endpointdefinitions - Groups endpoints by resource configuration
- Creates
flash_manifest.jsonfor service discovery - Generates handlers for each endpoint type
- Installs dependencies with Linux x86_64 compatibility
- Packages everything into
.flash/artifact.tar.gz
- Scans project for
-
Environment Resolution:
- Auto-detects app name from current directory
- If no app exists, creates it automatically
- If
--envspecified, uses that environment (creates if missing) - If only one environment exists, uses it automatically
- If multiple environments exist, prompts for selection
-
Deployment Phase:
- Uploads the build artifact to Runpod storage
- Provisions Serverless endpoints based on resource configs
- Configures endpoints with environment variables and volumes
- Sets up service discovery for cross-endpoint function calls
- Registers endpoints in environment tracking
-
Post-Deployment:
- Displays deployment URLs and available routes
- Shows authentication and testing guidance
- Cleans up temporary build directory
During deploy, Flash updates manifest metadata with runtime endpoint details (for example endpoint_id, endpoint URLs, and aiKey when returned by the API).
- The manifest stored in State Manager keeps runtime metadata used for reconciliation.
- The local
.flash/flash_manifest.jsonis sanitized before writing to disk and does not persistaiKey. RUNPOD_API_KEYcontinues to be resolved from credentials/env at runtime and is not stored in the local manifest.
The deploy command supports all build options from flash build:
flash deploy --no-depsOnly installs direct dependencies specified in Endpoint definitions. Useful when your base image already includes common packages.
flash deploy --exclude torch,torchvision,torchaudioSkips specified packages during dependency installation. Critical for staying under Runpod's 1.5GB deployment limit. See flash build for base image package reference.
flash deploy --previewBuilds your project and launches a local Docker-based test environment instead of deploying to Runpod. This allows you to test your distributed system locally before production deployment.
What happens:
- Builds your project (creates the archive and manifest)
- Creates a Docker network for inter-container communication
- Starts one Docker container per resource config
- Exposes the application on
localhost:8000 - All containers communicate via Docker DNS
- On shutdown (Ctrl+C), automatically stops and removes all containers
See flash build for more details on preview mode.
An environment is an isolated deployment context within a Flash app. Each environment is a separate "stage" (like dev, staging, or production) that contains its own deployed endpoints, build versions, and deployment status.
For more details about environment management, see flash env.
If the specified environment doesn't exist, flash deploy creates it automatically:
# Creates 'staging' if it doesn't exist
flash deploy --env stagingIf no environment is specified and none exist, it creates a 'production' environment by default.
When you have only one environment, it's selected automatically:
# Auto-selects the only available environment
flash deployWhen multiple environments exist, you must specify which one:
# Error: Multiple environments found
flash deploy
# Solution: Specify environment
flash deploy --env stagingAfter successful deployment, the command displays guidance for using your deployed application:
All endpoints require authentication with your Runpod API key:
export RUNPOD_API_KEY="your_key_here"QB endpoints:
curl -X POST "https://api.runpod.ai/v2/{ENDPOINT_ID}/runsync" \
-H "Authorization: Bearer $RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"input": {"key": "value"}}'LB endpoints:
curl -X POST "https://{ENDPOINT_ID}.api.runpod.ai/predict" \
-H "Authorization: Bearer $RUNPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"key": "value"}'# Check environment status
flash env get production
# View in Runpod Console
# https://console.runpod.io/serverless# Deploy updated code to same environment
flash deploy --env productionProblem: Error: Multiple environments found: dev, staging, production
Solution: Specify the target environment:
flash deploy --env stagingIf the build phase fails, see flash build troubleshooting for common build issues.
Problem: Deployment exceeds Runpod's 1.5GB limit
Solution: Use --exclude to skip packages already in your base image:
flash deploy --exclude torch,torchvision,torchaudioProblem: 401 Unauthorized when calling endpoints
Solution: Ensure your API key is set correctly:
echo $RUNPOD_API_KEY
flash login- flash build - Build without deploying
- flash env - Manage deployment environments
- flash app - Manage Flash applications
- flash undeploy - Remove deployed endpoints
- flash run - Local development server