Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions blueprints/index.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,15 @@ nav: firecracker
</figure>

<p>
A new&mdash;and growing!&mdash;library of blueprints showing how to run, design, build, and deploy different kinds of apps on Fly.io. This isnt just a showcase for stuff weve built, but a collection of patterns and examples that you can can apply in your own projects.
A new&mdash;and growing!&mdash;library of "blueprints" showing how to run, design, build, and deploy different kinds of apps on Fly.io. This isn't just a showcase for stuff we've built, but a collection of patterns and examples that you can can apply in your own projects.
</p>

<ul>
<% current_page.children.each do |page| %>
<% next if page.data['published'] == false %>
<%# Define a very old date for sorting pages without a date %>
<% default_date = Date.new(1970, 1, 1) %>

<%# Select published pages, sort by date descending (handling missing dates), then loop %>
<% current_page.children.select { |p| p.data['published'] != false }.sort_by { |p| p.data['date'] || default_date }.reverse.each do |page| %>
<li>
<%= link_to_page page %>
</li>
Expand Down
58 changes: 58 additions & 0 deletions blueprints/per-user-dev-environments.html.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: Per-User Dev Environments with Fly Machines
layout: docs
nav: firecracker
date: 2025-04-02
---

Fly Machines are fast-launching VMs behind [a simple API](https://fly.io/docs/machines/api), enabling you to launch tightly isolated app instances in milliseconds [all over the world](https://fly.io/docs/reference/regions/).

One interesting use case: running isolated dev environments for your users (or robots). Fly Machines are a safe execution sandbox for even the sketchiest user-generated (or LLM-generated) code.

This blueprint explains how to use Fly Machines to securely host ephemeral development and/or execution environments, complete with [dynamic subdomain routing](https://fly.io/docs/networking/dynamic-request-routing) using `fly-replay`.

## Overview

Your architecture should include:

- **Router app(s)**
- A Fly.io app to handle requests to wildcard subdomains (`*.example.com`). Uses `fly-replay` headers to transparently redirect each request to the correct app and machine. If you have clusters of users (or robots) in different geographic regions, you can spin up a router app in multiple regions (you might also want to consider a globally distributed datastore like [Upstash for Redis](https://fly.io/docs/upstash/redis/#what-you-should-know)).
- **User apps (pre-created)**
- Dedicated per-user (or per-robot) Fly apps, each containing isolated Fly Machines. App and Machine creation is not instantaneous, so we recommend provisioning a pool of these before you need them so you can quickly assign upon request.
- **Fly Machines (with optional volumes)**
- Fast-launching VMs that can be attached to persistent [Fly Volumes](https://fly.io/docs/volumes).

### Example Architecture Diagram

<img src="/static/images/docs-sandbox-architecture.webp" alt="Diagram showing router app directing traffic to user apps containing Fly Machines with volumes">

### Router app(s)

Your router app handles all incoming wildcard traffic. Its responsibility is simple:

- Extract subdomains (like `alice.example.com` → `alice-123`).
- Look up the correct app (and optionally machine ID) for that user.
- Issue a `fly-replay` header directing the Fly Proxy to [internally redirect the request](https://fly.io/docs/networking/dynamic-request-routing) (this should add no more than ~10 milliseconds of latency).
- Make sure you've added [a wildcard domain](https://fly.io/docs/networking/custom-domain/#get-certified) (*.example.com) to your router app (read more about the [certificate management endpoint here](https://fly.io/docs/networking/custom-domain-api/)).

### User apps

Creating apps dynamically for each user at request time can be slow. To ensure fast provisioning:

- **Pre-create** a pool of Fly apps and machines ahead of time (using the [Fly Machines API or CLI](https://fly.io/docs/apps/overview/)).
- Store app details (e.g., app_name: `alice-123`) in a datastore accessible to your router app.
- Assign apps to users at provisioning time.

**Fly Machines**

You'll want to spin up at least one Machine per user app (but apps can have as many Machines as needed). If your dev environments need persistent storage (data that should survive Machine restarts):

- Attach Fly Volumes to each machine at creation time.
- Keep in mind that machine restarts clear temporary filesystem state but preserve volume data.
- Learn more about the [Machines API resource](https://fly.io/docs/machines/api/machines-resource/) and the [Volumes API resource](https://fly.io/docs/machines/api/volumes-resource/).

## Pointers & Footguns

- **Machines & volumes are tied to physical hardware:** hardware failures can destroy machines and attached volumes. **Always persist important user data** (code, config, outputs) to external storage (like [Tigris Data](https://fly.io/docs/tigris/#main-content-start) or AWS S3).
- **Your users will break their environments:** pre-create standby machines to handle hardware & runtime failures, or the inevitable user or robot poisoned environment. Pre-create standby machines that you can quickly activate in these scenarios.
- **Machine restarts reset ephemeral filesystem:** the temporary Fly Machine filesystem state resets on Machine restarts, ensuring clean environments. However, volume data remains persistent, making it useful for retaining user progress or state.