Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
cfe9c54
pci: bdf: add parsing from &str for PciSBDF
ShadowCurse Jun 17, 2026
2b79fe6
vfio: add VfioConfig types and wire into VmResources
ShadowCurse Apr 8, 2026
ed01fc0
vfio: swagger: add REST API endpoint and metrics for VFIO devices
ShadowCurse Apr 8, 2026
836ea69
vfio: make ArrayVec a non-optional dependency
ShadowCurse Apr 8, 2026
49c13ac
pci: bars: add 32-bit BAR support and more generic utilities
ShadowCurse Apr 8, 2026
bce43e7
pci: add accessor methods to MsixCap
ShadowCurse Apr 8, 2026
d837084
vfio: add host-page-aligned utility functions
ShadowCurse Apr 8, 2026
676a183
vfio: add vfio-bindings and vfio-ioctls dependencies
ShadowCurse Apr 8, 2026
0436e08
pci: make BAR size decode functions public
ShadowCurse Jun 25, 2026
762bec2
vfio: add VFIO PCI device passthrough support
ShadowCurse Apr 8, 2026
661537a
devtool: add --vfio-device and --first-vfio-pci-device options
ShadowCurse Apr 7, 2026
9e02123
vfio: add functional integration tests for NVMe passthrough
ShadowCurse Apr 8, 2026
000cd21
vfio: disallow starting VM with VFIO with incompatible configurations
ShadowCurse Apr 17, 2026
1998dee
vfio: disallow taking snapshots with VFIO devices attached
ShadowCurse Apr 28, 2026
065d7a0
vfio: seccomp: allow syscalls for VFIO device runtime
ShadowCurse Apr 10, 2026
7556a55
vfio: guest_kernel: add NVMe driver configs
ShadowCurse Apr 7, 2026
d1bc13d
vfio: ci: add dedicated Buildkite step for VFIO tests
ShadowCurse Apr 14, 2026
fedf340
vfio: hotplug: add post VM start VFIO hotplug
ShadowCurse May 13, 2026
f9b272f
vfio: seccomp: allow syscalls for VFIO hotplug
ShadowCurse May 13, 2026
d3c003d
vfio: hot-unplug: add post VM start VFIO hot-unplug
ShadowCurse May 14, 2026
57ad855
changelog: vfio: add a note about VFIO passthrough
ShadowCurse May 1, 2026
81a57dc
vfio: doc: add VFIO passthrough documentation
ShadowCurse Apr 17, 2026
b3e961b
do not merge: point to vfio artifacts
ShadowCurse May 5, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .buildkite/pipeline_pr.py
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,15 @@
),
)

pipeline.build_group(
"vfio",
pipeline.devtool_test(
devtool_opts="--vfio-nvme-device /dev/sdf --first-vfio-nvme-device -c 1-10",
pytest_opts="-m vfio integration_tests/functional/",
),
**DEFAULTS_PERF,
)

pipeline.build_group(
"performance",
pipeline.devtool_test(
Expand Down
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ and this project adheres to

### Added

- [#5870](https://github.com/firecracker-microvm/firecracker/pull/5870): Add
basic VFIO support allowing for PCIe device passthrough into VM. See
[documentation][docs/vfio.md] for instructions and current limitations.

### Changed

### Deprecated
Expand Down
25 changes: 25 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions docs/device-hotplug.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ running microVM without requiring a reboot. Supported device types are:
- `virtio-block`
- `virtio-pmem`
- `virtio-net`
- `vfio`

## Prerequisites

Expand Down
148 changes: 148 additions & 0 deletions docs/vfio.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# VFIO Device Passthrough

## What is VFIO

VFIO (Virtual Function I/O) is a Linux kernel framework that allows userspace
programs to directly access physical devices in a secure, IOMMU-protected
environment. Firecracker uses VFIO to pass through PCI devices from the host
into the guest, giving the guest near-native performance access to physical
hardware such as GPUs, network adapters, and NVMe drives.

## Prerequisites

VFIO passthrough requires:

- Firecracker must be started with the `--enable-pci` flag since VFIO devices
are PCI devices.
- An IOMMU (Intel VT-d, AMD-Vi, or ARM SMMU) must be enabled on the host.
- The host must have the `vfio` and `vfio-pci` kernel modules loaded.
- The target PCI device must be unbound from its native kernel driver and bound
to the `vfio-pci` driver.
- All devices in the same IOMMU group must be bound to `vfio-pci`.

## How to bind device to `vfio-pci` driver

To bind a device (e.g. `0000:11:22.3`) to `vfio-pci`:

```bash
# Unbind from current driver
echo "0000:11:22.3" > /sys/bus/pci/devices/0000:11:22.3/driver/unbind
# Bind to vfio-pci
echo "vfio-pci" > /sys/bus/pci/devices/0000:11:22.3/driver_override
echo "0000:11:22.3" > /sys/bus/pci/drivers/vfio-pci/bind
```

## Configuration

Firecracker exposes the following configuration options for VFIO devices:

- `id` - unique identifier for the device
- `sbdf` - host PCI device identifier, accepted in many forms:
- full sysfs path: `/sys/bus/pci/devices/0000:01:02.03`
- full SBDF: `0000:01:02.03`
- short BDF: `01:02.03`
- hex integer: `0x010203`
- decimal integer: `66051`

### Config file

```json
"vfio": [
{
"id": "device0",
"sbdf": "/sys/bus/pci/devices/0000:11:22.3"
}
]
```

### API

#### Add device

The same `PUT /vfio/{id}` endpoint works both before and after boot:

```console
curl --unix-socket $socket_location -i \
-X PUT 'http://localhost/vfio/device0' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d "{
\"id\": \"device0\",
\"sbdf\": \"/sys/bus/pci/devices/0000:01:02.03\"
}"
```

#### Remove device

A VFIO device can be removed at runtime via `DELETE /vfio/{id}`:

```console
curl --unix-socket $socket_location -i \
-X DELETE 'http://localhost/vfio/device0'
```

Hot-unplug is only valid after the microVM has booted. The device is detached
from the guest PCI bus and all associated resources (DMA mappings, interrupts,
BAR memory) are released.

## Booting from a VFIO device

A passthrough block device (e.g. an NVMe SSD bound to `vfio-pci`) can serve as
the guest's root filesystem instead of a virtio-block drive. Firecracker does
not auto-detect this; you must point the guest kernel at the right device via
the boot arguments.

1. Configure the VFIO device as usual (see [Configuration](#configuration)) and
make sure no `is_root_device: true` virtio drive is configured.

1. In the boot source, set `boot_args` so that `root=` names the block device
Comment thread
ilstam marked this conversation as resolved.
the guest kernel will see for the passthrough device. For an NVMe namespace
that will appear as `/dev/nvme0n1`:

```json
"boot-source": {
"kernel_image_path": "/path/to/vmlinux",
"boot_args": "console=ttyS0 reboot=k panic=1 root=/dev/nvme0n1 ro"
}
```

Use `root=/dev/nvme0n1p1` (or similar) if the rootfs lives on a partition,
and adjust the device name for non-NVMe devices (`/dev/sda`, etc.).

Notes:

- The guest kernel must include the driver for the passthrough device (e.g.
`CONFIG_BLK_DEV_NVME=y`) and any filesystem it uses, either built-in or
available as an initrd-loadable module.
- If the device is hot-plugged after boot it cannot be the root device — the
kernel has already mounted root by then. Use cold-boot configuration for the
root device.

## Security

- **IOMMU is mandatory.** Without an IOMMU, a passthrough device could DMA to
arbitrary host memory.
- **IOMMU groups.** All devices in the same IOMMU group must be assigned to the
same VM. Splitting a group across VMs would break DMA isolation. Linux already
enforces this behaviour.

## Snapshot support

VFIO devices do not support snapshots. Device state is opaque to the VMM and
cannot be serialized or restored. VMs with VFIO devices attached cannot be
snapshotted.

## Limitations

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention somewhere (not as a limitation) that all guest memory will be allocated and pinned when using VFIO devices?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added


| Limitation | Details |
| :-------------------------- | :----------------------------------------------------------------------------- |
| No memory over-subscription | All the memory of the guest will be paged in and pinned by the kernel |
| No snapshots | Device state is opaque and cannot be saved/restored. |
| No BAR relocation | BAR addresses are assigned at init and cannot be moved. |
| No BAR resizing | Resizable BAR capability is masked from the guest. |
| No IO BARs | IO-type BARs are skipped. Devices relying solely on IO BARs will not work. |
| No ROM BAR | Expansion ROM BAR is not handled. |
| No MSI (non-X) | Only MSI-X interrupts are supported. Devices without MSI-X fail to initialize. |
| No INTx | Legacy pin-based interrupts are not supported. |
| No SR-IOV | SR-IOV capability is masked. Virtual Functions cannot be created. |
| No virtio-iommu | The guest has no IOMMU. DMA isolation relies entirely on the host IOMMU. |
2 changes: 2 additions & 0 deletions resources/guest_configs/nvme.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
CONFIG_NVME_CORE=y
CONFIG_BLK_DEV_NVME=y
11 changes: 6 additions & 5 deletions resources/rebuild.sh
Original file line number Diff line number Diff line change
Expand Up @@ -225,15 +225,16 @@ function build_al_kernels {
clone_amazon_linux_repo

CI_CONFIG="$PWD/guest_configs/ci.config"
NVME_CONFIG="$PWD/guest_configs/nvme.config"

if [[ "$KERNEL_VERSION" == @(all|5.10) ]]; then
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10.config "$CI_CONFIG"
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10.config "$CI_CONFIG" "$NVME_CONFIG"
fi
if [[ $ARCH == "x86_64" && "$KERNEL_VERSION" == @(all|5.10-no-acpi) ]]; then
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10-no-acpi.config "$CI_CONFIG"
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10-no-acpi.config "$CI_CONFIG" "$NVME_CONFIG"
fi
if [[ "$KERNEL_VERSION" == @(all|6.1) ]]; then
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-6.1.config "$CI_CONFIG"
build_al_kernel $PWD/guest_configs/microvm-kernel-ci-$ARCH-6.1.config "$CI_CONFIG" "$NVME_CONFIG"
fi

# Build debug kernels
Expand All @@ -242,11 +243,11 @@ function build_al_kernels {
OUTPUT_DIR=$OUTPUT_DIR/debug
mkdir -pv $OUTPUT_DIR
if [[ "$KERNEL_VERSION" == @(all|5.10) ]]; then
build_al_kernel "$PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10.config" "$CI_CONFIG" "$FTRACE_CONFIG" "$DEBUG_CONFIG"
build_al_kernel "$PWD/guest_configs/microvm-kernel-ci-$ARCH-5.10.config" "$CI_CONFIG" "$FTRACE_CONFIG" "$NVME_CONFIG" "$DEBUG_CONFIG"
vmlinux_split_debuginfo $OUTPUT_DIR/vmlinux-5.10.*
fi
if [[ "$KERNEL_VERSION" == @(all|6.1) ]]; then
build_al_kernel "$PWD/guest_configs/microvm-kernel-ci-$ARCH-6.1.config" "$CI_CONFIG" "$FTRACE_CONFIG" "$DEBUG_CONFIG"
build_al_kernel "$PWD/guest_configs/microvm-kernel-ci-$ARCH-6.1.config" "$CI_CONFIG" "$FTRACE_CONFIG" "$NVME_CONFIG" "$DEBUG_CONFIG"
vmlinux_split_debuginfo $OUTPUT_DIR/vmlinux-6.1.*
fi
}
Expand Down
Loading
Loading