Skip to content

applicative-systems/nixos-gpu-tests

Repository files navigation

GPU/CUDA in the NixOS integration test driver

This repository contains everything needed to run CUDA inside the NixOS integration test driver.

  • To learn how to configure your system and run this, see Getting started
  • In the section Necessary patches, we explain the residual things that are not (yet!) in nixpkgs.

CUDA in the test driver in action

[Getting started]

To run CUDA inside the sandbox, a list of host paths need to be mapped into the sandbox. Hence, the first step is configuring the host.

Host configuration

At first, make sure that your Nix daemon is configured to run the relatively new NixOS integration test container feature at all:

{
  nix.settings.auto-allocate-uids = true;

  nix.settings.experimental-features = [
    "auto-allocate-uids"
    "cgroups"
  ];

  nix.settings.extra-system-features = [
    "uid-range"
  ];

  # this one is only necessary for container <-> VM networking
  nix.settings.extra-sandbox-paths = [
    "/dev/net"
  ];
}

NVIDIA

On NVIDIA hosts, also enable the following configuration to ensure the right paths are visible inside the sandbox:

{
  hardware.graphics.enable = true;

  # ensure proprietary driver
  boot.blacklistedKernelModules = [ "nouveau" ];
  services.xserver.videoDrivers = [ "nvidia" ];

  # ensure proprietary and performance settings and latest driver
  boot.kernelPackages = pkgs.linuxPackages_latest;
  hardware.nvidia = {
    modesetting.enable = true;
    powerManagement.enable = false;
    powerManagement.finegrained = false;
    open = false;
    nvidiaSettings = true;
    package = config.boot.kernelPackages.nvidiaPackages.latest;
  };

  # ensure sandbox paths
  programs.nix-required-mounts = {
    enable = true;
    presets.nvidia-gpu.enable = true;
  };
}

[AMD]

(ignore this section if you are only interested in NVIDIA)

Similar to the NVIDIA scenario, but slightly different configuration is necessary on AMD GPU hosts.

This configuration snippet, based on the not-yet-upstreamed nixpkgs branch, contains the minimal needed configuration for AMD devices:

{
  hardware.graphics.enable = true;
  hardware.amdgpu.zluda.enable = true;

  programs.nix-required-mounts = {
    enable = true;
    presets.zluda.enable = true;
  };
}

Running GPU/CUDA stuff in the sandbox

To check if your nix daemon sandbox settings are correct, first run the minimal saxpy demo app in the sandbox without the NixOS integration test driver:

$ nix build -L .#cuda-sandbox
saxdemo> Start
saxdemo> Runtime version: 12090
saxdemo> Driver version: 13010
saxdemo> Host memory initialized, copying to the device
saxdemo> Scheduled a cudaMemcpy, calling the kernel
saxdemo> Scheduled a kernel call
saxdemo> Max error: 0.000000

If the output looks roughly like this without errors, your Nix sandbox works with CUDA!

To run the minimal saxpy demo app in the prepared minimal container test, run:

$ nix build -L .#cuda-container-test-nvidia
# ...
vm-test-run-saxpy-cuda-test> container: (finished: must succeed: saxpy 2>&1, in 3.38 seconds)
vm-test-run-saxpy-cuda-test> (finished: run the VM test script, in 3.38 seconds)
vm-test-run-saxpy-cuda-test> test script finished in 3.41s
vm-test-run-saxpy-cuda-test> cleanup
# ...

The same test but prepared for AMD GPUs exists as attribute .#cuda-container-test-amd.

[Necessary patches]

GPU support in the NixOS test driver

The foregoing work to upstream the necessary capabilities in nixpkgs happened in these PRs, which already have been merged:

This work implements containers in the test driver but does not yet allow for GPU related access in the container.

What's missing are the following two things:

  • the test driver needs to be patched to provide certain /run/... paths from the host sandbox to the container
  • it is necessary to attach "cuda" as a required feature to the test derivation
    • this flake does this here

For you as an outside user of this feature, this means:

  1. Use this overlay.nix when importing nixpkgs (like here)
  2. Add the "cuda" feature to the required list of your GPU test derivation (like here)

These patches will likely disappear in the future - we will keep this repository up2date.

Sandbox configuration with AMD and ZLUDA

(ignore this section if you are only interested in NVIDIA)

The configuration snippet in the AMD section will only work with upstream nixpkgs after the following PRs have been finalized and merged: