Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions _partials/launchpad-for-ai/_unreleased-banner.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
partial_category: launchpad-for-ai
partial_name: unreleased-banner
---

:::caution

This is unreleased documentation for Launchpad for AI. The product is not yet generally available.

Documentation is subject to change before release. We recommend against bookmarking as the link will change when the product is generally available.

:::
4 changes: 4 additions & 0 deletions docs/docs-content/launchpad-for-ai/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"position": 1000,
"className": "hidden-category"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Explanation",
"position": 30,
"link": {
"type": "doc",
"id": "launchpad-for-ai/explanation/explanation"
}
}
39 changes: 39 additions & 0 deletions docs/docs-content/launchpad-for-ai/explanation/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
sidebar_label: "Architecture Overview"
title: "Launchpad for AI Architecture Overview"
description:
"An explanation of the Launchpad for AI architecture, including its component stack, data flow, and network topology."
hide_table_of_contents: false
sidebar_position: 1
tags: ["launchpad-for-ai", "architecture", "explanation"]
keywords: ["launchpad", "ai", "architecture", "kubernetes", "kairos", "helm", "data flow"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

This page explains how Launchpad for AI works, how its components interact, and what key decisions shaped the design.
Use this page to build an understanding of the architecture before you deploy or operate Launchpad for AI.

## Component Stack

The appliance serves each model through an inference engine, such as vLLM, running on its Kubernetes cluster. Each
loaded model is exposed as an OpenAI-compatible endpoint, such as `/v1/chat/completions` and `/v1/models`.

## Appliance and Cluster Formation

## Model Provisioning Lifecycle

When you deploy a model, the appliance places it automatically on the best-fit node and brings it through a guarded
sequence of gate, provision, smoke-test, and ready stages. A model is routable only after its smoke test passes, shown
as `serving Β· smoke-test passed`, so the console never shows a model as ready before it is serving.

## Request Routing

The gateway routes each request to a model. A request that names a model uses that model, and a request that does not
name a model falls back to the default model. When you change the default model, the gateway rebuilds its router in
place. The gateway does not restart, and it does not drain requests that are in progress. Requests that the gateway
already routed continue on their assigned model, and the new default applies only to later requests.

## Network Topology

## Data Residency and Isolation
19 changes: 19 additions & 0 deletions docs/docs-content/launchpad-for-ai/explanation/explanation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
sidebar_label: "Explanation"
title: "Launchpad for AI Explanation"
description: "Background, context, and design rationale for understanding how Launchpad for AI works."
hide_table_of_contents: false
sidebar_position: 0
tags: ["launchpad-for-ai", "explanation"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

Explanatory and conceptual guides help you understand how and why Launchpad for AI works the way it does. They cover
design decisions, component relationships, and trade-offs rather than walking you through tasks.

## Contents

| **Topic** | **What you understand** |
| ------------------------------------------ | --------------------------------------------------------------------------------------------- |
| [Architecture Overview](./architecture.md) | The component stack, request routing, model provisioning lifecycle, and data residency model. |
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "How-to Guides",
"position": 20,
"link": {
"type": "doc",
"id": "launchpad-for-ai/how-to-guides/how-to-guides"
}
}
64 changes: 64 additions & 0 deletions docs/docs-content/launchpad-for-ai/how-to-guides/add-a-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
sidebar_label: "Add a Model"
title: "Add a Model"
description:
"Step-by-step guidance for platform operators on how to add an LLM model to a running Launchpad for AI appliance and
verify that it is serving."
hide_table_of_contents: false
sidebar_position: 1
tags: ["launchpad-for-ai", "models", "how-to"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

This guide explains how to add a model to a running Launchpad for AI appliance and verify that the model is serving
requests. For background on the appliance and how it routes requests, refer to
[What is Launchpad for AI?](../launchpad-for-ai.md) and [Architecture](../explanation/architecture.md).

Launchpad for AI provides a set of recommended models, including GLM, DeepSeek, and Kimi. You can deploy models beyond
the recommended list, but any model must fit within the GPU resources available on the appliance. Before you select a
model, check the [Supported LLM Models](../reference/supported-models.md) and
[Hardware Requirements](../reference/hardware-requirements.md) reference pages to confirm that your appliance can
support it.

## Prerequisites

- A running Launchpad for AI appliance, with the admin console reachable and operator access.
- At least one node with free capacity for the model you intend to add.
- The desired model present in the appliance catalog.

## Add a Model

1. From the left main menu, select **Orchestration**, and then select the **Fleet** tab.

2. In the **Deploy new model** section, open the model drop-down menu and select the model to add.

3. (Optional) Open the engine drop-down menu and select an engine. Leave it on the automatic option to let the appliance
choose the engine.

4. Confirm that the placement line shows a best-fit node. If the card shows **Deploy held** with a reason, such as no
node with enough free GPUs, resolve that reason before you continue.

5. Select **Deploy**, review the deployment summary, and then confirm.

The appliance selects the node and brings the model online for you. For how placement and provisioning work, refer to
[Architecture](../explanation/architecture.md).

### Verify the Model Is Available

Confirm the model is serving before you route traffic to it.

1. Stay on the _Fleet_ tab and locate the model in the _Fleet models_ table.

2. Confirm that the model state reads `ready`. An amber state means the model is still provisioning or running its smoke
test, and a red state means the model failed.

3. Confirm that the model detail reads `serving Β· smoke-test passed` and that the model is marked routable.

For why a model becomes routable only after its smoke test passes, refer to
[Architecture](../explanation/architecture.md).

## Next Steps

If a request does not name a model, the appliance routes it to the default model. Refer to
[Set the Default Model](./set-the-default-model.md) to learn how to configure the default.
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
sidebar_label: "How-to Guides"
title: "Launchpad for AI How-to Guides"
description: "Step-by-step guides for completing specific operational tasks on a running Launchpad for AI appliance."
hide_table_of_contents: false
sidebar_position: 0
tags: ["launchpad-for-ai", "how-to"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

How-to guides get a specific job done on a running appliance. They assume you know what you want to accomplish and give
you the steps to do it without teaching background concepts.

## Contents

| **Guide** | **What you do** |
| --------------------------------------------------- | ------------------------------------------------------------------------------- |
| [Install the Appliance](./install-the-appliance.md) | Flash the installer ISO, boot the hardware, and bring up the appliance console. |
| [Add a Model](./add-a-model.md) | Deploy a new LLM to the appliance fleet and verify it is serving. |
| [Set the Default Model](./set-the-default-model.md) | Configure which model handles requests that do not name a model explicitly. |
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
sidebar_label: "Install the Appliance"
title: "Install the Launchpad for AI Appliance"
description:
"Step-by-step guidance for platform operators on how to install the Launchpad for AI appliance from bare hardware to a
running, reachable appliance console."
hide_table_of_contents: false
sidebar_position: 0
tags: ["launchpad-for-ai", "install", "how-to"]
keywords: ["launchpad", "ai", "install", "appliance", "hardware", "iso", "edge", "local ui"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

This guide walks you through installing the Launchpad for AI appliance on bare hardware, from flashing the installer ISO
to verifying that the appliance console is reachable. The step-by-step content for this guide is actively being drafted.
Review the working outline at
[Install Guide β€” working outline](https://docs.google.com/document/d/1ycWX6JAbWhDS-jcXyZ-Vuyc-2Dg5r9iJNzVy9Kh9qBY/edit?usp=sharing)
while the content is finalized.

## Next Steps

- **Run your first model:** Once the appliance console is reachable, follow the
[Run Your First Model and Send Your First Prompt](../tutorials/run-first-model.md) tutorial to deploy a model and send
your first inference prompt.
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
sidebar_label: "Set the Default Model"
title: "Set the Default Model"
description:
"Step-by-step guidance for platform operators on how to set the default model that handles unrouted requests on a
running Launchpad for AI appliance."
hide_table_of_contents: false
sidebar_position: 2
tags: ["launchpad-for-ai", "models", "how-to"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

This guide explains how to set the default model on a running Launchpad for AI appliance. If a request does not name a
model, the appliance routes it to the default model. For how the appliance routes requests, refer to
[Architecture](../explanation/architecture.md).

## Prerequisites

- A running Launchpad for AI appliance, with the admin console reachable and operator access.
- The model you want to make default already added and serving. To add and verify a model, refer to
[Add a Model](./add-a-model.md).

## Set the Default Model

Set the default model from the _Control Room_. You can only select a model that the appliance currently serves.

1. From the left main menu, select **Control Room**.

2. Open the **switch default model** drop-down menu and select the model to make default.

3. Select **Apply fix**, and then confirm.

For what happens to requests that are in progress when you change the default model, refer to
[Architecture](../explanation/architecture.md).

## Next Steps

To add another model to the appliance, refer to [Add a Model](./add-a-model.md).
95 changes: 95 additions & 0 deletions docs/docs-content/launchpad-for-ai/launchpad-for-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
id: overview
title: What is Launchpad for AI?
description: >
Launchpad for AI is a standalone, turnkey AI appliance that lets enterprises run large language models on-premises
without cloud dependency, AI consulting, or complex infrastructure setup.
sidebar_label: Overview
sidebar_position: 1
tags:
- launchpad-for-ai
- overview
- explanation
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

Launchpad for AI is a turnkey appliance for running large language models (LLMs) on your hardware. It deploys as an
image with no Palette or PaletteAI dependency. Once a model is loaded, inference runs on the appliance, so data stays in
your environment and per-token API costs become predictable infrastructure spend.

## The Problem It Solves

Cloud-hosted AI is not the right fit for enterprises that need to keep data on-premises, whether to meet data residency,
regulatory compliance, and air-gapped network requirements, or to control latency and avoid per-token API costs.

Building an on-premises AI stack from scratch means assembling GPU compute, OS, Kubernetes, an LLM inference runtime,
authentication, Role-Based Access Control (RBAC), and observability. Launchpad for AI delivers the whole stack,
pre-integrated, as a single bootable artifact.

## Predictable AI Costs

Cloud AI services bill per token, so costs scale directly with usage and become difficult to budget at enterprise scale.
Running inference on-premises changes that billing model. Once the hardware is provisioned, inference cost is a fixed
infrastructure line item regardless of token volume, making AI spend predictable and independent of how heavily the
system is used.

## What It Includes

The appliance ships with the following pre-integrated components.

| **Layer** | **Technology** |
| --------------------- | ----------------------------------------------- |
| Operating system | [Kairos](https://kairos.io) on Ubuntu 24.04 |
| Orchestration | Kubernetes |
| LLM inference runtime | vLLM |
| Intelligent routing | Routing by task type and data sensitivity |
| Local models | GLM, DeepSeek, Kimi |
| Platform services | Authentication, RBAC, monitoring, observability |
| GPU support | NVIDIA |

The OS layer runs Kairos on Ubuntu 24.04, an immutable Linux distribution designed for appliance deployments. Its
read-only runtime prevents configuration drift and keeps the appliance in a known, reproducible state. Kubernetes
manages the lifecycle of containerized workloads on top, handling scheduling, scaling, and health recovery.

vLLM serves language models with high GPU throughput and handles concurrent requests from multiple users. The appliance
ships GLM, DeepSeek, and Kimi as local open-weight models that run entirely on-premises with no external API calls.
NVIDIA GPU support provides the parallel processing that large language models require to respond at production speed.

Intelligent routing directs each request to the most appropriate model. Requests that involve private data or require
low latency stay local. Other requests can route outbound when the network allows.

Platform services cover authentication, RBAC, monitoring, and observability. These surface health and usage data and
ensure the appliance behaves as a managed enterprise system rather than a raw inference server.

The stack is packaged as Helm charts, which bundle each component as a versioned unit. You can update individual layers
independently without replacing the entire appliance image.

## Launchpad for AI or PaletteAI

Launchpad for AI and [PaletteAI](https://docs.palette-ai.com) are related but distinct products that serve different
scales and deployment models.

| | **Launchpad for AI** | **PaletteAI** |
| --------------------- | ----------------------------------------------- | ----------------------------------------------------------------------- |
| **Form factor** | Standalone appliance (bootable ISO) | Software platform (Helm chart or All-in-One ISO on existing Kubernetes) |
| **Requires Palette?** | No | Only if Palette also manages the underlying cluster |
| **Primary users** | Platform engineering and IT teams | Platform engineering teams |
| **Scale** | Single-site, on-premises, air-gapped | Multi-cluster, multi-tenant, cloud and data center |
| **AI workload model** | Local LLM inference with optional cloud routing | Full AI factory: GPU-as-a-Service, Model-as-a-Service, AI Studio |

## What Launchpad for AI Is Not

- Not a managed cloud service. You own and operate the appliance.
- Not a Palette add-on. No Palette tenant or license required.
- Not a general-purpose Kubernetes platform. It is for LLM inference only.

## Next Steps

- **Check hardware requirements:** Review the [Hardware Requirements](/launchpad-for-ai/reference/hardware-requirements)
reference before procuring or preparing your hardware.
- **Install the appliance:** Follow the
[Install the Launchpad for AI Appliance](/launchpad-for-ai/how-to-guides/install-the-appliance) guide to go from bare
hardware to a running appliance with the UI accessible.
- **Run your first model:** Follow the [Run Your First Model](/launchpad-for-ai/tutorials/run-first-model) tutorial to
deploy a model and send your first prompt.
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"label": "Reference",
"position": 40,
"link": {
"type": "doc",
"id": "launchpad-for-ai/reference/reference"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
id: hardware-requirements
title: Launchpad for AI Hardware Requirements
description: >
Detailed hardware requirements for deploying Launchpad for AI, including per-model GPU VRAM needs and multi-node
cluster sizing.
sidebar_label: Hardware Requirements
sidebar_position: 2
tags:
- launchpad-for-ai
- reference
- requirements
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

<!-- TODO: DOC-2921 β€” populate this page with confirmed hardware requirements from engineering -->

This page is a placeholder for the full Launchpad for AI hardware requirements reference. Content will be added when
ready.
21 changes: 21 additions & 0 deletions docs/docs-content/launchpad-for-ai/reference/reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
sidebar_label: "Reference"
title: "Launchpad for AI Reference"
description: "Authoritative technical reference for requirements, supported models, and specifications."
hide_table_of_contents: false
sidebar_position: 0
tags: ["launchpad-for-ai", "reference"]
---

<PartialsComponent category="launchpad-for-ai" name="unreleased-banner" />

Reference pages give you technical information when you need to look something up. They describe what exists and how it
is configured, not how to accomplish a task.

## Contents

| **Reference** | **What it covers** |
| --------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| [System Requirements](./system-requirements.md) | Hardware, software, and network prerequisites for deploying the appliance. |
| [Hardware Requirements](./hardware-requirements.md) | Per-model GPU VRAM requirements and multi-node cluster sizing. |
| [Supported Models](./supported-models.md) | Full list of supported LLMs with parameter counts, VRAM minimums, and quantization levels. |
Loading