Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Deployment Examples

* [Simple](simple)
* [Production](production)
* [Scalable](scalable)
3 changes: 3 additions & 0 deletions examples/production/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Production Deployment Example

TODO
115 changes: 115 additions & 0 deletions examples/scalable/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Scalable Deployment Example

Example of a scalable™️ Dependency-Track deployment.

> [!WARNING]
> **THIS IS ABSOLUTELY INSANELY OVERKILL FOR A SINGLE HOST MACHINE!**

The intention behind this example is **not** for it to be copied 1:1.
In fact almost all aspects of it would be a complete waste of resources
when deployed on a single physical host. Benefits are only to be expected when
services are distributed across multiple physical hosts (or pods).

The idea is to demonstrate what you *can* do, and for you to pick-and-choose
the aspects you find useful. This is not in itself a supported setup,
and you are expected to do your own testing.

## Overview

```mermaid
graph TD
client([fa:fa-user client]) --> traefik
traefik -- "/" --> frontend
traefik -- "/api" --> web
subgraph internal network
frontend["frontend (xN)"]
web["web (xN)"]
worker["worker (xN)"]
pgbouncer
init
postgres[(postgres)]
postgres-dex[(postgres-dex)]
end
web --> pgbouncer
worker --> pgbouncer
pgbouncer --> postgres
pgbouncer --> postgres-dex
init --> postgres
init --> postgres-dex
```

## Usage

Deploy everything:

```shell
docker compose up -d --pull always
```

Scale `web` containers up to 3 replicas:

```shell
docker compose up -d --scale web=3
```

Scale `web` containers to back down to 1 replica:

```shell
docker compose up -d --scale web=1
```

## Dedicated Init Container

Expensive initialization tasks, such as database migrations, are executed by a dedicated
`init` container. This reduces contention when multiple containers are deployed at once,
and enables other application containers to launch faster. The latter becomes relevant if
you plan on utilizing horizontal auto-scaling.

## Separation of Web and Worker Containers

Application containers are split by responsibility, into "web" and "worker":

"web" containers do not perform any background processing and are meant to solely
handle incoming web traffic. They consume fewer resources when idle and can be horizontally
scaled up and down rather aggressively.

"worker" containers are fully-fledged Dependency-Track instances. They handle background
processing, and even though they *could* handle web traffic, they are not exposed via reverse proxy.

The purpose of this separation is for workloads to not interfere with each other.
For example, high volumes of web traffic could monopolize resources of a container
(CPU, memory, DB connections), causing resource starvation for background processing.

## Separate Postgres Instance for Durable Execution Workload

The durable execution engine (dex) has distinct workload characteristics from the main application.
Isolating it into a separate Postgres instance ensures that neither workload negatively impacts
the other.

While dex is relatively lightweight, it does require frequent vacuuming of queue tables to
remain performant. As your Dependency-Track instance grows, autovacuum can become a challenge
if main application and dex use the same database server.

> [!NOTE]
> This is an optimization that you are *very* unlikely to need. Just know that if you run into
> a scaling ceiling with both workloads mixed, there is an escape hatch here if you need it.

## Centralized Database Connection Pooling

By default, each application container maintains a local connection pool of up to 20-30 connections.
Connections are relatively expensive, because each one requires a separate process on the Postgres server.
As you scale horizontally, the number of connections quickly exceeds what a single Postgres instance can handle efficiently.
In its default configuration, Postgres limits the total number of connections to 100.

[PgBouncer] multiplexes many application connections over a smaller number of actual
database connections, making it a natural fit for deployments with many web and worker
replicas. Note that [PgBouncer] is not your only option, as there are many other similar
connection poolers available.

The downside of course is that a central pooler adds another network hop for every database connection
an application container initiates.

As with every other aspect of this example, do not add a central pooler to your deployment
unless you have solid evidence that database connections are a problem.

[PgBouncer]: https://www.pgbouncer.org/
267 changes: 267 additions & 0 deletions examples/scalable/compose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
# This file is part of Dependency-Track.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# SPDX-License-Identifier: Apache-2.0
# Copyright (c) OWASP Foundation. All Rights Reserved.
name: dependency-track-scalable

x-kafka-client-environment: &kafka-client-environment
KAFKA_BOOTSTRAP_SERVERS: "kafka:9092"

services:
traefik:
image: traefik:v3
command:
- "--providers.docker=true"
- "--entrypoints.web.address=:8080"
ports:
- "127.0.0.1:8080:8080"
volumes:
# For rootless Podman, set the DOCKER_SOCK environment variable:
# export DOCKER_SOCK="$(podman system info --format='{{.Host.RemoteSocket.Path}}')"
- "${DOCKER_SOCK:-/var/run/docker.sock}:/var/run/docker.sock:ro"
restart: unless-stopped

frontend:
image: docker.io/dependencytrack/hyades-frontend:snapshot
environment:
API_BASE_URL: "http://localhost:8080"
labels:
traefik.enable: "true"
traefik.http.routers.frontend.rule: "PathPrefix(`/`)"
traefik.http.routers.frontend.priority: "1"
restart: unless-stopped

init:
image: docker.io/dependencytrack/hyades-apiserver:snapshot
depends_on:
postgres:
condition: service_healthy
restart: false
postgres-dex:
condition: service_healthy
restart: false
deploy:
resources:
limits:
memory: 256m
environment:
JAVA_OPTIONS: "-XX:+UseSerialGC -XX:TieredStopAtLevel=1"
INIT_TASKS_ENABLED: "true"
INIT_AND_EXIT: "true"
# Bypass PgBouncer: transaction pooling is incompatible
# with session-level advisory locks required by init tasks.
DT_DATASOURCE_URL: "jdbc:postgresql://postgres:5432/dtrack"
DT_DATASOURCE_USERNAME: "dtrack"
DT_DATASOURCE_PASSWORD: "dtrack"
DT_DATASOURCE_DEX_URL: "jdbc:postgresql://postgres-dex:5432/dtrack-dex"
DT_DATASOURCE_DEX_USERNAME: "dtrack-dex"
DT_DATASOURCE_DEX_PASSWORD: "dtrack-dex"
DT_DEX_ENGINE_DATASOURCE_NAME: "dex"
volumes:
- "dtrack-data:/data"
restart: on-failure

web:
image: docker.io/dependencytrack/hyades-apiserver:snapshot
depends_on:
kafka:
condition: service_healthy
restart: false
pgbouncer:
condition: service_healthy
restart: false
init:
condition: service_completed_successfully
restart: false
deploy:
endpoint_mode: dnsrr
mode: replicated
replicas: 3
resources:
limits:
memory: 1g
environment:
INIT_TASKS_ENABLED: "false"
SMALLRYE_CONFIG_PROFILE: "web"
DT_DATASOURCE_URL: "jdbc:postgresql://pgbouncer:5432/dtrack-web"
DT_DATASOURCE_USERNAME: "dtrack"
DT_DATASOURCE_PASSWORD: "dtrack"
DT_DATASOURCE_DEX_URL: "jdbc:postgresql://pgbouncer:5432/dtrack-dex-web"
DT_DATASOURCE_DEX_USERNAME: "dtrack-dex"
DT_DATASOURCE_DEX_PASSWORD: "dtrack-dex"
DT_DEX_ENGINE_DATASOURCE_NAME: "dex"
<<: *kafka-client-environment
labels:
traefik.enable: "true"
traefik.http.routers.web.rule: "PathPrefix(`/api`)"
traefik.http.routers.web.priority: "2"
volumes:
- "dtrack-data:/data"
restart: unless-stopped

worker:
image: docker.io/dependencytrack/hyades-apiserver:snapshot
depends_on:
kafka:
condition: service_healthy
restart: false
pgbouncer:
condition: service_healthy
restart: false
init:
condition: service_completed_successfully
restart: false
deploy:
endpoint_mode: dnsrr
mode: replicated
replicas: 3
resources:
limits:
memory: 2g
environment:
INIT_TASKS_ENABLED: "false"
DT_DATASOURCE_URL: "jdbc:postgresql://pgbouncer:5432/dtrack-worker"
DT_DATASOURCE_USERNAME: "dtrack"
DT_DATASOURCE_PASSWORD: "dtrack"
DT_DATASOURCE_POOL_MAX_SIZE: "30"
DT_DATASOURCE_DEX_URL: "jdbc:postgresql://pgbouncer:5432/dtrack-dex-worker"
DT_DATASOURCE_DEX_USERNAME: "dtrack-dex"
DT_DATASOURCE_DEX_PASSWORD: "dtrack-dex"
DT_DATASOURCE_DEX_POOL_MAX_SIZE: "30"
DT_DEX_ENGINE_DATASOURCE_NAME: "dex"
<<: *kafka-client-environment
volumes:
- "dtrack-data:/data"
restart: unless-stopped

pgbouncer:
image: edoburu/pgbouncer:latest
entrypoint: ["pgbouncer", "/etc/pgbouncer/pgbouncer.ini"]
depends_on:
postgres:
condition: service_healthy
restart: false
postgres-dex:
condition: service_healthy
restart: false
configs:
- source: pgbouncer-ini
target: /etc/pgbouncer/pgbouncer.ini
- source: pgbouncer-userlist
target: /etc/pgbouncer/userlist.txt
healthcheck:
test: ["CMD-SHELL", "pg_isready -h 127.0.0.1 -p 5432 -U dtrack"]
interval: 5s
timeout: 3s
retries: 3
restart: unless-stopped

postgres:
image: postgres:18-alpine
environment:
POSTGRES_DB: "dtrack"
POSTGRES_USER: "dtrack"
POSTGRES_PASSWORD: "dtrack"
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}" ]
interval: 5s
timeout: 3s
retries: 3
volumes:
- "postgres-data:/var/lib/postgresql/data"
restart: unless-stopped

postgres-dex:
image: postgres:18-alpine
environment:
POSTGRES_DB: "dtrack-dex"
POSTGRES_USER: "dtrack-dex"
POSTGRES_PASSWORD: "dtrack-dex"
healthcheck:
test: [ "CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}" ]
interval: 5s
timeout: 3s
retries: 3
volumes:
- "postgres-dex-data:/var/lib/postgresql/data"
restart: unless-stopped

# TODO: Add SeaweedFS for file storage
# https://github.com/seaweedfs/seaweedfs

# DEPRECATED: Kafka will be removed soon.
kafka:
image: apache/kafka:4.1.1
environment:
KAFKA_NODE_ID: "1"
KAFKA_PROCESS_ROLES: "broker,controller"
KAFKA_LISTENERS: "PLAINTEXT://:9092,CONTROLLER://:9093"
KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://kafka:9092"
KAFKA_CONTROLLER_LISTENER_NAMES: "CONTROLLER"
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: "CONTROLLER:PLAINTEXT,PLAINTEXT:PLAINTEXT"
KAFKA_CONTROLLER_QUORUM_VOTERS: "1@localhost:9093"
healthcheck:
test: [ "CMD-SHELL", "/opt/kafka/bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092 > /dev/null 2>&1" ]
interval: 10s
timeout: 10s
retries: 5
volumes:
- "kafka-data:/var/lib/kafka/data"
restart: unless-stopped

# DEPRECATED: Kafka will be removed soon.
kafka-init:
image: apache/kafka:4.1.1
depends_on:
kafka:
condition: service_healthy
entrypoint: >-
sh -c "
/opt/kafka/bin/kafka-topics.sh --create --if-not-exists --topic dtrack.repo-meta-analysis.component --partitions 3 --replication-factor 1 --bootstrap-server kafka:9092 &&
/opt/kafka/bin/kafka-topics.sh --create --if-not-exists --topic dtrack.repo-meta-analysis.result --partitions 3 --replication-factor 1 --bootstrap-server kafka:9092
"
restart: on-failure

configs:
pgbouncer-ini:
content: |-
[databases]
dtrack-web = host=postgres port=5432 dbname=dtrack pool_size=15
dtrack-dex-web = host=postgres-dex port=5432 dbname=dtrack-dex pool_size=10
dtrack-worker = host=postgres port=5432 dbname=dtrack pool_size=30
dtrack-dex-worker = host=postgres-dex port=5432 dbname=dtrack-dex pool_size=30

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 5432
auth_type = plain
auth_file = /etc/pgbouncer/userlist.txt
pool_mode = transaction
max_client_conn = 500
max_db_connections = 50
min_pool_size = 5
server_reset_query = DISCARD ALL
log_connections = 0
log_disconnections = 0
pgbouncer-userlist:
content: |-
"dtrack" "dtrack"
"dtrack-dex" "dtrack-dex"

volumes:
dtrack-data: { }
kafka-data: { }
postgres-data: { }
postgres-dex-data: { }
Loading
Loading