From 383bbd1c4ca8738a54376eb329cd1bc17695d073 Mon Sep 17 00:00:00 2001 From: Kaelyn Date: Wed, 2 Apr 2025 17:28:22 -0700 Subject: [PATCH 1/3] Completed health checks doc --- partials/_firecracker_nav.html.erb | 1 + reference/health-checks.html.markerb | 58 ++++++++++++++++++++++++++++ reference/index.html.md | 2 + 3 files changed, 61 insertions(+) create mode 100644 reference/health-checks.html.markerb diff --git a/partials/_firecracker_nav.html.erb b/partials/_firecracker_nav.html.erb index b51673a23a..2ce41e8223 100644 --- a/partials/_firecracker_nav.html.erb +++ b/partials/_firecracker_nav.html.erb @@ -187,6 +187,7 @@ { text: "Autoscaling", path: "/docs/reference/autoscaling/" }, { text: "Builders", path: "/docs/reference/builders/" }, { text: "Fly Launch", path: "/docs/reference/fly-launch/" }, + { text: "Health Checks", path: "/docs/reference/health-checks/" }, { text: "Load Balancing", path: "/docs/reference/load-balancing/" }, { text: "Machine Migration", path: "/docs/reference/machine-migration/" }, { text: "Multiple Processes in Apps", path: "/docs/app-guides/multiple-processes/" }, diff --git a/reference/health-checks.html.markerb b/reference/health-checks.html.markerb new file mode 100644 index 0000000000..564cfdfff7 --- /dev/null +++ b/reference/health-checks.html.markerb @@ -0,0 +1,58 @@ +--- +title: Health Checks +layout: docs +nav: firecracker +--- + +Health checks are a useful way to monitor the state of your app on Fly.io. When configured, they help the platform make decisions about routing traffic and managing deployments. For example, health checks can: + +- Confirm that Machines are ready before receiving traffic +- Route around unhealthy Machines to maintain availability +- Halt or roll back deployments when a new version isn't responding correctly + +Fly.io supports several types of health checks for different use cases. These are configured in your fly.toml file, in the relevant section depending on the type of check. + +## Monitoring your health checks + +You can view your app's current health check status using the `fly checks list` command. The checks run periodically, but the time listed in the Last Updated column represents the last time the status of the check changed. + +
+**Important:** A failing health check can affect request routing to your Machine but it will not result in an automatic Machine restart. Health checks are primarily for monitoring your Machines. While some checks can affect routing eligibility for your apps, a failing health check won't trigger any actions against the Machines themselves, like a restart or stop. +
+ + +## Top-level checks + +Top-level health checks, defined in the `[checks]` section of your fly.toml file, are designed for monitoring the overall health of your application, especially for non-public-facing services. Unlike service-level health checks (e.g., `services.http_checks` or `services.tcp_checks`), which can be used to direct traffic away from unhealthy Machines, top-level checks do not affect request routing. This makes them suitable for internal monitoring and alerting purposes without impacting how traffic is distributed across your Machines. + +### Custom Service Checks + +Use custom checks when you need specific configurations or headers, such as checking authenticated or private endpoints. You can configure custom checks for both TCP and HTTP service checks. + +For more information on custom checks, check out the [checks section](/docs/reference/configuration/#the-checks-section). + +## Service-level checks + +Service-level checks let you define how the proxy determines if a Machine is ready to serve traffic. These checks run at the proxy level and prevent traffic from being sent to unhealthy or slow-starting Machines. TCP checks confirm that your app is listening on the expected port. HTTP checks go further by making a request to a specific path and expecting a 2xx HTTP response. This helps catch cases where the app has started but isn't ready to serve real traffic yet. Using both types of checks can help catch different kinds of issues: TCP for basic reachability, HTTP for readiness. + +The proxy uses service-level checks to determine whether a given Machine should be routed to. If a Machine fails its service check, it will be marked as unhealthy by the proxy, and it won't be routed to until its checks start passing. + +While service-level checks affect routing availability, a failing check won't cause the Machine to restart or stop. + +Service-level checks are configured using the [`[[services.tcp_checks]]`](/docs/reference/configuration/#services-tcp_checks), [`[[services.http_checks]]`](/docs/reference/configuration/#services-http_checks) and [`[[http_service.checks]]`](/docs/reference/configuration/#http_service-checks) sections. + +## Machine checks + +Machine health checks run only during deployments. They run a custom command inside an ephemeral Machine and can be used to verify app behavior beyond port or endpoint availability. They're useful for validating app readiness in more complex scenarios, such as confirming database connectivity or verifying that a key background service is up. These checks don't impact routing, but if a Machine check fails, the deployment will be stopped. + +Machine checks are configured using the [`[[services.machine_checks]]`](/docs/reference/configuration/#services-machine_checks) and [`[[http_service.machine_checks]]`](/docs/reference/configuration/#the-http_service-machine_checks-section) sections. + +## Bluegreen checks + +Bluegreen checks are an internal check used as part of bluegreen deployments. These run automatically when using a bluegreen deployment strategy, and you may see them listed in the output of `fly checks list` as `bg_deployment`. + +You don't need to configure them yourself, and they don't impact routing after the deployment has completed. + +## Related topics + +- [App configuration (fly.toml)](/docs/reference/configuration/) diff --git a/reference/index.html.md b/reference/index.html.md index 63df861665..d261ccc0aa 100644 --- a/reference/index.html.md +++ b/reference/index.html.md @@ -28,6 +28,8 @@ Quick references for often-used resources like flyctl and `fly.toml`. Or dig a l * **[Fly Proxy autostop/autostart](/docs/reference/fly-proxy-autostop-autostart/):** Learn how Fly Proxy determines excess capacity for an app to shut down or suspend Machines when they're not needed and start them back up when there's traffic. +* **[Health Checks](/docs/reference/health-checks/):** Learn how Fly monitors your apps health through HTTP, TCP, and custom checks to verify availability and trigger automatic restarts when needed. + --- ## Working with Fly.io From 5a223af3fc387f2335d97e37aec58b83308753a3 Mon Sep 17 00:00:00 2001 From: Kaelyn Date: Thu, 3 Apr 2025 09:37:01 -0700 Subject: [PATCH 2/3] Change wording around for monitoring health checks section --- reference/health-checks.html.markerb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/health-checks.html.markerb b/reference/health-checks.html.markerb index 564cfdfff7..32f56fa9b1 100644 --- a/reference/health-checks.html.markerb +++ b/reference/health-checks.html.markerb @@ -14,7 +14,7 @@ Fly.io supports several types of health checks for different use cases. These ar ## Monitoring your health checks -You can view your app's current health check status using the `fly checks list` command. The checks run periodically, but the time listed in the Last Updated column represents the last time the status of the check changed. +You can view your app's current health check status using the `fly checks list` command. While checks run periodically, the Last Updated column shows when the check's status last changed, not when it was last executed.
**Important:** A failing health check can affect request routing to your Machine but it will not result in an automatic Machine restart. Health checks are primarily for monitoring your Machines. While some checks can affect routing eligibility for your apps, a failing health check won't trigger any actions against the Machines themselves, like a restart or stop. From d547d13ea5306403277561bfdb93991ac9eb0483 Mon Sep 17 00:00:00 2001 From: Kaelyn Date: Thu, 3 Apr 2025 09:49:26 -0700 Subject: [PATCH 3/3] Removed redundant wording --- reference/health-checks.html.markerb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/reference/health-checks.html.markerb b/reference/health-checks.html.markerb index 32f56fa9b1..404eb38cce 100644 --- a/reference/health-checks.html.markerb +++ b/reference/health-checks.html.markerb @@ -17,7 +17,7 @@ Fly.io supports several types of health checks for different use cases. These ar You can view your app's current health check status using the `fly checks list` command. While checks run periodically, the Last Updated column shows when the check's status last changed, not when it was last executed.
-**Important:** A failing health check can affect request routing to your Machine but it will not result in an automatic Machine restart. Health checks are primarily for monitoring your Machines. While some checks can affect routing eligibility for your apps, a failing health check won't trigger any actions against the Machines themselves, like a restart or stop. +**Important:** A failing health check can prevent request routing to your Machine. However your Machines won't automatically restart or stop due to failing their health checks, this needs to be done manually.