Scalingo · benjaminach · May 12, 2026 · May 12, 2026 · May 12, 2026 · May 12, 2026
diff --git a/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md b/src/_posts/platform/app/scaling/2000-01-01-choosing-container-size.md
@@ -0,0 +1,92 @@
+---
+title: Choosing a Container Size
+nav: Choosing a Container Size
+modified_at: 2026-05-18 00:00:00
+tags: app scaling containers memory metrics
+index: 1
+---
+
+Choosing the right container size is a balance between safety, performance, and
+cost. Memory is often the resource that most directly constrains this choice
+because each running container must stay below its own memory quota. A safe
+initial size gives your application enough memory headroom while you collect
+real metrics and validate the workload.
+
+For simple applications, the default `M` container size is often a reasonable
+starting point. Choose a larger size from the beginning when you already know
+that your application has higher memory needs, for example because it uses a
+memory-intensive runtime, high concurrency, large in-memory datasets, caches,
+background jobs processing large payloads, or unknown production traffic.
+
+See the [container sizes][container-sizes] page for the available sizes, memory
+limits, and PID limits.
+
+
+## Start Safe, Then Adjust
+
+When you are unsure about the right size, start with a size that gives your
+application enough headroom. After deployment, use [metrics][metrics],
+[alerts][alerts], and realistic load testing to adjust the size.
+
+Avoid choosing a smaller size only because the application starts successfully.
+An application can boot with low memory usage and still consume much more
+memory under real traffic, scheduled jobs, large requests, or specific user
+flows.
+
+Before downsizing, validate that the application keeps enough memory headroom
+below the limit over time.
+
+
+## Validate With Metrics and Load Testing
+
+Before changing the container size, inspect the application charts in the
+[Metrics tab][metrics]. Compare memory usage with the memory quota of the
+selected container size, and also review CPU usage and application-level
+signals.
+
+Pay attention to:
+
+- CPU usage.
+- RAM and swap usage.
+- Whether memory usage returns to a stable baseline after traffic peaks.
+- Response time.
+- 5xx errors.
+- Restart events.
+
+If production metrics are not enough to validate a size, test the application
+with realistic load and non-sensitive data.
+
+{% note %}
+Before running intensive load tests against an application hosted on Scalingo,
+read our [external testing procedures][external-testing].
+{% endnote %}
+
+
+## Match Capacity to Traffic
+
+Once you have chosen the target size for each process type, use
+[Scaling Your Application][scaling] to configure the expected capacity for your
+traffic and workload.
+
+If the application is critical or you are unsure about the safest sizing
+strategy, contact Scalingo support.
+
+
+## Monitor Resource Usage
+
+Configure [alerts][alerts] for critical metrics, and keep
+[notifiers][notifiers] configured so the right people receive notifications
+before resource usage becomes critical.
+
+If the application consumes all its available memory, it can be terminated by
+the system. See the [Runtime Issues][oom-diagnosis] page for Out of Memory
+crash diagnosis and recovery guidance.
+
+
+[alerts]: {% post_url platform/app/2000-01-01-alerts %}
+[container-sizes]: {% post_url platform/internals/2000-01-01-container-sizes %}
+[external-testing]: {% post_url security/procedures/2000-01-01-external-testing %}#can-i-run-a-load-test-on-my-application-that-is-running-on-scalingo
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[notifiers]: {% post_url platform/app/2000-01-01-notifiers %}
+[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes
+[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}
diff --git a/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md b/src/_posts/platform/app/scaling/2000-01-01-optimizing-application-architecture.md
@@ -0,0 +1,128 @@
+---
+title: Optimizing Application Architecture
+nav: Optimized Architecture
+modified_at: 2026-05-18 00:00:00
+tags: app scaling architecture containers memory metrics performance concurrency
+index: 5
+---
+
+An optimized application architecture makes efficient use of the resources
+allocated to each container while keeping the application easy to operate. On
+Scalingo, this usually means separating workloads into [process types][procfile],
+keeping web requests short, tuning concurrency carefully, and using metrics to
+decide when to optimize code, split work, or scale the application.
+
+This page focuses on how to structure the application workload. To choose a
+container size, see [Choosing a Container Size][choosing-container-size]. To
+change the number or size of running containers, see
+[Scaling Your Application][scaling].
+
+
+## Design Around Process Types
+
+Use [process types][procfile] to separate workloads that do not have the same
+operational profile. A `web` process should handle HTTP requests and return
+responses quickly. Background workers, schedulers, importers, exporters, and
+other resource-intensive jobs should run in dedicated process types.
+
+Favor focused processes that can be scaled independently, with a clear role
+and a predictable resource profile. This is usually easier to operate than a
+single large process that handles every workload.
+
+This separation has several advantages:
+
+- each workload can have its own number of containers;
+- each workload can use a container size adapted to its resource profile;
+- focused containers can start and scale faster;
+- long or heavy tasks do not block request handling;
+- worker concurrency can be tuned independently from web concurrency;
+- incidents are easier to diagnose from metrics and logs.
+
+Typical process types include:
+
+- `web` for HTTP traffic;
+- `worker` for background jobs;
+- `clock` or `scheduler` for recurring jobs;
+- dedicated workers for heavy jobs such as PDF generation, image processing,
+  imports, exports, or batch tasks.
+
+For long tasks triggered by a user request, return quickly from the web process
+and process the work asynchronously. See [Long Running Process][long-process]
+for the general pattern.
+
+
+## Tune Concurrency Carefully
+
+Concurrency lets a process handle more work in parallel, but each additional
+thread, worker, or child process usually consumes more memory and may increase
+database or external service pressure.
+
+Tune concurrency separately for each process type:
+
+- increase web concurrency only if response time and resource usage stay
+  healthy under realistic traffic;
+- reduce worker concurrency if occasional jobs create memory pressure;
+- keep enough database connections for the configured concurrency;
+- use separate process types for jobs that have different resource profiles,
+  such as CPU-heavy and memory-heavy jobs.
+
+Some runtimes expose Scalingo-specific or buildpack-provided defaults and
+environment variables. See the language pages for details:
+[Ruby][ruby], [Python][python], [PHP][php], [Java][java], [Node.js][nodejs],
+and [Go][go].
+
+
+## Handle Memory-Intensive Workloads
+
+Memory pressure is often caused by specific workloads rather than by every
+request. Check whether the application consumes more memory regularly, or only
+during occasional tasks such as:
+
+- background jobs;
+- PDF generation;
+- image or video processing;
+- large imports or exports;
+- report generation;
+- scheduled batch tasks;
+- large in-memory caches or datasets.
+
+Depending on what you observe, prefer the smallest change that addresses the
+actual cause:
+
+- optimize the code path that allocates too much memory;
+- tune runtime-specific memory settings;
+- split heavy jobs into a dedicated process type;
+- reduce worker or job concurrency;
+- split large jobs into smaller chunks;
+- isolate workloads that do not have the same resource profile.
+
+If each container still needs more memory after these changes, continue with
+[Choosing a Container Size][choosing-container-size].
+
+
+## Size and Scale Your App
+
+Once your application is optimized for its workload:
+
+- [choose the right size][choosing-container-size] for each process type;
+- [scale the application][scaling] to match capacity to traffic;
+- read [Application Metrics][metrics] and configure [alerts][alerts] to monitor
+  the application after changes.
+
+If the application reaches its memory limit and crashes, see [Runtime
+Issues][oom-diagnosis] for diagnosis and recovery guidance.
+
+
+[alerts]: {% post_url platform/app/2000-01-01-alerts %}
+[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %}
+[go]: {% post_url languages/go/2000-01-01-start %}
+[java]: {% post_url languages/java/2000-01-01-start %}
+[long-process]: {% post_url platform/app/2000-01-01-long-process %}
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[nodejs]: {% post_url languages/nodejs/2000-01-01-start %}
+[oom-diagnosis]: {% post_url platform/app/troubleshooting/2000-01-01-runtime-issues %}#out-of-memory-crashes
+[php]: {% post_url languages/php/2000-01-01-start %}
+[procfile]: {% post_url platform/app/2000-01-01-procfile %}
+[python]: {% post_url languages/python/2000-01-01-start %}
+[ruby]: {% post_url languages/ruby/2000-01-01-start %}
+[scaling]: {% post_url platform/app/scaling/2000-01-01-scaling %}
diff --git a/src/_posts/platform/app/scaling/2000-01-01-scaling.md b/src/_posts/platform/app/scaling/2000-01-01-scaling.md
@@ -1,7 +1,7 @@
 ---
 title: Scaling Your Application
 nav: Scaling
-modified_at: 2026-01-02 12:00:00
+modified_at: 2026-05-18 00:00:00
 index: 10
 ---
 
@@ -62,22 +62,40 @@ application to traffic fluctuations.
 
 Here is a quick comparison table, in the context of a Platform as a Service:
 
-|                 | Vertical Scaling                        | Horizontal Scaling                |
-| --------------- | --------------------------------------- | --------------------------------- |
-| **Approach**    | Enhancing individual instance capacity  | Adding more instances             |
-| **Cost**        | Can become expensive at higher limits   | Often more cost-efficient         |
-| **Resilience**  | Low (single point of failure)           | High (distributed resources)      |
-| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture |
-| **When**        | Lack or overuse of CPU or RAM (swap increase or decrease) | Increase or decrease in total application traffic |
+|                 | Vertical Scaling                             | Horizontal Scaling                                |
+|-----------------|----------------------------------------------|---------------------------------------------------|
+| **Approach**    | Enhancing individual instance capacity       | Adding more instances                             |
+| **Cost**        | Can become expensive at higher limits        | Often more cost-efficient                         |
+| **Resilience**  | Low (single point of failure)                | High (distributed resources)                      |
+| **Flexibility** | Low, limited by physical/virtual constraints | High, limited by the application architecture     |
+| **When**        | Lack or overuse of resources (CPU or RAM)    | Increase or decrease in total application traffic |
 
+### Memory Usage and Scaling Decisions
 
-## Limitations
+If your application is approaching its memory limit, check its
+[memory metrics][metrics] and apply the same distinction between vertical and
+horizontal scaling to the memory usage pattern.
 
-- Vertical scaling is limited by the platform. The biggest container we can
-  currently boot is the `2XL` container, with 4GB of RAM. For a comprehensive
-  list of container sizes and corresponding specifications, please see our
+Use vertical scaling when each container needs more memory to run safely. In
+that case, choose a larger size with
+[Choosing a Container Size][choosing-container-size]. Adding more containers
+can help when memory usage increases because traffic increases and the workload
+can be distributed across multiple containers. However, horizontal scaling will
+not fix an application that individually requires more memory than the selected
+container size provides.
+
+If memory pressure comes from specific jobs or high concurrency, first review
+the application structure with
+[Optimizing Application Architecture][optimizing-architecture].
+
+
+## Scaling Limits
+
+- Vertical scaling currently goes up to the `2XL` container size, with 4GB of
+  RAM. For a comprehensive list of container sizes and corresponding
+  specifications, please see our
   [dedicated documentation page]({% post_url platform/internals/2000-01-01-container-sizes %}).
-- Horizontal scaling is limited by default to a maximum of 10 containers per
+- Horizontal scaling is available by default up to 10 containers per
   [process type]({% post_url platform/app/2000-01-01-procfile %}). This limit
   can be increased via our support team.
 
@@ -233,3 +251,6 @@ To learn more about events and notifiers, please visit the page dedicated to
 
 [routing-requests]: {% post_url platform/networking/public/2000-01-01-routing %}#requests-distribution
 [Scalingo Autoscaler]: {% post_url platform/app/scaling/2000-01-01-scalingo-autoscaler %}
+[choosing-container-size]: {% post_url platform/app/scaling/2000-01-01-choosing-container-size %}
+[metrics]: {% post_url platform/app/2000-01-01-metrics %}
+[optimizing-architecture]: {% post_url platform/app/scaling/2000-01-01-optimizing-application-architecture %}