Capacity planning best practices
Learn how to plan and size your self-managed Retool instance for production workloads.
Planning capacity before your first production deployment prevents performance problems and unplanned scaling events. The right size depends on how you use Retool, not just the number of users.
Identify your workload type
Different Retool features put pressure on different resources. Before sizing your infrastructure, consider which of the following describes your deployment:
| Workload type | Characteristics | Services under load |
|---|---|---|
| Query-heavy | Many concurrent users running queries against databases or APIs. | api containers, platform database. |
| Workflow-heavy | Automated workflows that run frequently or process large datasets. | workflows-backend, workflows-worker, code executor. |
| Agent editing | Many builders simultaneously editing apps or working with AI agents. | Agent sandbox pool, api container. |
| Mixed | A combination of the above. | All of the above. |
Understanding your workload type helps you decide which services to prioritize when sizing.
Use an external managed database for production
Do not use a containerized PostgreSQL instance in production. Use a managed PostgreSQL service (such as Amazon RDS, Cloud SQL, or Azure Database for PostgreSQL) with automated backups and failover enabled.
The platform database stores all apps, users, resources, audit logs, and organization settings. Start with at least 60GB of storage and monitor usage over time. Refer to the platform database migration guide for more information.
Scale the right containers
Most user-facing traffic flows through the api container. For query-heavy deployments, scale the number of api replicas to handle concurrent load.
Run exactly one replica of jobs-runner in each Retool environment. It handles database migrations and other background tasks that must run as a singleton. All other containers can be scaled horizontally.
For workflow-heavy deployments, scale workflows-backend and workflows-worker. For heavy code executor usage, scale code-executor replicas and review code executor resource limits.
Refer to the infrastructure scaling guide for more information on scaling a Retool instance.
Plan for agent sandbox capacity
Each active app-editing session requires one agent sandbox pod. To estimate sandbox resource requirements at peak usage:
- Count the maximum number of builders you expect to be active at the same time.
- Add the prewarm pool size (default: 5) for idle pool overhead.
- Multiply by per-pod resources: 1 CPU core and 2 GiB memory.
The default maximum total sandbox pods is 50. If your estimate exceeds this, increase agentSandbox.controller.scaling.maxTotalJobs in your Helm values before hitting the cap. Builders beyond the cap will not get a working sandbox session.
Know when to scale
Watch for these signals that indicate you need more capacity:
- High CPU or memory on
apicontainers indicates concurrent query traffic is exceeding current capacity. - Slow query response times can indicate
apicontainer saturation or database connection pool exhaustion. - Database connection pool exhaustion means you need to add replicas or increase connection pool settings, and verify your managed database tier supports the connection count you need.
- Temporal task queues backing up indicates workflow execution lag. Scale
workflows-workerreplicas and verify Temporal is sized appropriately. - Builders unable to open app editing sessions means the agent sandbox pod pool has hit its cap. Raise
maxTotalJobsand verify your cluster has node capacity.
Use your monitoring stack or Retool's telemetry to track these signals before they affect users.