Migrate from ECS to Kubernetes
Migrate an existing Retool deployment on Amazon ECS to a supported Kubernetes blueprints deployment.
Retool has previously supported Amazon ECS as a deployment option. Additional requirements introduced in Retool 4.0 are not supported on Amazon ECS, and Amazon ECS on Amazon EC2 is also not a supported path. Customers running a self-hosted deployment on ECS need to migrate to Amazon EKS.
This guide covers deploying a new blueprints instance, migrating your database, and cutting over DNS. Your source Amazon ECS deployment stays untouched until you've confirmed the new instance is working.
What changed from your ECS deployment
| Change | Details |
|---|---|
| Agent sandbox requires Kubernetes | The agent sandbox uses gVisor for process isolation and a custom seccomp profile. Amazon ECS Fargate does not support custom seccomp profiles. Amazon ECS on EC2 is also not a supported path. |
| Blob storage is now required | Object storage (S3, Azure Blob, or GCS) is a platform requirement for app storage, Git repository storage, and sandbox snapshots. It was optional in pre-4.0 deployments. |
Helm values use rr.* namespace | Components associated with the new app builder are configured under rr.enabled: true in Helm values, not as separate top-level ECS service definitions. |
| Same-origin networking | No wildcard DNS or extra TLS certificate required. The sandbox routes through your existing Retool ingress. |
| Temporal is managed, not self-hosted | Retool-managed Temporal is the default. Self-hosted Temporal on ECS is no longer supported — you must migrate to Temporal Cloud or a managed cluster before or during this migration. |
Before you start
Use this checklist as you prepare the migrate to ensure you have everything ready.
1. Collect your Amazon ECS configuration
Before deploying the target instance, collect the configuration values you'll need to carry over.
Environment variables
Your Retool environment variables are set on the Amazon ECS task definition. Retrieve them using the AWS Console or CLI.
- AWS Console
- AWS CLI
Go to Amazon ECS > Task Definitions, select your task definition, open the latest revision, and select the JSON tab. Look for the environment and secrets arrays in each container definition.
aws ecs describe-task-definition \
--task-definition <your-task-definition-name> \
--query 'taskDefinition.containerDefinitions[*].{name:name,env:environment,secrets:secrets}'
Note these values, as you'll configure them on the new deployment:
ENCRYPTION_KEY(required): must match exactly or your database dump will be unreadable.LICENSE_KEY.BASE_DOMAINor your Retool domain.- SSO/OIDC/SAML settings (
CLIENT_ID,CLIENT_SECRET,SSO_*,OIDC_*,SAML_*). - Any other non-default variables (SMTP, proxy settings, custom feature flags, etc.).
Scaling configuration
Note the CPU and memory allocated to each Retool container. You'll match or exceed these on the new deployment.
aws ecs describe-task-definition \
--task-definition <your-task-definition-name> \
--query 'taskDefinition.{cpu:cpu,memory:memory,containers:containerDefinitions[*].{name:name,cpu:cpu,memory:memory}}'
Database connection details
Find your Amazon RDS endpoint from the task definition environment variables (POSTGRES_HOST, DATABASE_URL, or equivalent), or from the Amazon RDS console at RDS > Databases > your Retool database > Connectivity & security > Endpoint.
Note the endpoint, port, database name, and username. You'll need these to run pg_dump.
2. Audit active features
Your deployment may use features that require extra steps or precautions during migration. Review these before you start:
- Workflows or Agents: disable them on the source at the start of the code-freeze window to prevent scheduled jobs from running on both instances simultaneously. Refer to step 9 for when to do this.
- Retool Database: refer to section 6 after restoring your main database.
- Retool Storage: refer to section 7 after restoring your main database.
3. Deploy the target blueprints instance
Follow the Terraform blueprints guide to stand up a new Amazon EKS blueprints instance. Before applying Terraform, configure these migration-specific settings:
-
Set
ENCRYPTION_KEYto match your source deployment. Store it in AWS Secrets Manager and reference it viaencryption_key_secret_name. If this value doesn't match, your restored database will be unreadable. -
Set the Retool version to match or exceed your source. Pin the image tag via
retool_helm_extra_values. -
Set all custom environment variables via
retool_helm_extra_values, except SSO/OIDC/SAML settings. LeaveDISABLE_USER_PASS_LOGINunset (orfalse) so you can log in as a local admin to validate the instance before cutover. -
Match your scaling configuration. Set pod CPU, memory, and replica counts to match or exceed your source task definition.
-
Set
domain_nameto your current Retool domain. This creates a wildcard TLS certificate for it and enables a temporaryblueprints.<your-domain>subdomain for pre-cutover testing. -
If using Workflows or Agents, set
workflows_enabled = falseto prevent double-firing during the migration window.
During terraform apply, the run pauses waiting for an SSL certificate to validate. Add the DNS validation records to your DNS provider while it waits — this includes the ACM-provided CNAME record and a TXT record if your DNS provider requires domain ownership verification. The apply resumes automatically once the certificate validates. The full apply takes 30–45 minutes.
Once complete, retrieve the new load balancer hostname:
terraform output -json modules | jq -r '.["user-ingress"].alb_dns_name'
Add a temporary CNAME in your DNS provider:
blueprints.<your-domain> → <load-balancer-dns-name>
4. Dump the source database
You need a host with network access to your Amazon RDS instance to run pg_dump. How you get that access depends on your Amazon ECS setup.
- Amazon ECS Exec (Amazon EC2 launch type only)
- Amazon EC2 bastion host (Fargate or Amazon EC2)
If your Amazon ECS service has Amazon ECS Exec enabled, open a shell in a running Retool container and then run pg_dump to dump the database.
aws ecs execute-command \
--cluster <your-cluster> \
--task <running-task-id> \
--container api \
--command "/bin/sh" \
--interactive
time pg_dump -Fc --no-owner --no-acl \
-h <rds-endpoint> \
-U <db-user> \
-d <db-name> \
-f /tmp/retool.dump
Next, copy the dump to Amazon S3 and then download it locally:
aws s3 cp /tmp/retool.dump s3://<your-bucket>/retool.dump
aws s3 cp s3://<your-bucket>/retool.dump ./retool.dump
Launch a temporary Amazon EC2 instance in the same VPC and security group as your Retool Amazon RDS instance. A t3.micro is sufficient.
-
Ensure the instance's security group has inbound access to your Amazon RDS security group on port 5432.
-
Install the PostgreSQL client:
Install PostgreSQL clientsudo apt-get install -y postgresql-client # Debian/Ubuntu
# or
sudo yum install -y postgresql # Amazon Linux -
Run
pg_dump:Dump the databasetime pg_dump -Fc --no-owner --no-acl \
-h <rds-endpoint> \
-U <db-user> \
-d <db-name> \
-f retool.dump -
Copy the dump to Amazon S3 and download it locally:
Copy dump to Amazon S3 and downloadaws s3 cp retool.dump s3://<your-bucket>/retool.dump
aws s3 cp s3://<your-bucket>/retool.dump ./retool.dump
Common values:
<db-name>is typicallyhammerhead_productionorretool;<db-user>is typicallyretoolorretool_admin.
Confirm these values using your task definition environment variables.
5. Restore to the target database
The blueprints Amazon RDS instance is in a private subnet, accessible only from within the Amazon EKS cluster. Connect via a temporary pod.
terraform output -json modules | jq '{host: .["db-main"].address, db: .["db-main"].name, user: .["db-main"].username, port: .["db-main"].port}'
kubectl get secret db-credentials -n default -o jsonpath='{.data.password}' | base64 -d
aws eks update-kubeconfig --name <cluster-name> --region <region>
kubectl run tmp-psql --image=postgres:<version> -it --rm --restart=Never -- sh
In a second terminal, copy the dump into the pod:
kubectl cp retool.dump tmp-psql:/retool.dump
Inside the pod, set connection variables and restore the database.
DROP DATABASE … WITH (FORCE) requires PostgreSQL 13 or later. On PostgreSQL 12 or earlier, use SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'retool'; before dropping.
export PGHOST=<host> PGPORT=<port> PGUSER=<user> PGPASSWORD=<password>
# Drop and recreate the database
psql -d postgres -c "DROP DATABASE retool WITH (FORCE)" -c "CREATE DATABASE retool"
# Restore
time pg_restore --no-owner --no-acl -d retool /retool.dump
# Restart all deployments to reconnect pods and run any pending migrations
kubectl rollout restart deployment -n default
6. Migrate Retool Database
If your deployment does not use Retool Database, skip this section.
Retool Database is a separate PostgreSQL database from the main Retool application database. You must dump and restore it separately, then update the connection strings on the new deployment.
Dump the Retool Database from a host with network access to your Amazon RDS instance (refer to section 4 for access options):
time pg_dump -Fc --no-owner --no-acl \
-h <rds-endpoint> \
-U <db-user> \
-d <retooldb-name> \
-f retooldb.dump
Copy the dump to your local machine and then into a temporary pod in the target cluster, following the same process as section 5. Restore into a new database on the target Amazon RDS instance:
psql -d postgres -c "CREATE DATABASE retooldb"
time pg_restore --no-owner --no-acl -d retooldb /retooldb.dump
After restoring, update the Retool Database connection string in retool_helm_extra_values to point to the new database on the target Amazon RDS instance.
7. Migrate Retool Storage
If your deployment does not use Retool Storage, skip this section.
Retool Storage uses Amazon S3-compatible blob storage. You have two options from which to proceed.
- Use existing bucket
- Copy to new bucket
Point the new deployment at your existing bucket. This is the preferred approach if the Amazon EKS cluster has network and IAM access to the bucket. The ECS task role that previously accessed the bucket is a different IAM principal from the EKS node or pod role. You must grant the new principal access before cutover. Options:
- Add the EKS node IAM role to the bucket policy.
- Use IRSA to attach a dedicated IAM role to the Retool pods. Refer to the S3 and IAM reference configuration in the blueprints repo.
Set the storage environment variables in retool_helm_extra_values to reference your existing bucket before cutover.
Copy blobs to a new bucket. If you need the new deployment to use a separate bucket, sync the contents before cutover and update the environment variables to point at the new bucket:
aws s3 sync s3://<source-bucket> s3://<target-bucket>
8. Validate
Once the restore is complete, log in to blueprints.<your-domain> as a local admin and verify:
- Encryption: go to Resources and open any resource. If the credentials are readable, your encryption key is correct. If fields show garbled or empty values, the
ENCRYPTION_KEYon the new instance doesn't match the source. - Functional parity: spot check critical apps and queries. Confirm users, permissions, and audit logs look correct. Workflows will be non-functional here since you disabled them.
If anything looks wrong, you can repeat the dump and restore without affecting the source deployment. Nothing done so far has modified your Amazon ECS instance.
9. Cut over DNS
This is the live migration. Time it during a low-traffic window and communicate the maintenance period to users in advance.
-
Designate a code-freeze window. The final database dump is a point-in-time snapshot. Any apps, workflows, or config changes made after the final dump will not carry over.
-
Begin the freeze and notify users.
-
If using Workflows or Agents, disable them on the Amazon ECS source to prevent jobs from double-firing after the new instance comes up.
-
Take the final database dump by repeating sections 4 and 5 (and sections 6 and 7 if applicable).
-
Validate on
blueprints.<your-domain>one more time. -
Cut over DNS. Update your DNS provider to point
<your-domain>at the new load balancer. The blueprints Terraform creates a DNS zone with the correct records. The cleanest path is to delegate your domain to it by updating the NS record at your registrar:Get nameserversterraform output -json modules | jq -r '.["user-ingress"].zone_name_servers' -
Verify the cutover. Once DNS propagates, confirm traffic is reaching the new instance:
Verify DNS propagationdig +short <your-domain>Verify SSO is working end-to-end.
-
Re-enable Workflows. Set
workflows_enabled = trueand runterraform apply. Verify scheduled jobs are running on the new instance. -
End the code freeze and notify users.
Post-migration
Leave the Amazon ECS deployment stopped but intact for at least a week. Don't decommission it until you're confident the new instance is stable. To roll back, point DNS back at the Amazon ECS load balancer and re-enable Workflows on the source.
Once the new deployment is confirmed healthy, decommission the Amazon ECS service, task definitions, and any associated infrastructure.