Skip to main content
This page is unlisted and can only be accessed directly via URL. It is excluded from the site navigation and search results.

Migrate from ECS to Kubernetes

Retool has previously supported Amazon ECS as a deployment option. Additional requirements introduced in Retool 4.0 are not supported on Amazon ECS, and Amazon ECS on Amazon EC2 is also not a supported path. Customers running a self-hosted deployment on ECS need to migrate to Amazon EKS.

This guide covers deploying a new blueprints instance, migrating your database, and cutting over DNS. Your source Amazon ECS deployment stays untouched until you've confirmed the new instance is working.

What changed from your ECS deployment

ChangeDetails
Agent sandbox requires KubernetesThe agent sandbox uses gVisor for process isolation and a custom seccomp profile. Amazon ECS Fargate does not support custom seccomp profiles. Amazon ECS on EC2 is also not a supported path.
Blob storage is now requiredObject storage (S3, Azure Blob, or GCS) is a platform requirement for app storage, Git repository storage, and sandbox snapshots. It was optional in pre-4.0 deployments.
Helm values use rr.* namespaceComponents associated with the new app builder are configured under rr.enabled: true in Helm values, not as separate top-level ECS service definitions.
Same-origin networkingNo wildcard DNS or extra TLS certificate required. The sandbox routes through your existing Retool ingress.
Temporal is managed, not self-hostedRetool-managed Temporal is the default. Self-hosted Temporal on ECS is no longer supported — you must migrate to Temporal Cloud or a managed cluster before or during this migration.

Before you start

Use this checklist as you prepare the migrate to ensure you have everything ready.

1. Collect your Amazon ECS configuration

Before deploying the target instance, collect the configuration values you'll need to carry over.

Environment variables

Your Retool environment variables are set on the Amazon ECS task definition. Retrieve them using the AWS Console or CLI.

Go to Amazon ECS > Task Definitions, select your task definition, open the latest revision, and select the JSON tab. Look for the environment and secrets arrays in each container definition.

Note these values, as you'll configure them on the new deployment:

  • ENCRYPTION_KEY (required): must match exactly or your database dump will be unreadable.
  • LICENSE_KEY.
  • BASE_DOMAIN or your Retool domain.
  • SSO/OIDC/SAML settings (CLIENT_ID, CLIENT_SECRET, SSO_*, OIDC_*, SAML_*).
  • Any other non-default variables (SMTP, proxy settings, custom feature flags, etc.).

Scaling configuration

Note the CPU and memory allocated to each Retool container. You'll match or exceed these on the new deployment.

List container scaling configuration
aws ecs describe-task-definition \
--task-definition <your-task-definition-name> \
--query 'taskDefinition.{cpu:cpu,memory:memory,containers:containerDefinitions[*].{name:name,cpu:cpu,memory:memory}}'

Database connection details

Find your Amazon RDS endpoint from the task definition environment variables (POSTGRES_HOST, DATABASE_URL, or equivalent), or from the Amazon RDS console at RDS > Databases > your Retool database > Connectivity & security > Endpoint.

Note the endpoint, port, database name, and username. You'll need these to run pg_dump.

2. Audit active features

Your deployment may use features that require extra steps or precautions during migration. Review these before you start:

  • Workflows or Agents: disable them on the source at the start of the code-freeze window to prevent scheduled jobs from running on both instances simultaneously. Refer to step 9 for when to do this.
  • Retool Database: refer to section 6 after restoring your main database.
  • Retool Storage: refer to section 7 after restoring your main database.

3. Deploy the target blueprints instance

Follow the Terraform blueprints guide to stand up a new Amazon EKS blueprints instance. Before applying Terraform, configure these migration-specific settings:

  1. Set ENCRYPTION_KEY to match your source deployment. Store it in AWS Secrets Manager and reference it via encryption_key_secret_name. If this value doesn't match, your restored database will be unreadable.

  2. Set the Retool version to match or exceed your source. Pin the image tag via retool_helm_extra_values.

  3. Set all custom environment variables via retool_helm_extra_values, except SSO/OIDC/SAML settings. Leave DISABLE_USER_PASS_LOGIN unset (or false) so you can log in as a local admin to validate the instance before cutover.

  4. Match your scaling configuration. Set pod CPU, memory, and replica counts to match or exceed your source task definition.

  5. Set domain_name to your current Retool domain. This creates a wildcard TLS certificate for it and enables a temporary blueprints.<your-domain> subdomain for pre-cutover testing.

  6. If using Workflows or Agents, set workflows_enabled = false to prevent double-firing during the migration window.

During terraform apply, the run pauses waiting for an SSL certificate to validate. Add the DNS validation records to your DNS provider while it waits — this includes the ACM-provided CNAME record and a TXT record if your DNS provider requires domain ownership verification. The apply resumes automatically once the certificate validates. The full apply takes 30–45 minutes.

Once complete, retrieve the new load balancer hostname:

Get load balancer hostname
terraform output -json modules | jq -r '.["user-ingress"].alb_dns_name'

Add a temporary CNAME in your DNS provider:

Temporary DNS record
blueprints.<your-domain>  →  <load-balancer-dns-name>

4. Dump the source database

You need a host with network access to your Amazon RDS instance to run pg_dump. How you get that access depends on your Amazon ECS setup.

If your Amazon ECS service has Amazon ECS Exec enabled, open a shell in a running Retool container and then run pg_dump to dump the database.

Open a shell via Amazon ECS Exec
aws ecs execute-command \
--cluster <your-cluster> \
--task <running-task-id> \
--container api \
--command "/bin/sh" \
--interactive
Dump the database
time pg_dump -Fc --no-owner --no-acl \
-h <rds-endpoint> \
-U <db-user> \
-d <db-name> \
-f /tmp/retool.dump

Next, copy the dump to Amazon S3 and then download it locally:

Copy dump to Amazon S3 and download
aws s3 cp /tmp/retool.dump s3://<your-bucket>/retool.dump
aws s3 cp s3://<your-bucket>/retool.dump ./retool.dump

Common values:

  • <db-name> is typically hammerhead_production or retool;
  • <db-user> is typically retool or retool_admin.

Confirm these values using your task definition environment variables.

5. Restore to the target database

The blueprints Amazon RDS instance is in a private subnet, accessible only from within the Amazon EKS cluster. Connect via a temporary pod.

1. Get database connection details
terraform output -json modules | jq '{host: .["db-main"].address, db: .["db-main"].name, user: .["db-main"].username, port: .["db-main"].port}'
kubectl get secret db-credentials -n default -o jsonpath='{.data.password}' | base64 -d
2. Update kubeconfig
aws eks update-kubeconfig --name <cluster-name> --region <region>
3. Start a temporary pod with a matching PostgreSQL version
kubectl run tmp-psql --image=postgres:<version> -it --rm --restart=Never -- sh

In a second terminal, copy the dump into the pod:

kubectl cp retool.dump tmp-psql:/retool.dump

Inside the pod, set connection variables and restore the database.

DROP DATABASE … WITH (FORCE) requires PostgreSQL 13 or later. On PostgreSQL 12 or earlier, use SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE datname = 'retool'; before dropping.

Restore the database
export PGHOST=<host> PGPORT=<port> PGUSER=<user> PGPASSWORD=<password>

# Drop and recreate the database
psql -d postgres -c "DROP DATABASE retool WITH (FORCE)" -c "CREATE DATABASE retool"

# Restore
time pg_restore --no-owner --no-acl -d retool /retool.dump

# Restart all deployments to reconnect pods and run any pending migrations
kubectl rollout restart deployment -n default

6. Migrate Retool Database

If your deployment does not use Retool Database, skip this section.

Retool Database is a separate PostgreSQL database from the main Retool application database. You must dump and restore it separately, then update the connection strings on the new deployment.

Dump the Retool Database from a host with network access to your Amazon RDS instance (refer to section 4 for access options):

Dump Retool Database
time pg_dump -Fc --no-owner --no-acl \
-h <rds-endpoint> \
-U <db-user> \
-d <retooldb-name> \
-f retooldb.dump

Copy the dump to your local machine and then into a temporary pod in the target cluster, following the same process as section 5. Restore into a new database on the target Amazon RDS instance:

Restore Retool Database
psql -d postgres -c "CREATE DATABASE retooldb"
time pg_restore --no-owner --no-acl -d retooldb /retooldb.dump

After restoring, update the Retool Database connection string in retool_helm_extra_values to point to the new database on the target Amazon RDS instance.

7. Migrate Retool Storage

If your deployment does not use Retool Storage, skip this section.

Retool Storage uses Amazon S3-compatible blob storage. You have two options from which to proceed.

Point the new deployment at your existing bucket. This is the preferred approach if the Amazon EKS cluster has network and IAM access to the bucket. The ECS task role that previously accessed the bucket is a different IAM principal from the EKS node or pod role. You must grant the new principal access before cutover. Options:

Set the storage environment variables in retool_helm_extra_values to reference your existing bucket before cutover.

8. Validate

Once the restore is complete, log in to blueprints.<your-domain> as a local admin and verify:

  1. Encryption: go to Resources and open any resource. If the credentials are readable, your encryption key is correct. If fields show garbled or empty values, the ENCRYPTION_KEY on the new instance doesn't match the source.
  2. Functional parity: spot check critical apps and queries. Confirm users, permissions, and audit logs look correct. Workflows will be non-functional here since you disabled them.

If anything looks wrong, you can repeat the dump and restore without affecting the source deployment. Nothing done so far has modified your Amazon ECS instance.

9. Cut over DNS

This is the live migration. Time it during a low-traffic window and communicate the maintenance period to users in advance.

  1. Designate a code-freeze window. The final database dump is a point-in-time snapshot. Any apps, workflows, or config changes made after the final dump will not carry over.

  2. Begin the freeze and notify users.

  3. If using Workflows or Agents, disable them on the Amazon ECS source to prevent jobs from double-firing after the new instance comes up.

  4. Take the final database dump by repeating sections 4 and 5 (and sections 6 and 7 if applicable).

  5. Validate on blueprints.<your-domain> one more time.

  6. Cut over DNS. Update your DNS provider to point <your-domain> at the new load balancer. The blueprints Terraform creates a DNS zone with the correct records. The cleanest path is to delegate your domain to it by updating the NS record at your registrar:

    Get nameservers
    terraform output -json modules | jq -r '.["user-ingress"].zone_name_servers'
  7. Verify the cutover. Once DNS propagates, confirm traffic is reaching the new instance:

    Verify DNS propagation
    dig +short <your-domain>

    Verify SSO is working end-to-end.

  8. Re-enable Workflows. Set workflows_enabled = true and run terraform apply. Verify scheduled jobs are running on the new instance.

  9. End the code freeze and notify users.

Post-migration

Leave the Amazon ECS deployment stopped but intact for at least a week. Don't decommission it until you're confident the new instance is stable. To roll back, point DNS back at the Amazon ECS load balancer and re-enable Workflows on the source.

Once the new deployment is confirmed healthy, decommission the Amazon ECS service, task definitions, and any associated infrastructure.