Cabalmail

Host your own email and enhance your privacy

View the Project on GitHub cabalmail/cabal-infra

Quiesce: scale a non-prod environment to zero

The quiesce workflow scales a development or stage environment’s running compute to zero so it stops accruing hourly charges. Mail data, address data, and other state-bearing resources are left alone, so the environment can be brought back up with a single workflow run.

What gets quiesced

Resource Behavior when quiesced
ECS services for imap, smtp-in, smtp-out desired_count = 0
ECS services for the monitoring stack (Prometheus, Alertmanager, Grafana, Uptime Kuma, Healthchecks, ntfy, cloudwatch-exporter, blackbox-exporter) desired_count = 0
ECS Application Auto Scaling targets for smtp-in and smtp-out min_capacity = 0, max_capacity = 0
ECS-instance Auto Scaling Group min_size = 0, desired_capacity = 0, max_size = 0
ASG instance scale-in protection (protect_from_scale_in) Disabled, so the running instance can actually be terminated
ECS capacity provider managed_termination_protection Disabled, so the capacity provider stops fighting the ASG drain
NAT instances count = 0. The Elastic IPs are kept, so SMTP allow-lists do not need to be re-issued on resume.
Private subnet default route Removed. The NAT-instance NIC it pointed to is gone, and nothing runs in private subnets while quiesced.

The DAEMON node-exporter ECS service is not gated explicitly. It places one task per EC2 instance in the cluster; with the ASG at zero, it has no instances to schedule on and naturally goes to zero with the rest of the compute.

What is preserved

A quiesced environment will fail TCP health checks on IMAP/SMTP and serve no monitoring UI. DNS still resolves; clients see connection timeouts rather than NXDOMAIN.

Running the workflow

The workflow is in .github/workflows/quiesce.yml. It is workflow_dispatch only.

  1. Switch to the branch that maps to your target environment (stage for the stage environment, anything else for development). The main branch is rejected outright.
  2. From the GitHub Actions UI, run Quiesce Infrastructure with:
    • environment = development or stage
    • action = down to scale to zero, or up to restore
  3. The job validates that the branch matches the chosen environment, generates the same backend config and tfvars that infra.yml does, and runs terraform apply with quiesced = true|false written directly into the tfvars file.

Durability across other Terraform runs

infra.yml runs on any push that touches terraform/infra/** (or terraform/dns/**) and on workflow_dispatch. It writes quiesced = $ to its tfvars, so by default it un-quiesces.

To keep an environment quiesced across other runs:

Forgetting this step is recoverable: the next infra.yml run will simply restore compute. The state-bearing resources are unaffected either way.

Caveats