Host your own email and enhance your privacy
Fired by:
/list), ntfy server health.BlackboxProbeFailure (any blackbox target).A synthetic probe to a user-facing endpoint is failing. The probe is one of:
https://admin.<control-domain>/) — CloudFront → S3 returned non-2xx, or DNS broke./list) — API Gateway → Lambda failed, or the seeded Cognito JWT in the Kuma monitor expired./v1/health) — the monitoring ALB or the ntfy ECS task is down. If this is the only thing failing, the alert that delivered it came in via Pushover; ntfy push will be missing.| Probe | User impact |
|---|---|
| 993 (IMAP) | Mail clients can’t read mail. |
| 25 (SMTP relay) | Inbound mail bounces or queues remotely. |
| 587 / 465 (Submission) | Mail clients can’t send. |
| Admin app | The browser admin client is unreachable. Mail itself unaffected. |
/list |
Address management is broken. Mail itself unaffected. |
| ntfy | The push channel for warnings is broken. Pushover still works for critical. |
The label instance (Prometheus) or the monitor name (Kuma) tells you which probe. Then:
nc -zv imap.<control-domain> 993 # mail ports
curl -I https://admin.<control-domain>/ # admin app
curl -I https://ntfy.<control-domain>/v1/health
Compare with aws ecs execute-command into a task in the same VPC. If the in-VPC test passes and the public test fails, the issue is at the load balancer or DNS layer. If both fail, the issue is in the service itself.
aws ecs describe-services --cluster <cluster> --services cabal-imap cabal-smtp-in cabal-smtp-out cabal-uptime-kuma cabal-ntfy --query 'services[].{name:serviceName,running:runningCount,desired:desiredCount,events:events[0:3]}'
runningCount < desiredCount for the relevant service is the smoking gun. The recent events[] usually identifies the cause (image pull failure, EFS access point error, capacity).
cabal-uptime-kuma, cabal-ntfy, etc.). Healthy targets + a failing public probe means a security-group rule, listener rule, or DNS issue, not the service.If all three checks pass but the probe stays red:
/list specifically, the most common cause is the seeded JWT having expired — re-seed it (see docs/monitoring.md §9) before assuming the API is broken.aws logs tail /ecs/cabal-imap --filter-pattern fail2ban | head -100. A blanket ban can drop the probe source if Kuma’s outbound IP shifted.aws ecs update-service --cluster <cluster> --service <name> --desired-count 0, wait, then back to 1).aws ecs list-container-instances) — likely an EC2 host is wedged. NAT instance failures also break outbound for any service that calls AWS APIs at start.