I Built a Self-Hosted Status Page for My GPS Tracking SaaS. Here's How.

I was tailing logs after a deploy when I noticed one of my background jobs had silently stopped processing. The queue was backed up, but the app looked fine from the frontend. No alerts, no errors in the dashboard. I only caught it because I happened to be watching.

That's when it hit me. I had no real visibility into whether Traxelio was actually healthy. And if I couldn't tell, my customers definitely couldn't.

When something looks off on a GPS tracking platform, users don't know where to start. Is it the device? The SIM card? The network? The platform itself? Without a way to check, they message me. I investigate. Usually the platform is fine and the issue is on their end. But neither of us knew that upfront. That back-and-forth was happening multiple times a month.

I needed a status page.

Why self-hosted

I wanted three things: low cost, full control, and independence from the main app. If the app goes down, the status page should still be up. That rules out anything running on the same infrastructure.

I went with Uptime Kuma. Open source, self-hosted, clean UI, and a Python API I could automate against.

The architecture

The key decision: the status page runs on a separate EC2 instance from the main application. If the app server crashes, the database goes down, or a deploy goes sideways, the status page stays up. It's an independent observer, not a self-reporting system.

Here's the setup:

Dedicated EC2 instance running Uptime Kuma behind Nginx with SSL
19 monitors across 4 groups: Core (website, app, health check), Marketing (pricing, demo, blog, contact, enterprise, login), Pages (trackers, features, industries, locations, guides, glossary, sitemap), and DNS (3 domain resolution checks)
Tiered check intervals: 60s for core services, 120s for marketing pages, 300s for content pages and DNS
Daily encrypted backups to S3
IP-restricted admin panel, public status page only

The tiered intervals were a deliberate choice. I don't need to check if my glossary page is up every 60 seconds. But I absolutely need to know if traxelio.com goes down within a minute.

The automation problem

I could have set this up manually through the Uptime Kuma UI. Click, click, click, add 19 monitors, arrange them into groups, configure the status page. Done.

But manual setup has two problems. First, it drifts. You add a new page to your app, forget to add the monitor. Second, it doesn't survive a rebuild. If I need to reprovision the status page server, I'm clicking through that UI again.

So I wrote a Python script: setup-kuma.py. It runs on every deploy and does everything:

Fetches admin credentials from AWS Secrets Manager
Waits for Kuma to be ready (polls the HTTP endpoint)
Creates the admin user on first run, or logs in on subsequent runs
Adds monitors idempotently (deduplicates by URL or hostname)
Creates the public status page with grouped monitors
Sets it as the default entry page

The idempotent part is key. The script can run 100 times and produce the same result. No duplicates, no errors. Adding a new monitor is one dict in a Python list, then deploy.

MONITORS = [
    {
        "name": "Traxelio App",
        "url": "https://traxelio.com",
        "type": "http",
        "group": "Core",
        "interval": 60,
        "maxretries": 3,
    },
    # ... 18 more
]

That's the entire monitor definition. The script handles the rest: API calls, grouping, status page layout, deduplication.

The Cloudflare problem

Here's something I didn't expect. My first monitors all showed "down" immediately after setup. Every single one.

The issue: Cloudflare's bot detection. Uptime Kuma's default HTTP requests look like automated traffic (because they are), and Cloudflare was serving challenge pages instead of the actual site. The monitors were getting 403s.

Two fixes:

Custom User-Agent (TraxelioUptimeKuma/1.0): identifies the traffic so I can filter it out of analytics.
Custom header (X-Uptime-Monitor): a token that my Cloudflare rules recognize and allow through without challenge.

headers=json.dumps({
    "X-Uptime-Monitor": MONITORING_TOKEN,
    "User-Agent": MONITORING_USER_AGENT,
})

The Cloudflare rule checks for the X-Uptime-Monitor header with the correct token and skips the bot challenge. Simple, but it took me an hour of staring at green health checks that Kuma insisted were red before I figured it out.

Deploying with CloudFormation

The entire server is defined in a single CloudFormation template. One aws cloudformation deploy and the instance is up, configured, and running Kuma.

The template provisions:

A t3.micro EC2 instance with a static Elastic IP
A security group allowing inbound on ports 22, 80, and 443
An IAM role scoped to read one Secrets Manager secret
An S3 bucket for encrypted daily backups

The UserData script handles the rest on first boot: installs Docker, pulls the Uptime Kuma image, sets up Nginx with a Let's Encrypt certificate, configures the daily backup cron, and runs setup-kuma.py to provision monitors and the status page.

Resources:
  StatusPageInstance:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t3.micro
      IamInstanceProfile: !Ref InstanceProfile
      SecurityGroupIds:
        - !Ref SecurityGroup
      UserData:
        Fn::Base64: !Sub |
          #!/bin/bash
          # Install Docker, Nginx, certbot
          # Pull and run Uptime Kuma container
          # Configure SSL with Let's Encrypt
          # Run setup-kuma.py
          # Set up daily S3 backup cron

Tearing it down is just as simple: aws cloudformation delete-stack. The whole thing can be rebuilt from scratch in under 10 minutes, and setup-kuma.py recreates every monitor and the status page automatically. Nothing is manually configured, nothing to remember.

Credentials and secrets

I didn't want to hardcode the Kuma admin password in the script or in a .env file sitting on the server. The script fetches credentials from AWS Secrets Manager at runtime:

def fetch_credentials():
    result = subprocess.run(
        ["aws", "secretsmanager", "get-secret-value",
         "--secret-id", SECRET_NAME,
         "--query", "SecretString",
         "--output", "text",
         "--region", AWS_REGION],
        capture_output=True, text=True,
    )
    secret = json.loads(result.stdout.strip())
    return secret["username"], secret["password"]

The EC2 instance has an IAM role scoped to read only that one secret. No API keys on disk, no environment variables to leak.

What I'd do differently

If I were starting over, one thing I'd change: I'd add the monitoring script earlier. I spent months without visibility into uptime. That silent queue failure I caught by luck could have gone unnoticed for hours.

The total investment was about a day of work. A few hours for the script, a few hours for the EC2 setup, an hour debugging the Cloudflare issue. The ongoing cost is one small EC2 instance.

The result

The status page is live at status.traxelio.com. 19 monitors, 4 groups, real-time uptime data. No login required.

The "is the app down?" messages stopped. When something looks off, customers check the status page before reaching out. Green means the issue is on their end. Red means I'm already on it. Either way, nobody is waiting on me to answer a question a dashboard can handle.

A day of work, one small EC2 instance, and one Python script. That's it.

Resources

#!/usr/bin/env python3
"""
Auto-setup Uptime Kuma: create admin user, add HTTP/DNS monitors, and create
a public status page with grouped monitors.

Fetches admin credentials from AWS Secrets Manager, waits for Kuma to be ready,
creates the admin user (or logs in if already set up), adds monitors
idempotently (deduplicates by URL or hostname), and creates/updates a public
status page.

Prerequisites:
    pip3 install uptime-kuma-api
    aws secretsmanager create-secret \
        --name "your-app/status-page/admin-credentials" \
        --secret-string '{"username": "admin", "password": "STRONG_PASSWORD"}' \
        --region your-region

Cloudflare setup:
    HTTP monitors send an X-Uptime-Monitor header with MONITORING_TOKEN.
    To bypass bot protection, add a WAF exception rule in Cloudflare:

    Cloudflare dashboard > your domain > Security > WAF > Custom rules
    Create a rule:
        Name:   Allow Uptime Kuma
        If:     (http.request.headers["x-uptime-monitor"] eq "your-monitoring-token")
        Then:   Skip all remaining custom rules, rate limiting, and managed rules
"""

import json
import subprocess
import sys
import time

import urllib.request
import urllib.error

# --- Configuration (update these for your setup) ---
SECRET_NAME = "your-app/status-page/admin-credentials"
AWS_REGION = "us-east-1"
KUMA_URL = "http://localhost:3001"
MONITORING_TOKEN = "your-monitoring-token"
MONITORING_USER_AGENT = "YourAppUptimeKuma/1.0"
STATUS_PAGE_SLUG = "your-app"
STATUS_PAGE_TITLE = "Your App Status"
STATUS_PAGE_GROUPS = ["Core", "Marketing", "Pages", "DNS"]

MONITORS = [
    # --- Core (60s) ---
    {
        "name": "Website",
        "url": "https://yourapp.com",
        "type": "http",
        "group": "Core",
        "interval": 60,
        "maxretries": 3,
    },
    # --- Marketing (120s) ---
    {
        "name": "Pricing",
        "url": "https://yourapp.com/pricing",
        "type": "http",
        "group": "Marketing",
        "interval": 120,
        "maxretries": 3,
    },
    {
        "name": "Blog",
        "url": "https://yourapp.com/blog",
        "type": "http",
        "group": "Marketing",
        "interval": 120,
        "maxretries": 3,
    },
    # --- Pages (300s) ---
    {
        "name": "Docs",
        "url": "https://yourapp.com/docs",
        "type": "http",
        "group": "Pages",
        "interval": 300,
        "maxretries": 3,
    },
    {
        "name": "Sitemap",
        "url": "https://yourapp.com/sitemap.xml",
        "type": "http",
        "group": "Pages",
        "interval": 300,
        "maxretries": 3,
    },
    # --- DNS resolution (300s) ---
    {
        "name": "DNS · yourapp.com",
        "type": "dns",
        "group": "DNS",
        "hostname": "yourapp.com",
        "dns_resolve_server": "1.1.1.1",
        "dns_resolve_type": "A",
        "port": 53,
        "interval": 300,
        "maxretries": 3,
    },
    {
        "name": "DNS · app.yourapp.com",
        "type": "dns",
        "group": "DNS",
        "hostname": "app.yourapp.com",
        "dns_resolve_server": "1.1.1.1",
        "dns_resolve_type": "A",
        "port": 53,
        "interval": 300,
        "maxretries": 3,
    },
]


def log(msg):
    print(f"[setup-kuma] {msg}", flush=True)


def fetch_credentials():
    """Fetch admin credentials from AWS Secrets Manager via CLI."""
    log(f"Fetching credentials from Secrets Manager ({SECRET_NAME})...")
    result = subprocess.run(
        [
            "aws", "secretsmanager", "get-secret-value",
            "--secret-id", SECRET_NAME,
            "--query", "SecretString",
            "--output", "text",
            "--region", AWS_REGION,
        ],
        capture_output=True,
        text=True,
    )
    if result.returncode != 0:
        log(f"Failed to fetch secret: {result.stderr.strip()}")
        sys.exit(1)

    secret = json.loads(result.stdout.strip())
    return secret["username"], secret["password"]


def wait_for_kuma(max_retries=30, delay=5):
    """Poll Kuma's HTTP endpoint until it responds."""
    log(f"Waiting for Uptime Kuma at {KUMA_URL}...")
    for i in range(1, max_retries + 1):
        try:
            req = urllib.request.Request(KUMA_URL, method="GET")
            with urllib.request.urlopen(req, timeout=5) as resp:
                if resp.status < 500:
                    log(f"Kuma is ready (HTTP {resp.status}) after {i} attempt(s).")
                    return
        except (urllib.error.URLError, OSError):
            pass
        log(f"  Attempt {i}/{max_retries}, retrying in {delay}s...")
        time.sleep(delay)

    log("Kuma did not become ready in time.")
    sys.exit(1)


def main():
    from uptime_kuma_api import UptimeKumaApi, MonitorType

    username, password = fetch_credentials()
    wait_for_kuma()

    api = UptimeKumaApi(KUMA_URL, timeout=30)

    try:
        if api.need_setup():
            log("First-time setup: creating admin user...")
            api.setup(username, password)
            log("Admin user created. Waiting for Kuma to initialize...")
            time.sleep(5)
        else:
            log("Kuma already set up. Logging in...")
            api.login(username, password)
            log("Logged in.")

        # Get existing monitors and build a set of monitored URLs/hostnames
        log("Fetching existing monitors...")
        existing = api.get_monitors()
        existing_keys = {m.get("url") or m.get("hostname") for m in existing}
        log(f"Found {len(existing)} existing monitor(s).")

        # Add missing monitors
        added = 0
        for monitor in MONITORS:
            key = monitor.get("url") or monitor.get("hostname")
            if key in existing_keys:
                log(f"  Monitor already exists: {key}")
                continue

            if monitor["type"] == "dns":
                api.add_monitor(
                    type=MonitorType.DNS,
                    name=monitor["name"],
                    hostname=monitor["hostname"],
                    dns_resolve_server=monitor["dns_resolve_server"],
                    dns_resolve_type=monitor["dns_resolve_type"],
                    port=monitor["port"],
                    interval=monitor["interval"],
                    maxretries=monitor["maxretries"],
                )
            else:
                api.add_monitor(
                    type=MonitorType.HTTP,
                    name=monitor["name"],
                    url=monitor["url"],
                    interval=monitor["interval"],
                    maxretries=monitor["maxretries"],
                    headers=json.dumps({
                        "X-Uptime-Monitor": MONITORING_TOKEN,
                        "User-Agent": MONITORING_USER_AGENT,
                    }),
                )
            log(f"  Added monitor: {monitor['name']} ({key})")
            added += 1

        log(f"Done. {added} monitor(s) added, {len(MONITORS) - added} already existed.")

        # --- Status page setup ---
        log("Setting up public status page...")

        # Build a lookup from URL/hostname to monitor ID
        all_monitors = api.get_monitors()
        key_to_id = {}
        for m in all_monitors:
            k = m.get("url") or m.get("hostname")
            if k:
                key_to_id[k] = m["id"]

        # Build grouped monitor lists preserving STATUS_PAGE_GROUPS order
        groups_map = {g: [] for g in STATUS_PAGE_GROUPS}
        for monitor in MONITORS:
            key = monitor.get("url") or monitor.get("hostname")
            mid = key_to_id.get(key)
            if mid and monitor.get("group") in groups_map:
                groups_map[monitor["group"]].append({"id": mid})

        public_group_list = [
            {"name": group, "monitorList": groups_map[group]}
            for group in STATUS_PAGE_GROUPS
            if groups_map[group]
        ]

        # Create status page if it doesn't exist, then save config
        existing_pages = api.get_status_pages()
        page_exists = any(p.get("slug") == STATUS_PAGE_SLUG for p in existing_pages)

        if not page_exists:
            log(f"Creating status page '{STATUS_PAGE_SLUG}'...")
            api.add_status_page(STATUS_PAGE_SLUG, STATUS_PAGE_TITLE)

        api.save_status_page(
            slug=STATUS_PAGE_SLUG,
            title=STATUS_PAGE_TITLE,
            showPoweredBy=False,
            publicGroupList=public_group_list,
        )
        log(f"Status page configured with {len(public_group_list)} group(s) "
            f"and {sum(len(g['monitorList']) for g in public_group_list)} monitor(s).")

        # Set the status page as the default entry page
        api.set_settings(entryPage=f"statusPage-{STATUS_PAGE_SLUG}")
        log("Entry page set to status page (visitors will be redirected from /).")

    finally:
        api.disconnect()


if __name__ == "__main__":
    main()