Tutorials Search / Shipping & infrastructure / Set up uptime monitoring
📝 Written ● Beginner Updated 2026-05-13

Set up uptime monitoring

Five minutes of setup, ten years of "I knew before my users did." An external service pings your site every minute from multiple regions; when the ping fails, you get a message before customers tweet about it. The cheapest reliability tool that exists.

Uptime monitoring and error tracking sound similar and aren't. Sentry watches your application code from the inside — when a request crashes, Sentry catches the exception and tells you. Uptime monitoring watches your site from the outside — when nobody can reach the server at all, Sentry can't tell you because Sentry can't reach the server either. The two are complementary; you want both. This tutorial covers the second one.

The mental model is simple: an external service makes an HTTPS request to your URL every minute (or every five, depending on plan), checks the response, and alerts you when the response stops looking right. "Right" can mean any of: HTTP 200, a specific string in the body, a response time under N milliseconds, a valid TLS certificate. The checks come from multiple geographic regions so a single-region network problem doesn't generate false alerts. The alert reaches you via the channel you picked: email, SMS, Slack, Discord, PagerDuty, push notification on a phone app.

This tutorial picks UptimeRobot as the default — its free tier covers most personal and small-business needs (50 monitors, 5-minute intervals), and the UI is genuinely simple. Better Stack, Pingdom, StatusCake, and Hyperping all do the same things; the patterns transfer. We'll set up the monitor, design a sensible /health endpoint that distinguishes "the process is alive" from "the database is reachable," wire up Slack notifications, and (optionally) put up a public status page so users can self-serve when something's wrong.

What you'll learn

Prerequisites: A live URL that responds to HTTPS (a deployed site, an API endpoint, or just a domain pointing at a host). If your URL is still localhost:3000, deploy something first — uptime monitoring requires a public endpoint to ping. A Slack or Discord workspace is helpful but not required; email alerts work out of the box.

Step 1: Design the /health endpoint

1

Three flavors; pick at least one

You can monitor your homepage URL directly — and for static sites, that's fine. For an app with a database and backing services, the homepage might return 200 even when the database is unreachable (cached HTML, static fallback, generic error page). A purpose-built /health endpoint distinguishes three levels:

  • Liveness — "the process is running and responding." A one-line endpoint: app.get('/health', (req, res) => res.send('ok')). Good for catching crashed processes; useless for catching dependency failures.
  • Readiness — "the process is running and can talk to its dependencies." Check that Postgres responds; check that Redis responds; check that any required external service is reachable. Return 200 only if all checks pass; 503 with details otherwise.
  • Deep health — same as readiness but with more checks (queue lengths, recent error rates, etc.). Useful for status pages and dashboards; usually too slow for per-minute uptime pings.

For uptime monitoring, readiness is the right shape:

app.get('/health', async (req, res) => {
  try {
    await db.query('SELECT 1');
    await redis.ping();
    res.json({ status: 'ok', db: 'ok', redis: 'ok' });
  } catch (err) {
    res.status(503).json({ status: 'degraded', error: err.message });
  }
});

Now a 200 from /health means "process up and dependencies reachable" — a much stronger signal than monitoring the homepage.

Keep /health cheap and ungated. Don't require auth on it (the monitor can't authenticate); don't make it run expensive queries; don't put rate-limiting in front of it. The endpoint runs every minute from multiple monitors — if it's slow, your bill goes up; if it's gated, the monitor reports false outages.

Step 2: Create an UptimeRobot monitor

2

Free tier handles most personal needs

Sign up at uptimerobot.com. The free tier includes 50 monitors at 5-minute intervals. (The paid Solo tier adds 1-minute intervals and SMS for $7/month.)

Dashboard → + New Monitor:

  • Monitor Type: HTTP(s).
  • Friendly Name: "Mybrand API" or similar — appears in alerts.
  • URL: https://mybrand.com/health.
  • Monitoring Interval: 5 minutes (free) or 1 minute (paid).
  • Monitor Timeout: 30 seconds. Slower responses count as down.
  • Advanced → Keyword Monitoring: optional but useful. Set keyword "ok" with type "Yes — alert when keyword does not exist." Now the monitor checks for the literal string in the response body, not just the HTTP status. Pages that return 200 with an error message stop fooling it.

Save. The monitor starts pinging within a minute; you'll see the response time graph populate.

Step 3: Add alert channels

3

Email is the default; Slack / Discord wakes the right person

UptimeRobot dashboard → My SettingsAlert ContactsAdd Alert Contact. Each "contact" is a destination for alerts:

  • Email — set up by default to the signup email. Free.
  • Slack — pick Slack as the type, paste an incoming webhook URL. In Slack: an admin creates the webhook at <workspace>.slack.com/apps → search "Incoming Webhooks" → pick the channel → copy URL. Alerts post into the channel with the monitor name and downtime duration.
  • Discord — same pattern. Discord webhook URL from a channel's Integrations settings.
  • SMS — paid plans only. Useful for "wake me at 3 AM if production goes down."
  • Webhook — for piping into your own systems (PagerDuty, Opsgenie, custom). UptimeRobot POSTs a JSON payload when status changes.

After adding contacts, edit the monitor and tick which contacts should receive its alerts. Production monitors usually get all of email + Slack + SMS; staging monitors get only email.

Step 4: Avoid alert fatigue

4

The settings that make alerts mean something

An uptime monitor that alerts on every flicker becomes background noise. Three settings tame it:

  • Grace period (consecutive failures): alert only after the URL has been down for N consecutive checks. Default 1 is too aggressive — set to 2 or 3. A single transient blip stops paging you; a real outage still alerts within 10–15 minutes.
  • Multi-region confirmation: in UptimeRobot Pro and competitors, only alert if multiple monitoring regions agree the site is down. Cuts false alerts from a single-region network problem.
  • Maintenance windows: schedule scheduled downtime (deploy windows, planned maintenance) so the monitor doesn't alert during them. UptimeRobot: monitor → Maintenance Window → recurring or one-off.

The goal is "every alert is real, every real outage alerts." Drift in either direction kills the value of the monitor.

Step 5: Status pages (optional)

5

Public-facing visibility into uptime

A status page (status.mybrand.com) shows your users which parts of your service are up. When something breaks, users check it instead of emailing support. When things are fine, the page is a trust signal.

  • UptimeRobot's built-in public status pages — free, attached to your monitors. Dashboard → Status Pages → Create. Custom URL on a UptimeRobot subdomain; custom domain on paid plans. Three minutes to set up.
  • Better Stack / Better Uptime — competing service with richer status pages (incident history, scheduled maintenance posts, subscriber notifications). Paid only but cheap; worth it if you have customers.
  • Atlassian Statuspage — the enterprise standard. Expensive ($29+/mo) and overbuilt for small projects.
  • Self-hosted with Cachet or Upptime — open-source status pages. Free but operational overhead.

For most projects with under 10K users, the free UptimeRobot status page is enough. Upgrade only when you have customers asking for an incident history.

Step 6 (bonus): Monitor scheduled jobs

6

Cron jobs that silently stopped running

HTTP uptime monitors only catch HTTP-facing failures. The other common silent failure: a cron job (nightly backup, weekly digest email, hourly cleanup) that's been broken for two weeks and nobody noticed.

The pattern: a service like Healthchecks.io (free for up to 20 checks) gives you a unique URL per cron job. The job pings the URL on every successful run. If pings stop arriving on the expected schedule, Healthchecks alerts you.

# In your cron job script, last line:
curl -fsS --retry 3 https://hc-ping.com/<your-unique-uuid> > /dev/null

# In crontab:
0 3 * * * /usr/local/bin/backup.sh && curl -fsS https://hc-ping.com/<uuid>

Healthchecks knows the schedule (you configure it) and pages you when a ping is overdue. Most useful for backups (where "did the backup run" is the question that only matters the day you need to restore).

When uptime monitoring isn't worth it

Static sites on Vercel / Netlify / Cloudflare Pages. These platforms have ~100% uptime on the CDN; an outage of theirs is broader than your site. Adding uptime monitoring catches your-config-broke-the-build outages but those usually show up in deploy logs first.

Internal-only tools, no users, no revenue impact. If the only person affected by an outage is you, and you'll notice within the hour anyway, the monitor and its config maintenance aren't paying for themselves.

You can't be alerted in the next 24 hours anyway. Uptime monitors are most valuable when paired with an on-call rotation that responds to alerts. Without that, the alert just queues up unread. Set up the monitor when you're at the stage where you'd actually act on a 3 AM page.

What's next