On-call

Decide who gets paged for which incidents and in what order. Schedules answer who's on call right now; policies answer and if nobody acks, who's next.

Schedules

A schedule is a named rotation, with one or more layers. Each layer is an ordered list of users that cycles on a cadence (daily, weekly, custom). When two layers cover the same time — weekday primary plus weekend secondary, say — the higher position wins.

Overrides cover swaps, vacations, and incident-driven coverage: they pin a specific user on call for a window regardless of what the layer math would say. The resolver is timezone-aware, so you can ship a rotation that respects each user's home timezone if you want to.

Escalation policies

A policy is an ordered list of steps. Each step says who to notify, after how long, and through which channels. We support targets like the user currently on this schedule or this specific person; channels are the alert channels you've set up (email, SMS, Slack, generic webhook).

Step 1 (immediately): Page primary on-call via SMS + email
Step 2 (after 5 min):  Page secondary on-call via SMS
Step 3 (after 15 min): Notify #incidents Slack channel

wait_secondsis cumulative from incident open, so step 3's wait is the total elapsed time, not the gap from step 2. This makes “page within 30 seconds” SLAs easy to express.

Ack vs resolve

When you ack an incident — from the dashboard, an email link, or the webhook — we stop firing escalation steps but keep the incident open. Closing it requires either a successful next probe (auto-resolve) or a manual close. This mirrors the most common page-fatigue mistake: someone acks the page so the noise stops, and then forgets to actually fix anything.

Setting it up

  1. Create a schedule, add at least one layer, and optionally an override.
  2. Create an escalation policy referencing that schedule.
  3. On any monitor, set the escalation policy. New incidents now route through the policy instead of straight to per-monitor alert channels.

The web UI for managing schedules and policies is on the roadmap; in the meantime they can be created via the API or directly in the database for self-hosted setups.