Your SLA says 99.99% uptime. Your client nods. Everyone feels good about the number. But does anyone actually know what it means?
99.99% sounds like "basically never down." In reality, it means you can be down for 52 minutes per year. That's less than one hour — total — across 365 days. One bad deployment that takes 20 minutes to roll back uses 38% of your annual downtime budget.
The Nines Table
| Uptime | Name | Downtime/year | Downtime/month | Downtime/day |
|---|---|---|---|---|
| 99% | Two nines | 3 days, 15 hours | 7 hours, 18 min | 14 min, 24s |
| 99.5% | 1 day, 19 hours | 3 hours, 39 min | 7 min, 12s | |
| 99.9% | Three nines | 8 hours, 45 min | 43 min, 49s | 1 min, 26s |
| 99.95% | 4 hours, 22 min | 21 min, 54s | 43s | |
| 99.99% | Four nines | 52 min, 33s | 4 min, 23s | 8.6s |
| 99.999% | Five nines | 5 min, 15s | 26s | 0.86s |
Use our free uptime calculator to convert any percentage to real downtime numbers.
What Each Level Requires
99% — Internal tools, staging environments
3.6 days of downtime per year. Achievable with a single server and basic monitoring. Most side projects and internal tools operate here.
99.9% — Most SaaS products
8 hours, 45 minutes per year. Requires health checks, automated restarts, and alerting. This is the minimum acceptable level for production APIs that customers depend on.
99.99% — Payment APIs, auth systems
52 minutes per year. Requires redundancy (multi-AZ or multi-region), automated failover, blue-green deployments, and sub-minute monitoring. A single 20-minute outage uses 38% of your annual budget.
99.999% — Infrastructure APIs (AWS, Stripe)
5 minutes per year. Requires active-active multi-region, automatic traffic rerouting, zero-downtime deployments, and a dedicated SRE team. Most companies don't need this and shouldn't promise it.
How to Calculate Your Real Uptime
Most teams calculate uptime from synthetic checks: "Our health check returned 200 for 99.9% of pings." But a health check every 60 seconds can miss a 59-second outage entirely.
Real uptime should be calculated from actual request data:
// Request-based uptime (more accurate)
const totalRequests = 1_000_000
const failedRequests = 500 // 5xx responses
const uptime = ((totalRequests - failedRequests) / totalRequests) * 100
// 99.95% — more accurate than any synthetic check
// Synthetic-based uptime (less accurate)
const totalChecks = 43_200 // 1 check/min for 30 days
const failedChecks = 12 // 12 minutes of downtime detected
const syntheticUptime = ((totalChecks - failedChecks) / totalChecks) * 100
// 99.97% — but might have missed short outages between checksThe Error Budget Concept
If your SLA is 99.9%, you have an error budget of 0.1% — approximately 43 minutes of downtime per month. Think of it as a budget you can "spend":
- Deployments that cause 2 minutes of downtime? That's fine — 41 minutes left.
- Database migration that takes the API down for 10 minutes? Budget it — 31 minutes left.
- Unplanned outage of 30 minutes? You've used 70% of your monthly budget in one incident.
When the error budget is almost spent, slow down deployments and focus on reliability.
How to Monitor Uptime
Two approaches, ideally used together:
External checks (uptime pings)
UptimeRobot, Better Stack, or Pingdom ping your API from outside every 30-60 seconds. This catches total outages and DNS/network issues. But it misses partial failures, endpoint-specific errors, and issues between check intervals.
Internal monitoring (request-based)
Nurbak Watch runs inside your Next.js server and calculates uptime from real request data. Every request counts — not just synthetic pings. If /api/checkout returns 500 for 5% of requests while /api/health returns 200, internal monitoring catches it. External pings don't.
// instrumentation.ts
import { initWatch } from '@nurbak/watch'
export function register() {
initWatch({
apiKey: process.env.NURBAK_WATCH_KEY,
})
}Free during beta. Request-based uptime. Alerts in under 10 seconds. Calculate your target uptime and start monitoring it.

