Per Outage Duration
What does a 10-minute outage actually cost?
Ten minutes is the modal outage duration in real operations. It rarely triggers an SLA credit, almost never appears in a board pack, and is the dominant component of cumulative annual downtime cost for most enterprises. ITIC math says a mid-size enterprise loses around $50,000 per 10-minute event. The realistic figure including context-switching tail is closer to $75,000. Across a year of typical operational noise, those events compound to seven figures.
Direct Cost Plus Tail
10-minute outage cost by company size
The per-minute figures below come from ITIC 2024 and Ponemon 2022. The "tail" column is the context-switching productivity loss after technical recovery, modelled at 15 minutes per affected worker times a typical share of staff impacted by the system in question. For most knowledge-worker outages it adds 40 to 60% to the headline direct cost.
| Segment | $/min | Direct (10 min) | Context-switch tail | Total |
|---|---|---|---|---|
| Small business (under 50 staff) | $427 | $4,270 | $1,700 | $5,970 |
| Mid-size (50 to 500 staff) | $5,000 | $50,000 | $25,000 | $75,000 |
| Large enterprise (500+ staff) | $16,667 | $166,670 | $80,000 | $246,670 |
| Large enterprise, top quartile | $83,333 | $833,330 | $400,000 | $1,233,330 |
| Finance, large banks (peak) | $155,000 | $1,550,000 | $750,000 | $2,300,000 |
Per-minute figures from ITIC 2024 (large enterprise) and Ponemon 2022 (small/mid). Tail estimate from Gloria Mark's attention research. Figures USD, as of 2026-05-18.
Why Credits Do Not Fire
SLA credit accounting and the 10-minute outage
Three reasons a 10-minute outage almost never gets a credit. First, SLAs are typically measured monthly. A 10-minute outage on a 99.9% monthly SLA (which permits 43 minutes) consumes 23% of the monthly budget. The customer is still within allowance. Credits only fire when cumulative downtime crosses the threshold.
Second, cloud SLAs use a regional or service-level uptime definition that is more forgiving than what the customer experiences. AWS EC2 SLA, for example, measures regional uptime, not the uptime of any individual instance. A 10-minute issue in a single AZ that affects only some customers may not register on the SLA at all.
Third, the customer has to file the credit request manually, with evidence, within a tight window (typically 30 days). The opportunity cost of an engineer or account manager preparing the request exceeds the value of the credit for most small outages. So even when credit accrues in principle, it goes unclaimed in practice. See our full SLA credit asymmetry analysis for the cumulative claim-rate data.
Frequency Math
10-minute outages add up across a year
The dollar-per-event figure of $50,000 to $75,000 for a mid-size enterprise looks tolerable in isolation. The compounding picture changes the math. Most engineering teams see roughly one 10-minute-class outage per month at baseline, two per month in operationally noisy periods, and as many as one per week during a poor quarter. Across a typical year, those events sum to a number that is comparable to one major incident.
| Frequency | Annual minutes | Annual cost (mid-size) | Note |
|---|---|---|---|
| 1 per month | 120 | $600,000 | Median operational practice |
| 2 per month | 240 | $1,200,000 | Frequent small incidents |
| 1 per week | 520 | $2,600,000 | Operationally noisy team |
This is the case for monitoring and observability investment. A 10-minute outage that you detect and resolve in 3 minutes is a 3-minute outage. A 10-minute outage that you detect in 8 minutes is a 10-minute outage. The marginal dollar of monitoring spend that reduces mean-time-to-detect typically returns more than the marginal dollar of HA spend that reduces blast radius, because most outages are small and frequent rather than large and rare. See MonitoringCost.com for the prevention-spend math.
The Context-Switching Tail
Why a 10-minute outage costs more than 10 minutes
Knowledge workers do not seamlessly resume work the moment a system recovers. The research is consistent. Gloria Mark's 2008 study on workplace interruption found an average of 23 minutes to return to focused work after an interruption. Salvucci and Bogunovich's CHI 2009 study measured 8 to 22 seconds of pure cognitive resumption time for trivial task switches, much higher for complex tasks.
For an outage of a critical work system, the resumption time is at the higher end of the range. People do not just wait for the system to come back. They Slack-message colleagues, check status pages, refresh tabs, write postmortems mentally, and then have to context-switch back to the task they were doing before. A 10-minute outage of CRM for a 200-person sales team is, in productivity terms, more like a 25 to 30 minute event. The aggregate productivity loss is therefore meaningfully larger than the technical outage window.
Most downtime cost models ignore this tail. Ours adds it explicitly as a separate line because the input that reduces it is different from the input that reduces direct cost. Direct cost is reduced by faster mean-time-to-recover. Tail cost is reduced by clearer incident communications (so people do not waste time guessing), better partial-degradation modes (so people can keep doing some work), and improved status-page UX (so people can re-engage their primary task as soon as the system is back).
Frequently Asked
Common Questions
How much does a 10-minute outage cost a mid-size enterprise?
Will a 10-minute outage trigger an SLA credit?
Are 10-minute outages common?
Should I bother modelling 10-minute outages in my business case?
What is the productivity tail for a 10-minute outage?
How do I reduce the cost of 10-minute outages?
Related