Reliability & DORA Metrics
Purpose. Reliability & DORA Metrics measure production stability and recovery performance. They provide visibility into deployment frequency, change risk, and how quickly incidents are resolved. Thes…
Purpose
Reliability & DORA Metrics measure production stability and recovery performance. They provide visibility into deployment frequency, change risk, and how quickly incidents are resolved.
These metrics are commonly used to evaluate engineering reliability and operational maturity.
Metrics Included
The following metrics belong to the Reliability & DORA family:
- Deployment Frequency
- Lead Time for Changes
- Change Failure Rate (CFR)
- Mean Time to Restore (MTTR)
Each metric definition, calculation formula, normalization logic, and display behavior is documented in the Metrics Glossary.
How These Metrics Work Together
Reliability metrics measure both delivery speed and production stability:
- Deployment Frequency measures how often code is released to production.
- Lead Time for Changes measures how long it takes for code to reach production.
- Change Failure Rate measures the percentage of deployments that result in incidents.
- MTTR measures how quickly incidents are resolved.
Together, these metrics provide a balanced view of release velocity and operational risk.
Dashboards That Use Reliability & DORA Metrics
Reliability & DORA metrics appear in:
These dashboards often combine Reliability metrics with Delivery and Throughput metrics to provide context.
Related Metric Families
Reliability metrics often correlate with:
- Delivery Metrics (lead time and deployment stage timing)
- Throughput Metrics (release volume vs. stability)
- Quality Metrics (change size and review rigor may influence CFR)
Configuration Dependencies
Reliability metrics rely on:
- Deployment mapping configuration
- Incident tracking integration (for CFR and MTTR)
- Accurate release detection
Misconfigured release or incident mapping may affect accuracy.
Troubleshooting
If Reliability metrics appear inconsistent:
- Confirm deployments are correctly mapped to production environments.
- Validate incident tracking integration is enabled and syncing.
- Verify release tagging or detection logic.
For detailed troubleshooting guidance, see: Troubleshooting .
How did we do?
Mean Time to Restore (MTTR): Definition and Calculation