What is considered the best practice when working with alerting notifications?
The Prometheus alerting philosophy emphasizes signal over noise --- meaning alerts should focus only on actionable and user-impacting issues. The best practice is to alert on symptoms that indicate potential or actual user-visible problems, not on every internal metric anomaly.
This approach reduces alert fatigue, avoids desensitizing operators, and ensures high-priority alerts get the attention they deserve. For example, alerting on ''service unavailable'' or ''latency exceeding SLO'' is more effective than alerting on ''CPU above 80%'' or ''disk usage increasing,'' which may not directly affect users.
Option B correctly reflects this principle: keep alerts meaningful, few, and symptom-based. The other options contradict core best practices by promoting excessive or equal-weight alerting, which can overwhelm operations teams.
Verified from Prometheus documentation -- Alerting Best Practices, Alertmanager Design Philosophy, and Prometheus Monitoring and Reliability Engineering Principles.
Nathalie
9 hours agoOctavio
5 days agoCristina
24 days agoSabra
29 days agoMarion
1 month agoJulieta
1 month agoCecilia
1 month agoBlondell
2 months agoMichal
2 months agoCarlton
2 months agoJess
2 months agoKayleigh
2 months agoYasuko
3 months agoAnnamae
3 months agoJose
3 months agoJesus
3 months agoMelina
3 months agoJin
4 months agoAlba
4 months agoXuan
4 months agoEdmond
4 months ago