What is considered the best practice when working with alerting notifications?
The Prometheus alerting philosophy emphasizes signal over noise --- meaning alerts should focus only on actionable and user-impacting issues. The best practice is to alert on symptoms that indicate potential or actual user-visible problems, not on every internal metric anomaly.
This approach reduces alert fatigue, avoids desensitizing operators, and ensures high-priority alerts get the attention they deserve. For example, alerting on ''service unavailable'' or ''latency exceeding SLO'' is more effective than alerting on ''CPU above 80%'' or ''disk usage increasing,'' which may not directly affect users.
Option B correctly reflects this principle: keep alerts meaningful, few, and symptom-based. The other options contradict core best practices by promoting excessive or equal-weight alerting, which can overwhelm operations teams.
Verified from Prometheus documentation -- Alerting Best Practices, Alertmanager Design Philosophy, and Prometheus Monitoring and Reliability Engineering Principles.
Amber
2 months agoMurray
2 months agoLauran
3 months agoBok
3 months agoMyra
3 months agoNathalie
3 months agoOctavio
3 months agoCristina
4 months agoSabra
4 months agoMarion
4 months agoJulieta
4 months agoCecilia
5 months agoBlondell
5 months agoMichal
5 months agoCarlton
5 months agoJess
5 months agoKayleigh
5 months agoYasuko
6 months agoAnnamae
6 months agoJose
6 months agoJesus
6 months agoMelina
6 months agoJin
7 months agoAlba
7 months agoXuan
7 months agoEdmond
7 months agoYvette
1 month agoHershel
1 month agoMan
2 months agoJulian
2 months agoGail
2 months ago