What is considered the best practice when working with alerting notifications?
The Prometheus alerting philosophy emphasizes signal over noise --- meaning alerts should focus only on actionable and user-impacting issues. The best practice is to alert on symptoms that indicate potential or actual user-visible problems, not on every internal metric anomaly.
This approach reduces alert fatigue, avoids desensitizing operators, and ensures high-priority alerts get the attention they deserve. For example, alerting on ''service unavailable'' or ''latency exceeding SLO'' is more effective than alerting on ''CPU above 80%'' or ''disk usage increasing,'' which may not directly affect users.
Option B correctly reflects this principle: keep alerts meaningful, few, and symptom-based. The other options contradict core best practices by promoting excessive or equal-weight alerting, which can overwhelm operations teams.
Verified from Prometheus documentation -- Alerting Best Practices, Alertmanager Design Philosophy, and Prometheus Monitoring and Reliability Engineering Principles.
Amber
20 days agoMurray
25 days agoLauran
1 month agoBok
1 month agoMyra
1 month agoNathalie
2 months agoOctavio
2 months agoCristina
2 months agoSabra
2 months agoMarion
3 months agoJulieta
3 months agoCecilia
3 months agoBlondell
3 months agoMichal
3 months agoCarlton
4 months agoJess
4 months agoKayleigh
4 months agoYasuko
4 months agoAnnamae
4 months agoJose
4 months agoJesus
5 months agoMelina
5 months agoJin
5 months agoAlba
5 months agoXuan
5 months agoEdmond
6 months agoMan
4 days agoJulian
9 days agoGail
15 days ago