Back to Catalog
Microservices
Observability
Application Metrics
Instrument your services to emit stats — then aggregate, alert, and actually know what's happening.
Intent & Description
Real-world Use Case
Order Service emits p99 latency per endpoint. Prometheus scrapes every 15s. Grafana alerts when p99 > 500ms. On-call gets paged before users notice.
Source
📌 TL;DR
If it’s not measured, it doesn’t exist. Instrument everything, aggregate centrally, alert on what matters.
Advantages
- Real-time operational visibility — no more guessing
- Proactive alerting catches issues before users complain
- Enables capacity planning and scaling decisions
- Great audit trail for post-mortems
Disadvantages
- Cardinality explosion can tank Prometheus if you label carelessly
- Every service needs instrumentation — ongoing dev effort
- Metrics infra (Prometheus, Grafana) needs to be maintained
- Easy to collect everything, hard to collect the right things