---
title: "Golden Signals: Efficient Monitoring for Successful Software Development"
date: 2025-02-22
reading_time: 3 min
---

Golden Signals: Efficient Monitoring for Successful Software Development

The Golden Signals are four essential basic metrics for monitoring our system. With them, we can detect potential issues in our system more easily, reducing debugging and analysis times for your organization’s support team.

Personally, I feel a strong affinity for Golden Signals metrics, as they have been a great help in my work over the past few years, both as a Software Engineer and Engineering Manager. With a quick glance at the dashboard, I can evaluate the health and performance of my service without wasting time with multiple dashboards, queries, or tools. They also give me confidence that the changes I make to my code won’t have a negative impact on customers, which significantly facilitates the adoption of Continuous Delivery.

The Four Golden Signals

Traffic

This is the number of requests reaching your system. In web services, we typically measure the number of HTTP requests per second. In other services, the number of transactions executed per second is measured.

Errors

This is the number of errors in your system, typically differentiated by type. In web services, we have the various HTTP status codes. In other services, we can provide a more detailed description of the error, which helps us be more efficient in debugging.

HTTP Status:

  • 500: internal server error
  • 502: bad gateway
  • 503: service temporarily unavailable
  • 504: gateway timeout
  • 400: bad request
  • 401: unauthorized
  • 403: forbidden
  • 404: not found
  • 409: conflict in the request, can occur when we try to respond to a request the client has already abandoned

Latency

Latency, or the response time of a request, is measured from when the request is received until the response is sent. It is crucial to differentiate response times between successful and failed requests. Mixing both types of requests can contaminate the metrics and make problem identification harder.

Saturation

This metric measures the saturation level of the system’s operational resources, i.e., how close we are to reaching the limit. For example: CPU usage, memory, storage, disk I/O (input/output) operations, and network.

Alerts

These metrics help us evaluate the health and performance of our system; however, their usefulness is limited if they are not continuously monitored. To avoid the need for staff watching monitors 24 hours a day, we can use technology to create alerts based on rules and thresholds, enabling more efficient monitoring. Notifications are triggered when the rules are met and the predefined thresholds are exceeded.

Dashboard

It is essential that all four metrics are visualized on a single dashboard to facilitate problem identification by developers and the support team, avoiding wasted time navigating between dashboards and monitoring tools searching for each metric separately.

It is also crucial to have visibility into all services in your system. If you have databases, workers, Kafka, or microservices, you should have a dashboard for each of them. Following this same principle, at a minimum, the Golden Signals metrics should be included.

Conclusion

What organization hasn’t experienced intense discussions between developers and DevOps about the cause of a failure affecting the end customer? What software engineer hasn’t wasted time trying to identify an error of unknown origin? That, and more, is what Golden Signals metrics are for — key tools for identifying the root cause of a problem. Their implementation considerably reduces problem debugging times and facilitates the adoption of Continuous Delivery in a system, which helps keep the support SLA under control.

So, if you’re looking for successful software development, it’s essential to have at least Golden Signals metrics in every service of your system.

References