Counter - Resources Metrics

About

This resource (counter|metrics) are usually expressed in the following terms:

  • utilization: as a percent over a time interval. eg, “one disk is running at 90% utilization”.
  • saturation: as a queue length. eg, “the CPUs have an average run queue length of four”.
  • errors: scalar counts. eg, “this network interface has had fifty late collisions”.

As utilization is a mean, it may have a high deviation that produce a saturation. Therefore a low utilization doesn't imply a low saturation.

They are part of SLI: Service Level Indicators

Collection

The data are generally collected as a gauge

Example

Resource Utilization Saturation Errors Availability
Disk IO % time that device was busy wait queue length # device errors % time writable
Memory % of total memory capacity in use swap usage N/A (not usually observable) N/A
Microservice average % time each request-servicing thread was busy # enqueued requests # internal errors such as caught exceptions % time service is reachable
Database average % time each connection was busy # enqueued queries # internal errors, e.g. replication errors % time database is reachable

Documentation / Reference

Task Runner