Performance - (Latency|Response time|Running Time)
Table of Contents
1 - About
Latency is a performance metric also known as Response time.
Latency (Response Time) is the amount of time take a system to process a request (ie to first response) from the outside or not, remote or not,
In other words, how much time it takes between making a request and receiving the first data requested. That you can implement so: the time calculated just before sending the request to just after the first response has been received.
A request can be:
- an UI action (such as pressing a button)
- or a server API call
Network Protocol analysers (such as Wireshark) measure the time when bytes are actually sent/received over the interface.
Can we answer “What was the performance?” by “It took 15 seconds. Performance’s units, “inverse seconds”, can be awkward
2 - Articles Related
3 - Measurement
4 - Requirement
- 90% of responses should be below 0.5sec,
- 99% should be below 2 seconds,
- 99.9 should be better than 5 seconds.
- And a >10 sec. response should never happen.
If you haven’t stated percentiles and a Max, you haven’t specified your requirements
5 - User Perception
- 100 ms: user perceives as instantaneous
- 1s: is about the upper limit of human flow of though. User loses the feeling direct feedback
- 10s: Upper limit of attention. Feedback is important and the chance of context switch is high.
6 - Response Time
6.1 - over load
6.2 - percentile
7 - Storage
hdrhistogram - data structure to capture latency
8 - Documentation / Reference
- How NOT to Measure Latency An attempt to share wisdom...Matt Schuetze, Product Management Director, Azul Systems - Presentation on how not to measure latency, which explains why average values are better measure of latency than median values and why maximum values are very important.