Understand load average on multicore system

Question

For only one microprocessor unit, the load average output by top could be understood that if it's above 1.0 then there are jobs waiting. But if we have n number of cores on a multicore system with l*n logical cores (on my Intel CPU n=6 and l*n = 12 so the output from nproc is 12), should we divide the load average by the output from nproc to see if that number is above 1 to understand if there are (on average) jobs waiting, or is it better to use htop to understand if a parallel multicore system is getting too much average load?

I think that my method was wrong but the conclusion was right when I saw that an average load was above 10 top, I checked with ps which process was expensive and found an overflow from a running program, but if that machine actually has output from nproc > 10 then it would not really have been cause for investigation if I had known that. Do you agree?

This might help: https://unix.stackexchange.com/questions/303699/how-is-the-load-average-interpreted-in-top-output-is-it-the-same-for-all-di — Andre Wildberg, Dec 15 '20 at 13:21

score 7 · Accepted Answer · answered Dec 15 '20 at 14:48

7

Your assumption is correct, you divide the load average across the cores. To understand load averages better I highly recommend this article by Brendan Gregg http://www.brendangregg.com/blog/2017-08-08/linux-load-averages.html

answered Dec 15 '20 at 14:48

ojs

932
5
11

Well If I'm reading it correctly just dividing across would not work I guess given if the system is under heavy load that does mean heavy CPU load it can be load IO load too (like thread / process waiting on slow IO etc) – Noobie Jul 10 '21 at 09:53

Understand load average on multicore system

1 Answers1