In general, I'm trying to understand the numbers displayed by uptime (and numerous oder commands):
$ uptime
13:40:52 up 18 days, 2:10, 8 users, load average: 0.20, 0.30, 0.64
Here, my system's load average for the previous one-minute-interval was 0.2. On a very high level, this number is telling me that, on average during the sampling period, 0.2 processes were considered as producing load for the system [2].
I understand that a process can be in many different states. But I only have a firm understanding for 3 main states:
- The process is doing work on a CPU.
- The process could be doing work right away but there are no free CPUs.
- The process is waiting for something to happen on or outside the system.
I think the last state is the one to look at more closely because I'm quite certain that processes in the first two states are always counted for the system's load. I can imagine a few things that a system might do to get into the last state:
- Calling
sleep(). - Calling
read()on an open file which is not in the cache. - Calling
read()on a pipe into which no-one is writing anything. - Calling
accept()on a network socket. - Calling
send()on a network socket with a full buffer. - Trying to access private memory that has been swapped to disk.
- Trying to access a memory mapped file that is not in the cache and has to be loaded from disk or a network file system.
- Being unfortunate enough to jump to a piece of code that has not yet been loaded into memory.
I read this answer and a few other resources on the internet and I've seen the terms uninterruptable state and uninterruptable sleep a few times and that processes in such a state are also counted. So I'm assuming that this would include some of the states above, but which ones? Definitely not all of them.
So my main question is, when exactly is a process counted towards the system's load?
Footnotes
[1]: E.g. out of 1000 samples, 200 times a single process [2] was counted while the other 800 times none were counted. Or 200 processes were counted once while none were counted the other 999 times. I get that.
[2]: I'm using process and thread interchangeably here because, to my understanding, from the viewpoint of the scheduler, they don't look all that different. For the interpretation of my question, please assume that I'm talking about a system where every process has exactly one thread. If there are interesting differences when looking at processes with multiple threads, please note them though!