2

I am experiencing a strange issue on my Linux machine. I am using 6 CPUs, 12 Hyper threads. Top summary tab shows 90% idle. For 12 threads, 1200% should me the max usage. For 90% idle individual process usage should not exceed more than 100% combined. But I have many processes reporting close to 100% as shown below. Is it because idle usage is from kernel idle thread jiffies and individual process usage is from /proc where it is dependent on child process completion?

Following is the top output

top - 01:35:46 up 20:07,  3 users,  load average: 3.14, 2.05, 1.84
Tasks: 423 total,   6 running, 416 sleeping,   0 stopped,   1 zombie
%Cpu(s):  5.3 us,  4.2 sy,  0.0 ni, 89.7 id,  0.1 wa,  0.5 hi,  0.3 si,  0.0 st
KiB Mem : 65354616 total, 31564688 free, 10397548 used, 23392380 buff/cache
KiB Swap:  4194300 total,  4194300 free,        0 used. 53570652 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
18038 root      20   0 6631168 707068 412072 R 106.2  1.1   3:01.12 Process1
27399 root      20   0 3127992 853740 397052 R 106.2  1.3   0:03.34 Process2
13572 root      20   0 2289344 857832  89620 R 100.0  1.3   0:37.95 Process3
11386 root      20   0 6653276 1.113g 615792 R  93.8  1.8   3:44.04 Process4
13568 root      20   0 6241660 838040 408984 R  93.8  1.3   0:42.94 Process5
10344 root      20   0 6387436 1.008g 396800 S  68.8  1.6   5:27.98 Process6
14899 root      20   0 5668692 473040 458768 S  62.5  0.7   3:11.54 Process7
13100 root      20   0 6288300 915164 386776 S  12.5  1.4   3:20.63 Process8
Totor
  • 19,302
  • 17
  • 75
  • 102
  • In `man top`, I see that the "id" means "time spent in the kernel idle handler". I am absolutely not an expert or even remotely knowledgeable about this sort of thing, but that looks like it would count things like the time a process spends waiting for input/output operations and _not_ the time that the actual CPU is completely idle. – terdon Nov 12 '21 at 12:10
  • That may be a bug in `top`, please check if `htop` gives comparable results. That might be kernel related (please share version). That may be related to the calculation of the average. Maybe ProcessX is using 100% CPU 10% of time, but when your `top` checks at a given time, it may see (and display) 100% (and have a different average for the whole CPU). Your [load average](https://unix.stackexchange.com/a/118163) is not so high. Maybe the `ProcessX` run in some sort of container, and cgroup CPU limiting applies. – Totor Nov 12 '21 at 12:22
  • 1
    Out of 6 CPUs, only one of them might be using 100%. Rest must be idle. Press "1" in the top command to see individual CPU statistics. – SHW Nov 12 '21 at 15:01

0 Answers0