2

CentOS 9. 3 vCPU VM with 4Gb RAM.

I run a cron job with 7z compressing 35Gb data in 150 files 7za a -mx=9 -mmt=3 ...

RAM usage - 18%, disk queue is very small, CPU is 61% on average. Why not 100% ? How do I find the bottleneck?

sar -p -d 1 10
Linux 5.14.0-80.el9.x86_64 (logger)       30/04/22        _x86_64_        (3 CPU)

16:50:10          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:11         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:11          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:12         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:12          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:13        39.00  33832.00      0.00      0.00    867.49      0.04      0.95      1.90 sda

16:50:13          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:14         2.00      0.00     24.00      0.00     12.00      0.00      0.50      0.10 sda

16:50:14          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:15         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:15          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:16         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:16          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:17         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:17          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:18         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

16:50:18          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:19         2.00      0.00     12.00      0.00      6.00      0.00      0.50      0.20 sda

16:50:19          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
16:50:20         0.00      0.00      0.00      0.00      0.00      0.00      0.00      0.00 sda

Average:          tps     rkB/s     wkB/s     dkB/s   areq-sz    aqu-sz     await     %util DEV
Average:         4.30   3383.20      3.60      0.00    787.63      0.00      0.91      0.22 sda
sar -p -u 1 10
Linux 5.14.0-80.el9.x86_64 (logger)       30/04/22        _x86_64_        (3 CPU)

16:50:26        CPU     %user     %nice   %system   %iowait    %steal     %idle
16:50:27        all     60.20      0.00      0.99      0.00      0.00     38.82
16:50:28        all     61.54      0.00      0.67      0.00      0.00     37.79
16:50:29        all     60.87      0.00      0.33      0.00      0.00     38.80
16:50:30        all     59.26      0.00      1.01      0.00      0.00     39.73
16:50:31        all     60.20      0.00      1.00      0.00      0.00     38.80
16:50:32        all     62.79      0.00      0.00      0.00      0.00     37.21
16:50:33        all     63.46      0.00      1.00      0.00      0.00     35.55
16:50:34        all     64.88      0.00      0.67      0.00      0.00     34.45
16:50:35        all     63.04      0.00      0.66      0.00      0.00     36.30
16:50:36        all     62.88      0.00      0.33      0.00      0.00     36.79
Average:        all     61.91      0.00      0.67      0.00      0.00     37.42

EDIT I found this doc: https://documentation.help/7-Zip/method.htm and it says "LZMA compression uses only 2 threads." which would explain what I observe on CentOS. But on Windows it uses 24 threads with LZMA.. Why?

Boppity Bop
  • 139
  • 9
  • Are you running the compression in parallel? 60% of 3 CPUs is just about 2 CPUs at 100%. – Kusalananda Apr 30 '22 at 20:58
  • this is one 7za process that runs. On windows machine the same files with same params uses 85% of CPU. So I expect Linux to be at least that number. – Boppity Bop Apr 30 '22 at 21:11
  • @BoppityBop **that** is important information to have! I wrote an answer and was about to post it, but the info you just gave in a comment basically invalidated all of it. Don't have the time to incorporate it now. **EDIT** your question to *include that critical piece of info*! – Marcus Müller Apr 30 '22 at 21:15
  • 1
    I didnt know that at the time I posted. It takes almost 4hrs to finish the archive. I just found out myself. so I added the edit. – Boppity Bop Apr 30 '22 at 21:17
  • you're simply using two different programs with the same name – Marcus Müller Apr 30 '22 at 21:17
  • But the author is the same. I am sure he uses C++ which is not that different on Windows. – Boppity Bop Apr 30 '22 at 21:18
  • hm, maybe the detection for the number of threads to use works worse in their Linux port. Have you tried explicitly setting 24 for the number of threads using `-mmt=24` instead of just `-mmt`? – Marcus Müller Apr 30 '22 at 21:21
  • (try with a smaller file, maybe!) – Marcus Müller Apr 30 '22 at 21:22
  • By the way, if you want to have something that is very darn nearly as good at compressing as 7z's LZMA2 implementation, but 2 to 4 times faster, [fast-lzma2](https://github.com/conor42/fast-lzma2) might be your method of choice :) – Marcus Müller Apr 30 '22 at 21:31
  • thank you. I will test fast-lzma2 tomorrow and report back. – Boppity Bop Apr 30 '22 at 21:37

1 Answers1

0

Mystery solved: 7za a -mx=9 -mmt=4 - note 4 threads although the VM has 3 vCPUs only.

Now it uses 100% CPU.

Helped by the 7z author: https://sourceforge.net/p/p7zip/discussion/383043/thread/15831e05/#576a/740e/7944

Boppity Bop
  • 139
  • 9