what is the effect of setting cpu.cpu_quota_us in cpu cgroup?

Question

I am trying to understand if setting cpu.cpu_quota_us in cpu cgroup subsytem has any impact on application performance. Essentially by reducing the CPU quota, but increasing the number of CPUs such that "effective" CPUs are still the same, would it impact the application? For example, is 4 CPU 100% quota configuration same as 8 CPU 50% quota configuration?

I know this depends a lot on the application design and whether its cpu or io bound. Here I am only concerned about CPU intensive applications.

My effort:

I wrote a simple C application available here https://github.com/ashu-mehra/cpu-quota-test.

This program creates 'N' threads. Each thread starts computing prime numbers starting between number 'n' and 1000000. The starting number 'n' is different for each thread. After computing 100 prime numbers, the thread sleeps for fixed duration of time. Once the thread reaches 1000000, it starts over from 2. At the end, the main thread displays the cumulative number of primes calculated by each thread. I treat this as the "throughput" of this sample application.

I ran this program under following configurations:

In a cgroup which has 4 cpus and no limit on quota.
In a cgroup which has 8 cpus and 50% quota.

I disabled the hyperthreads by setting /sys/devices/system/cpu/cpu/online` to 0.

For each configuration I varied the number of threads from 4 to 32. Following are the results for the "throughput" generated by the sample program. Numbers are average of 10 iterations.

threads    cpu4quota100    cpu8quota50
4          66229.5         66079.4
8          128129          129768
16         189247          134882
24         188238          98917.8
32         176236          87252.5

Notice there is big difference in throughput between two cases from thread 16 onwards. For 24 and 32 threads, throughput dropped considerably for "cpu8quota50" case.

I have the perf stat results for these runs as well. I noticed cpu-migrations reported by perf vary a lot between these two configurations. Here is the comparison

threads    cpu4quota100    cpu8quota50
4          9.6             11.2
8          3252.2          37.9
16         2956.2          4490.5
24         472.6           2347
32         118.3           1727.2

Numbers for threads 4, 8 and 16 make sense but I can't comprehend the numbers for thread 24 and 32 for "cpu4quota100" case which are way less than thread 16 case.

Can some one provide explanation for these results? Also, does "cpu-migration" have any impact on application performance?

Sorry for the long post!

Edit 1:

I updated my script for running above mentioned sample program to time the execution using time command to see if there is any difference between "cpu4quota100" and "cpu8quota50" cases. I did the run for 32 threads only, and these are the results:

time    cpu4quota100    cpu8quota50
user    119.956 secs    120.076 secs
sys     0.001 secs      0.009 secs
CPU     386.2%          386.5%

So not much difference in user and sys time in the two cases, but the "throughput" is twice in cpu4quota100 case compared to cpu8quota50 case.

Edit 2:

It seems changing kernel governor for CPU frequency helped in improving the throughput of cpu8quota50 case. Earlier numbers were obtained when frequency governor "powersave" was in use. With "powersave" CPU frequency of the cores in case of cpu4quota100 shot to maximum but for cpu8quota50 it was much lower. However, after changing frequency governor to "performance", CPU frequency in case of cpu8quota50 was also close to max. For 32 threads running with "performance" as frequency governor I get following numbers:

threads    cpu4quota100    cpu8quota50
32         175804          163831

So the difference has now come down from nearly 50% to 6.8% only.

But its interesting to note the difference in behavior of "powersave" governor in the two cases as mentioned above. I am not sure if it is working as expected in cpu8quota50 case.

what is the effect of setting cpu.cpu_quota_us in cpu cgroup?

0 Answers0

Linked