6

I'm trying to disable local timer interrupts for low latency. I have full tickless mode enabled in the kernel config and I have the boot parameter nohz_full set for the cores in question.

However when I look at the interrupt count through /proc/interrupts I see the local timer interrupts counting up 1000 times a sec per core meaning full tickless isn't working.

The tickless mode documentation says that for tickless to work that only one running process needs to be on that core.

When I look at top, I see the following under a given core (core 1 in this example):

   19 root      RT   0     0    0    0 S  0.0  0.0   0:00.00  1 watchdog/1
   20 root      -2   0     0    0    0 S  0.0  0.0   0:02.15  1 rcuc/1
   21 root      RT   0     0    0    0 S  0.0  0.0   0:00.04  1 migration/1
   22 root      -2   0     0    0    0 S  0.0  0.0   0:00.25  1 ksoftirqd/1
   23 root      RT   0     0    0    0 S  0.0  0.0   0:00.00  1 posixcputmr/1
   24 root      20   0     0    0    0 S  0.0  0.0   0:00.00  1 kworker/1:0
   25 root       0 -20     0    0    0 S  0.0  0.0   0:00.00  1 kworker/1:0H

I do know that some of these are kernel threads. Are these the reason why my full tickless mode isn't working?

muru
  • 69,900
  • 13
  • 192
  • 292
Nathan Doromal
  • 195
  • 1
  • 6
  • Just for curiosity how much extra CPU time do you think that you will be getting here? Is it worth going outside the "norm"? – mdpc Jun 05 '14 at 20:13
  • 2
    @mdpc, interruptions through scheduling and other means can be measured and easily add up to 1 to 2 % CPU time. On a tickless isolated core you eliminate those interruptions. But usually the point is not to increase your per-core throughput by 1 % or so but to minimize the latency (of applications with realtime/near-realtime requirements). A thread on a core with ticks gets interrupted by the scheduler which costs you some microseconds and thus increases your latency by some microseconds. – maxschlepzig Sep 05 '19 at 20:51

1 Answers1

5

The full tickless mode that is activated with e.g. nohz_full=cpux-cpuy indeed is only effective if there is just one runnable task on a each nohz_full CPU:

Adaptive-ticks does not do anything unless there is only one runnable task for a given CPU, even though there are a number of other situations where the scheduling-clock tick is not needed.

(cf. Documentation/timers/NO_HZ.txt)

Thus, if you check a nohz_full CPU with ps it makes sense to explicitly look for runnable tasks - e.g.:

$ ps -e -L -o cpuid,pid,lwp,state,pri,rtprio,class,wchan:20,comm \
      | awk '$1 == '$mycpunr

(i.e. look at the state column)

That means it's ok to have have some additional tasks on a nohz_full CPU as long as they aren't runnable.

With just nohz_full=, nothing stops the kernel to schedule user/kernel threads on the selected CPUs. Thus, one usually also isolates those CPUs to avoid any interference by other threads. For example with:

nohz_full=cpux-cpuy isolcpus=cpux-cpuy

(cf. Linux Kernel Parameters)

With those options a thread on an isolated nohz_full CPU still can be interrupted, e.g. by timers and RCU operations.

Thus, if you want to minimize the latency of your isolated thread you need to disable other sources of interruptions.

You can check /proc/timer_list for timers that are still active on isolated CPUs.

Common examples for timers that may show on an isolated CPU are watchdog_timer_fn and a timer related to machine check exceptions (MCE) functionality.

You can disable those interruptions by further kernel options, e.g.:

nowatchdog mce=ignore_ce

Looking at the /proc/interrupts counters is a good way to check for hardware induced interruptions. Another source of interruptions are Softirqs, thus one also has to check the /proc/softirqs counters.

For example, to minimize RCU related interruptions on isolated CPUs, one can offload RCU callbacks to kernel threads, migrate them to a non-isolated CPU and free the isolated CPU from having to notify a callback thread by adding the kernel option:

rcu_nocb_poll

That option requires rcu_nocbs= to be effective, but nohz_full= already implies rcu_nocbs= for the specified CPUs.

Note that you explicitly have to move the offloaded RCU callback threads to a housekeeping CPU - by explicitly setting the CPU affinities of those threads. For example with tuna (to CPU 0):

# tuna -U -t 'rcu*' -c 0 -m

The kernel document Documentation/kernel-per-CPU-kthreads.txt describes further sources of interruptions (a.k.a. OS Jitter) and shows how to locate them by running your test load with tracing enabled.

maxschlepzig
  • 56,316
  • 50
  • 205
  • 279
  • Small nit: `nohz_full` implies `rcu_nocbs`, so there's no need to specify it separately, see `Documentation/admin-guide/kernel-parameters.txt` and https://github.com/torvalds/linux/blob/6f38be8f2ccd9babf04b9b23539108542a59fcb8/kernel/rcu/tree_nocb.h#L1191 – Xyene Feb 08 '22 at 03:11
  • @Xyene yeah, I wrote that in my answer: 'That option requires `rcu_nocbs=` to be effective, but `nohz_full=` already implies `rcu_nocbs=` for the specified CPUs.' – maxschlepzig Feb 08 '22 at 21:33
  • Right you are, I missed that. "With those options a thread on an isolated `nohz_full` CPU still can be interrupted, e.g. by timers and RCU callbacks." suggested otherwise, which I think is not true, and which the rest of your answer agrees with? – Xyene Feb 10 '22 at 02:52
  • @Xyene Firstly, timers aren't affected by the parameters referenced before that quote. Yes, the wording RCU callbacks in that sentence isn't consistent and should read RCU operations. Meaning RCU stuff that happens unless `rcu_nocb_poll` is also specified. I'll update the answer. – maxschlepzig Feb 11 '22 at 20:27