42

I am confused by the terminology used to describe Linux signal delivery. Most texts say things like "the signal is delivered to the process" or "the signal is delivered to the thread".

It is my understanding that a signal is "delivered" to a signal handler, which resides in a process, when the kernel calls that handler. The process itself is running asynchronously, and this "delivery" process is akin to a CPU calling an interrupt handler. The interrupt handler (signal handler) is not the process thread, nor any thread running under that process, correct? It is a separate thread of its own started by the kernel.

So the signal is not delivered to a thread or a process, but is delivered to a signal handler residing in the process and not necessarily associated with any specific thread. If this is not correct, please tell me, for example, the association between the signal handler and a pthread that justifies the terminology of "signal delivered to a pthread".

Toby Speight
  • 8,460
  • 3
  • 26
  • 50
Albert
  • 511
  • 2
  • 4

3 Answers3

57

A signal handler is just a function within a given process' address space. This function is executed whenever the signal is received. There's nothing special about it (although there are certain actions that should not be performed within a signal handler), and it does not reside in a special thread.

While signals are often described as being software interrupts, they aren't actually asynchronous.* When a signal is sent to a process, the kernel adds it to the process' pending signal set. It doesn't cause anything to happen immediately. The signal will only actually do anything at the next context switch back to userspace (whether that's a syscall returning or the scheduler switching to that process). If a process were to, for whatever reason, never switch from kernel to user, the signal would be kept in the pending signal set and never acted upon.

When a process establishes a signal handler, it gives the kernel an address to a function. When the process is to receive a signal, the next context switch from kernelspace to userspace will not restore the execution context from before the process entered the kernel (usually, the context is saved when entering the kernel and restored upon exiting it). Instead, it will "restore" execution at the location of the signal handler. When the signal handler returns, it executes code which calls rt_sigreturn(), which restores the real execution context, allowing the process to continue where it left off.

When a process has multiple threads (i.e. there are multiple processes in a given thread group), the signal is sent to one of the threads in the thread group at random. This is because threads typically share memory and many other resources and run the same code.

* While they aren't asynchronous from the perspective of hardware, they are effectively asynchronous as far as userspace applications are concerned. This is why they are sometimes called software interrupts.

† When I refer to context switches, I mean privilege or process switches (i.e. both simple mode transitions between kernel and user within the same process and "true" context switches between processes or kernel threads).

forest
  • 2,585
  • 15
  • 27
  • This answer is great, and explains what happens when the process is in kernel context at the time the signal is sent. But what about when the process is executing in user-space at the time? Are you able to clarify that, too? Thanks. – Toby Speight Jan 25 '23 at 11:47
  • 5
    @TobySpeight If the process is currently in userspace, the signal is added to the process' pending signal set. It won't be acted upon until a kernel -> user context switch. I believe if you enable certain real-time scheduling modes for a process and it never calls `yield()` and no hardware interrupts occur on that CPU, then the signal will never be acted on at all. – forest Jan 25 '23 at 11:48
  • I really didn't expect that, but it does make sense (any running process must eventually yield to the scheduler, and if it calls an interruptible system call such as `read()`, that will return immediately, so it all works eventually). – Toby Speight Jan 25 '23 at 11:52
  • 7
    With regard to delivery to "a thread at random", it may be worth mentioning there are ways to mask which signals a thread is interested in. This ties in with your remark about threads that never yield to the kernel: those threads should "mask" all signals, leaving other threads in the process to handle them. – Matthieu M. Jan 25 '23 at 13:02
  • 6
    *“There's nothing special about it”* There are actually several things special about it, that I would not dismiss this easily. While it is true that it is not running in a separate thread for example, it can have its own stack, and it can interrupt what the thread it was called on at any point. This is the reason why you can't safely call functions that would allocate memory, or do other things that need to be atomic in a way, and that actually restricts what you can do quite a lot. – G. Sliepen Jan 25 '23 at 13:38
  • 9
    `While signals are often described as being software interrupts, they aren't actually asynchronous.` They _are_ actually asynchronous in some sense of the word. Yes, they don't interrupt kernel code, or force the scheduler to switch to this process, but if only looking at plain old user code, they do _interrupt_ that just as a regular hardware IRQ would do (at least in the good old, simpler days, with much less virtualization and/or single-processing). From the POV of userland, they are asynchronous. +1, otherwise, fine answer. – AnoE Jan 25 '23 at 14:34
  • But there has to be some forced context switches once in a while? Otherwise `int main(){while(1);}` might never receive the signal? – Oskar Skog Jan 25 '23 at 17:15
  • @OskarSkog Yes. So it's completely silly to say they're not software interrupts. – user253751 Jan 25 '23 at 17:38
  • @OskarSkog All processes will naturally have context switches due to the scheduler. – forest Jan 25 '23 at 22:54
  • 4
    @AnoE That's true! I added a footnote to my answer to point that out, thanks. From the perspective of the process, even a syscall is merely a single instruction that sets various GPRs and may change the contents of memory, albeit an instruction that takes a long time to retire. The process has no idea that a context switch or mode transition occurs, so from its perspective, a signal is a true interrupt. – forest Jan 26 '23 at 02:03
  • Is it safe to assume that SIGKILL and SIGSTOP work quite differently? Or are they essentially the same as certain uncaught signals? – UncleCarl Jan 26 '23 at 17:41
  • @UncleCarl They're nothing more than signals whose default disposition can only be changed by PID 1, which means any other process can neither catch nor ignore them. – forest Jan 26 '23 at 23:34
  • 1
    `If the process is currently in userspace, the signal is added to the process' pending signal set. It won't be acted upon until a kernel -> user context switch` I don't think this is correct - the kernel actually will trigger the scheduler to interrupt a process that's currently in userspace "immediately" - https://elixir.bootlin.com/linux/v6.1.8/source/kernel/signal.c#L777 – KJ Tsanaktsidis Jan 27 '23 at 03:14
  • https://gist.github.com/KJTsanaktsidis/f7799c426ea3b681c76ccd3d2ed99e58 - on my system, this program dispatches the signal within about ~50-100 us, which is way smaller than the scheduler tick interval (think that's 250hz, so 4ms). – KJ Tsanaktsidis Jan 27 '23 at 03:19
  • "When a process has multiple threads (i.e. there are multiple processes in a given thread group), the signal is sent to one of the threads... " There is that phrase again, "sent to one of the threads". What does it mean for the signal to be "sent to" a thread, if the signal has its own handler function? The handler is a function inside a process with multiple threads. How does the signal get from the handler to the thread it is being "sent" to? – Albert Jan 30 '23 at 21:28
  • @Albert The kernel has a list of all processes. If PID 1234 sends a signal to PID 4321, all the kernel has to do is add that signal to 4321's pending signal set. Then when the kernel schedules that process, one of the first things it does before restoring its execution context is check if there's anything that needs to be done first, including checking for any pending signals. To say that a signal is "sent" is a bit of an analogy and isn't really accurate when you look at it from the perspective of the kernel itself. – forest Sep 01 '23 at 02:26
9

The interrupt handler (signal handler) is not the process thread, nor any thread running under that process, correct?

The kernel doesn't start a new thread to execute a signal handler. It executes the signal handler on an existing thread. We could say that the signal is delivered to that particular thread. Basically, the thread drops whatever it was doing before, and executes the signal handler. After the signal handler returns, it goes back to what it was doing before. (forest's answer goes into more detail about how exactly the kernel schedules this.) But a key difference between this and an ordinary function call is that you don't have control over when it happens. So for example, on a platform where 32-bit accesses are atomic and 64-bit accesses are not, it's possible that the thread is in the middle of writing to a int64_t variable when the signal handler invocation occurs. Therefore, even in a single-threaded program, the signal handler must guard against "the thread racing against itself", so to speak. Consequently, the set of operations you can perform safely inside a signal handler is very limited.

The sender of a signal can choose to send it to a particular thread, for example by calling tgkill, or target an entire process. When a signal is sent to a process, the kernel selects one of the threads in the process to deliver it to. See What happens to a multithreaded Linux process if it gets a signal?

Note that if the behaviour of the signal is to terminate the recipient (e.g. SIGKILL, or SIGTERM with the default handler) then the entire process terminates even if you direct it at a particular thread.

Brian Bi
  • 562
  • 5
  • 15
5

I have an answer to a very similar question on Stack Overflow: https://stackoverflow.com/questions/6949025/how-are-asynchronous-signal-handlers-executed-on-linux/6949377#6949377

Signals are delivered to a thread by suspending its execution, saving its execution context (register state, signal mask, etc.) to a ucontext_t object pushed to its stack, modifying the execution context so that the program counter points to the registered signal handler and so that the other aspects of the context conform to ABI requirements for entering a function, and resuming execution. The part of the context representing the return address has also been modifed to point to code which will execute a sigreturn syscall, restoring state from the saved ucontext_t object.

  • Yes, but other than borrowing some thread's stack for a short time, how does the signal get sent to that thread? The thread itself doesn't know the signal happened since its execution was simply interrupted and restored. The only way any thread would know the signal happened is if the signal changed the value of some variable known to and observed by that thread. What difference does it make which stack was used? – Albert Jan 30 '23 at 21:34
  • @Albert: If your question is why it matters which thread it's delivered to, that probably merits a new question. If the signal handler wants to do AS-unsafe things, it needs to be delivered to a thread that's only performing AS-safe operations in its main flow of execution. If the signal handler does anything that involves blocking/waiting for a resource, delivery to the wrong thread could deadlock due to preventing the forward progress that would make the resource become available. Etc. – R.. GitHub STOP HELPING ICE Feb 05 '23 at 13:05