30

I've been reading up about how pipes are implemented in the Linux kernel and wanted to validate my understanding. If I'm incorrect, the answer with the correct explanation will be selected.

  • Linux has a VFS called pipefs that is mounted in the kernel (not in user space)
  • pipefs has a single super block and is mounted at it's own root (pipe:), alongside /
  • pipefs cannot be viewed directly unlike most file systems
  • The entry to pipefs is via the pipe(2) syscall
  • The pipe(2) syscall used by shells for piping with the | operator (or manually from any other process) creates a new file in pipefs which behaves pretty much like a normal file
  • The file on the left hand side of the pipe operator has its stdout redirected to the temporary file created in pipefs
  • The file on the right hand side of the pipe operator has its stdin set to the file on pipefs
  • pipefs is stored in memory and through some kernel magic, shouldn't be paged

Is this explanation of how pipes (e.g. ls -la | less) function pretty much correct?

One thing I don't understand is how something like bash would set a process' stdin or stdout to the file descriptor returned by pipe(2). I haven't been able to find anything about that yet.

Brandon
  • 657
  • 1
  • 9
  • 15
  • Note that you're talking about two considerably different layers of things with the same name. The `pipe()` kernel call along with the machinery that supports it (`pipefs`, etc) is much lower level than the `|` operator offered in your shell. The latter is usually implemented using the former, but it doesn't have to be. – Greg Hewgill Aug 04 '14 at 23:28
  • Yes, I am specifically referring to the lower level operations, with the assumption that the `|` operator is just calling `pipe(2)` as a process like bash does. – Brandon Aug 04 '14 at 23:31
  • See also [What's the difference between "Redirection" and "Pipe"?](https://askubuntu.com/a/1074550/295286) – Sergiy Kolodyazhnyy Sep 12 '18 at 20:18

1 Answers1

24

Your analysis so far is generally correct. The way a shell might set the stdin of a process to a pipe descriptor could be (pseudocode):

pipe(p) // create a new pipe with two handles p[0] and p[1]
fork() // spawn a child process
    close(p[0]) // close the write end of the pipe in the child
    dup2(p[1], 0) // duplicate the pipe descriptor on top of fd 0 (stdin)
    close(p[1]) // close the other pipe descriptor
    exec() // run a new process with the new descriptors in place
Cristian Ciupitu
  • 2,430
  • 1
  • 22
  • 29
Greg Hewgill
  • 7,003
  • 2
  • 31
  • 34
  • Thanks! Just curious why the `dup2` call is needed, and you can't just directly assign the pipe descriptor to stdin? – Brandon Aug 04 '14 at 23:42
  • 3
    The caller doesn't get to choose what the numeric value of the file descriptor is when it is created in `pipe()`. The `dup2()` call allows the caller to copy the file descriptor to a specific numeric value (needed because 0, 1, 2 are stdin, stdout, stderr). That is the kernel equivalent of "assigning directly to stdin". Note that the C runtime library global variable `stdin` is a `FILE *`, which is not kernel related (although it is initialised to be connected to descriptor 0). – Greg Hewgill Aug 04 '14 at 23:43
  • Great answer! I am a little lost in the details. Just wondering why you do close(p[1]) before running exec()? Once dup2 returns, wouldn't p[1] point to fd 0? Then close(p[1]) closes the file descriptor 0. Then how can we read from the stdin of the child process? – user1559897 Dec 07 '18 at 14:33
  • @user1559897: The `dup2` call does not change `p[1]`. Instead, it makes the two handles `p[1]` and `0` point to the same kernel object (the pipe). Since the child process doesn't need two stdin handles (and wouldn't know what the numbered handle that is `p[1]` is anyway), `p[1]` is closed before `exec`. – Greg Hewgill Dec 07 '18 at 16:56
  • @GregHewgill Gotchu. Thx! – user1559897 Dec 07 '18 at 18:36
  • How does the shell decide when to terminate the processes on either end? Does it just kill both ends as soon as one of them finishes? (as it appeas to when I do `yes|head -1` – Ben Sep 28 '20 at 00:17
  • 1
    @Ben: Look up the `SIGPIPE` signal. In your example, when `head` terminates, `yes` will receive a `SIGPIPE`, and the default handler terminates the process. – Greg Hewgill Sep 28 '20 at 00:34