Preferred syntax for two lines long pipe

Question

When writing a long pipe it is usually clearer to separate it in two lines.

This long command line:

ruby -run -e httpd -- -p 5000 . 2>&1 | tee >(grep -Fq 'WEBrick::HTTPServer#start' && open localhost:5000)

Could be divided as:

ruby -run -e httpd -- -p 5000 . 2>&1 \
   | tee >(grep -Fq 'WEBrick::HTTPServer#start' && open localhost:5000)

Or:

ruby -run -e httpd -- -p 5000 . 2>&1 |
    tee >(grep -Fq 'WEBrick::HTTPServer#start' && open localhost:5000)

In short:

command1 \
    | command2

Or:

command1 |
    command2

I realize that this might be a style (opinion) issue, but: Is there a preferred way, and, if so, why?

My first instinct is to declutter (and clarify) the whole pipeline by predefining variables containing the strings starting "WEB" and "local". After that, folding may not even be required. — Paul_Pedant, Apr 08 '20 at 10:56
[Related](https://google.github.io/styleguide/shellguide.html#pipelines), recommends the opposite of the acceped answer. — schrodingerscatcuriosity, Apr 08 '20 at 13:38
@guillermochamorro Of course, that recommendation doesn't provide a rationale for using explicit line continuation, and certainly doesn't address the very real issue mentioned in the accepted answer. — chepner, Apr 08 '20 at 14:04
@chepner Yeah, I just wanted to add a different a view on the matter (at least style wise), but as you said , it's not explained why (I think the idea is that it's more clear when you read the code). BTW, once I spent hours trying to fix my code, and it was... an invisible space like shown in the answer ^^. Cheers! — schrodingerscatcuriosity, Apr 08 '20 at 14:09
Just to note: The accepted answer could change in the future if some other option gets more votes (or becomes "a better answer"). There nothing saying that it could not change. — , Apr 08 '20 at 16:56
The linked portion of the Google shell style guide is also about `&&` and `||` usage, not just `|`. If you take those into consideration, I suspect that it recommends that style for readability (it's easier to see which commands are being combined by which operators without needing to find the end of the previous line). — jamesdlin, Apr 09 '20 at 00:38
style wise, I go back and forth on this one. It is convenient to have the visual clue at the start of the line that a cmd is in the pipeline. OTOH, it is really nice to be able to add comments after the pipe symbol at the end of the line which you cannot do with an escaped newline. The google style guide recommends putting the pipe symbol at the front of the line, and that's a pretty strong argument for doing the opposite. — William Pursell, Apr 09 '20 at 11:40

score 42 · Answer 1 · edited Apr 07 '20 at 22:48

42

Ask your self what would this do?

command1 \ 
   | command2

Can't see the difference. Neither can I, but the shell can. Look closely, there is a space after the \. This stops the newline from being escaped.

Therefore use the other form, as it is safer. Shown here with the same error (a space after the | in this case). But it does not cause a bug.

command1 | 
    command2

edited Apr 07 '20 at 22:48

schrodingerscatcuriosity

12,087
3
29
57

answered Apr 07 '20 at 22:35

ctrl-alt-delor

27,473
9
58
102

3

But the shell would catch that situation and give an error. So it's not a very strong reason. – gidds Apr 08 '20 at 16:25
11

@gidds the shell will NOT catch that as an error `command \` will just pass a space as the first argument to command. That being said, this Q is pointless, and there's a double standard on this site, where perfectly technical & objective Qs are closed on sight, but "opinion collectors" like this thrive and fester. – Apr 08 '20 at 16:43
1

@mosvy I think gidds is referring to how the shell would raise a syntax error on `| command2`, indicating the file and line number it happened on. – JoL Apr 08 '20 at 16:47
@mosvy Please vote to close the question if **this Q is pointless**. – Apr 08 '20 at 16:49
@JoL You beat me to it! Yes, I was referring to the leading `|` causing an error. (I tried it: it does, in bash at least.) – gidds Apr 08 '20 at 16:49
7

@JoL but it will __still run__ `command ` without redirecting it through the pipeline, even if errors out on the next line. The shell is different from perl or python (or most other languages) -- it will evaluate the whole script line by line, not parse it whole first and then execute it. – Apr 08 '20 at 17:10
@mosvy Sure, but I don't think that mistake is that common, compound that with the small chance of that command doing something dangerous because of an added space argument and misdirected output. Most commands would just raise an error on that argument. I've never experienced a scenario nor can I think of a practical one where it would have avoided a serious bug. There's worth in the trailing pipe, sure, but I don't think it's worth the readability issue of having to read to the end of the line to make sure it's a `|` and not a `&&` or a `||`, etc. – JoL Apr 08 '20 at 17:23
3

@mosvy There's also the issue that by this argument, one should never use line-continuation escapes. Writing multi-line pipes is not the only use for them. They're also useful for breaking up long simple commands that have lots of arguments. Are we going to stop doing that too because we can't trust ourselves to not put a space after it? – JoL Apr 08 '20 at 17:44
Just to look into the other side: Maybe we should "stop using line continuations" and build single commands arguments as arrays `arr+=( arg23 )` and then run `command "${arr[@]}"`. Or, `set -- "$@" arg23` and then `command "$@"`. But maybe we shouldn't. @JoL – Apr 08 '20 at 17:50
1

1) Backslashes are just plain ugly (error-prone, hard to read) and should be avoided whenever feasible. 2) Significant trailing whitespace in a language is a misfeature. Therefore, this is best. – jrw32982 Apr 08 '20 at 22:57
1

Yes, is fragile, which is a very good reason to avoid it. OTOH, if your editor cannot make whitespace visible, (in particular, leading & trailing whitespace), you should consider getting a better editor. ;) – PM 2Ring Apr 09 '20 at 07:59

score 16 · Answer 2 · answered Apr 08 '20 at 16:34

I'm going to disagree with most folks here; I always prefer to wrap before a joining operator such as a pipe:

command1 \
| command 2

(You don't need to indent the second line; the pipe itself links it very obviously to the first.)

There are a few reasons for this:

It's easier to see the joiner; it doesn't get lost amongst the details of the line. (This is especially important if the line is long, and the joiner might have got scrolled out of sight, or lost amongst line wrapping.) When you scan code quickly, you look down the left-hand side, because that's where the overall structure is: in the indentation, the braces, or whatever a particular languages uses. Pipes and other joiners are important to the structure, so they too should be on the left.
It lines up if you're spanning 3 or more lines. Again, this makes the structure of the pipeline easy to take in at a glance.
It's closer to the way we think. (This is the most subtle and contentious point…) If you're reading a list out slowly, so someone can write it down, you'd say “[Item 1]… (pause)… and [Item 2]… (pause)… and [Item 3].”; it would feel unnatural to say “[Item 1] and… (pause)… [Item 2] and… (pause)… [Item 3].” That's because we think of the joiner as attaching to the following item more than the previous one. (You can think of the minus sign in arithmetic in similar terms; it works like addition, but connects more closely to the following number by negating it.) Code is easier to follow when it reflects our thinking.

I've tried both ways in many languages over the years, and have found that putting joiners on the following line really does help in most cases.

Beside "better looking", is there any bug or problem that this option would avoid ? — , Apr 08 '20 at 16:39
You might want just to drop point 3: that's an argument for Japanese speakers to do it the other way. (I.e., it's based on your local/community convention, not logic about how the world works.) — cjs, Apr 09 '20 at 00:13
@Isaac Any bug that might be put in by a programmer misreading code. — cjs, Apr 09 '20 at 00:16

JoL · Answer 3 · 2020-04-08T17:52:28.943

Well, just to avoid it looking like nobody would prefer:

command1 \
   | command2

I'm going to say that I do.

I see the trailing space problem raised by ctrl-alt-delor as a non-issue. Editors can warn about it; git warns about it. To top it off, the shell would raise a syntax error on | command2, providing the user with the file and line number of the error and cease interpreting the rest of the file:

$ cat f.sh
#!/bin/bash

echo foo \ 
| command2

echo bar
$ ./f.sh
foo  
./f.sh: line 4: syntax error near unexpected token `|'
./f.sh: line 4: `| command2'

There's also the fact that there are more uses for line-continuation escapes. For example, to break simple commands that have many arguments:

ffmpeg \
  -f x11grab \
  -video_size "$size" \
  -framerate "${framerate:-10}" \
  -i "${DISPLAY}${offset}" \
  -c:v ffvhuff \
  -f matroska \
  -

Should we avoid such usage too because we can't trust ourselves not to put a space after the escape?

My preference is purely a matter of readability and quite subjective. Here's a real-life example from my shell history (with details substituted with foobar):

org-table-to-csv foobar.org \
| cq +H -q "
  select foo
    from t
    where bar = 'baz'
      and foo != ''" \
| sed -r 's/^|$/'\''/g' \
| sed -r ':b;$!{N;bb};s/\n/, /g'

Compare to:

org-table-to-csv foobar.org |
  cq +H -q "
    select foo
      from t
      where bar = 'baz'
        and foo != ''" |
  sed -r 's/^|$/'\''/g' |
  sed -r ':b;$!{N;bb};s/\n/, /g'

Here's another:

sed 's/ .*//' <<< "$blame_out"
| sort \
| uniq \
| tee >(sed "s/^/from pipe before grep filtering: /" > /dev/tty) \
| grep -vF "$(git show -s --format=%h "$from_commit")" \
| tee >(sed "s/^/from pipe before git show: /" > /dev/tty) \
| xargs git show -s --format='%cI %h' \
| tee >(sed "s/^/from pipe after git show: /" > /dev/tty) \
| sort -k1 \
| tail -1 \
| cut -d' ' -f2

Compare to:

sed 's/ .*//' <<< "$blame_out"
  sort |
  uniq |
  tee >(sed "s/^/from pipe before grep filtering: /" > /dev/tty) |
  grep -vF "$(git show -s --format=%h "$from_commit")" |
  tee >(sed "s/^/from pipe before git show: /" > /dev/tty) |
  xargs git show -s --format='%cI %h' |
  tee >(sed "s/^/from pipe after git show: /" > /dev/tty) |
  sort -k1 |
  tail -1 |
  cut -d' ' -f2

I must say that if the bug raised by ctrl-alt-delor is a ""non-issue** then yes, the whole question falls into the realm of preferences and opinion, well IMhOpinion as well. — , Apr 08 '20 at 17:11
I try to avoid long lines. I will put things in to variables to make them shorter (I use long-ish variable names, but it still makes it shorter). But sometime I do line continuation just as you have shown. My dockerfiles have a lot of them. And yes I use an editor to highlight errors, and `shellcheck`. (+1 by the way. As I like your argument.) — ctrl-alt-delor, Apr 08 '20 at 18:13
It is a very long command without comments. Can you add comments above each line explaining what it does? — Ole Tange, Apr 08 '20 at 18:26

score 7 · Answer 4 · edited Apr 10 '20 at 19:59

I thought the answer to this was easy, but I can see @JoL and @gidds disagree with me.

My brain prefers reading a line and not having to scan the next line \
:

  foo bar baz ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... \

In the above I will have to see \
, what is on line 2 \
, before I can tell \
, what the command does \
. Maybe the command is complete \
? Or maybe the command continues \
  on the next line \
?

To me it is much easier to read,
if \ is only used,
when a command cannot fit on a line.

Reading through my code, I also see comments as an issue:

foo ... ... ... ... ... ... ... ... |
    # Now this does bar
    bar ... ... ... ... ... ... ... ... ||
    # And if that fails: fubar
    fubar

I am not sure how you would at all do comments in the middle of a pipeline if you use \ + newline before | or || or &&. If that is not possible, I think this is the most important problem. Code is not maintainable without comments, and comments should normally be as close to the code as possible to encourage updating the documentation when you change the code.

Emacs does the indentation for me automatically, so the indentation is not even an extra burden:

# This is indented automatically in emacs
ruby -run -e httpd -- -p 5000 . 2>&1 |
    # Send the output to the screen and to grep
    tee >(grep -Fq 'WEBrick::HTTPServer#start' &&
              # If grep matches, open localhost:5000
              open localhost:5000) 
# Here is where emacs indents the next command to

Clever! But I'd argue that in this context commas, full stops, and question marks are terminators. Not separators. :-) — gidds, Apr 08 '20 at 17:53

Preferred syntax for two lines long pipe

4 Answers4