22

Yeah, I know what you are thinking: "Who on earth names their file `a`b?"

But let us assume you do have a file called `a`b (possibly made by a crazy Mac user - obviously not by you), and you want to rsync that. The obvious solution:

rsync server:'./`a`b' ./.;
rsync 'server:./`a`b' ./.;

gives:

bash: line 1: a: command not found
rsync: [sender] link_stat "/home/tange/b" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1865) [Receiver=3.2.7]
rsync: [Receiver] write error: Broken pipe (32)

Even:

$ rsync 'server:./\`a\`b' ./.;
bash: line 3: a\: command not found
rsync: [sender] link_stat "/home/tange/\b" failed: No such file or directory (2)
:

What is the rsync command I should be running?

$ rsync --version
rsync  version 3.2.7  protocol version 31
AdminBee
  • 21,637
  • 21
  • 47
  • 71
Ole Tange
  • 33,591
  • 31
  • 102
  • 198
  • 18
    Finally! Our day has come, the day us naysayers, us preachers of the chorus of "you can't assume your file names to be free of delimiters and characters with special meanings to your shell" are proven right – Marcus Müller Mar 21 '23 at 22:18
  • Can you try putting the `'` around the whole argument, i.e., `rsync 'server:./`a`b' ./.;`? – Marcus Müller Mar 21 '23 at 22:19
  • 7
    This is slightly terrifying; who would have thought that calling rsync like that could easily run arbitrary commands on the server? – dhag Mar 21 '23 at 22:43
  • @MarcusMüller: The original string and the one you proposed are identical to the shell (so that rsync will behave identically on either). – dhag Mar 21 '23 at 22:44
  • 1
    @dhag well the thing is I couldn't believe `rsync` would be running command substitution on the server either, so that's why I presumed something had to be strange here. – Marcus Müller Mar 21 '23 at 22:48
  • Someone that has the broken version: would `--protect-args` help here? I only use it to stop space splitting, but it says "without allowing the remote shell to interpret them", so.. – Izkata Mar 22 '23 at 14:01
  • @MarcusMüller, I say we ban any characters from filenames outside the "portable filename character set". Or at least control characters, ASCII punctuation and maybe blanks too. (plus initial hyphens in any case) – ilkkachu Mar 22 '23 at 14:40
  • 2
    Obligatory link: https://dwheeler.com/essays/fixing-unix-linux-filenames.html – ilkkachu Mar 22 '23 at 14:40
  • 2
    @ilkkachu you can infer from my last name how fond I am of the idea of white-list character sets! (And I'm not even among the roughly half of humankind whose native script isn't Latin!) – Marcus Müller Mar 22 '23 at 15:26
  • 1
    @MarcusMüller. I'm Finnish, I know the problem. But the issue is mostly ASCII control and special chars, leaving the door open for most of Unicode (incl. ü and å etc.) would be fine. Also the other thing is that it's really necessary to be able to put everything in filenames, we could store the full document title on the application level (or encode special chars in filenames somehow). We already need to do that for slashes, so no use pretending filenames are something one can put anything in. – ilkkachu Mar 22 '23 at 15:40
  • @ilkkachu I just really think that we're doing better if we fix the software rather than the data (which is file names, in this case) :) That's basically a philosophical opinion there, in which I think the two of us simply differ! Also, I think if we don't solve the software problem but restrict the data today, then we'll still be stuck with about 4 decades of data already, which we still need to deal with, so we cannot choose to *not* solve the problem robustly on the software side. And if we've solved it for "old" data, why are we constraining "new" data? – Marcus Müller Mar 22 '23 at 15:46
  • @MarcusMüller, yes that's the problem, we _can't_ solve it in software for the old data. Filenames just aren't 8-bit clean, unless we go about changing _all_ the filename APIs (and ABIs). With the amount of text-based scripts and filename listings there are, fixing the data seems way easier than fixing the software. (and if a project like rsync can't get it right...) Old data doesn't seem much of an issue, if didn't produce problems before, it's not likely to spawn up any new ones unless moved to a new environment, and then it can be just translated to a safer format. – ilkkachu Mar 22 '23 at 16:25
  • One "solution" here would be to install a command `a` that just prints `a` between backquotes. (Untested except with rsync 3.2.3, and not practical in general, of course.) – dhag Mar 22 '23 at 17:04
  • 1
    @ilkkachu ¥ couldn't be allowed on filenames because [it's a path separator on Windows](https://news.ycombinator.com/item?id=29177000) – user253751 Mar 23 '23 at 02:07
  • @user253751, because it's the backslash in SHIFT-JIS, so same issue as with `/` on unixen. Windows folks also do fine without stuff like `?*<>:` allowed in filenames. – ilkkachu Mar 23 '23 at 07:47

3 Answers3

33

Having manually bisected, this is a bug in rsync and is fixed by commit 5c93dedf4538 ("Add backtick to SHELL_CHARS."), which will be in the upcoming rsync 3.2.8 (not yet released). It was broken by commit 6b8db0f6440b ("Add an arg-protection idiom using backslash-escapes"), which is in 3.2.4.

As a mitigation, an option to use the old arg parsing behaviour (--old-args) exists:

rsync --old-args 'server:./\`a\`b' .
Chris Down
  • 122,090
  • 24
  • 265
  • 262
12

It is a version issue. It seems not to depend on the server version, but the client.

Something has been broken between v3.2.3 and v3.2.7.

OK:

$ rsync-v3.2.3 --rsync-path=rsync-v3.2.7  'server:./\`a\`b' ./.;
$ rsync-v3.2.3 --rsync-path=rsync-v3.2.3  'server:./\`a\`b' ./.;
$ rsync-v3.2.3 --rsync-path=rsync-v3.2.3  server:./"'"'`a`'"'"b ./.;

Fails:

$ rsync-v3.2.7 --rsync-path=rsync-v3.2.7  'server:./\`a\`b' ./.;
bash: line 3: a\: command not found
rsync: [sender] link_stat "/home/tange/\b" failed: No such file or directory (2)
$ rsync-v3.2.7 --rsync-path=rsync-v3.2.3  'server:./\`a\`b' ./.;
bash: line 3: a\: command not found
rsync: [sender] link_stat "/home/tange/\b" failed: No such file or directory (2)
$ rsync-v3.2.3 --rsync-path=rsync-v3.2.3  'server:./`a`b' ./.;
bash: line 1: a: command not found
rsync: [sender] link_stat "/home/tange/b" failed: No such file or directory (2)
$ rsync-v3.2.3 --rsync-path=rsync-v3.2.7  'server:./`a`b' ./.;
bash: line 1: a: command not found
rsync: [sender] link_stat "/home/tange/b" failed: No such file or directory (2)

But seriously: It seems to be a disaster waiting to happen, that you need to quote ` twice.

Thanks to @dhag for pointing to the issue.

Unfortunately it does not answer how to do the transfer with version v3.2.7.

AdminBee
  • 21,637
  • 21
  • 47
  • 71
Ole Tange
  • 33,591
  • 31
  • 102
  • 198
3

If you don't have to support versions of rsync prior to 3.0.0, use --secluded-args aka -s, formerly known as --protect-args, and then you don't have to worry how the file names may be interpreted by the remote user's login shell (which may be anything, so doing quoting / escaping properly is virtually impossible) when doing rsync over rsh/ssh. From the manual (here in 3.2.7):

  --secluded-args, -s
         This  option  sends all filenames and most options to the remote
         rsync via the protocol (not the remote shell command line) which
         avoids  letting the remote shell modify them.  Wildcards are ex‐
         panded on the remote host by rsync instead of a shell.

         This is similar to the default backslash-escaping of  args  that
         was  added  in 3.2.4 (see --old-args) in that it prevents things
         like space splitting  and  unwanted  special-character  side-ef‐
         fects.  However, it has the drawbacks of being incompatible with
         older rsync versions (prior to 3.0.0) and of  being  refused  by
         restricted shells that want to be able to inspect all the option
         values for safety.

         This option is useful for those times that you  need  the  argu‐
         ment's character set to be converted for the remote host, if the
         remote shell is incompatible with the default backslash-escpaing
         method, or there is some other reason that you want the majority
         of the options and arguments to bypass the command-line  of  the
         remote shell.

         If you combine this option with --iconv, the args related to the
         remote side will be translated from  the  local  to  the  remote
         character-set.   The  translation  happens before wild-cards are
         expanded.  See also the --files-from option.

         You may also control this setting via the RSYNC_PROTECT_ARGS en‐
         vironment  variable.   If  it has a non-zero value, this setting
         will be enabled by default, otherwise it will be disabled by de‐
         fault.  Either state is overridden by a manually specified posi‐
         tive or negative version of this option (note  that  --no-s  and
         --no-secluded-args are the negative versions).  This environment
         variable is also superseded by a non-zero RSYNC_OLD_ARGS export.

         This option conflicts with the --old-args option.

         This option used to be called --protect-args (before 3.2.6)  and
         that older name can still be used (though specifying it as -s is
         always the easiest and most compatible choice).
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501