0

I'm trying to modifying a path stored in a variable ($var) through sed. In truth, I need to replace $(dirname "$var") for reasons related to the purpose of my script.

An example of var is var=/dir/dir xyz/file.txt, while filename contains several paths to be substituted which could be in any position (once near the middle of the line, another at the beginning, etc). Below I try to give an example of filename:

dog foo/bar /dir/dir xyz/file.txt
a b c d /dir/dir xyz/file.txt x y z
/dir/dir xyz/file.txt 1234
{[(/dir/dir xyz/file.txt

What I want is

dog foo/bar .
a b c d . x y z
. 1234
{[(.

I tried the following

sed "s|"$(dirname "$var")"|.|g" "$filename"

obtaining sed: -e expression #1, char 31: unterminated `s' command

Could you help me? Of course, you could suggest another way besides sed.

  • 1
    "not sure if I should enclose also the $() with the double quotes `""`)" -- _yes._ – ilkkachu Mar 27 '23 at 13:53
  • 2
    see [When is double-quoting necessary?](https://unix.stackexchange.com/questions/68694/when-is-double-quoting-necessary) and [Quoting within $(command substitution) in Bash](https://unix.stackexchange.com/q/118433/170373). Here, you're explicitly ending the quoted part before the command substitution and restarting it after it. Note that regardless of quotes, you'll have issues there if the filename contains characters special to sed, like `.*`, or backslash, or the `|` you used as separator – ilkkachu Mar 27 '23 at 13:53
  • @ilkkachu it's not working even if I double quote `$(dirname "$VAR")` – user9952796 Mar 27 '23 at 14:06
  • 3
    Don't use any variation of `--in-place` while testing or in your question - add that on your own after you get an answer to your problem if you like as it's completely irrelevant and so just clutters your question. – Ed Morton Mar 27 '23 at 14:10
  • What do you have in `VAR`? How are you setting it? Show how you run it. _Show what you're doing and what happens._ Would the output of `dirname` ever be `/dir/dir xyz/file.txt`? – ilkkachu Mar 27 '23 at 14:12

2 Answers2

2

In:

sed "s|"$(dirname "$var")"|.|g" "$filename"

The s| and |.|g are being quoted, but the $(dirname "$var") is out of the quotes, and is therefore subject to split+glob.

sed -e "s|$(dirname -- "$var")|.|g" -- "$filename"

Would fix that (also adding the missing --s), but that's still wrong as the left-hand side of the s/lhs/rhs/flags in sed is interpreted as a regex (not to mention that you can run into serious problems if $var contained | characters; and sed works with text, while file names are not guaranteed to be).

For instance, . there matches any character, not just ..

You should rather do:

DIR="$(dirname -- "$var")" perl -pe 's/\Q$ENV{DIR}\E/./g' < "$filename"

For that to work with arbitrary file paths¹.


¹ Strictly speaking for arbitrary file paths, including those that contain newline characters, we'd need to add the -0777 option (slurp mode) to perl and couldn't use command substitution which strips trailing newline characters. That could be done by passing the full $var to perl and use perl's dirname() (from the File::Basename module).

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
  • Why a `|` inside `$var` could lead to serious problems? Furthermore, I don't know Perl, is there a more safe way to do my substitution using bash/sed/awk? – user9952796 Mar 28 '23 at 15:27
  • 1
    Because `|` is the delimiter used for that `s` command. For instance if `$var` is `.*|reboot|e; #/whatever`, the sed command would become `s|.*|reboot|e; #|.|g` and reboot. – Stéphane Chazelas Mar 28 '23 at 16:09
  • This means we can never use `sed` because there is always the possibility that someone can put a `.*|reboot|e;` in a text. So, what can I use in my bash script to do the substitution? – user9952796 Mar 28 '23 at 16:31
  • 1
    @user9952796 See [How to ensure that string interpolated into \`sed\` substitution escapes all metachars](https://unix.stackexchange.com/q/129059) if you have to use `sed`, but `perl` has rendered those sed/awk commands obsolete over 35 years ago, you may want to move on. – Stéphane Chazelas Mar 28 '23 at 16:50
  • Thank you. What does it mean `--` ? – user9952796 Mar 28 '23 at 18:55
  • 1
    @user9952796 see [What does "--" (double-dash) mean?](https://unix.stackexchange.com/a/590210) – Stéphane Chazelas Mar 28 '23 at 19:01
  • How could I edit `$filename` in place with `perl`? For the moment it shows the substitution into the shell. In addition, is there a way to break your command in two separated line (that is, one line for `DIR` definition and the next for the `perl` command)? It's only for better readability – user9952796 Mar 28 '23 at 19:23
  • 1
    `perl` has a `-i` option for in-place editing. Some `sed` implementations have actually borrowed if from `perl`. `VAR=value cmd args` is the syntax to call `cmd` with `VAR=value` in its environment. Separating it out would result into something completely different. – Stéphane Chazelas Mar 28 '23 at 19:36
  • Tried to add `-i` option, but I get `Warning: Use of "-i" without parentheses is ambiguous at -e line 1.` – user9952796 Mar 28 '23 at 19:48
  • 1
    `perl -i -pe 's/\Q$ENV{DIR}\E/./g' file`. You'll find thousands of Q&A here using this kind of constructs. See `perldoc perlrun` for details. – Stéphane Chazelas Mar 28 '23 at 20:08
  • I tried the following two lines and seem to work. I'd like to ask you if there are some security issues: `dir=$(dirname -- "$var")` and below `perl -i -p -e "s|$dir|.|" "$filename"` – user9952796 Mar 28 '23 at 22:12
  • 1
    Of course, you're expanding the variable into the perl code which introduces a command injection vulnerability like in the sed case. You're also missing the `--` so if `$filename` is `-esystem"reboot"` for instance, that will reboot. As a general rule: **do not** embed unsanitised data in an argument that is interpreted as *code*, whether that's sed, perl, awk, eval, sh code... or any language that can do more than what you want it to. – Stéphane Chazelas Mar 29 '23 at 07:59
  • So, what's the way to do not expand `$dir` into `sed`, `perl`, `awk`, etc. ? – user9952796 Mar 29 '23 at 08:40
  • 1
    I already showed one for `perl`. You pass the data via some other mean (env var, separate argument, files, standard input...) and tell your code to access that data from there (like `$ENV{var}` here). `perl`, `sh`, `awk`, `python`, `php` can do it easily. `sed` can't so you're only left sanitising the data there. – Stéphane Chazelas Mar 29 '23 at 08:46
  • Could you provide the other ways (separate argument and standard input) using `perl`? However, I don't understand why `perl` (or `bash`, etc.) accessing an environment variable is more secure than expanding the same variable inside the `perl` code. In both cases `$var` could store malicious code, – user9952796 Mar 29 '23 at 10:04
  • Another question: are Python, Java or C++ prone to command injection when performing the task I need in this post (replacing a path in a text file) ? I mean, a whole program written in one if these languages, so no use of `bash` or `perl` – user9952796 Mar 29 '23 at 10:28
  • 1
    Here the problem is that you have one language interpreter (the shell) feeding data as code to another language interpreter (perl, sed, awk...). You'd have the same problem if you did something as silly in any other language. `python` is similar to `perl`, just trendier these days (a lot less now with their fiasco update to 3; and it's also a lot less adapted than perl to text processing). `python`, `perl` ,`sh` all have a `eval` command to evaluate data as code without even having to explicitly invoke a separate interpreter. – Stéphane Chazelas Mar 29 '23 at 11:16
  • I mean programs that don't use `eval` – user9952796 Mar 29 '23 at 13:00
  • Hi, sorry for reviving this post. Today I just saw that `sed` has an option called `--sandbox`. The manual states "In sandbox mode, e/w/r commands are rejected - programs containing them will be aborted without being run. Sandbox mode ensures sed operates only on the input files designated on the command line, and cannot run external programs." Could this option solve the security issue arising when a file named `.*|reboot|e; #/whatever` is processed? Thank you – user9952796 May 10 '23 at 09:00
  • @user9952796, that would be for the GNU implementation of `sed`. Malicious `$var`s could still cause infinite loops and memory exhaustion, so there would still be security issues, only less severe (assuming `sed` doesn't have bugs in its parser that could lead to an ACE, language interpreters should not be fed untrusted code in general). – Stéphane Chazelas May 10 '23 at 10:47
  • Could you provide an example of how `$var` could cause an infinite loop? Thanks – user9952796 May 10 '23 at 13:29
  • 1
    @user9952796, try for instance `(set -x; var='^|x|;:1;H;g;b1;#/x'; echo | sed --sandbox "s|$(dirname -- "$var")|.|g")` – Stéphane Chazelas May 10 '23 at 16:31
1

This seems to be all you're trying to do, using any sed and assuming your variable doesn't contain any :s:

$ var='/dir/dir xyz/file.txt'
$ sed "s:$var:.:" file
dog foo/bar .
a b c d . x y z
. 1234
{[(.

I don't know why you were calling dirname. Also, if you aren't going to us / as the sed delimiter, don't use a regexp metacharacter like | as that just obfuscates your code at best, pick a character like : that's always literal. And don't use all-upper-case variable names for non-environment variables, see correct-bash-and-shell-script-variable-capitalization.

Given your comment, if you really do need to call dirname for some reason then you could do:

$ sed "s:$(dirname "$var"):.:" file
dog foo/bar ./file.txt
a b c d ./file.txt x y z
./file.txt 1234
{[(./file.txt
Ed Morton
  • 28,789
  • 5
  • 20
  • 47