5

Can I use bash variable substitution to extract a piece of a variable based on a delimeter? I'm trying to get the immediate directory name of a filename (in this case, foo).

$ filename=./foo/bar/baz.xml

I know I could do something like

echo $filename | cut -d '/' -f 2

or

echo $filename | awk -F '/' '{print $2}'

but it's getting slow to fork awk/cut for multiple filenames.

I did a little profiling of the various solutions, using my real files:

echo | cut:

real    2m56.805s
user    0m37.009s
sys     1m26.067s

echo | awk:

real    2m56.282s
user    0m38.157s
sys     1m31.016s

@steeldriver's variable substitution/shell parameter expansion:

real    0m0.660s
user    0m0.421s
sys     0m0.235s

@jai_s's IFS-wrangling:

real    1m26.243s
user    0m13.751s
sys     0m28.969s

Both suggestions were a huge improvement over my existing ideas, but the variable substitution is fastest because it doesn't require forking any new processes.

bonh
  • 181
  • 1
  • 9

4 Answers4

12

You can remove the shortest leading substring that matches */

tmp="${filename#*/}"

and then remove the longest trailing substring that matches /*

echo "${tmp%%/*}"
steeldriver
  • 78,509
  • 12
  • 109
  • 152
4
    echo $f
    a/b/c

    $ (IFS='/';set $f; echo $1)
     a

    $ (IFS='/';set $f; echo $2)
     b

    $ (IFS='/';set $f; echo $3)
     c

with wild card it seems to work with double or single quotes -

    f="a?b?c"
     $(IFS="?"; set $f; echo $1)
     a
    echo $f
    a*b*c
    (IFS="*"; set $f; echo $1)
    a

yes, you'll have to unset the IFS back to default

    unset IFS
jai_s
  • 1,480
  • 7
  • 7
  • Ooh, I like that. – bonh Jan 07 '16 at 14:45
  • This is usually my preferred method as well, but bear in mind that Bash only supports `$1` through `$9` using this syntax. For 10th and later arguments, the `${10}` form must be used. – James Sneeringer Jan 07 '16 at 17:55
  • 1
    Doesn't work when `$f` contains wildcards. And you need to restore `IFS` afterwards (or do this in a command substitution, to get the value of a field, and that strips off trailing newlines). – Gilles 'SO- stop being evil' Jan 07 '16 at 23:34
  • The example works in isolation (inside Git bash on Windows), but when I pipe from the find command I get this error: `echo: write error: Bad address`. – bonh Jan 25 '16 at 17:54
  • Okay, looks like I have to `unset IFS` every time. – bonh Jan 25 '16 at 17:57
  • `unset IFS` vs `SAVIFS=$IFS` I do prefere the second... or it may unset IFS for the calling context? – Sandburg Mar 15 '19 at 17:39
1

Feed the list to awk to speed it up:

awk -F '/' '{print $2}' < <(find /usr)
awk -F '/' '{print $2}' < inputfile

Demonstration:

time awk -F '/' '{print $2; SUM++} END {print "number of directories found: " SUM}' < <(find /usr -type d)
usr
usr
.
.
number of directories found: 16748

real    0m8.910s
user    0m0.050s
sys     0m0.050s
Lambert
  • 12,495
  • 2
  • 26
  • 35
0

Why don't you use the "dirname" command, instead of all this awk/sed/cut stuff?

filename=./foo/bar/baz.xml
dirname $filename

Yields:

./foo/bar
Erik
  • 165
  • 5
  • In this case I was looking for the immediate directory, not the full directory path. – bonh Dec 12 '17 at 21:25