4

I have a small script.

#!/bin/bash
# test for regular expressions to match...
DIR="/search/path/"
NAME="FOO[0-9][0-9]_<bar|dog|cat>"

for FILE in `find ${DIR} -maxdepth 1 -type f -name "*\.[dD][oO][cC]"` 
do
    BASENAME=`basename ${FILE}`
    FILENAME="${BASENAME%.*}"

    if [[ "${FILENAME}" == ${NAME} ]]
    then
            echo "Found $FILENAME"
    else
            echo "$FILENAME not matching..!"
    fi

done

In this script I want to match all files that start with FOO[0-9[0-9]_ and then either bar, dog, or cat. But if something else is there like bog or cog or car it should NOT match.

When I do [a-z][a-z][a-z] they will match...

I already tried doing something like:

NAME="FOO[0-9][0-9]_(bar|dog|cat)"
or
NAME="FOO[0-9][0-9]_bar|dog|cat"
or
NAME="FOO[0-9][0-9]_[bar|dog|cat]"
or
NAME="FOO[0-9][0-9]_'bar|dog|cat'"

But in the documentation about regular expressions I could not find an exact match.

I need to have it in a single line, as the main script I use it for is a lot more complex and have a lot of different sub processes hanging off of it.

Is this even possible...?

Rui F Ribeiro
  • 55,929
  • 26
  • 146
  • 227
SHLelieveld
  • 331
  • 2
  • 4
  • 21

2 Answers2

2

Using only the string match, you need to use =~ (which matches against an extended regular expression):

if [[ "${FILENAME}" =~ "FOO[0-9][0-9]_(bar|dog|cat)" ]]

or

NAME="FOO[0-9][0-9]_(bar|dog|cat)"
if [[ "${FILENAME}" =~ ${NAME} ]]

to match your original.

== is always a globbing match (or an exact match if globs are disabled), it can’t be used with regular expressions.

Alternatively, if you can make more changes to the script, assuming you’re using GNU find, you can filter with find:

find ${DIR} -maxdepth 1 -type f -regextype posix-extended -regex ".*/FOO[0-9][0-9]_(bar|dog|cat)\.[Dd][Oo][Cc]"

(-regextype posix-extended tells find we want to use extended regular expressions, and the regular expression itself starts with .*/ because -regex matches the whole path, not just the filename.)

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • Thanks for the quick reply. The issue is this: the find + if loop is in a function, and the DIR and NAME variable is are set over and over again and in each folder the names that are accepted are different. so the variable NAME needs to have the 'bar' or 'dog' or 'cat' regexp. I'll give this a try: NAME="FOO[0-9][0-9]_(bar|dog|cat)" if [[ "${FILENAME}" =~ "${NAME}" ]] – SHLelieveld Jul 10 '18 at 08:06
  • See the (originally) second part of my answer, which works with those constraints (apart from `==`). – Stephen Kitt Jul 10 '18 at 08:07
  • unfortunatly it doesn't I get a ... not matching..! back on all files. even the ones that should match. And I did change the == to =~ but no luck. – SHLelieveld Jul 10 '18 at 08:12
  • Oh wait the quotes are wrong, try the updated version. – Stephen Kitt Jul 10 '18 at 08:17
0

Don't loop over the output of find, instead use find to execute a script that you feed with pathnames:

#!/bin/bash

topdir='/search/path'
pattern='FOO[0-9][0-9]_(bar|dog|cat)'

find "$topdir" -type f -maxdepth 1 -iname '*.doc' -exec bash -c '
    pattern=$1; shift
    for pathname do
        stem=$( basename "${pathname%.*}" )
        if [[ "$stem" =~ $pattern ]]; then
            printf "Found %s\n" "$pathname"
        else
            printf "No match in %s\n" "$pathname"
        fi
    done' bash "$pattern" {} +

Note that using -iname with find does a case insensitive match on the filename.

Related:

Or easier, using bash:

#!/bin/bash

topdir='/search/path'
pattern='FOO[0-9][0-9]_(bar|dog|cat)'

shopt -s globstar nullglob

for pathname in "$topdir"/**/*.[Dd][Oo][Cc]; do
    stem=$( basename "${pathname%.*}" )
    if [[ "$stem" =~ $pattern ]]; then
        printf "Found %s\n" "$pathname"
    else
        printf "No match in %s\n" "$pathname"
    fi
done

Or, using filename globs throughout:

#!/bin/bash

topdir='/search/path'

shopt -s globstar nullglob extglob

for pathname in "$topdir"/**/*FOO[0-9][0-9]_@(bar|dog|cat)*.[Dd][Oo][Cc]
do
    printf "Found %s\n" "$pathname"
done
  • globstar enables the ** glob pattern which works like * but matches across slashes in pathnames.
  • nullglob makes unmatched glob patterns expand to the empty string.
  • extglob makes a few extended glob patterns available. Among these is @(...) which matches any one of the patterns in the parentheses.

For a one-liner, you may use

shopt -s globstar nullglob extglob; printf 'Found %s\n' "$topdir"/**/*FOO[0-9][0-9]_@(bar|dog|cat)*.[Dd][Oo][Cc]

but this would still print the string Found (alone on a line) if no matches were found.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936