5

Is this related to a bug, etc. or this is how it should be?

find ./frontend -mindepth 1 -regex '^./dir1/dir2\(/.*\)?' works on Ubuntu but not Alpine (docker)

find ./frontend -mindepth 1 -regex '^./dir1/dir2\(/.*\)\?' works on Alpine (docker) but not Ubuntu

Alpine: 3.14

Ubuntu: 18.04

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175

1 Answers1

7

They use different syntaxes for regular expressions.

GNU find's -regex uses Emacs regular expressions by default. This can be changed with the option -regextype which is specific to GNU find; other choices include POSIX BRE (basic regular expressions, as in grep and sed) and POSIX ERE (extended regular expressions, as in grep -E and (almost) awk).

BusyBox find's -regex uses POSIX BRE (the default for the regexc function). Because BusyBox is designed to be small, there is no option to use a different regex syntax.

FreeBSD, macOS and NetBSD default to BRE, and can use ERE with the -E option.

POSIX does not standardize -regex.

For your command:

  • In BRE (basic), grouping is \(…\). The zero-or-one operator is \? if present, but it is an optional feature, present in BusyBox when built with Glibc (I'm not sure about other libc) but not on BSD. Zero-or-one can also be spelled \{0,1\}.
  • In Emacs RE, grouping is \(…\) and the zero-or-one operator is ?. Although Emacs itself also supports \{0,1\} to mean zero-or-one, GNU find's Emacs regex syntax doesn't.
  • In ERE (extended), grouping is (…) and the zero-or-one operator is ?.

If you need portability between the various implementations of find that implement -regex, you need to stick to POSIX BRE constructs (for the sake of BusyBox) that are spelled the same in GNU find's Emacs syntax. This means there's no zero-or-one operator.

find ./frontend -mindepth 1 \( -regex '^./dir1/dir2/.*' -o -regex '^./dir1/dir2' \)

Or, alternatively, arrange to pass -regextype posix-basic to GNU find.

case $(find --help 2>/dev/null) in
  *-regextype*) find_options='-regextype posix-basic';;
  *) find_options=;;
esac
find ./frontend $find_options -mindepth 1 -regex '^./dir1/dir2\(/.*\)\{0,1\}'

If dir1 and dir2 are plain strings an not regexes, you're not getting any use from -regex and you can just write

find ./frontend/dir1/dir2 -maxdepth 1
Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
  • There’s also a fourth option that’s just as portable as the last one, but still works if you need the regex syntax and avoids the extra boilerplate of the second approach: Just use `find` to generate the base list of files, and then pipe that to `grep` to do the regex filtering. Not as efficient, but guaranteed to work on Busybox, GNU coreutils, and any POSIX compliant system. For example: `find ./frontend -maxdepth 1 | grep '^./dir1/dir2\(/.*\)\{0,1\}'`. – Austin Hemmelgarn Dec 01 '21 at 00:03