6

I'm confused as to why this does not match:

expr match Unauthenticated123 '^(Unauthenticated|Authenticated).*'

it outputs 0.

stracktracer
  • 163
  • 1
  • 1
  • 3
  • As an aside, if you were using bash for this, the preferred alternative would be the `=~` operator in `[[ ]]`, ie. `[[ Unauthenticated123 =~ ^(Unauthenticated|Authenticated) ]]` – Charles Duffy Dec 14 '15 at 18:22
  • ...and if you weren't targeting a known/fixed operating system, using `case` rather than a regex match is very much the better practice, since the accepted answer depends on behavior POSIX doesn't define. – Charles Duffy Dec 14 '15 at 18:25
  • See [Why does my regular expression work in X but not in Y?](http://unix.stackexchange.com/questions/119905/why-does-my-regular-expression-work-in-x-but-not-in-y) – Gilles 'SO- stop being evil' Dec 14 '15 at 23:43

3 Answers3

7

Your command should be:

expr match Unauthenticated123 'Unauthenticated\|Authenticated'

If you want the number of characters matched.

To have the part of the string (Unauthenticated) returned use:

expr match Unauthenticated123 '\(Unauthenticated\|Authenticated\)'

From info coreutils 'expr invocation':

`STRING : REGEX'
     Perform pattern matching.  The arguments are converted to strings
     and the second is considered to be a (basic, a la GNU `grep')
     regular expression, with a `^' implicitly prepended.  The first
     argument is then matched against this regular expression.

     If the match succeeds and REGEX uses `\(' and `\)', the `:'
     expression returns the part of STRING that matched the
     subexpression; otherwise, it returns the number of characters
     matched.

     If the match fails, the `:' operator returns the null string if
     `\(' and `\)' are used in REGEX, otherwise 0.

     Only the first `\( ... \)' pair is relevant to the return value;
     additional pairs are meaningful only for grouping the regular
     expression operators.

     In the regular expression, `\+', `\?', and `\|' are operators
     which respectively match one or more, zero or one, or separate
     alternatives.  SunOS and other `expr''s treat these as regular
     characters.  (POSIX allows either behavior.)  *Note Regular
     Expression Library: (regex)Top, for details of regular expression
     syntax.  Some examples are in *note Examples of expr::.
outis
  • 113
  • 5
Lambert
  • 12,495
  • 2
  • 26
  • 35
5

Note that both match and \| are GNU extensions (and the behaviour for : (the match standard equivalent) when the pattern starts with ^ varies with implementations). Standardly, you'd do:

expr " $string" : " Authenticated" '|' " $string" : " Unauthenticated"

The leading space is to avoid problems with values of $string that start with - or are expr operators, but that means it adds one to the number of characters being matched.

With GNU expr, you'd write it:

expr + "$string" : 'Authenticated\|Unauthenticated'

The + forces $string to be taken as a string even if it happens to be a expr operator. expr regular expressions are basic regular expressions which don't have an alternation operator (and where | is not special). The GNU implementation has it as \| though as an extension.

If all you want is to check whether $string starts with Authenticated or Unauthenticated, you'd better use:

case $string in
  (Authenticated* | Unauthenticated*) do-something
esac
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
2

$ expr match "Unauthenticated123" '^\(Unauthenticated\|Authenticated\).*' you have to escape with \ the parenthesis and the pipe.

netmonk
  • 1,840
  • 1
  • 13
  • 20
  • 1
    and the `^` may not mean what some would think depending on the `expr`. it is implied anyway. – mikeserv Dec 14 '15 at 14:18
  • 1
    @mikeserv, `match` and `\|` are GNU extensions anyway. This Q&A seems to be about GNU `expr` anyway (where `^` is guaranteed to mean _match at the beginning of the string_). – Stéphane Chazelas Dec 14 '15 at 14:34
  • @StéphaneChazelas - i didn't know they were strictly GNU. i think i remember them being explicitly officially *unspecified* - but i don't use `expr` too often anyway and didn't know that. thank you. – mikeserv Dec 14 '15 at 14:49
  • 1
    It's not "strictly GNU" - it's present in a number of historical implementations (even System V had it, undocumented, though it didn't have the others like substr/length/index), which is why it's explicitly unspecified. I can't find anything about `\|` being an extension. – Random832 Dec 14 '15 at 16:13