1

I want to use the following regex with awk to validate phone numbers:

echo 012-3456-7890 | awk '/^\(?0[1-9]{2}\)?(| |-|.)[1-9][0-9]{3}( |-|.)[0-9]{4}$/ {print $0}'

But I am getting the following error:

awk: line 1: regular expression compile failed (missing operand)
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
sci9
  • 517
  • 2
  • 7
  • 19
  • 1
    Are you using mawk? This works with GNU awk, but with mawk there are two problems: `(| |-|.)` has a leading `|` without a regex atom before it, and mawk doesn't support `{n}`. – muru Jun 05 '18 at 04:59
  • Yes I need a regex works with MAWK. The first pipe symbol is to match the no space condition. Is there any workaround for this problem? – sci9 Jun 05 '18 at 05:22
  • You could handle the optional space/dash/period using `[ -.]?`, but this still leaves the `{n}` problem open, so I think you should switch from mawk to something else (gawk, Perl, Ruby, Python, Java, ....). Actually, `grep -e` or `grep -P` should work too. I don't see any compelling reason, why you are using a programming language, if you can simply grep for it. – user1934428 Jun 05 '18 at 05:33

1 Answers1

5

Since the ranges used here are of fixed length, you could simply write out the entire range [0-9]{3} => [0-9][0-9][0-9]. And instead of (| |-|.), ( |-|.)? - though I am confused: are you allowing any character (.), in addition to space and -? Then it could just be .? since space and - are matched by . anyway. If you're matching the literal period ., then you should use [- .]? instead (the leading - is to avoid interpretation as a character range). So:

^\(?0[1-9]{2}\)?(| |-|.)[1-9][0-9]{3}( |-|.)[0-9]{4}$

Becomes:

^\(?0[1-9][1-9]\)?[- .]?[1-9][0-9][0-9][0-9][- .][0-9][0-9][0-9][0-9]$
muru
  • 69,900
  • 13
  • 192
  • 292