3

I am new to sed and am having some troubles making it work.

What I want is this:

abc.ztx.com. A 132.123.12.44 ---> abc.ztx.com

I used the below pattern, but doesn't seem to work:

echo "abc.ztx.com. A 132.123.12.44" | sed 's/\.\s.+//g'

I verified the regex using regex101.com and pattern, \.\s.+ matches the part . A 132.123.12.44 perfectly. Why is it not working with sed.

Appreciate your help. Thank you.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
Amey
  • 133
  • 5
  • Do you have to use `sed`? This is a perfect job for `cut`, which is what I'd use here. If you aren't stuck with sed only, let me know and I'll post a `cut` answer. – ron rothman Mar 18 '20 at 00:37
  • @ron-rothman Yes, please. Anything that makes the job easier. – Amey Mar 18 '20 at 05:23

4 Answers4

5

sed uses POSIX basic regular expressions (BRE) by default. \s is a PCRE (Perl-compatible regular expression) which is equivalent to the BRE [[:blank:]] (I think, matching spaces and tabs, or possiby [[:space:]] which matches a larger set of whitespace characters). The + is a POSIX extended regular expression (ERE) modifier, which is equivalent to \{1,\} as a BRE.

So try

sed 's/\.[[:blank:]].*//'

instead. You may replace [[:blank:]] by a space character if you don't need to match tabs:

sed 's/\. .*//'

Note that there is no need to do the substitution with the g flag as there will only ever be a single match. Also, the .+ that you use could just be replaced by .* instead of .\{1,\} as we don't care whether there are any further characters at all (just delete all of them).

Related:

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
  • Thank you so much Kusalananda for answering the question and also explaining the first principles of sed Regex. – Amey Mar 16 '20 at 14:19
2

If you are using Gnu/Linux, or any other Gnu, then you will have Gnu sed. Gnu sed has the -r option, that allows this.

Add the option -r to change the dialect of regex.

e.g.

echo "abc.ztx.com. A 132.123.12.44" | sed -r 's/\.\s.+//g'

ctrl-alt-delor
  • 27,473
  • 9
  • 58
  • 102
  • Wow! This is helpful too! Thank you.. – Amey Mar 16 '20 at 16:53
  • While `-r` does work, `-E` is the POSIX standard switch to do this. – David Conrad Mar 16 '20 at 18:07
  • @DavidConrad POSIX `sed` does not support extended regular expressions nor PCREs, and does not have `-r` nor `-E` options. The `sed` used in this answer is GNU `sed`. Most `sed` implementations supports the non-standard `-E` option to enable the use of EREs, but only GNU `sed` (AFAIK) includes the `\s` expression (along with a few other PCRE shortcuts that GNU decided to put in their regular expression library). – Kusalananda Mar 16 '20 at 18:18
  • @Kusalananda The att sed version (from 2012-03-28) already included the `-r` and `-E` options. Also supports the `\s`. I don't recall now if that came from super-sed or from sed 3.02. –  Mar 17 '20 at 01:01
  • @DavidConrad There is no accepted `-E` (yet, it may be so on future editions) in POSIX. But the idea sure came from other places, not form POSIX. –  Mar 17 '20 at 01:03
  • Yes, `-E` for **Extended** sounds clearer than `-r`. Both are not POSIX standard anyway. –  Mar 17 '20 at 01:05
0

Your question specifically asks about sed, but I would use cut for this.

If you can live with a trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1

abc.ztx.com.

If you can't live with the trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | rev | cut -d. -f2- | rev

abc.ztx.com

or:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | sed -e "s/\.$//"

abc.ztx.com
ron rothman
  • 101
  • 2
0

Yet another possibility using awk, with . (period + space) specified as field separator:

echo "abc.ztx.com. A 132.123.12.44" | awk -F '\\. ' '{print $1}'

abc.ztx.com
AdminBee
  • 21,637
  • 21
  • 47
  • 71