17

I'm trying to use regex as a field seperator in awk. From my reading this seems possible but I can't get the syntax right.

rpm -qa | awk '{ 'FS == [0-9]' ; print $1 }'
awk: cmd. line:1: { FS
awk: cmd. line:1:     ^ unexpected newline or end of string

Thoughts? The goal if not obviouse is to get a list of software without version number.

bahamat
  • 38,658
  • 4
  • 70
  • 103
Gray Race
  • 273
  • 1
  • 2
  • 5

2 Answers2

30

You have mucked up your quotes and syntax. To set the input field separator, the easiest way to do it is with the -F option on the command line:

awk -F '[0-9]' '{ print $1 }'

or

awk -F '[[:digit:]]' '{ print $1 }'

This would use any digit as the input field separator, and then output the first field from each line.

The [0-9] and [[:digit:]] expressions are not quite the same, depending on your locale. See "Difference between [0-9], [[:digit:]] and \d".

One could also set FS in the awk program itself. This is usually done in a BEGIN block as it's a one-time initialisation:

awk 'BEGIN { FS = "[0-9]" } { print $1 }'

Note that single quotes can't be used in a single-quoted string in the shell, and that awk strings always use double quotes.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
  • Is it possible to access the FS and see the string it matched? – Roland Apr 06 '20 at 11:07
  • For a more natural separator, I think `FS="( | )+"` works (the first one is a tab, and 2nd one is a space) – Sridhar Sarnobat Jul 31 '22 at 18:29
  • @SridharSarnobat That's not relevant to the current question (which deals with `FS` set to something that matches digits) but would be more naturally written as `FS='[[:blank:]]+'`, which is very close to what `awk` would use if you don't set `FS` at all. – Kusalananda Jul 31 '22 at 18:32
  • Thanks, that's better. I can't think of a case when I'd use a number as a field separator but maybe I'm not creative enough :) – Sridhar Sarnobat Jul 31 '22 at 18:33
  • @SridharSarnobat I don't really know what this user's data looks like. – Kusalananda Jul 31 '22 at 18:35
15

+1 for Kusalananda's answer. Alternately, the FS variable can be set in the BEGIN block:

awk 'BEGIN {FS="[0-9]"} {print $1}'

Changing FS in a action block won't take effect until the next line is read

$ printf "%s\n" "abc123 def456" "ghi789 jkl0" | awk '{FS="[0-9]"; print $1}'
abc123
ghi

The other errors in the question:

  • can't use single quotes inside a single-quoted string
  • == is a comparison operator, = is for variable assignment
Kusalananda
  • 320,670
  • 36
  • 633
  • 936
glenn jackman
  • 84,176
  • 15
  • 116
  • 168
  • 1
    "Changing FS in a action block won't take effect until the next line is read" I've been looking all over for that info. – Samizdis Jun 26 '17 at 15:50
  • 1
    plus: can't use single quotes for string value in awk, even if you pass them from shell correctly – dave_thompson_085 Jun 27 '18 at 09:24
  • Is it possible to access the FS and see the string it matched? – Roland Apr 06 '20 at 11:07
  • `FS` is a variable so you do anything like any other variable (e.g `print FS`). To get the parts that match FS, with POSIX awk, I think you cannot. With GNU awk you could write `n = split($0, fields, FS, separators)` where `fields` and `separators` are arrays. – glenn jackman Apr 06 '20 at 12:55