Regex that would grep numbers after specific string

Question

So I have a line:

ID: 54376

Can you help me make a regex that would only return numbers without "ID:"?

NOTE: This string is in a file.

score 20 · Answer 1 · edited May 24 '14 at 12:05

20

Try this:

grep -oP '(?<=ID: )[0-9]+' file

or:

perl -nle 'print $1 if /ID:.*?(\d+)/' file

edited May 24 '14 at 12:05

terdon

234,489
66
447
667

answered May 24 '14 at 08:21

cuonglm

150,973
38
327
406

Thank you for the reply but I don't need all numbers from a file a need only a number that occurs after ID: – Blake Gibbs May 24 '14 at 09:38
Updated my answer. – cuonglm May 24 '14 at 09:44
1

Note that `-o` and `-P` are GNU extensions to `grep`. `-o` works on the BSD's as well. PCRE support with `-P` is not always compiled in either. – Matt May 25 '14 at 10:06

score 7 · Answer 2 · edited May 24 '14 at 16:09

7

There are many ways of doing this. For example:

Use GNU grep with recent PCREs and match the numbers after ID: :
```
grep -oP 'ID:\s*\K\d+' file
```
Use awk and simply print the last field of all lines that start with ID:
```
awk '/^ID:/{print $NF}' file
```
That will also print fields that are not numbers though, to get numbers only, and only in the second field, use
```
awk '($1=="ID:" && $2~/^[0-9]+$/){print $2}' file
```
Use GNU grep with Extended Regular Expressions and parse it twice:
```
grep -Eo '^ID: *[0-9]+' file | grep -o '[0-9]*'
```

edited May 24 '14 at 16:09

Stéphane Chazelas

522,931
91
1,010
1,501

answered May 24 '14 at 12:00

terdon

234,489
66
447
667

Thanks! What `\K` is doing in first example? – rnd_d May 14 '15 at 14:26
2

@rnd_d it's a Perl Compatible Regular Expressions (PCRE) construct which means "ignore anything matched up to this point". It is used like a lookbehind, it let's me use `-o` to print only the matched portion but also discard things I'm not interested in. Compare `echo "foobar" | grep -oP "foobar"` and `echo "foobar" | grep -oP 'foo\Kbar'` – terdon May 14 '15 at 15:27

Rohit Jain · Answer 3 · 2014-05-24T08:25:58.123

4

Use egrep with -o or grep with -Eo option to get only the matched segment. Use [0-9] as regex to get just numbers:

grep -Eo [0-9]+ filename

edited May 24 '14 at 08:25

answered May 24 '14 at 08:20

Rohit Jain

141
4

1

The OP needs it to match only after a specific string. See the question's title. – terdon May 24 '14 at 12:06

mikeserv · Answer 4 · 2014-05-25T01:08:32.453

sed -n '/ID: 54376/,${s/[^ 0-9]*//g;/./p}'

That will print only all numbers and spaces occurring after ID: 54376 in any file input.

I've just updated the above a little to make it a little faster with * and not to print blank lines after removing the non-{numeric,space} characters.

It addresses lines from regex /ID: 54376/ ,through the $last and on them s///removes all or any *characters ^not [^ 0-9]* then prints /any/ line with a .character remaining.

DEMO:

{
echo line 
printf 'ID: 54376\nno_nums_or_spaces\n'
printf '%s @nd 0th3r char@cter$ %s\n' $(seq 10)
echo 'ID: 54376'
} | sed -n '/ID 54376/,${s/[^ 0-9]*//g;/./p}'

OUTPUT:

score 1 · Answer 5 · answered May 25 '14 at 16:33

1

Using sed:

{
    echo "ID: 1"
    echo "Line doesn't start with ID: "
    echo "ID: Non-numbers"
    echo "ID: 4"
} | sed -n '/^ID: [0-9][0-9]*$/s/ID: //p'

The -n is "don't print anything by default", the /^ID: [0-9][0-9]*$/ is "for lines matching this regex" (starts with "ID: ", then 1 or more digits, then end of line), and the s/ID: //p is of the form s/pattern/repl/flags - s means we're doing a substitute, to replace the pattern "ID: " with replacement text "" (empty string) using the p flag, which means "print this line after doing the substitution".

Output:

1
4

answered May 25 '14 at 16:33

godlygeek

7,963
1
28
28

It won't work if the ID present in the center of a line. – Avinash Raj May 25 '14 at 16:38
Nor should it, based on my reading of the question. And not trying to prematurely handle that case makes the code simpler and more portable. – godlygeek May 25 '14 at 17:01

Avinash Raj · Answer 6 · 2014-05-25T16:35:03.687

0

Another GNU sed command,

sed -nr '/ID: [0-9]+/ s/.*ID: +([0-9]+).*/\1/p' file

It prints any number after ID:

edited May 25 '14 at 16:35

answered May 24 '14 at 15:47

Avinash Raj

3,653
4
20
34

You really don't need the `+`. If the difference between one character and 3 characters is your script may not work in all `sed`s you should probably do: `sed -n '/ID: $[0-9][0-9]*$.*/{s//\1/;s/.*[^0-9]//;/./p}'`. Your answer also misses the first `ID: [0-9]` on a line containing two occurrences of `ID: [0-9]`. – mikeserv May 25 '14 at 04:02

score 0 · Answer 7 · answered May 12 '16 at 12:37

0

Use grep + awk :

  grep "^ID" your_file | awk {'print $2'}

Bonus : easy to read :)

answered May 12 '16 at 12:37

lily

1

1

You don't need `grep` if you're using `awk`. `awk '/^ID/ { print $2 }'` does the same thing, and avoids [grep line-buffering issues](http://unix.stackexchange.com/a/46720/7696). It's also pretty much the same as one of the solutions in @terdon's answer. – cas May 12 '16 at 13:02

Regex that would grep numbers after specific string

7 Answers7

DEMO:

OUTPUT: