3

Assume we have the following test.txt:

# commented line, no match
# commented line, would match /app/
# again commented, would match app
non commented line, matchin /app
non commented line, no match

I would like to get all lines that contain the word 'app', - but not those that have a comment, - and I would like the filename to be output.

The trivial grep -H 'app' test.txt, obviously, matches everything and does not avoid lines starting with number/hash character #:

term1.png

A pipeline with a second grep with -v, --invert-match option generally messes up the colors, and to preserve the -H filenames, I would not be able to specify a negated match for ^# (i.e. a number/hash character at the start of a line) so I'd have to use a beast like grep -H app test.txt --color=always | grep -v '\[K:.\[m.\[K#' to preserve colors:

term2.png

... but only after doing something like grep -H app test.txt --color=always | hexdump -C so I can see the right combo of characters, which is, mildly speaking, tedious.

And unfortunately, seemingly one cannot use the -v option to specify its own (negated) pattern in a combo with -e PATTERN, --regexp=PATTERN option which can specify multiple search patterns:

:tmp$ grep -H -e 'app' -v '^#' test.txt
grep: ^#: No such file or directory
test.txt:# commented line, no match
test.txt:non commented line, no match
test.txt:

Here, grep interprets '^#' to be a filename, not a search pattern - so the -v inverts the matching of app, and I get the wrong results from the expected one. Otherwise, in this example, the expected output is only one line:

test.txt:non commented line, matchin /app

... with properly colored filename, and matches.

So, is there a way to achieve this - but without the messy pipeline given above, and simply using ^# as the pattern to be avoided?

sdaau
  • 6,668
  • 12
  • 57
  • 69
  • 1
    @Theophrastus that's a good idea I think - except it won't match the pattern `app` if it happens to be at the very start of the line: perhaps match a word boundary *or* any character except # e.g. `grep -HE --color=always '^(\b|[^#]).*app'`? – steeldriver Dec 11 '15 at 03:41
  • Something similar, via https://unix.stackexchange.com/questions/70710/ and PCRE negative look-ahead: the following command matches only lines that do not start with any ammount of whitespace followed by `//`, but contain the word `myword`: `grep -rnP '^(?![[:space:]]*//).*myword' --include='*.cxx'` – sdaau Jan 05 '19 at 21:14

1 Answers1

5
grep -E '^([^#].*)?app' ./infiles* /dev/null

I guess the comments already nearly had it anyway, but if you make the head of line [^#] not-comment match ?optional, then you either get lines that begin with the match app or you get lines which begin with something else and then eventually match app - but either way, you don't get lines that begin with #.

Regarding the colors - well... that depends on the grep and the regexp, but a standard GNU grep should highlight the whole match up to the last app match. If you would like it more specific you can do info grep to have a look at what environment vars a GNU grep will consider when highlighting and configure them appropriately , or, failing a satisfactory result in that vein, highlight it yourself.

mikeserv
  • 57,448
  • 9
  • 113
  • 229
  • Many thanks @mikeserv and the comments - this works, just note that in my example, the entire line is highlighted, as this answer also warns about. Cheers! – sdaau Dec 11 '15 at 09:32