1

I'm looking for something pretty similar to this.

The logs look like this:

[09:44:22] [main] ERROR [url/location] - A ONE LINE ERROR
[09:44:22] [main] ERROR [url/location] - Another ERROR 
[09:44:22] [main] SOMETHING DIFFERENT
[09:44:22] [main] SOMETHING DIFFERENT AGAIN
[09:44:22] [main] WARN [url/location] - ANOTHER ONE LINE WARN

Line after line with no empty lines between them, though occasionally there are indents when further info is available for a specific piece.

I want to be able to pull every line that includes ERROR (ideally as a script that can pull ERROR And/Or FAIL, WARN, etc.) and display them according to a parameter. It'll make sifting through logs for fails and whatnot much easier.

Patremagne
  • 45
  • 1
  • 7
  • You'll need to define what a "chunk" is - beyond the fact that it *isn't* delimited by blank lines – steeldriver Jul 14 '16 at 15:39
  • I couldn't find a way to make it so that what's shown in the block above is line after line instead of line, space, line. Each message has its own line with a timestamp at the beginning, if it's possible to sort by that. – Patremagne Jul 14 '16 at 15:44
  • You should be able to preserve text formating by using code markdown (basically select it then press Ctrl-K) - see [How do I format my posts using Markdown or HTML?](http://unix.stackexchange.com/help/formatting) in the [Help Center](http://unix.stackexchange.com/help) – steeldriver Jul 14 '16 at 16:04
  • I've got something working with [awk -v FS='' '/ERROR/' file.txt] that seems to work, though only in my test file where I copied a few lines from the log into a new file. Is there a size limit where the command stops working? Minus the brackets, as I'm new to SE and don't know much of the forum syntax. – Patremagne Jul 14 '16 at 16:11
  • I'm not aware of any size limit. However I still don't understand what your input and desired output are. – steeldriver Jul 14 '16 at 16:32
  • I don't know how to make it more clear than I have already. I want to have every line in a log (starting with a timestamp and ending at the next timestamp) that has ERROR in it to be printed. – Patremagne Jul 14 '16 at 17:06
  • 1
    @Patremagne I don't see why the `awk` command you quoted wouldn't work if all you want is the matching lines. Setting `FS` is unnecessary, but shouldn't matter. Just printing lines containing a pattern is simple with any of the tools you mention, and perhaps the simplicity of it is what causes confusion. (And the word "chunk" and mentioning indentations). `awk '/ERROR/' in.txt`, `grep ERROR in.txt` and `sed -n '/ERROR/p' in.txt` should all print the lines containing "ERROR" anywhere on the line, though grep is made just for this. – ilkkachu Jul 14 '16 at 21:10

3 Answers3

2

GNU grep is able to do this quite simply. From man grep:

Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either subexpression.

grep "ERROR\|FAIL\|WARN" /path/to/example.log

egrep eliminates the need for escaping the | symbols.

egrep "ERROR|FAIL|WARN" /path/to/example.log
Timothy Martin
  • 8,447
  • 1
  • 34
  • 40
  • Why is `|` called "infix operator", when it means "OR" here? Also, your answer will not output following lines, if they start with whitespace. – Alex Stragies Jul 14 '16 at 17:24
  • That is a good question for which I do not know the answer. I believe the pertinent information is contained in the second half of that sentence: `the resulting regular expression matches any string matching either subexpression`. As for a `leading whitespace`, it works on any line containing `ERROR, FAIL, or WARN` on my system. I have edited the answer to include `GNU`. – Timothy Martin Jul 14 '16 at 17:38
  • Ok, thanks for looking that up. I often use the word "infix", in the context "Reverse incremental infix search" (CTRL-R), but i had never seen it written in the context you quoted. 2) I meant lines, that start with whitespace, but follow a line containing the keyword, while not themselves containing the keyword. – Alex Stragies Jul 14 '16 at 17:44
  • This is the correct answer. +1. (I had misunderstood OP's desired input format.) – Alex Stragies Jul 14 '16 at 18:25
  • @AlexStragies I've sort of found the issue that has me saying some things aren't working. The actual log file that I want to pull errors from won't work with any of the commands in this thread, but the same exact text (copied/pasted word for word from the original) pasted into a fresh text file works. I have both read/write permissions, so I'm not sure why the original log or even duplicates of it won't obey. – Patremagne Jul 14 '16 at 22:24
1

I suppose your log file looks like this?

example.log:

[09:44:22] [main] ERROR [url/location] - A ONE LINE ERROR
[09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR 
    with whitepace indention
[09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR 
       with tab indention
[09:44:22] [main] SOMETHING DIFFERENT
[09:44:22] [main] SOMETHING DIFFERENT
       with tab indention
[09:44:22] [main] WARN [url/location] - ANOTHER ONE LINE WARN

Admittedly not a one-liner and in perl, but it should do the job:

logsifter.pl:

#!/usr/bin/perl
use warnings;
use strict;

my $buffer="";

while(my $line= <>){
  chomp $line;
  if($line=~/ERROR|INFO|WARN/){
    print "$buffer\n" if $buffer;
    $buffer = $line;
  }
  elsif($line=~/^\s+(.*)$/){
    $buffer .= $1 if $buffer;
  }
  else{
    if($buffer){
      print "$buffer\n";
      $buffer ="";
    }
  }
}

print "$buffer\n";

call it like:

perl logsifter.pl < example.log
 [09:44:22] [main] ERROR [url/location] - A ONE LINE ERROR
 [09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR with whitepace indention
 [09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR with tab indention
 [09:44:22] [main] WARN [url/location] - ANOTHER ONE LINE WARN
murphy
  • 345
  • 1
  • 11
  • I tried this and the output was just every line in the text file, regardless of whether or not it included ERROR. It's entirely likely I did something wrong, as I mentioned I'm new to all this. – Patremagne Jul 14 '16 at 17:14
  • @Patremagne : Murphy and I, we both basically use the same algorithm. I haven't run his version, but on first glance it looks correct . You may have missed, that his version filters for ERROR|INFO|WARN, and trailing lines – Alex Stragies Jul 14 '16 at 17:18
  • @AlexStragies the only small difference is, that I remove the line breaks in case of an indent. this way everything is in one line and easier to process with grep if necessary – murphy Jul 14 '16 at 17:21
  • @AlexStragies Ahh I missed that INFO was in the perl script, so I removed it and it worked on my test text file that only includes 10 or so lines from the actual log. Unfortunately when I try it on the log itself, there's no output. – Patremagne Jul 14 '16 at 17:26
  • @Patremagne I tested my solution with my provided example.log. It puts out the data as shown above. – murphy Jul 14 '16 at 17:31
  • @murphy Yes, it works for me as well with my test log of 10 lines, but the log I actually want to probe through has 2500 lines and has no output when I run it through your solution. – Patremagne Jul 14 '16 at 17:38
  • @Patremagne if you upload your file here https://gist.github.com/ and post the url, I will look into it. – murphy Jul 14 '16 at 17:39
  • @murphy I'd rather not post it publicly, but I can send it to you on some other platform. – Patremagne Jul 14 '16 at 20:30
1

Now, that your Data format has been established, the answer becomes a lot simpler: grep was built for this.

Use as grep '<PATTERN>' <dataFile>

Where <PATTERN> is SearchWORD1 or SearchW1\|SearchW2

The answer below was written, when me and @murphy still had wrong assumptions about the dataformat:

Here is a one-line awk program that only searches for ERROR:

awk '/ERROR/{a=1;print} /^ / || /^\t/ {if (a) print;next} !/ERROR/ {a=0}'

You could make this into a flexible shell-function with parameter:

searchlog(){ awk -f <( echo "
/$1/{a=1;print}
/^ /||/^\t/{if (a) print;next}
! /$1/{a=0}
"); }

Run it either as LogData_generated_by_program | searchlog <PATTERN>, or searchlog <PATTERN> < File_containing_Log_Data.

For the example data format the other answerer "guessed", this results in:

$ searchlog ERROR < /tmp/exampleData
[09:44:22] [main] ERROR [url/location] - A ONE LINE ERROR
[09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR 
    with whitepace indention
[09:44:22] [main] ERROR [url/location] - A MULTI LINE ERROR 
       with tab indention
Alex Stragies
  • 5,857
  • 2
  • 32
  • 56