Script that scans through logs, pulls specific data, and creates an output directory

Question

I had asked a question a while ago about how to pull specific info from a log file, and got some useful answers, but I don't quite have what I need yet. What I have so far:

#!/bin/bash


mkdir result

cd $1
ls | while read -r file;
do
egrep "ERROR|FAIL|WARN" > ../errors-$file
done
cd ..

mv errors-* result

Here's what I want in total: the script scans through a directory of logs, grabs each line of text (some are multi-line) that includes ERROR|FAIL|WARN, and outputs them in a directory called "result" with each individual file being named based on their source file.

It works when my target log directory only has one text file in it, but it doesn't work when my target directory has directories and log files at different levels within it. I know it's because the "../errors-$file" bit in the script, but I'm not sure how to change it.

Any help would be great, and I apologize in advance if it's poorly phrased.

You use the argument `$1`, what is the value of that argument when you normally run that script? — Centimane, Aug 12 '16 at 18:05
Also, you say it `doesn't work` but what does that mean? What actually happens when you run your script? — Centimane, Aug 12 '16 at 18:08
The egrep line should be `egrep "ERROR|FAIL|WARN" $file > ../errors-$file` — Rui F Ribeiro, Aug 12 '16 at 18:17
@Dave I follow the ./logsifter.sh with the name of the folder I want to scan through. I'm still new to UNIX so I'm not entirely sure if that's what you're asking for though. As for when it doesn't work - the only output in the "result" folder is an empty file that takes its name from the first item inside the target directory, and which is a subdirectory. — Patremagne, Aug 12 '16 at 18:22

Centimane · Accepted Answer · 2016-08-16T19:43:14.180

0

I'll poke at your loop itself, and try to walk through what it's actually doing, and then look at how you might achieve what you want.

How you initiate your loop is overly complex: ls | while read -r file;. You're piping the results of ls to your while read, not a big deal, still works, though I would recommend a for loop more like:

for file in ./* ;

This will loop over the same files, but is a little more straight forward. Also the ls command can return output that is not just the filenames (ever notice that ls has different font colors for directories/executables?), and this can cause other commands grief.

Your command: egrep "ERROR|FAIL|WARN" > ../errors-$file is trying to egrep for ERROR|FAIL|WARN in nothing, and redirecting output to ../errors-$file.

The syntax for egrep is:

egrep [search pattern] [location]

So you're missing the location for egrep to look in. You can fix this by adding $file after your pattern, so it becomes:

#!/bin/bash


mkdir result

cd $1
for file in $(find ./ -type f)
do
egrep "ERROR|FAIL|WARN" $file > ../result/errors-$(basename $file)
echo "-------------------------------" >> ../result/errors-$(basename $file)
egrep "denied" -B 1 $file >> ../result/errors-$(basename $file)
done

edited Aug 16 '16 at 19:43

answered Aug 12 '16 at 19:16

Centimane

4,420
2
21
45

Thanks, this is super helpful! Does the fact that there are subfolders that also include logs within the scanned directory affect the code you supplied? – Patremagne Aug 15 '16 at 13:12
Also, you mentioned -h prevents the filename being included in the output. I'd actually like it in the output, as the script will be scanning several log files and ouputting the ERROR|FAIL|WARNs into separate files that are named based on their original. Can I just remove the -h? – Patremagne Aug 15 '16 at 13:18
@Patremagne `find` will look in subfolders, but `{}` would expand to include the directory name, so it may be better to stick to your for loop (else the `find -exec` becomes more complicated to include a subshell). As for the `-h`, even in your prior example the output from grep should only be the matching line. The `-h` option ensures the output is only the matching line. If you use `-H` then the output will be: [filename]:[matching line] in your output files. But if the files you're putting the output into are named based on the original you may not need the contents to mention the original. – Centimane Aug 15 '16 at 15:53
I see. I tried the script on a directory that had a text file as well as a folder (with text files in it) beneath it and only got a single "errors {}" file for output, when I would've expected (or wanted) one for each text file. Any ideas? – Patremagne Aug 16 '16 at 17:21
@Patremagne use your for loop instead, I'm going to edit out the `find exec` suggestion – Centimane Aug 16 '16 at 17:31
I'm having trouble figuring out what the resulting script is supposed to look like with the for loop. I attempted to add what I have in a comment but it comes out on one line so I got rid of it. – Patremagne Aug 16 '16 at 17:49
@Patremagne i put a full copy of the script in my answer, with a little cleanup. – Centimane Aug 16 '16 at 17:56
That's perfect, can't thank you enough for your help. Is it possible to, if the script picks up say, "denied", have it also grab and output the line just above it? – Patremagne Aug 16 '16 at 18:15
`egrep "denied" -B 1 $file` would print the matching line and 1 line before it (that what the `-B 1` does). I'm not sure that you could work it into your current `egrep`, but you could just perform two `egrep`s on the file. Check out `man egrep` to get a sense of all the options. – Centimane Aug 16 '16 at 18:46
Would performing 2 `egrep`s on the file still allow them to be printed in output in the proper order in the same file? I imagine it's not as simple as moving `> ../result/errors-$(basename $file)` (or transform it into something that can be moved) down a few lines and adding the second egrep before it? – Patremagne Aug 16 '16 at 19:09
@Patremagne they can still go in the same output file, I edited my answer again. `>` redirects output overwriting the file `>>` adds output onto the end of the file. Using `>` first will make sure the file is cleared out, then `echo` a divider to indicate the different `egrep`s then redirect output onto the end of the file. – Centimane Aug 16 '16 at 19:44
Sorry to keep bugging you, but I've got one final question. Is it possible to keep the folder structure in the output folder? For example, if the script found a file at topfolder > subfolder > log.txt, it'd be great if the output would create and keep the name of the subfolder above the log to make it more organized. – Patremagne Aug 17 '16 at 15:31
you can use the `dirname` command. Where `basename $string` leaves only the file name `dirname $string` leaves only the directory names, so instead of `> ../result/error-$(basename $file)` you could use `> ../result/$(dirname $file)/error-$(basename $file)` you'd just have to make sure you made the directory first `mkdir $(dirname $file)` – Centimane Aug 17 '16 at 16:09

Script that scans through logs, pulls specific data, and creates an output directory

1 Answers1