0

I wrote a shell script to check which ".err" text files are empty. Some files have a specific repeated phrase, like this example file fake_error.err (blank lines intentional):


WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

WARNING: reaching max number of iterations

that I want to also remove in addition to the empty files. I wrote the following script to do so

#!/bin/bash

for file in *error.err; do
    if [ ! -s $file ]
    then
        echo "$file is empty"
        rm $file
    else
        # Get the unique, non-blank lines in the file, sorted and ignoring blank space
        lines=$(grep -v "^$" "$file" | sort -bu "$file")
        echo $lines

        EXPECTED="WARNING: reaching max number of iterations"
        echo $EXPECTED

        if [ "$lines" = "$EXPECTED" ]
        then
            # Remove the file that only has iteration warnings
            echo "Found reached max iterations!"
            rm $file
        fi

    fi
done

However, the output of this script when run on the fake_error.err files is

WARNING: reaching max number of iterations
WARNING: reaching max number of iterations

from the two $echo statements in the loop, but the file itself is not deleted and the string "Found reached max iterations!" is not printed. I think the issue is in if [ "$lines" = "$EXPECTED" ] and I've tried using double brackets [[ ]] and == but none of those worked. I have no idea what the difference between the two printed statement are.

Why are the two variables not equal?

m13op22
  • 101
  • 2
  • 2
    see if there are any trailing blanks? `sort -bu` won't delete them. e.g. `echo ":$lines:"` or something like that – ilkkachu Mar 29 '23 at 20:00
  • @ilkkachu that's it. The output is `: WARNING: reaching max number of iterations: WARNING: reaching max number of iterations`. How can I remove the blanks from this? – m13op22 Mar 29 '23 at 20:24
  • 1
    Since you're already using grep, I wonder if it would not be simpler to use the exit status of `grep -qv -e '^$' -e 'WARNING: reaching max number of iterations'` directly? – steeldriver Mar 29 '23 at 20:30
  • Ooh, that's an idea, @steeldriver! Something like ```if ! (grep -qv -e '^$' -e 'WARNING: reaching max number of iterations' $file)``` since it would return exit status 0 for matches found? – m13op22 Mar 29 '23 at 20:48
  • 1
    @m13op22 yes it "succeeds" if it finds any line that is neither empty not the ignorable phrase - don't think you need the parentheses though – steeldriver Mar 29 '23 at 20:54
  • Unless you have files containing ONLY blank lines, you shouldn't need to grep for them - grepping for just "WARNING: reaching maximum...." should be enough. And if you do need to grep for only blank lines, that should be a separate command, perhaps comparing the outputs from `wc -l "$file"` and `grep -c '^[[:blank:]]*$' "$file"`. BTW, you can't combine an inverted match `-v` with a normal match in the same grep command, the `-v` applies to all `-e` options in that command. Use awk or perl if you need to do boolean logic with regex matches like `! /^$/ && /WARNING: reaching.../`. – cas Mar 30 '23 at 01:54
  • also, **[quote](https://unix.stackexchange.com/q/131766) your [variables](https://unix.stackexchange.com/q/4899)** – cas Mar 30 '23 at 01:56
  • @cas good point about the quotes, thanks for making sure I'm not being sloppy! Some files are only blank lines, so it's better to use a separate command that uses `awk` instead? – m13op22 Mar 30 '23 at 15:22
  • i don't think so. as far as i can tell from your script above, you want to delete empty files (your -s test works well for that) AND files that contain "WARNING: reaching maximum....". grepping for blank lines isn't needed for that. If you **also** want to contain files containing ONLY blank lines then yes, compare the total line count of each file against the count of empty lines in that file - if equal, then delete it. you'd only need to use awk or perl if you needed more than a simple regex match. – cas Mar 30 '23 at 16:07
  • BTW, my awk example was bogus because the `! /^$/` test is redundant, it's always going to be true if the line contains the warning. I just wanted a quick example and didn't think that one through. better would be if you wanted to check if a file contained both foo and bar on the same line, then you'd use `awk '/foo/ && /bar/ { ... }'` – cas Mar 30 '23 at 16:09

0 Answers0