1

When I grep a string in a file, it gives me a match. However, when I do it in a bash script, it doesn't work.

$ cat files_android.txt
000d07dfe5016314c98b869c19c7f986b5db57db49ac76d16a5d2f5861a35072
001c738f74acbf19e3f31c09f6017de99bf3009a0b6f889740da0302ad172472
0047423956b09dd56a8b9c917d8f3028ad32ee01efdd501afa11b0777f4c184f
$ grep  000d07dfe5016314c98b869c19c7f986b5db57db49ac76d16a5d2f5861a35072 android.txt
G:\000d07dfe5016314c98b869c19c7f986b5db57db49ac76d16a5d2f5861a35072 - a variant of Android/Gappusin.C trojan

However, with the following script that read the string from the first argument and grep that in the second argument, no match is reported.

$ cat script.sh
#!/bin/bash
FILE1=$1
FILE2=$2
counter=0
for line in $(cat $FILE1); do
  echo "$line"
  if grep $line $FILE2; then
    counter=$((counter+1))
    echo "$counter"
  fi
done
echo "counter=$counter"
$ ./script.sh files_android.txt android.txt
000d07dfe5016314c98b869c19c7f986b5db57db49ac76d16a5d2f5861a35072
001c738f74acbf19e3f31c09f6017de99bf3009a0b6f889740da0302ad172472
0047423956b09dd56a8b9c917d8f3028ad32ee01efdd501afa11b0777f4c184f
counter=0

What is wrong with that?

[UPDATE]

Thanks to Stephen Harris, the root was that the files were saved in dos format. So dos2unix conversion fixes the issue.

mahmood
  • 1,141
  • 6
  • 28
  • 49
  • 3
    Are you using cygwin or Windows Linux? If so, ensure the file is in Unix format and not DOS format, because you might have hidden `^M` characters at the end of the `files_android.txt` – Stephen Harris Dec 27 '18 at 20:33
  • @StephenHarris that's most certainly the case. The only way I could reproduce OP's problem was by adding `\r` at the end of each line of `files_android.txt`. –  Dec 27 '18 at 20:35
  • In `vim`, I see `"android.txt" [dos] 1268L, 171482C ` and that `dos` also exists in `files_android.txt`. Is that the case? – mahmood Dec 27 '18 at 20:36
  • Yes `[dos]` means it's in DOS format, so has the extra `^M` character. – Stephen Harris Dec 27 '18 at 20:36
  • But `vim` doesn't show any `^M` with blue/cyan color. – mahmood Dec 27 '18 at 20:37
  • See: https://unix.stackexchange.com/q/32001/237982 – jesse_b Dec 27 '18 at 20:37
  • Please try and write this up as an answer, so other people searching will find useful results (these comments may get lost). – Stephen Harris Dec 27 '18 at 20:43
  • [Don't read lines of a file with a `for` loop](http://mywiki.wooledge.org/DontReadLinesWithFor) – glenn jackman Dec 27 '18 at 21:21
  • 1
    If you just want to count the lines in `android.txt` that match the entries in `files_android.txt`, I'd suggest `grep -Fcf files_android.txt android.txt` – steeldriver Dec 27 '18 at 21:44
  • @mahmood you might be interested in combining the `-F` and `-f` options for `grep` to make your script a one-liner. – peterph Dec 27 '18 at 21:51
  • @steeldriver: I think that counts all instances however my script counts only one match which is what I want. To total number of lines in files_android.txt is 1172 and my script returns 974 while `-Fcf` returns 1256. – mahmood Dec 28 '18 at 09:04

1 Answers1

2

Not sure exactly what is wrong with your code but it's possible that something is being expanded in an undesirable way due to unquoted variables.

I have made the following improvements:

#!/bin/bash
FILE1=$1
FILE2=$2
counter=0

while IFS= read -r line; do
    printf '%s\n' "$line"
    if grep -q "$line" "$FILE2"; then
        printf '%d\n' "$((counter++))"
    fi
done <"$FILE1"
printf 'counter=%d\n' "$counter"

  1. Using bash to iterate through a file isn't ideal but if it is to be used, you should use a while read loop rather than a for loop.
  2. grep -q is being used to suppress any output grep may produce
  3. printf '%d\n' "$((counter++))" is being used to save a line
  4. All variables have been quoted.
  5. printf rather than echo is technically more portable but mostly a matter of preference.
jesse_b
  • 35,934
  • 12
  • 91
  • 140
  • The problem was with dos/unix file saves. Using `dos2unix` conversion program, my script as well as your script work fine. Thank you – mahmood Dec 27 '18 at 20:41
  • 1
    `((counter++))` is not equivalent to `counter=$((counter+1))`. The former would cause the shell to exit under `set -e` if the result was non-zero, whereas the latter would not. The return codes are different. – Kusalananda Dec 27 '18 at 21:04
  • 1
    To read the lines **exactly** (i.e. with leading/trailing whitespace intact), use `IFS= read -r line` – glenn jackman Dec 27 '18 at 21:22
  • @mahmood Does this answer work with the dos lines? – Xen2050 Dec 28 '18 at 02:20
  • @Xen2050: No it doesn't work with dos lines. – mahmood Dec 28 '18 at 09:05
  • I wonder why... the $line you're grep'ing for must have a ^M at the end? So a modified $line, or grep command, might work... – Xen2050 Dec 28 '18 at 21:47
  • @Xen2050: Likely the loop wouldn't iterate over the file properly because there are no unix line endings. – jesse_b Dec 28 '18 at 22:30
  • @Jesse_b I tried duplicating the problem files on linux, using `unix2dos`, and I think I've got a solution... just having problems typing the special dos `^M` character in a bash script without using a hex editor (it's `0d`, followed by a LF `0a`, I think)... any ideas to get bash to recognize it? Maybe something like `\0x0d`? -- I think I found it `\r`, carriage return (in man bash). I can't answer anymore, but using `read -r -d \r line...` should work. (I don't think this Q should have been marked duplicate really, unless they *wanted* to modify the files & not keep as-is) – Xen2050 Dec 28 '18 at 22:42
  • Actually `read -r -d \r` didn't work, but `tr -d "\r"` did, I edited that in instead – Xen2050 Dec 28 '18 at 23:34