0

I have more then 30 different text files and each one of them has a same word which repeated different time for example in text1 "esr" repeated 12 times and in text2 "esr" repeated 21 times.

Is it possible to output the number of time that the word repeated separately with one command?

αғsнιη
  • 40,939
  • 15
  • 71
  • 114

5 Answers5

5

With grep + wc pipeline:

for f in *.txt; do echo -n "$f "; grep -wo 'esr' "$f" | wc -l; done

grep options:

  • -w - word-regexp (to match whole/separate word)

  • -o - print only matched substrings


  • wc -l - count the number of lines (matched words in our case) for each file
RomanPerekhrest
  • 29,703
  • 3
  • 43
  • 67
4
strings ./*.txt|tr " " "\n"|sort|uniq -c
αғsнιη
  • 40,939
  • 15
  • 71
  • 114
pawel7318
  • 1,940
  • 3
  • 16
  • 15
3

Use grep to find all instances, then count unique lines using uniq -c.

grep "word" * | sort | uniq -c

If you want matches per input file, use grep -c:

grep -c "word" * 
sebasth
  • 14,332
  • 4
  • 50
  • 68
2
for name in file*.txt; do
    printf 'Pattern occurs %d times in "%s"\n' "$(grep -wo 'pattern' "$name" | wc -l)" "$name"
done
Kusalananda
  • 320,670
  • 36
  • 633
  • 936
0

If you want to count every word in any number of files you could use AWK e.g.:

awk 'BEGIN{RS="[[:space:]]+"}
     {counts[$0]++}
     END{for(word in counts){print word " - " counts[word]}
     ' file1 file2 file...

This treats a file as if every word were on a separate line, that's the BEGIN{RS="[[:space:]]+"} part, then counts each time it sees a line. Removing the BEGIN portion would count each normal line.

If you're only interested in 1 specific word, you could change the END block to look something like:

END{print counts["esr"]}

Which would print only the times "esr" shows up, but remember that this is case-sensitive.

To remove case-sensitivity, use counts[tolower($0)]++ or counts[toupper($0)]++.

Checks can be added to print out data when the count goes from one file to the next as well.