Number of same line in each text file

Question

I have more then 30 different text files and each one of them has a same word which repeated different time for example in text1 "esr" repeated 12 times and in text2 "esr" repeated 21 times.

Is it possible to output the number of time that the word repeated separately with one command?

"repeated separately"? The sum of all the occurrences, or the number of occurrences in each file, but presented separately? — Kusalananda, Aug 28 '17 at 12:31
are you interested specifically in "esr" or every word that can be found in any file ? — pawel7318, Aug 28 '17 at 12:41
Your title is completely different that what you described in body of your question! if it's title then `sort — αғsнιη, Aug 28 '17 at 14:31

RomanPerekhrest · Accepted Answer · 2017-08-28T12:48:06.753

5

With grep + wc pipeline:

for f in *.txt; do echo -n "$f "; grep -wo 'esr' "$f" | wc -l; done

grep options:

-w - word-regexp (to match whole/separate word)
-o - print only matched substrings

wc -l - count the number of lines (matched words in our case) for each file

edited Aug 28 '17 at 12:48

answered Aug 28 '17 at 12:40

RomanPerekhrest

29,703
3
43
67

Why not just `grep -wc esr *txt`? – terdon Aug 28 '17 at 13:06
@terdon, to print the filename along with count `filename count` – RomanPerekhrest Aug 28 '17 at 13:07
The file name is printed when you give `grep` multiple input files (and you can force it to do it with `-H` anyway) so that's not an issue. However, my suggestion will not count multiple occurrences on the same line, while yours does. – terdon Aug 28 '17 at 13:08
@terdon, exactly – RomanPerekhrest Aug 28 '17 at 13:09

score 4 · Answer 2 · edited Aug 28 '17 at 14:27

4

strings ./*.txt|tr " " "\n"|sort|uniq -c

edited Aug 28 '17 at 14:27

αғsнιη

40,939
15
71
114

answered Aug 28 '17 at 12:36

pawel7318

1,940
3
16
15

sebasth · Answer 3 · 2017-08-28T12:41:52.990

3

Use grep to find all instances, then count unique lines using uniq -c.

grep "word" * | sort | uniq -c

If you want matches per input file, use grep -c:

grep -c "word" *

edited Aug 28 '17 at 12:41

answered Aug 28 '17 at 12:37

sebasth

14,332
4
50
68

This will unfortunately miscount if the pattern occurs several times on the same line. – Kusalananda Aug 28 '17 at 12:41
this will match `anywordlike this` and will count it – αғsнιη Aug 28 '17 at 14:34

Kusalananda · Answer 4 · 2017-08-28T12:46:41.277

2

for name in file*.txt; do
    printf 'Pattern occurs %d times in "%s"\n' "$(grep -wo 'pattern' "$name" | wc -l)" "$name"
done

edited Aug 28 '17 at 12:46

answered Aug 28 '17 at 12:38

Kusalananda

320,670
36
633
936

Robert Benson · Answer 5 · 2017-08-28T20:36:37.073

If you want to count every word in any number of files you could use AWK e.g.:

awk 'BEGIN{RS="[[:space:]]+"}
     {counts[$0]++}
     END{for(word in counts){print word " - " counts[word]}
     ' file1 file2 file...

This treats a file as if every word were on a separate line, that's the BEGIN{RS="[[:space:]]+"} part, then counts each time it sees a line. Removing the BEGIN portion would count each normal line.

If you're only interested in 1 specific word, you could change the END block to look something like:

END{print counts["esr"]}

Which would print only the times "esr" shows up, but remember that this is case-sensitive.

To remove case-sensitivity, use counts[tolower($0)]++ or counts[toupper($0)]++.

Checks can be added to print out data when the count goes from one file to the next as well.

Number of same line in each text file

5 Answers5