2

I have a word and I want to check what the percent of its appearance in a file ( according to the total number of word in the file ) ? For example if I have the word "you" and it appears 2 times in a file with 8 words the output will be 25%.

I tried: fgrep -ow

Thomas Dickey
  • 75,040
  • 9
  • 171
  • 268
mor
  • 21
  • 1
  • 3

3 Answers3

2

you can get the total numbers of words in your file as follow

nw=`wc -w < /path/to/file`

And the number of occurrences of a certain word/pattern with

occurrences=`egrep -c <pattern> /path/to/file`

then you can easily calculate the percentage and put the result in a variable

result=`echo "scale=2; $occurrences*100/$nw" | bc`

to add the % you can eg. do as follow

echo $result'%'
lese
  • 2,716
  • 5
  • 19
  • 30
0

Use the same logic as shown URL

tr ' ' '\n' < file.txt | awk '{if($0=="her"){nmw+=1}}END{print ((nmw*100)/NR)}'
jijinp
  • 1,361
  • 9
  • 10
0

With awk:

awk -vw="word" 'BEGIN{RS="[^a-zA-Z]+"} $0==w{c++} END{printf "%.1f%%\n",c*100/NR}' file
  • -vw="word" gives awk the variable w which contains "word". That is the word, you want to have the percentage.
  • BEGIN{RS="[^a-zA-Z]+"} sets the row separator to everything, but letter, so every word is processed separately.
  • $0==w{c++} increase the counter if the word is found.
  • END{printf "%.1f%%\n",c*100/NR} print the calculated number after the file is processed
chaos
  • 47,463
  • 11
  • 118
  • 144