13

I have a CSV file like this:

abd,123,egypt,78
cde,456,england,45

How can I get the character count of only the 3rd column words?

I can't figure out how to get wc to do this.

user3116123
  • 539
  • 1
  • 6
  • 11

8 Answers8

23
awk -F, '{sum+=length($3)}; END {print +sum}' file
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
Hauke Laging
  • 88,146
  • 18
  • 125
  • 174
  • 3
    Amen; `awk` was designed for processing column based files, line-by-line. The problem is perfectly suited for the tool. – Ray May 07 '14 at 13:34
  • What is the purpose of + in {print +sum} ? {print sum} works just as well. – spuder May 07 '14 at 16:18
  • 3
    @spuder, that's to print `0` instead of an empty line when the input file is empty. – Stéphane Chazelas May 07 '14 at 17:09
  • 3
    @Ray, on the other hand, the task can be achieved by having 3 basic utilities (each one of them being a fraction of the size of `awk`) cooperating to the case (working concurrently) in typical Unix spirit. You may notice how the cut+tr+wc one is 5 types as fast as this awk one itself 5 times as fast as the `perl` one. (at least on my system, in a UTF8 locale, tried on a 100MB file). – Stéphane Chazelas May 08 '14 at 06:00
23
cut -d, -f3 | tr -d '\n' | wc -m

(remember that wc -c counts bytes, not characters:

$ echo a,1,españa,2 | cut -d, -f3 | tr -d '\n' | wc -c
7
$ echo a,1,españa,2 | cut -d, -f3 | tr -d '\n' | wc -m
6

)

Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
5

A perl solution:

perl -Mopen=:locale -F, -anle '$sum += length($F[2]); END{print $sum}' file

or a shorter version:

perl -Mopen=:locale -F, -anle '$sum += length($F[2])}{print $sum' file
cuonglm
  • 150,973
  • 38
  • 327
  • 406
3

In Perl:

perl -F, -Mopen=:locale -lane 'print length $F[2]' your_file
Joseph R.
  • 38,849
  • 7
  • 107
  • 143
3
cut -d, -f3 <<\DATA | grep -o . | grep -c .
abd,123,egypt,78
cde,456,england,45
DATA

#OUTPUT
12
mikeserv
  • 57,448
  • 9
  • 113
  • 229
3

You could also use

awk -F, '{printf "%s", $3}' file | wc -m
Stéphane Chazelas
  • 522,931
  • 91
  • 1,010
  • 1,501
terdon
  • 234,489
  • 66
  • 447
  • 667
1

With your sample file like so:

$ cat sample.txt 
abd,123,egypt,78
cde,456,england,45

$ awk -F, '{print $3}' sample.txt | while read i; do echo "$i" | \
    tr -d '\n' | wc -m; done
5
7

Working with wc to get each line's count can be tricky. You have to call it for each string from column 3 individually which makes it a bit tricky to do what you want. You have to look through each row of your CSV, extract column 3 and then present it to wc to get the character count.

slm
  • 363,520
  • 117
  • 767
  • 871
0

Using sed and awk

sed 's/.*,.*,\(.*\),.*/\1/g' file | awk -v FS="" '{print NF;}'

Example:

$ (echo abd,123,egypt,78; echo cde,456,england,45;) | sed 's/.*,.*,\(.*\),.*/\1/g' | awk -v FS="" '{print NF;}'
5
7

Two awk's

awk -F, '{print $3}' file | awk -v FS="" '{print NF;}'

Example:

$ (echo abd,123,egypt,78; echo cde,456,england,45;) | awk -F, '{print $3}'| awk -v FS="" '{print NF;}'
5
7
Avinash Raj
  • 3,653
  • 4
  • 20
  • 34