How to find the most common name in passwd file

Question

My /etc/passwd has a list of users in a format that looks like this:

username:password:uid:gid:firstname.lastname, somenumber:/...

Goal : I want to see only the first names and than sort them having the most common name appear first, 2nd most common appear 2nd etc....

I saw some solutions as to how to do the 2nd part, although they are relevant to working with a text file and not to reading from a map.

In regards to the first part, I really don't know how to approach this. I know that there are some solutions but don't really know how to do them.

score 6 · Accepted Answer · answered Aug 09 '16 at 07:51

6

One way to do it:

cut -d: -f5 /etc/passwd | \
    sed 's/\..*//' | \
    sort -i | \
    uniq -ci | \
    sort -rn

answered Aug 09 '16 at 07:51

Satō Katsura

13,138
2
31
48

Great answer, but I think he'll be in need of using `uniq` without `-i`, since there should be difference between X and x in name, we only need --ignore-case option for sort as you've used. In addition, using the sed command you've added in your answer, seems irrelevant, if there is any reason, please explain. – Aug 09 '16 at 08:07
@FarazX Re: `-i`: `John.Doe` should be the same as `john.doe`. Re: `sed`: from the OP: _I want to see only the first names_. – Satō Katsura Aug 09 '16 at 08:13
Oh you're right, sorry I didn't notice. So voila! Thanks for your explanation, and your great way of using cut ;) – Aug 09 '16 at 08:14
**cut** + **sed** is too much `sed '/\n/{P;d};s/:/\n/4;s/\./\n/;D'` or `sed 's/[^.]*:$\w\+$.*/\1/'` – Costas Aug 09 '16 at 08:23
@Costas Too much compared to what? For me, total time spent thinking about getting the 5th field portably with `sed` >> the time gained by not using `cut`. BTW, your second recipe assumes GNU `sed` (`\w`). – Satō Katsura Aug 09 '16 at 08:31
@SatoKatsura The above is example. If you'd like you can do the same as in your script `sed 's/[^.]*://;s/\..*//'`. But my 1st example a little bit quicker. AND if you don't like `\w` you free to use `[:alnum:]` – Costas Aug 09 '16 at 08:38
@Costas `sed 's/[^.]*://;s/\..*//'` misses any names without dot. The point of using `cut` is precisely to avoid going into this kind of details, you know. – Satō Katsura Aug 09 '16 at 08:44
@SatoKatsura If you insist `s/$[^:]*:$\{4\}//;s/[:.].*//` In any way if you involve *sed* you can easily avoid *cut* – Costas Aug 09 '16 at 09:46
Can u explain briefly how this command works? (specifically the `sed` and `cut` part) – asaf92 Aug 09 '16 at 13:29
And btw, in my system I don't have access to passwd. I have to type `ypcat passwd` to read it. – asaf92 Aug 09 '16 at 13:31
@PanthersFan92 `cut` extracts the 5th field, `sed` kills the `.lastname, somenumber` part out of it. You can, of course, do it like this: `ypcat passwd | cut -d: -f5 | ...`. – Satō Katsura Aug 09 '16 at 13:48

John1024 · Answer 2 · 2016-08-09T08:15:43.430

Using awk and sorting to have the most common name first:

awk -F: '{sub(/[.].*/, "", $5); a[$5]++} END{for (n in a)print a[n],n}' /etc/passwd | sort -nr

For a case-insensitive version:

awk -F: '{sub(/[. ,].*/, "", $5); a[tolower($5)]++} END{for (n in a)print a[n],n}' /etc/passwd | sort -nr

For those who prefer their commands spread over multiple lines:

awk -F: '
  {
    sub(/[.].*/, "", $5)
    a[$5]++
  }

  END{
    for (n in a)
      print a[n],n
  }
  ' /etc/passwd | sort -nr

How it works

-F:

This makes : the field separator.
sub(/[.].*/, "", $5)

This removes everything after the first period from field 5.
a[$5]++

The count for the number of times this name has appeared is stored in associative array a. This increments the counter. For the case-insensitive version, this is replaced with a[tolower($5)]++.
END{for (n in a)print a[n],n}

This prints the count and name for all the results that we have in array a.
sort -nr

This sorts the output numerically in descending order.

FWIW if GNU awk 4+ you can set `PROCINFO["sorted_in"]="@val_num_desc"` and drop the separate `sort` — dave_thompson_085, Oct 27 '19 at 02:46

How to find the most common name in passwd file

2 Answers2

How it works

Linked

Related