find common lines between multiple files

Question

I have 4 files which are like

       file A
       >TCONS_00000867
       >TCONS_00001442
       >TCONS_00001447
       >TCONS_00001528
       >TCONS_00001529
       >TCONS_00001668
       >TCONS_00001921

       file b
       >TCONS_00001528
       >TCONS_00001529
       >TCONS_00001668
       >TCONS_00001921
       >TCONS_00001922
       >TCONS_00001924

       file c
       >TCONS_00001529
       >TCONS_00001668
       >TCONS_00001921
       >TCONS_00001922
       >TCONS_00001924
       >TCONS_00001956
       >TCONS_00002048

       file d
       >TCONS_00001922
       >TCONS_00001924
       >TCONS_00001956
       >TCONS_00002048

All files contain more than 2000 lines and are sorted by first column.

I want to find common lines in all files. I tried awk and grep and comm but not working.

Stéphane Chazelas · Accepted Answer · 2022-03-29T08:58:41.217

Since the files are already sorted:

comm -12 a b |
  comm -12 - c |
  comm -12 - d

comm finds common lines between files. By default comm prints 3 TAB-separated columns:

The lines unique to the first file,
The lines unique to the second file,
The lines common to both files.

With the -1, -2, -3 options, we suppress the corresponding column. So comm -12 a b reports the lines common to a and b. - can be used in place of a file name to mean stdin.

score 7 · Answer 2 · edited May 10 '19 at 09:44

7

cat a b c d |sort |uniq -c |sed -n -e 's/^ *4 \(.*\)/\1/p'

edited May 10 '19 at 09:44

Stephen Kitt

411,918
54
1,065
1,164

answered May 10 '19 at 07:53

Piotr

71
1
1

Actually, save the `sed`, this is quite good for finding duplicate lines across many files: `cat` to `sort` to `uniq -c`. Somehow I didn't quite think of this, good answer! – smaslennikov May 21 '19 at 21:35
1

You can also use uniq command to only print duplicated lines: `uniq -cd` – mems Sep 30 '19 at 14:44

find common lines between multiple files

2 Answers2

Linked