Find common elements in a given column from two files and output the column values from each file

Question

I have two files with tab-separated values that look like this:

file1:

file2:

I would like to find rows between files 1 and 2 where the string in column 1 is the same, then get the corresponding values. The desired output is a single file that looks like this:

B    3    3
C    1    2
A    1    9

Can this be done with a Unix one-liner?

jaume · Accepted Answer · 2012-10-25T12:59:12.227

11

GNU coreutils includes the command join that does exactly what you want if line sorting in the result is irrelevant:

join <(sort file1) <(sort file2)

A 1 9
B 3 3
C 1 2

If you want the tabs back, do:

join <(sort file1) <(sort file2) | tr ' ' '\t'

A   1   9
B   3   3
C   1   2

Or use the t option to join.

(<() aka process substitution, requires ksh93 (where the feature originated in), bash or zsh)

edited Oct 25 '12 at 12:59

answered Oct 25 '12 at 09:55

jaume

1,258
13
9

Now the output is sorted alphabetically, instead of according to file2, but it was not clear whether that would be a problem. – Bernhard Oct 25 '12 at 10:52

ire_and_curses · Answer 2 · 2012-10-25T09:56:54.800

2

ire@localhost: sort -k1 file2 | join file1 -
A 1 9
B 3 3
C 1 2

If you want tabs in the output, then do

sort -k1 file2 | join -t "        " file1 -

where you can construct the tab inside the quotes by pressing CTRL-V <tab>.

edited Oct 25 '12 at 09:56

answered Oct 25 '12 at 09:51

ire_and_curses

12,232
3
38
33

glenn jackman · Answer 3 · 2012-10-27T01:15:32.383

1

If you don't want the output sorted but instead have the same order as file2, awk is a good tool to use:

awk '
    NR == FNR {val[$1]=$2; next} 
    $1 in val {print $1, val[$1], $2}
' file1 file2

edited Oct 27 '12 at 01:15

answered Oct 25 '12 at 16:11

glenn jackman

84,176
15
116
168

Find common elements in a given column from two files and output the column values from each file

3 Answers3