5

file1:

a, 1    
b, 5    
c, 2    
f, 7

file2:

a, 2    
f, 9    
g, 3

I want to join file 1 and file 2 based on column 1 and get file 3 as below.

file3:

a, 1, 2    
b, 5, -    
c, 2, -    
f, 7, 9    
g, -, 3

merge the matching values and also keep the specific ones from each file

Sundeep
  • 11,753
  • 2
  • 26
  • 57
chris
  • 51
  • 1
  • 2
  • 1
    what have you tried to solve this? see https://unix.stackexchange.com/questions/43417/join-two-files-with-matching-columns and https://unix.stackexchange.com/questions/122919/merge-2-files-based-on-all-values-of-the-first-column-of-the-first-file for reference... – Sundeep Oct 04 '17 at 05:02

1 Answers1

7

Using join:

$ join -t, -a 1 -a 2 -o0,1.2,2.2 -e ' -' file1 file2
a, 1, 2
b, 5, -
c, 2, -
f, 7, 9
g, -, 3

The standard join utility will perform a relational JOIN operation on the two sorted input files.

The flags used here tells the utility to expect comma-delimited input (-t,) and to produce output for all entries in both files (-a 1 -a 2, otherwise it would only produce output for lines with matching first field). We then ask for the join field along with the second column of both files to be outputted (-o0,1.2,2.2) and say that any missing field should be replaced by the string ␣- (space-dash, with -e ' -').

If the input is not sorted, it has to be pre-sorted. In shells that understands process substitution with <( ... ), this my be done through

join -t, -a 1 -a 2 -o0,1.2,2.2 -e ' -' <( sort file1 ) <( sort file2 )
Kusalananda
  • 320,670
  • 36
  • 633
  • 936
  • I didn't even know there is a `join` command, but it's part of the [GNU coreutils](https://en.wikipedia.org/wiki/List_of_GNU_Core_Utilities_commands) – thanks a lot! – dessert Oct 04 '17 at 06:55
  • @dessert It's a standard command that should be available on all Unices. – Kusalananda Oct 04 '17 at 06:57