0

I have couple of files (file 1.txt and file2.txt) and I am using unix "comm" command to compare those files to find out unique lines on file1.txt

Here are the lines having on file1.txt:

OD1
EN2
OD3
OD4
OD5
EN6
EN7
EN8
EN9
OD10
OD11
OD12

Here are the lines having on file2.txt:

EN1
EN2
EN3
OD4
OD5
EN6
EN7
EN8
EN9
OD10

I am using the command as :

comm -23 file1.txt file2.txt

actual

The result is:

OD1 
OD10
OD11
OD12
OD3

expecting

I was expecting:

OD1 
OD11
OD12
OD3

Can you please help how to get the expected results?

Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
AmitDas
  • 3
  • 2
  • I get `... OD5 comm: file 2 is not in sorted order EN6 ...` (this is taken from the right-most column). "comm - compare two sorted files line by line". – sourcejedi Jul 11 '17 at 11:31
  • Check also this topic: [link](https://unix.stackexchange.com/questions/377659/comparing-two-files-line-by-line) – mrc02_kr Jul 11 '17 at 11:34
  • `comm` expects the file to be sorted before inputting. Looks like both of your files are not sorted – Thushi Jul 11 '17 at 12:30

2 Answers2

2

The files have to be sorted lexically or comm will not work.

Sort them into order and try again.

Or use:

comm -23 <(sort file1.txt) <(sort file2.txt)  
Gilles 'SO- stop being evil'
  • 807,993
  • 194
  • 1,674
  • 2,175
Bob Eager
  • 3,520
  • 2
  • 14
  • 29
  • Thanks for the reply. However I still I am getting the result where OD10 is getting included. Here is my test result:OD1 OD10 OD11 OD12 OD3 – AmitDas Jul 11 '17 at 12:58
0

Use

sdiff -s file1.txt file2.txt | awk '{print $1}' | sort -u

Output is

OD1                                                                                                                                                                      
OD11                                                                                                                                                                     
OD12                                                                                                                                                                     
OD3   
Kaushik Nayak
  • 283
  • 2
  • 7
  • I tried the same command but getting result with OD10 included on the list. here are the result :OD1 OD10 OD11 OD12 OD3 – AmitDas Jul 11 '17 at 12:54
  • Can you check data correctly? I have used the same data which you provided in the question and got the output shown. – Kaushik Nayak Jul 11 '17 at 13:00
  • It is surprising! With the same data I am getting different result. Any idea does that depend on linux version? The one I am running is on Red Hat Enterprise Linux Server release 6.7 – AmitDas Jul 11 '17 at 13:08
  • Probably there are some additional characters in OD10. open the file in vi. then enter :set list – Kaushik Nayak Jul 11 '17 at 13:12
  • Thanks! It is working fine now with the new data set. Even the first command suggested works well now! – AmitDas Jul 11 '17 at 13:34