compare two csv files and fetch matching data

Question

I have two .csv files namely file1.csv and file2.csv

file1.csv

ADIS
BAP3
Mercury_System
nxh-2003
DR_FeatureUP_PT

file2.csv

ADIS,projects.adis
EcoSystems,projects.ecosystems
em1xxxsw,projects.em1xxxsw
BAP3,projects.bap3
Dirana4,projects.dirana4
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DocStore,projects.docstore
DR_FeatureUP_PT,projects.dr_featureup_pt

Desired output.csv

ADIS,projects.adis
BAP3,projects.bap3
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DR_FeatureUP_PT,projects.dr_featureup_pt

I have already tried couple of codes below, but none of them worked for me as per the requirement

grep -Ff file1.csv file2.csv > outfile.csv

awk -F, 'NR==FNR{seen[$0]++;next} ($1 in seen)' file1.csv file2.csv > outfile.csv

file1.csv contains 2500 rows and file2.csv contains 118 rows, so it should compare and give me only results which are matching to file2, output should be matching to 118 rows/results.

its not a duplicate of the question mentioned as this is a csv file that i have asked and the one you mentioned is piped file. — Siddharth Sahoo, Nov 21 '16 at 12:45
There is no fundamental difference between the contents of a file and piped contents, as long as they are both finite. — l0b0, Nov 21 '16 at 12:48
Cannot reproduce. When I do `grep -Ff file1 file2` I get your desired output (using `grep` in macOS). — maulinglawns, Nov 21 '16 at 12:51
@maulinglawns i have tried it, but i am on RHEL. It's not working here, even with awk i get a blank output — Siddharth Sahoo, Nov 21 '16 at 12:56
@I0b0 even i tried with the duplicate you mentioned but didn't help — Siddharth Sahoo, Nov 21 '16 at 13:06
Tested `grep -Ff file1.csv file2.csv` on CentOS 6.2 with GNU grep 2.20 and it works for me. Cannot reproduce. — Zachary Brady, Nov 21 '16 at 13:16
@SiddharthSahoo it would help to know what exactly is going wrong.. in one of the comments you mentioned `awk` is giving blank output.. are you getting blank output with `grep` too? can you post what is the output of `seq 3 6 > f1 ; seq 5 > f2 ; grep -Ff f1 f2` ? — Sundeep, Nov 21 '16 at 13:27
You wrote that file1.csv has 2500 rows and file2.csv has 118 rows. Maybe you want to reverse the order of file, and compare according to the 118 rows. — andreatsh, Nov 21 '16 at 13:29
@Sundeep yes both give empty result, i don't see any content in the output file — Siddharth Sahoo, Nov 21 '16 at 13:45
For clarification: I voted to close this question because it *can't be reproduced*: both the awk and the grep commands print the desired output. There's something else going on with the OP's situation. It could be mis-matched character encodings or DOS line endings; I'd suggest [edit]ing the question to include the results of `file file1.csv file2.csv`. — Anthony Geoghegan, Nov 21 '16 at 14:49

smokes2345 · Answer 1 · 2016-11-21T14:40:22.270

0

The following grep should return the desired results assuming file1.csv has only one column for each row. This uses each line in file1.csv as a search string (needle) and searches through file2.csv (haystack).

grep -f file1.csv file2.csv | tee outfile.csv

I added tee so you can see the output as well as writing it to a file. Your question is very vague as to what problem you are experiencing. I've done this many times on RHEL and Debian and tested just now with your sample contents. I was able to achieve your desired results.

edited Nov 21 '16 at 14:40

answered Nov 21 '16 at 14:35

smokes2345

845
4
18

i have already tried this option but no use, now i tried on GNU/Linux but still nothing i get – Siddharth Sahoo Nov 21 '16 at 14:53

compare two csv files and fetch matching data

1 Answers1