-1

I have two csv files with similar content/data.


file1.csv

ADIS
BAP3
Mercury_System
nxh-2003
DR_FeatureUP_PT

file2.csv

ADIS,projects.adis
EcoSystems,projects.ecosystems
em1xxxsw,projects.em1xxxsw
BAP3,projects.bap3
Dirana4,projects.dirana4
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DocStore,projects.docstore
DR_FeatureUP_PT,projects.dr_featureup_pt

Desired output.csv

ADIS,projects.adis
BAP3,projects.bap3
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DR_FeatureUP_PT,projects.dr_featureup_pt

Please let me know the command used for this type of query.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
  • 2
    What kind of query are you trying to do? Right now we just have 2 input files from you. – Eric Renouf Nov 18 '16 at 13:15
  • 1
    see also: https://unix.stackexchange.com/questions/134829/compare-two-columns-of-different-files-and-print-if-it-matches – Sundeep Nov 18 '16 at 13:22
  • @EricRenouf i have given two input files, so a query/command/sed/awk/grep anything should comapre these two inout files and get a data which is similar like the one i gave in output – Siddharth Sahoo Nov 18 '16 at 13:30
  • Yes, but how should the output be generated from those two files? Explain in words. Do you just want the 1,2,4,5,and 7th lines from file? Are you looking for `grep -Ff file1 file2`? What if file2 contains a `ADISBADIS`, should that match `ADIS`? You need to tell us what you need instead of expecting us to guess. – terdon Nov 21 '16 at 14:55
  • file1.csv contains 118 rows , file2.csv contains 2500 rows with 2 columns My requirement is the code/script should match both the csv files and fetch the matched row in a new file. I even tried grep -Ff file1 file2 but no use, there will be only single string with the same in both files. I am running GNU/Linux – Siddharth Sahoo Nov 21 '16 at 15:02

4 Answers4

1

Using Miller (mlr) to join the two CSV data sets on the named field ADIS:

$ mlr --csv join -j ADIS -f file1.csv file2.csv
ADIS,projects.adis
BAP3,projects.bap3
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DR_FeatureUP_PT,projects.dr_featureup_pt

If ADIS is not actually a field name and the two CSV files are header-less, then use the following instead (reads the input as header-less CSV and uses the field number instead of the field name in the join operation):

$ mlr --csv -N join -j 1 -f file1.csv file2.csv
ADIS,projects.adis
BAP3,projects.bap3
Mercury_System,projects.mercury_system
nxh-2003,projects.nxh-2003
DR_FeatureUP_PT,projects.dr_featureup_pt

The output of this second command looks the same as that of the first command. However, the first line in each file in no longer interpreted as a header line but as a record of data fields.

This would work regardless of whether the input data files are DOS or Unix text files.

Kusalananda
  • 320,670
  • 36
  • 633
  • 936
0

Using grep;

grep -Ff file1.csv file2.csv > outfile.csv

using awk

awk -F, 'NR==FNR{seen[$0]++;next} ($1 in seen)' file1.csv file2.csv > outfile.csv
αғsнιη
  • 40,939
  • 15
  • 71
  • 114
0

You can solve this with a simple script to execute in the shell. Try with this:

while read i; do grep $i ./file2.csv >> output.csv; done < file1.csv

It will create a new file called output.csv with the result that you are expecting for. You need to run it on the folder where you have stored both files file1.csv and file2.csv.

0

Using Bash process substitution, you can sort the files and then "join" them.

join -t, <(sort file1.csv) <(sort file2.csv) > output.csv
Wildcard
  • 35,316
  • 26
  • 130
  • 258