17

I have two files.

File 1:

A0001  C001
B0003  C896
A0024  C234
.
B1542  C231
.
upto 28412 such lines

File 2:

A0001
A0024
B1542
.
.
and 12000 such lines.

I want to compare File 2 against File 1 and store the matching lines from File 1. I tried Perl and Bash but none seems to be working.

The latest thing I tried was something like this:

for (@q) # after storing contents of second file in an array
{
        $line =`cat File1 | grep $_`; #directly calling File 1 from bash
        print $line;
}

but it fails.

Braiam
  • 35,380
  • 25
  • 108
  • 167
user3543389
  • 195
  • 1
  • 1
  • 5

3 Answers3

27

This should do the job:

grep -Ff File2 File1

The -f File2 reads the patterns from File2 and the -F treats the patterns as fixed strings (ie no regexes used).

Graeme
  • 33,607
  • 8
  • 85
  • 110
10

You can use awk:

$ awk 'FNR==NR{a[$1];next}($1 in a){print}' file2 file1
A0001   C001
A0024   C234
B1542   C231
cuonglm
  • 150,973
  • 38
  • 327
  • 406
3

It looks to me like both files are already sorted on the first field. If so:

join file1 file2

is best, by about as far as your files are large.

jthill
  • 2,671
  • 12
  • 15
  • Tried this; each file must be sorted for this to work. In `grep` solution that is not needed. – Matthew Turner Sep 13 '18 at 22:20
  • 2
    @МатиТернер yes, this is true, keeping large files sorted saves much time (when they're big enough to pirate the R out of RAM 'cause L2's not even close to enough) and keeping small files sorted is too cheap to meter anyway. – jthill Sep 24 '18 at 21:07