19

I have seen solutions of splitting a file with respect to pattern matching and line matching but what I want is the following. The scenario is, let's say I have a file file1 -

A.B|100|20
A.B|101|20
A.X|101|30
A.X|1000|20
B.Y|1|1

Now I want to split this file into 3 different files just based on the first column where the 1st file would be all the lines containing A.B in the first column, the 2nd file should have all the lines with A.X and so on.

If the first column changes in any way, there should be a new file created for those lines. Is there any way of doing it with bash or awk?

Since there is no way of actually knowing before hand what the first column value is, I wasn't able to use any feature like split or cut. Thanks for the help in advance!

don_crissti
  • 79,330
  • 30
  • 216
  • 245
Zzrot
  • 371
  • 1
  • 2
  • 9

1 Answers1

45

Try:

awk -F\| '{print>$1}' file1

This writes each line to a file named after the first column.

How it works:

  • -F\| sets the field separator to |.

  • print>$1 prints the current line to a file whose name is the first field.

John1024
  • 73,527
  • 11
  • 167
  • 163
  • 1
    That's what I wanted. Thank you, will accept in 2 minutes. – Zzrot Jul 22 '16 at 22:23
  • 3
    (what surprises me about this, is that a second occurrence of the first field doesn't open a new instance of the file and write over the previous one. that is, that ">>" isn't necessary. ...been using shell for too long, i guess) – Theophrastus Jul 22 '16 at 22:24
  • 3
    @Theophrastus That would be _shell_ behavior. Awk is different. With awk, `>` will overwrite a _previously_ existing file but, and this is where it differs from shell, while the awk command is running, `>` _appends_. – John1024 Jul 22 '16 at 22:26
  • How do you do the same when you have space as a field separator? – Yogesh D Oct 19 '17 at 22:28
  • @Techiee `awk -F" " '{print>$1}' file1` – John1024 Oct 19 '17 at 22:35
  • This might be too much to ask, but I really want to know, how can I write all these files into one sub-folder, instead of the current folder? – user3768495 Nov 14 '17 at 15:37
  • 6
    @user3768495 `awk -F\| '{print>"subfolder/"$1}' file1` – John1024 Nov 14 '17 at 19:04
  • 1
    Tried a bunch of different ways to concatenate the `subfolder` with `$1` but without success. Thanks for providing the answer, @John1024! – user3768495 Nov 14 '17 at 22:16
  • How can we ignore the delimiter that is coming inside String column with double quotes ""? i.e A.B|"10|0"|20 – Rails Developer Oct 23 '20 at 10:05