Split a file into two

Question

I have a big file and need to split into two files. Suppose in the first file the 1000 lines should be selected and put into another file and delete those lines in the first file.

I tried using split but it is creating multiple chunks.

Yes i have checked it, but is creating multiple files which doesn't need to me. — Aravind, Oct 21 '14 at 16:02

score 60 · Accepted Answer · answered Oct 21 '14 at 16:11

60

The easiest way is probably to use head and tail:

$ head -n 1000 input-file > output1
$ tail -n +1001 input-file > output2

That will put the first 1000 lines from input-file into output1, and all lines from 1001 till the end in output2

answered Oct 21 '14 at 16:11

Michael Mrozek

91,316
38
238
232

1

Funny thing that it works pretty well with 2GB+ files too! – Gergely Lukacsy Oct 21 '20 at 18:12

score 29 · Answer 2 · edited Oct 21 '14 at 16:56

29

I think that split is you best approach.

Try using the -l xxxx option, where xxxx is the number of lines you want in each file (default is 1000).

You can use the -n yy option if you are more concerned about the amount of files created. Use -n 2 will split your file in only 2 parts, no matter the amount of lines in each file.

You can count the amount of lines in your file with wc -l filename. This is the 'wordcount' command with the lines option.

References

man split
man wc

edited Oct 21 '14 at 16:56

slm

363,520
117
767
871

answered Oct 21 '14 at 16:44

Lucien Raven

399
2
3

1

This is how to split into a bunch of files with a fixed number of lines, or how to split evenly into a fixed number of files. Is there a way to split into one 1000-line file and one file with everything else? That's what he was asking for; I couldn't find it in the man page – Michael Mrozek Oct 21 '14 at 17:05
You´re correct Michael. I think I took a simplistic view on the question. You solution is the best one in this case. Another way would be to use the 'sed' command: sed -n 1,1000 originalfile > first_1000_lines. sed '1,1000d' originalfile > remaining_lines. – Lucien Raven Oct 21 '14 at 17:17
Of course you could do `split -l 1000 bigfile && mv xaa piece1 && cat x?? > piece2 && rm x??`. – G-Man Says 'Reinstate Monica' Oct 21 '14 at 23:40
`split` is what I was looking for – Daniel Apr 08 '20 at 20:53
split with both -l and -n options doesn't run ('split: cannot split in more than one way'). Question wanted file into 2 parts, but at a specific line: split is the wrong tool for this job. csplit is the correct tool – RGD2 Jun 28 '21 at 23:40

don_crissti · Answer 3 · 2018-08-26T19:37:46.990

17

This is a job for csplit:

csplit -s infile 1001

will silently split infile, the first piece xx00 - up to but not including line 1001 and the second piece xx01 - the remaining lines.
You can play with the options if you need different output file names e.g. using -f and specifying a prefix:

csplit -sf piece. infile 1001

produces two files named piece.00 and piece.01

With a smart head you could also do something like:

{ head -n 1000 > 1st.out; cat > 2nd.out; } < infile

edited Aug 26 '18 at 19:37

answered May 10 '15 at 22:54

don_crissti

79,330
30
216
245

2

Wow, it really *is* a job for `csplit`. Very nice. (I'm just reading through the list of POSIX commands and had enormous trouble wrapping my head around the `csplit` command's purpose at first. Turns out it's really really simple.) :) – Wildcard Nov 02 '16 at 05:38

G-Man Says 'Reinstate Monica' · Answer 4 · 2014-10-21T21:59:34.983

5

A simple way to do what the question asks for, in one command:

awk '{ if (NR <= 1000) print > "piece1"; else print > "piece2"; }' bigfile

or, for those of you who really hate to type long, intuitively comprehensible commands,

awk '{ print > ((NR <= 1000) ? "piece1" : "piece2"); }' bigfile

edited Oct 21 '14 at 21:59

answered Oct 21 '14 at 21:11

G-Man Says 'Reinstate Monica'

22,130
27
68
117

Split a file into two

4 Answers4

References

Linked

Related