16

Say, I have a command command which prints huge number of lines to stdout:

line1
line2
.....
lineN

I want to save the output to disk, but not as a single file, but as a sequence of files each having 1000 lines of stdout:

file0001.txt:
-------------
line1
....
line1000

file0002.txt:
-------------
line1001
....
line2000

etc

I've tried to google the answer, but every time google points me to tee command, which is useless in this situation. Probably, I'm entering wrong queries.

DNNX
  • 315
  • 1
  • 3
  • 7

2 Answers2

30

Once you are done saving the file, you could always split the file into file pieces or multiple files based on the number of lines.

split -l 1000 output_file

or even better just try

command | split -l 1000 -

This will split the output stream into files with each 1000 lines (default is 1000 lines without -l option).

The below command will give you additional flexibility to put or enforce a prefix to the filename that will be generated when the output is generated and splitted to store into the file.

command | split -l 1000 - small-

Nikhil Mulley
  • 8,145
  • 32
  • 49
  • I got confused, so for others, its `split [arguments...] [input e.g. "-" for stdin] [output_prefix]`, for example: `tar -c somedir | split --byes 100MB --numeric-suffixes --suffix-length=3 - somedir.tar.part-` would output a bunch of 100MB files named `somedir.tar.part-000`, 001, 002 ans so on. – ThorSummoner Aug 17 '17 at 19:47
  • @ThorSummoner I suppose it should be `--bytes`? – Qin Heyang Nov 05 '20 at 02:05
3

You can use a bash script lines.bash

#!/bin/bash
a=0
while IFS='' read -r line
do
  printf -v filename "%04d.txt" "$((a++/1000))"
  echo "$line" >> $filename
done

and use it as:

cat long_file.txt | bash lines.bash

The only problem I noticed is with * sign in long_file.txt (somebody could correct it).

manatwork
  • 30,549
  • 7
  • 101
  • 91
xralf
  • 16,149
  • 29
  • 101
  • 149
  • 2
    Set the `IFS` to empty string to avoid word splitting on `read`. Use `-r` to disable backslash escaping on `read`. Remove `-e` to avoid backslash escaping on `echo`. Use quoting to avoid word splitting on `echo`. Use `-v` in `bash` since 4.0 to avoid starting a sub-process. Use post-incrementing as your current code will put in the first file only 999 lines. `a=0; while IFS='' read -r line; do printf -v filename "%04d.txt" $((a++/1000)); echo "$line" >> "$filename"; done` – manatwork Dec 06 '11 at 14:09
  • @manatwork Thank you. Only my `printf` does not have `-v` switch. (`bash 4.2.10`). At least it's not in manpage of `printf` – xralf Dec 06 '11 at 14:21
  • 1
    `man printf` documents /usr/bin/printf, that could never in life set an environment variable. See `help printf` for the `printf` shell built-in's documentation. – manatwork Dec 06 '11 at 14:44
  • @manatwork OK. There seems to be syntax error in the `++/` part yet. – xralf Dec 06 '11 at 14:48
  • 1
    One more thing: there is no need to use sigil inside arithmetic evaluation, unless you need parameter expansion explicitly. In arithmetic expansion the variables are evaluated anyway. – manatwork Dec 06 '11 at 14:49
  • No need to do `IFS=''`, you can just do `IFS=`. – Chris Down Dec 06 '11 at 16:56
  • @ChrisDown one does not simply defy shellcheck :p – Ярослав Рахматуллин Jan 05 '23 at 23:29