0

I need to trim the header in a csv file. I have used tail -n +2 ... which works fine but it is really slow (I have lots of 100M files), and I don't understand why since no memory is needed from tail to achieve this (unlike tail -n 10000 for instance).

I have tried awk '{if (NR > 1) print $0}'. It is a bit faster but still orders of magnitude slower than cat. But cat doesn't have that option.

Are there other commands? Thanks

Thomas
  • 883
  • 2
  • 12
  • 25
  • Processing lots of 100Mb files is going to be slow no matter what. To remove the header, you will have to write the rest of the file out to a new file. That's what takes the most time. – Kusalananda Jun 20 '17 at 09:38
  • 3
    `{ head -n 1 >/dev/null; cat; } – don_crissti Jun 20 '17 at 09:39
  • @Kusalananda not really, as that's what `cat` does and it is 100x faster – Thomas Jun 20 '17 at 09:46
  • @don_crissti the solution with `sed` worked for me, not sure why `head` doesn't. And it is really fast, thanks! if you want to write it as an answer, please do – Thomas Jun 20 '17 at 09:47
  • Give a try also to this : `{ read -r line; cat; } – George Vasiliou Jun 20 '17 at 09:52
  • Thomas, I'll leave it as a comment as this - and multiple variations - have been discussed a few times here (see e.g. [What's the best way to take a segment out of a text file?](https://unix.stackexchange.com/q/2072) and related questions...) – don_crissti Jun 20 '17 at 09:56

0 Answers0