12

Possible Duplicate:
IO redirection and the head command

I just wanted to remove all but the first line of a file. I did this:

head -1 foo.txt

... and verified that I saw only the first line. Then I did:

head -1 foo.txt > foo.txt

But instead of containing only the first line, foo.txt was now empty.

Turns out that cat foo.txt > foo.txt also empties the file.

Why?

Nathan Long
  • 1,613
  • 1
  • 13
  • 27
  • 2
    A more interesting question (yet slightly more pedantic) would be to know if the evaluation order is defined in POSIX or is it implementation-specific. – rahmu Jun 20 '12 at 18:19
  • @rahmu - True. I happen to be using zshell on OSX Lion. – Nathan Long Jun 20 '12 at 18:21
  • 1
    @rahmu POSIX does specify the order but it doesn't need to because if you think about it for just a moment, you'll realize that the shell is the one that does the redirections and it has to do them *before* running the command, since it will be too late to do it after the command has already started. Add that bit of common sense to the fact that the `>` operator includes truncation and this behavior becomes very logical. – jw013 Jun 20 '12 at 21:55

2 Answers2

13

Before the shell starts processing any data, it needs to make sure all the input and output is squared away.

So in your case using > foo.txt basically tells the system: "create a (new) file named foo.txt and stick all the output from this command into that file".

The problem is, as you found out, that that wipes out the previous contents.

Related, >> will append to an existing file.

Update:

Here's a solution using sed, handle with care:

 sed -i '2,$d' foo.txt

It will delete lines 2 to "last" in-place in file foo.txt. Best to try this out on a file you can afford to mess up first :)

This slightly modified version of the command will keep a copy of the original with the .bak extension:

 sed -i.bak '2,$d' foo.txt

You can specify any sequence of characters (or a single character) after the -i command line switch for the name of the "backup" (ie original) file.

Levon
  • 11,174
  • 4
  • 45
  • 41
  • Interesting. How would you remove all but the first line of a file, since `head -1 foo.txt > foo.txt` won't work? – Nathan Long Jun 20 '12 at 17:55
  • @NathanLong Off the top of my head, I'd just use a different temporary file for the output and then rename it. Or is this something you are going to do over? – Levon Jun 20 '12 at 18:00
  • It's not something I'll do a lot. I just wondered if I was missing some obvious, better way. – Nathan Long Jun 20 '12 at 18:02
  • 1
    This works: `head -1 foo.txt | tee foo.txt` – Nathan Long Jun 20 '12 at 18:05
  • @NathanLong I just posted a solution using `sed` that will fit your requirements too. – Levon Jun 20 '12 at 18:06
  • Regarding @Levon's temporary file. For 1 line of text a simple shell variable is enough: `IFS= read -r < foo.txt && echo "$REPLY" > foo.txt`. This is faster than the `sed` solution and more reliable than @NathanLong's `tee` workaround, which works only on small files. Regarding the `sed` solution, `sed -i 'q' foo.txt` is shorter and faster. – manatwork Jun 20 '12 at 18:22
  • @manatwork thanks for the additional information, always good to add to the toolbox. – Levon Jun 20 '12 at 18:35
  • 1
    Please don't recommend `sed -i`. It's not portable and probably doesn't even work correctly on links. Non-portable alternatives include `cmd file | sponge file`. Portable alternatives include `ed` (`ed` is a file editor; `sed` is a *stream* editor, not a file editor), which has more or less the same commands as `sed`, or a temporary file. – jw013 Jun 20 '12 at 21:59
  • why does `tee` work here ? from [this SO discussion of tee vs. sponge](https://stackoverflow.com/questions/33638511/differences-between-sponge-and-tee) it seems like tee would have the same issues as just redirecting straight to the file. – orion elenzil Dec 03 '19 at 17:12
3

Because the shell that you use to invoke cat does the redirection indicated by >.

The shell (bash, zsh, ksh, dash, whatever) reads the command cat foo.txt > foo.txt. The shell has to set up the redirection indicated by > foo.txt. > means to start writing the file from the top, >> would mean to append to foot.txt.

By the time the shell actually gets cat running, foo.txt has disappeared.