1

I have a large (several GB) xml file that is missing a root element. I might use cat on ubuntu to wrap around a <root> element around the content.

How can I achieve this without having to extract and repack the content?

It works if I create separate prefix and postfix gz files, and concatenate them with cat. But could I do better, without having to create explicit pre-/postfix gz files? Can I achieve the same on the fly?

echo "<root>" | gzip > prefix.gz
echo "</root>" | gzip > postfix.gz
cat prefix.gz input.xml.gz postfix.gz > newfile.gz
membersound
  • 431
  • 1
  • 5
  • 17
  • 1
    since you're dealing with a multi-GB file, [this question](https://stackoverflow.com/q/2690823) might be interesting to take a look at regarding the performance. – myrdd Apr 25 '18 at 09:27

1 Answers1

3

You can use here strings and a succession of commands:

(gzip <<<"<root>"; cat input.xml.gz; gzip <<<"</root>") > newfile.gz

Here strings are described in What does <<< mean? and in the Bash manual.

Stephen Kitt
  • 411,918
  • 54
  • 1,065
  • 1,164
  • `gzip` may refuse to compress short strings like that without `-f`. EDIT: No it doesn't, since it's from stdin. – Kusalananda Apr 25 '18 at 08:30
  • While I don't understand exactly, especially the `<<<`, it works as expected! – membersound Apr 25 '18 at 08:41
  • See [this question](https://unix.stackexchange.com/q/80362/86440) or [the Bash manual](https://www.gnu.org/software/bash/manual/html_node/Redirections.html#Here-Strings) for a description of `<<<`. – Stephen Kitt Apr 25 '18 at 09:03