0

Update: On closer inspection of the man page it appears that it is not yet supported by unzip to read archives from standard input. The man page refers the user to funzip instead. I guess this makes my question moot, but I will leave it regardless.

I am trying to compose zip and unzip to create a command that acts as the identity function, meaning that it accepts input in stdin and produces the same output on stdout. I am, however, not able to get it working. I read both manual pages (especially zip had a detailed one) and believed that this would work

echo "hello" | zip | unzip -p | echo

Explanation:

  • echo should produce the initial standard input
  • zip without commands, according to the man page, takes input on stdin and produces output on stdout
  • unzip with -p should produce output on stdout. NOTE: I believe something might be missing here, to indicate that input is arriving from stdin. I tried adding - but it did not work. The man page does not seem to cover this.
  • echo to visualize what appeared on stdout

Please note: I am only trying to feed a single file through this command, so I do not need to worry about directories/multiple files.

Rewbert
  • 131
  • 3
  • Side note about "`echo` to visualize what appeared on stdout" – Even if `zip` and `unzip` worked as you want, [`| echo` would be wrong](https://unix.stackexchange.com/q/648201/108618). – Kamil Maciorowski Mar 14 '23 at 10:25
  • why do you want to use `zip`? Zip is an *archiver* (which can also compress), and it seems you want a *compressor* instead: – Marcus Müller Mar 14 '23 at 11:02
  • 1
    Does this answer your question? [Difference between ar, tar, gzip, zip and when should I decide to choose which one?](https://unix.stackexchange.com/questions/731609/difference-between-ar-tar-gzip-zip-and-when-should-i-decide-to-choose-which-o) – Marcus Müller Mar 14 '23 at 11:03
  • The choice is not so important, I am working on something else and want two simple commands that when composed form an identity function. I am fuzzing this composed function to perform some evaluation, which is my point of interest. The function in question is not super important. @KamilMaciorowski – Rewbert Mar 14 '23 at 11:30
  • "_I[...] want two simple commands that when composed form an identity function_" - how about `cat | cat` or `tac | tac`? Can't get much simpler than the first – roaima Mar 14 '23 at 14:26
  • "The choice is not so important" – There are other tools then. If you want a pair of commands that actually does some compression/decompression then `gzip -c | gzip -dc`. Other non-trivial commands that compose a no-op: `base64 | base64 -d`, `dd conv=ebcdic | dd conv=ascii`. Asking specifically about `zip` really looks like an [XY problem](https://meta.stackexchange.com/a/66378/355310). – Kamil Maciorowski Mar 14 '23 at 16:15

1 Answers1

2

unzip requires the archive to be seekable (random access), because zip accumulates information on the archived files, and writes the whole contents list and statistics at the end. So a pipe will never be acceptable as a zip archive.

zip does not even know the lengths of each file inside the archive as it writes them, because they are individually compressed on the fly.

tar, by way of contrast, can be written and read through pipes. The overhead for this is (a) the file must be read serially to locate specific files (because the metadata is spread through the archive), and (b) the whole archive needs to be compressed and decompressed as one unit, so retrieving one file requires the whole archive to be decompressed.

Edit: slight rider on the tar notes: if the archive is not compressed, then tar can seek over file data on disk (it knows the size of each file, and the block padding rules). If a tape drive has a "skip blocks" capability, it can use that. In a pipeline, it has to read and discard any files it wants to skip.

Paul_Pedant
  • 8,228
  • 2
  • 18
  • 26
  • 1
    To stress this twice: "is not yet supported": yes, and it never will; as Paul says in his first paragraph, it's inherent to the zip format that you need the whole archive to get *anything* from it. – Marcus Müller Mar 14 '23 at 11:01