10

I'm modifying a bunch of initramfs archives from different Linux distros in which normally only one file is being changed.

I would like to automate the process without switching to root user to extract all files inside the initramfs image and packing them again.

First I've tried to generate a list of files for gen_init_cpio without extracting all contents on the initramfs archive, i.e. parsing the output of cpio -tvn initrd.img (like ls -l output) through a script which changes all permissions to octal and arranges the output to the format gen_init_cpio wants, like:

dir /dev 755 0 0
nod /dev/console 644 0 0 c 5 1
slink /bin/sh busybox 777 0 0
file /bin/busybox initramfs/busybox 755 0 0

This involves some replacements and the script may be hard to write for me, so I've found a better way and I'm asking about how safe and portable is:

In some distros we have an initramfs file with concatenated parts, and apparently the kernel parses the whole file extracting all parts packed in a 1-byte boundary, so there is no need to fill each part to a multiple of 512 bytes. I thought this 'feature' can be useful for me to avoid recreating the archive when modifying files inside it. Indeed it works, at least for Debian and CloneZilla.

For example if we have modified the /init file on initrd.gz of Debian 8.2.0, we can append it to initrd.gz image with:

$ echo ./init | cpio -H newc -o | gzip >> initrd.gz

so initrd.gz has two concatenated archives, the original and its modifications. Let's see the result of binwalk:

DECIMAL       HEXADECIMAL     DESCRIPTION
--------------------------------------------------------------------------------
0             0x0             gzip compressed data, maximum compression, has original file name: "initrd", from Unix, last modified: Tue Sep  1 09:33:08 2015
6299939       0x602123        gzip compressed data, from Unix, last modified: Tue Nov 17 16:06:13 2015

It works perfectly. But it is reliable? what restrictions do we have when appending data to initfamfs files? it is safe to append without padding the original archive to a multiple of 512 bytes? from which kernel version is this feature supported?

Emilio Lazo
  • 243
  • 2
  • 7

2 Answers2

12

It's very reliable and supported by all kernel versions that support initrd, AFAIK. It's a feature of the cpio archives that initramfs are made up of. cpio just keeps on extracting its input....we might know the file is two cpio archives one after the other, but cpio just sees it as a single input stream.

Debian advises use of exactly this method (appending another cpio to the initramfs) to add binary-blob firmware to their installer initramfs. For example:

DebianInstaller / NetbootFirmware | Debian Wiki

Initramfs is essentially a concatenation of gzipped cpio archives which are extracted into a ramdisk and used as an early userspace by the Linux kernel. Debian Installer's initrd.gz is in fact a single gzipped cpio archive containing all the files the installer needs at boot time. By simply appending another gzipped cpio archive - containing the firmware files we are missing - we get the show on the road!

cas
  • 1
  • 7
  • 119
  • 185
  • Thank you @cas! I think the initramfs extracting code in kernel is clever than that because it works not only with `cpio` archives. We can have a combined gzipped and xzipped `cpio` with a `cpio` archive without compression... everything inside the `initrd` file. For example TAILS includes `GenuineIntel.bin` blob in a `cpio` archive at the beginning, and another `cpio` archive compressed with `xz`. Linux kernel seem to pass the whole file recognizing everything which has been concatenated. When an stream ends, it starts a new recognizing code for the next stream! – Emilio Lazo Nov 18 '15 at 05:25
  • I read the Debian's wiki page for Netboot firmwares you post for me, but I don't understand the rationale behind delivering an initrd concatenated this way "from factory". Why do Intel blobs are packed apart (before) of the compressed `initramfs` and not inside the archive containing full `initramfs` tree? This blob must be the first file to be expanded? in this case (TAILS), they can arrange the file list for `cpio` to have this file before the rest... I can't find an `initrd` file of Debian that uses this feature of concatenate two or more gzipped `cpio` archives as their wiki states. Thanks – Emilio Lazo Nov 18 '15 at 05:44
  • Debian doesn't do this themselves. As you say, they don't need to. This page contains instructions for users who want to add non-free firmware (e.g. for their NICs) to the initrd. – cas Nov 18 '15 at 05:56
  • Ok, good! To customize; exactly what I'm doing.... TAILS and another distro which I don't remember do that. May be to obfuscate... – Emilio Lazo Nov 18 '15 at 06:05
0

No, the existing initramfs needs to be padded in order to reliably continue parsing arbitrary archives appended to compressed archives.

Especially xz and "legacy frame format" lz4 compression are tricky and will fail 3 out of 4 cases - specifically, every time the byte count of the preceding archive is not divisible by 4. This generally goes unnoticed, as it is of no concern when placing raw format=newc cpio in front of a single compressed archive - the uncompressed form is always aligned.

While in theory the initramfs format is not specified beyond being a simple concatenation of (optionally compressed) archives, the padding can still be needed when the decompression routine (in some cases, by design) cannot tell where one archive ends and the next one starts. Some edge cases were improved in Linux release 5.14, others appear difficult if not impossible to unambiguously detect in the kernel. If additional data follows compressed archives, messages such as these indicate it was disregarded:

Initramfs unpacking failed: Decoding failed
Initramfs unpacking failed: invalid magic at start of compressed archive
Initramfs unpacking failed: broken padding

If compression was only applied to the last archive, these messages are harmless - nothing further was parsed, but there was nothing left to be parsed anyway.

anx
  • 101
  • 1