7

I have a CentOS 8 installation, where the partitioning and RAID 1 configuration where done using the automatic partitioning of the CentOS installer. Here is the output of lsblk:

sda         8:0    0 558.9G  0 disk
├─sda1      8:1    0    50G  0 part
│ └─md127   9:127  0    50G  0 raid1 /
├─sda2      8:2    0    20G  0 part
│ └─md126   9:126  0    20G  0 raid1 [SWAP]
├─sda3      8:3    0     1G  0 part
│ └─md125   9:125  0  1022M  0 raid1 /boot
├─sda4      8:4    0   600M  0 part
│ └─md124   9:124  0   600M  0 raid1 /boot/efi
└─sda5      8:5    0 487.3G  0 part
  └─md123   9:123  0 487.2G  0 raid1 /home
sdb         8:16   0 558.9G  0 disk
├─sdb1      8:17   0    50G  0 part
│ └─md127   9:127  0    50G  0 raid1 /
├─sdb2      8:18   0    20G  0 part
│ └─md126   9:126  0    20G  0 raid1 [SWAP]
├─sdb3      8:19   0     1G  0 part
│ └─md125   9:125  0  1022M  0 raid1 /boot
├─sdb4      8:20   0   600M  0 part
│ └─md124   9:124  0   600M  0 raid1 /boot/efi
└─sdb5      8:21   0 487.3G  0 part
  └─md123   9:123  0 487.2G  0 raid1 /home

As you can see, the /boot/efi partition is mirrored in RAID 1 as any other partition. Now, I'm trying to recreate the same setup when installing Debian, and I'm unable to proceed. If I setup the partitions and RAID 1 in this way, I get an failure from the installer during the grub installation (with no other error message, just "Some installation step has failed" generic message).

Screenshot:

error

The error goes away if I do not mirror the ESP partition.

I realise that mirroring the ESP partition is something that sounds unfeasible, and looking around it seems everybody agrees. But the CentOS installer manages to do it somehow.

What do I have to do to recreate the same setup on Debian?

gigabytes
  • 311
  • 1
  • 2
  • 8
  • 1
    try just using just sda4 for /boot/efi, and then turn it into a raid-1 mirror with mdadm after the system has installed and booted. BTW, a raid-1 mirror of the ESP partition is fine (but don't use other raid types like raid-0 or 10 or 5 or 6), but remember that you'll have to tell your UEFI to use the other disk if the primary disk dies - UEFI doesn't understand linux mdadm raid and won't automatically switch to the mirror. – cas Apr 08 '21 at 12:12
  • So the steps after installation are: to create the md device, format the partition as FAT32, change its type to ESP with parted/fdisk/etc, mount it again to /boot/efi, and then how to I tell grub to repopulate it again? – gigabytes Apr 08 '21 at 12:54
  • Make a degraded raid-1 using only **/dev/sdb4**. format it as FAT32, mount it somewhere convenient (/mnt, perhaps) and copy everything from /boot/efi to it (use `cp -a` or `rsync` or some other method that recurses any sub-directories). unmount /boot/efi and then add /dev/sda4 to the raid-1 with sdb4. This will cause sda4 to be synced with the contents of sdb4. Unmount this raid-1 mirror and remount it as /boot/efi (and don't forget to update `/etc/fstab` so that it mounts the mirror device instead of /dev/sda4 - use a LABEL or UUID instead of a /dev/ entry). – cas Apr 08 '21 at 13:05
  • Thanks, so the contents of /boot/efi cannot be "recreated" from the grub package? Just curious. – gigabytes Apr 08 '21 at 13:09
  • If you need more details, there are numerous questions with answers here on this site with detailed instructions for doing this kind of thing with degraded (i.e. missing one or more devices) raid mirrors. e.g. https://unix.stackexchange.com/questions/63928/can-i-create-a-software-raid-1-with-one-device – cas Apr 08 '21 at 13:10
  • 1
    depends what's on /boot/efi. IIRC, `update-grub` can & will copy it's own boot-loader there, but can't do anything about any others that might have been installed by the bios or other programs or operating systems. Easiest to just copy everything from sda4 to the mirror, it's only a few hundred MB at most, anyway. – cas Apr 08 '21 at 13:12
  • When you get it working, please write up what you did as an answer and then accept it (unless someone else posts a better answer), so that this question gets flagged as answered. So take notes :) – cas Apr 08 '21 at 13:15
  • I've tried, and It worked and booted. But then I tried removing the first disk to simulate a failure and as you pointed out the system does not boot anymore. Is it just a problem with EFI that I have to solve with my EFI system? (I'm on VirtualBox by the way) – gigabytes Apr 08 '21 at 13:30
  • yeah, you need to tell UEFI where to find its ESP partition. I think some motherboard/BIOS manufacturers allow you to give it a list of partitions to try, but I don't know if vbox's uefi bios code is capable of that or not. Maybe try setting the partition type to ESP on both sda4 and sdb4 (which will probably mean you have to manually define that raid array in /etc/mdadm/mdadm.conf rather than rely on mdadm auto-detect). – cas Apr 08 '21 at 13:36
  • This is notoriously complicated. [See this bug](https://github.com/systemd/systemd/issues/12468) and the (numerous) other bugs linked from there for discussion and workaround. EFI partition mirroring is _feasible_ but definitely not nice or easy. [Here](https://github.com/systemd/systemd/issues/17298)’s the exact configuration that “works” for me. – Andrej Podzimek Dec 22 '22 at 14:36

1 Answers1

9

Thanks to the comments by @cas I had this working.

The steps are mainly:

  1. I've installed Debian without setting up the RAID for the ESP partition. During the partitioning, I've already created two identical partitions marked as ESP partitions. They were on /dev/sda1 and /dev/sdb1
  2. I've copied the contents of /boot/efi somewhere else (/boot/eficopy).
  3. umount /boot/efi
  4. mdadm --create --verbose /dev/md3 --level=1 --raid-devices=2 --metadata=1.0 /dev/sda1 /dev/sdb1. Of course change /dev/md3 to something else if /dev/md3 is already an active MD device
  5. mkfs.vfat /dev/md3
  6. found the UUID of the partition in /dev/disk/by-uuid
  7. changed the /boot/efi entry in /etc/fstab with the new UUID
  8. mount /boot/efi
  9. copied the data from the backup into /boot/efi again

The reboot worked.

EDIT: Instead of backing up the /boot/efi partition, it seems that

grub-install --efi-directory=/boot/efi

does the job of restoring its contents (at step 9 above), even though I got a lot of warnings I cannot understand.

EDIT2: One should probably consider using metadata version 1.0 in favor of 0.9, as per the wiki page A guide to mdadm.

Version 1.0 still has the requirement (for this usecase) of placing the superblock at the end of the device, but also includes "the modern features of mdadm", by using common layout format as 1.1 & 1.2.

twan
  • 3
  • 2
gigabytes
  • 311
  • 1
  • 2
  • 8