5

After rebooting my server i get the following error message:

Begin: Running /scripts/init-premount … done.
Begin: Mounting root file system … 
Begin: Running /scripts/local-top …
Volume group “ubuntu-vg” not found
Cannot process volume group ubuntu-vg
Begin: Running /scripts/local-premount …
...
Begin: Waiting for root file system …
Begin: Running /scripts/local-block …
mdadm: No arrays found in config file or automatically
Volume group “ubuntu-vg” not found
Cannot process volume group ubuntu vg
mdadm: No arrays found in config file or automatically # <-- approximately 30 times
mdadm: error opening /dev/md?*: No such file or directory
done.
Gave up waiting for root file system device. 
Common problems:
-   Boot args (cat /proc/cmdline)
-   Check rootdelay= (did the system wait long enough?)
-   Missing modules (cat /proc/modules: ls /dev)
ALERT! /dev/mapper/ubuntu--vg-ubuntu--lv does not exist. Dropping to a shell!

The system drops to initramfs shell (busybox) where lvm vgscan doesn't find any volume groups and ls /dev/mapper only shows only one entry control.

When i boot the live SystemRescueCD, the Volume Group can be found and the LV is available as usual in /dev/mapper/ubuntu--vg-ubuntu--lv. I am able to mount it and the VG is set to active. So the VG and the LV look fine but something seems broken during the boot process.

Ubuntu 20.04 Server, LVM setup on top of hardware raid1+0 with 4 SSDs. The hardware RAID controller is HPE Smart Array P408i-p SR Gen10 controller with firmware version 3.00. Four HPE SSDs model MK001920GWXFK in a RAID 1+0 configuration. The server model is HPE Proliant DL380 Gen10.

No software raid, no encryption.

Any hints how to find the error and fix the problem?

EDIT I:

/proc/partitions looks good enter image description here

blkid enter image description here

Where

  • /dev/sdc1 is /boot/efi
  • /dev/sdc2 is /boot
  • /dev/sdc3 is the PV

Booting from an older kernel version worked once until executing apt update && apt upgrade. After the upgrade the older kernel had the same issue.

EDIT II:

In the module /proc/modules I can find the following entry: smartpqi 81920 0 - Live 0xffffffffc0626000

No output for lvm pvs in initramfs shell.

Output for lvm pvchange -ay -v

No volume groups found.

Output for lvm pvchange -ay --partial vg-ubuntu -v

PARTIAL MODE. Incomplete logical volumes will be processed.
VG name on command line not found in list of VGs: vg-ubuntu
Volume group "vg-ubuntu" not found
Cannot process volume group vg-ubuntu

There is a second RAID controller with HDDs connected to another PCI slot; same model P408i-p SR Gen10. There is a volume group named "cinder-volumes" configured on top of this RAID. But this VG can't be found either in initramfs.

EDIT III:

Here is a link to the requested files from the root FS:

  • /mnt/var/log/apt/term.log
  • /mnt/etc/initramfs-tools/initramfs.conf
  • /mnt/etc/initramfs-tools/update-initramfs.conf

EDIT IV:

In the SystemRescueCD I mounted the LV / (root), /boot and /boot/efi and chrooted into the LV /. All the mounted volumes have enough disk space left (disk space used < 32%).

The output of update-initramfs -u -k 5.4.0.88-generic is:

update-initramfs: Generating /boot/initrd.img-5.4.0.88-generic
W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.
W: mdadm: failed to auto-generate temporary mdadm.conf file

The image /boot/initrd.img-5.4.0-88-generic has an updated last modified date.

Problem remains after rebooting. The boot initrd parameter in the grub menu config /boot/grub/grub.cfg points to /initrd.img-5.4.0-XX-generic, where XX is different for each menu entry, i.e. 88, 86 and 77.

In the /boot directory I can find different images (?)

vmlinuz-5.4.0-88-generic
vmlinuz-5.4.0-86-generic
vmlinuz-5.4.0-77-generic

The link /boot/initrd.img points to the latest version /boot/initrd.img-5.4.0-88-generic.

EDIT V:

Since no measure has led to the desired result and the effort to save the system is too great, I had to completely rebuild the server.

sebmal
  • 51
  • 1
  • 1
  • 5
  • When the system goes to initramfs shell, run `cat /proc/partitions` to verify that the hardware RAID controller has been successfully detected and its logical drive(s) are visible. If the RAID drive that is supposed to contain the LVM PV is not visible, the most likely reason is that the driver for the hardware RAID controller has not been loaded: perhaps the correct module has not been included in initramfs. Try selecting the previous kernel version from the "advanced boot options" GRUB menu: if that works, but the latest kernel fails, then something is wrong with the new kernel's initramfs. – telcoM Oct 11 '21 at 15:22
  • @telcoM thanks for your answer. Seems like the hardware RAID controller can be detected. Could the error still depend on a missing module? If yes, is there any possibility to find out which one? – sebmal Oct 12 '21 at 08:23
  • In initramfs shell the `etc/fstab` file is empty. Is that normal? – sebmal Oct 12 '21 at 08:37
  • You might want to specify the make and model of the hardware RAID controller. The fact that the controller is visible in `lspci` output does not mean the driver module for it is loaded: `lspci` gets its information from the standard PCI(e) bus structures. Since the only goal of the initramfs is to mount the root filesystem, it does not have much need for `/etc/fstab`. – telcoM Oct 12 '21 at 09:18
  • It's a HPE Smart Array P480i-p SR Gen10 controller with firmware version 3.00. Four HPE SSDs model MK001920GWXFK in a RAID 1+0 configuration. The server model is HPE Proliant DL380 Gen10. That makes sense, with the `etc/fstab`. – sebmal Oct 12 '21 at 11:07
  • Hmm, Smart Array P480i-p SR Gen10 does not seem to exist, maybe you meant P408i-P SR Gen10? In that case, the driver module would be `smartpqi`. If you run `lvm pvs`, does it list `sdb1` and/or `sdc3` as belonging to `ubuntu-vg`? If you run `lvm vgchange -ay` in the initramfs prompt to activate the VG explicitly, what does it say? Does it help if you run `lvm vgchange -ay --partial ubuntu-vg`? – telcoM Oct 12 '21 at 11:42
  • I took the liberty of editing your original post to add the hardware information to it. In StackExchange, the recommended way is to provide any requested information by editing the original post, as these comments are not permanent and having all the pertinent information in the question makes it more useful for others who might have a similar problem. – telcoM Oct 12 '21 at 11:50
  • Thanks for editing the post. Yes, P408i-p SR Gen 10 is correct. I changed it in the original post. – sebmal Oct 12 '21 at 13:35
  • May you please boot via your SystemRescueCD, mount your LV to `/mnt` and post/attach log file `/mnt/var/log/apt/term.log` to your original post? Please also post/attach `/mnt/etc/initramfs-tools/initramfs.conf` and `/mnt/etc/initramfs-tools/update-initramfs.conf` to your original post. – paladin Oct 12 '21 at 13:46
  • @paladin I couldn't find any hints in the files. Could you find something wrong? – sebmal Oct 13 '21 at 09:09
  • Your apt-log seems to be okay, also the initramfs-configs. Is your LV installed on a RAID? Because, if so, this initramfs-warning would explain your problem: `W: mkconf: MD subsystem is not loaded, thus I cannot scan for arrays.`, make sure the `MD subsystem` is installed in your `initramfs`. – paladin Oct 18 '21 at 08:37

3 Answers3

8

I had a very similar problem. After a failed installation of a new kernel (mainly because the /boot partition ran out of space) I manually updated initramfs and after rebooting, initramfs wouldn't prompt the decryption process for the encrypted partition. I was getting errors of the sort Volume group “vg0” not found and the initramfs prompt, which is similar to a terminal but with limited capabilities.

My problem got solved by:

  1. freeing space in the boot partition.
  2. installing cryptsetup-initramfs.

For step 1 I used the recipe in this post in order to delete old kernels: https://askubuntu.com/a/1391904/1541500. A note on step 1: if you cannot boot onto any older kernel (as it was my case), you night need to perform those steps as part of step 2 (live CD session), after performing the chroot command.

For step 2 I booted from the live CD and opened a terminal. Then I mounted the system, installed the missing packages and prompted the reinstallation of the last kernel (which automatically updates the initramfs and grub cfg).

In the following I list the commands I used in the terminal of the live CD session for step 2 in order to fix the system.

In my case I have the following partitions:

  • /dev/sda1 as /boot/efi with fat32
  • /dev/sda2 as /boot with ext4
  • /dev/sda3 as vg0 with lvm2 -> this is my encrypted partition, with my kubuntu installation.

Also important to mention is that my encrypted partition is listed as CryptDisk in /etc/crypttab. This name is necessary in order to decrypt the partition using cryptsetup luksOpen. Once decrypted, vg0 has 3 partitions: root, home and swap.

Now back to the commands to run in the LIVE CD terminal:

sudo cryptsetup luksOpen /dev/sda3 CryptDisk # open encrypted partition
sudo vgchange -ay
sudo mount /dev/mapper/vg0-root /mnt # mount root partition
sudo mount /dev/sda2 /mnt/boot # mount boot partition
sudo mount -t proc proc /mnt/proc
sudo mount -t sysfs sys /sys
sudo mount -o bind /run /mnt/run # to get resolv.conf for internet access
sudo mount -o bind /dev /mnt/dev
sudo chroot /mnt
apt-get update
apt-get dist-upgrade
apt-get install lvm2 cryptsetup-initramfs cryptsetup
apt-get install --reinstall linux-image-5.4.0-99-generic linux-headers-5.4.0-9 9-generic linux-modules-5.4.0-99-generic linux-tools-5.4.0-99-generic

After successful reboot I updated all other kernels with update-initramfs -c -k all.

mapazarr
  • 181
  • 1
  • 4
  • 1
    bravo. importance of first command name matching crypttab value should be highlighted more! – Andrew May 03 '22 at 01:47
  • I ended up here while trying to fix my Kubuntu getting stuck at "press ctrl+c to cancel all filesystem checks in progress" during boot. Turns out some of the (linux-modules and linux-tools) packages did not get installed due to lack of disk space. Running the `apt-get install --reinstall` with all these packages fixed the issue. – Jeroen De Dauw Aug 06 '22 at 21:38
0

Since the problem went away by booting with an older kernel version, but reappeared after an upgrade, I think you might be accidentally switching kernel flavours: see https://wiki.ubuntu.com/Kernel/LTSEnablementStack for more details.

Basically, Ubuntu 20.04 may have one of three versions of Linux kernel in use:

  • the General Availability (GA) of Ubuntu 20.04 version: metapackage linux-generic
  • the OEM special version: metapackage linux-oem-20.04
  • the Long-Term Support Enablement / Hardware Enablement (HWE) kernel version: metapackage linux-generic-hwe-20.04

Your hardware might be so new it requires either the OEM or HWE kernel. But if the system was originally installed with the "wrong" kernel and then the correct one was manually installed, without installing the corresponding metapackage too, it is possible that the update mechanism now defaults to installing the latest kernel of the GA series, whose smartpqi driver might be too old for your hardware.

As suggested by paladin in the comments, you might want to boot from SystemRescueCD and look at the /var/log/apt/term.log in the system to figure out the exact kernel package version(s) that got replaced in the updates.

Once you know the correct kernel flavour, you can either try the boot menu again if it still holds an older kernel version that works, or boot from SystemRescueCD, mount the root LV and chroot into it, mount any other necessary filesystems, and then install the newest kernel package of the correct flavor, and then reboot.

If the system then runs to your satisfaction, then you should remove the metapackages associated with the "wrong" kernel flavours, if any are installed: those will direct apt in the choice of kernel flavour whenever a new kernel update becomes available.


If the kernel flavour turns out to be correct after all, then it might be something simpler, like insufficient disk space for update-initramfs to create a new initramfs file for new kernels.

That has an easy fix: first free up some disk space (clearing up apt caches with apt clean might come in handy), then run update-initramfs -u -k version-of-newest-kernel-package to re-create the initramfs file. You may wish to repeat this command for any kernel version that currently has a failing initramfs, just to give you more workable boot options in case there will be more problems in the future.

telcoM
  • 87,318
  • 3
  • 112
  • 232
  • I tried `update-initramfs` (see edit 4 in my original post). I will have a look at the kernel flavorous tomorrow. Thank you very much for all your hints and your effort. I very much appreciate your support! – sebmal Oct 12 '21 at 18:23
0

I faced similar issue sometime ago. I was adding another LVM volume to the setup and accidentally had renamed my ubuntu-vg to the new volume (messed up executing the commands).

Solution was to boot through a USB bootable disk and rename back the faulty vg. Steps I followed were:

Boot using a Live Linux USB Open terminal and type in the following: 2.1 sudo vgdisplay --> shows the list of all the vgs in your setup

2.2 Identify the vg that has been disturbed or that contains the primary FS

2.3 sudo vgrename <wrong_name> ubuntu-vg

After the above, shutdown nicely and then restart. The volume will be seen by grub and will be loaded as normal.

Naresh Mehta
  • 489
  • 4
  • 3