Fixing Longhorn error FailedMount - exit status 32

A couple of days ago I started facing Longhorn issues after rebooting all three nodes. For some reason my adguard deployment was stuck trying to mount the PV. I’m running my adguard deployment with RWX and this means it’s mounted over NFS.

Events:
  Type     Reason       Age                 From     Message
  ----     ------       ----                ----     -------
  Warning  FailedMount  45m (x3 over 56m)   kubelet  Unable to attach or mount volumes: unmounted volumes=[adguard-conf-pv adguard-work-pv], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount  16m (x23 over 61m)  kubelet  MountVolume.MountDevice failed for volume "adguard-work-pv" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: /usr/local/sbin/nsmounter
Mounting arguments: mount -t nfs -o vers=4.1,noresvport,intr,hard 10.43.20.191:/adguard-work-pv /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/6c472e8b20509432a91f0d78010890b28fd5ebac0d9e85d0554ba7a69c9dccbd/globalmount
Output: mount.nfs: Protocol not supported
  Warning  FailedMount  6m50s (x21 over 61m)  kubelet  Unable to attach or mount volumes: unmounted volumes=[adguard-work-pv adguard-conf-pv], unattached volumes=[], failed to process volumes=[]: timed out waiting for the condition
  Warning  FailedMount  67s (x31 over 62m)    kubelet  MountVolume.MountDevice failed for volume "adguard-conf-pv" : rpc error: code = Internal desc = mount failed: exit status 32
Mounting command: /usr/local/sbin/nsmounter
Mounting arguments: mount -t nfs -o vers=4.1,noresvport,intr,hard 10.43.25.161:/adguard-conf-pv /var/lib/kubelet/plugins/kubernetes.io/csi/driver.longhorn.io/e5cf370525e8a07eda18f130064a2fb3712c6f4668904245d8622230e3defd8e/globalmount
Output: mount.nfs: Protocol not supported

At first I was having some doubt on the NFS package — if it got updated and started causing issues with Longhorn. But this wasn’t the case. Googling online landed me to one github issue that was opened fairly recently (link here). andrewheberle shared a comment about facing the same issue after upgrading the kernel to 5.15.0-94.

https://www.kolide.com/features/checks/ubuntu-unattended-upgrades

Checking my apt history, it looks like Ubuntu upgraded automatically last 8th of February: less /var/log/apt/history.log

This looks to be due to unattended upgrades that are enabled by default upon OS installation.

You can check this by executing: cat /etc/apt/apt.conf.d/20auto-upgrades

APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";

In my case I disabled both updating of package list and unattended upgrades by changing the value from 1 to 0.

Before we start the rollback, let’s check the available linux kernel images with dpkg -l | grep linux-image. We should be able to see our target kernel version. In this case 5.15.0-92 should be there.

After verification then we can start uninstallation of the new kernel image. On the next reboot, the kernel should be running with the next latest available kernel version.

Refeferences:

sudo apt remove linux-headers-5.15.0-97-generic -y
sudo apt remove linux-headers-5.15.0-97 linux-headers-5.15.0-97-generic linux-modules-5.15.0-97-generic linux-image-unsigned-5.15.0-97-generic -y

sudo apt remove linux-headers-5.15.0-94-generic -y
sudo apt remove linux-headers-5.15.0-94 linux-headers-5.15.0-94-generic linux-modules-5.15.0-94-generic linux-image-unsigned-5.15.0-94-generic -y

# then reboot
sudo shutdown -r now

Post reboot check the version: uname -r

The longhorn volumes are also successfully mounted this time with no errors.

P.S. This known issue seems to be logged already in this Longhorn KB article