Raspberry pi4 Ubuntu not booting

gbajart · September 30, 2021, 4:17pm

Hello,

Today I was working on my 3 raspberries when I had to reboot one of them. After rebooting, my boot was stucked with the message below after the bootloader:

spi-bcm3825 fe204000.spi: could not get clk: -517
_ (blinking)

After restarting all my rpi4, I noticed that they were all blocked with the same behavior

My problem seems similar to this thread:

I would like to overcome this without reflashing the SD card but more important: understand why it happened?

I checked my power supply to see if it the voltage and the quality is ok and unfortunately I discovered that power was reaching peaks at 5,30v which is above 5,25v (Overvoltage). See the topic below:

I then used a lab power supply calibrated to 5v to be sure signal is fine and started the rpi4. Problem is still there.

I did long configuration work on these platform and would like to get the system back up and running without re-flashing the SD card. Would it be possible?
What could have gone wrong here since I did not worked on the two other tablets?

I also tried to use the keyboard to interrupt the auto boot but seems that no key is detected at that moment of boot…

I would really appreciate any help, thank you in advance
Gilles

O.S: Ubuntu x64 20.04 LTS
Board: Raspberry pi 4 b
ZYMKEY4 without battery in developer mode.

Bob_of_Zymbit · September 30, 2021, 10:02pm

Hello Gilles,

It looks like when I reproduced this, it continued on after 10 seconds or so and booted. Can you wait for 30 seconds or so and see if you boot or get any additional output? I’m curious about the message right after that one if you see anything.

Bob

gbajart · October 1, 2021, 8:53am

Hello Bob,

Thank you for your message.

Can you wait for 30 seconds or so and see if you boot or get any additional output?

→ Left it on for at least one hour: Nothing happens, it hangs, no keyboard input press is possible, CTRL+ALT+DEL does not reboot.

My fourth zymkey got also locked with same behavior… I definitely did something wrong
I attach a picture of the boot message at the beginning:

Could you please answer my questions from my first post?
I have other many other questions, like:

what are the conditions for the zymkey to destroy the master key in developer mode?
is the battery cell mandatory for the zymkey to work in developer mode?
can we swap SD card without master key to be destroyed?

Thank you so much!
Gilles

Bob_of_Zymbit · October 1, 2021, 3:44pm

Gilles,

I’m going to follow up with you directly via email for now rather than here in the community since I cannot currently reproduce this issue and we may go back and forth a few times.

I’ll try my best to address any questions from both posts. I will edit these later to be more helpful once we figure out what happened. Answers for now:

This does not seem to have anything to do with power. Your power source seems fine. It seems to be related to an upgrade of your kernel/initrd. I currently cannot reproduce and do not understand why it happened.
Encrypting your root partition is obviously to prevent someone in the field from accessing your intellectual property. Good practice would be for you to have access to your code/image locally so that you can reproduce another SD card should it be needed.
The only way to destroy the keys on a Zymkey is via our API by enabling that mode for a perimeter detect event. That said, there is no way to access the key to unlock your root file system once you have encrypted even in development mode without that specific Zymkey, SD card, PI combination.
The battery is only necessary to maintain the RTC and if you have setup perimeter detect to take action - notify or destroy keys.
In developer mode, you can “start over” at any time. The Zymkey does not destroy any keys unless you enable that mode via the API for perimeter detect events and put into Production Mode by clipping the tab. You can put in a new SD card and re-use the Zymkey and PI.

I’ll update this post once we determine what changed during the upgrade of ubuntu for others in the Community.

Bob

gbajart · October 22, 2021, 3:40pm

After a few weeks of using the zymkey without any issue, I’m back here to conclude on this topic.

First @Bob_of_Zymbit I’d like to thank you for your quick reply and our conversation by mail! It was really helpful.

We noticed that the issue was due to the init ram disk file (initrd.img) that changed. In fact, if the system is upgraded or if a system package is added. It might be possible that the initrd file is rebuild with new tools or new files. If the file changed, the zymkey sees this change as “tempering” and therefore forbids the OS to boot…

To mitigate this issue, remove the service unattended-upgrades:

systemctl stop unattended-upgrades
systemctl disable unattended-upgrades

To be sure the partition mmcblk0p2 (system-boot) is not modified, load it with read-only by adding ro after defaults in /etc/fstab
→ afterwards reboot the system.

To check that your kernel stayed unchanged, use:
uname -r

And of course make backups of the SD card to be able to recover the system-boot partition later if something happens

Cheers,
Gilles

kontrasec · November 19, 2021, 6:53am

I find myself up against this exact issue on two of my rpi4’s, ubuntu 20.04 LTS, with zymbit 4i’s.

Thank you for sharing your findings, and that it’s related to kernel upgrades. That may point to how to fix this, e.g. with a hook on apt upgrade to ask the zymbit to rekey.

gbajart · November 29, 2021, 4:42pm

Hey,

@kontrasec thanks for your message I think we may get an answer from @Bob_of_Zymbit?

from my point of view, this is bypassing the secure boot of zymkey because then someone could make changes in the boot files without me noticing it …
I think the best is to re-deploy an OS with correct packages and versions or use docker images if a newer system is necessary.

I may be wrong!
looking forward to your security expert advises

Gilles

kontrasec · December 6, 2021, 5:57am

Looking through the installation scripts, at first blush it appears as if there’s provisions for hooking when the kernel updates.

mk_kernel_update_initramfs()
{
# Replace existing kernel initramfs rebuild with our own
cat > /etc/kernel/postinst.d/initramfs-tools <<"EOF"
...

and /usr/sbin/update_encr_initrd. Though, I haven’t spotted the exact binding calls yet - more reading, another time is needed.

kontrasec · December 6, 2021, 8:32am

Popping open the signed image (tar gz) referenced by the script, the next clue for where the magic happens is in conv_edev_rfs/opt/zk_luks/scripts/create_zk_crypt_vol:122:

Clearing /var/lib/zymbit and resetting the zkifs service, presumably it recreates folder contents upon binding.

Primitively, without further insight…

The update logic to repair (create?) would be an apt-upgrade hook to zkunlockifs the wrapped LUKS key at /var/lib/zymbit/key.bin.lock, rebind the zymbit key to the newly updated OS, and then zklockifs the same LUKS key to a newly-wrapped file at the same location.

–

My understanding of the module design means that this will only work when the zymbit key is in development mode - production mode, sealed is sealed. This may meet what Gilles brings up, I think.

kontrasec · December 11, 2021, 7:06am

I present my mitigation for device bricking. This is quite primitive for now, but it allows you to use a fallback LUKS password to get back into the workstation.

The key is to add a fallback for /lib/cryptsetup/scripts/zk_get_key to /lib/cryptsetup/askpass.

First, create a backup LUKS passphrase:


# Future state: correct the pipelining to avoid a file...

#sudo /lib/cryptsetup/scripts/zk_get_key | sudo cryptsetup luksAddKey --key-file=- /dev/mmcblk0p2

# The call to zkunlockifs frequently crashes on my device, for some reason...

sudo zkunlockifs /var/lib/zymbit/key.bin.lock > key.bin

# Create a backup key

sudo cryptsetup luksAddKey --key-file=key.bin /dev/mmcblk0p2

rm key.bin

Then, edit your /lib/cryptsetup/scripts/zk_get_key to something like:

#!/bin/sh

num_times=30
while [ ${num_times} -gt 0 ]
do
    ls /sys/class/net/eth* 1>/dev/null 2>&1
    eth=$?
    ls /sys/class/net/enx* 1>/dev/null 2>&1
    enx=$?

    if [ ${eth} -ne 0 ] && [ ${enx} -ne 0 ]
    then
        num_times=$((num_times-1))
        sleep 0.1
    else
        break
fi
done
num_times=30
while [ ${num_times} -gt 0 ]
do
    if [ -d "/var/lib/zymbit" ]
    then
        break
    else
        num_times=$((num_times-1))
        sleep 0.1
    fi
done

if [ -e /var/lib/zymbit/zkenv.conf ]
then
    export $(cat /var/lib/zymbit/zkenv.conf)
fi

/sbin/zkunlockifs /var/lib/zymbit/key.bin.lock

err=$?
if [ ${err} -ne 0 ]
then
    /lib/cryptsetup/askpass "Zymbit key did not release LUKS key. Enter backup LUKS passphrase:"
fi

Bake:
sudo update-initramfs -u -k all

I have tested this fallback mechanism with ubuntu 20.04 LTS on an rpi4 4gb. In the event the zymbit refuses to pop open the boot disk, or you physically pull the zymbit off of the rpi4, you can enter your fallback LUKS password and get back in.

gbajart · April 1, 2022, 3:13pm

Hello Kontrasec,

I tried to apply your mitigation and it was fine when unplugging the zymkey module but… in my issue, I noticed that initrd.img was missing on my boot partition (mmcblk0p1) after the upgrade leading to this “bricking issue”

for reference, I updated a single packet to reduce the scope of my issue, I therefore tried:
sudo apt install linux-raspi

After you pinpointing the zymbit scripts and a few days understanding them, I found the following to be really interesting: /usr/sbin/update_encr_initrd

I took the logs of apt to seek for any of these echo messages from the scripts and look what I found:
apt-logs.zip (16.0 KB)

3558 Processing triggers for linux-image-5.4.0-1056-raspi (5.4.0-1056.63) ...
3559 /etc/kernel/postinst.d/initramfs-tools:
3560 Kernel version 5.4.0-1056-raspi passed in...
3561 Aborting update-initramfs due to request mismatching running kernel...

From the script, I could match that line:

...
if [ -z "${version}" ]; then
   # Derive the installed kernel's version number from the correct image.
   ...
else
   rmod=`echo "$version" | cut -d '-' -f 2`
   if [ "${rmod}" != "${mod}" ]; then
     echo "Aborting update-initramfs due to request mismatching running kernel..."
     exit 0
   fi
   lv="${version}"
fi

Which led me to the question: Why the request would mismatch in case of an upgrade?
I could solve my issue by manually rebuilding the kernel:
sudo update-initramfs -v -c -k 5.4.0-1056-raspi -b /mnt/boot

I fear that this hook was only there for the usb key migration and not for an upgrade…
I plan to restore my backup, patch the script without any exit preventing update-initramfs to do its job.
The code is the following:

if [ -z "${version}" ]; then

   # Derive the installed kernel's version number from the correct image.
   ...
else
   rmod=`echo "$version" | cut -d '-' -f 2`
   echo "rebuilding initramfs from old version ${mod} to new version ${rmod}"
   # if [ "${rmod}" != "${mod}" ]; then
   #  echo "Aborting update-initramfs due to request mismatching running kernel..."
   #  exit 0
   # fi
   lv="${version}"
fi

More on this next week. Thanks again Kontrasec for your support

gbajart · April 6, 2022, 8:19am

Hello,

Here is my last post with the solution to be able to upgrade your kernel without bricking the system (missing initrd.img file in /boot/firmware).

What is happening:
Hook file modified in /etc/kernel/postinst.d/initramfs-tools is problematic:

The ìnit ram disk is directly rebuilded in /boot/firmware folder thus skipping the original copy located in /boot/ folder. flash-kernel is normally responsible to take the currenlty used init ram disk in /boot/ folder and place it in /boot/firmware and renaming it to initrd.img. With a future upgrade, flash-kernel does not find the original init ram disk in /boot/initrd-<version>.img and therefore does only a copy of the old ram disk /boot/firmware/initrd.img.bak, leaving folder /boot/firmware without any initrd.img file bricking the system at next reboot.

update-initramfs -v -c -k ${lv} -b /mnt/tmpboot >/dev/null || exit # This line should not rebuild directly in partition mmcblk0p1.
mv /mnt/tmpboot/${if} /mnt/tmpboot/initrd.img # This is actually the job of flash-kernel to place the correct files in partition mmcblk0p1.

→ I think there is a good reason to do it for the zymkey installation? But this is not suitable afterwards.

Suggested solution:
Restore the original hook file that…:

Rebuilds init ram disk in /boot/ folder;
Let flash-kernel do the magic for the firmware partition;
Never exits for whatever reason meaning that you will always have an init ram disk correctly built.

Please have below the original ubuntu 20.04 LTS hook file:
initramfs-tools.zip (559 Bytes)

Cheers,
Gilles

Topic		Replies	Views
Potential LUKS key lost on Raspberry Pi reboot? Encrypting File System	2	812	January 6, 2021
After a while without any problem the boot process now stop working. No access to the encrypted USB device ZYMKEY4	6	399	December 19, 2022
Raspberry pi4-B: not able to Encrypt "rootfs" Encrypting File System	4	453	February 12, 2022
ERROR: no zymkeys installed ZYMKEY4	20	2060	October 25, 2021
First attempt with production mode: FAILED ZYMKEY4	9	573	January 13, 2022

Raspberry pi4 Ubuntu not booting

Related topics