Amazon EC2 / TurnKey Hub - Fix grub boot issues

On a VM (or bare metal install), you can use a live ISO to fix many grub and kernel related boot errors. But on AWS that isn't an option. However, it is possible to attach the root volume from one server, to another server, giving you a somewhat similar environment to fix many grub boot related issues. This page gives an overview of how to do that via the Hub, AWS console and an SSH session.

Specific Steps - including commands

I use the server labels OLD and NEW to denote the original (broken) server and the new temporary server respectively.

These steps should be done within the 'Hub' or 'AWS console' as noted. The commands are all to be performed within a shell in your NEW server (via an SSH session).

  1. Hub: Start a NEW server within the same availability zone as the broken server. This can be any appliance or size, e.g. a t2.micro Core would be fine - but must be in the same availability zone!
  2. Hub: Stop OLD server.
  3. AWS console: Once OLD has stopped, detach the root volume.
  4. AWS console: Attach the (OLD) root volume to NEW (as a secondary volume). When asked for mount point, use /dev/sdf.
  5. Log into NEW via SSH: Use the following commands to set up a chroot and enter it:
# check that the OLD server volume is attached as /dev/xvdf:
gdisk -l /dev/xvdf

# Assuming OLD server volume is /dev/xvdf, mount it and chroot in:
mkdir /mnt2
mount /dev/xvdf2 /mnt2
mount --bind /dev /mnt2/dev
mount --bind /proc /mnt2/proc
mount --bind /sys /mnt2/sys
chroot /mnt2

Assuming that succeeds with no errors, you are now within the chroot of your OLD server volume. Here are a few potential fixes that are relevant to booting issues:

# optional: update kernel
apt-get update
apt-get install linux-image-$(dpkg --print-architecture)
apt-get install linux-image-$(uname -r)

# optional: regenerate initramfs (will be run automatically if the kernel is updated)
update-initramfs -u

# Note: as part of the initramfs regeneration process (whether via updating the 
# kernel, or manually regenerating) it is expected and normal to see the 
# following errors/warnings:
# /usr/share/initramfs-tools/scripts/casper-bottom/25autologin: 13: .: Can't open /scripts/casper-functions
# /usr/share/initramfs-tools/scripts/casper-bottom/25singleuser_shell: 6: .: Can't open /scripts/casper-functions
# /usr/share/initramfs-tools/scripts/casper-bottom/25ssh_emptypw: 6: .: Can't open /scripts/casper-functions

# To avoid adding grub boot entries for the host machine (i.e. the NEW server), temporarily disable the OS prober:
chmod a-x /etc/grub.d/30_os-prober

# reinstall grub2
grub-install --boot-directory=/boot /dev/xvdf

# re-enable os-prober
chmod a+x /etc/grub.d/30_os-prober

# You can also do other things such as check logs or install additional packages as desired.

Hopefully that all worked nicely and you didn't get any errors. Assuming so, exit the chroot and unmount everything:

# exit and tidy up
umount /mnt2/dev
umount /mnt2/proc
umount /mnt2/sys
umount /mnt2
  1. AWS console: Unattach OLD root volume from NEW, and reattach to the original server (OLD). Specify the attachment point as /dev/xvda.
  2. Hub: Boot OLD - fingers crossed it will now boot! If so great! If not, repeat steps 2-7 trying some different steps within the chroot (step 5).