Amazon EC2 / TurnKey Hub - Fix grub boot issues / recover root password

On a VM (or bare metal install), you can use a live ISO to fix many grub, kernel and other related issues (e.g. lost root access). But on AWS that isn't an option. However, it is possible to attach the root volume from one server, to another server, giving you a somewhat similar environment to fix many grub, boot and related issues. This page gives an overview of how to do that via the Hub, AWS console and an SSH session.

Specific Steps - including commands

I use the server labels OLD and NEW to denote the original (broken) server and the new temporary server respectively.

These steps should be done within the 'Hub' or 'AWS console' as noted. The commands are all to be performed within a shell in your NEW server (via an SSH session).

  1. Hub: Start a NEW server within the same availability zone as the broken server. This can be any appliance or size, e.g. a t2.micro Core would be fine - but must be in the same availability zone!
  2. Hub: Stop OLD server.
  3. AWS console: Once OLD has stopped, detach the root volume.
  4. AWS console: Attach the (OLD) root volume to NEW (as a secondary volume). When asked for mount point, use /dev/sdf.
  5. Log into NEW via SSH: Use the following commands to set up a chroot and enter it:
# check that the OLD server volume is attached as /dev/xvdf:
gdisk -l /dev/xvdf

# Assuming OLD server volume is /dev/xvdf, mount it and chroot in:
mkdir /mnt2
mount /dev/xvdf2 /mnt2
mount --bind /dev /mnt2/dev
mount --bind /proc /mnt2/proc
mount --bind /sys /mnt2/sys
chroot /mnt2
  1. SSH - Perform fix(es) in chroot: Assuming that succeeds with no errors, you are now within the chroot of your OLD server volume. What you do next will depend on what issue you are seeking to address. I note a few potential fixes that are relevant to booting issues as well as resetting the root password:
# optional: update kernel
apt-get update
apt-get install linux-image-$(dpkg --print-architecture)
apt-get install linux-image-$(uname -r)

# optional: regenerate initramfs (will be run automatically if the kernel is updated)
update-initramfs -u

# Note: as part of the initramfs regeneration process (whether via updating the 
# kernel, or manually regenerating) it is expected and normal to see the 
# following errors/warnings:
#
# /usr/share/initramfs-tools/scripts/casper-bottom/25autologin: 13: .: Can't open /scripts/casper-functions
# /usr/share/initramfs-tools/scripts/casper-bottom/25singleuser_shell: 6: .: Can't open /scripts/casper-functions
# /usr/share/initramfs-tools/scripts/casper-bottom/25ssh_emptypw: 6: .: Can't open /scripts/casper-functions

# To avoid adding grub boot entries for the host machine (i.e. the NEW server), disable the OS prober
# (note os-prober has no value other than on dual boot machines)
chmod a-x /etc/grub.d/30_os-prober

# reinstall grub2
grub-install --boot-directory=/boot /dev/xvdf
update-grub

# reset root password
passwd root

# add a public SSH key
echo "_keytype_ _public_key_data_ _email_" >> /root/.ssh/authorized_keys
# IMPORTANT: replace "_keytype_ _public_key_data_ _email_" with the contents of your public key.
# email is optional but highly recommended
# You can also do other things such as check logs or install additional packages as desired.
  1. SSH - Unmount OLD volume: Hopefully that all worked nicely and you didn't get any errors. Assuming so, exit the chroot and unmount everything:
# exit and tidy up
exit
umount /mnt2/dev
umount /mnt2/proc
umount /mnt2/sys
umount /mnt2
  1. AWS console: Unattach OLD root volume from NEW, and reattach to the original server (OLD). Specify the attachment point as /dev/xvda.
  2. Hub: Boot OLD - fingers crossed it will now boot! If so great! If not, repeat steps 2-8 trying some different steps within the chroot (step 6).