Instance status checks fail, fails to start after reboot; was healthy earlier today

Robin's picture

Sometime after 3PM PT on 3/1/2012, my instance (i-f2611ec2) became unavailable. Rebooting does not bring it back up. It is a Postgres-11.3.lucid-x86 appliance that had been running without a hitch for 40 days. I did not do any maintenance of any kind on it today. I accessed a Postgres database on this instance up to 3 PM PT today. I am attempting to stop the instance and then start, but the stop has been running for approximately 40 mins or more.

Checking the AWS console I see 2 of 2 status checks have failed.

The last entries from the console log (Plymouth command failed):

    0.870818] mice: PS/2 mouse device common for all mice
[    0.870900] rtc_cmos rtc_cmos: rtc core: registered rtc_cmos as rtc0
[    0.870968] Driver for 1-wire Dallas network protocol.
[    0.871026] device-mapper: uevent: version 1.0.3
[    0.871165] device-mapper: ioctl: 4.15.0-ioctl (2009-04-01) initialised:
[    0.871509] NET: Registered protocol family 17
[    0.871647] registered taskstats version 1
[    0.966349] XENBUS: Device with no driver: device/console/0
[    0.966361] /build/buildd/linux-ec2-2.6.32/drivers/rtc/hctosys.c: unable to open rtc device (rtc0)
[    0.966422] Freeing unused kernel memory: 216k freed
[    0.967453] Write protecting the kernel text: 4336k
[    0.967755] Write protecting the kernel read-only data: 1336k
Loading, please wait...
[    0.995993] ramzswap: disk size set to 325176 kB
[    1.004207] udev: starting version 151
[    1.067818] Adding 325172k swap on /dev/ramzswap0.  Priority:100 extents:1 across:325172k SS
Begin: Loading essential drivers... ...
Begin: Running /scripts/init-premount ...
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Begin: Running /scripts/local-premount ...
[    1.346678] EXT4-fs (sda1): INFO: recovery required on readonly filesystem
[    1.346689] EXT4-fs (sda1): write access will be enabled during recovery
[    1.519204] EXT4-fs (sda1): recovery complete
[    1.521659] EXT4-fs (sda1): mounted filesystem with ordered data mode
Begin: Running /scripts/local-bottom ...
Begin: Running /scripts/init-bottom ...
 * Starting Initialization hooks        [9;0][ OK ]
 * Starting PostgreSQL 8.4 database server        [ OK ]
 * Updating HubDNS        Updated with
[ OK ]
 * Starting NTP server ntpd        [ OK ]
 * Starting Shell In A Box Daemon shellinabox        [ OK ]
Syntax OK
 * Starting web server lighttpd        [ OK ]
 * Starting webmin        [ OK ]
mountall: Plymouth command failed
mountall: Disconnected from Plymouth

What else might I do to recover this instance?

Jeremy Davis's picture

I had a quick look on the Ubuntu forums and most people that reported the 'Plymouth' error were on desktop systems and the fixes seemed to be various and many involved adjusting xorg settings (which obviously don't apply here). From what I can gather though (acording to stuff I read) you should still be able to connect via SSH (assuming that it's a similar issue).

I'm not sure if it'll work? (Perhaps Plymouth issue is a symptom rather than the actual problem). But you could try this:

apt-get update
apt-get install -f
dpkg --configure -a
dpkg-reconfigure plymouth

If you can get access to it with SSH it may be worth trying to run a backup and/or dump your DB and copy it out (plus any other data).

If you have a recent backup restoring that may be your easiest answer. If you don't then I'd definately be looking to set something up as soon as you have this sorted. I know TKLBAM doesn't support PostgreSQL OOTB but it is possible to configure it to work (with a little mucking around). Have a look at this blog post, the details are in the posts.

I'm not sure if this is an EBS backed instance or not, but if it is and nothing previous has worked,  another option may be to try attaching your EBS volume to another instance to get your data out.

Post new comment