Michael Graves's picture

Over the past week I've migrated a blog from a shared host to a VPS.NET 2 node VPS running the Turnkey Linux WordPress image. In general it seems to be running ok. (http://www.mgraves.org)

However, several times I've seen the CPU load spike up to 15 nodes! When this happens the site drops offline. It's unresponsive to any kind of connection; web, web admin, ssh, etc.

When I find the site in this state it will not respond to a graceful reboot of the VPS. I can only force a power-down, the a bit a bit and bring it back up.

When it comes back up it runs normally for a day or more, but it eventually encounters the CPU spike again.

Having just migrated from a shared host I'm not clear where to look in the TKL setup to determine if the problem is being caused by a local process or something else?

Any suggestion for which logs might reveal what's going on?

Forum: 
Liraz Siri's picture

This is the first time I've heard of something like this happening so I doubt this is a general problem but rather something that is particular to your circumstances.

In particular I would check the web server logs to see if anything unusual is happening which coincides with your load spike. It could be a misbehaving bot (e.g., search engine) that's hitting your unusually site hard. It could be spammers. It could be a problem at VPS.NET.

Michael Graves's picture

The support forum over at VPS.Net has quite a few threads that reference this problem. It appears to be a problem with a memory leak in the Ubuntu kernel, but there is no consensus about the exact cause.

These threads date from mid-2009 to present. I'd post a link but their support forums require that you be a customer and logged into your account.

Here's a possible solution that has been suggested:

# apt-get remove linux-image-xen linux-ubuntu-modules-2.6.24-24-xen
# curl http://nl.archive.ubuntu.com/ubuntu/pool/universe/l/linux/linux-image-2.6.24-24-xen_2.6.24-24.61_amd64.deb > kernel.deb
# dpkg -i --force-architecture kernel.deb
# reboot

They even went so far as to update all the TKL images that they offer to include a new kernel. However, I am still seeing this problem.

This is a pity as I had though to used VPS.Net based upon their relationship with TKL. If there is is no solution to this trouble then I may end up migrating (again) to something completely different, not using TKL or VPS.NET.

It makes no sense to me that they offer images that they know present such problems. It might be very convenient to launch TKL into a VPS, but what value is there unless it can be kept running stablely?

Michael Graves's picture

I asked if the matter was reported back to TKL project. Here's on response:

yes it was and it only seems to be present on the vps.net architecture though

i dont know if the images were updated to include the fix so dont know if this is the same issue or a different one

 

Liraz Siri's picture

I wouldn't jump to conclusions so fast. From what I understood installing a 64-bit kernel solved the problem. This might be a problem that just looks similar...

And I may just be beating a dead horse but as I've said before way VPS.NET handled the issue with the kernels was far from ideal. It doesn't even seem to have been a problem with TKL so much as something specific to their Xen implementation. Other partners have been running TKL for over a year now and haven't run into anything like this. But who knows, there can be very complex interactions between a kernel and a hypervisor. That's why Amazon keep a tight control on what kernels are allowed to run on EC2.

BTW, are you sure that forum thread is private? Not all the forums on VPS.NET are...

Michael Graves's picture

 

I tried another VPS provider, but tried to stay with TLK. However, I had the same issue with TKL at UnmeteredVPS.Net. The operator of that company was surprised and tried to help me sort it out. After some unsuccessful tweaks we decided to move to a CentOS based VPS.

The whole process was described here:

http://www.mgraves.org/2010/08/blogging-in-transition-a-host-of-issues-%E2%80%93-act-two/

Now that the base system of TKL has been updated I might like to try it again. It included some conveniences that are handy for someone who is not a linux expert.

Liraz Siri's picture

Thanks for not giving up on TurnKey just yet. Reading your blog post made me cringe. We've put an enormous effort into giving users a good experience, but it only takes one really lousy experience to blow most of that away.

Anyhow, for what it's worth I left the following response on your blog:

Sorry to hear you had a bad experience with TKL on VPS.NET. For the record we created the partner page on Apr 2010 and VPS.NET has never been listed on it.

I'm guessing you came across an old blog post, because a year before that, in 2009 we tried partnering up with them to offer the first on-demand deployment for TurnKey Linux but we had to back off as soon as users started reporting the instability. I completely agree that they should have pulled back the offering until the problems were resolved and we said as much. Instead they kept advertising support for TurnKey Linux, and whenever users ran into problems they blamed us.

For our part, we were open about the issue and our dissatisfaction at the way it had been handled and started promoting alternative hosting providers, including Amazon EC2 which we added support for in October 2009, and which gives TurnKey full control of the images (helpful for debugging issues), in contrast to VPS.NET.

As you might imagine, for these reasons there currently isn't much of a relationship between TurnKey and VPS.NET. Last time I heard of them was when they shut down our CDN provider (SimpleCDN) without warning, retroactively changed their TOS, and then blamed SoftLayer, the upstream provider which they resell. It's an ugly business which I think they are getting sued for.

But anyhow I digress, the new TurnKey Linux 11 is out based on an entirely new Ubuntu Long Term Support release (10.04) with a new Ubuntu kernel. Assuming they update the images this will probably resolve issues even for those amongst our users who are still VPS.NET customers.

Add new comment