Wordpress Crashing - Error establishing a database connection

rogerbrx's picture

Hello,

I am not a very experienced Wordpress Admin, this is my first Site. I would like to ask for some help from you.I have a Workpress Appliance running for 2 years in Amazon AWS. It worked fine since initial install without any downtime. I did regular upgrades until I start to have problems.

On last months I have experienced repetitive downtimes with the message:

Error establishing a database connection

Its necessary to reboot the AWS Instance to execute the procedures to recover the Database, using:
 
 
I seems to be ok, the script runs fine but and the site crashes after some hours.
Looking in various support sites I found that this kind of problem should be simple to solve with the repair.php procedure, but unfortunatelly it doesn´t seems to be my case.
 
Questions for help:
 
Does anyone could recommed a more eficient procedure to recover from this problem ?
 
Is it possible to reinstall the Turnkey Wordpress Appliance like the initial setup ?
 
Thanks in advance.
 
Rogério Guimarães
m...'s picture

I have the same issue. But for me it began yesterday with the mentioned error message. I simply rebooted the machine and it ran smoothly for a day and then the appliance crashed again, now even without the error message. It seems to be some kind of memory leakage issue. Both the RAM and the swap is filled and then processes gets killed to keep the machine alive. I will try to look if it is a certain process that is running amok, but any suggestions are welcome.

Jeremy Davis's picture

My guess is that your server is running out of RAM. The kernel's OOM killer will then kill whichever process it thinks will resolve the RAM issue ASAP. Unfortunately MySQL seems to be what it tends to kill first...

You'll need to do some monitoring to see if that's your issue, but I think the odds are pretty good. Especially if a reboot resolves it (for a little while).

Assuming that's the issue, to properly solve it, you either need to give your server more RAM, or reduce the amount of RAM it's using.

For an older appliance running on Amazon, the first thing I would recommend is using TKLBAM to migrate your data to a new v14.1 HVM instance of the relevant size. E.g. t1.micro to t2.micro; m1.small to t2.small, etc. The newer HVM instances sizes give much better performance per price compared to the old PV instance sizes. There is a doc page here which provides a suggested workflow and some specific considerations when migrating data.

The performance increase may be enough to get your server running more reliably? If not, then you have a few other options to explore. Each one of these should help but there's no reason why you couldn't do a number of them (or even all of them):

  • Move to a bigger instance size (with more RAM).
  • Tune WordPress by removing unneeded plugins (poorly coded WP plugins are the most common source of memory leakage). If you have recently added plugins then try disabling them as that is possibly the primary source of your issues.
  • Tune Apache and MySQL so they use less RAM. This may or may not be enough to resolve the issue.
  • Tweak the kernel OOM killer settings so Apache threads get killed before MySQL. It will still impact your server's ability to provide stable connection, but at least it won't crash completely.
  • Swapping Apache for a more resource friendly webserver (e.g. Nginx). Probably the best way to do that would be to use TKLBAM to migrate your data to an Nginx appliance and tweak it so it works to your satisfaction.
  • m...'s picture

    I run my appliance on my private server, so I don't have the option to just increase the given amount of RAM. I have noticed that my appliance is flooded with Apache instances. Normally the server have about 80 threads, and suddenly it increases to over 225 threads. Each instance seem to have a network connection, pointing IP-addresses belonging to a cloud computing facility in France. It sounds reasonable that the cause is a malfunctioning plug-in or similar. I'll investigate further.

    Jeremy Davis's picture

    Using swap will make your server less responsive, but it should stop it from crashing.

    I think tuning WordPress (especially removing excess plugins) is a great first step. Tuning Apache and MySQL may also help. Like I said, switching to a lighter weight server like Nginx is also an option (albeit quite involved).

    Having said all that, unless you have a lot of connected users, Apache forking that many workers seems pretty excessive. I wonder if someone is probing your site looking for vulnerabilities? Perhaps it's worth looking into fail2ban?

    If you want to install that, here's a tutorial. It's actually for Ubuntu, but I've just had a quick glance and I would expect that to apply to Debian too. Please note I haven't tested it though so can't be 100% sure. Keep in mind that TurnKey is Debian under the hood (v13.x = Wheezy; v14.x = Jessie) so you should be able to find some docs relevant to your version if that one proves useless...

    m...'s picture

    It actually seems to be unwanted traffic, malicious or not, that is reaching the server. I went though the installed plugins and did not found any plugin that could cause any traffic related to the IP addresses I saw. It doesn't seem like a standard denial of service attack since the traffic is relatively low. But nevertheless it takes down my server. Now have the service been up and running for a while, so the issue might have been mitigated elsewhere. As most traffic are dropped before it even reaches the server, I am not really sure if there are other malicious traffic - like port scans - too. I'll look into fail2ban if it seems to be needed, thanks for the tip. But your tutorial link doesn't work though. Now I'll focus on to keep everything updated.
    Jeremy Davis's picture

    If it's the same IP addresses over and over, it may also be worth doing some research to find who/where they are. That may give you some insight into the legitimacy of it.

    Regardless, fail2ban would probably be quite useful I'm guessing...

    PS I fixed the link in my previous post. Sorry about that...

    Peter Woodall's picture

    I am also having this issue.  It started happening about 2 - 3 weeks ago.  A reboot always fixes it and I was gong to set an automatic reboot say twice a week until I saw this thread.

    Could this be a bad patch given that this is happening to multiple instances?

    I will check the traffic suggestion in the thread.

     

    Jeremy Davis's picture

    Considering that the WordPress appliance is one of our most popular appliances and there are only a handfull of you reporting issues, I strongly suspect that it is a resources issue (as I noted in previous posts). Besides PHP applications running on a LAMP base crashing due to insufficient resources is relatively common.

    The thing with website server resources is, when you first set up the website, you have very little traffic and a minimal install footprint. As time goes on, additional content and plugins tend to be added (each adding a small amount to the resource overhead). Many WordPress plugins are poorly coded and in my experience many of them have memory leaks.

    Also traffic tends to increase (both wanted traffic from the intended audience; plus unwanted traffic from hackers looking for hackable targets). Generally memory leaks are exacerbated by the number of connections. So your site may well run fine for ages until you reach a critical traffic limit where all of a sudden things start going wrong. The line between everything seeming to be fine and a crashed server may be as simple as one additional plugin, or a couple of extra concurrent logins.

    Please note that a hacked server and/or WordPress malware infection is also a possibility which I haven't previously mentioned...

    If a reboot is (temporarily) resolving the issue, then I'd nearly put money on your server running out of RAM (and I'm not a gambling man). As discussed above, that can be caused by misbehaving WordPress plugins and/or unanticipated (and perhaps unwanted) traffic, among many other things (including malware).

    Like most somewhat technical problems (e.g. your lawnmower won't start, your car stalling or your server crashing) to properly resolve the issue you need to first understand what the problem is. So the first step is to do some diagnosis. As insufficient RAM is a fairly common cause of WordPress server crashes, monitoring the RAM usage and the remote connections are good ways to see what might be going on.

    Once you have diagnosed the issue, then you can decide on the best way forward.

    Peter Woodall's picture

     

    First thing I noticed it was running out of RAM and found a log entry sjhowing that the mysql process was being terminated (giving me the error).

    So as a quick fix I created a 1 Gb Swap file to match the existing RAM and the issue has 'gone away'.

    Next thing I did was delete all unused Plugins as a just in case maneouver.  While doing that I noticed that a lot of the existing plugins are showing as 'not tested with current version'.

    Just a guess but after the upgrade to 4.6.1 one (or more) of those plugs may have started to run amoke and eat up RAM.

    So I am in the process of replacing the plugins with compatible replacements.

    And WP has release Version 4.7 so the fun never ends lol.

    Thanks to Jeremy for pointing me in right direction!

     

    Jeremy Davis's picture

    Glad it headed you in the right direction and great to here you successfully implemented a workaround!

    Thanks for posting back to share your experience with others.

    Bill Carney's picture

    I had a burst of (legitimate) traffic on my Wordpress installation.  My server kept becoming unresponsive frequently and only a reboot would fix it.  Eventually I discovered the cause of the issue was incorrectly closed tables in MySQL, particularly those related to live logging in Wordfence.  Here was my solution:

    Use phpMyAdmin to back up all your databases, just in case.

    Log into Wordpress, go to Wordfence, Options, and turn off "Enable Live Traffic View" (this is what was hammering my MySQL server).

    service apache2 stop

    Load up phpMyAdmin and run a repair on your database (select your db, select all tables, and then use the "With Selected", "Repair tables" option)

    Reboot 

    This solved my issue.  Hope it works for you as well.

     

    Jeremy Davis's picture

    Thanks for sharing your solution too Bill. It's a relatively common issue and WordPress is one of our most popular appliances so the more info on potential issues and possible workarounds the better. :)

    Post new comment