Jason S's picture

Hello,

I have been trying to log into the Turnkey Hub, but it appears anything on hub.turnkeylinux.org is not responding.

Chrome says "The site cannot be reached" and returns ERR_CONNECTION_RESET.

I've tried the following URLs with the same result:

I have also tried from both our LAN connection and via 4G on my phone with the same result.

Is anybody else running into this issue?

Best Regards,

Jason

Forum: 
Tags: 
Jeff Dagenais's picture

Same here. This has happened 2 days ago also for a few hours (sorry for not being more precise).

Jason S's picture

Hi Jeff,

Thanks for the update, I've been randomly testing since 9:00 a.m. AZ/Phoenix Time (GMT -7), and it is still down as we near 12:00 p.m.

It would be nice if we could at least see a maintenance message or something indicating that the issue is being worked on. I do not have anything in production on the hub yet, but if I did and there was an issue to address, this could be a huge roadblock.

Hopefully it will be resolved soon!

Best Regards,

Jason

 

Jeff Dagenais's picture

Indeed, we've been using the hub for tklbam backup for years. Only recently I have been pondering using the server EC2 features and thus needed hub access more and more. It being down this often in such a short period is quite scary!!

Definitely a thing to consider before going into production using this.

Jeremy Davis's picture

Apologies that I haven't posted here sooner.

Yes the Hub is currently down. I have messaged my colleague Alon (the developer and maintainer of the Hub). I haven't yet heard back from him, but I'm sure he'll be onto it ASAP.

Re Servers: the Hub is engineered so that even if it goes down, any servers that you have running WILL NOT be affected. Obviously you won't be able to use the Hub to manage them, but they will continue to run. Whilst not ideal, your servers are AWS EC2 servers, you can manage them (e.g. start/stop/reboot/etc) from the AWS console if need be. Please do not hesitate to contact me directly via support@turnkeylinux.org if you have specific questions or concerns regarding your servers and/or need assistance finding yoru servers via the AWS console (it can be quite confusing for the unintiated - hence why the Hub exists). Launching new servers is problematic of course, but that too can be worked around if need be.

Re TKLBAM Backups: again the Hub is designed to not a single point of failure. After the initial setup, TKLBAM communicates directly with Amazon S3. So even if the Hub does down, backups will not be interrupted. In fairness restores are more problematic with the Hub being down, but they are not impossible. Obviously initializing new TKLBAM backups with the Hub down is impossible. As a short-term messure, I can assist to create temporary backups for new servers is required. Again please email me direct via support@turnkeylinux.org for this sort of assistance.

Re concerns over the Hub going down twice in such a short period of time. I completely understand your concern. Saying it isn't ideal is putting it lightly! We certainly hope to have it back up again ASAP (and not go down again anytime soon!). In the hope that I don't come across as "making excuses", whilst it is the second time in as many days, in fairness it's also the third time that I'm aware of; the third unplanned outage since I've been providing support (nearly 7 years). So on balance, the Hub still has a pretty good record (albeit quite dented recently).

Jeff Dagenais's picture

Yeah about tklbam, unfortunately, right now, on one of my VMs, after a VERY long delay, I get:

 

# tklbam-backup --simulate

warning: using cached profile because of a Hub error: error(35, 'gnutls_handshake() failed: A TLS packet with unexpected length was received.')

error: error(35, 'gnutls_handshake() failed: A TLS packet with unexpected length was received.')
Jeremy Davis's picture

I'm looking into this as we speak.

On face value it appears to be a change in behaviour of the GnuTLS python library that we overlooked. I'll have a workaround documented ASAP (hopefully within the next few hours). A patched TKLBAM update will be pushed out soon after (within the next few days at least).

[update] FWIW I've also noted this on our issue tracker. I've reiterated what I posted above there... I'll hopefully have some news real soon.

[further update] Sorry that I haven't resolved this yet. Obviously now that the Hub is back up it's not an immediate issue, but I will ensure that this is resolved ASAP - just in case the Hub goes down again.

Jeremy Davis's picture

I've just added a comment to the GH issue. Long story short is that this functionality (TKLBAM still creating backups even if the Hub goes down) was broken some time ago. :(

It's also not compatible with the way that tklbam and the Hub now interact with your AWS IAMs role. That's a plus for security, but means that now TKLBAM relies on the Hub to be able to access your AWS S3 storage.

We plan to implement an optional work around, by providing TKLBAM support for a IAMs user. I'm not sure when this will be available, but we'll make it a priority. It will require additional configuration on your end, but we'll try to make it as easy as possible.

If you have any comments, suggestions or concerns around that, please feel free to post here, on GitHub or shoot me an email (support@turnkeylinux.org).

Dan's picture

I just signed up for the Hub about 24 hrs ago... now today all day its down.

Gosh I sure hope it wasn’t me. :) LOL

 

Jeremy Davis's picture

Pretty sure it wasn't you Dan! :)

Although deep apologies on your poor experience to date... We should have it back up ASAP so you can actually use it though! I'll keep you all posted (and be back with a TKLBAM workaround as soon as I have it worked out)...

Jeremy Davis's picture

Hey all, the Hub is back up. We're investigating the cause now.

Apologies on the time it's taken, but thanks for your patience.

Jason S's picture

Hi Jeremy,

Thank you for the response, and as always, for being so thorough. I am happy to see things are working again.

The first thing that I checked were some of my websites running on a LAMP stack, and sure enough, the servers remained operational while the hub was down. That, of course, is the greatest concern!

I also did not think about logging in through AWS to manage my instances, nor did I try connecting via SSH, both of which would have allowed for server management while the hub was down. This would have been a good workaround had I really needed to complete a task.

As far as the track record, I completely understand, and I am sure there is plenty going on behind the scenes to fix the issue. The Turnkey team seems to be very proactive in that manner. Thank you for all that you do!

Best Regards,
Jason

Jeremy Davis's picture

We do our best. There is still plenty of work to be done and I know that there are many TurnKey things that I would like to improve. But we certainly do our best.

With regards to the recent outage, we're still looking into the underlying cause, but we have implemented some improvements which should mitigate against the immediate problem which caused the outage.

We also hope to provide some sort of status system so users can easily check the Hub's "health". I'm not yet sure exactly what that will be or how it will work, but we'll be sure to post about it once it's implemented.

Thanks again for using TurnKey and the TurnKey Hub; especially your patience under the trying circumstance of late.

As per always, please post about any problems, questions and/or concerns. General TurnKey related questions/problems are best asked here on the forums. I try to answer forum posts fairly quickly (often every day, at least every few days).

If you have specific Hub related questions/issues, please also feel free to get me via support@turnkeylinux.org or via the Hub's built-in text messenger. Hub support response should always be within one work day (but usually much quicker).

Add new comment