Ortho's picture

Hello,

So I powered a Magento2 Turnkey VM back up after a few months of it being offline, and new I'd need to renew its certificate.

I ran the following:

/usr/lib/confconsole/plugins.d/Lets_Encrypt/dehydrated-wrapper

and got:

--------------

[2019-10-01 09:07:43] dehydrated-wrapper: INFO: started
[2019-10-01 09:07:44] dehydrated-wrapper: INFO: found apache2 listening on port 80
[2019-10-01 09:07:44] dehydrated-wrapper: INFO: stopping apache2
[2019-10-01 09:07:44] dehydrated-wrapper: INFO: running dehydrated
  + ERROR: An error occurred while sending post-request to https://acme-v01.api.letsencrypt.org/acme/new-authz (Status 400)

Details:
{
  "type": "urn:acme:error:badNonce",
  "detail": "JWS has no anti-replay nonce",
  "status": 400
}

[2019-10-01 09:07:46] dehydrated-wrapper: FATAL: dehydrated exited with a non-zero exit code.
[2019-10-01 09:07:46] dehydrated-wrapper: WARNING: Something went wrong, restoring original cert & key.
[2019-10-01 09:07:46] dehydrated-wrapper: INFO: starting apache2
[2019-10-01 09:07:46] dehydrated-wrapper: INFO: starting stunnel4
[2019-10-01 09:07:47] dehydrated-wrapper: WARNING: Check today's previous log entries for details of error.
------------------

From what i can find while googling, it seems as though the dehydrated script isn't handling the nonce sequence properly.

Any idea on how to fix this?

Forum: 
Jeremy Davis's picture

Yes, it appears that Let's Encrypt have changed their server config and the change has broken the older version of Dehydrated that we have been using (installed from Debian repos).

There is some discussion regarding this (and some other issues) in another thread (specifically this post and this one). But probably your best bet is to have a read of the full info in the relevant issue on GitHub.

FWIW, I have just updated the issue with a link to the relevant Debian bug and a note that it appears to be possible to just edit the Dehydrated script itself to resolve the issue. I haven't tested that myself, but I figured it was worth noting.

Regardless, I should probably do a blog post (and send out a newsletter) about this as it will hit all users eventually (as their certificates expire). I'll be speaking with Alon tonight (my time) and we'll decide on a plan of attack...

Ortho's picture

I tried both changes, and updating dehydrated did seem to work after a reboot.

Unfortunately I've seemed to have run into an LE Rate limit (I hate how it doesn't tell you which limit you've hit).

I tried then to change the dehydrated url to the test url, but now I always just get 400 error.

Is there a way to run the dehydrated wrapper but in test mode just to validate that everything is working once the rate limit clears?

Jeremy Davis's picture

AFAIK the Let's Encrypt rate limit is 20 per week.

It's not well documented, but Confconsole actually ships with a alternate config designed to be used against the "Let's Encrypt staging server". It was created with the intention of being "dropped in" instead of the default config (it overrides the default URL that it hits, so just adding the relevant lines in the config file is another option). You can find a note about it within a discussion on GitHub.

Having said that, it was a fair while ago since it was last tested (at least by me). So I'm not even 100% sure that it still works. It'd be great if you wanted to test it out and let us know...

It might be best to save the current config, just in case. I.e.:

mv /etc/dehydrated/confconsole.config /etc/dehydrated/confconsole.config.orig
cp cp /usr/share/confconsole/letsencrypt/dehydrated-staging-confconsole.config /etc/dehydrated/confconsole.config

Then give it a run:

/usr/lib/confconsole/plugins.d/Lets_Encrypt/dehydrated-wrapper

Good luck mate.

Jeremy Davis's picture

Ah yes, that is a new error. Reading the error message suggests that we'll need to update the API URL that Dehydrated uses. According to a Let's Encrypt announcement disallowing new registrations for APIv1 wasn't supposed to happen until next month. Existing users have until June 2020 apparently, although this early breakage of v1 suggests that it's probably best not to wait...

I documented on the bug report how to install from upstream (it's a bit dirty, but in this case it's not "unsafe"). Not sure if you've done that, or the just "fixed" the single line (i.e. adjusting the single line in Dehydrated)?

Regardless, either of those will resolve the Nonce issue, but without further changes, it will still attempt to use the v1 API. I haven't done any testing yet, but a quick google suggests that we'll need to adjust the URL that Dehydrated connects to (stored in the CA variable). Currently it is:

CA="https://acme-v01.api.letsencrypt.org/directory"

For new users, that now needs to be:

CA="https://acme-v02.api.letsencrypt.org/directory"

It's probably also worth existing users changing that too and double checking that everything works as it should.

It's probably worth noting that there is now a newer version of Dehydrated in stretch-backports now (actually, it's the latest version). So installing from backports is another (cleaner) way to upgrade it.

As noted above, I'm yet to do any testing to confirm my understanding, but essentially, these steps should resolve the issue(s):

  1. upgrade dehydrated (either from upstream or via stretch-backports)
  2. update the confconsole hook script (from TurnKey's GH)
  3. update the API URL (to v2)

And then everything should be all systems go!

I'll aim to look into this further ASAP (within the next few days) but I'm currently a bit bogged down with other stuff.

PS - I hope you don't mind me editing your post to make it a bit easier to read... :)

Jeremy Davis's picture

Thanks for the neat and tidy roundup Igor, that's great!

Although FWIW, you don't need certbot. Dehydrated does pretty much the same thing (i.e. they're both ACME clients which get certificates from Let's Encrypt).

Jeremy Davis's picture

Yes add them in.

FWIW Dehydrated has built in defaults for a number of settings, so adding new values (such as CA and CA_TERMS) in the config file will override the builtin defaults.

Jeremy Davis's picture

Ryan (via support) has just provided me with a (possible) short term workaround for the issue related to failures with multiple domains configured. I have posted it to the relevant GH issue, but will give a brief overview here.

Essentially he suggests doing the update (as noted above by Igor or on issue #1359). Once that has been configured, reduce the domains that you are requesting certs for to one and run the wrapper script. Then re-add the other domains and re-run the wrapper script.

I haven't tested it myself, but thought it worth sharing. If anyone else tests this out, please post back with feedback.

Unfortunately, I still haven't had a chance to complete the work I've started to provide a "proper" fix for #1360, but fingers crossed, I'll have the final piece for that soon. If anyone is willing and able to help out with that, please let me know.

Jeremy Davis's picture

TBH, I'm not 100% sure, but I think that perhaps it's the switch from the v1 API to the v2 API. I think it's worth trying to clear Dehydrated's data. I.e. try this:

mv /var/lib/dehydrated /var/lib/dehydrated.bak

And then retry:

/usr/bin/dehydrated --register --accept-terms
/lib/confconsole/plugins.d/Lets_Encrypt/dehydrated-wrapper

If that still doesn't work, could you please show me the contents of your config file:

cat /etc/dehydrated/confconsole.config
Jeremy Davis's picture

Judging from the error message "this_hookscript_is_broken__dehydrated_is_working_fine__please_ignore_unknown_hooks_in_your_script", I'm guessing that you haven't updated the hook script. To do that (as per step 2 of the workaround noted on the issue):

GH_URL=https://raw.githubusercontent.com/turnkeylinux/confconsole/master
GH_HOOK=share/letsencrypt/dehydrated-confconsole.hook.sh
CC_HOOK="$DEHYD_ETC/confconsole.hook.sh"
SH_HOOK=$SHARE/dehydrated-confconsole.hook.sh

wget $GH_URL/$GH_HOOK -O $SH_HOOK
cp $SH_HOOK $CC_HOOK

Hopefully that should get you up and running...

Jeremy Davis's picture

Please note that I've just published Confconsole v1.1.1. It's not yet available from our repos (although will be soon) and still requires some specific steps to install and set up on a v15.x server (although better than instructions published previously).

Please note that users who have already updated via various other means are still recommended to install this update as it includes reliability fixes for add-water; our custom challenge mini-server. Please see the release notes for full step by step setup and further info - instructions cover both new and existing users.

Any issues, please ask. Any feedback (e.g. anything that isn't clear, etc). Please ask.

Jeremy Davis's picture

It's been bought to my attention that after installing the v1.1.1 Confconsole update, the add-water service is being inadvertently enabled. That means that on reboot, it will start up and will likely block Apache (or other webserver) from starting!

The fix is easy:

systemctl disable add-water

Please note that on any server where you have already run the v1.1.1 update, you need to apply the above line and there is no value in updating to the newer package. For any new servers you launch though (or for anyone else who hasn't applied any fixes and stumbles across this thread) the newer v1.1.2 release resolves this issue (it's exactly that same as v1.1.1, but doesn't auto enable the add-water service on install).

Denton Thompson jr's picture

hello all, 

i have another tkl appliance running wordpress that is just fine, so . how about i spin up a new vm, move my site to it with new cert?

is that crazy? 

skip

 

Jeremy Davis's picture

Is the alternate WP server using the exact same domain? Also is it a Let's Encrypt cert? Or one you've paid for?

I ask because a SSL cert is explicitly for a specific domain. So trying to use a cert for www.mydomain.com for another site which is www.myotherdomain.com won't work.

I ask regarding Let's Encrypt as they are only valid for 90 days. So need to be regularly renewed. Updating as per the v1.1.2 Confconsole release notes will allow you to get a free Let's Encrypt cert and it will be auto updated every few months.

Paid for SSL certs generally last much longer, e.g. the certificate that we use for the Hub lasts 2 years. So it still needs to be updated, but much less frequently.

I hope that helps.

Denton Thompson jr's picture

thanks for the update. very appealing to have someone so close..... tkl ROCKS!

i was specifically thinking of new vm with no cert. import content. from site A. then take Site A's fqdn. and apply a cert. but i will pursue the conf 1.1.2 route instead. tomorrow

i'm in CST, so day over for me

Jeremy Davis's picture

If you use the same FQDN for both sites, then it should work fine. Although as I say, unless it's a paid for certificate, the advantages of swapping the cert are limited (because it will expire and without the fix, it won't be able to renew)...

Thanks too for your kind words. Good luck with it. :)

FWIW, I'm currently observing AEDT (UTC+11) so it's now day over for me now! :)

sanmolhec's picture

Hi, I installed the nginx turnkey (v 15.1) and I found the same error:

Details:
{
  "type": "urn:acme:error:badNonce",
  "detail": "JWS has no anti-replay nonce",
  "status": 400
}

I have tried to apply the proposed solution with version 1.1.2 confconsole (confconsole v 1.1.2). But when I install it it tells me that it is not a reliable version and it does not finish correctly (it aborts the installation). When I forced the installation of the package I broke the VM and I had to reinstall it.

Any idea how to fix it?

Thank you very much in advance. Greetings, Hector.

Jeremy Davis's picture

Did you follow the exact instructions in the "How to install/update" section? If so, at which step did you hit the issue that you note? What was the exact error message that you got?

FWIW, I just tried running the update (following the exact instructions noted in the v1.1.2 release notes, starting at the "How to install/update" section) on a clean install of v15.1 Nginx server and it worked fine. So I'm really stumped on exactly which part isn't working for you and why...!?!

sanmolhec's picture

Hello. Yes, I follow the instructions (so I think).

I'm so sorry I disappointed you with my problems updating the confconsole package.

I'm a rookie at tunrkey and not too advanced with linux. I'm deploying the VM in VirtualBox 6.0.4 inside a Ubuntu 18.04LTS machine.

I use a clean VM, I simply change the passwords requested when I start the installation. I have redirected ports 80 and 443 to the VM with nginx.

I try to get the let's encript certificate and I get the error detailed before.

I start with the instructions in the "How to install/update" of github confconsole v1.1.2

When I get to the second step of point 3 (install the updated Confconsole) I get the following:

[root@nginx-php-fastcgi ~]# apt install ./confconsole_1.1.2_all.deb

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  libsasl2-modules
Suggested packages:
  libsasl2-modules-gssapi-mit | libsasl2-modules-gssapi-heimdal
  libsasl2-modules-ldap libsasl2-modules-otp libsasl2-modules-sql
The following NEW packages will be installed:
  libsasl2-modules
The following packages will be upgraded:
  confconsole
1 upgraded, 1 newly installed, 0 to remove and 53 not upgraded.
Need to get 102 kB/359 kB of archives.
After this operation, 269 kB of additional disk space will be used.
Do you want to continue? [Y/n] Abort.

 

[root@nginx-php-fastcgi ~]#

If I continue with all the steps and try to get the certificate, from the WV menu to get the certificate by entering my domain, then I get this error:

If I force the installation instead of the automatic shutdown in step 3, the MV is ruined.

This is using the terminal (command shell) of the Webmin utilities, maybe that's the error? Better to enter by ssh?

Can you think of what mistake I might be making? Or how to fix it?

Sorry for the trouble. Thank you very much. Cheers.

Jeremy Davis's picture

Looking at your screenshot, it appears that you still have the old hook script (see the line that says "this_hooks_script_is_broken"). I'm not quite sure why you hit that, but if you followed step 1 in the "How to install/update" instructions, all the old scripts/config should have been removed (and then first run should have reinstated the new ones). Rerunning these lines (then rerunning the "get certificate process) should get things working for you:

rm -rf /etc/dehydrated/confconsole{.config,.hook.sh}
rm -rf /etc/cron.daily/confconsole-dehydrated

So hopefully that fixes everything and you're good to go. Please let me know if you continue to have issues. Although unless it's directly related to this Let's Encrypt issue, probably best to start a new thread.

[update] just after posting this message I noticed your note about using Webmin. Apologies that I missed that before. Yes, I suspect that may be a part of the issue (I never use Webmin myself...). I would encourage you to always use SSH for terminal commands as the terminal in Webmin is not a proper interactive shell (although Webshell, on port 12320 IS a proper interactive shell). FWIW on an Ubuntu host, it should be really easy to access via SSH, e.g. open a terminal and:

ssh root@VM_IP_ADDRESS
sanmolhec's picture

Hello again.

I'm sorry for the inconvenience created by my clumsiness.

Voilà!!

Accessing by ssh and redoing all the steps of the installation, accepting to continue with the installation when you ask me "Do you want to continue? [Y/n]" I accepted and managed to finish the installation correctly, without the problems of last time.

I thought the webmin console would be fully operational, at least I got that feeling. That must have been my big mistake...

And now I've been able to get the let's encript certificates. Great!!!

I've already set up the reverse proxy and have everything working as I need it to.

Thanks a million.

Greetings, Hector.

Jeremy Davis's picture

I'm really glad to hear that you're up and running now! I probably should add a note re using a "proper" SSH session rather than Webmin. Although hopefully, we'll be releasing a new version soon with this fixed OOTB... :)

sanmolhec's picture

Perhaps it would be nice to comment somewhere that the webmin terminal is not 100% operational.

But I think in my case it was the little experience with linux that didn't make me see that that was the problem, and not entering directly through ssh or WebShell.

Thanks a lot!!

Regards, Héctor.

Add new comment