Ariel Lira's picture

Hi guys, I been using tklapp for a couple of years on several servers with excelent results but sometimes, in particular last year, I found some issues with invalid *.tklapp.com domains resolution. 

My server has a fixed public IP with and no 'x' bit in /etc/cron.hourly/hubdns-update. The domain was registered using hubdns-update command.  Also, I am almost sure the server had no downtime and that it was 100% online. 

Usualy after several days, maybe a month, I get alerts indicating that my-domain.tklapp.com does not work anymore (because it redirects to tklapp.com). After that, I have to manually run hubdns-update on the server and alert everyone to wait a couple of hours in order to DNS caches get updated.

I think I may have 2 problems here: 

a) tklapp.com not locating my server. Perhaps this is because I have no 'x' bit in hubdns-update cronjob? Again, I am sure the server was alive all the time.

b) the DNS caches are outdated for too long, despite the domain being ok. 

For example, if I run hubdns-update after a) issue, and I ask my DNS for my-domain.tklapp.com, I get for several hours

dig my-domain.tklapp.com 
;; ANSWER SECTION:                                                                                                   my-domain.tklapp.com.   85986   IN      CNAME   tklapp.com.                                 tklapp.com.             85986   IN      A       23.21.244.168   

 but if I ask my-domain.tklapp.com to one of tklapp DNSs at AWS, I get the right IP (XYZ.XYZ.XYZ.XYZ)

dig my-domain.tklapp.com @ns-217.awsdns-27.com. 
;; ANSWER SECTION:
my-domain.tklapp.com.   10      IN      A       XYZ.XYZ.XYZ.XYZ

I think this issue may be caused by a CNAME record to tklapp.com with very high TTL (1 day). Perhaps if CNAME records to tklapp.com have a lower TTL, like 3600, DNS caches will be cleared sooner?

Any advice will be welcome!

Thanks,

Ariel

Forum: 
Jeremy Davis's picture

TBH I don't know much about HubDNS but it does take longer to update than ideal, but for me it's only ever been ~10 mins (I use Google Public DNS). I have noticed that a new tklapp.com domain works straight away, whereas reassigning an existing one takes 5-10 mins to update.

Where is your DNS coming from? You can see from your dig of AWS nameservers that the TTL is 10s. So it's the caching between the AWS nameserver and your browser that is causing the issues.

I note that for me:

dig domain.tklapp.com @8.8.8.8 # Google DNS
...
;; ANSWER SECTION:
domain.tklapp.com. 21599 IN A XYZ.XYZ.XYZ.XYZ
...

Still not 10s but significantly less than yours.
Ariel Lira's picture

Hi Jeremy, 

The ttl of valid and resolved tklapp domains is perfect. My issue is with the ttl returned for the CNAME record in case of invalid domains (or valid but unresolved by tklapp.com for some reason like I expressed previously). See bold text in the following example :

$dig inexistent-domain.tklapp.com  


;; QUESTION SECTION:
;inexistent-domain.tklapp.com.  IN      A

;; ANSWER SECTION:
inexistent-domain.tklapp.com. 86400 IN  CNAME   tklapp.com.
tklapp.com.             86400   IN      A       23.21.244.168

Then, for about 86400 secs (1 day) this CNAME record is allowed to live in DNS caches and if I register inexistent-domain.tklapp.com with hubdns, it would be invisible to me and others because DNS caches in the middle would redirect me to tklapp.com, despite the right A record is available in AWS servers.

I ve tried with my ISP DNS and Google DNS.  

I think you can replicate the scenario with the following steps:

  1. dig some-new-domain.tklapp.com @8.8.8.8 (you will get a CNAME to tklapp.com and cause the results get cached in google DNS)
  2. hubdns-init APIKEY some-new-domain.tklapp.com && hubdn-update (in a server with ip XYZ.XYZ.XYZ.XYZ)
  3. dig some-new-domain.tklapp.com @8.8.8.8 (you will get a CNAME to tklapp.com instead of the A record to XYZ.XYZ.XYZ.XYZ)

Please let me know if you need more info.

Thanks,

Ariel

 

Jeremy Davis's picture

Sorry that I missed your initial point. Re-reading your OP I get it and I think you may be right! Interestingly though I get the high TTL for the CNAME record if I query one of the AWS nameservers (same as you 86400 i.e. 24 hours), but I don't if I query Google (21599 = ~6 hours). It's still far from ideal, but is significantly less. I wonder what is going on.

FWIW I have updated the existing issue on our tracker with an overview of your info. I've linked to this thread but feel free to elaborate over there if you want.

Thanks for your info.

Add new comment