> On Nov 10, 2018, at 11:57 PM, Mukund Sivaraman <muks@xxxxxxxxxx> wrote: > > On Sat, Nov 10, 2018 at 05:08:29PM -0500, Viktor Dukhovni wrote: >> For the IPv4 address, for a total of 60 queries sent, only 5 answers >> came back. Four of the five responses were delayed by more than a >> full second. > > digging around, I see curious behavior. > > dig -4 +tcp @ns0.amsl.com consistently reports query time of upwards of > 2s from a host in hetzner.de and upwards of 3s from hotel in Bangkok. > > dig -4 @ns0.amsl.com (udp) reports query time of ~165ms from a host in > hetzner.de and ~230ms from hotel in Bangkok. FWIW, given the apparently rather tight rate limits, if it were up to me, I'd just operate the primary as a hidden master, with a sole purpose of providing AXFR service to the slave servers. It is perhaps better to not list it at all, than list a server that's going to be "aggressively" dropping packets. However, the operative word is "apparently"! It is quite possible that only "DNSViz" and various synthetic probes run into issues, and that "normal" resolvers are getting perfectly adequate service. Since I don't know how the rate limits are implemented, and what mix of queries the domain gets, ... the first paragraph could be substantially off-base. Glen mentioned that there'll be new attention on monitoring, and perhaps more frequent, and less error-prone re-signing. Even if manual, I hope the process involves just a single step of running a robustly designed and well-tested script that consistently performs all the requisite tasks. Good luck the ietf-tools and AMSL teams, I hope this outage will help to improve things going forward. I'll stop kibitzing now. -- Viktor.