I'm starting a new thread to clarify and emphasize the problem I'm actually trying to solve. Here is the problem restated as I posted it to the dns-operations list: ----- Is it really expected that the first DNS server listed in /etc/resolv.conf should never go down? Operationally speaking, who can actually rely on listing multiple nameservers in /etc/resolv.conf and using libc's failover mechanism in any kind of production server? Because the failover behavior in libc is atrocious--each new or existing process has to re-do the failover after timing out, and even long-running processes have to call res_init() to re-read resolv.conf. It seems that the only sensible way to run a datacenter (or a network full of Linux workstations for that matter) is to either: 1. Make sure the first nameserver listed in resolv.conf never goes down by using Anycast DNS or some other failover mechanism like VRRP or CARP on the DNS server side. or: 2. Use a local DNS daemon on every server with forwarders configured to the network's nameservers, and fix resolv.conf to 127.0.0.1. ----- (I've since learned that nscd can be a third option) On Fri, Apr 25, 2014 at 07:19:17PM +0200, Petr Spacek wrote: > On 25.4.2014 18:19, Simo Sorce wrote: > >On Fri, 2014-04-25 at 09:56 -0600, Pete Zaitcev wrote: > >>On Thu, 10 Apr 2014 10:41:54 -0400 > >>Chuck Anderson <cra@xxxxxxx> wrote: > >> > >>>[...] We need an independent, > >>>system-wide DNS cache, and always point resolv.conf to 127.0.0.1 to > >>>solve this fundamental design problem with how name resolution works > >>>on a Linux system. Windows has had a default system-wide DNS cache > >>>for over a decade. It is about time that Linux catches up. > >> > >>I observe you pointedly ignore the existence of nscd (which does not > >>require any changes to resolv.conf). Why is that? Ignorance about nscd on my part. Please tell me more. What are the honest pros/cons to using nscd? Are there still big enough problems with nscd to warrant its poor reputation? > >nscd is ... bad I've since learned more about nscd. Apparently its reputation may be undeserved, at least the newer versions in glibc. I have no direct experience, but I finally found a good thread about fixing the stub resolver that addresses people's unwillingness to use nscd as well as some other things that could be done, such as a patch apparently carried by Debian and Ubuntu that improves detection of changes to resolv.conf: https://sourceware.org/ml/libc-alpha/2012-12/msg00416.html > Main goal is to have local DNSSEC-validating resolver. I, as the OP, did not intend that as the goal, although I have no problem with that as a different goal. My intent was to fix the atrocious failover behavior of the glibc resolver. I also don't mind using a caching resolver BUT there should be a better stub resolver that can be widely deployed in a default configuration that doesn't require a local caching resolver to paper over its deficiencies. Maybe nscd (and some of the other ideas in the link I posted) are part of the solution. Basically, we aren't going to win the war by suggesting that everyone should run a DNSSEC-validating resolver everywhere. But maybe we can get widespread consensus for having a lightweight daemon that just does failover correctly and nothing else fancy so that people won't mind it running by default on Server, Workstation, Cloud, etc. Maybe nscd can be that daemon, or maybe something else needs to be written. Whatever the solution to DNS failover, we should be sure it works correctly in combination with/doesn't get in the way of full caching/DNSSEC-validating resolvers, both local and remote, whether they are installed/enabled by default in various Fedora products or not. -- devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct