On Tue, 29 Sep 2020, Lennart Poettering wrote:
Well, but how do you determine "local resources"?
This is not the proper question. The proper question is "what are you trying to do". The .local domain discovery clearly is something meant to be local. I assume the real question is: How to convey my custom local network domain to my local infrastructure. In the old days, it was what DHCP gave you as domain. If you do that with your own network, then it is pretty obvious. If you do it differently, you will have to coney this somehow via configuration. This is why 20 years ago Microsoft added "zones" to their network configuration. Is this a "home zone" or a "work zone" or a "public wifi". So I expect the information of "what is local" to live in NetworkManager or systemd-networkd via configuration. No further magic should be needed. The user selects this once when joining a new network.
Corporate networks tend to define local zones. Home wifi routers all do, too. There's no clear way to know what can go directly to well-known good DNS servers and what needs to be resolved locally.
Generally, resolve the names from the DHCP obtained domain name with the DHCP obtained name servers. Yes, this is limited to one domain name, which might not be ideal, but in general when you connect to a home or corporate network directly (no VPN) then you should use their DNS servers regardless. Enterprise is likely blocking port 53 (or DoT or trying to block DoH) for security reasons. And your home network you trust? Most home routers these days allow configuration of a guest network along with the native home network. For those not requiring services on the home network, and who just need internet. It is the same as using a public wifi in a coffeeshop or guest network at an enterprise network. You might need to authenticate a captive portal and then you should not trust the network for anything else and ideally only give it encrypted packets (TLS, DoT to trusted DNS servers, VPN). If no trusted DNS servers are configured on your device, you have no choice but to trust their DNS servers. For what the user deems is a "public wifi", there are simply never any "local resources" other than an internet uplink to your own remote resources. In all the above scenario's, I see no ambiguity on which DNS servers to use, except when multiple domain names exist within only the LAN, which is rarely the case. For the VPN scenario, it is just a little bit more complicated. For those with proper standards, such as "Cisco IPsec", L2TP/IPsec", the VPN confiuration is dictated by the server to either send all or some traffic to the VPN server. If it is not "everything", then these VPNs convey 1 domain name and one or more IP's of DNS servers to use to resolve that domain. For IKEv2 IPsec based VPNs, any number of domain names can be specified by the server to be used by the client. When doing split-DNS with DNSSEC trust anchors, these can be conveyed and there are strict rules on when to allow these to override public DNSSEC trust anchors as per RFC 8598. For VPN protocols with no real standard, things are more complicated. OpenVPN can do custom things. It all depends on the provisioning. WireGuard has nothing related to DNS, it is all hidden in the per-vendor proprietary provisioning code. Perhaps the "wg-dynamic" userland protocol will address this. Let's hope they read RFC 8598 for inspiration to avoid the mistakes of IPsec 20 years ago. What is important with all of the VPN cases is that you properly flush the cache when the VPN estalishes and terminates, so avoid having unreachable IP's in your DNS cache. It's important not to flush other DNS data to avoid DNS fingerprinting users across different networks. It seems resolvectl is the API to support this with systemd-resolved. In short, I don't understand the issue raised here of "How do you determine local resources". For each and every domain name in the above scenario it is obvious what nameserver to send it to. There is never a need to broadcast this over a mix of public / private DNS servers.
Also, people would react very allergic if we'd start sending all DNS traffic to google or so.
So this feature has no purpose as far as I can see and is never ever a good idea, unless the user is specifically told their choice is to disconnect from a broken network or try to use the broken network with well known public DNS servers as a last resort.
Yes, resolved implements DNSSEC. But from my experience I can tell you it's very hard to do in a way resonably compatible with DNS servers deployed out there in particular edge ones. Things mostly work, but DNS servers are all broken in different ways, and we can't possibly test things on all possible cheap wifi hw...
Which is why the DNSSEC validation code should have been left to the large DNS teams at ISC, NLnetlabs, nic.cz, powerdns, IETF/ICANN communities etc. For any of the problems that systemd-resolved claims to have been written for - determing when and where to send which DNS queries to - is completely unrelated to DNSSEC and its deployment/implementation protocol interop issues and corner cases. It was never required that systemd-resolved use its own DNSSEC validation code. I warned not to do this. The DNS community spends tens of millions of dollars a year on writing and maintaining DNS libraries and deamons and do protocol updates. libreswan based its VPN DNS reconfiguration on the unbound daemon and libunbound. This work actually collaborated with NLnetlabs to extend unbound for all the VPN use cases to reconfigure the DNS server for all kinds of VPN domain scenario's. FreeBSD has started using unbound with their own unbound reconfiguration tooling around it. systemd-resolved resources were spend re-inventing DNSSEC implementations, making many of the same mistakes that the existing DNS libraries made, and is still buggy resolving certain complicated CNAME and wildcard scenario's with NSEC3. This is not because systemd-resolved programmers are bad. It is because implementing and maintaining DNSSEC is a million dollars a year operation. This money results in 3-4 production quality well maintained DNSSEC implementations that Linux can choose from. systemd-resolved simply does not have the resources to do this themselves, as is evident by the 1300 open bugs on github right now. systemd-resolved should use an existing DNSSEC library. It can open a seperate DNS cache to each of the interface's supplied DNS servers. It can route DNS queries to the proper DNS cache. It will automatically get fixes and new record types supported by updates to thse DNS libraries. systemd-resolved should focus on what it needs to do. Learn and reconfigure the stream of DNS queries to the right servers. It should get out of the DNS resolving and DNSSEC validation and DNS caching business.
(One thing I definitely want to add is an option to only do DNSSEC if DoT is also done, under the assumption that a DNS server that is good enough and new enough to implement the latter also should be able to do the former sanely.)
That assumption might be true now, but 5 years down the line there will be bugs and corner cases and not enough resources for systemd-resolved to track and handle this. Also, the "only do DNSSEC if" is not a valid choice. Let's remember this whole thread started with my system getting broken because DNSSEC was silently dropped by systemd-resolved after a system upgrade.
No, it's not. It's extremely difficult. Cheap wifi router DNS servers are broken in so many ways. They return errors in some cases, freeze in others, return rubbish in others, or not at all in even others. If you ask the wrong questions anything can happen.
This is why systemd-resolved should use a DNS library and not invent its own thing. The teams at ISC, NLnetlabs, NIC.CZ, PowerDNS have spend the last 20 years dealing with this and solving it. Use their code. You don't have the resources to do this yourselves. Again, 1300 open bugs on github show you have never managed to dig out of this hole.
We pretty carefully tests and probe DNS servers but this still comes at the price that on a particular bad implementation we might take a long time until we figure out that DNSSEC simply is not possible.
See above. Also, the fix you applied now is to disable DNSSEC per default, damaging all the installed servers on enterprise networks that depend and receive completely valid DNSSEC traffic. So I am sorry if I strongly disagree with "pretty carefullt test and probe". That's not what happened to my laptop as VPN client and my mail server.
The simple fact that some DNS servers don't respond at all if you ask the "wrong" questions is already a problem: it means you have to wait for a timeout (which means super long lookups initially) or do queries in parallel. That however is a problem too since other DNS servers really don#t like it if you ask them multiple questions at once. Bombarding DNS servers with multiple questions all at once and see if one "sticks" isn't a workable strategy hence either.
Stop re-inventing the wheel. Bind, unbound, knot, powerdns do this with much more resources that you have, and for many more years than you have and they are far more aware of these issues then you are as they see a vastly larger audience with issues that the Linux desktop niche market. When systemd-resolved on github closes a bug reported and explained by Mark Andrews of Bind, the result is a bug in systemd-resolved.
So I think we do quite well in resolved on the DNSSEC front actually,
Compared to the dedicated DNS teams at the mentioned opensource DNS software, systemd-resolved is not doing quite well. It is doing poorly. Its developers are not attending the DNS conferences where issues are discussed. They are not at IETF, not at ICANN, not at DNS-OARC, not at RIPE. I have never seen systemd-resolvd people participating in the wider DNS community. A community of hundreds of DNS engineers. So let me ExecSum what I wrote here. For systemd-resolved to become a high quality DNS solution: 1) Remove custom DNS/DNSSEC resolving code and use a well maintained DNS library. 2) Maintain a per interface DNS cache using these libraries 3) Use the above sketched out process to improve your process of deciding which interface to send the query to. This is the core of what systemd-resolved should give to the user. It is probably already pretty close to this when we work on integrating VPN supprt. 4) Deal with hotspots separately 5) Support user configured/prompted fallback using DoT and DoH to well known servers in case obtained DNS servers are too broken to work well (with DNSSEC) No one else but systemd-resolved has item 2) and 3) and we only had a badly working dnssec-trigger that tried to do this. This is where systemd-resolved can shine. I would seriously FALL IN LOVE with systemd-resolved for doing 2) and 3) even if I had to sometimes manually do 4) and 5) I will work on extending 3) with VPN support in libreswan for IKEv1 and IKEv2 based IPsec VPNs. But 1) is crucial to widespread voluntary adoption. Without 1) we have no choice to allow the user to completely disable/remove systemd-resolved from their system. Paul _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx