Hi all, I recently observed reverse IPv4 address lookups timing out on a newly configured host. (Ubuntu 22.04LTS, systemd 249.11-0ubuntu3.7). I tracked the problem to the DVE-2018-0001 mitigation code. An example: $ resolvectl query 151.101.1.164 151.101.1.164: resolve call failed: All attempts to contact name servers or networks failed tcpdump shows (in relevant part): 00:00:00.000000 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ [1au] PTR? 164.1.101.151.in-addr.arpa. (55) 00:00:00.021127 IP 8.8.8.8.53 > 192.168.1.48.35911: 26417 NXDomain 0/1/1 (115) 00:00:00.021252 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ PTR? 164.1.101.151.in-addr.arpa. (44) The first query gets an "NXDOMAIN", which is the correct answer for this address. However, NXDOMAIN triggers the DVE-2018-0001 mitigation code to send an revised query without EDNS OPT (confirmed in debug log). I **never see a response to this revised query**. If there is only a single DNS server, the resolver resends the OPT-less query after a timeout, and *that* gets an NXDOMAIN which is returned. However, if there are multiple DNS servers (e.g. 8.8.8.8 8.8.4.4), on timing out, it sends another query with EDNS to the next server, and the three-packet sequence repeats several times until it gives up. Since the server *will* respond to a retransmit after 5s, my guess is that the server, or maybe something in the network, is dropping close- in-time requests with the same transaction id. I tried a few public DNSs that (surprisingly?) all behaved the same. I haven't found a simple way to rule out a firewall, router or my ISP. Regardless, my thought is that resending a slightly different query after we did get a response should not use the same transaction id. I patched systemd as follows and the problem goes away: --- a/src/resolve/resolved-dns-transaction.c +++ b/src/resolve/resolved-dns-transaction.c @@ -1312,6 +1312,7 @@ void dns_transaction_process_reply(DnsTransaction *t, DnsPacket *p, bool encrypt FORMAT_DNS_RCODE(DNS_PACKET_RCODE(p)), dns_server_feature_level_to_string(t- >clamp_feature_level_nxdomain)); + dns_transaction_shuffle_id(t); dns_transaction_retry(t, false /* use the same server */); return; } A few questions: - Does anyone else see this? - Does this look like a reasonable fix? Any thoughts on whether the one other place where dns_transaction_retry(..., false) is called to retry the same server with a lower feature level (SERVFAIL etc) should do the same? - Any other issues with the patch? Or would it be reasonable to (add comments and) submit a pull request? -Vince Del Vecchio