On Fri, Feb 05, 2016 at 06:42:34AM -0800, Ned Freed wrote: > > The implementation and documentation of this was joint work with > > Wietse back in early 2006. These days, when STARTTLS fails, Postfix > > tries other MX hosts first and if they all fail, defers the mail > > initially. Cleartext fallback kicks in on the second delivery > > attempt if STARTTLS fails again. > > Actually, I consider this approach as unacceptable unless the second delivery > attempt occurs within a minute or two. (Which, incidentally, is a much shorter > retry period after deferral than the standards recommend.) The default is 5 minutes, with doubling exponential backoff up to a cutoff of somewhat over an hour: $ postconf -d {min,max}imal_backoff_time minimal_backoff_time = 300s maximal_backoff_time = 4000s (These, combined with per-destination concurrency limits to avoid overwhelming remote systems that come back after a period of downtime, and throttling of destinations when enough back-to-back deliveries fail, do a better job of avoiding hammering remote systems than the recommendations in the standards, while substantially reducing delay when transmission fails on the first attempt). I typically override these to min/max = 225s/7200s (this reduces congestion delay if the queue happens to hold enough long-term deferred mail, since with these settings the fraction that is active at any given time is reduced by a factor of 2 or so). As for "unacceptable", you might find the below fall into that category: * IIRC Sendmail never falls back to cleartext if STARTTLS is advertised. * Microsoft's Schannel TLS stack at outlook.com and in Exchange by default solicits client certs it does not use and then rejects client connections that happen to present a certificate chain with an MD5 signature, even when it the self-signature of a root CA as with CAcert.org. * The ancient SChannel implementation in Exchange 2003 ignores all but the first 64 ciphers in the client's TLS hello, and has a broken DES3-CBC implementation that fails post-handshake with trailing garbage in the TLS response to EHLO that breaks the established TLS connection after "MAIL FROM". Only RC4-SHA and RC4-MD5 work (but modern Schannel at outlook.com refuses to negotiate RC4). $ posttls-finger -o tls_medium_cipherlist=RC4-SHA -c -Ldebug microsoft.com posttls-finger: initializing the client-side TLS engine posttls-finger: setting up TLS connection to microsoft-com.mail.protection.outlook.com[207.46.163.138]:25 posttls-finger: microsoft-com.mail.protection.outlook.com[207.46.163.138]:25: TLS cipher list "RC4-SHA:!aNULL" posttls-finger: SSL_connect:before/connect initialization posttls-finger: SSL_connect:SSLv2/v3 write client hello A posttls-finger: SSL_connect error to microsoft-com.mail.protection.outlook.com[207.46.163.138]:25: lost connection * DANE/DNSSEC at mail.mil and the many domains it is MX for is fubar'ed, because their firewall "protects" the DNS servers by blocking lookups for exotic records such as TLSA. $ dig -t tlsa +noall +comment +ans _25._tcp.pri-jeemsg.eemsg.mail.mil. ;; connection timed out; no servers could be reached $ dig -t a +noall +comment +ans _25._tcp.pri-jeemsg.eemsg.mail.mil. ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 14873 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 * Despite being notified at the beginning of Aug 2015, isphuset.no still has has upgraded from a buggy PowerDNS version that botches TLSA record denial of existence for the domains of all their DNSSEC hosted customers. [ In the example below, the name internot.no seems rather apt ] $ unbound-host -t tlsa -v _25._tcp.internot.no. _25._tcp.internot.no. has no TLSA record (BOGUS (security failure)) validation failure <_25._tcp.internot.no. TLSA IN>: nodata proof failed from 195.35.82.103 As for a delay of < 5 minutes delivering email to such broken sites it is, for most users, a reasonable trade-off to reduce needless TLS fallback in the face of routine transmission glitches. Though in the case of mail.mil, internot.no and the like, one has to explicitly disable DANE support for those domains, since in order to avoid downgrade attacks there is no fallback when TLSA record lookups timeout or servfail. -- Viktor.