Re: The sk_err mechanism is infuriating in userspace

Andy Lutomirski <luto@xxxxxxxxxxxxxx> · Wed, 28 Feb 2024 12:00:48 -0800

On Tue, Feb 6, 2024 at 9:24 AM Andy Lutomirski <luto@xxxxxxxxxxxxxx> wrote:
>
> On Tue, Feb 6, 2024 at 12:43 AM Paolo Abeni <pabeni@xxxxxxxxxx> wrote:
> >
> > What about 'destination/port unreachable' and many other similar errors
> > reported by sk_err? Which specific errors reported by sk_err does not
> > indicate that anything is wrong with the socket ?

I started writing a series to improve this in a backwards-compatible
way, but now I'm wondering whether the current behavior may be
partially a regression and not actually something well-enshrined in
history.

The nasty behavior in question is that, if a UDP or ping (or
presumably TCP, but that case is not necessarily a problem) socket
enables IP_RECVERR, then an ICMP error will asynchronously cause the
next sendmsg() to fail.  The code that causes this seems to be ancient
(I think it's sock_wait_for_wmem, which predates git, but I won't
swear to that)

Looking at my own logs, though, a Linux 4.5.2 did not seem to
regularly trigger this, and I'm getting it on a regular basis on 6.2
and some newer kernels.  And, somewhat damningly (with IP addresses
redacted):

$ traceroute -I 10.1.2.3
traceroute to 10.1.2.3 (10.1.2.3), 30 hops max, 60 byte packets
 1  * * *
 2  10.5.6.7 (10.5.6.7)  0.593 ms  0.793 ms  0.988 ms
 3  10.8.9.10 (10.8.9.10)  1.247 ms  1.547 ms  1.881 ms
 4  10.11.12.13 (10.11.12.13)  1.032 ms  1.333 ms  1.679 ms
send: No route to host

Whoops, traceroute is getting a bogus return when it sends a packet,
causing it to give up.  The real trace should be longer.

So I'm wondering if maybe this behavior should be seen as a bug to be
fixed and not a weird old API that needs to be preserved.

--Andy