Re: The sk_err mechanism is infuriating in userspace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-02-05 at 15:03 -0800, Andy Lutomirski wrote:
> Hi all-
> 
> I encounter this issue every couple of years, and it still seems to be
> an issue, and it drives me nuts every time I see it.
> 
> I write software that uses unconnected datagram-style sockets.  Errors
> happen for all kinds of reasons, and my software knows it.  My
> software even handles the errors and moves on with its life.  I use
> MSG_ERRQUEUE to understand the errors.  But the kernel fights back:
> 
> struct sk_buff *__skb_try_recv_datagram(struct sock *sk,
>                                         struct sk_buff_head *queue,
>                                         unsigned int flags, int *off, int *err,
>                                         struct sk_buff **last)
> {
>         struct sk_buff *skb;
>         unsigned long cpu_flags;
>         /*
>          * Caller is allowed not to check sk->sk_err before skb_recv_datagram()
>          */
>         int error = sock_error(sk);
> 
>         if (error)
>                 goto no_packet;
>         ^^^^^^^^^^ <----- EXCUSE ME?
> 
> The kernel even fights back on the *send* path?!?
> 
> static long sock_wait_for_wmem(struct sock *sk, long timeo)
> {
>         DEFINE_WAIT(wait);
> 
>         sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
>         for (;;) {
>                 if (!timeo)
>                         break;
>                 if (signal_pending(current))
>                         break;
>                 set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
>                 ...
>                 if (READ_ONCE(sk->sk_err))
>                         break;  <-- KERNEL HATES UNCONNECTED SOCKETS!
> 
> This is IMO just broken.  I realize it's legacy behavior, but it's
> BROKEN legacy behavior. 

As you noted this is an established behaviour exposed to the user-
space, and we can't simply change it, regardless of it's own (eventual
lack of) merit.

>  sk_err does not (at least for an unconnected
> socket) indicate that anything is wrong with the socket. 

What about 'destination/port unreachable' and many other similar errors
reported by sk_err? Which specific errors reported by sk_err does not
indicate that anything is wrong with the socket ?

I guess that if you really want to ignore socket error for datagram
sockets at recvmsg()/sendmsg() time you could implement some new socket
option to conditionally enable such behaviour on a per socket basis.

Cheers,

Paolo






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux