On Sat, Jul 25, 2020 at 02:21:06AM +0000, Dexuan Cui wrote: > Hi, > The v4.4 stable kernel (currently it's v4.4.231) lacks this bugfix: > 327868212381 ("make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error") > , as a result, the v4.4 kernel can deliver corrupt data to the application > when a corrupt UDP packet is closely followed by a valid UDP packet: > when the same invocation of the recvmsg() syscall delivers the corrupt > packet's UDP payload to the application's receive buffer, it provides the > UDP payload length and the "from IP/Port" of the valid packet to the > application -- this mismatch makes the issue worse. > > Details: > > For a UDP packet longer than 76 bytes (see the v5.8-rc6 kernel's > include/linux/skbuff.h:3951), Linux delays the UDP checksum verification > until the application invokes the syscall recvmsg(). > > In the recvmsg() syscall handler, while Linux is copying the UDP payload > to the application's memory, it calculates the UDP checksum. If the > calculated checksum doesn't match the received checksum, Linux drops the > corrupt UDP packet, and then starts to process the next packet (if any), > and if the next packet is valid (i.e. the checksum is correct), Linux will > copy the valid UDP packet's payload to the application's receiver buffer. > > The bug is: before Linux starts to copy the valid UDP packet, the data > structure used to track how many more bytes should be copied to the > application memory is not reset to what it was when the application just > entered the kernel by the syscall! Consequently, only a small portion or > none of the valid packet's payload is copied to the application's receive > buffer, and later when the application exits from the kernel, actually > most of the application's receive buffer contains the payload of the > corrupt packet while recvmsg() returns the length of the UDP payload of > the valid packet. > > For the mainline kernel, the bug was fixed by Al Viro in the commit > 327868212381, but unluckily the bugfix is only backported to the > upstream v4.9+ kernels. I hope the bugfix can be backported to the > v4.4 stable kernel, since it's a "longterm" kernel and is still used by > some Linux distros. > > It turns out backporting 327868212381 to v4.4 means that some > Supporting patches must be backported first, so the overall changes > are pretty big... > > I made the below one-line workaround patch to force the recvmsg() syscall > handler to return to the userspace when Linux detects a corrupt UDP packet, > so the application will invoke the syscall again to receive the following valid > UDP packet (note: the patch may not work well with blocking sockets, for > which typically the application doesn't expect an error of -EAGAIN. I > guess it would be safer to return -EINTR instead?): > > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1367,6 +1367,7 @@ csum_copy_err: > /* starting over for a new packet, but check if we need to yield */ > cond_resched(); > msg->msg_flags &= ~MSG_TRUNC; > + return -EAGAIN; > goto try_again; > } > > > Eric Dumazet made an alternative that performs the csum validation earlier: > > --- a/net/ipv4/udp.c > +++ b/net/ipv4/udp.c > @@ -1589,8 +1589,7 @@ int udp_queue_rcv_skb(struct sock *sk, struct > sk_buff *skb) > } > } > > - if (rcu_access_pointer(sk->sk_filter) && > - udp_lib_checksum_complete(skb)) > + if (udp_lib_checksum_complete(skb)) > goto csum_error; > > if (sk_rcvqueues_full(sk, sk->sk_rcvbuf)) { > > I personally like Eric's fix and IMHO we'd better have it in v4.4 rather than > trying to backport 327868212381. Does Eric's fix work with your testing? If so, great, can you turn it into something I can apply to the 4.4.y stable tree and send it to stable@xxxxxxxxxxxxxxx? thanks, greg k-h