On Fri, Nov 1, 2024 at 10:41 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > ... > >> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > >> index e928efc22f80..31e01da61c12 100644 > >> --- a/net/ipv4/tcp.c > >> +++ b/net/ipv4/tcp.c > >> @@ -277,6 +277,7 @@ > >> #include <net/ip.h> > >> #include <net/sock.h> > >> #include <net/rstreason.h> > >> +#include <net/page_pool/types.h> > >> > >> #include <linux/uaccess.h> > >> #include <asm/ioctls.h> > >> @@ -2476,6 +2477,11 @@ static int tcp_recvmsg_dmabuf(struct sock *sk, const struct sk_buff *skb, > >> } > >> > >> niov = skb_frag_net_iov(frag); > >> + if (net_is_devmem_page_pool_ops(niov->pp->mp_ops)) { > >> + err = -ENODEV; > >> + goto out; > >> + } > >> + > > > > I think this check needs to go in the caller. Currently the caller > > assumes that if !skb_frags_readable(), then the frag is dma-buf, and > > io_uring originated netmem that are marked unreadable as well > and so will end up in tcp_recvmsg_dmabuf(), then we reject and > fail since they should not be fed to devmem TCP. It should be > fine from correctness perspective. > > We need to check frags, and that's the place where we iterate > frags. Another option is to add a loop in tcp_recvmsg_locked > walking over all frags of an skb and doing the checks, but > that's an unnecessary performance burden to devmem. > Checking each frag in tcp_recvmsg_dmabuf (and the equivalent io_uring function) is not ideal really. Especially when you're dereferencing nio->pp to do the check which IIUC will pull a cache line not normally needed in this code path and may have a performance impact. We currently have a check in __skb_fill_netmem_desc() that makes sure all frags added to an skb are pages or dmabuf. I think we need to improve it to make sure all frags added to an skb are of the same type (pages, dmabuf, iouring). sending it to skb_copy_datagram_msg or tcp_recvmsg_dmabuf or error. I also I'm not sure dereferencing ->pp to check the frag type is ever OK in such a fast path when ->pp is not usually needed until the skb is freed? You may have to add a flag to the niov to indicate what type it is, or change the skb->unreadable flag to a u8 that determines if it's pages/io_uring/dmabuf. -- Thanks, Mina