On 2/3/25 11:39 PM, Mina Almasry wrote: > Augment dmabuf binding to be able to handle TX. Additional to all the RX > binding, we also create tx_vec needed for the TX path. > > Provide API for sendmsg to be able to send dmabufs bound to this device: > > - Provide a new dmabuf_tx_cmsg which includes the dmabuf to send from. > - MSG_ZEROCOPY with SCM_DEVMEM_DMABUF cmsg indicates send from dma-buf. > > Devmem is uncopyable, so piggyback off the existing MSG_ZEROCOPY > implementation, while disabling instances where MSG_ZEROCOPY falls back > to copying. > > We additionally pipe the binding down to the new > zerocopy_fill_skb_from_devmem which fills a TX skb with net_iov netmems > instead of the traditional page netmems. > > We also special case skb_frag_dma_map to return the dma-address of these > dmabuf net_iovs instead of attempting to map pages. > > Based on work by Stanislav Fomichev <sdf@xxxxxxxxxxx>. A lot of the meat > of the implementation came from devmem TCP RFC v1[1], which included the > TX path, but Stan did all the rebasing on top of netmem/net_iov. > > Cc: Stanislav Fomichev <sdf@xxxxxxxxxxx> > Signed-off-by: Kaiyuan Zhang <kaiyuanz@xxxxxxxxxx> > Signed-off-by: Mina Almasry <almasrymina@xxxxxxxxxx> Very minor nit: you unexpectedly leaved a lot of empty lines after the SoB. [...] @@ -240,13 +249,23 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd, > * binding can be much more flexible than that. We may be able to > * allocate MTU sized chunks here. Leave that for future work... > */ > - binding->chunk_pool = > - gen_pool_create(PAGE_SHIFT, dev_to_node(&dev->dev)); > + binding->chunk_pool = gen_pool_create(PAGE_SHIFT, > + dev_to_node(&dev->dev)); > if (!binding->chunk_pool) { > err = -ENOMEM; > goto err_unmap; > } > > + if (direction == DMA_TO_DEVICE) { > + binding->tx_vec = kvmalloc_array(dmabuf->size / PAGE_SIZE, > + sizeof(struct net_iov *), > + GFP_KERNEL); > + if (!binding->tx_vec) { > + err = -ENOMEM; > + goto err_free_chunks; It looks like the later error paths (in the for_each_sgtable_dma_sg() loop) could happen even for 'direction == DMA_TO_DEVICE', so I guess an additional error label is needed to clean tx_vec on such paths. /P