On 2024-10-04 03:34, Taehee Yoo wrote: > On Fri, Oct 4, 2024 at 3:43 AM Mina Almasry <almasrymina@xxxxxxxxxx> wrote: >>> @@ -3608,9 +3629,11 @@ static void bnxt_free_rx_rings(struct bnxt *bp) >>> >>> static int bnxt_alloc_rx_page_pool(struct bnxt *bp, >>> struct bnxt_rx_ring_info *rxr, >>> + int queue_idx, >>> int numa_node) >>> { >>> struct page_pool_params pp = { 0 }; >>> + struct netdev_rx_queue *rxq; >>> >>> pp.pool_size = bp->rx_agg_ring_size; >>> if (BNXT_RX_PAGE_MODE(bp)) >>> @@ -3621,8 +3644,15 @@ static int bnxt_alloc_rx_page_pool(struct bnxt *bp, >>> pp.dev = &bp->pdev->dev; >>> pp.dma_dir = bp->rx_dir; >>> pp.max_len = PAGE_SIZE; >>> - pp.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV; >>> + pp.order = 0; >>> + >>> + rxq = __netif_get_rx_queue(bp->dev, queue_idx); >>> + if (rxq->mp_params.mp_priv) >>> + pp.flags = PP_FLAG_DMA_MAP | PP_FLAG_ALLOW_UNREADABLE_NETMEM; >> >> This is not the intended use of PP_FLAG_ALLOW_UNREADABLE_NETMEM. >> >> The driver should set PP_FLAG_ALLOW_UNREADABLE_NETMEM when it's able >> to handle unreadable netmem, it should not worry about whether >> rxq->mp_params.mp_priv is set or not. >> >> You should set PP_FLAG_ALLOW_UNREADABLE_NETMEM when HDS is enabled. >> Let core figure out if mp_params.mp_priv is enabled. All the driver >> needs to report is whether it's configured to be able to handle >> unreadable netmem (which practically means HDS is enabled). > > The reason why the branch exists here is the PP_FLAG_ALLOW_UNREADABLE_NETMEM > flag can't be used with PP_FLAG_DMA_SYNC_DEV. > > 228 if (pool->slow.flags & PP_FLAG_DMA_SYNC_DEV) { > 229 /* In order to request DMA-sync-for-device the page > 230 * needs to be mapped > 231 */ > 232 if (!(pool->slow.flags & PP_FLAG_DMA_MAP)) > 233 return -EINVAL; > 234 > 235 if (!pool->p.max_len) > 236 return -EINVAL; > 237 > 238 pool->dma_sync = true; //here > 239 > 240 /* pool->p.offset has to be set according to the address > 241 * offset used by the DMA engine to start copying rx data > 242 */ > 243 } > > If PP_FLAG_DMA_SYNC_DEV is set, page->dma_sync is set to true. > > 347 int mp_dmabuf_devmem_init(struct page_pool *pool) > 348 { > 349 struct net_devmem_dmabuf_binding *binding = pool->mp_priv; > 350 > 351 if (!binding) > 352 return -EINVAL; > 353 > 354 if (!pool->dma_map) > 355 return -EOPNOTSUPP; > 356 > 357 if (pool->dma_sync) //here > 358 return -EOPNOTSUPP; > 359 > 360 if (pool->p.order != 0) > 361 return -E2BIG; > 362 > 363 net_devmem_dmabuf_binding_get(binding); > 364 return 0; > 365 } > > In the mp_dmabuf_devmem_init(), it fails when pool->dma_sync is true. This won't work for io_uring zero copy into user memory. We need all PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | PP_FLAG_ALLOW_UNREADABLE_NETMEM set. I agree with Mina that the driver should not be poking at the mp_priv fields. How about setting all the flags and then letting the mp->init() figure it out? mp_dmabuf_devmem_init() is called within page_pool_init() so as long as it resets dma_sync if set I don't see any issues. > > tcp-data-split can be used for normal cases, not only devmem TCP case. > If we enable tcp-data-split and disable devmem TCP, page_pool doesn't > have PP_FLAG_DMA_SYNC_DEV. > So I think mp_params.mp_priv is still useful. > > Thanks a lot, > Taehee Yoo > >> >> >> -- >> Thanks, >> Mina