On Tue, Oct 22, 2024 at 11:22:13AM +0800, Yunsheng Lin wrote: > Networking driver with page_pool support may hand over page > still with dma mapping to network stack and try to reuse that > page after network stack is done with it and passes it back > to page_pool to avoid the penalty of dma mapping/unmapping. > With all the caching in the network stack, some pages may be > held in the network stack without returning to the page_pool > soon enough, and with VF disable causing the driver unbound, > the page_pool does not stop the driver from doing it's > unbounding work, instead page_pool uses workqueue to check > if there is some pages coming back from the network stack > periodically, if there is any, it will do the dma unmmapping > related cleanup work. > > As mentioned in [1], attempting DMA unmaps after the driver > has already unbound may leak resources or at worst corrupt > memory. Fundamentally, the page pool code cannot allow DMA > mappings to outlive the driver they belong to. > > Currently it seems there are at least two cases that the page > is not released fast enough causing dma unmmapping done after > driver has already unbound: > 1. ipv4 packet defragmentation timeout: this seems to cause > delay up to 30 secs. > 2. skb_defer_free_flush(): this may cause infinite delay if > there is no triggering for net_rx_action(). > > In order not to do the dma unmmapping after driver has already > unbound and stall the unloading of the networking driver, add > the pool->items array to record all the pages including the ones > which are handed over to network stack, so the page_pool can > do the dma unmmapping for those pages when page_pool_destroy() > is called. As the pool->items need to be large enough to avoid > performance degradation, add a 'item_full' stat to indicate the > allocation failure due to unavailability of pool->items. > > Note, the devmem patchset seems to make the bug harder to fix, > and may make backporting harder too. As there is no actual user > for the devmem and the fixing for devmem is unclear for now, > this patch does not consider fixing the case for devmem yet. > > 1. https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@xxxxxxxxxx/T/ > > Fixes: f71fec47c2df ("page_pool: make sure struct device is stable") > Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx> > Tested-by: Yonglong Liu <liuyonglong@xxxxxxxxxx> > CC: Robin Murphy <robin.murphy@xxxxxxx> > CC: Alexander Duyck <alexander.duyck@xxxxxxxxx> > CC: IOMMU <iommu@xxxxxxxxxxxxxxx> ... > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index c022c410abe3..194006d2930f 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -102,6 +102,7 @@ struct page_pool_params { > * @refill: an allocation which triggered a refill of the cache > * @waive: pages obtained from the ptr ring that cannot be added to > * the cache due to a NUMA mismatch > + * @item_full items array is full No need to block progress because of this, but the correct syntax is: * @item_full: ... Flagged by ./scripts/kernel-doc -none ...