Yunsheng Lin <linyunsheng@xxxxxxxxxx> writes: > On 2025/3/10 17:13, Toke Høiland-Jørgensen wrote: > > ... > >> >>> Also, we might need a similar locking or synchronisation for the dma >>> sync API in order to skip the dma sync API when page_pool_destroy() is >>> called too. >> >> Good point, but that seems a separate issue? And simpler to solve (just > > If I understand the comment from DMA experts correctly, the dma_sync API > should not be allowed to be called after the dma_unmap API. Ah, right, I see what you mean; will add a check for this. >> set pool->dma_sync to false when destroying?). > > Without locking or synchronisation, there is some small window between > pool->dma_sync checking and dma sync API calling, during which the driver > might have already unbound. > >> >>>> To avoid having to walk the entire xarray on unmap to find the page >>>> reference, we stash the ID assigned by xa_alloc() into the page >>>> structure itself, in the field previously called '_pp_mapping_pad' in >>>> the page_pool struct inside struct page. This field overlaps with the >>>> page->mapping pointer, which may turn out to be problematic, so an >>>> alternative is probably needed. Sticking the ID into some of the upper >>>> bits of page->pp_magic may work as an alternative, but that requires >>>> further investigation. Using the 'mapping' field works well enough as >>>> a demonstration for this RFC, though. >>> page->pp_magic seems to only have 16 bits space available at most when >>> trying to reuse some unused bits in page->pp_magic, as BPF_PTR_POISON >>> and STACK_DEPOT_POISON seems to already use 16 bits, so: >>> 1. For 32 bits system, it seems there is only 16 bits left even if the >>> ILLEGAL_POINTER_VALUE is defined as 0x0. >>> 2. For 64 bits system, it seems there is only 12 bits left for powerpc >>> as ILLEGAL_POINTER_VALUE is defined as 0x5deadbeef0000000, see >>> arch/powerpc/Kconfig. >>> >>> So it seems we might need to limit the dma mapping count to 4096 or >>> 65536? >>> >>> And to be honest, I am not sure if those 'unused' 12/16 bits can really >>> be reused or not, I guess we might need suggestion from mm and arch >>> experts here. >> >> Why do we need to care about BPF_PTR_POISON and STACK_DEPOT_POISON? >> AFAICT, we only need to make sure we preserve the PP_SIGNATURE value. >> See v2 of the RFC patch, the bit arithmetic there gives me: >> >> - 24 bits on 32-bit architectures >> - 21 bits on PPC64 (because of the definition of ILLEGAL_POINTER_VALUE) >> - 32 bits on other 64-bit architectures >> >> Which seems to be plenty? > > I am really doubtful it is that simple, but we always can hear from the > experts if it isn't:) Do you have any specific reason to think so? :) >>>> Since all the tracking is performed on DMA map/unmap, no additional code >>>> is needed in the fast path, meaning the performance overhead of this >>>> tracking is negligible. The extra memory needed to track the pages is >>>> neatly encapsulated inside xarray, which uses the 'struct xa_node' >>>> structure to track items. This structure is 576 bytes long, with slots >>>> for 64 items, meaning that a full node occurs only 9 bytes of overhead >>>> per slot it tracks (in practice, it probably won't be this efficient, >>>> but in any case it should be an acceptable overhead). >>> >>> Even if items is stored sequentially in xa_node at first, is it possible >>> that there may be fragmentation in those xa_node when pages are released >>> and allocated many times during packet processing? If yes, is there any >>> fragmentation info about those xa_node? >> >> Some (that's what I mean with "not as efficient"). AFAICT, xa_array does >> do some rebalancing of the underlying radix tree, freeing nodes when >> they are no longer used. However, I am not too familiar with the xarray >> code, so I don't know exactly how efficient this is in practice. > > I guess that is one of the disadvantages that an advanced struct like > Xarray is used:( Sure, there will be some overhead from using xarray, but I think the simplicity makes up for it; especially since we can limit this to the cases where it's absolutely needed. -Toke