Mina Almasry <almasrymina@xxxxxxxxxx> writes: > On Sat, Mar 8, 2025 at 6:55 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: >> >> When enabling DMA mapping in page_pool, pages are kept DMA mapped until >> they are released from the pool, to avoid the overhead of re-mapping the >> pages every time they are used. This causes problems when a device is >> torn down, because the page pool can't unmap the pages until they are >> returned to the pool. This causes resource leaks and/or crashes when >> there are pages still outstanding while the device is torn down, because >> page_pool will attempt an unmap of a non-existent DMA device on the >> subsequent page return. >> >> To fix this, implement a simple tracking of outstanding dma-mapped pages >> in page pool using an xarray. This was first suggested by Mina[0], and >> turns out to be fairly straight forward: We simply store pointers to >> pages directly in the xarray with xa_alloc() when they are first DMA >> mapped, and remove them from the array on unmap. Then, when a page pool >> is torn down, it can simply walk the xarray and unmap all pages still >> present there before returning, which also allows us to get rid of the >> get/put_device() calls in page_pool. > > THANK YOU!! I had been looking at the other proposals to fix this here > and there and I had similar feelings to you. They add lots of code > changes and the code changes themselves were hard for me to > understand. I hope we can make this simpler approach work. You're welcome :) And yeah, me too! >> Using xa_cmpxchg(), no additional >> synchronisation is needed, as a page will only ever be unmapped once. >> > > Very clever. I had been wondering how to handle the concurrency. I > also think this works. Thanks! >> To avoid having to walk the entire xarray on unmap to find the page >> reference, we stash the ID assigned by xa_alloc() into the page >> structure itself, in the field previously called '_pp_mapping_pad' in >> the page_pool struct inside struct page. This field overlaps with the >> page->mapping pointer, which may turn out to be problematic, so an >> alternative is probably needed. Sticking the ID into some of the upper >> bits of page->pp_magic may work as an alternative, but that requires >> further investigation. Using the 'mapping' field works well enough as >> a demonstration for this RFC, though. >> > > I'm unsure about this. I think page->mapping may be used when we map > the page to the userspace in TCP zerocopy, but I'm really not sure. > Yes, finding somewhere else to put the id would be ideal. Do we really > need a full unsigned long for the pp_magic? No, pp_magic was also my backup plan (see the other thread). Tried actually doing that now, and while there's a bit of complication due to the varying definitions of POISON_POINTER_DELTA across architectures, but it seems that this can be defined at compile time. I'll send a v2 RFC with this change. -Toke