Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx> writes: > On 27/04/2023 22.53, Toke Høiland-Jørgensen wrote: >>> +noinline >>> static void page_pool_empty_ring(struct page_pool *pool) >>> { >>> struct page *page; >>> @@ -796,39 +828,29 @@ static void page_pool_scrub(struct page_pool *pool) >>> page_pool_empty_ring(pool); >>> } >> So this is not in the diff context, but page_pool_empty_ring() does >> this: >> >> static void page_pool_empty_ring(struct page_pool *pool) >> { >> struct page *page; >> >> /* Empty recycle ring */ >> while ((page = ptr_ring_consume_bh(&pool->ring))) { >> /* Verify the refcnt invariant of cached pages */ >> if (!(page_ref_count(page) == 1)) >> pr_crit("%s() page_pool refcnt %d violation\n", >> __func__, page_ref_count(page)); >> >> page_pool_return_page(pool, page); >> } >> } >> >> ...and with this patch, that page_pool_return_page() call will now free >> the pool memory entirely when the last page is returned. When it does >> this, the condition in the while loop will still execute afterwards; it >> would return false, but if the pool was freed, it's now referencing >> freed memory when trying to read from pool->ring. > > Yes, that sounds like a problem. > >> So I think page_pool_empty_ring needs to either pull out all the pages >> in the ring to an on-stack buffer before calling page_pool_return_page() >> on them, or there needs to be some other way to break the loop early. > > Let me address this one first, I'll get back to the other in another > reply. The usual/idiom way of doing this is to have a next pointer that > is populated inside the loop before freeing the object. > It should look like this (only compile tested): > > static void page_pool_empty_ring(struct page_pool *pool) > { > struct page *page, *next; > > next = ptr_ring_consume_bh(&pool->ring); > > /* Empty recycle ring */ > while (next) { > page = next; > next = ptr_ring_consume_bh(&pool->ring); > > /* Verify the refcnt invariant of cached pages */ > if (!(page_ref_count(page) == 1)) > pr_crit("%s() page_pool refcnt %d violation\n", > __func__, page_ref_count(page)); > > page_pool_return_page(pool, page); > } > } Yup, that works! >> There are a couple of other places where page_pool_return_page() is >> called in a loop where the loop variable lives inside struct page_pool, >> so we need to be absolutely sure they will never be called in the >> shutdown stage, or they'll have to be fixed as well. > > The other loops are okay, but I spotted another problem in > __page_pool_put_page() in "Fallback/non-XDP mode", but that is fixable. Alright, great! -Toke