On 13.01.21 09:52, David Hildenbrand wrote: > On 13.01.21 04:31, Linus Torvalds wrote: >> On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: >>> >>> The thing about the speculative page cache references is that they can >>> temporarily bump a refcount on a page which _used_ to be in the page >>> cache and has now been reallocated as some other kind of page. >> >> Oh, and thinking about this made me think we might actually have a >> serious bug here, and it has nothing what-so-ever to do with COW, GUP, >> or even the page count itself. >> >> It's unlikely enough that I think it's mostly theoretical, but tell me >> I'm wrong. >> >> PLEASE tell me I'm wrong: >> >> CPU1 does page_cache_get_speculative under RCU lock >> >> CPU2 frees and re-uses the page >> >> CPU1 CPU2 >> ---- ---- >> >> page = xas_load(&xas); >> if (!page_cache_get_speculative(page)) >> goto repeat; >> .. succeeds .. >> >> remove page from XA >> release page >> reuse for something else >> >> .. and then re-check .. >> if (unlikely(page != xas_reload(&xas))) { >> put_page(page); >> goto repeat; >> } >> >> ok, the above all looks fine. We got the speculative ref, but then we >> noticed that its' not valid any more, so we put it again. All good, >> right? >> >> Wrong. >> >> What if that "reuse for something else" was actually really quick, and >> both allocated and released it? >> >> That still sounds good, right? Yes, now the "put_page()" will be the >> one that _actually_ releases the page, but we're still fine, right? >> >> Very very wrong. >> >> The "reuse for something else" on CPU2 might have gotten not an >> order-0 page, but a *high-order* page. So it allocated (and then >> immediately free'd) maybe an order-2 allocation with _four_ pages, and >> the re-use happened when we had coalesced the buddy pages. >> >> But when we release the page on CPU1, we will release just _one_ page, >> and the other three pages will be lost forever. >> >> IOW, we restored the page count perfectly fine, but we screwed up the >> page sizes and buddy information. >> >> Ok, so the above is so unlikely from a timing standpoint that I don't >> think it ever happens, but I don't see why it couldn't happen in >> theory. >> >> Please somebody tell me I'm missing some clever thing we do to make >> sure this can actually not happen.. > > Wasn't that tackled by latest (not merged AFAIKs) __free_pages() changes? > > I'm only able to come up with the doc update, not with the oroginal > fix/change > > https://lkml.kernel.org/r/20201027025523.3235-1-willy@xxxxxxxxxxxxx > Sorry, found it, it's in v5.10 commit e320d3012d25b1fb5f3df4edb7bd44a1c362ec10 Author: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Date: Tue Oct 13 16:56:04 2020 -0700 mm/page_alloc.c: fix freeing non-compound pages and commit 7f194fbb2dd75e9346b305b8902e177b423b1062 Author: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> Date: Mon Dec 14 19:11:09 2020 -0800 mm/page_alloc: add __free_pages() documentation is in v5.11-rc1 -- Thanks, David / dhildenb