On Mon, 24 Oct 2022, Vlastimil Babka wrote: > On 10/3/22 19:00, Matthew Wilcox wrote: > > On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote: > >> Just one more thing, rcu_leak_callback too. RCU seem to use it > >> internally to catch double call_rcu(). > >> > >> And some suggestions: > >> - what about adding runtime WARN() on slab init code to catch > >> unexpected arch/toolchain issues? > >> - instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)? > > > > I think the real problem here is that isolate_movable_page() is > > insufficiently paranoid. Looking at the gyrations that GUP and the > > page cache do to convince themselves that the page they got really is > > the page they wanted, there are a few missing pieces (eg checking that > > you actually got a refcount on _this_ page and not some random other > > page you were temporarily part of a compound page with). > > > > This patch does three things: > > > > - Turns one of the comments into English. There are some others > > which I'm still scratching my head over. > > - Uses a folio to help distinguish which operations are being done > > to the head vs the specific page (this is somewhat an abuse of the > > folio concept, but it's acceptable) > > - Add the aforementioned check that we're actually operating on the > > page that we think we want to be. > > - Add a check that the folio isn't secretly a slab. > > > > We could put the slab check in PageMapping and call it after taking > > the folio lock, but that seems pointless. It's the acquisition of > > the refcount which stabilises the slab flag, not holding the lock. > > > > I would like to have a working safe version in -next, even if we are able > simplify it later thanks to frozen refcounts. I've made a formal patch of > yours, but I'm still convinced the slab check needs to be more paranoid so > it can't observe a false positive __folio_test_movable() while missing the > folio_test_slab(), hence I added the barriers as in my previous attempt [1]. > Does that work for you and can I add your S-o-b? > > [1] https://lore.kernel.org/all/aec59f53-0e53-1736-5932-25407125d4d4@xxxxxxx/ Ignore me, don't let me distract if you're happy with Matthew's patch (I know little of PageMovable, and I haven't tried to understand it); but it did look to me more like 6.2 material, and I was surprised that you dropped the simple align(4) approach for 6.1. Because of Hyeonggon's rcu_leak_callback() observation? That was a good catch, but turned out to be irrelevant, because it was only for an RCU debugging option, which would never be set up on a struct page (well, maybe it would in a dynamically-allocated-struct-page future). Hugh