On Mon, Oct 03, 2022 at 06:00:35PM +0100, Matthew Wilcox wrote: > On Sun, Oct 02, 2022 at 02:48:02PM +0900, Hyeonggon Yoo wrote: > > Just one more thing, rcu_leak_callback too. RCU seem to use it > > internally to catch double call_rcu(). > > > > And some suggestions: > > - what about adding runtime WARN() on slab init code to catch > > unexpected arch/toolchain issues? > > - instead of 4, we may use macro definition? like (PAGE_MAPPING_FLAGS + 1)? > > I think the real problem here is that isolate_movable_page() is > insufficiently paranoid. Looking at the gyrations that GUP and the > page cache do to convince themselves that the page they got really is > the page they wanted, there are a few missing pieces (eg checking that > you actually got a refcount on _this_ page and not some random other > page you were temporarily part of a compound page with). > > This patch does three things: > > - Turns one of the comments into English. There are some others > which I'm still scratching my head over. > - Uses a folio to help distinguish which operations are being done > to the head vs the specific page (this is somewhat an abuse of the > folio concept, but it's acceptable) > - Add the aforementioned check that we're actually operating on the > page that we think we want to be. > - Add a check that the folio isn't secretly a slab. > > We could put the slab check in PageMapping and call it after taking > the folio lock, but that seems pointless I partially agree with this patch. I actually like it. > It's the acquisition of > the refcount which stabilises the slab flag, not holding the lock. But can you please elaborate how this prevents race between allocation & initialization of a slab and isolate_movable_page()? Or maybe we can handle it with frozen folio as Vlastimil suggested? ;-) > diff --git a/mm/migrate.c b/mm/migrate.c > index 6a1597c92261..a65598308c83 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -59,6 +59,7 @@ > > int isolate_movable_page(struct page *page, isolate_mode_t mode) > { > + struct folio *folio = page_folio(page); > const struct movable_operations *mops; > > /* > @@ -70,16 +71,23 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode) > * the put_page() at the end of this block will take care of > * release this page, thus avoiding a nasty leakage. > */ > - if (unlikely(!get_page_unless_zero(page))) > + if (unlikely(!folio_try_get(folio))) > goto out; > > + /* Recheck the page is still part of the folio we just got */ > + if (unlikely(page_folio(page) != folio)) > + goto out_put; > + > /* > - * Check PageMovable before holding a PG_lock because page's owner > - * assumes anybody doesn't touch PG_lock of newly allocated page > - * so unconditionally grabbing the lock ruins page's owner side. > + * Check movable flag before taking the folio lock because > + * we use non-atomic bitops on newly allocated page flags so > + * unconditionally grabbing the lock ruins page's owner side. > */ > - if (unlikely(!__PageMovable(page))) > - goto out_putpage; > + if (unlikely(!__folio_test_movable(folio))) > + goto out_put; > + if (unlikely(folio_test_slab(folio))) > + goto out_put; > + > /* > * As movable pages are not isolated from LRU lists, concurrent > * compaction threads can race against page migration functions > @@ -91,8 +99,8 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode) > * lets be sure we have the page lock > * before proceeding with the movable page isolation steps. > */ > - if (unlikely(!trylock_page(page))) > - goto out_putpage; > + if (unlikely(!folio_trylock(folio))) > + goto out_put; I don't know much about callers that this is trying to avoid race aginst... But for this to make sense, I think *every users* that doing their stuff with sub-page of a compound page should acquire folio lock and not page lock of sub-page, right? > if (!PageMovable(page) || PageIsolated(page)) > goto out_no_isolated; > @@ -106,14 +114,14 @@ int isolate_movable_page(struct page *page, isolate_mode_t mode) > /* Driver shouldn't use PG_isolated bit of page->flags */ > WARN_ON_ONCE(PageIsolated(page)); > SetPageIsolated(page); > - unlock_page(page); > + folio_unlock(folio); > > return 0; > > out_no_isolated: > - unlock_page(page); > -out_putpage: > - put_page(page); > + folio_unlock(folio); > +out_put: > + folio_put(folio); > out: > return -EBUSY; > } -- Thanks, Hyeonggon