On Fri, May 31, 2024 at 4:25 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Fri, May 31, 2024 at 07:46:41PM +0200, David Hildenbrand wrote: > > try_grab_folio()->try_get_folio()->folio_ref_try_add_rcu() > > > > Is called (mm-unstable) from: > > > > (1) gup_fast function, here IRQs are disable > > (2) gup_hugepte(), possibly problematic > > (3) memfd_pin_folios(), possibly problematic > > (4) __get_user_pages(), likely problematic > > > > (1) should be fine. > > > > (2) is possibly problematic on the !fast path. If so, due to commit > > a12083d721d7 ("mm/gup: handle hugepd for follow_page()") ? CCing Peter. > > > > (3) is possibly wrong. CCing Vivek. > > > > (4) is what we hit here > > I guess it was overlooked because try_grab_folio() didn't have any comment > or implication on RCU or IRQ internal helpers being used, hence a bit > confusing. E.g. it has different context requirement on try_grab_page(), > even though they look like sister functions. It might be helpful to have a > better name, something like try_grab_folio_rcu() in this case. > > Btw, none of above cases (2-4) have real bug, but we're looking at some way > to avoid triggering the sanity check, am I right? I hope besides the host > splash I didn't overlook any other side effect this issue would cause, and > the splash IIUC should so far be benign, as either gup slow (2,4) or the > newly added memfd_pin_folios() (3) look like to have the refcount stablized > anyway. > > Yang's patch in the other email looks sane to me, just that then we'll add > quite some code just to avoid this sanity check in paths 2-4 which seems > like an slight overkill. > > One thing I'm thinking is whether folio_ref_try_add_rcu() can get rid of > its RCU limitation. It boils down to whether we can use atomic_add_unless() > on TINY_RCU / UP setup too? I mean, we do plenty of similar things > (get_page_unless_zero, etc.) in generic code and I don't understand why > here we need to treat folio_ref_try_add_rcu() specially. > > IOW, the assertions here we added: > > VM_BUG_ON(!in_atomic() && !irqs_disabled()); > > Is because we need atomicity of below sequences: > > VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); > folio_ref_add(folio, count); > > But atomic ops avoids it. Yeah, I didn't think of why atomic can't do it either. But is it written in this way because we want to catch the refcount == 0 case since it means a severe bug? Did we miss something? > > This chunk of code was (mostly) originally added in 2008 in commit > e286781d5f2e ("mm: speculative page references"). > > In short, I'm wondering whether something like below would make sense and > easier: > > ===8<=== > diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h > index 1acf5bac7f50..c89a67d239d1 100644 > --- a/include/linux/page_ref.h > +++ b/include/linux/page_ref.h > @@ -258,26 +258,9 @@ static inline bool folio_try_get(struct folio *folio) > return folio_ref_add_unless(folio, 1, 0); > } > > -static inline bool folio_ref_try_add_rcu(struct folio *folio, int count) > -{ > -#ifdef CONFIG_TINY_RCU > - /* > - * The caller guarantees the folio will not be freed from interrupt > - * context, so (on !SMP) we only need preemption to be disabled > - * and TINY_RCU does that for us. > - */ > -# ifdef CONFIG_PREEMPT_COUNT > - VM_BUG_ON(!in_atomic() && !irqs_disabled()); > -# endif > - VM_BUG_ON_FOLIO(folio_ref_count(folio) == 0, folio); > - folio_ref_add(folio, count); > -#else > - if (unlikely(!folio_ref_add_unless(folio, count, 0))) { > - /* Either the folio has been freed, or will be freed. */ > - return false; > - } > -#endif > - return true; > +static inline bool folio_ref_try_add(struct folio *folio, int count) > +{ > + return folio_ref_add_unless(folio, count, 0); > } > > /** > @@ -305,7 +288,7 @@ static inline bool folio_ref_try_add_rcu(struct folio *folio, int count) > */ > static inline bool folio_try_get_rcu(struct folio *folio) > { > - return folio_ref_try_add_rcu(folio, 1); > + return folio_ref_try_add(folio, 1); > } > > static inline int page_ref_freeze(struct page *page, int count) > diff --git a/mm/gup.c b/mm/gup.c > index e17466fd62bb..17f89e8d31f1 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -78,7 +78,7 @@ static inline struct folio *try_get_folio(struct page *page, int refs) > folio = page_folio(page); > if (WARN_ON_ONCE(folio_ref_count(folio) < 0)) > return NULL; > - if (unlikely(!folio_ref_try_add_rcu(folio, refs))) > + if (unlikely(!folio_ref_try_add(folio, refs))) > return NULL; > > /* > ===8<=== > > So instead of adding new code, we fix it by removing some. There might be > some implication on TINY_RCU / UP config on using the atomic_add_unless() > to replace one atomic_add(), but I'm not sure whether that's a major issue. > > The benefit is try_get_folio() then don't need a renaming then, because the > rcu implication just went away. > > Thanks, > > -- > Peter Xu >