On Nov 07 10:48, Yang Shi wrote: > On Sun, Nov 6, 2022 at 11:55 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > On Fri 04-11-22 13:52:52, Yang Shi wrote: > > > On Fri, Nov 4, 2022 at 12:51 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > > > On Fri 04-11-22 10:42:45, Yang Shi wrote: > > > > > On Fri, Nov 4, 2022 at 2:56 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > > > > > > > On Fri 04-11-22 10:35:21, Michal Hocko wrote: > > > > > > [...] > > > > > > > diff --git a/include/linux/gfp.h b/include/linux/gfp.h > > > > > > > index ef4aea3b356e..308daafc4871 100644 > > > > > > > --- a/include/linux/gfp.h > > > > > > > +++ b/include/linux/gfp.h > > > > > > > @@ -227,7 +227,10 @@ static inline > > > > > > > struct folio *__folio_alloc_node(gfp_t gfp, unsigned int order, int nid) > > > > > > > { > > > > > > > VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES); > > > > > > > - VM_WARN_ON((gfp & __GFP_THISNODE) && !node_online(nid)); > > > > > > > + if((gfp & __GFP_THISNODE) && !node_online(nid)) { > > > > > > > > > > > > or maybe even better > > > > > > if ((gfp & (__GFP_THISNODE|__GFP_NOWARN) == __GFP_THISNODE|__GFP_NOWARN) && !node_online(nid)) > > > > > > > > > > > > because it doesn't really make much sense to dump this information if > > > > > > the allocation failure is going to provide sufficient (and even more > > > > > > comprehensive) context for the failure. It looks more hairy but this can > > > > > > be hidden in a nice little helper shared between the two callers. > > > > > > > > > > Thanks a lot for the suggestion, printing warning if the gfp flag > > > > > allows sounds like a good idea to me. Will adopt it. But the check > > > > > should look like: > > > > > > > > > > if ((gfp & __GFP_THISNODE) && !(gfp & __GFP_NOWARN) && !node_online(nid)) > > > > > > > > The idea was to warn if __GFP_NOWARN _was_ specified. Otherwise we will > > > > get an allocation failure splat from the page allocator and there it > > > > will be clear that the node doesn't have any memory associated. It is > > > > exactly __GFP_NOWARN case that would be a silent failure and potentially > > > > a buggy code (like this THP collapse path). See my point? > > > > > > Aha, yeah, see your point now. I didn't see the splat from the > > > allocator from the bug report, then I realized it had not called into > > > allocator yet before the warning was triggered. > > > > And it would trigger even if it did because GFP_TRANSHUGE has > > __GFP_NOWARN > > Yeah, the syzbot has panic on warn set, so kernel just panicked before > entering the allocator. > Sorry I'm late to the party here. I think Michal's suggestion is sound -- catches instances like we saw with MADV_COLLAPSE, but no risk of panic-on-warn. Thanks for the suggestion. Best, Zach > > -- > > Michal Hocko > > SUSE Labs