On 07/26/2018 10:03 AM, Michal Hocko wrote: > On Thu 26-07-18 09:50:45, Vlastimil Babka wrote: >> On 07/26/2018 09:42 AM, Michal Hocko wrote: >>> On Thu 26-07-18 09:34:58, Vlastimil Babka wrote: >>>> On 07/26/2018 09:26 AM, Michal Hocko wrote: >>>>> On Thu 26-07-18 09:18:57, Vlastimil Babka wrote: >>>>>> On 07/25/2018 09:52 PM, Andrew Morton wrote: >>>>>> >>>>>> This is likely the kvmalloc() in xt_alloc_table_info(). Between 4.13 and >>>>>> 4.17 it shouldn't use __GFP_NORETRY, but looks like commit 0537250fdc6c >>>>>> ("netfilter: x_tables: make allocation less aggressive") was backported >>>>>> to 4.14. Removing __GFP_NORETRY might help here, but bring back other >>>>>> issues. Less than 4MB is not that much though, maybe find some "sane" >>>>>> limit and use __GFP_NORETRY only above that? >>>>> >>>>> I have seen the same report via http://lkml.kernel.org/r/df6f501c-8546-1f55-40b1-7e3a8f54d872@xxxxxxxxxxx >>>>> and the reported confirmed that kvmalloc is not a real culprit >>>>> http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@xxxxxxxxxxx >>>> >>>> Hmm but that was revert of eacd86ca3b03 ("net/netfilter/x_tables.c: use >>>> kvmalloc() in xt_alloc_table_info()") which was the 4.13 commit that >>>> removed __GFP_NORETRY (there's no __GFP_NORETRY under net/netfilter in >>>> v4.14). I assume it was reverted on top of vanilla v4.14 as there would >>>> be conflict on the stable with 0537250fdc6c backport. So what should be >>>> tested to be sure is either vanilla v4.14 without stable backports, or >>>> latest v4.14.y with revert of 0537250fdc6c. >>> >>> But 0537250fdc6c simply restored the previous NORETRY behavior from >>> before eacd86ca3b03. So whatever causes these issues doesn't seem to be >>> directly related to the kvmalloc change. Or do I miss what you are >>> saying? >> >> I'm saying that although it's not a regression, as you say (the >> vmalloc() there was only for a few kernel versions called without >> __GFP_NORETRY), it's still possible that removing __GFP_NORETRY will fix >> the issue and thus we will rule out other possibilities. > > http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@xxxxxxxxxxx > claims that reverting eacd86ca3b03 didn't really help. Of course not. eacd86ca3b03 *removed* __GFP_NORETRY, so the revert reintroduced it. I tried to explain it in the quoted part above starting with "Hmm but that was revert of eacd86ca3b03 ...". What I'm saying is that eacd86ca3b03 might have actually *fixed* (or rather prevented) this alloc failure, if there was not 0537250fdc6c and its 4.14 stable backport (the kernel bugzilla report says 4.14, I'm assuming new enough stable to contain 0537250fdc6c as the failure message contains __GFP_NORETRY). The mail you reference also says "seems that old version is masking errors", which confirms that we are indeed looking at the right vmalloc(), because eacd86ca3b03 also removed __GFP_NOWARN there (and thus the revert reintroduced it). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html