On 07/26/2018 11:48 AM, Vlastimil Babka
wrote:
On 07/26/2018 10:31 AM, Vlastimil Babka wrote:On 07/26/2018 10:03 AM, Michal Hocko wrote:On Thu 26-07-18 09:50:45, Vlastimil Babka wrote:On 07/26/2018 09:42 AM, Michal Hocko wrote:On Thu 26-07-18 09:34:58, Vlastimil Babka wrote:On 07/26/2018 09:26 AM, Michal Hocko wrote:On Thu 26-07-18 09:18:57, Vlastimil Babka wrote:On 07/25/2018 09:52 PM, Andrew Morton wrote: This is likely the kvmalloc() in xt_alloc_table_info(). Between 4.13 and 4.17 it shouldn't use __GFP_NORETRY, but looks like commit 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive") was backported to 4.14. Removing __GFP_NORETRY might help here, but bring back other issues. Less than 4MB is not that much though, maybe find some "sane" limit and use __GFP_NORETRY only above that?I have seen the same report via http://lkml.kernel.org/r/df6f501c-8546-1f55-40b1-7e3a8f54d872@xxxxxxxxxxx and the reported confirmed that kvmalloc is not a real culprit http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@xxxxxxxxxxxHmm but that was revert of eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_table_info()") which was the 4.13 commit that removed __GFP_NORETRY (there's no __GFP_NORETRY under net/netfilter in v4.14). I assume it was reverted on top of vanilla v4.14 as there would be conflict on the stable with 0537250fdc6c backport. So what should be tested to be sure is either vanilla v4.14 without stable backports, or latest v4.14.y with revert of 0537250fdc6c.But 0537250fdc6c simply restored the previous NORETRY behavior from before eacd86ca3b03. So whatever causes these issues doesn't seem to be directly related to the kvmalloc change. Or do I miss what you are saying?I'm saying that although it's not a regression, as you say (the vmalloc() there was only for a few kernel versions called without __GFP_NORETRY), it's still possible that removing __GFP_NORETRY will fix the issue and thus we will rule out other possibilities.http://lkml.kernel.org/r/d99a9598-808a-6968-4131-c3949b752004@xxxxxxxxxxx claims that reverting eacd86ca3b03 didn't really help.Ah, I see, that mail thread references a different kernel bugzilla #200639 which doesn't mention 4.14, but outright blames commit eacd86ca3b03. Yet the alloc fail message contains __GFP_NORETRY, so I still suspect the kernel also had 0537250fdc6c backport. Georgi can you please clarify which exact kernel version had the alloc failures, and how exactly you tested the revert (which version was the baseline for revert). Thanks.Of course not. eacd86ca3b03 *removed* __GFP_NORETRY, so the revert reintroduced it. I tried to explain it in the quoted part above starting with "Hmm but that was revert of eacd86ca3b03 ...". What I'm saying is that eacd86ca3b03 might have actually *fixed* (or rather prevented) this alloc failure, if there was not 0537250fdc6c and its 4.14 stable backport (the kernel bugzilla report says 4.14, I'm assuming new enough stable to contain 0537250fdc6c as the failure message contains __GFP_NORETRY). The mail you reference also says "seems that old version is masking errors", which confirms that we are indeed looking at the right vmalloc(), because eacd86ca3b03 also removed __GFP_NOWARN there (and thus the revert reintroduced it). Hello, Kernel that has allocation failures is 4.14.50. Here is the patch applied to this version which masks errors: --- net/netfilter/x_tables.c 2018-06-18 14:18:21.138347416 +0300 +++ net/netfilter/x_tables.c 2018-07-26 11:58:01.721932962 +0300 @@ -1059,9 +1059,19 @@ * than shoot all processes down before realizing there is nothing * more to reclaim. */ - info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY); +/* info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY); if (!info) return NULL; +*/ + + if (sz <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) + info = kmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY); + if (!info) { + info = __vmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY, + PAGE_KERNEL); + if (!info) + return NULL; + } memset(info, 0, sizeof(*info)); info->size = size; I will try to reproduce it with only info = kvmalloc(sz, GFP_KERNEL); Regards, -- Georgi Nikolov |