Re: [Bug 200651] New: cgroups iptables-restor: vmalloc: allocation failure

Georgi Nikolov <gnikolov@xxxxxxxxxxx> · Mon, 30 Jul 2018 21:51:42 +0300

On 07/30/2018 09:38 PM, Michal Hocko wrote:
> On Mon 30-07-18 18:54:24, Georgi Nikolov wrote:
> [...]
>> No i was wrong. The regression starts actually with 0537250fdc6c8.
>> - old code, which opencodes kvmalloc, is masking error but error is there
>> - kvmalloc without GFP_NORETRY works fine, but probably can consume a
>> lot of memory - commit: eacd86ca3b036
>> - kvmalloc with GFP_NORETRY shows error - commit: 0537250fdc6c8
> OK.
>
>>>> What is correct way to fix it.
>>>> - inside xt_alloc_table_info remove GFP_NORETRY from kvmalloc or add
>>>> this flag only for sizes bigger than some threshold
>>> This would reintroduce issue fixed by 0537250fdc6c8. Note that
>>> kvmalloc(GFP_KERNEL | __GFP_NORETRY) is more or less equivalent to the
>>> original code (well, except for __GFP_NOWARN).
>> So probably we should pass GFP_NORETRY only for large requests (above
>> some threshold).
> What would be the treshold? This is not really my area so I just wanted
> to keep the original code semantic.
>  
>>>> - inside kvmalloc_node remove GFP_NORETRY from
>>>> __vmalloc_node_flags_caller (i don't know if it honors this flag, or
>>>> the problem is elsewhere)
>>> No, not really. This is basically equivalent to kvmalloc(GFP_KERNEL).
>>>
>>> I strongly suspect that this is not a regression in this code but rather
>>> a side effect of larger memory fragmentation caused by something else.
>>> In any case do you see this failure also without artificial test case
>>> with a standard workload?
>> Yes i can see failures with standard workload, in fact it was hard to
>> reproduce it.
>> Here is the error from production servers where allocation is smaller:
>> iptables: vmalloc: allocation failure, allocated 131072 of 225280 bytes,
>> mode:0x14010c0(GFP_KERNEL|__GFP_NORETRY), nodemask=(null)
>>
>> I didn't understand if vmalloc honors GFP_NORETRY.
> 0537250fdc6c8 changelog tries to explain. kvmalloc doesn't really
> support the GFP_NORETRY remantic because that would imply the request
> wouldn't trigger the oom killer but in rare cases this might happen
> (e.g. when page tables are allocated because those are hardcoded GFP_KERNEL).
>
> That being said, I have no objection to use GFP_KERNEL if it helps real
> workloads but we probably need some cap...

Probably Vlastimil Babka can propose some limit:

On Thu 26-07-18 09:18:57, Vlastimil Babka wrote:
This is likely the kvmalloc() in xt_alloc_table_info(). Between 4.13 and
4.17 it shouldn't use __GFP_NORETRY, but looks like commit 0537250fdc6c
("netfilter: x_tables: make allocation less aggressive") was backported
to 4.14. Removing __GFP_NORETRY might help here, but bring back other
issues. Less than 4MB is not that much though, maybe find some "sane"
limit and use __GFP_NORETRY only above that?

Regards,

--
Georgi Nikolov