On 08/06/2018 11:42 AM, Georgi Nikolov wrote: > On 08/02/2018 11:50 AM, Michal Hocko wrote: >> In other words, why don't we simply do the following? Note that this is >> not tested. I have also no idea what is the lifetime of this allocation. >> Is it bound to any specific process or is it a namespace bound? If the >> later then the memcg OOM killer might wipe the whole memcg down without >> making any progress. This would make the whole namespace unsuable until >> somebody intervenes. Is this acceptable? >> --- >> From 4dec96eb64954a7e58264ed551afadf62ca4c5f7 Mon Sep 17 00:00:00 2001 >> From: Michal Hocko <mhocko@xxxxxxxx> >> Date: Thu, 2 Aug 2018 10:38:57 +0200 >> Subject: [PATCH] netfilter/x_tables: do not fail xt_alloc_table_info too >> easilly >> >> eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() >> in xt_alloc_table_info()") has unintentionally fortified >> xt_alloc_table_info allocation when __GFP_RETRY has been dropped from >> the vmalloc fallback. Later on there was a syzbot report that this >> can lead to OOM killer invocations when tables are too large and >> 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive") >> has been merged to restore the original behavior. Georgi Nikolov however >> noticed that he is not able to install his iptables anymore so this can >> be seen as a regression. >> >> The primary argument for 0537250fdc6c was that this allocation path >> shouldn't really trigger the OOM killer and kill innocent tasks. On the >> other hand the interface requires root and as such should allow what the >> admin asks for. Root inside a namespaces makes this more complicated >> because those might be not trusted in general. If they are not then such >> namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY >> and replace it by __GFP_ACCOUNT to enfore memcg constrains on it. >> >> Fixes: 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive") >> Reported-by: Georgi Nikolov <gnikolov@xxxxxxxxxxx> >> Suggested-by: Vlastimil Babka <vbabka@xxxxxxx> >> Signed-off-by: Michal Hocko <mhocko@xxxxxxxx> >> --- >> net/netfilter/x_tables.c | 7 +------ >> 1 file changed, 1 insertion(+), 6 deletions(-) >> >> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c >> index d0d8397c9588..b769408e04ab 100644 >> --- a/net/netfilter/x_tables.c >> +++ b/net/netfilter/x_tables.c >> @@ -1178,12 +1178,7 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size) >> if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE) >> return NULL; >> >> - /* __GFP_NORETRY is not fully supported by kvmalloc but it should >> - * work reasonably well if sz is too large and bail out rather >> - * than shoot all processes down before realizing there is nothing >> - * more to reclaim. >> - */ >> - info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY); >> + info = kvmalloc(sz, GFP_KERNEL | __GFP_ACCOUNT); >> if (!info) >> return NULL; >> > I will check if this change fixes the problem. > > Regards, > > -- > Georgi Nikolov I can't reproduce it anymore. If i understand correctly this way memory allocated will be accounted to kmem of this cgroup (if inside cgroup). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html