Re: Softlockup during memory allocation

Nikolay Borisov <kernel@xxxxxxxx> · Thu, 24 Nov 2016 15:09:38 +0200

On 11/24/2016 02:12 PM, Michal Hocko wrote:
> On Thu 24-11-16 13:45:03, Nikolay Borisov wrote:
> [...]
>> Ok, I think I know what has happened. Inspecting the data structures of
>> the respective cgroup here is what the mem_cgroup_per_zone looks like:
>>
>>   zoneinfo[2] =   {
>>     lruvec = {{
>>         lists = {
>>           {
>>             next = 0xffffea004f98c660,
>>             prev = 0xffffea0063f6b1a0
>>           },
>>           {
>>             next = 0xffffea0004123120,
>>             prev = 0xffffea002c2e2260
>>           },
>>           {
>>             next = 0xffff8818c37bb360,
>>             prev = 0xffff8818c37bb360
>>           },
>>           {
>>             next = 0xffff8818c37bb370,
>>             prev = 0xffff8818c37bb370
>>           },
>>           {
>>             next = 0xffff8818c37bb380,
>>             prev = 0xffff8818c37bb380
>>           }
>>         },
>>         reclaim_stat = {
>>           recent_rotated = {172969085, 43319509},
>>           recent_scanned = {173112994, 185446658}
>>         },
>>         zone = 0xffff88207fffcf00
>>     }},
>>     lru_size = {159722, 158714, 0, 0, 0},
>>     }
>>
>> So this means that there are inactive_anon and active_annon only -
>> correct?
> 
> yes. at least in this particular zone.
> 
>> Since the machine doesn't have any swap this means anon memory
>> has nowhere to go. If I'm interpreting the data correctly then this
>> explains why reclaim makes no progress. If that's the case then I have
>> the following questions:
>>
>> 1. Shouldn't reclaim exit at some point rather than being stuck in
>> reclaim without making further progress.
> 
> Reclaim (try_to_free_mem_cgroup_pages) has to go down all priorities
> without to get out. We are not doing any pro-active checks whether there
> is anything reclaimable but that alone shouldn't be such a big deal
> because shrink_node_memcg should simply do nothing because
> get_scan_count will find no pages to scan. So it shouldn't take much
> time to realize there is nothing to reclaim and get back to try_charge
> which retries few more times and eventually goes OOM. I do not see how
> we could trigger rcu stalls here. There shouldn't be any long RCU
> critical section on the way and preemption points on the way.
> 
>> 2. It seems rather strange that there are no (INACTIVE|ACTIVE)_FILE
>> pages - is this possible?
> 
> All of them might be reclaimed already as a result of the memory
> pressure in the memcg. So not all that surprising. But the fact that
> you are hitting the limit means that the anonymous pages saturate your
> hard limit so your memcg seems underprovisioned.
> 
>> 3. Why hasn't OOM been activated in order to free up some anonymous memory ?
> 
> It should eventually. Maybe there still were some reclaimable pages in
> other zones for this memcg.

I just checked all the zones for both nodes (the machines have 2 NUMA
nodes) so essentially there are no reclaimable pages - all are
anonymous. So the pertinent question is why process are sleeping in
reclamation path when there are no pages to free. I also observed the
same behavior on a different node, this time the priority was 0 and the
code hasn't resorted to OOM. This seems all too strange..

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>