Re: numa balancing stuck in task_work_run

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 09/24/2015 09:04 PM, Rik van Riel wrote:
> On 09/24/2015 05:14 PM, Joe Lawrence wrote:
>> [ +cc for linux-mm mailinglist address ]
>>
>> On 09/24/2015 05:08 PM, Joe Lawrence wrote:
>>> Hi Mel, Rik et al,
>>>
>>> We've encountered interesting NUMA balancing behavior on RHEL7.1,
>>> reproduced with an upstream 4.2 kernel (of similar .config), that can
>>> leave a user process trapped in the kernel performing task_numa_work.
>>>
>>> Our test group set up a server with 256GB memory running a program that
>>> allocates and dirties ~50% of that memory.  They reported the following
>>> condition when they attempted to kill the test process -- the signal was
>>> never handled, instead traces showed the task stuck here:
> 
> Does the bug still happen with this patch applied?
> 
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=4620f8c1fda2af4ccbd11e194e2dd785f7d7f279
> 

Hi Rik,

Success!  With 4620f8c1fda2 (-tip) cherry-picked on-top of 4.2, I could
successfully kill off the memory test process, even when the
numa_scan_period_max dropped to 140.

I also ran kicked off the est program and let continue overnight (it
restarts itself after a given time) and several iterations ran without
incident.

Thanks,

-- Joe

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]