Thanks for your reply
Hello every, I meet an interesting kernel memory problem. Can anyone help me explain what happen under the kernel
Which kernel version is that?
The kernel version is 3.10.0-327.4.5.el7.x86_64 The machine's status is describe as blow:
the machine has 96 physical memory. And the real use memory is about 64G, and the page cache use about 32G. we also use the swap area, at that time we have about 10G(we set the swap max size to 32G). At that moment, we find xfs report
|Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) |
Just once, or many times?
the message appear many times from the code, I know that xfs will try 100 time of kmalloc() function
after reading the source code. This message is display from this line
|ptr = kmalloc(size, lflags); if (ptr || (flags & (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100)) xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)", __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50); |
Any indication what is the size used here?
I don't know the size here, since it is called by the xfs.
The error is cause by the kmalloc() function, there is not enough memory in the system. But there is still 32G page cache.
So I run
|echo 3 > /proc/sys/vm/drop_caches |
to drop the page cache.
Then the system is fine.
Are you saying that the error message was repeated infinitely until you did the drop_caches?
No. the error message don't appear after I drop_cache.
Is it possible the reason is that even we have enough physical pages, but there pages is used for page cache, when user call kmalloc(), kmalloc() get page from kernel. kernel find that there is not enough pages, but some page is used for page cache, we can get some free pages from these page caches. so the kernel will call the kswapd to clear away some page cache. But it takes too long to get the free pages. And the function in xfs kmem_alloc don't set the flag __GFP_WAIT flag. So the kmem_alloc always return no enough memory, and print the error message.
----------------------------------------
[+CC Dave]On 05/18/2016 04:38 AM, baotiao wrote:Hello every, I meet an interesting kernel memory problem. Can anyone help me explain what happen under the kernel
Which kernel version is that?The machine's status is describe as blow:
the machine has 96 physical memory. And the real use memory is about 64G, and the page cache use about 32G. we also use the swap area, at that time we have about 10G(we set the swap max size to 32G). At that moment, we find xfs report
|Apr 29 21:54:31 w-openstack86 kernel: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) |
Just once, or many times?after reading the source code. This message is display from this line
|ptr = kmalloc(size, lflags); if (ptr || (flags & (KM_MAYFAIL|KM_NOSLEEP))) return ptr; if (!(++retries % 100)) xfs_err(NULL, "possible memory allocation deadlock in %s (mode:0x%x)", __func__, lflags); congestion_wait(BLK_RW_ASYNC, HZ/50); |
Any indication what is the size used here?The error is cause by the kmalloc() function, there is not enough memory in the system. But there is still 32G page cache.
So I run
|echo 3 > /proc/sys/vm/drop_caches |
to drop the page cache.
Then the system is fine.
Are you saying that the error message was repeated infinitely until you did the drop_caches?But I really don't know the reason. Why after I run drop_caches operation the kmalloc() function will success? I think even we use whole physical memory, but we only use 64 real momory, the 32G memory are page cache, further we have enough swap space. So why the kernel don't flush the page cache or the swap to reserved the kmalloc operation.
---------------------------------------- Github: https://github.com/baotiao Blog: http://baotiao.github.io/ Stackoverflow: http://stackoverflow.com/users/634415/baotiao Linkedin: http://www.linkedin.com/profile/view?id=145231990
|