Re: [PATCH 1/1] NFSD: Fix memory shortage problem with Courteous server.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/22/22 11:32 AM, Chuck Lever III wrote:

On Jun 22, 2022, at 2:28 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:


On 6/22/22 11:16 AM, Chuck Lever III wrote:
On Jun 22, 2022, at 2:15 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:

Currently the idle timeout for courtesy client is fixed at 1 day.
If there are lots of courtesy clients remain in the system it can
cause memory resource shortage that effects the operations of other
modules in the kernel. This problem can be observed by running pynfs
nfs4.0 CID5 test in a loop. Eventually system runs out of memory
and rpc.gssd fails to add new watch:

rpc.gssd[3851]: ERROR: inotify_add_watch failed for nfsd4_cb/clnt6c2e:
		No space left on device

and also alloc_inode fails with out of memory:

Call Trace:
<TASK>
dump_stack_lvl+0x33/0x42
dump_header+0x4a/0x1ed
oom_kill_process+0x80/0x10d
out_of_memory+0x237/0x25f
__alloc_pages_slowpath.constprop.0+0x617/0x7b6
__alloc_pages+0x132/0x1e3
alloc_slab_page+0x15/0x33
allocate_slab+0x78/0x1ab
? alloc_inode+0x38/0x8d
___slab_alloc+0x2af/0x373
? alloc_inode+0x38/0x8d
? slab_pre_alloc_hook.constprop.0+0x9f/0x158
? alloc_inode+0x38/0x8d
__slab_alloc.constprop.0+0x1c/0x24
kmem_cache_alloc_lru+0x8c/0x142
alloc_inode+0x38/0x8d
iget_locked+0x60/0x126
kernfs_get_inode+0x18/0x105
kernfs_iop_lookup+0x6d/0xbc
__lookup_slow+0xb7/0xf9
lookup_slow+0x3a/0x52
walk_component+0x90/0x100
? inode_permission+0x87/0x128
link_path_walk.part.0.constprop.0+0x266/0x2ea
? path_init+0x101/0x2f2
path_lookupat+0x4c/0xfa
filename_lookup+0x63/0xd7
? getname_flags+0x32/0x17a
? kmem_cache_alloc+0x11f/0x144
? getname_flags+0x16c/0x17a
user_path_at_empty+0x37/0x4b
do_readlinkat+0x61/0x102
__x64_sys_readlinkat+0x18/0x1b
do_syscall_64+0x57/0x72
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7fce5410340e

This patch adds a simple policy to dynamically adjust the idle
timeout based on the percentage of available memory in the system
as follow:

. > 70% : unlimited. Courtesy clients are allowed to remain valid
as long as memory availability is above 70%
. 60% - 70%: 1 day.
. 50% - 60%: 1hr
. 40% - 50%: 30mins
. 30% - 40%: 15mins
. < 30%: disable. Expire all existing courtesy clients and donot
allow new courtesey client
I thought our plan was to add a shrinker to do this.
I'm not familiar with kernel's memory allocation and don't want to muck
with it so I start with this simple approach but I'm open for any suggestion
on how to add a shrinker for this task. Is there any existing model that I
can use as reference?
Fortunately there's nothing complicated about using a shrinker.
Look for register_shrinker() calls to see code examples. There
are two already in NFSD itself.

Thanks Chuck, I'll take a look.

-Dai



--
Chuck Lever






[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux