Re: [PATCH 1/1] NFSD: Fix memory shortage problem with Courteous server.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jun 22, 2022, at 2:28 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> 
> 
> On 6/22/22 11:16 AM, Chuck Lever III wrote:
>> 
>>> On Jun 22, 2022, at 2:15 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
>>> 
>>> Currently the idle timeout for courtesy client is fixed at 1 day.
>>> If there are lots of courtesy clients remain in the system it can
>>> cause memory resource shortage that effects the operations of other
>>> modules in the kernel. This problem can be observed by running pynfs
>>> nfs4.0 CID5 test in a loop. Eventually system runs out of memory
>>> and rpc.gssd fails to add new watch:
>>> 
>>> rpc.gssd[3851]: ERROR: inotify_add_watch failed for nfsd4_cb/clnt6c2e:
>>> 		No space left on device
>>> 
>>> and also alloc_inode fails with out of memory:
>>> 
>>> Call Trace:
>>> <TASK>
>>> dump_stack_lvl+0x33/0x42
>>> dump_header+0x4a/0x1ed
>>> oom_kill_process+0x80/0x10d
>>> out_of_memory+0x237/0x25f
>>> __alloc_pages_slowpath.constprop.0+0x617/0x7b6
>>> __alloc_pages+0x132/0x1e3
>>> alloc_slab_page+0x15/0x33
>>> allocate_slab+0x78/0x1ab
>>> ? alloc_inode+0x38/0x8d
>>> ___slab_alloc+0x2af/0x373
>>> ? alloc_inode+0x38/0x8d
>>> ? slab_pre_alloc_hook.constprop.0+0x9f/0x158
>>> ? alloc_inode+0x38/0x8d
>>> __slab_alloc.constprop.0+0x1c/0x24
>>> kmem_cache_alloc_lru+0x8c/0x142
>>> alloc_inode+0x38/0x8d
>>> iget_locked+0x60/0x126
>>> kernfs_get_inode+0x18/0x105
>>> kernfs_iop_lookup+0x6d/0xbc
>>> __lookup_slow+0xb7/0xf9
>>> lookup_slow+0x3a/0x52
>>> walk_component+0x90/0x100
>>> ? inode_permission+0x87/0x128
>>> link_path_walk.part.0.constprop.0+0x266/0x2ea
>>> ? path_init+0x101/0x2f2
>>> path_lookupat+0x4c/0xfa
>>> filename_lookup+0x63/0xd7
>>> ? getname_flags+0x32/0x17a
>>> ? kmem_cache_alloc+0x11f/0x144
>>> ? getname_flags+0x16c/0x17a
>>> user_path_at_empty+0x37/0x4b
>>> do_readlinkat+0x61/0x102
>>> __x64_sys_readlinkat+0x18/0x1b
>>> do_syscall_64+0x57/0x72
>>> entry_SYSCALL_64_after_hwframe+0x46/0xb0
>>> RIP: 0033:0x7fce5410340e
>>> 
>>> This patch adds a simple policy to dynamically adjust the idle
>>> timeout based on the percentage of available memory in the system
>>> as follow:
>>> 
>>> . > 70% : unlimited. Courtesy clients are allowed to remain valid
>>> as long as memory availability is above 70%
>>> . 60% - 70%: 1 day.
>>> . 50% - 60%: 1hr
>>> . 40% - 50%: 30mins
>>> . 30% - 40%: 15mins
>>> . < 30%: disable. Expire all existing courtesy clients and donot
>>> allow new courtesey client
>> I thought our plan was to add a shrinker to do this.
> 
> I'm not familiar with kernel's memory allocation and don't want to muck
> with it so I start with this simple approach but I'm open for any suggestion
> on how to add a shrinker for this task. Is there any existing model that I
> can use as reference?

Fortunately there's nothing complicated about using a shrinker.
Look for register_shrinker() calls to see code examples. There
are two already in NFSD itself.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux