> On Sep 2, 2022, at 3:34 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote: > > On 9/2/22 10:58 AM, Chuck Lever III wrote: >>>> Or, nfsd_courtesy_client_count() could return >>>> nfsd_couresy_client_count. nfsd_courtesy_client_scan() could >>>> then look something like this: >>>> >>>> if ((sc->gfp_mask & GFP_KERNEL) != GFP_KERNEL) >>>> return SHRINK_STOP; >>>> >>>> nfsd_get_client_reaplist(nn, reaplist, lt); >>>> list_for_each_safe(pos, next, &reaplist) { >>>> clp = list_entry(pos, struct nfs4_client, cl_lru); >>>> trace_nfsd_clid_purged(&clp->cl_clientid); >>>> list_del_init(&clp->cl_lru); >>>> expire_client(clp); >>>> count++; >>>> } >>>> return count; >>> This does not work, we cannot expire clients on the context of >>> scan callback due to deadlock problem. >> Correct, we don't want to start shrinker laundromat activity if >> the allocation request specified that it cannot wait. But are >> you sure it doesn't work if sc_gfp_flags is set to GFP_KERNEL? > > It can be deadlock due to lock ordering problem: > > ====================================================== > WARNING: possible circular locking dependency detected > 5.19.0-rc2_sk+ #1 Not tainted > ------------------------------------------------------ > lck/31847 is trying to acquire lock: > ffff88811d268850 (&sb->s_type->i_mutex_key#16){+.+.}-{3:3}, at: btrfs_inode_lock+0x38/0x70 > #012but task is already holding lock: > ffffffffb41848c0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0x506/0x1db0 > #012which lock already depends on the new lock. > #012the existing dependency chain (in reverse order) is: > > #012-> #1 (fs_reclaim){+.+.}-{0:0}: > fs_reclaim_acquire+0xc0/0x100 > __kmalloc+0x51/0x320 > btrfs_buffered_write+0x2eb/0xd90 > btrfs_do_write_iter+0x6bf/0x11c0 > do_iter_readv_writev+0x2bb/0x5a0 > do_iter_write+0x131/0x630 > nfsd_vfs_write+0x4da/0x1900 [nfsd] > nfsd4_write+0x2ac/0x760 [nfsd] > nfsd4_proc_compound+0xce8/0x23e0 [nfsd] > nfsd_dispatch+0x4ed/0xc10 [nfsd] > svc_process_common+0xd3f/0x1b00 [sunrpc] > svc_process+0x361/0x4f0 [sunrpc] > nfsd+0x2d6/0x570 [nfsd] > kthread+0x2a1/0x340 > ret_from_fork+0x22/0x30 > > #012-> #0 (&sb->s_type->i_mutex_key#16){+.+.}-{3:3}: > __lock_acquire+0x318d/0x7830 > lock_acquire+0x1bb/0x500 > down_write+0x82/0x130 > btrfs_inode_lock+0x38/0x70 > btrfs_sync_file+0x280/0x1010 > nfsd_file_flush.isra.0+0x1b/0x220 [nfsd] > nfsd_file_put+0xd4/0x110 [nfsd] > release_all_access+0x13a/0x220 [nfsd] > nfs4_free_ol_stateid+0x40/0x90 [nfsd] > free_ol_stateid_reaplist+0x131/0x210 [nfsd] > release_openowner+0xf7/0x160 [nfsd] > __destroy_client+0x3cc/0x740 [nfsd] > nfsd_cc_lru_scan+0x271/0x410 [nfsd] > shrink_slab.constprop.0+0x31e/0x7d0 > shrink_node+0x54b/0xe50 > try_to_free_pages+0x394/0xba0 > __alloc_pages_slowpath.constprop.0+0x5d2/0x1db0 > __alloc_pages+0x4d6/0x580 > __handle_mm_fault+0xc25/0x2810 > handle_mm_fault+0x136/0x480 > do_user_addr_fault+0x3d8/0xec0 > exc_page_fault+0x5d/0xc0 > asm_exc_page_fault+0x27/0x30 Is this deadlock possible only via the filecache close paths? I can't think of a reason the laundromat has to wait for each file to be flushed and closed -- the laundromat should be able to "put" these files and allow the filecache LRU to flush and close in the background. -- Chuck Lever