Re: [PATCH v3] nfsd: fix nfs4_openowner leak when concurrent nfsd4_open occur

Chuck Lever <chuck.lever@xxxxxxxxxx> · Wed, 6 Nov 2024 08:35:02 -0500

On Tue, Nov 05, 2024 at 07:03:14PM +0800, Yang Erkun wrote:
> From: Yang Erkun <yangerkun@xxxxxxxxxx>
> 
> The action force umount(umount -f) will attempt to kill all rpc_task even
> umount operation may ultimately fail if some files remain open.
> Consequently, if an action attempts to open a file, it can potentially
> send two rpc_task to nfs server.
> 
>                    NFS CLIENT
> thread1                             thread2
> open("file")
> ...
> nfs4_do_open
>  _nfs4_do_open
>   _nfs4_open_and_get_state
>    _nfs4_proc_open
>     nfs4_run_open_task
>      /* rpc_task1 */
>      rpc_run_task
>      rpc_wait_for_completion_task
> 
>                                     umount -f
>                                     nfs_umount_begin
>                                      rpc_killall_tasks
>                                       rpc_signal_task
>      rpc_task1 been wakeup
>      and return -512
>  _nfs4_do_open // while loop
>     ...
>     nfs4_run_open_task
>      /* rpc_task2 */
>      rpc_run_task
>      rpc_wait_for_completion_task
> 
> While processing an open request, nfsd will first attempt to find or
> allocate an nfs4_openowner. If it finds an nfs4_openowner that is not
> marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since
> two rpc_task can attempt to open the same file simultaneously from the
> client to server, and because two instances of nfsd can run
> concurrently, this situation can lead to lots of memory leak.
> Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be
> triggered.
> 
>                     NFS SERVER
> nfsd1                  nfsd2       echo 0 > /proc/fs/nfsd/threads
> 
> nfsd4_open
>  nfsd4_process_open1
>   find_or_alloc_open_stateowner
>    // alloc oo1, stateid1
>                        nfsd4_open
>                         nfsd4_process_open1
>                         find_or_alloc_open_stateowner
>                         // find oo1, without NFS4_OO_CONFIRMED
>                          release_openowner
>                           unhash_openowner_locked
>                           list_del_init(&oo->oo_perclient)
>                           // cannot find this oo
>                           // from client, LEAK!!!
>                          alloc_stateowner // alloc oo2
> 
>  nfsd4_process_open2
>   init_open_stateid
>   // associate oo1
>   // with stateid1, stateid1 LEAK!!!
>   nfs4_get_vfs_file
>   // alloc nfsd_file1 and nfsd_file_mark1
>   // all LEAK!!!
> 
>                          nfsd4_process_open2
>                          ...
> 
>                                     write_threads
>                                      ...
>                                      nfsd_destroy_serv
>                                       nfsd_shutdown_net
>                                        nfs4_state_shutdown_net
>                                         nfs4_state_destroy_net
>                                          destroy_client
>                                           __destroy_client
>                                           // won't find oo1!!!
>                                      nfsd_shutdown_generic
>                                       nfsd_file_cache_shutdown
>                                        kmem_cache_destroy
>                                        for nfsd_file_slab
>                                        and nfsd_file_mark_slab
>                                        // bark since nfsd_file1
>                                        // and nfsd_file_mark1
>                                        // still alive
> 
> =======================================================================
> BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on
> __kmem_cache_shutdown()
> -----------------------------------------------------------------------
> 
> Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28
> flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
> CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.16.1-2.fc37 04/01/2014
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x53/0x70
>  slab_err+0xb0/0xf0
>  __kmem_cache_shutdown+0x15c/0x310
>  kmem_cache_destroy+0x66/0x160
>  nfsd_file_cache_shutdown+0xac/0x210 [nfsd]
>  nfsd_destroy_serv+0x251/0x2a0 [nfsd]
>  nfsd_svc+0x125/0x1e0 [nfsd]
>  write_threads+0x16a/0x2a0 [nfsd]
>  nfsctl_transaction_write+0x74/0xa0 [nfsd]
>  vfs_write+0x1ae/0x6d0
>  ksys_write+0xc1/0x160
>  do_syscall_64+0x5f/0x170
>  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> Disabling lock debugging due to kernel taint
> Object 0xff11000110e2ac38 @offset=3128
> Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3
> pid=800
>  nfsd_file_do_acquire+0x20f/0xa30 [nfsd]
>  nfsd_file_acquire_opened+0x5f/0x90 [nfsd]
>  nfs4_get_vfs_file+0x4c9/0x570 [nfsd]
>  nfsd4_process_open2+0x713/0x1070 [nfsd]
>  nfsd4_open+0x74b/0x8b0 [nfsd]
>  nfsd4_proc_compound+0x70b/0xc20 [nfsd]
>  nfsd_dispatch+0x1b4/0x3a0 [nfsd]
>  svc_process_common+0x5b8/0xc50 [sunrpc]
>  svc_process+0x2ab/0x3b0 [sunrpc]
>  svc_handle_xprt+0x681/0xa20 [sunrpc]
>  nfsd+0x183/0x220 [nfsd]
>  kthread+0x199/0x1e0
>  ret_from_fork+0x31/0x60
>  ret_from_fork_asm+0x1a/0x30
> 
> Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and
> break nfsd4_open process to fix this problem.
> 
> Cc: stable@xxxxxxxxxxxxxxx # 2.6

Hi -

Questions about the "stable@" tag:

 - You refer above to a leak of nfsd_file objects, but the nfsd_file
   cache was added in v5.4. Any thoughts about what might be leaked,
   if anything, in kernels earlier than v5.4?

 - Have you tried applying this patch to LTS kernels?

> Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>
> Signed-off-by: Yang Erkun <yangerkun@xxxxxxxxxx>
> ---
>  fs/nfsd/nfs4state.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> v2->v3:
> 1. add reviewed-by
> 2. add lockdep_assert_held in nfs4_openowner_unhashed
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 9ddb91d088ae..37888562b436 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1660,6 +1660,14 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp)
>  	free_ol_stateid_reaplist(&reaplist);
>  }
>  
> +static bool nfs4_openowner_unhashed(struct nfs4_openowner *oo)
> +{
> +	lockdep_assert_held(&oo->oo_owner.so_client->cl_lock);
> +
> +	return list_empty(&oo->oo_owner.so_strhash) &&
> +		list_empty(&oo->oo_perclient);
> +}
> +
>  static void unhash_openowner_locked(struct nfs4_openowner *oo)
>  {
>  	struct nfs4_client *clp = oo->oo_owner.so_client;
> @@ -4979,6 +4987,12 @@ init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open)
>  	spin_lock(&oo->oo_owner.so_client->cl_lock);
>  	spin_lock(&fp->fi_lock);
>  
> +	if (nfs4_openowner_unhashed(oo)) {
> +		mutex_unlock(&stp->st_mutex);
> +		stp = NULL;
> +		goto out_unlock;
> +	}
> +
>  	retstp = nfsd4_find_existing_open(fp, open);
>  	if (retstp)
>  		goto out_unlock;
> @@ -6131,6 +6145,11 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
>  
>  	if (!stp) {
>  		stp = init_open_stateid(fp, open);
> +		if (!stp) {
> +			status = nfserr_jukebox;
> +			goto out;
> +		}
> +
>  		if (!open->op_stp)
>  			new_stp = true;
>  	}
> -- 
> 2.39.2
> 
> 

-- 
Chuck Lever