On Mon, 2024-11-04 at 09:22 -0500, Jeff Layton wrote: > On Mon, 2024-11-04 at 21:54 +0800, yangerkun wrote: > > > > 在 2024/11/4 21:42, Jeff Layton 写道: > > > On Mon, 2024-11-04 at 16:39 +0800, Yang Erkun wrote: > > > > From: Yang Erkun <yangerkun@xxxxxxxxxx> > > > > > > > > The action force umount(umount -f) will attempt to kill all rpc_task even > > > > umount operation may ultimately fail if some files remain open. > > > > Consequently, if an action attempts to open a file, it can potentially > > > > send two rpc_task to nfs server. > > > > > > > > NFS CLIENT > > > > thread1 thread2 > > > > open("file") > > > > ... > > > > nfs4_do_open > > > > _nfs4_do_open > > > > _nfs4_open_and_get_state > > > > _nfs4_proc_open > > > > nfs4_run_open_task > > > > /* rpc_task1 */ > > > > rpc_run_task > > > > rpc_wait_for_completion_task > > > > > > > > umount -f > > > > nfs_umount_begin > > > > rpc_killall_tasks > > > > rpc_signal_task > > > > rpc_task1 been wakeup > > > > and return -512 > > > > _nfs4_do_open // while loop > > > > ... > > > > nfs4_run_open_task > > > > /* rpc_task2 */ > > > > rpc_run_task > > > > rpc_wait_for_completion_task > > > > > > > > While processing an open request, nfsd will first attempt to find or > > > > allocate an nfs4_openowner. If it finds an nfs4_openowner that is not > > > > marked as NFS4_OO_CONFIRMED, this nfs4_openowner will released. Since > > > > two rpc_task can attempt to open the same file simultaneously from the > > > > client to server, and because two instances of nfsd can run > > > > concurrently, this situation can lead to lots of memory leak. > > > > Additionally, when we echo 0 to /proc/fs/nfsd/threads, warning will be > > > > triggered. > > > > > > > > NFS SERVER > > > > nfsd1 nfsd2 echo 0 > /proc/fs/nfsd/threads > > > > > > > > nfsd4_open > > > > nfsd4_process_open1 > > > > find_or_alloc_open_stateowner > > > > // alloc oo1, stateid1 > > > > nfsd4_open > > > > nfsd4_process_open1 > > > > find_or_alloc_open_stateowner > > > > // find oo1, without NFS4_OO_CONFIRMED > > > > release_openowner > > > > unhash_openowner_locked > > > > list_del_init(&oo->oo_perclient) > > > > // cannot find this oo > > > > // from client, LEAK!!! > > > > alloc_stateowner // alloc oo2 > > > > > > > > nfsd4_process_open2 > > > > init_open_stateid > > > > // associate oo1 > > > > // with stateid1, stateid1 LEAK!!! > > > > nfs4_get_vfs_file > > > > // alloc nfsd_file1 and nfsd_file_mark1 > > > > // all LEAK!!! > > > > > > > > nfsd4_process_open2 > > > > ... > > > > > > > > write_threads > > > > ... > > > > nfsd_destroy_serv > > > > nfsd_shutdown_net > > > > nfs4_state_shutdown_net > > > > nfs4_state_destroy_net > > > > destroy_client > > > > __destroy_client > > > > // won't find oo1!!! > > > > nfsd_shutdown_generic > > > > nfsd_file_cache_shutdown > > > > kmem_cache_destroy > > > > for nfsd_file_slab > > > > and nfsd_file_mark_slab > > > > // bark since nfsd_file1 > > > > // and nfsd_file_mark1 > > > > // still alive > > > > > > > > ======================================================================= > > > > BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on > > > > __kmem_cache_shutdown() > > > > ----------------------------------------------------------------------- > > > > > > > > Slab 0xffd4000004438a80 objects=34 used=1 fp=0xff11000110e2ad28 > > > > flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff) > > > > CPU: 4 UID: 0 PID: 757 Comm: sh Not tainted 6.12.0-rc6+ #19 > > > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > > > 1.16.1-2.fc37 04/01/2014 > > > > Call Trace: > > > > <TASK> > > > > dump_stack_lvl+0x53/0x70 > > > > slab_err+0xb0/0xf0 > > > > __kmem_cache_shutdown+0x15c/0x310 > > > > kmem_cache_destroy+0x66/0x160 > > > > nfsd_file_cache_shutdown+0xac/0x210 [nfsd] > > > > nfsd_destroy_serv+0x251/0x2a0 [nfsd] > > > > nfsd_svc+0x125/0x1e0 [nfsd] > > > > write_threads+0x16a/0x2a0 [nfsd] > > > > nfsctl_transaction_write+0x74/0xa0 [nfsd] > > > > vfs_write+0x1ae/0x6d0 > > > > ksys_write+0xc1/0x160 > > > > do_syscall_64+0x5f/0x170 > > > > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > > > > > Disabling lock debugging due to kernel taint > > > > Object 0xff11000110e2ac38 @offset=3128 > > > > Allocated in nfsd_file_do_acquire+0x20f/0xa30 [nfsd] age=1635 cpu=3 > > > > pid=800 > > > > nfsd_file_do_acquire+0x20f/0xa30 [nfsd] > > > > nfsd_file_acquire_opened+0x5f/0x90 [nfsd] > > > > nfs4_get_vfs_file+0x4c9/0x570 [nfsd] > > > > nfsd4_process_open2+0x713/0x1070 [nfsd] > > > > nfsd4_open+0x74b/0x8b0 [nfsd] > > > > nfsd4_proc_compound+0x70b/0xc20 [nfsd] > > > > nfsd_dispatch+0x1b4/0x3a0 [nfsd] > > > > svc_process_common+0x5b8/0xc50 [sunrpc] > > > > svc_process+0x2ab/0x3b0 [sunrpc] > > > > svc_handle_xprt+0x681/0xa20 [sunrpc] > > > > nfsd+0x183/0x220 [nfsd] > > > > kthread+0x199/0x1e0 > > > > ret_from_fork+0x31/0x60 > > > > ret_from_fork_asm+0x1a/0x30 > > > > > > > > Add nfs4_openowner_unhashed to help found unhashed nfs4_openowner, and > > > > break nfsd4_open process to fix this problem. > > > > > > > > Cc: stable@xxxxxxxxxxxxxxx # 2.6 > > > > Signed-off-by: Yang Erkun <yangerkun@xxxxxxxxxx> > > > > --- > > > > fs/nfsd/nfs4state.c | 17 +++++++++++++++++ > > > > 1 file changed, 17 insertions(+) > > > > > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > > > > index 551d2958ec29..d3b5321d02a5 100644 > > > > --- a/fs/nfsd/nfs4state.c > > > > +++ b/fs/nfsd/nfs4state.c > > > > @@ -1660,6 +1660,12 @@ static void release_open_stateid(struct nfs4_ol_stateid *stp) > > > > free_ol_stateid_reaplist(&reaplist); > > > > } > > > > > > > > +static bool nfs4_openowner_unhashed(struct nfs4_openowner *oo) > > > > +{ > > > > + return list_empty(&oo->oo_owner.so_strhash) && > > > > + list_empty(&oo->oo_owner.so_strhash); > > > > +} > > > > + > > > > static void unhash_openowner_locked(struct nfs4_openowner *oo) > > > > { > > > > struct nfs4_client *clp = oo->oo_owner.so_client; > > > > @@ -4975,6 +4981,12 @@ init_open_stateid(struct nfs4_file *fp, struct nfsd4_open *open) > > > > spin_lock(&oo->oo_owner.so_client->cl_lock); > > > > spin_lock(&fp->fi_lock); > > > > > > > > + if (nfs4_openowner_unhashed(oo)) { > > > > + mutex_unlock(&stp->st_mutex); > > > > + stp = NULL; > > > > + goto out_unlock; > > > > + } > > > > + > > > > retstp = nfsd4_find_existing_open(fp, open); > > > > if (retstp) > > > > goto out_unlock; > > > > @@ -6127,6 +6139,11 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf > > > > > > > > if (!stp) { > > > > stp = init_open_stateid(fp, open); > > > > + if (!stp) { > > > > + status = nfserr_jukebox; > > > > + goto out; > > > > + } > > > > + > > > > if (!open->op_stp) > > > > new_stp = true; > > > > } > > > > > > > Thanks a lot for your review! > > > > > > > First, this should only be a problem with v4.0 mounts. v4.1+ openowners > > > are always considered CONFIRMED. > > > > Yes, v4.1+ will always be confirmed. > > > > > > > > That does look like a real bug, and your fix seems like it would fix > > > it, but I can't help but wonder if the better fix is to change how nfsd > > > handles the case of an unconfirmed openowner in > > > find_or_alloc_open_stateowner(). > > > > > > I'd have to go back and read the v4.0 spec again, but maybe it's > > > possible to just keep the same stateowner instead of replacing it in > > > that function? It's not clear to me why unconfirmed owners are > > > discarded there. > > > > Aha, it will be great if we can keep this owners still alive instead of > > discarding it! > > > > For normal case of nfs4.0, it won't happend since the second rpc_task of > > open will sleep until the first rpc_task been finished. And for the > > upper abnormal case, after this patch, we will discarding the fist > > owner, but the second owner will keep going and work well to finish the > > open work. And based on this, I wrote this patch... > > > > If there's anything wrong with this idea, please be sure to point it > > out! > > I think the deal here is that with v4.0, the client is required to > serialize opens using the same openowner, at least until that openowner > is confirmed. The reason for this is because an unconfirmed openowner > is associated with the specific stateid under which it was created and > its seqid in the OPEN_CONFIRM must match that. > > RFC7530, 16.18.5 says: > > Second, the client sends another OPEN request with a sequence id that > is incorrect for the open_owner4 (out of sequence). In this case, > the server assumes the second OPEN request is valid and the first one > is a replay. The server cancels the OPEN state of the first OPEN > request, establishes an unconfirmed OPEN state for the second OPEN > request, and responds to the second OPEN request with an indication > that an OPEN_CONFIRM is needed. > > ...so I think we are required to abort the old openowner in this case. > To follow up, RFC 7530 section 9.1.7 makes this very clear: Note that for requests that contain a sequence number, for each state-owner, there should be no more than one outstanding request. So, if a v4.0 client is sending concurrent seqid morphing requests for the same stateowner, then it's misbehaving. I still think we need to guard against this situation for the reasons you outlined. Your v2 patch is probably the best we can do here. > It seems though like the client isn't serializing OPENs correctly? It's > spamming the server with multiple OPEN requests for an unconfirmed > openowner. Then again, maybe the client just forgot an earlier, > confirmed openowner and now it's just starting to try to use it again > and isn't expecting it to need confirmation? > > How did you reproduce this? Were you using the Linux NFS client or > something else? > > In any case, I suspect your v2 fix is probably what we'll need... > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- Jeff Layton <jlayton@xxxxxxxxxx>