Re: A process killed while opening a file can result in leaked open handle on the server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Steve, Pavel

This patch goes ontop of Pavels patch.
Maybe it should be merged with Pavels patch since his patch changes from "we only send a close() on an interrupted open()"
to now "we send a close() on either interrupted open() or interrupted close()" so both comments as well as log messages are updates.

Additionally it adds logging of the MID that failed in the case of an interrupted Open() so that it is easy to find it in wireshark
and check whether that smb2 file handle was indeed handles by a SMB_Close() or not.


>From testing it appears Pavels patch works. When the close() is interrupted we don't leak handles as far as I can tell.
We do have a leak in the Open() case though and it seems that eventhough we set things up and flags the MID to be cancelled we actually never end up
calling smb2_cancelled_close_fid() and thus we never send a SMB2_Close().
I haven't found the root cause yet but I suspect we mess up mid flags or state somewhere.


It did work in the past though when Sachin provided the initial implementation so we have regressed I think.
I have added a new test 'cifs/102'  to the buildbot that checks for this but have not integrated into the cifs-testing run yet since we still fail this test.
At least we will not have further regressions once we fix this and enable the test in the future.

ronnie s




----- Original Message -----
From: "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>
To: "Pavel Shilovsky" <piastryyy@xxxxxxxxx>
Cc: "Frank Sorenson" <sorenson@xxxxxxxxxx>, "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>
Sent: Wednesday, 13 November, 2019 3:19:11 PM
Subject: Re: A process killed while opening a file can result in leaked open handle on the server





----- Original Message -----
> From: "Pavel Shilovsky" <piastryyy@xxxxxxxxx>
> To: "Frank Sorenson" <sorenson@xxxxxxxxxx>
> Cc: "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>
> Sent: Wednesday, 13 November, 2019 12:37:38 PM
> Subject: Re: A process killed while opening a file can result in leaked open handle on the server
> 
> вт, 12 нояб. 2019 г. в 11:34, Frank Sorenson <sorenson@xxxxxxxxxx>:
> >
> > If a process opening a file is killed while waiting for a SMB2_CREATE
> > response from the server, the response may not be handled by the client,
> > leaking an open file handle on the server.
> >
> 
> Thanks for reporting the problem.
> 
> > This can be reproduced with the following:
> >
> > # mount //vm3/user1 /mnt/vm3
> > -overs=3,sec=ntlmssp,credentials=/root/.user1_smb_creds
> > # cd /mnt/vm3
> > # echo foo > foo
> >
> > # for i in {1..100} ; do cat foo >/dev/null 2>&1 & sleep 0.0001 ; kill -9
> > $! ; done
> >
> > (increase count if necessary--100 appears sufficient to cause multiple
> > leaked file handles)
> >
> 
> This is a known problem and the client handles it by closing unmatched
> opens (the one that don't have a corresponding waiting thread)
> immediately. It is indicated by the message you observed: "Close
> unmatched open".
> 
> >
> > The client will stop waiting for the response, and will output
> > the following when the response does arrive:
> >
> >     CIFS VFS: Close unmatched open
> >
> > on the server side, an open file handle is leaked.  If using samba,
> > the following will show these open files:
> >
> >
> > # smbstatus | grep -i Locked -A1000
> > Locked files:
> > Pid          Uid        DenyMode   Access      R/W        Oplock
> > SharePath   Name   Time
> > --------------------------------------------------------------------------------------------------
> > 25936        501        DENY_NONE  0x80        RDONLY     NONE
> > /home/user1   .   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x80        RDONLY     NONE
> > /home/user1   .   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)
> > /home/user1   foo   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)
> > /home/user1   foo   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)
> > /home/user1   foo   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)
> > /home/user1   foo   Tue Nov 12 12:29:24 2019
> > 25936        501        DENY_NONE  0x120089    RDONLY     LEASE(RWH)
> > /home/user1   foo   Tue Nov 12 12:29:24 2019
> >
> 
> I tried it locally myself, it seems that we have another issue related
> to interrupted close requests:
> 
> [1656476.740311] fs/cifs/file.c: Flush inode 0000000078729371 file
> 000000004d5f9348 rc 0
> [1656476.740313] fs/cifs/file.c: closing last open instance for inode
> 0000000078729371
> [1656476.740314] fs/cifs/file.c: CIFS VFS: in _cifsFileInfo_put as
> Xid: 457 with uid: 1000
> [1656476.740315] fs/cifs/smb2pdu.c: Close
> [1656476.740317] fs/cifs/transport.c: signal is pending before sending any
> data
> 
> This will return -512 error to the upper layer but we do not save a
> problematic handle somewhere to close it later. op->release() error
> code is not being checked by VFS anyway.
> 
> We should do the similar thing as we do today for cancelled mids:
> allocate "struct close_cancelled_open" and queue the lazy work to
> close the handle.
> 
> I attached the patch implementing this idea. The patch handles the
> interrupt errors in smb2_close_file which is called during close
> system call. It fixed the problem in my setup. In the same time I
> think I should probably move the handling to PDU layer function
> SMB2_close() to handle *all* interrupted close requests including ones
> closing temporary handles. I am wondering if anyone has thoughts about
> it. Anyway, review of the patch is highly appreciated.

I think your patch makes it better. I seem Close() that your patch now fixes the leak for.


But there is still a leak in Open().
I just got one instance where I leaked one handle.

The log shows :
CIFS VFS: \\192.168.124.207 Cancelling wait for mid 25 cmd: 5

and the wireshark shows an Open() request/response for this MID but we never
send a Close() for the handle.



> 
> @Frank, could you please test the patch in your environment?
> 
> --
> Best regards,
> Pavel Shilovsky
> 

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index d78bfcc19156..39a9165f9bba 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1551,6 +1551,8 @@ struct close_cancelled_open {
 	struct cifs_fid         fid;
 	struct cifs_tcon        *tcon;
 	struct work_struct      work;
+	__u64 mid;
+	__u16 cmd;
 };
 
 /*	Make code in transport.c a little cleaner by moving
diff --git a/fs/cifs/smb2misc.c b/fs/cifs/smb2misc.c
index ba1e5754239c..1fb9932d3dae 100644
--- a/fs/cifs/smb2misc.c
+++ b/fs/cifs/smb2misc.c
@@ -735,19 +735,36 @@ smb2_cancelled_close_fid(struct work_struct *work)
 {
 	struct close_cancelled_open *cancelled = container_of(work,
 					struct close_cancelled_open, work);
+	struct cifs_tcon *tcon = cancelled->tcon;
+	int rc;
 
-	cifs_dbg(VFS, "Close unmatched open\n");
+	if (cancelled->mid)
+		cifs_tcon_dbg(VFS, "Close unmatched open for MID:%llx\n",
+			      cancelled->mid);
+	else
+		cifs_tcon_dbg(VFS, "Close interrupted close\n");
 
-	SMB2_close(0, cancelled->tcon, cancelled->fid.persistent_fid,
-		   cancelled->fid.volatile_fid);
-	cifs_put_tcon(cancelled->tcon);
+	rc = SMB2_close(0, tcon, cancelled->fid.persistent_fid,
+			cancelled->fid.volatile_fid);
+	if (rc) {
+		cifs_tcon_dbg(VFS, "Close cancelled mid failed with rc:%d\n", rc);
+	}
+
+	cifs_put_tcon(tcon);
 	kfree(cancelled);
 }
 
-/* Caller should already has an extra reference to @tcon */
+/*
+ * Caller should already has an extra reference to @tcon
+ * This function is used to queue work to close a handle to prevent leaks
+ * on the server.
+ * We handle two cases. If an open was interrupted after we sent the
+ * SMB2_CREATE to the server but before we processed the reply, and second
+ * if a close was interrupted before we sent the SMB2_CLOSE to the server.
+ */
 static int
-__smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
-			      __u64 volatile_fid)
+__smb2_handle_cancelled_cmd(struct cifs_tcon *tcon, __u16 cmd, __u64 mid,
+			    __u64 persistent_fid, __u64 volatile_fid)
 {
 	struct close_cancelled_open *cancelled;
 
@@ -758,6 +775,8 @@ __smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
 	cancelled->fid.persistent_fid = persistent_fid;
 	cancelled->fid.volatile_fid = volatile_fid;
 	cancelled->tcon = tcon;
+	cancelled->cmd = cmd;
+	cancelled->mid = mid;
 	INIT_WORK(&cancelled->work, smb2_cancelled_close_fid);
 	WARN_ON(queue_work(cifsiod_wq, &cancelled->work) == false);
 
@@ -775,7 +794,8 @@ smb2_handle_cancelled_close(struct cifs_tcon *tcon, __u64 persistent_fid,
 	tcon->tc_count++;
 	spin_unlock(&cifs_tcp_ses_lock);
 
-	rc = __smb2_handle_cancelled_close(tcon, persistent_fid, volatile_fid);
+	rc = __smb2_handle_cancelled_cmd(tcon, SMB2_CLOSE_HE, 0,
+					 persistent_fid, volatile_fid);
 	if (rc)
 		cifs_put_tcon(tcon);
 
@@ -799,8 +819,11 @@ smb2_handle_cancelled_mid(char *buffer, struct TCP_Server_Info *server)
 	if (!tcon)
 		return -ENOENT;
 
-	rc = __smb2_handle_cancelled_close(tcon, rsp->PersistentFileId,
-					   rsp->VolatileFileId);
+	rc = __smb2_handle_cancelled_cmd(tcon,
+					 le16_to_cpu(sync_hdr->Command),
+					 le64_to_cpu(sync_hdr->MessageId),
+					 rsp->PersistentFileId,
+					 rsp->VolatileFileId);
 	if (rc)
 		cifs_put_tcon(tcon);
 

[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux