Re: A process killed while opening a file can result in leaked open handle on the server

Ronnie Sahlberg <lsahlber@xxxxxxxxxx> · Wed, 13 Nov 2019 20:39:44 -0500 (EST)

----- Original Message -----
> From: "Frank Sorenson" <sorenson@xxxxxxxxxx>
> To: "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>, "Pavel Shilovsky" <piastryyy@xxxxxxxxx>
> Cc: "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>
> Sent: Thursday, 14 November, 2019 8:15:46 AM
> Subject: Re: A process killed while opening a file can result in leaked open handle on the server
> 
> On 11/13/19 12:49 AM, Ronnie Sahlberg wrote:
> > Steve, Pavel
> > 
> > This patch goes ontop of Pavels patch.
> > Maybe it should be merged with Pavels patch since his patch changes from
> > "we only send a close() on an interrupted open()"
> > to now "we send a close() on either interrupted open() or interrupted
> > close()" so both comments as well as log messages are updates.
> > 
> > Additionally it adds logging of the MID that failed in the case of an
> > interrupted Open() so that it is easy to find it in wireshark
> > and check whether that smb2 file handle was indeed handles by a SMB_Close()
> > or not.
> > 
> > 
> > From testing it appears Pavels patch works. When the close() is interrupted
> > we don't leak handles as far as I can tell.
> > We do have a leak in the Open() case though and it seems that eventhough we
> > set things up and flags the MID to be cancelled we actually never end up
> > calling smb2_cancelled_close_fid() and thus we never send a SMB2_Close().
> > I haven't found the root cause yet but I suspect we mess up mid flags or
> > state somewhere.
> > 
> > 
> > It did work in the past though when Sachin provided the initial
> > implementation so we have regressed I think.
> > I have added a new test 'cifs/102'  to the buildbot that checks for this
> > but have not integrated into the cifs-testing run yet since we still fail
> > this test.
> > At least we will not have further regressions once we fix this and enable
> > the test in the future.
> > 
> > ronnie s
> 
> The patches do indeed improve it significantly.
> 
> I'm still seeing some leak as well, and I'm removing ratelimiting so
> that I can see what the added debugging is trying to tell us.  I'll
> report if I find more details.
> 

We are making progress.
Can you test this patch if it improves even more for you?
It fixes most but not all the leaks I see for interrupted open():

I will post this to the list too as a separate mail/patch.


Author: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
Date:   Thu Nov 14 11:23:06 2019 +1000

    cifs: fix race between compound_send_recv() and the demultiplex thread
    
    There is a race where the open() may be interrupted between when we receive the reply
    but before we have invoked the callback in which case we never end up calling
    handle_cancelled_mid() and thus leak an open handle on the server.
    
    Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>

diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index ccaa8bad336f..802604a7e692 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1223,7 +1223,6 @@ cifs_demultiplex_thread(void *p)
                        if (mids[i] != NULL) {
                                mids[i]->resp_buf_size = server->pdu_size;
                                if ((mids[i]->mid_flags & MID_WAIT_CANCELLED) &&
-                                   mids[i]->mid_state == MID_RESPONSE_RECEIVED &&
                                    server->ops->handle_cancelled_mid)
                                        server->ops->handle_cancelled_mid(
                                                        mids[i]->resp_buf,
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index ca3de62688d6..28018a7eccb2 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -1119,7 +1119,8 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
                                 midQ[i]->mid, le16_to_cpu(midQ[i]->command));
                        send_cancel(server, &rqst[i], midQ[i]);
                        spin_lock(&GlobalMid_Lock);
-                       if (midQ[i]->mid_state == MID_REQUEST_SUBMITTED) {
+                       if (midQ[i]->mid_state == MID_REQUEST_SUBMITTED ||
+                           midQ[i]->mid_state == MID_RESPONSE_RECEIVED) {
                                midQ[i]->mid_flags |= MID_WAIT_CANCELLED;
                                midQ[i]->callback = cifs_cancelled_callback;
                                cancelled_mid[i] = true;





> 
> Thanks for the help.
> 
> 
> Frank
> 
>