RE: A process killed while opening a file can result in leaked open handle on the server

Pavel Shilovskiy <pshilov@xxxxxxxxxxxxx> · Fri, 15 Nov 2019 18:21:03 +0000

ср, 13 нояб. 2019 г. в 17:39, Ronnie Sahlberg <lsahlber@xxxxxxxxxx>:
>
>
>
>
>
> ----- Original Message -----
> > From: "Frank Sorenson" <sorenson@xxxxxxxxxx>
> > To: "Ronnie Sahlberg" <lsahlber@xxxxxxxxxx>, "Pavel Shilovsky" <piastryyy@xxxxxxxxx>
> > Cc: "linux-cifs" <linux-cifs@xxxxxxxxxxxxxxx>
> > Sent: Thursday, 14 November, 2019 8:15:46 AM
> > Subject: Re: A process killed while opening a file can result in leaked open handle on the server
> >
> > On 11/13/19 12:49 AM, Ronnie Sahlberg wrote:
> > > Steve, Pavel
> > >
> > > This patch goes ontop of Pavels patch.
> > > Maybe it should be merged with Pavels patch since his patch changes from
> > > "we only send a close() on an interrupted open()"
> > > to now "we send a close() on either interrupted open() or interrupted
> > > close()" so both comments as well as log messages are updates.
> > >
> > > Additionally it adds logging of the MID that failed in the case of an
> > > interrupted Open() so that it is easy to find it in wireshark
> > > and check whether that smb2 file handle was indeed handles by a SMB_Close()
> > > or not.
> > >
> > >
> > > From testing it appears Pavels patch works. When the close() is interrupted
> > > we don't leak handles as far as I can tell.
> > > We do have a leak in the Open() case though and it seems that eventhough we
> > > set things up and flags the MID to be cancelled we actually never end up
> > > calling smb2_cancelled_close_fid() and thus we never send a SMB2_Close().
> > > I haven't found the root cause yet but I suspect we mess up mid flags or
> > > state somewhere.
> > >
> > >
> > > It did work in the past though when Sachin provided the initial
> > > implementation so we have regressed I think.
> > > I have added a new test 'cifs/102'  to the buildbot that checks for this
> > > but have not integrated into the cifs-testing run yet since we still fail
> > > this test.
> > > At least we will not have further regressions once we fix this and enable
> > > the test in the future.
> > >
> > > ronnie s
> >
> > The patches do indeed improve it significantly.
> >
> > I'm still seeing some leak as well, and I'm removing ratelimiting so
> > that I can see what the added debugging is trying to tell us.  I'll
> > report if I find more details.
> >
>
> We are making progress.
> Can you test this patch if it improves even more for you?
> It fixes most but not all the leaks I see for interrupted open():
>
> I will post this to the list too as a separate mail/patch.
>
>
> Author: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
> Date:   Thu Nov 14 11:23:06 2019 +1000
>
>     cifs: fix race between compound_send_recv() and the demultiplex thread
>
>     There is a race where the open() may be interrupted between when we receive the reply
>     but before we have invoked the callback in which case we never end up calling
>     handle_cancelled_mid() and thus leak an open handle on the server.
>
>     Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
>
> diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
> index ccaa8bad336f..802604a7e692 100644
> --- a/fs/cifs/connect.c
> +++ b/fs/cifs/connect.c
> @@ -1223,7 +1223,6 @@ cifs_demultiplex_thread(void *p)
>                         if (mids[i] != NULL) {
>                                 mids[i]->resp_buf_size = server->pdu_size;
>                                 if ((mids[i]->mid_flags & MID_WAIT_CANCELLED) &&
> -                                   mids[i]->mid_state == MID_RESPONSE_RECEIVED &&
>                                     server->ops->handle_cancelled_mid)
>                                         server->ops->handle_cancelled_mid(
>                                                         mids[i]->resp_buf,
> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> index ca3de62688d6..28018a7eccb2 100644
> --- a/fs/cifs/transport.c
> +++ b/fs/cifs/transport.c
> @@ -1119,7 +1119,8 @@ compound_send_recv(const unsigned int xid, struct cifs_ses *ses,
>                                  midQ[i]->mid, le16_to_cpu(midQ[i]->command));
>                         send_cancel(server, &rqst[i], midQ[i]);
>                         spin_lock(&GlobalMid_Lock);
> -                       if (midQ[i]->mid_state == MID_REQUEST_SUBMITTED) {
> +                       if (midQ[i]->mid_state == MID_REQUEST_SUBMITTED ||
> +                           midQ[i]->mid_state == MID_RESPONSE_RECEIVED) {
>                                 midQ[i]->mid_flags |= MID_WAIT_CANCELLED;
>                                 midQ[i]->callback = cifs_cancelled_callback;
>                                 cancelled_mid[i] = true;
>
>
>
>
>
> >
> > Thanks for the help.
> >
> >
> > Frank
> >
> >
>

Thanks Frank, Ronnie for testing my patch! I also like the Ronnie's patch - good progress overall!

About my patch, I am going to add a proper description once I get time next week and will probably do some improvements:

1) Currently with my patch the client will send Close even if it has already received a response but wait_for_response() was interrupted. The latter function returns -ERESTARTSYS. What we want to do instead is to send Close only if *sending* was interrupted. In that case we return -EINTR, see __smb_send_rqst():

if (signal_pending(current)) {
    cifs_dbg(FYI, "signal is pending before sending any data\n");
    return -EINTR;
}

2) Move the error handling to the PDU layer function SMB2_close() to handle *all* interrupted close requests including ones closing temporary handles.

Please comment on the two changes above!

--
Best regards,
Pavel Shilovsky