Re: [PATCH] cifs: do not fail __smb_send_rqst if non-fatal signals are pending

Shyam Prasad N <nspmangalore@xxxxxxxxx> · Mon, 25 Jan 2021 09:06:50 -0800

>From my readings so far, -ERESTARTSYS assumes that the syscall is idempotent.
Can we safely make such assumptions for all VFS file operations? For
example, what happens if a close gets restarted, and we already
scheduled a delayed close because the original close failed with
ERESTARTSYS?

Also, I ran a quick grep for EINTR, and it looks like __cifs_readv and
__cifs_writev return EINTR too. Should those be replaced by
ERESTARTSYS too?

Regards,
Shyam

Regards,
Shyam

On Mon, Jan 25, 2021 at 8:38 AM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote:
>
> Hi Pavel,
>
> Sorry for the late review on this. A few minor comments on __smb_send_rqst():
>
>     if ((total_len > 0) && (total_len != send_length)) { <<<< what's
> special about total_len == 0? I'm guessing send_length will also be 0
> in such a case.
>         cifs_dbg(FYI, "partial send (wanted=%u sent=%zu): terminating
> session\n",
>              send_length, total_len);
>         /*
>          * If we have only sent part of an SMB then the next SMB could
>          * be taken as the remainder of this one. We need to kill the
>          * socket so the server throws away the partial SMB
>          */
>         server->tcpStatus = CifsNeedReconnect;
>         trace_smb3_partial_send_reconnect(server->CurrentMid,
>                           server->hostname);
>     }
>
> I'm not an expert on kernel socket programming, but if total_len !=
> sent_length, shouldn't we iterate retrying till they become equal (or
> abort if there's no progress)?
> I see that we cork the connection before send, and I guess it's
> unlikely why only a partial write will occur (Since these are just
> in-memory writes).
> But what is the reason for reconnecting on partial writes?
>
> smbd_done:
>     if (rc < 0 && rc != -EINTR)   <<<<< Not very critical, but
> shouldn't we also check for rc != -ERESTARTSYS?
>         cifs_server_dbg(VFS, "Error %d sending data on socket to server\n",
>              rc);
>     else if (rc > 0)
>         rc = 0;
>
>     return rc;
> }
>
> Regards,
> Shyam
>
> On Fri, Jan 22, 2021 at 11:34 PM Steve French <smfrench@xxxxxxxxx> wrote:
> >
> > Patch updated with Pavel's suggestion, and added a commit description
> > from various comments in this email thread (see attached).
> >
> >     cifs: do not fail __smb_send_rqst if non-fatal signals are pending
> >
> >     RHBZ 1848178
> >
> >     The original intent of returning an error in this function
> >     in the patch:
> >       "CIFS: Mask off signals when sending SMB packets"
> >     was to avoid interrupting packet send in the middle of
> >     sending the data (and thus breaking an SMB connection),
> >     but we also don't want to fail the request for non-fatal
> >     signals even before we have had a chance to try to
> >     send it (the reported problem could be reproduced e.g.
> >     by exiting a child process when the parent process was in
> >     the midst of calling futimens to update a file's timestamps).
> >
> >     In addition, since the signal may remain pending when we enter the
> >     sending loop, we may end up not sending the whole packet before
> >     TCP buffers become full. In this case the code returns -EINTR
> >     but what we need here is to return -ERESTARTSYS instead to
> >     allow system calls to be restarted.
> >
> >     Fixes: b30c74c73c78 ("CIFS: Mask off signals when sending SMB packets")
> >     Cc: stable@xxxxxxxxxxxxxxx # v5.1+
> >     Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
> >     Reviewed-by: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx>
> >     Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx>
> >
> > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > index e9abb41aa89b..95ef26b555b9 100644
> > --- a/fs/cifs/transport.c
> > +++ b/fs/cifs/transport.c
> > @@ -338,7 +338,7 @@ __smb_send_rqst(struct TCP_Server_Info *server,
> > int num_rqst,
> >         if (ssocket == NULL)
> >                 return -EAGAIN;
> >
> > -       if (signal_pending(current)) {
> > +       if (fatal_signal_pending(current)) {
> >                 cifs_dbg(FYI, "signal pending before send request\n");
> >                 return -ERESTARTSYS;
> >         }
> > @@ -429,7 +429,7 @@ __smb_send_rqst(struct TCP_Server_Info *server,
> > int num_rqst,
> >
> >         if (signal_pending(current) && (total_len != send_length)) {
> >                 cifs_dbg(FYI, "signal is pending after attempt to send\n");
> > -               rc = -EINTR;
> > +               rc = -ERESTARTSYS;
> >         }
> >
> >         /* uncork it */
> >
> > On Fri, Jan 22, 2021 at 3:46 PM ronnie sahlberg
> > <ronniesahlberg@xxxxxxxxx> wrote:
> > >
> > > On Sat, Jan 23, 2021 at 5:47 AM Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote:
> > > >
> > > > Ronnie,
> > > >
> > > > I still think that your patch needs additional fix here:
> > > >
> > > > -----------
> > > > /*
> > > > * If signal is pending but we have already sent the whole packet to
> > > > * the server we need to return success status to allow a corresponding
> > > > * mid entry to be kept in the pending requests queue thus allowing
> > > > * to handle responses from the server by the client.
> > > > *
> > > > * If only part of the packet has been sent there is no need to hide
> > > > * interrupt because the session will be reconnected anyway, so there
> > > > * won't be any response from the server to handle.
> > > > */
> > > >
> > > > if (signal_pending(current) && (total_len != send_length)) {
> > > > cifs_dbg(FYI, "signal is pending after attempt to send\n");
> > > > rc = -EINTR;
> > > >         ^^^
> > > >         This should be changed to -ERESTARTSYS to allow kernel to
> > > > restart a syscall.
> > > >
> > > > }
> > > > -----------
> > > >
> > > > Since the signal may remain pending when we enter the sending loop, we
> > > > may end up not sending the whole packet before TCP buffers become
> > > > full. In this case according to the condition above the code returns
> > > > -EINTR but what we need here is to return -ERESTARTSYS instead to
> > > > allow system calls to be restarted.
> > > >
> > > > Thoughts?
> > >
> > > Yes, that is probably a good idea to change too.
> > > Steve, can you add this change to my patch?
> > >
> > >
> > > >
> > > > --
> > > > Best regards,
> > > > Pavel Shilovsky
> > > >
> > > > чт, 21 янв. 2021 г. в 14:41, ronnie sahlberg <ronniesahlberg@xxxxxxxxx>:
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jan 21, 2021 at 10:19 PM Paulo Alcantara <pc@xxxxxx> wrote:
> > > > >>
> > > > >> Aurélien Aptel <aaptel@xxxxxxxx> writes:
> > > > >>
> > > > >> > Pavel Shilovsky <piastryyy@xxxxxxxxx> writes:
> > > > >> >
> > > > >> >> вт, 19 янв. 2021 г. в 22:38, Steve French <smfrench@xxxxxxxxx>:
> > > > >> >>>
> > > > >> >>> The patch won't merge (also has some text corruptions in it).  This
> > > > >> >>> line of code is different due to commit 6988a619f5b79
> > > > >> >>>
> > > > >> >>> 6988a619f5b79 (Paulo Alcantara 2020-11-28 15:57:06 -0300 342)
> > > > >> >>>  cifs_dbg(FYI, "signal pending before send request\n");
> > > > >> >>> 6988a619f5b79 (Paulo Alcantara 2020-11-28 15:57:06 -0300 343)
> > > > >> >>>  return -ERESTARTSYS;
> > > > >> >>>
> > > > >> >>>         if (signal_pending(current)) {
> > > > >> >>>                 cifs_dbg(FYI, "signal pending before send request\n");
> > > > >> >>>                 return -ERESTARTSYS;
> > > > >> >>>         }
> > > > >> >>>
> > > > >> >>> See:
> > > > >> >>>
> > > > >> >>> Author: Paulo Alcantara <pc@xxxxxx>
> > > > >> >>> Date:   Sat Nov 28 15:57:06 2020 -0300
> > > > >> >>>
> > > > >> >>>     cifs: allow syscalls to be restarted in __smb_send_rqst()
> > > > >> >>>
> > > > >> >>>     A customer has reported that several files in their multi-threaded app
> > > > >> >>>     were left with size of 0 because most of the read(2) calls returned
> > > > >> >>>     -EINTR and they assumed no bytes were read.  Obviously, they could
> > > > >> >>>     have fixed it by simply retrying on -EINTR.
> > > > >> >>>
> > > > >> >>>     We noticed that most of the -EINTR on read(2) were due to real-time
> > > > >> >>>     signals sent by glibc to process wide credential changes (SIGRT_1),
> > > > >> >>>     and its signal handler had been established with SA_RESTART, in which
> > > > >> >>>     case those calls could have been automatically restarted by the
> > > > >> >>>     kernel.
> > > > >> >>>
> > > > >> >>>     Let the kernel decide to whether or not restart the syscalls when
> > > > >> >>>     there is a signal pending in __smb_send_rqst() by returning
> > > > >> >>>     -ERESTARTSYS.  If it can't, it will return -EINTR anyway.
> > > > >> >>>
> > > > >> >>>     Signed-off-by: Paulo Alcantara (SUSE) <pc@xxxxxx>
> > > > >> >>>     CC: Stable <stable@xxxxxxxxxxxxxxx>
> > > > >> >>>     Reviewed-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
> > > > >> >>>     Reviewed-by: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx>
> > > > >> >>>
> > > > >> >>> On Tue, Jan 19, 2021 at 10:32 PM Ronnie Sahlberg <lsahlber@xxxxxxxxxx> wrote:
> > > > >> >>> >
> > > > >> >>> > RHBZ 1848178
> > > > >> >>> >
> > > > >> >>> > There is no need to fail this function if non-fatal signals are
> > > > >> >>> > pending when we enter it.
> > > > >> >>> >
> > > > >> >>> > Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx>
> > > > >> >>> > ---
> > > > >> >>> >  fs/cifs/transport.c | 2 +-
> > > > >> >>> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >> >>> >
> > > > >> >>> > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > > > >> >>> > index c42bda5a5008..98752f7d2cd2 100644
> > > > >> >>> > --- a/fs/cifs/transport.c
> > > > >> >>> > +++ b/fs/cifs/transport.c
> > > > >> >>> > @@ -339,7 +339,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> > > > >> >>> >         if (ssocket == NULL)
> > > > >> >>> >                 return -EAGAIN;
> > > > >> >>> >
> > > > >> >>> > -       if (signal_pending(current)) {
> > > > >> >>> > +       if (fatal_signal_pending(current)) {
> > > > >> >>> >                 cifs_dbg(FYI, "signal is pending before sending any data\n");
> > > > >> >>> >                 return -EINTR;
> > > > >> >>> >         }
> > > > >> >
> > > > >> > I've looked up the difference
> > > > >> >
> > > > >> > static inline int __fatal_signal_pending(struct task_struct *p)
> > > > >> > {
> > > > >> >       return unlikely(sigismember(&p->pending.signal, SIGKILL));
> > > > >> > }
> > > > >> >
> > > > >> >
> > > > >> >> I have been thinking around the same lines. The original intent of
> > > > >> >> failing the function here was to avoid interrupting packet send in the
> > > > >> >> middle of the packet and not breaking an SMB connection.
> > > > >> >> That's also why signals are blocked around smb_send_kvec() calls. I
> > > > >> >> guess most of the time a socket buffer is not full, so, those
> > > > >> >> functions immediately return success without waiting internally and
> > > > >> >> checking for pending signals. With this change the code may break SMB
> > > > >> >
> > > > >> > Ah, interesting.
> > > > >> >
> > > > >> > I looked up the difference between fatal/non-fatal and it seems
> > > > >> > fatal_signal_pending() really only checks for SIGKILL, but I would
> > > > >> > expect ^C (SIGINT) to return quickly as well.
> > > > >> >
> > > > >> > I thought the point of checking for pending signal early was to return
> > > > >> > quickly to userspace and not be stuck in some unkillable state.
> > > > >> >
> > > > >> > After reading your explanation, you're saying the kernel funcs to send
> > > > >> > on socket will check for any signal and err early in any case.
> > > > >> >
> > > > >> > some_syscall() {
> > > > >> >
> > > > >> >     if (pending_fatal_signal)  <===== if we ignore non-fatal here
> > > > >> >         fail_early();
> > > > >> >
> > > > >> >     block_signals();
> > > > >> >     r = kernel_socket_send {
> > > > >> >         if (pending_signal) <==== they will be caught here
> > > > >> >             return error;
> > > > >> >
> > > > >> >         ...
> > > > >> >     }
> > > > >> >     unblock_signals();
> > > > >> >     if (r)
> > > > >> >         fail();
> > > > >> >     ...
> > > > >> > }
> > > > >> >
> > > > >> > So this patch will (potentially) trigger more reconnect (because we
> > > > >> > actually send the packet as a vector in a loop) but I'm not sure I
> > > > >> > understand why it returns less errors to userspace?
> > > > >> >
> > > > >> > Also, shouldn't we move the pending_fatal_signal check *inside* the blocked
> > > > >> > signal section?
> > > > >> >
> > > > >> > In any case I think we should try to test some of those changes given
> > > > >> > how we have 3,4 patches trying to tweak it on top of each other.
> > > > >>
> > > > >> I think it would make sense to have something like
> > > > >>
> > > > >> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
> > > > >> index e9abb41aa89b..f7292c14863e 100644
> > > > >> --- a/fs/cifs/transport.c
> > > > >> +++ b/fs/cifs/transport.c
> > > > >> @@ -340,7 +340,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst,
> > > > >>
> > > > >>         if (signal_pending(current)) {
> > > > >>                 cifs_dbg(FYI, "signal pending before send request\n");
> > > > >> -               return -ERESTARTSYS;
> > > > >> +               return __fatal_signal_pending(current) ? -EINTR : -ERESTARTSYS;
> > > > >>         }
> > > > >>
> > > > >
> > > > > That is not sufficient because there are syscalls that are never supposed to fail with -EINTR or -ERESTARTSYS
> > > > > and thus will not be restarted by either the kernel or libc.
> > > > >
> > > > > For example utimensat(). The change to only fail here with -E* for a fatal signal (when the process will be killed anyway)
> > > > > is to address an issue when IF there are signals pending, any signal, during the utimensat() system call then
> > > > > this will lead to us returning -EINTR back to the application. Which can break some applications such as 'tar'.
> > > > >
> > > > >
> > > > > ronnie s
> > > > >
> > > > >
> > > > >>
> > > > >>         /* cork the socket */
> > > > >>
> > > > >> so that we allow signal handlers to be executed before restarting
> > > > >> syscalls when receiving non-fatal signals, otherwise -EINTR.
> >
> >
> >
> > --
> > Thanks,
> >
> > Steve
>
>
>
> --
> -Shyam

-- 
-Shyam