>From my readings so far, -ERESTARTSYS assumes that the syscall is idempotent. Can we safely make such assumptions for all VFS file operations? For example, what happens if a close gets restarted, and we already scheduled a delayed close because the original close failed with ERESTARTSYS? Also, I ran a quick grep for EINTR, and it looks like __cifs_readv and __cifs_writev return EINTR too. Should those be replaced by ERESTARTSYS too? Regards, Shyam Regards, Shyam On Mon, Jan 25, 2021 at 8:38 AM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote: > > Hi Pavel, > > Sorry for the late review on this. A few minor comments on __smb_send_rqst(): > > if ((total_len > 0) && (total_len != send_length)) { <<<< what's > special about total_len == 0? I'm guessing send_length will also be 0 > in such a case. > cifs_dbg(FYI, "partial send (wanted=%u sent=%zu): terminating > session\n", > send_length, total_len); > /* > * If we have only sent part of an SMB then the next SMB could > * be taken as the remainder of this one. We need to kill the > * socket so the server throws away the partial SMB > */ > server->tcpStatus = CifsNeedReconnect; > trace_smb3_partial_send_reconnect(server->CurrentMid, > server->hostname); > } > > I'm not an expert on kernel socket programming, but if total_len != > sent_length, shouldn't we iterate retrying till they become equal (or > abort if there's no progress)? > I see that we cork the connection before send, and I guess it's > unlikely why only a partial write will occur (Since these are just > in-memory writes). > But what is the reason for reconnecting on partial writes? > > smbd_done: > if (rc < 0 && rc != -EINTR) <<<<< Not very critical, but > shouldn't we also check for rc != -ERESTARTSYS? > cifs_server_dbg(VFS, "Error %d sending data on socket to server\n", > rc); > else if (rc > 0) > rc = 0; > > return rc; > } > > Regards, > Shyam > > On Fri, Jan 22, 2021 at 11:34 PM Steve French <smfrench@xxxxxxxxx> wrote: > > > > Patch updated with Pavel's suggestion, and added a commit description > > from various comments in this email thread (see attached). > > > > cifs: do not fail __smb_send_rqst if non-fatal signals are pending > > > > RHBZ 1848178 > > > > The original intent of returning an error in this function > > in the patch: > > "CIFS: Mask off signals when sending SMB packets" > > was to avoid interrupting packet send in the middle of > > sending the data (and thus breaking an SMB connection), > > but we also don't want to fail the request for non-fatal > > signals even before we have had a chance to try to > > send it (the reported problem could be reproduced e.g. > > by exiting a child process when the parent process was in > > the midst of calling futimens to update a file's timestamps). > > > > In addition, since the signal may remain pending when we enter the > > sending loop, we may end up not sending the whole packet before > > TCP buffers become full. In this case the code returns -EINTR > > but what we need here is to return -ERESTARTSYS instead to > > allow system calls to be restarted. > > > > Fixes: b30c74c73c78 ("CIFS: Mask off signals when sending SMB packets") > > Cc: stable@xxxxxxxxxxxxxxx # v5.1+ > > Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx> > > Reviewed-by: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx> > > Signed-off-by: Steve French <stfrench@xxxxxxxxxxxxx> > > > > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c > > index e9abb41aa89b..95ef26b555b9 100644 > > --- a/fs/cifs/transport.c > > +++ b/fs/cifs/transport.c > > @@ -338,7 +338,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, > > int num_rqst, > > if (ssocket == NULL) > > return -EAGAIN; > > > > - if (signal_pending(current)) { > > + if (fatal_signal_pending(current)) { > > cifs_dbg(FYI, "signal pending before send request\n"); > > return -ERESTARTSYS; > > } > > @@ -429,7 +429,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, > > int num_rqst, > > > > if (signal_pending(current) && (total_len != send_length)) { > > cifs_dbg(FYI, "signal is pending after attempt to send\n"); > > - rc = -EINTR; > > + rc = -ERESTARTSYS; > > } > > > > /* uncork it */ > > > > On Fri, Jan 22, 2021 at 3:46 PM ronnie sahlberg > > <ronniesahlberg@xxxxxxxxx> wrote: > > > > > > On Sat, Jan 23, 2021 at 5:47 AM Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote: > > > > > > > > Ronnie, > > > > > > > > I still think that your patch needs additional fix here: > > > > > > > > ----------- > > > > /* > > > > * If signal is pending but we have already sent the whole packet to > > > > * the server we need to return success status to allow a corresponding > > > > * mid entry to be kept in the pending requests queue thus allowing > > > > * to handle responses from the server by the client. > > > > * > > > > * If only part of the packet has been sent there is no need to hide > > > > * interrupt because the session will be reconnected anyway, so there > > > > * won't be any response from the server to handle. > > > > */ > > > > > > > > if (signal_pending(current) && (total_len != send_length)) { > > > > cifs_dbg(FYI, "signal is pending after attempt to send\n"); > > > > rc = -EINTR; > > > > ^^^ > > > > This should be changed to -ERESTARTSYS to allow kernel to > > > > restart a syscall. > > > > > > > > } > > > > ----------- > > > > > > > > Since the signal may remain pending when we enter the sending loop, we > > > > may end up not sending the whole packet before TCP buffers become > > > > full. In this case according to the condition above the code returns > > > > -EINTR but what we need here is to return -ERESTARTSYS instead to > > > > allow system calls to be restarted. > > > > > > > > Thoughts? > > > > > > Yes, that is probably a good idea to change too. > > > Steve, can you add this change to my patch? > > > > > > > > > > > > > > -- > > > > Best regards, > > > > Pavel Shilovsky > > > > > > > > чт, 21 янв. 2021 г. в 14:41, ronnie sahlberg <ronniesahlberg@xxxxxxxxx>: > > > > > > > > > > > > > > > > > > > > On Thu, Jan 21, 2021 at 10:19 PM Paulo Alcantara <pc@xxxxxx> wrote: > > > > >> > > > > >> Aurélien Aptel <aaptel@xxxxxxxx> writes: > > > > >> > > > > >> > Pavel Shilovsky <piastryyy@xxxxxxxxx> writes: > > > > >> > > > > > >> >> вт, 19 янв. 2021 г. в 22:38, Steve French <smfrench@xxxxxxxxx>: > > > > >> >>> > > > > >> >>> The patch won't merge (also has some text corruptions in it). This > > > > >> >>> line of code is different due to commit 6988a619f5b79 > > > > >> >>> > > > > >> >>> 6988a619f5b79 (Paulo Alcantara 2020-11-28 15:57:06 -0300 342) > > > > >> >>> cifs_dbg(FYI, "signal pending before send request\n"); > > > > >> >>> 6988a619f5b79 (Paulo Alcantara 2020-11-28 15:57:06 -0300 343) > > > > >> >>> return -ERESTARTSYS; > > > > >> >>> > > > > >> >>> if (signal_pending(current)) { > > > > >> >>> cifs_dbg(FYI, "signal pending before send request\n"); > > > > >> >>> return -ERESTARTSYS; > > > > >> >>> } > > > > >> >>> > > > > >> >>> See: > > > > >> >>> > > > > >> >>> Author: Paulo Alcantara <pc@xxxxxx> > > > > >> >>> Date: Sat Nov 28 15:57:06 2020 -0300 > > > > >> >>> > > > > >> >>> cifs: allow syscalls to be restarted in __smb_send_rqst() > > > > >> >>> > > > > >> >>> A customer has reported that several files in their multi-threaded app > > > > >> >>> were left with size of 0 because most of the read(2) calls returned > > > > >> >>> -EINTR and they assumed no bytes were read. Obviously, they could > > > > >> >>> have fixed it by simply retrying on -EINTR. > > > > >> >>> > > > > >> >>> We noticed that most of the -EINTR on read(2) were due to real-time > > > > >> >>> signals sent by glibc to process wide credential changes (SIGRT_1), > > > > >> >>> and its signal handler had been established with SA_RESTART, in which > > > > >> >>> case those calls could have been automatically restarted by the > > > > >> >>> kernel. > > > > >> >>> > > > > >> >>> Let the kernel decide to whether or not restart the syscalls when > > > > >> >>> there is a signal pending in __smb_send_rqst() by returning > > > > >> >>> -ERESTARTSYS. If it can't, it will return -EINTR anyway. > > > > >> >>> > > > > >> >>> Signed-off-by: Paulo Alcantara (SUSE) <pc@xxxxxx> > > > > >> >>> CC: Stable <stable@xxxxxxxxxxxxxxx> > > > > >> >>> Reviewed-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx> > > > > >> >>> Reviewed-by: Pavel Shilovsky <pshilov@xxxxxxxxxxxxx> > > > > >> >>> > > > > >> >>> On Tue, Jan 19, 2021 at 10:32 PM Ronnie Sahlberg <lsahlber@xxxxxxxxxx> wrote: > > > > >> >>> > > > > > >> >>> > RHBZ 1848178 > > > > >> >>> > > > > > >> >>> > There is no need to fail this function if non-fatal signals are > > > > >> >>> > pending when we enter it. > > > > >> >>> > > > > > >> >>> > Signed-off-by: Ronnie Sahlberg <lsahlber@xxxxxxxxxx> > > > > >> >>> > --- > > > > >> >>> > fs/cifs/transport.c | 2 +- > > > > >> >>> > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > >> >>> > > > > > >> >>> > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c > > > > >> >>> > index c42bda5a5008..98752f7d2cd2 100644 > > > > >> >>> > --- a/fs/cifs/transport.c > > > > >> >>> > +++ b/fs/cifs/transport.c > > > > >> >>> > @@ -339,7 +339,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, > > > > >> >>> > if (ssocket == NULL) > > > > >> >>> > return -EAGAIN; > > > > >> >>> > > > > > >> >>> > - if (signal_pending(current)) { > > > > >> >>> > + if (fatal_signal_pending(current)) { > > > > >> >>> > cifs_dbg(FYI, "signal is pending before sending any data\n"); > > > > >> >>> > return -EINTR; > > > > >> >>> > } > > > > >> > > > > > >> > I've looked up the difference > > > > >> > > > > > >> > static inline int __fatal_signal_pending(struct task_struct *p) > > > > >> > { > > > > >> > return unlikely(sigismember(&p->pending.signal, SIGKILL)); > > > > >> > } > > > > >> > > > > > >> > > > > > >> >> I have been thinking around the same lines. The original intent of > > > > >> >> failing the function here was to avoid interrupting packet send in the > > > > >> >> middle of the packet and not breaking an SMB connection. > > > > >> >> That's also why signals are blocked around smb_send_kvec() calls. I > > > > >> >> guess most of the time a socket buffer is not full, so, those > > > > >> >> functions immediately return success without waiting internally and > > > > >> >> checking for pending signals. With this change the code may break SMB > > > > >> > > > > > >> > Ah, interesting. > > > > >> > > > > > >> > I looked up the difference between fatal/non-fatal and it seems > > > > >> > fatal_signal_pending() really only checks for SIGKILL, but I would > > > > >> > expect ^C (SIGINT) to return quickly as well. > > > > >> > > > > > >> > I thought the point of checking for pending signal early was to return > > > > >> > quickly to userspace and not be stuck in some unkillable state. > > > > >> > > > > > >> > After reading your explanation, you're saying the kernel funcs to send > > > > >> > on socket will check for any signal and err early in any case. > > > > >> > > > > > >> > some_syscall() { > > > > >> > > > > > >> > if (pending_fatal_signal) <===== if we ignore non-fatal here > > > > >> > fail_early(); > > > > >> > > > > > >> > block_signals(); > > > > >> > r = kernel_socket_send { > > > > >> > if (pending_signal) <==== they will be caught here > > > > >> > return error; > > > > >> > > > > > >> > ... > > > > >> > } > > > > >> > unblock_signals(); > > > > >> > if (r) > > > > >> > fail(); > > > > >> > ... > > > > >> > } > > > > >> > > > > > >> > So this patch will (potentially) trigger more reconnect (because we > > > > >> > actually send the packet as a vector in a loop) but I'm not sure I > > > > >> > understand why it returns less errors to userspace? > > > > >> > > > > > >> > Also, shouldn't we move the pending_fatal_signal check *inside* the blocked > > > > >> > signal section? > > > > >> > > > > > >> > In any case I think we should try to test some of those changes given > > > > >> > how we have 3,4 patches trying to tweak it on top of each other. > > > > >> > > > > >> I think it would make sense to have something like > > > > >> > > > > >> diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c > > > > >> index e9abb41aa89b..f7292c14863e 100644 > > > > >> --- a/fs/cifs/transport.c > > > > >> +++ b/fs/cifs/transport.c > > > > >> @@ -340,7 +340,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, > > > > >> > > > > >> if (signal_pending(current)) { > > > > >> cifs_dbg(FYI, "signal pending before send request\n"); > > > > >> - return -ERESTARTSYS; > > > > >> + return __fatal_signal_pending(current) ? -EINTR : -ERESTARTSYS; > > > > >> } > > > > >> > > > > > > > > > > That is not sufficient because there are syscalls that are never supposed to fail with -EINTR or -ERESTARTSYS > > > > > and thus will not be restarted by either the kernel or libc. > > > > > > > > > > For example utimensat(). The change to only fail here with -E* for a fatal signal (when the process will be killed anyway) > > > > > is to address an issue when IF there are signals pending, any signal, during the utimensat() system call then > > > > > this will lead to us returning -EINTR back to the application. Which can break some applications such as 'tar'. > > > > > > > > > > > > > > > ronnie s > > > > > > > > > > > > > > >> > > > > >> /* cork the socket */ > > > > >> > > > > >> so that we allow signal handlers to be executed before restarting > > > > >> syscalls when receiving non-fatal signals, otherwise -EINTR. > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > -Shyam -- -Shyam