Pavel noticed I missed a line from the attempt to do a similar patch to Eric's suggestion (it still didn't work though - although "allow_signal" does albeit is possibly dangerous as user space can kill cifsd) # git diff -a diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index a4830ced0f98..8758dff18c15 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p) struct task_struct *task_to_wake = NULL; struct mid_q_entry *mids[MAX_COMPOUND]; char *bufs[MAX_COMPOUND]; + sigset_t mask, oldmask; current->flags |= PF_MEMALLOC; cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current)); @@ -1113,6 +1114,9 @@ cifs_demultiplex_thread(void *p) mempool_resize(cifs_req_poolp, length + cifs_min_rcv); set_freezable(); + sigfillset(&mask); + sigdelset(&mask, SIGKILL); + sigprocmask(SIG_BLOCK, &mask, &oldmask); while (server->tcpStatus != CifsExiting) { if (try_to_freeze()) continue; On Tue, Jul 23, 2019 at 9:02 PM Steve French <smfrench@xxxxxxxxx> wrote: > > On Tue, Jul 23, 2019 at 8:32 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote: > > > > Steve French <smfrench@xxxxxxxxx> writes: > > > > > Very easy to see what caused the regression with this global change: > > > > > > mount (which launches "cifsd" thread to read the socket) > > > umount (which kills the "cifsd" thread) > > > rmmod (rmmod now fails since "cifsd" thread is still active) > > > > > > mount launches a thread to read from the socket ("cifsd") > > > umount is supposed to kill that thread (but with the patch > > > "signal/cifs: Fix cifs_put_tcp_session to call send_sig instead of > > > force_sig" that no longer works). So the regression is that after > > > unmount you still see the "cifsd" thread, and the reason that cifsd > > > thread is still around is that that patch no longer force kills the > > > process (see line 2652 of fs/cifs/connect.c) which regresses module > > > removal. > > > > > > - force_sig(SIGKILL, task); > > > + send_sig(SIGKILL, task, 1); > > > > > > The comment in the changeset indicates "The signal SIGKILL can not be > > > ignored" but obviously it can be ignored - at least on 5.3-rc1 it is > > > being ignored. > > > > > > If send_sig(SIGKILL ...) doesn't work and if force_sig(SIGKILL, task) > > > is removed and no longer possible - how do we kill a helper process > > > ... > > > > I think I see what is happening. It looks like as well as misuinsg > > force_sig, cifs is also violating the invariant that keeps SIGKILL out > > of the blocked signal set. > > > > For that force_sig will act differently. I did not consider it because > > that is never supposed to happen. > > > > Can someone test this code below and confirm the issue goes away? > > > > diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c > > index 5d6d44bfe10a..2a782ebc7b65 100644 > > --- a/fs/cifs/transport.c > > +++ b/fs/cifs/transport.c > > @@ -347,6 +347,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, > > */ > > > > sigfillset(&mask); > > + sigdelset(&mask, SIGKILL); > > sigprocmask(SIG_BLOCK, &mask, &oldmask); > > > > /* Generate a rfc1002 marker for SMB2+ */ > > > > > > Eric > > I just tried your suggestion and it didn't work. I also tried doing > a similar thing on the thread we are trying to kills ("cifsd" - ie > which is blocked in the function cifs_demultiplex_thread waiting to > read from the socket) > # git diff -a > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c > index a4830ced0f98..b73062520a17 100644 > --- a/fs/cifs/connect.c > +++ b/fs/cifs/connect.c > @@ -1104,6 +1104,7 @@ cifs_demultiplex_thread(void *p) > struct task_struct *task_to_wake = NULL; > struct mid_q_entry *mids[MAX_COMPOUND]; > char *bufs[MAX_COMPOUND]; > + sigset_t mask; > > current->flags |= PF_MEMALLOC; > cifs_dbg(FYI, "Demultiplex PID: %d\n", task_pid_nr(current)); > @@ -1113,6 +1114,8 @@ cifs_demultiplex_thread(void *p) > mempool_resize(cifs_req_poolp, length + cifs_min_rcv); > > set_freezable(); > + sigfillset(&mask); > + sigdelset(&mask, SIGKILL); > while (server->tcpStatus != CifsExiting) { > if (try_to_freeze()) > continue; > > > That also didn't work. The only thing I have been able to find > which worked was: > > diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c > index a4830ced0f98..e74f04163fc9 100644 > --- a/fs/cifs/connect.c > +++ b/fs/cifs/connect.c > @@ -1113,6 +1113,7 @@ cifs_demultiplex_thread(void *p) > mempool_resize(cifs_req_poolp, length + cifs_min_rcv); > > set_freezable(); > + allow_signal(SIGKILL); > while (server->tcpStatus != CifsExiting) { > if (try_to_freeze()) > continue; > > > That fixes the problem ... but ... as Ronnie and others have noted it > would allow a userspace process to make the mount unusable (all you > would have to do would be to do a kill -9 of the "cifsd" process from > some userspace process like bash and the mount would be unusable - so > this sounds dangerous. > > Is there an alternative that, in the process doing the unmount in > kernel, would allow us to do the equivalent of: > "allow_signal(SIGKILL, <the id of the cifsd process>" > In otherwords, to minimize the risk of some userspace process killing > cifsd, could we delay enabling allow_signal(SIGKILL) till the unmount > begins by doing it for a different process (have the unmount process > enable signals for the cifsd process). Otherwise is there a way to > force kill a process from the kernel as we used to do - without > running the risk of a user space process killing cifsd (which is bad). > > -- > Thanks, > > Steve -- Thanks, Steve