On 03/11/17 09:40, NeilBrown wrote: > Hi Neil, and thanks taking the time to post the patch. > Currently if the autofs kernel module gets an error when > writing to the pipe which links to the daemon, then it > marks the whole moutpoint as catatonic, and it will stop working. > > It is possible that the error is transient. This can happen > if the daemon is slow and more than 16 requests queue up. > If a subsequent process tries to queue a request, and is then signalled, > the write to the pipe will return -ERESTARTSYS and autofs > will take that as total failure. Indeed it does. And given the problems with a half dozen (or so) user space applications consuming large amounts of CPU under heavy mount and umount activity this could happen more easily than we expect. > > So change the code to assess -ERESTARTSYS and -ENOMEM as transient > failures which only abort the current request, not the whole > mountpoint. This looks good to me. > > Signed-off-by: NeilBrown <neilb@xxxxxxxx> > --- > > Do people think this should got to -stable ?? > It isn't a crash or a data corruption, but having autofs mountpoints > suddenly stop working is rather inconvenient. Perhaps that's a good idea given the CPU usage problem I refer to above has been around for a while now. > > Thanks, > NeilBrown > > > fs/autofs4/waitq.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c > index 4ac49d038bf3..8fc41705c7cd 100644 > --- a/fs/autofs4/waitq.c > +++ b/fs/autofs4/waitq.c > @@ -81,7 +81,8 @@ static int autofs4_write(struct autofs_sb_info *sbi, > spin_unlock_irqrestore(¤t->sighand->siglock, flags); > } > > - return (bytes > 0); > + /* if 'wr' returned 0 (impossible) we assume -EIO (safe) */ > + return bytes == 0 ? 0 : wr < 0 ? wr : -EIO; > } > > static void autofs4_notify_daemon(struct autofs_sb_info *sbi, > @@ -95,6 +96,7 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi, > } pkt; > struct file *pipe = NULL; > size_t pktsz; > + int ret; > > pr_debug("wait id = 0x%08lx, name = %.*s, type=%d\n", > (unsigned long) wq->wait_queue_token, > @@ -169,7 +171,18 @@ static void autofs4_notify_daemon(struct autofs_sb_info *sbi, > mutex_unlock(&sbi->wq_mutex); > > if (autofs4_write(sbi, pipe, &pkt, pktsz)) > + switch (ret = autofs4_write(sbi, pipe, &pkt, pktsz)) { > + case 0: > + break; > + case -ENOMEM: > + case -ERESTARTSYS: > + /* Just fail this one */ > + autofs4_wait_release(sbi, wq->wait_queue_token, ret); > + break; > + default: > autofs4_catatonic_mode(sbi); > + break; > + } > fput(pipe); > } > >