Re: [PATCH 2/2] nfs: do not start the callback thread until we set rqstp->rq_task

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 02, 2014 at 03:32:55PM -0400, Trond Myklebust wrote:
> On Tue, Sep 2, 2014 at 3:23 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> > On Tue, Sep 02, 2014 at 01:58:58PM -0400, Trond Myklebust wrote:
> >> This fixes an Oopsable race when starting up the callback server.
> >>
> >> Signed-off-by: Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx>
> >> ---
> >>  fs/nfs/callback.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c
> >> index e3dd1cd175d9..b8fb3a4ef649 100644
> >> --- a/fs/nfs/callback.c
> >> +++ b/fs/nfs/callback.c
> >> @@ -235,7 +235,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >>
> >>       cb_info->serv = serv;
> >>       cb_info->rqst = rqstp;
> >> -     cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> >> +     cb_info->task = kthread_create(callback_svc, cb_info->rqst,
> >>                                   "nfsv4.%u-svc", minorversion);
> >>       if (IS_ERR(cb_info->task)) {
> >>               ret = PTR_ERR(cb_info->task);
> >> @@ -245,6 +245,7 @@ static int nfs_callback_start_svc(int minorversion, struct rpc_xprt *xprt,
> >>               return ret;
> >>       }
> >>       rqstp->rq_task = cb_info->task;
> >> +     wake_up_process(cb_info->task);
> >
> > Wouldn't it be cleaner to do something like:
> >
> > -       cb_info->task = kthread_run(callback_svc, cb_info->rqst,
> > +       cb_info->task = rqstp->rq_run =
> > +               kthread_create(callback_svc, cb_info->rqst,
> >
> > or am I missing something subtile that the changelog didn't mention?
> 
> The above is fine if you call kthread_create(), but if you stick with
> kthread_run(), then there is still the same atomicity issue that the
> thread can be started before we've initialised cb_info->task and
> rqstp->rq_run.
> 
> Internal testing has shown that this can lead to an oops when starting
> lockd.

The oops seen in practice were probably after applying 983c684466e0
"SUNRPC: get rid of the request wait queue"?

Though it was a bug before then too, of course.

--b.

> I'm therefore assuming that the same thing can happen with the
> NFS client callback channel.
> 
> -- 
> Trond Myklebust
> 
> Linux NFS client maintainer, PrimaryData
> 
> trond.myklebust@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux