Re: Regretion on NFS in mainline kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 18 Apr 2012 15:13:13 +0100
Luis Henriques <luis.henriques@xxxxxxxxxxxxx> wrote:

> On Wed, Apr 18, 2012 at 02:04:26PM +0000, Myklebust, Trond wrote:
> > On Wed, 2012-04-18 at 14:57 +0100, Luis Henriques wrote:
> > > Hi Jeff,
> > > 
> > > On Wed, Apr 18, 2012 at 09:28:22AM -0400, Jeff Layton wrote:
> > > > On Wed, 18 Apr 2012 12:26:10 +0100
> > > > Luis Henriques <luis.henriques@xxxxxxxxxxxxx> wrote:
> > > > 
> > > > > Hi,
> > > > > 
> > > > > We have a bug reporting a regression in mainline kernel.  Basically, the
> > > > > bug reporters are seeing lots of messages:
> > > > > 
> > > > > [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > > > > [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > > > > [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed
> > > > > 
> > > > > This happens when mounting a user's home directory over NFS.
> > > > > 
> > > > > Is this a known issue being addressed at the moment?  Is there any
> > > > > information needed to help debugging the issue?
> > > > > 
> > > > > The original bug report can be found here:
> > > > > 
> > > > > http://bugs.launchpad.net/bugs/974664
> > > > > 
> > > > > And there's also a similar report for Fedora:
> > > > > 
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=811138
> > > > > 
> > > > > Cheers,
> > > > 
> > > > This code in nfs4_reclaim_open_state() looks wrong to me, but I'm not
> > > > that familiar with this code so I could be wrong:
> > > > 
> > > > -------------------[snip]--------------------
> > > >                 status = ops->recover_open(sp, state);
> > > >                 if (status >= 0) {
> > > >                         status = nfs4_reclaim_locks(state, ops);
> > > >                         if (status >= 0) {
> > > >                                 spin_lock(&state->state_lock);
> > > >                                 list_for_each_entry(lock, &state->lock_states, ls_locks) {
> > > >                                         if (!(lock->ls_flags & NFS_LOCK_INITIALIZED))
> > > >                                                 pr_warn_ratelimited("NFS: "
> > > >                                                         "%s: Lock reclaim "
> > > >                                                         "failed!\n", __func__);
> > > >                                 }
> > > >                                 spin_unlock(&state->state_lock);
> > > >                                 nfs4_put_open_state(state);
> > > >                                 goto restart;
> > > >                         }
> > > >                 }
> > > > -------------------[snip]--------------------
> > > > 
> > > > Shouldn't the status check after nfs4_reclaim_locks be reversed?
> > > 
> > > Thanks a lot for your help.  Could you please take a look at the patch
> > > below, just to make sure I understood you're suggestion correctly?  I will
> > > prepare a test kernel so that we can check whether it actually solves the
> > > problem or not.
> > > 
> > > Cheers,
> > > --
> > > Luis
> > > 
> > > 
> > > >From a1348f473c157439ac62f502eb45ca48f95e627f Mon Sep 17 00:00:00 2001
> > > From: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
> > > Date: Wed, 18 Apr 2012 14:50:10 +0100
> > > Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state()
> > > 
> > > There have been several bug reports, with the following messages on the
> > > logs:
> > > 
> > >  [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > >  [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > >  [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed
> > > 
> > > This happens, for example, when mounting a user's home directory over NFS.
> > > 
> > > Thanks to Jeff Layton that identified the cause, this patch fixes an
> > > incorrect status check on nfs4_reclaim_open_state().
> > > 
> > > Signed-off-by: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
> > > ---
> > >  fs/nfs/nfs4state.c |    2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> > > index 07354b7..8b6acec 100644
> > > --- a/fs/nfs/nfs4state.c
> > > +++ b/fs/nfs/nfs4state.c
> > > @@ -1180,7 +1180,7 @@ restart:
> > >  		atomic_inc(&state->count);
> > >  		spin_unlock(&sp->so_lock);
> > >  		status = ops->recover_open(sp, state);
> > > -		if (status >= 0) {
> > > +		if (status < 0) {
> > >  			status = nfs4_reclaim_locks(state, ops);
> > >  			if (status >= 0) {
> > >  				spin_lock(&state->state_lock);
> > 
> > Hell no! 
> 
> Ouch!  Wrong one...
> 
> From 005918b5eef853fd4d495743fef5a52ae62f825e Mon Sep 17 00:00:00 2001
> From: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
> Date: Wed, 18 Apr 2012 15:08:39 +0100
> Subject: [PATCH 1/1] NFS: Fix status check on nfs4_reclaim_open_state()
> 
> There have been several bug reports, with the following messages on the
> logs:
> 
>  [ 48.701213] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>  [ 48.701990] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
>  [ 53.696076] nfs4_reclaim_open_state: 6440 callbacks suppressed
> 
> This happens, for example, when mounting a user's home directory over NFS.
> 
> Thanks to Jeff Layton that identified the cause, this patch fixes an
> incorrect status check on nfs4_reclaim_open_state().
> 
> Signed-off-by: Luis Henriques <luis.henriques@xxxxxxxxxxxxx>
> ---
>  fs/nfs/nfs4state.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c
> index 07354b7..023e09f 100644
> --- a/fs/nfs/nfs4state.c
> +++ b/fs/nfs/nfs4state.c
> @@ -1182,7 +1182,7 @@ restart:
>  		status = ops->recover_open(sp, state);
>  		if (status >= 0) {
>  			status = nfs4_reclaim_locks(state, ops);
> -			if (status >= 0) {
> +			if (status < 0) {
>  				spin_lock(&state->state_lock);
>  				list_for_each_entry(lock, &state->lock_states, ls_locks) {
>  					if (!(lock->ls_flags & NFS_LOCK_INITIALIZED))

Yeah, that looks more reasonable, but again I'm not sure about this
either way. This code has been this way for a long time and it's not
clear to me why it's only now become a problem if it is wrong.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux