Re: fs: locks: WARNING: CPU: 16 PID: 4296 at fs/locks.c:236 locks_free_lock_context+0x10d/0x240()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 16 Jan 2015 13:53:04 -0500
Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote:

> On Fri, 16 Jan 2015 13:10:46 -0500
> Sasha Levin <sasha.levin@xxxxxxxxxx> wrote:
> 
> > On 01/16/2015 09:40 AM, Jeff Layton wrote:
> > > On Fri, 16 Jan 2015 09:31:23 -0500
> > > Sasha Levin <sasha.levin@xxxxxxxxxx> wrote:
> > > 
> > >> On 01/15/2015 03:22 PM, Jeff Layton wrote:
> > >>> Ok, I tried to reproduce it with that and several variations but it
> > >>> still doesn't seem to do it for me. Can you try the latest linux-next
> > >>> tree and see if it's still reproducible there?
> > >>
> > >> It's still not in in today's -next, could you send me a patch for testing
> > >> instead?
> > >>
> > > 
> > > Seems to be there for me:
> > > 
> > > ----------------------[snip]-----------------------
> > > /*
> > >  * This function is called on the last close of an open file.
> > >  */
> > > void locks_remove_file(struct file *filp)
> > > {
> > >         /* ensure that we see any assignment of i_flctx */
> > >         smp_rmb();
> > > 
> > >         /* remove any OFD locks */
> > >         locks_remove_posix(filp, filp);
> > > ----------------------[snip]-----------------------
> > > 
> > > That's actually the right place to put the barrier, I think. We just
> > > need to ensure that this function sees any assignment to i_flctx that
> > > occurred before this point. By the time we're here, we shouldn't be
> > > getting any new locks that matter to this close since the fcheck call
> > > should fail on any new requests.
> > > 
> > > If that works, then I'll probably make some other changes to the set
> > > and re-post it next week.
> > > 
> > > Many thanks for helping me test this!
> > 
> > You're right, I somehow missed that.
> > 
> > But it doesn't fix the issue, I still see it happening, but it seems
> > to be less frequent(?).
> > 
> 
> Ok, that was my worry (and one of the reasons I really would like to
> find some way to reproduce this on my own). I think what I'll do at
> this point is pull the patchset from linux-next until I can consult
> with someone who understands this sort of cache-coherency problem
> better than I do.
> 
> Once I get it resolved, I'll push it back to my linux-next branch and
> let you know and we can give it another go.
> 
> Thanks for the testing so far!

Actually, I take it back. One more try...

I dragooned David Howells into helping me look at this and he talked me
into just going back to using the i_lock to protect the i_flctx
assignment.

My hope is that will work around whatever strange effect is causing
this. Can you test tomorrow's -next tree (once it's been merged) and see
whether this is still reproducible?

If that works, then I may go back to trying to do this locklessly with
cmpxchg, but I'll probably need to corner Paul McKinney and buy him a
beverage of his choice so he can talk me through how to do it properly.

Thanks again!
-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux