On Fri, 16 Jan 2015 13:53:04 -0500 Jeff Layton <jlayton@xxxxxxxxxxxxxxx> wrote: > On Fri, 16 Jan 2015 13:10:46 -0500 > Sasha Levin <sasha.levin@xxxxxxxxxx> wrote: > > > On 01/16/2015 09:40 AM, Jeff Layton wrote: > > > On Fri, 16 Jan 2015 09:31:23 -0500 > > > Sasha Levin <sasha.levin@xxxxxxxxxx> wrote: > > > > > >> On 01/15/2015 03:22 PM, Jeff Layton wrote: > > >>> Ok, I tried to reproduce it with that and several variations but it > > >>> still doesn't seem to do it for me. Can you try the latest linux-next > > >>> tree and see if it's still reproducible there? > > >> > > >> It's still not in in today's -next, could you send me a patch for testing > > >> instead? > > >> > > > > > > Seems to be there for me: > > > > > > ----------------------[snip]----------------------- > > > /* > > > * This function is called on the last close of an open file. > > > */ > > > void locks_remove_file(struct file *filp) > > > { > > > /* ensure that we see any assignment of i_flctx */ > > > smp_rmb(); > > > > > > /* remove any OFD locks */ > > > locks_remove_posix(filp, filp); > > > ----------------------[snip]----------------------- > > > > > > That's actually the right place to put the barrier, I think. We just > > > need to ensure that this function sees any assignment to i_flctx that > > > occurred before this point. By the time we're here, we shouldn't be > > > getting any new locks that matter to this close since the fcheck call > > > should fail on any new requests. > > > > > > If that works, then I'll probably make some other changes to the set > > > and re-post it next week. > > > > > > Many thanks for helping me test this! > > > > You're right, I somehow missed that. > > > > But it doesn't fix the issue, I still see it happening, but it seems > > to be less frequent(?). > > > > Ok, that was my worry (and one of the reasons I really would like to > find some way to reproduce this on my own). I think what I'll do at > this point is pull the patchset from linux-next until I can consult > with someone who understands this sort of cache-coherency problem > better than I do. > > Once I get it resolved, I'll push it back to my linux-next branch and > let you know and we can give it another go. > > Thanks for the testing so far! Actually, I take it back. One more try... I dragooned David Howells into helping me look at this and he talked me into just going back to using the i_lock to protect the i_flctx assignment. My hope is that will work around whatever strange effect is causing this. Can you test tomorrow's -next tree (once it's been merged) and see whether this is still reproducible? If that works, then I may go back to trying to do this locklessly with cmpxchg, but I'll probably need to corner Paul McKinney and buy him a beverage of his choice so he can talk me through how to do it properly. Thanks again! -- Jeff Layton <jlayton@xxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html