On Fri, 31 Jul 2015, Linus Torvalds wrote: > On Fri, Jul 31, 2015 at 10:46 AM, Hugh Dickins <hughd@xxxxxxxxxx> wrote: > > > > Sounds like a dcache problem, and 75a6f82a0d10 seemed the only > > likely candidate, so I experimented with reverting it yesterday, > > and ran successfully for 24 hours. > > Hmm. Sounds odd. Are you running nfsd? That would explain why it > happens on ext4 but not tmpfs: ext4 has a get_parent method that can > get a disconnected entry, while tmpfs does not. > > That said, your load doesn't sound like it would actually ever trigger > this, unless you just didn't mention that you also end up using that > filesystem over nfs on another machine. No, no nfsd nor any kind of networking filesystem stuff going on. Right, I never looked to see what DCACHE_DISCONNECTED is actually about, just rushed ahead and tried running with the revert. > > So leave it running a while longer, but maybe it's 4bf46a272647 like > Dominique suspects. Although I don't see how that could trigger > anything either.. I restarted with a slightly different version of the load this morning, which has sometimes shown the issue more easily - I thought it better to restart with a variant than persist with a run that might have settled into a protected pattern. We'll see what that shows later on. It will indeed be weird and odd if it confirms that DCACHE_DISCONNECTED revert is good. I agree that Dominique's 4bf46a272647 seems now more likely, if still unlikely; but that was included in v4.1, and I saw no problem with v4.1 once the rmap_walk() skip was fixed. There may be some completely unrelated commit which alters the timing enough to expose or mask whatever is the guilty commit. Or something corrupting dentry->d_flags occasionally. Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html