On Thu 27-10-16 22:13:00, Ross Zwisler wrote: > On Thu, Oct 27, 2016 at 09:48:41PM +0000, Kani, Toshimitsu wrote: > > On Thu, 2016-10-27 at 15:03 -0600, Ross Zwisler wrote: > > > On Thu, Oct 27, 2016 at 12:46:32PM -0700, Dan Williams wrote: > > > > > > > > On Thu, Oct 27, 2016 at 12:07 PM, Jan Kara <jack@xxxxxxx> wrote: > > > > > > > > > > Hello, > > > > > > > > > > When testing my DAX patches rebased on top of Ross' DAX PMD > > > > > series, I've come across the following issue with generic/344 > > > > > test from xfstests. The test ends in an infinite fault loop when > > > > > we fault index 0 over and over again never finishing the fault. > > > > > The problem is that we do a write fault > > > > > for index 0 when there is PMD for that index. So we enter > > > > > wp_huge_pmd(). For whatever reason that returns VM_FAULT_FALLBACK > > > > > so we continue to handle_pte_fault(). There we do > > > > > > > > > > if (pmd_trans_unstable(vmf->pmd) || pmd_devmap(*vmf- > > > > > >pmd)) > > > > > > > > > > check which is true - the PMD we have is pmd_trans_huge() - so we > > > > > 'return 0' and that results in retrying the fault and all happens > > > > > from the beginning again. > > > > > > > > > > It isn't quite obvious how to break that cycle to me. The comment > > > > > before pmd_none_or_trans_huge_or_clear_bad() goes to great > > > > > lengths explaining possible races when PMD is pmd_trans_huge() so > > > > > it needs careful evaluation what needs to be done for DAX. Ross, > > > > > any idea? > > > > > > > > Can you bisect it with CONFIG_BROKEN removed from older kernels? > > > > > > > > I remember tracking down something like this when initially doing > > > > the pmd support. It ended up being a missed pmd_devmap() check in > > > > the fault path, so it may not be the same issue. It would at least > > > > be interesting to see if 4.6 fails in a similar manner with this > > > > test and FS_DAX_PMD enabled. > > > > > > I've been able to reproduce this with my v4.9-rc2 branch, but it > > > doesn't reproduce with the old v4.6 kernel. > > > > Not sure if it's relevant, but as FYI I fixed a similar issue before. > > > > commit 59bf4fb9d386601cbaa70a9b00159abb846dedaa > > dax: Split pmd map when fallback on COW > > > > -Toshi > > Thanks! Applying a similar patch solves this deadlock. Unfortunately I don't > (yet?) understand this well enough to say whether this is the correct > solution, but it makes generic/344 + PMDs pass. :) > > Does anyone with more mm knowledge have time to review? I'm not really much into huge pages but AFAICT that should fix the problem. I'm just not sure whether in other cases when we return VM_FAULT_FALLBACK we don't need something similar. Probably this will need some experiments ;). Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html