On Mon, Mar 16, 2020 at 10:05:03AM +0100, Christoph Hellwig wrote: > On Wed, Mar 11, 2020 at 03:35:01PM -0300, Jason Gunthorpe wrote: > > From: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > > > > This eventually calls into handle_mm_fault() which is a sleeping function. > > Release the lock first. > > > > hmm_vma_walk_hole() does not touch the contents of the PUD, so it does not > > need the lock. > > So how did this manage to not be noticed before? I'm still struggling a bit with how this PUD stuff works.. However AFAICT: 1) The first hunk around pud_none() is actualy covering a race. In the non-race case the page walker will directly call hmm_vma_walk_hole() for pud_none so it would be very hard to hit this 2) I'm not 100% sure, but I think pud_present() == pud_none(), as there is no swap entry for PUD I don't know what a non-present but non-none entry is or how to create one. This is possibly dead code due to #1 3) To hit hmm_range_need_fault() requesting fault you would need a read-only huge PUD and a fault requesting write. I suspect creating a read only huge PUD entry is very rare - not something someone would deliberately construct. 4) To even trigger the PUD flow at all you need the 1G THP or the special 1G DAX pages which I strongly suspect people are not testing with. > The fix looks fine assuming we want something backportable before > starting the cleanups: I found it easier to make the other cleanup patches make sense if they didn't introduce all kinds of new logic too.. Thanks, Jason