RE: [RFC v1 1/3] mm/mmu_notifier: Add a new notifier for mapping updates (new pages)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

> > >
> > > > No, adding HMM_PFN_REQ_WRITE still doesn't help in fixing the issue.
> > > > Although, I do not have THP enabled (or built-in), shmem does not evict
> > > > the pages after hole punch as noted in the comment in
> shmem_fallocate():
> > >
> > > This is the source of all your problems.
> > >
> > > Things that are mm-centric are supposed to track the VMAs and changes
> to
> > > the PTEs. If you do something in userspace and it doesn't cause the
> > > CPU page tables to change then it certainly shouldn't cause any mmu
> > > notifiers or hmm_range_fault changes.
> > I am not doing anything out of the blue in the userspace. I think the
> behavior
> > I am seeing with shmem (where an invalidation event
> (MMU_NOTIFY_CLEAR)
> > does occur because of a hole punch but the PTEs don't really get updated)
> > can arguably be considered an optimization.
> 
> Your explanations don't make sense.
> 
> If MMU_NOTIFER_CLEAR was sent but the PTEs were left present then:
> 
> > > There should still be an invalidation notifier at some point when the
> > > CPU tables do eventually change, whenever that is. Missing that
> > > notification would be a bug.
> > I clearly do not see any notification getting triggered (from both
> shmem_fault()
> > and hugetlb_fault()) when the PTEs do get updated as the hole is refilled
> > due to writes. Are you saying that there needs to be an invalidation event
> > (MMU_NOTIFY_CLEAR?) dispatched at this point?
> 
> You don't get to get shmem_fault in the first place.
What I am observing is that even after MMU_NOTIFY_CLEAR (hole punch) is sent,
hmm_range_fault() finds that the PTEs associated with the hole are still pte_present().
I think it remains this way as long as there are reads on the hole. Once there are
writes, it triggers shmem_fault() which results in PTEs getting updated but without
any notification.

> 
> If they were marked non-prsent during the CLEAR then the shadow side
> remains non-present until it gets its own fault.
> 
> If they were made non-present without an invalidation then that is a
> bug.
> 
> > > hmm_range_fault() is the correct API to use if you are working with
> > > notifiers. Do not hack something together using pin_user_pages.
> 
> > I noticed that hmm_range_fault() does not seem to be working as expected
> > given that it gets stuck(hangs) while walking hugetlb pages.
> 
> You are the first to report that, it sounds like a serious bug. Please
> try to fix it.
> 
> > Regardless, as I mentioned above, the lack of notification when PTEs
> > do get updated due to writes is the crux of the issue
> > here. Therefore, AFAIU, triggering an invalidation event or some
> > other kind of notification would help in fixing this issue.
> 
> You seem to be facing some kind of bug in the mm, it sounds pretty
> serious, and it almost certainly is a missing invalidation.
> 
> Basically, anything that changes a PTE must eventually trigger an
> invalidation. It is illegal to change a PTE from one present value to
> another present value without invalidation notification.
> 
> It is not surprising something would be missed here.
As you suggest, it looks like the root-cause of this issue is the missing
invalidation notification when the PTEs are changed from one present
value to another. I'd like to fix this issue eventually but I first need to
focus on addressing udmabuf page migration (out of movable zone)
and also look into the locking concerns Daniel mentioned about pairing
static and dynamic dmabuf exporters and importers.

Thanks,
Vivek





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux