Re: [PATCH v6 01/10] mm: add zone device coherent type memory support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2022-02-15 16:47, Jason Gunthorpe wrote:
On Tue, Feb 15, 2022 at 04:35:56PM -0500, Felix Kuehling wrote:
On 2022-02-15 14:41, Jason Gunthorpe wrote:
On Tue, Feb 15, 2022 at 07:32:09PM +0100, Christoph Hellwig wrote:
On Tue, Feb 15, 2022 at 10:45:24AM -0400, Jason Gunthorpe wrote:
Do you know if DEVICE_GENERIC pages would end up as PageAnon()? My
assumption was that they would be part of a special mapping.
We need to stop using the special PTEs and VMAs for things that have a
struct page. This is a mistake DAX created that must be undone.
Yes, we'll get to it.  Maybe we can do it for the non-DAX devmap
ptes first given that DAX is more complicated.
Probably, I think we can check the page->pgmap type to tell the
difference.

I'm not sure how the DEVICE_GENERIC can work without this, as DAX was
made safe by using the unmap_mapping_range(), which won't work
here. Is there some other trick being used to keep track of references
inside the AMD driver?
Not sure I'm following all the discussion about VMAs and DAX. So I may be
answering the wrong question: We treat each ZONE_DEVICE page as a reference
to the BO (buffer object) that backs the page. We increment the BO refcount
for each page we migrate into it. In the dev_pagemap_ops.page_free callback
we drop that reference. Once all pages backed by a BO are freed, the BO
refcount reaches 0 [*] and we can free the BO allocation.
Userspace does
  1) mmap(MAP_PRIVATE) to allocate anon memory
  2) something to trigger migration to install a ZONE_DEVICE page
  3) munmap()

Who decrements the refcout on the munmap?

When a ZONE_DEVICE page is installed in the PTE is supposed to be
marked as pte_devmap and that disables all the normal page refcounting
during munmap().

fsdax makes this work by working the refcounts backwards, the page is
refcounted while it exists in the driver, when the driver decides to
remove it then unmap_mapping_range() is called to purge it from all
PTEs and then refcount is decrd. munmap/fork/etc don't change the
refcount.

Hmm, that just means, whether or not there are PTEs doesn't really matter. It should work the same as it does for DEVICE_PRIVATE pages. I'm not sure where DEVICE_PRIVATE page's refcounts are decremented on unmap, TBH. But I can't find it in our driver, or in the test_hmm driver for that matter.

Regards,
  Felix


Jason



[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux