Re: Rename restrictedmem => guardedmem? (was: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM)

Ackerley Tng <ackerleytng@xxxxxxxxxx> · Thu, 08 Jun 2023 17:13:35 +0000

Sean Christopherson <seanjc@xxxxxxxxxx> writes:

...

Probably not.  And as you note below, it's all pretty nonsensical anyways.

     + What is the significance of these LRU flags when gmem doesn't  
support
       swapping/eviction?

Likely none.  I used the filemap APIs in my POC because it was easy, not  
because
it was necessarily the best approach, i.e. that the folios/pages show up  
in the
LRUs is an unwanted side effect, not a feature.  If guest_memfd only  
needs a small
subset of the filemap support, going with a true from-scratch  
implemenation on
top of xarray might be cleaner overall, e.g. would avoid the need for a  
new flag
to say "this folio can't be migrated even though it's on the LRUs".

For hugetlb support on gmem, using an xarray in place of a filemap should  
work
fine. Page migration could come up in future - perhaps migration code works
better with filemap? Not sure.

+ "KVM: guest_mem: Align so that at least 1 page is allocated"
     + Bug in current implementation: without this alignment, fallocate()  
of
       a size less than the gmem page size will result in no allocation  
at all

I'm not convinced this is a bug.  I don't see any reason to allow  
allocating and
punching holes in sub-page granularity.

I looked at the code more closely, you're right. len is checked to be 4K
aligned. When userspace requests a gmem page size of larger than 4K (gmem  
THP),
the allocation loop still does the right thing.

This issue only arises for hugetlb pages. I'll rebase the next revision of  
the
hugetlb series accordingly.

     + Both shmem and hugetlbfs perform this alignment
+ "KVM: guest_mem: Add alignment checks"
     + Implemented the alignment checks for guest_mem because hugetlb on  
gmem
       would hit a BUG_ON without this check
+ "KVM: guest_mem: Prevent overflows in kvm_gmem_invalidate_begin()"
     + Sean fixed a bug in the offset-to-gfn conversion in
       kvm_gmem_invalidate_begin() earlier, adding a WARN_ON_ONCE()

As Mike pointed out, there's likely still a bug here[*].  I was planning  
on
diving into that last week, but that never happened.  If you or anyone  
else can
take a peek and/or write a testcase, that would be awesome.

  : Heh, only if there's a testcase for it.  Assuming start >= the slot  
offset does
  : seem broken, e.g. if the range-to-invalidate overlaps multiple slots,  
later slots
  : will have index==slot->gmem.index > start.
  :
  : > Since 'index' corresponds to the gmem offset of the current slot, is  
there any
  : > reason not to do something like this?:
  : >
  : >   .start = slot->base_gfn + index - slot->gmem.index,
  : >
  : > But then, if that's the case, wouldn't index == slot->gmem.index?  
Suggesting
  : > we case just simplify to this?:
  : >
  : >   .start = slot->base_gfn,
  :
  : No, e.g. if start is partway through a memslot, there's no need to  
invalidate
  : the entire memslot.  I'll stare at this tomorrow when my brain is  
hopefully a
  : bit more functional, I suspect there is a min() and/or max() needed  
somewhere.

[*] https://lore.kernel.org/all/20230512002124.3sap3kzxpegwj3n2@xxxxxxx

I think I have fixed this, please see "KVM: guest_mem: Prevent overflows in
kvm_gmem_invalidate_begin()" [1].

This patch does take into account that start could be greater than
slot->gmem.index, when userspace chooses to punch holes beginning in the  
middle
of the memslot.

The process could be split into figuring out file indices, then GFNs:

1. Figure out the start and end in terms of index in the file
    + index_start: taking max(start, slot->gmem.index)
        + start will only be greater than slot->gmem.index but not greater  
than
          the end of the slot, due to the nature of xa_for_each_range()
    + index_end: taking min(end, slot->gmem.index + slot->npages)
2. Convert indices to GFNs

This also prevents overflows as described at [1].

...

[1]  
https://github.com/googleprodkernel/linux-cc/commit/bcc304e3657a998b8f61aa1b841754fbb90d8994