On Wed, Feb 5, 2025 at 9:39 AM Vishal Annapurve <vannapurve@xxxxxxxxxx> wrote: > > On Wed, Feb 5, 2025 at 2:07 AM Fuad Tabba <tabba@xxxxxxxxxx> wrote: > > > > Hi Vishal, > > > > On Wed, 5 Feb 2025 at 00:42, Vishal Annapurve <vannapurve@xxxxxxxxxx> wrote: > > > > > > On Fri, Jan 17, 2025 at 8:30 AM Fuad Tabba <tabba@xxxxxxxxxx> wrote: > > > > > > > > Before transitioning a guest_memfd folio to unshared, thereby > > > > disallowing access by the host and allowing the hypervisor to > > > > transition its view of the guest page as private, we need to be > > > > sure that the host doesn't have any references to the folio. > > > > > > > > This patch introduces a new type for guest_memfd folios, and uses > > > > that to register a callback that informs the guest_memfd > > > > subsystem when the last reference is dropped, therefore knowing > > > > that the host doesn't have any remaining references. > > > > > > > > Signed-off-by: Fuad Tabba <tabba@xxxxxxxxxx> > > > > --- > > > > The function kvm_slot_gmem_register_callback() isn't used in this > > > > series. It will be used later in code that performs unsharing of > > > > memory. I have tested it with pKVM, based on downstream code [*]. > > > > It's included in this RFC since it demonstrates the plan to > > > > handle unsharing of private folios. > > > > > > > > [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.13-v5-pkvm > > > > > > Should the invocation of kvm_slot_gmem_register_callback() happen in > > > the same critical block as setting the guest memfd range mappability > > > to NONE, otherwise conversion/truncation could race with registration > > > of callback? > > > > I don't think it needs to, at least not as far potencial races are > > concerned. First because kvm_slot_gmem_register_callback() grabs the > > mapping's invalidate_lock as well as the folio lock, and > > gmem_clear_mappable() grabs the mapping lock and the folio lock if a > > folio has been allocated before. > > I was hinting towards such a scenario: > Core1 > Shared to private conversion > -> Results in mappability attributes > being set to NONE > ... > Trigger private to shared conversion/truncation for > ... > overlapping ranges > ... > kvm_slot_gmem_register_callback() on > the guest_memfd ranges converted > above (This will end up registering callback > for guest_memfd ranges which possibly don't > carry *_MAPPABILITY_NONE) > Sorry for the format mess above. I was hinting towards such a scenario: Core1- Shared to private conversion -> Results in mappability attributes being set to NONE ... Core2 Trigger private to shared conversion/truncation for overlapping ranges ... Core1 kvm_slot_gmem_register_callback() on the guest_memfd ranges converted above (This will end up registering callback for guest_memfd ranges which possibly don't carry *_MAPPABILITY_NONE) > > > > Second, __gmem_register_callback() checks before returning whether all > > references have been dropped, and adjusts the mappability/shareability > > if needed. > > > > Cheers, > > /fuad