On Fri, Dec 24, 2021 at 11:54:18AM +0800, Chao Peng wrote: > On Thu, Dec 23, 2021 at 06:02:33PM +0000, Sean Christopherson wrote: > > On Thu, Dec 23, 2021, Chao Peng wrote: > > > Similar to hva_tree for hva range, maintain interval tree ofs_tree for > > > offset range of a fd-based memslot so the lookup by offset range can be > > > faster when memslot count is high. > > > > This won't work. The hva_tree relies on there being exactly one virtual address > > space, whereas with private memory, userspace can map multiple files into the > > guest at different gfns, but with overlapping offsets. > > OK, that's the point. > > > > > I also dislike hijacking __kvm_handle_hva_range() in patch 07. > > > > KVM also needs to disallow mapping the same file+offset into multiple gfns, which > > I don't see anywhere in this series. > > This can be checked against file+offset overlapping with existing slots > when register a new one. > > > > > In other words, there needs to be a 1:1 gfn:file+offset mapping. Since userspace > > likely wants to allocate a single file for guest private memory and map it into > > multiple discontiguous slots, e.g. to skip the PCI hole, the best idea off the top > > of my head would be to register the notifier on a per-slot basis, not a per-VM > > basis. It would require a 'struct kvm *' in 'struct kvm_memory_slot', but that's > > not a huge deal. > > > > That way, KVM's notifier callback already knows the memslot and can compute overlap > > between the memslot and the range by reversing the math done by kvm_memfd_get_pfn(). > > Then, armed with the gfn and slot, invalidation is just a matter of constructing > > a struct kvm_gfn_range and invoking kvm_unmap_gfn_range(). > > KVM is easy but the kernel bits would be difficulty, it has to maintain > fd+offset to memslot mapping because one fd can have multiple memslots, > it need decide which memslot needs to be notified. How about pass "context" of fd (e.g. the gfn/hva start point) when register the invalidation notifier to fd, then in callback kvm can convert the offset to absolute hva/gfn with such "context", then do memslot invalidation. > > Thanks, > Chao