Re: [RFC PATCH v1 04/26] KVM: Don't allow private attribute to be set if mapped by host

Sean Christopherson <seanjc@xxxxxxxxxx> · Wed, 17 Apr 2024 16:27:37 -0700

On Thu, Feb 22, 2024, Fuad Tabba wrote:
> +#ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM_MAPPABLE
> +bool kvm_is_gmem_mapped(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
> +{
> +	struct kvm_memslot_iter iter;
> +
> +	kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), gfn_start, gfn_end) {
> +		struct kvm_memory_slot *memslot = iter.slot;
> +		gfn_t start, end, i;
> +
> +		start = max(gfn_start, memslot->base_gfn);
> +		end = min(gfn_end, memslot->base_gfn + memslot->npages);
> +		if (WARN_ON_ONCE(start >= end))
> +			continue;
> +
> +		for (i = start; i < end; i++) {
> +			struct page *page;
> +			bool is_mapped;
> +			kvm_pfn_t pfn;
> +			int ret;
> +
> +			/*
> +			 * Check the page_mapcount with the page lock held to
> +			 * avoid racing with kvm_gmem_fault().
> +			 */

I don't see how this avoids a TOCTOU race.   kvm_gmem_fault() presumably runs with
mmap_lock, but it definitely doesn't take slots_lock.  And this has slots_lock,
but definitely doesn't have mmap_lock.  If the fault is blocked on the page lock,
this will see page_mapcount() = 0, and the fault will map the page as soon as
unlock_page() runs.   Am I missing something?

I haven't thought deeply about this, but I'm pretty sure that "can this be
mapped" needs to tracked against the guest_memfd() inode, not in KVM.  While
each guest_memfd() *file* has a 1:1 binding with a KVM instance, the plan is to
allow multiple files per inode, e.g. to allow intra-host migration to a new KVM
instance, without destroying guest_memfd().