On Mon, Nov 29, 2021, David Stevens wrote: > +static kvm_pfn_t ensure_pfn_ref(struct page *page, kvm_pfn_t pfn) "ensure" is rather misleading as that implies this is _just_ an assertion, but that's not true since it elevates the refcount. Maybe kvm_try_get_page_ref()? > +{ > + if (page || is_error_pfn(pfn)) A comment above here would be very helpful. It's easy to overlook the "page" check and think that KVM is double-counting pages. E.g. /* If @page is valid, KVM already has a reference to the pfn/page. */ That would tie in nicely with the kvm_try_get_page_ref() name too. > + return pfn; > + > + /* > + * If we're here, a pfn resolved by hva_to_pfn_remapped is > + * going to be returned to something that ultimately calls > + * kvm_release_pfn_clean, so the refcount needs to be bumped if > + * the pfn isn't a reserved pfn. > + * > + * Whoever called remap_pfn_range is also going to call e.g. > + * unmap_mapping_range before the underlying pages are freed, > + * causing a call to our MMU notifier. > + * > + * Certain IO or PFNMAP mappings can be backed with valid > + * struct pages, but be allocated without refcounting e.g., > + * tail pages of non-compound higher order allocations, which > + * would then underflow the refcount when the caller does the > + * required put_page. Don't allow those pages here. > + */ > + if (kvm_is_reserved_pfn(pfn) || > + get_page_unless_zero(pfn_to_page(pfn))) > + return pfn; > + > + return KVM_PFN_ERR_FAULT; > +}