On Wed, Jul 31, 2024 at 8:41 PM Alex Bennée <alex.bennee@xxxxxxxxxx> wrote: > > Sean Christopherson <seanjc@xxxxxxxxxx> writes: > > > On Thu, Feb 29, 2024, David Stevens wrote: > >> From: David Stevens <stevensd@xxxxxxxxxxxx> > >> > >> This patch series adds support for mapping VM_IO and VM_PFNMAP memory > >> that is backed by struct pages that aren't currently being refcounted > >> (e.g. tail pages of non-compound higher order allocations) into the > >> guest. > >> > >> Our use case is virtio-gpu blob resources [1], which directly map host > >> graphics buffers into the guest as "vram" for the virtio-gpu device. > >> This feature currently does not work on systems using the amdgpu driver, > >> as that driver allocates non-compound higher order pages via > >> ttm_pool_alloc_page(). > >> > >> First, this series replaces the gfn_to_pfn_memslot() API with a more > >> extensible kvm_follow_pfn() API. The updated API rearranges > >> gfn_to_pfn_memslot()'s args into a struct and where possible packs the > >> bool arguments into a FOLL_ flags argument. The refactoring changes do > >> not change any behavior. > >> > >> From there, this series extends the kvm_follow_pfn() API so that > >> non-refconuted pages can be safely handled. This invloves adding an > >> input parameter to indicate whether the caller can safely use > >> non-refcounted pfns and an output parameter to tell the caller whether > >> or not the returned page is refcounted. This change includes a breaking > >> change, by disallowing non-refcounted pfn mappings by default, as such > >> mappings are unsafe. To allow such systems to continue to function, an > >> opt-in module parameter is added to allow the unsafe behavior. > >> > >> This series only adds support for non-refcounted pages to x86. Other > >> MMUs can likely be updated without too much difficulty, but it is not > >> needed at this point. Updating other parts of KVM (e.g. pfncache) is not > >> straightforward [2]. > > > > FYI, on the off chance that someone else is eyeballing this, I am working on > > revamping this series. It's still a ways out, but I'm optimistic that we'll be > > able to address the concerns raised by Christoph and Christian, and maybe even > > get KVM out of the weeds straightaway (PPC looks thorny :-/). > > I've applied this series to the latest 6.9.x while attempting to > diagnose some of the virtio-gpu problems it may or may not address. > However launching KVM guests keeps triggering a bunch of BUGs that > eventually leave a hung guest: > Likely the same issue as [1]. Commit d02c357e5bfa added another call to kvm_release_pfn_clean() in kvm_faultin_pfn(), which ends up releasing a reference that is no longer being taken. If you replace that with kvm_set_page_accessed() instead, then things should work again. I didn't send out a rebased version of the series, since Sean's work supersedes my series. -David [1] https://lore.kernel.org/lkml/15865985-4688-4b7e-9f2d-89803adb8f5b@xxxxxxxxxxxxx/