On Wed, Jan 26, 2022 at 09:59:02PM +0000, Sean Christopherson wrote: > On Wed, Jan 26, 2022, Boris Burkov wrote: > > I tested this fix on the workload and it did prevent the hangs. However, > > I am unsure if the fix is appropriate from a locking perspective, so I > > hope to draw some extra attention to that aspect. set_page_dirty_lock in > > mm/page-writeback.c has a comment about locking that says set_page_dirty > > should be called with the page locked or while definitely holding a > > reference to the mapping's host inode. I believe that the mmap should > > have that reference, so for fear of hurting KVM performance or > > introducing a deadlock, I opted for the unlocked variant. > > KVM doesn't hold a reference per se, but it does subscribe to mmu_notifier events > and will not mark the page dirty after KVM has been instructed to unmap the page > (barring bugs, which we've had a slew of). So yeah, the unlocked variant should > be safe. > > Is it feasible to trigger this behavior in a selftest? KVM has had, and probably > still has, many bugs that all boil down to KVM assuming guest memory is backed by > either anonymous memory or something like shmem/HugeTLBFS/memfd that isn't typically > truncated by the host. I haven't been able to isolate a reproducer, yet. I am a bit stumped because there isn't a lot for me to go off from that stack I shared--the best I have so far is that I need to trick KVM into emulating instructions at some point to get to this 'complete_userspace_io' codepath? I will keep trying, since I think it would be valuable to know what exactly happened. Open to try any suggestions you might have as well. Thanks for the response, Boris