On Wed, 2024-01-17 at 10:14 -0800, Sean Christopherson wrote: > On Wed, Jan 17, 2024, David Woodhouse wrote: > > On Wed, 2024-01-17 at 08:51 -0800, Sean Christopherson wrote: > > > On Wed, Jan 17, 2024, David Woodhouse wrote: > > > > On Wed, 2024-01-17 at 08:09 -0800, Sean Christopherson wrote: > > > > > On Fri, Jan 12, 2024, David Woodhouse wrote: > > > > > As you note above, some other mutex _should_ be held. I think we should lean > > > > > into that. E.g. > > > > > > > > I don't. I'd like this code to stand alone *without* making the caller > > > > depend on "some other lock" just for its own internal consistency. > > > > > > Hmm, I get where you're coming from, but protecting a per-vCPU asset with > > > vcpu->mutex is completely sane/reasonable. Xen's refresh from a completely > > > different task is the oddball. > > > > Well yes, because that's what the gfn_to_pfn_cache is *for*, surely? > > > > If we were in a context where we could sleep and take mutexes like the > > vcpu mutex (and without deadlocking with a thread actually running that > > vCPU), then we could mostly just use kvm_write_guest(). > > > > The gfn_to_pfn_cache exists specifically to handle that 'oddball' case. > > No? As I see it, the main role of the gpc code is to handle the mmu_notifier > interactions. I am not at all convinced that the common code should take on > supporting "oh, and by the way, any task can you use the cache from any context". But that really was its raison d'être. It exists to allow interrupts to be delivered, runstate to be updated — all of which runs in a context that can't just access the userspace address. The MMU notifiers are just a means to that end — to know when the cache becomes invalid and we need to take a slow path to refresh it. > That's the "oddball" I am referring to. I'm not entirely opposed to making the > gpc code fully standalone, but I would like to at least get to the point where > have very high confidence that arch.xen.xen_lock can be fully excised before > committing to handling that use case in the common gpc code. I fully agree that we should have high confidence that the arch.xen.xen_lock isn't needed before we actually elide it. This patch is very much moving us in the right direction. Even if it doesn't actually fix a bug, it's a cleanup of some fairly awful locking abuse. We *shouldn't* drop the rwlock and then assume that fields will be unchanged when we take it again "because our caller will save us by doing our locking for us". And while I can't see *how* the caller is failing to do that locking for us, I'm definitely looking at a trace where the ->khva being dereferenced from an allegedly valid gfn_to_pfn_cache is a valid- looking but not present kernel address. That's why I was staring at the code until my brain hurt and trying to work out how it could possibly happen. And I think the awful locking abuse in __kvm_gpc_refresh() has to be part of it. I still want to do a full audit of all the possible parallel code paths before actually removing xen_lock from the code paths where the gfn_to_pfn_cache is all it protects — but this patch is a very clear cleanup of some fairly awful locking abuse. I'm OK with you nacking it as a "bug fix", given the uncertainty. But let's take it as a cleanup and a simplification, and making the API easier to use correctly.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature