On Tue, Feb 11, 2025 at 10:25 AM Marc Zyngier <maz@xxxxxxxxxx> wrote: > > No functional change intended. > > Actually plenty of it. Yes, definitely. That's not "no functional change intended", it's "no breakage of sane userspace intended" which is pretty much the opposite. > Yet the new helper returns -EINTR, which you blindly forward to userspace. It won't quite reach userspace, since the task won't survive mutex_lock_killable(). With your current code the kill will arrive just after the ioctl returns to userspace, so there's no change in behavior there. The -EBUSY change is the one that matters. > At the end of the day, the x86 locking serves completely different > purposes. It wants to gracefully wait for vcpus to exit and is happy > to replay things, because migration (which is what x86 seems to be > using this for) is a stupidly long process. No, it's not using it for the whole length of migration. It serves the same exact purpose as ARM: do some stuff on something that spans a whole struct kvm. The only difference is the behavior on contended mutex, where x86 simply says don't do it. > Our locking is designed to > either succeed or fail quickly, because some of the lock paths are on > the critical path for VM startup and configuration. The only long-running vCPU ioctl is KVM_RUN and you don't have that during VM startup and configuration. The interesting case is snapshotting, i.e. reading attributes. Failing quickly when reading attributes makes sense, and I'd be rightly wary of changing it. I know that QEMU is doing them only when it knows CPUs are stopped, but it does seem to affect crosvm and Firecracker more, for snapshotting more than startup/configuration. The GIC snapshotting code in crosvm returns an anyhow::Result and ends up logging an error with println!, Firecracker is similar as it also propagates the error. So neither crosvm nor Firecracker have particularly sophisticated error handling but they do seem to want to fail their RPCs quickly. Failing quickly when creating the device or setting attributes makes less sense (Firecracker for one does an unwrap() there). But anyhow the change would have to be a separate patch, certainly not one applied through the generic KVM tree; and if you decide that you're stuck with it, that would be more than understandable. > So for this series to be acceptable, you'd have to provide the same > semantics. It is probably doable with a bit of macro magic, at the > expense of readability. Or it can be just an extra bool argument, plus wrappers. For RISC-V it's only used at device creation time, and the ioctl fails if the vCPU ran at least once. Removing the "try" in trylock is totally feasible there, however it must be documented in the commit message. Thanks, Paolo > What I would also like to see is for this primitive to be usable with > scoped_cond_guard(), which would make the code much more readable.