Hello Sean, On Wed, 3 Jan 2024 at 04:30, Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > Heh, I don't know that I would describe "412 microseconds" as "indefinitely", but > it's certainly a long time, especially during boot. * Indefinitely because it does not come out of it. I've left the guest overnight and still it did not boot. > Piecing things together, the issue is I was wrong about the -EAGAIN exit being > benign. > > QEMU responds to the spurious exit by bailing from the vCPU's inner runloop, and > when that happens, the associated task (briefly) acquires a global mutex, the > so called BQL (Big QEMU Lock). I assumed that QEMU would eat the -EAGAIN and do > nothing interesting, but QEMU interprets the -EAGAIN as "there might be a global > state change the vCPU needs to handle". > > As you discovered, having 9 vCPUs constantly acquiring and releasing a single > mutex makes for slow going when vCPU0 needs to acquire said mutex, e.g. to do > emulated MMIO. > > Ah, and the other wrinkle is that KVM won't actually yield during KVM_RUN for > UNINITIALIZED vCPUs, i.e. all those vCPU tasks will stay at 100% utilization even > though there's nothing for them to do. That may or may not matter in your case, > but it would be awful behavior in a setup with oversubscribed vCPUs. ... > Yeah, that's kinda sorta what's happening, although that comment is about requests > that are never cleared in *any* path, e.g. violation of that rule causes a vCPU > to be 100% stuck. * I see, interesting. > I'm not 100% confident there isn't something else going on, e.g. a 400+ microsecond > wait time is a little odd, * It could be vCPU thread's sched priority/policy. > but this is inarguably a KVM regression and I doubt it's worth anyone's time to dig deeper. > Can you give me a Signed-off-by for this? I'll write a changelog and post a proper patch. * I have sent a formal patch to you. Please feel free to edit the commit/change log as you see fit. Thanks so much. Thank you. --- - Prasad