On Tue, 2017-08-01 at 22:47 +0200, Paolo Bonzini wrote: > On 01/08/2017 18:33, Tamas K Lengyel wrote: > > > If you actually pause the whole VM (through QEMU's monitor commands > > > "stop" and "cont") everything should be safe. Of course there can be > > > bugs and PCI passthrough devices should be problematic, but in general > > > the device emulation is quiescent. This however is not the case when > > > only the VCPUs are paused. > > > > IMHO for some use-cases it is sufficient to have the guest itself be > > limited in the modifications it makes to memory. So for example if > > just a vCPU is paused there are areas of memory that you can interact > > with without having to worry about it changing underneath the > > introspecting application (ie. thread-specific datastructures like the > > KPCR, etc..). If the introspecting application needs access to areas > > that non-paused vCPUs may touch, or QEMU, or a pass-through device, > > then it should be a decision for the introspecting app whether to > > pause the VM completely. It may still choose to instead do some > > error-detection on reads/writes to detect inconsistent accesses and > > perhaps just re-try the operation till it succeeds. This may have less > > of an impact on the performance of the VM as no full VM pause had to > > be performed. It is all very application specific, so having options > > is always a good thing. > > Fair enough. There is another issue however. > > If a guest is runnnig in the kernel, it can be easily paused while KVMI > processes events and the like. > > While a guest is outside the kernel, it could be running or paused. > > If running, the value of a register might change before the VM reenters > execution (due to a reset, or due to ugly features such as the VMware > magic I/O port 0x5658). So the introspector would probably prefer > anyway to do any changes while the guest is in the kernel: one idea I > had was a KVMI_PAUSE_VCPU command that replies with a KVMI_VCPU_PAUSED > event---then the introspector can send commands that do the required > patching and then restart the guest by replying to the event. > > But if the guest is paused, KVMI_PAUSE_VCPU would never be processed. > So how could the introspector distinguish the two cases and avoid the > KVMI_PAUSE_VCPU if the guest is paused? > > (There is another complication: the guest could be running with the APIC > emulated in userspace. In that case, a VCPU doing "cli;hlt" spends > infinite time in userspace even though it's running, and KVM has no idea > why. This is less common, but it's worth mentioning too). I think it might help to distinguish two situations in which we require the guest _or_ a single vCPU to be paused. Our initial KVMI_PAUSE_GUEST command can be translated into a qemu pause. In our particular usecase we made special arrangements to call it as few times as possible assuming it's very costly. The other is needed only by the internal KVM code for situations similar to: kvm_pause_vcpu(vcpu); vcpu_load(vcpu); kvm_arch_vcpu_ioctl_get_regs(vcpu, regs); vcpu_put(vcpu); kvm_unpause_vcpu(vcpu); or more generally put, for accesses that involve the vCPU state (registers, MSR-s, exceptions etc.), no guest memory involved. Here kvm_pause_vcpu() will only pull the vCPU out of the guest and, if so, make it somehow available for quick re-entry with kvm_unpause_vcpu(). If said vCPU is already out, then the function will be a no-op. Obviously, kvm_{pause,unpause}_vcpu() will do nothing if we're currently handling an event or one is pending. I hope this narrows down further the exact requirements. One exception that might have a better solution is: kvm_pause_all_vcpus(kvm); kvm_set_page_access(kvm, gfn); /* pause for get too? */ kvm_unpause_all_vcpus(kvm); There might be a way to make the change and then IPI all vCPU-s without pulling them out of the guest. Regards, -- Mihai Donțu