Marcelo Tosatti wrote: > On Wed, Jan 27, 2010 at 03:54:08PM +0100, Jan Kiszka wrote: >> This patch originates in the mp_state writeback issue: During runtime >> and even on reset, we must not write the previously saved VCPU state >> back into the kernel in an uncontrolled fashion. E.g mp_state should >> only written on reset or on VCPU setup. Certain clocks (e.g. the TSC) >> may only be written on setup or after vmload. >> >> By introducing additional information about the context of the planned >> vcpu state manipulation, we can simply skip sensitive states like >> mp_state when updating the in-kernel state. The planned modifications >> are defined when calling cpu_synchronize_state. They accumulate, ie. >> once a full writeback was requested, it will stick until it was >> performed. >> >> This patch already fixes existing writeback issues in upstream KVM by >> only selectively writing MSR_IA32_TSC, MSR_KVM_SYSTEM_TIME, >> MSR_KVM_WALL_CLOCK, the mp_state and the vcpu_events. >> >> Signed-off-by: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> >> --- >> >> This patch is intentionally written against uq/master. As upstream is >> less convoluted (yet :) ), it may help understanding the basic idea. An >> add-on patch for qemu-kvm that both cleans up the current code and also >> moves kvm_get/set_lapic into kvm_arch_get/put_registers (hmm, maybe also >> renaming that services...) will follow soon if no one sees fundamental >> problems of this approach. >> >> exec.c | 4 ++-- >> gdbstub.c | 10 +++++----- >> hw/apic.c | 2 +- >> hw/ppc_newworld.c | 2 +- >> hw/ppc_oldworld.c | 2 +- >> hw/s390-virtio.c | 2 +- >> kvm-all.c | 31 +++++++++++++++++++------------ >> kvm.h | 13 +++++++++---- >> monitor.c | 4 ++-- >> target-i386/helper.c | 2 +- >> target-i386/kvm.c | 31 +++++++++++++++++++------------ >> target-i386/machine.c | 4 ++-- >> target-ppc/kvm.c | 2 +- >> target-ppc/machine.c | 4 ++-- >> target-s390x/kvm.c | 17 ++++++++++++----- >> 15 files changed, 78 insertions(+), 52 deletions(-) >> >> diff --git a/kvm-all.c b/kvm-all.c >> index f8350c9..8595cd9 100644 >> --- a/kvm-all.c >> +++ b/kvm-all.c >> @@ -57,7 +57,8 @@ struct KVMState >> KVMSlot slots[32]; >> int fd; >> int vmfd; >> - int regs_modified; >> + int synchronized; >> + int pending_modifications; >> int coalesced_mmio; >> #ifdef KVM_CAP_COALESCED_MMIO >> struct kvm_coalesced_mmio_ring *coalesced_mmio_ring; > > Should be per-vcpu. Yep, good chance to clean this up in upstream. > >> @@ -155,10 +156,12 @@ static void kvm_reset_vcpu(void *opaque) >> CPUState *env = opaque; >> >> kvm_arch_reset_vcpu(env); >> - if (kvm_arch_put_registers(env)) { >> + if (kvm_arch_put_registers(env, env->kvm_state->pending_modifications >> + | CPU_MODIFY_RESET)) { >> fprintf(stderr, "Fatal: kvm vcpu reset failed\n"); >> abort(); >> } >> + env->kvm_state->pending_modifications = CPU_MODIFY_NONE; > > Can't the writeback here happen at exec_cpu? Don't think so (longterm). The reset callbacks are run synchronously, writing back on exec would happen asynchronously, leaving some vcpus in pre-reset state when others already start over. > >> @@ -946,9 +953,9 @@ static void kvm_invoke_set_guest_debug(void *data) >> struct kvm_set_guest_debug_data *dbg_data = data; >> CPUState *env = dbg_data->env; >> >> - if (env->kvm_state->regs_modified) { >> - kvm_arch_put_registers(env); >> - env->kvm_state->regs_modified = 0; >> + if (env->kvm_state->pending_modifications) { >> + kvm_arch_put_registers(env, env->kvm_state->pending_modifications); >> + env->kvm_state->pending_modifications = CPU_MODIFY_NONE; >> } > > Why's synchronous writeback needed here? Older kernels overwrote the effect of set_guest_debug on eflags when updating the registers. But this hunk is scheduled for refactoring which will take it to the same place as all the other vcpu state writebacks. That will enforce the ordering more naturally. > > Otherwise seems fine. > Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html