On 02/09/2012 05:39 PM, Jan Kiszka wrote: > On 2012-02-09 16:18, Avi Kivity wrote: > > On 02/05/2012 02:39 PM, Jan Kiszka wrote: > >> From: Jan Kiszka <jan.kiszka@xxxxxxxxxxx> > >> > >> This enables acceleration for MMIO-based TPR registers accesses of > >> 32-bit Windows guest systems. It is mostly useful with KVM enabled, > >> either on older Intel CPUs (without flexpriority feature, can also be > >> manually disabled for testing) or any current AMD processor. > >> > >> The approach introduced here is derived from the original version of > >> qemu-kvm. It was refactored, documented, and extended by support for > >> user space APIC emulation, both with and without KVM acceleration. > > > > However, it's presented as a rewrite instead of a series of changes, so > > we can't see what the changes are. > > Yes, but the original code also depends on interfaces we don't have in > upstream. The usual rant: patch qemu-kvm until it's suitable for upsteam, then submit the end result for upstream. But you don't deserve this today. > >> + > >> + if (access == TPR_ACCESS_WRITE && kvm_enabled() && > >> + !kvm_irqchip_in_kernel()) { > >> + /* > >> + * KVM without TPR access reporting calls into the user space APIC on > >> + * write with IP pointing after the accessing instruction. So we need > >> + * to look backward to find the reason. > >> + */ > > > > Why do we need to do anything at all? > > We need to patch the causing instruction, so we have to know where it > starts. Or what do you mean? Just don't deal with this at all, no one runs on kernels without kernel irqchip. > > > > I'm not sure if the ABI guarantees anything about %rip. > > That's indeed a point. It's likely coupled to the emulator's internals > and when it calls out to user space for MMIO write. How to deal with it? One way is to verify that it worked this way at least N versions back, and then retro-doc it. The downside is that it reduces our flexibility in the future, but I think that's a small downside. Another way is not to do it at all. > >> +static int get_kpcr_number(CPUState *env) > >> +{ > >> + struct kpcr { > >> + uint8_t fill1[0x1c]; > >> + uint32_t self; > >> + uint8_t fill2[0x31]; > >> + uint8_t number; > >> + } QEMU_PACKED kpcr; > >> + > >> + if (smp_cpus == 1) { > >> + return 0; > >> + } > >> + if (cpu_memory_rw_debug(env, env->segs[R_FS].base, > >> + (void *)&kpcr, sizeof(kpcr), 0) < 0 || > >> + kpcr.self != env->segs[R_FS].base) { > > > > Ah, so it works. We may want to do it for UP as well, as a way of > > verifying that the guest is compatible with these hacks. > > I'm not sure if Windows has this properly set up for the UP HAL. I > rather think this was a bug in the original implementation. The ROM uses > 0 as CPU index in UP mode unconditionally, so should we in QEMU. I mean just check kpcr.self. The reason up and smp are so different is that for a long while smp didn't work at all, and for some time it used even uglier hacks than we have today (like stashing the cpu id in TR.sel), so I didn't want to pollute th up code with the smp ugliness. It's also marginally faster (less locked ops), though I doubt it matters on today's processors. > > > >> + > >> + memory_region_init_alias(&s->rom, "kvmvapic-rom", section.mr, rom_paddr, > >> + rom_size); > >> + memory_region_add_subregion_overlap(as, rom_paddr, &s->rom, 1000); > > > > This is incredibly hacky, so at least the spirit of the original code is > > preserved. > > I know, and it caused some pain to write it (not only to find out how to > solve it technically). We would need to pass the RAM memory region down > to this freaky device, like we do to the i440fx for PAM purposes. But, > well, that is not straightforward right now. Better ideas welcome. Could we make it a child<> of i440FX, and have i440FX pass it the MemoryRegion directly? It means we'll need to redo the glue for new chipsets, but it should be just a few lines. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html