On 03/03/2015 21:42, Radim Krčmář wrote: > 2015-03-03 13:48-0600, Joel Schopp: >>>> + unsigned long new_rax = kvm_register_read(vcpu, VCPU_REGS_RAX); >>> Shouldn't we handle writes in EAX differently than in AX and AL, because >>> of implicit zero extension. >> I don't think the implicit zero extension hurts us here, but maybe there >> is something I'm missing that I need understand. Could you explain this >> further? > > According to APM vol.2, 2.5.3 Operands and Results, when using EAX, > we should zero upper 32 bits of RAX: > > Zero Extension of Results. In 64-bit mode, when performing 32-bit > operations with a GPR destination, the processor zero-extends the 32-bit > result into the full 64-bit destination. Both 8-bit and 16-bit > operations on GPRs preserve all unwritten upper bits of the destination > GPR. This is consistent with legacy 16-bit and 32-bit semantics for > partial-width results. > > Is IN not covered? It is. You need to zero the upper 32 bits. >>>> + BUG_ON(!vcpu->arch.pio.count); >>>> + BUG_ON(vcpu->arch.pio.count * vcpu->arch.pio.size > sizeof(new_rax)); >>> (Looking at it again, a check for 'vcpu->arch.pio.count == 1' would be >>> sufficient.) >> I prefer the checks that are there now after your last review, >> especially since surrounded by BUG_ON they only run on debug kernels. > > BUG_ON is checked on essentially all kernels that run KVM. > (All distribution-based configs should have it.) Correct. > If we wanted to validate the size, then this is strictly better: > BUG_ON(vcpu->arch.pio.count != 1 || vcpu->arch.pio.size > sizeof(new_rax)) That would be a very weird assertion considering that vcpu->arch.pio.size will architecturally be at most 4. The first arm of the || is sufficient. >>>> + memcpy(&new_rax, vcpu, sizeof(new_rax)); >>>> + trace_kvm_pio(KVM_PIO_IN, vcpu->arch.pio.port, vcpu->arch.pio.size, >>>> + vcpu->arch.pio.count, vcpu->arch.pio_data); >>>> + kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax); >>>> + vcpu->arch.pio.count = 0; >>> I think it is better to call emulator_pio_in_emulated directly, like >>> >>> emulator_pio_in_out(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size, >>> vcpu->arch.pio.port, &new_rax, 1); >>> kvm_register_write(vcpu, VCPU_REGS_RAX, new_rax); >>> >>> because we know that vcpu->arch.pio.count != 0. > > Pasting the same code creates bug opportunities when we forget to modify > all places. This class of problems can be harder to deal with, that (c) > and (d), because we can't simply print all callers. I agree with this and prefer calling emulator_pio_in_emulated in complete_fast_pio_in, indeed. >>> Refactoring could avoid the weird vcpu->ctxt->vcpu conversion. >>> (A better name is always welcome.) No need for that. >> The pointer chasing is making me dizzy. I'm not sure why >> emulator_pio_in_emulated takes a x86_emulate_ctxt when all it does it >> immediately translate that to a vcpu and never use the x86_emulate_ctxt, >> why not pass the vcpu in the first place? Because the emulator is written to be usable outside the Linux kernel as well. Also, the fast path (used if kernel_pio returns 0) doesn't read VCPU_REGS_RAX, thus using an uninitialized variable here: >>> + unsigned long val; >>> + int ret = emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size, >>> + port, &val, 1); >>> + >>> + if (ret) >>> + kvm_register_write(vcpu, VCPU_REGS_RAX, val); Thanks, Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html