On Wed, 2012-02-08 at 21:41 +1100, Michael Ellerman wrote: > On Tue, 2012-02-07 at 17:38 -0200, Marcelo Tosatti wrote: > > On Tue, Feb 07, 2012 at 05:32:07PM +1100, Michael Ellerman wrote: > > > A test case which does the following: > > > > > > ioctl(vmfd, KVM_CREATE_VCPU, 0); > > > ioctl(vmfd, KVM_CREATE_IRQCHIP); > > > ioctl(cpufd, KVM_RUN); > > > > > > Can oops in kvm_apic_accept_pic_intr() because vcpu->arch.apic == NULL. > > > > > > Because irqchip_in_kernel() is false when we create the vcpu we leave > > > vcpu->arch.apic uninitialised (in kvm_arch_vcpu_init()). Then when we run, > > > irqchip_in_kernel() is true, but we didn't do the correct initialisation. > > > > > > The root of the problem seems to be that there is an assumption that > > > KVM_CREATE_IRQCHIP will be called before any VCPUs are created. The > > > documentation says "sets up future vcpus to have a local APIC". > > > > > > So the simplest fix seems to be to enforce that ordering in the code. > > > > Ugh. With your patch below there is still the window for a race: > > > > kvm_arch_vcpu_init can create a vcpu without vcpu->arch.apic, > > block on mutex_lock(kvm->lock). Meanwhile a separate thread is on > > KVM_CREATE_IRQCHIP holding kvm->lock, finds online_vcpus == 0 and > > proceeds. Then kvm_arch_vcpu_init finishes. > > Yeah bugger. I missed that most of the vcpu create is done without the > mutex held. > > > Moving mutex_lock(kvm->lock) to the beginning of > > kvm_vm_ioctl_create_vcpu should fix it? Hmm, maybe not. How are the locks meant to nest? If we move the mutex up, we will be calling kvm_arch_vcpu_setup() with the kvm->lock held, which calls vcpu_load() which takes vcpu->mutex. So we would be taking the kvm->lock (A) then vcpu->mutex (B). And I think the following path takes vcpu->mutex (B) then kvm->lock (A). kvm_vcpu_ioctl -> vcpu_load -> mutex_lock(&vcpu->mutex); -> kvm_arch_vcpu_ioctl() KVM_GET_MSRS: -> msr_io(vcpu, argp, kvm_get_msr, 1) -> __msr_io(..) -> srcu_read_lock(&vcpu->kvm->srcu); -> kvm_get_msr() -> kvm_x86_ops->get_msr() -> vmx_get_msr() default: -> kvm_get_msr_common() case HV_X64_MSR_GUEST_OS_ID ... HV_X64_MSR_SINT15: -> mutex_lock(&vcpu->kvm->lock); But trawling through that code path was a bit of a mess, so maybe I missed something somewhere. cheers
Attachment:
signature.asc
Description: This is a digitally signed message part