On 02/08/2012 02:13 PM, Michael Ellerman wrote: > On Wed, 2012-02-08 at 21:41 +1100, Michael Ellerman wrote: > > On Tue, 2012-02-07 at 17:38 -0200, Marcelo Tosatti wrote: > > > On Tue, Feb 07, 2012 at 05:32:07PM +1100, Michael Ellerman wrote: > > > > A test case which does the following: > > > > > > > > ioctl(vmfd, KVM_CREATE_VCPU, 0); > > > > ioctl(vmfd, KVM_CREATE_IRQCHIP); > > > > ioctl(cpufd, KVM_RUN); > > > > > > > > Can oops in kvm_apic_accept_pic_intr() because vcpu->arch.apic == NULL. > > > > > > > > Because irqchip_in_kernel() is false when we create the vcpu we leave > > > > vcpu->arch.apic uninitialised (in kvm_arch_vcpu_init()). Then when we run, > > > > irqchip_in_kernel() is true, but we didn't do the correct initialisation. > > > > > > > > The root of the problem seems to be that there is an assumption that > > > > KVM_CREATE_IRQCHIP will be called before any VCPUs are created. The > > > > documentation says "sets up future vcpus to have a local APIC". > > > > > > > > So the simplest fix seems to be to enforce that ordering in the code. > > > > > > Ugh. With your patch below there is still the window for a race: > > > > > > kvm_arch_vcpu_init can create a vcpu without vcpu->arch.apic, > > > block on mutex_lock(kvm->lock). Meanwhile a separate thread is on > > > KVM_CREATE_IRQCHIP holding kvm->lock, finds online_vcpus == 0 and > > > proceeds. Then kvm_arch_vcpu_init finishes. > > > > Yeah bugger. I missed that most of the vcpu create is done without the > > mutex held. > > > > > Moving mutex_lock(kvm->lock) to the beginning of > > > kvm_vm_ioctl_create_vcpu should fix it? > > Hmm, maybe not. > > How are the locks meant to nest? > > If we move the mutex up, we will be calling kvm_arch_vcpu_setup() with > the kvm->lock held, which calls vcpu_load() which takes vcpu->mutex. > > So we would be taking the kvm->lock (A) then vcpu->mutex (B). > > And I think the following path takes vcpu->mutex (B) then kvm->lock (A). > > kvm_vcpu_ioctl > -> vcpu_load > -> mutex_lock(&vcpu->mutex); > -> kvm_arch_vcpu_ioctl() > KVM_GET_MSRS: > -> msr_io(vcpu, argp, kvm_get_msr, 1) > -> __msr_io(..) > -> srcu_read_lock(&vcpu->kvm->srcu); > -> kvm_get_msr() > -> kvm_x86_ops->get_msr() > -> vmx_get_msr() > default: > -> kvm_get_msr_common() > case HV_X64_MSR_GUEST_OS_ID ... HV_X64_MSR_SINT15: > -> mutex_lock(&vcpu->kvm->lock); > > > But trawling through that code path was a bit of a mess, so maybe I > missed something somewhere. I think you are correct (the path is benign, since the vcpu being created is not going to run and read that MSR, but let's not get lockdep angry at us). However kvm_arch_vcpu_init(), which creates the lapic, _is_ called without either the vcpu->mutex or kvm->lock held. Patch coming up. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html