On Wed, 7 Jul 2021 11:41:56 -0400 Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: First sorry for being this late with having a more serious look at the code. > @@ -270,6 +270,9 @@ static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q, > * We take the matrix_dev lock to ensure serialization on queues and > * mediated device access. > * > + * Note: This function must be called with a read lock held on > + * vcpu->kvm->arch.crypto.pqap_hook_rwsem. > + * That is a fine synchronization for the pqap_hook, but I don't think it is sufficient for everything. > * Return 0 if we could handle the request inside KVM. > * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault. > */ > @@ -287,22 +290,12 @@ static int handle_pqap(struct kvm_vcpu *vcpu) > return -EOPNOTSUPP; > > apqn = vcpu->run->s.regs.gprs[0] & 0xffff; > - mutex_lock(&matrix_dev->lock); Here you drop a matrix_dev->lock critical section. And then you do all the interesting stuff. E.g. q = vfio_ap_get_queue(matrix_mdev, apqn); and vfio_ap_irq_enable(q, status & 0x07, vcpu->run->s.regs.gprs[2]);. Since in vfio_ap_get_queue() we do the check if the queue belongs to the given guest, and examine the matrix (apm, aqm) I suppose that needs to be done holding a lock that protects the matrix, and to my best knowledge this is still matrix_dev->lock. It would probably make sense to convert matrix_dev->lock into an rw_semaphore, or to introduce a some new rwlock which protects less state in the future, but right now AFAICT it is still matrix_dev->lock. So I don't think this patch should pass review. Regards, Halil > > if (!vcpu->kvm->arch.crypto.pqap_hook) > goto out_unlock; > matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook, > struct ap_matrix_mdev, pqap_hook); > > - /* > - * If the KVM pointer is in the process of being set, wait until the > - * process has completed. > - */ > - wait_event_cmd(matrix_mdev->wait_for_kvm, > - !matrix_mdev->kvm_busy, > - mutex_unlock(&matrix_dev->lock), > - mutex_lock(&matrix_dev->lock)); > - > /* If the there is no guest using the mdev, there is nothing to do */ > if (!matrix_mdev->kvm) > goto out_unlock;