On 5/19/21 12:16 PM, Jason Gunthorpe wrote:
On Wed, May 19, 2021 at 11:39:21AM -0400, Tony Krowiak wrote:
@@ -287,13 +289,17 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
if (!(vcpu->arch.sie_block->eca & ECA_AIV))
return -EOPNOTSUPP;
- apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
- mutex_lock(&matrix_dev->lock);
+ rcu_read_lock();
+ pqap_module_hook = rcu_dereference(vcpu->kvm->arch.crypto.pqap_hook);
+ if (!pqap_module_hook) {
+ rcu_read_unlock();
+ goto set_status;
+ }
- if (!vcpu->kvm->arch.crypto.pqap_hook)
- goto out_unlock;
- matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
- struct ap_matrix_mdev, pqap_hook);
+ matrix_mdev = pqap_module_hook->data;
+ rcu_read_unlock();
+ mutex_lock(&matrix_dev->lock);
The matrix_mdev pointer was extracted from the pqap_module_hook,
but now there is nothing protecting it since the rcu was dropped and
it gets freed in vfio_ap_mdev_remove.
Therein lies the rub. We can't hold the rcu_read_lock across the
entire time that the interception is being processed because of
wait conditions in the interception handler. Regardless of whether
the pointer to the matrix_mdev is retrieved as the container of
or extracted from the pqap_hook, there is nothing protecting it
and there appears to be no way to do so using RCU.
And, again, module locking doesn't prevent vfio_ap_mdev_remove() from
being called. None of these patches should be combining module locking
with RCU.
Is there any other way besides user interaction with the mdev's
sysfs remove interface for the remove callback to get invoked?
If I try to remove the mdev using the sysfs interface while the
mdev fd is still open by the guest, the remove hangs until the
fd is closed. That being the case, the mdev release callback
will get invoked prior to the remove callback being invoked which
renders this whole debate moot. What am I missing here?
Jason