On Tue, 4 Jun 2019 15:38:51 -0400 Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > On 5/21/19 11:34 AM, Pierre Morel wrote: > > We register a AP PQAP instruction hook during the open > > of the mediated device. And unregister it on release. [..] > > +/** > > + * vfio_ap_wait_for_irqclear > > + * @apqn: The AP Queue number > > + * > > + * Checks the IRQ bit for the status of this APQN using ap_tapq. > > + * Returns if the ap_tapq function succedded and the bit is clear. > > s/succedded/succeeded/ > I'm gonna fix this up when picking. > > + * Returns if ap_tapq function failed with invalid, deconfigured or > > + * checkstopped AP. > > + * Otherwise retries up to 5 times after waiting 20ms. > > + * > > + */ > > +static void vfio_ap_wait_for_irqclear(int apqn) > > +{ > > + struct ap_queue_status status; > > + int retry = 5; > > + > > + do { > > + status = ap_tapq(apqn, NULL); > > + switch (status.response_code) { > > + case AP_RESPONSE_NORMAL: > > + case AP_RESPONSE_RESET_IN_PROGRESS: > > + if (!status.irq_enabled) > > + return; > > + /* Fall through */ > > + case AP_RESPONSE_BUSY: > > + msleep(20); > > + break; > > + case AP_RESPONSE_Q_NOT_AVAIL: > > + case AP_RESPONSE_DECONFIGURED: > > + case AP_RESPONSE_CHECKSTOPPED: > > + default: > > + WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__, > > + status.response_code, apqn); > > + return; > > Why not just break out of the loop and just use the WARN_ONCE > outside of the loop? > AFAIU the idea was to differentiate between got a strange response_code and ran out of retires. Actually I suspect that we are fine in case of AP_RESPONSE_Q_NOT_AVAIL, AP_RESPONSE_DECONFIGURED and AP_RESPONSE_CHECKSTOPPED in a sense that what should be the post-condition of this function is guaranteed to be reached. What do you think? While I think that we can do better here, I see this as something that should be done on top. > > + } > > + } while (--retry); > > + > > + WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n", > > + __func__, status.response_code, apqn); > > +} > > + > > +/** > > + * vfio_ap_free_aqic_resources > > + * @q: The vfio_ap_queue > > + * > > + * Unregisters the ISC in the GIB when the saved ISC not invalid. > > + * Unpin the guest's page holding the NIB when it exist. > > + * Reset the saved_pfn and saved_isc to invalid values. > > + * Clear the pointer to the matrix mediated device. > > + * > > + */ [..] > > +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q) > > +{ > > + struct ap_qirq_ctrl aqic_gisa = {}; > > + struct ap_queue_status status; > > + int retries = 5; > > + > > + do { > > + status = ap_aqic(q->apqn, aqic_gisa, NULL); > > + switch (status.response_code) { > > + case AP_RESPONSE_OTHERWISE_CHANGED: > > + case AP_RESPONSE_NORMAL: > > + vfio_ap_wait_for_irqclear(q->apqn); > > + goto end_free; > > + case AP_RESPONSE_RESET_IN_PROGRESS: > > + case AP_RESPONSE_BUSY: > > + msleep(20); > > + break; > > + case AP_RESPONSE_Q_NOT_AVAIL: > > + case AP_RESPONSE_DECONFIGURED: > > + case AP_RESPONSE_CHECKSTOPPED: > > + case AP_RESPONSE_INVALID_ADDRESS: > > + default: > > + /* All cases in default means AP not operational */ > > + WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__, > > + status.response_code); > > + goto end_free; > > Why not just break out of the loop instead of repeating the WARN_ONCE > message? > I suppose the reason is same as above. I'm not entirely happy with this code myself. E.g. why do we do retries here -- shouldn't we just fail the aqic by the guest? [..] > > +static int handle_pqap(struct kvm_vcpu *vcpu) > > +{ > > + uint64_t status; > > + uint16_t apqn; > > + struct vfio_ap_queue *q; > > + struct ap_queue_status qstatus = { > > + .response_code = AP_RESPONSE_Q_NOT_AVAIL, }; > > + struct ap_matrix_mdev *matrix_mdev; > > + > > + /* If we do not use the AIV facility just go to userland */ > > + if (!(vcpu->arch.sie_block->eca & ECA_AIV)) > > + return -EOPNOTSUPP; > > + > > + apqn = vcpu->run->s.regs.gprs[0] & 0xffff; > > + mutex_lock(&matrix_dev->lock); > > + > > + if (!vcpu->kvm->arch.crypto.pqap_hook) > > Wasn't this already checked in patch 2 prior to calling this > function? In fact, doesn't the hook point to this function? > Let us benevolently call this defensive programming. We are actually in that callback AFAICT, so it sure was set a moment ago, and I guess the client code still holds the kvm.lock so it is guaranteed to stay so unless somebody is playing foul. We can address this with a patch on top. Regards, Halil