On 09/24/2018 01:36 PM, Cornelia Huck wrote: > On Wed, 12 Sep 2018 15:43:03 -0400 > Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> wrote: > >> From: Tony Krowiak <akrowiak@xxxxxxxxxxxxx> >> >> Let's call PAPQ(ZAPQ) to zeroize a queue for each queue configured >> for a mediated matrix device when it is released. >> >> Zeroizing a queue resets the queue, clears all pending >> messages for the queue entries and disables adapter interruptions >> associated with the queue. >> >> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx> >> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx> >> Tested-by: Michael Mueller <mimu@xxxxxxxxxxxxx> >> Tested-by: Farhan Ali <alifm@xxxxxxxxxxxxx> >> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> >> --- >> drivers/s390/crypto/vfio_ap_ops.c | 44 +++++++++++++++++++++++++++++++++++++ >> 1 files changed, 44 insertions(+), 0 deletions(-) >> >> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c >> index f8b276a..48b1b78 100644 >> --- a/drivers/s390/crypto/vfio_ap_ops.c >> +++ b/drivers/s390/crypto/vfio_ap_ops.c >> @@ -829,6 +829,49 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, >> return NOTIFY_OK; >> } >> >> +static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi, >> + unsigned int retry) >> +{ >> + struct ap_queue_status status; >> + >> + do { >> + status = ap_zapq(AP_MKQID(apid, apqi)); >> + switch (status.response_code) { >> + case AP_RESPONSE_NORMAL: >> + return 0; >> + case AP_RESPONSE_RESET_IN_PROGRESS: >> + case AP_RESPONSE_BUSY: >> + msleep(20); >> + break; >> + default: >> + /* things are really broken, give up */ >> + return -EIO; >> + } >> + } while (retry--); >> + >> + return -EBUSY; > > So, this function may either return 0, -EIO (things are really broken), > or -EBUSY (still busy after multiple tries)... > >> +} >> + >> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev) >> +{ >> + int ret; >> + int rc = 0; >> + unsigned long apid, apqi; >> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); >> + >> + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, >> + matrix_mdev->matrix.apm_max + 1) { >> + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, >> + matrix_mdev->matrix.aqm_max + 1) { >> + ret = vfio_ap_mdev_reset_queue(apid, apqi, 1); >> + if (ret) >> + rc = ret; > > ...and here, we return the last error of any of the resets. Two > questions: > > - Does it make sense to continue if we get -EIO? IOW, does "really > broken" only refer to a certain tuple and other tuples still can/need > to be reset? I think it does make sense to continue, because IMHO "things are really broken" is an overstatement (I mean the APQN invalid case). One could argue would skipping the current card (adapter) be justified or not. IMHO the current code is good enough for the first shot, and we can think about fine-tuning it later. > - Is the return code useful in any way, as we don't know which tuple it > refers to? > Well, good question. It conveys that the operation can 'fail'. AFAIR -EBUSY is mostly fine given what the architecture say if we are satisfied with just reset. And the cases behind -EIO might actually be OK too in the same sense. My guess is, that based on the return value client code can tell if we have zeroize for all queues or basically just reset (like rapq). We could log that to some debug facility or whatever -- I guess, but at the moment we don't care. In the end I think the code is good enough as is, and if we want we can improve on it later. Regards, Halil >> + } >> + } >> + >> + return rc; >> +} >> + >> static int vfio_ap_mdev_open(struct mdev_device *mdev) >> { >> struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); >> @@ -859,6 +902,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev) >> if (matrix_mdev->kvm) >> kvm_arch_crypto_clear_masks(matrix_mdev->kvm); >> >> + vfio_ap_mdev_reset_queues(mdev); >> vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, >> &matrix_mdev->group_notifier); >> matrix_mdev->kvm = NULL; >