On 24.09.2018 14:16, Halil Pasic wrote: > > On 09/24/2018 01:36 PM, Cornelia Huck wrote: >> On Wed, 12 Sep 2018 15:43:03 -0400 >> Tony Krowiak <akrowiak@xxxxxxxxxxxxxxxxxx> wrote: >> >>> From: Tony Krowiak <akrowiak@xxxxxxxxxxxxx> >>> >>> Let's call PAPQ(ZAPQ) to zeroize a queue for each queue configured >>> for a mediated matrix device when it is released. >>> >>> Zeroizing a queue resets the queue, clears all pending >>> messages for the queue entries and disables adapter interruptions >>> associated with the queue. >>> >>> Signed-off-by: Tony Krowiak <akrowiak@xxxxxxxxxxxxx> >>> Reviewed-by: Halil Pasic <pasic@xxxxxxxxxxxxx> >>> Tested-by: Michael Mueller <mimu@xxxxxxxxxxxxx> >>> Tested-by: Farhan Ali <alifm@xxxxxxxxxxxxx> >>> Signed-off-by: Christian Borntraeger <borntraeger@xxxxxxxxxx> >>> --- >>> drivers/s390/crypto/vfio_ap_ops.c | 44 +++++++++++++++++++++++++++++++++++++ >>> 1 files changed, 44 insertions(+), 0 deletions(-) >>> >>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c >>> index f8b276a..48b1b78 100644 >>> --- a/drivers/s390/crypto/vfio_ap_ops.c >>> +++ b/drivers/s390/crypto/vfio_ap_ops.c >>> @@ -829,6 +829,49 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb, >>> return NOTIFY_OK; >>> } >>> >>> +static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi, >>> + unsigned int retry) >>> +{ >>> + struct ap_queue_status status; >>> + >>> + do { >>> + status = ap_zapq(AP_MKQID(apid, apqi)); >>> + switch (status.response_code) { >>> + case AP_RESPONSE_NORMAL: >>> + return 0; >>> + case AP_RESPONSE_RESET_IN_PROGRESS: >>> + case AP_RESPONSE_BUSY: >>> + msleep(20); >>> + break; >>> + default: >>> + /* things are really broken, give up */ >>> + return -EIO; >>> + } >>> + } while (retry--); >>> + >>> + return -EBUSY; >> So, this function may either return 0, -EIO (things are really broken), >> or -EBUSY (still busy after multiple tries)... >> >>> +} >>> + >>> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev) >>> +{ >>> + int ret; >>> + int rc = 0; >>> + unsigned long apid, apqi; >>> + struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); >>> + >>> + for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, >>> + matrix_mdev->matrix.apm_max + 1) { >>> + for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm, >>> + matrix_mdev->matrix.aqm_max + 1) { >>> + ret = vfio_ap_mdev_reset_queue(apid, apqi, 1); >>> + if (ret) >>> + rc = ret; >> ...and here, we return the last error of any of the resets. Two >> questions: >> >> - Does it make sense to continue if we get -EIO? IOW, does "really >> broken" only refer to a certain tuple and other tuples still can/need >> to be reset? > I think it does make sense to continue, because IMHO "things are really > broken" is an overstatement (I mean the APQN invalid case). One could > argue would skipping the current card (adapter) be justified or not. > > IMHO the current code is good enough for the first shot, and we can think > about fine-tuning it later. Absolutely. The -EIO case is reached for example when the APQN is 'deconfigured' which means the crypto adapter is logically unplugged. So the -EIO case should NOT lead to some fatal actions like panic() or cause a KVM guest to shut down or so. >> - Is the return code useful in any way, as we don't know which tuple it >> refers to? >> > Well, good question. It conveys that the operation can 'fail'. AFAIR -EBUSY > is mostly fine given what the architecture say if we are satisfied with just > reset. And the cases behind -EIO might actually be OK too in the same sense. > My guess is, that based on the return value client code can tell if we have > zeroize for all queues or basically just reset (like rapq). We could log that > to some debug facility or whatever -- I guess, but at the moment we don't care. > > In the end I think the code is good enough as is, and if we want we can > improve on it later. > > Regards, > Halil > > >>> + } >>> + } >>> + >>> + return rc; >>> +} >>> + >>> static int vfio_ap_mdev_open(struct mdev_device *mdev) >>> { >>> struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev); >>> @@ -859,6 +902,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev) >>> if (matrix_mdev->kvm) >>> kvm_arch_crypto_clear_masks(matrix_mdev->kvm); >>> >>> + vfio_ap_mdev_reset_queues(mdev); >>> vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY, >>> &matrix_mdev->group_notifier); >>> matrix_mdev->kvm = NULL;