On Mon, 21 Jan 2019 17:55:39 +0100 Cornelia Huck <cohuck@xxxxxxxxxx> wrote: > On Mon, 21 Jan 2019 11:14:16 -0500 > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > On 01/21/2019 10:52 AM, Cornelia Huck wrote: > > > On Mon, 21 Jan 2019 10:48:32 -0500 > > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > > > >> On 01/21/2019 10:18 AM, Cornelia Huck wrote: > > >>> On Mon, 21 Jan 2019 09:54:09 -0500 > > >>> Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > >>> > > >>>> This check is unecessary as we already have the vfio state machine > > >>>> to handle I/O requests. > > >>>> > > >>>> On the other hand, this check returns incorrect information to > > >>>> userspace if the state of the subchannel is not idle. For example > > >>>> if the state is busy and new I/O request comes in, this will return > > >>>> an EACCES, whereas we should return EBUSY. > > >>>> > > >>>> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > > >>>> --- > > >>>> drivers/s390/cio/vfio_ccw_ops.c | 2 -- > > >>>> 1 file changed, 2 deletions(-) > > >>>> > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c > > >>>> index f673e10..3fdcc6d 100644 > > >>>> --- a/drivers/s390/cio/vfio_ccw_ops.c > > >>>> +++ b/drivers/s390/cio/vfio_ccw_ops.c > > >>>> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev, > > >>>> return -EINVAL; > > >>>> > > >>>> private = dev_get_drvdata(mdev_parent_dev(mdev)); > > >>>> - if (private->state != VFIO_CCW_STATE_IDLE) > > >>>> - return -EACCES; > > >>>> > > >>>> region = private->io_region; > > >>>> if (copy_from_user((void *)region + *ppos, buf, > > >>>> count)) > > >>> > > >>> Hm, the patchset for halt/clear handling I recently posted > > >>> changes this to a check for NOT_OPER || STANDBY. What do you > > >>> think of that option? > > >>> > > >>> > > >> I am concerned with the return code that we send userspace. With > > >> the state machines we return an EIO for NOT_OPER or STANDBY, but > > >> we return EACCES in the early check. QEMU on an EACCES returns a > > >> 'not_oper' to the guest and for EIO will inject an interrupt. > > > > > > I actually think that not_oper is the saner option here: the > > > device is in a state on the vfio side where it is not usable... > > > > Agreed. Then we should the change the return codes with the state > > machine to EACCES as well, that could be a follow up patch to your > > series. > > Yes, let's do that. > > As you seem to be able to hit this problem with your workload, maybe > post it as a standalone fix? I can easily rebase my series on top of > that (and I assume that it will need more review than a simple fix.) > I had an issue following the discussion so I chatted up Farhan. Turns out the bad private->state he sees is VFIO_CCW_STATE_BUSY. That is, I guess, a should normally not happen. I mean QEMU should not accept a new SSCH before the START_PENDING for the old one is cleared. And that would normally happen after the IO corresponding to the old SSCH is done. Well we do have our wonderful emulated CSCH/SSCH at the moment (which do clear the SCSW in QEMU but don't bother to terminate IO at the device). And we may or may not have bugs in all the SCSW handling/updating code as well. The problem with this patch is that we poke private->io_region before calling the state machine stuff. That could race with the interrupt handler touching the same region. That could have been OK, if IRB in the region was read-only. But it is not, and QEMU zeroes it out on each write AFAICT. Changing the condition to NOT_OPER || STANDBY or omitting the check (at the moment) does not seem to be what I would call a fix. Regards, Halil