On Mon, 21 Jan 2019 11:14:16 -0500 Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > On 01/21/2019 10:52 AM, Cornelia Huck wrote: > > On Mon, 21 Jan 2019 10:48:32 -0500 > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > >> On 01/21/2019 10:18 AM, Cornelia Huck wrote: > >>> On Mon, 21 Jan 2019 09:54:09 -0500 > >>> Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > >>> > >>>> This check is unecessary as we already have the vfio state machine > >>>> to handle I/O requests. > >>>> > >>>> On the other hand, this check returns incorrect information to > >>>> userspace if the state of the subchannel is not idle. For example > >>>> if the state is busy and new I/O request comes in, this will return > >>>> an EACCES, whereas we should return EBUSY. > >>>> > >>>> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > >>>> --- > >>>> drivers/s390/cio/vfio_ccw_ops.c | 2 -- > >>>> 1 file changed, 2 deletions(-) > >>>> > >>>> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c > >>>> index f673e10..3fdcc6d 100644 > >>>> --- a/drivers/s390/cio/vfio_ccw_ops.c > >>>> +++ b/drivers/s390/cio/vfio_ccw_ops.c > >>>> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev, > >>>> return -EINVAL; > >>>> > >>>> private = dev_get_drvdata(mdev_parent_dev(mdev)); > >>>> - if (private->state != VFIO_CCW_STATE_IDLE) > >>>> - return -EACCES; > >>>> > >>>> region = private->io_region; > >>>> if (copy_from_user((void *)region + *ppos, buf, count)) > >>> > >>> Hm, the patchset for halt/clear handling I recently posted changes this > >>> to a check for NOT_OPER || STANDBY. What do you think of that option? > >>> > >>> > >> I am concerned with the return code that we send userspace. With the > >> state machines we return an EIO for NOT_OPER or STANDBY, but we return > >> EACCES in the early check. QEMU on an EACCES returns a 'not_oper' to the > >> guest and for EIO will inject an interrupt. > > > > I actually think that not_oper is the saner option here: the device is > > in a state on the vfio side where it is not usable... > > Agreed. Then we should the change the return codes with the state > machine to EACCES as well, that could be a follow up patch to your series. Yes, let's do that. As you seem to be able to hit this problem with your workload, maybe post it as a standalone fix? I can easily rebase my series on top of that (and I assume that it will need more review than a simple fix.)