On Tue, 22 Jan 2019 11:20:41 +0100 Cornelia Huck <cohuck@xxxxxxxxxx> wrote: > On Mon, 21 Jan 2019 21:10:32 +0100 > Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > > > On Mon, 21 Jan 2019 17:55:39 +0100 > > Cornelia Huck <cohuck@xxxxxxxxxx> wrote: > > > > > On Mon, 21 Jan 2019 11:14:16 -0500 > > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > > > > > On 01/21/2019 10:52 AM, Cornelia Huck wrote: > > > > > On Mon, 21 Jan 2019 10:48:32 -0500 > > > > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > > > > > > > >> On 01/21/2019 10:18 AM, Cornelia Huck wrote: > > > > >>> On Mon, 21 Jan 2019 09:54:09 -0500 > > > > >>> Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > > >>> > > > > >>>> This check is unecessary as we already have the vfio state machine > > > > >>>> to handle I/O requests. > > > > >>>> > > > > >>>> On the other hand, this check returns incorrect information to > > > > >>>> userspace if the state of the subchannel is not idle. For example > > > > >>>> if the state is busy and new I/O request comes in, this will return > > > > >>>> an EACCES, whereas we should return EBUSY. > > > > >>>> > > > > >>>> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > > > > >>>> --- > > > > >>>> drivers/s390/cio/vfio_ccw_ops.c | 2 -- > > > > >>>> 1 file changed, 2 deletions(-) > > > > >>>> > > > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c > > > > >>>> index f673e10..3fdcc6d 100644 > > > > >>>> --- a/drivers/s390/cio/vfio_ccw_ops.c > > > > >>>> +++ b/drivers/s390/cio/vfio_ccw_ops.c > > > > >>>> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev, > > > > >>>> return -EINVAL; > > > > >>>> > > > > >>>> private = dev_get_drvdata(mdev_parent_dev(mdev)); > > > > >>>> - if (private->state != VFIO_CCW_STATE_IDLE) > > > > >>>> - return -EACCES; > > > > >>>> > > > > >>>> region = private->io_region; > > > > >>>> if (copy_from_user((void *)region + *ppos, buf, > > > > >>>> count)) > > > > >>> > > > > >>> Hm, the patchset for halt/clear handling I recently posted > > > > >>> changes this to a check for NOT_OPER || STANDBY. What do you > > > > >>> think of that option? > > > > >>> > > > > >>> > > > > >> I am concerned with the return code that we send userspace. With > > > > >> the state machines we return an EIO for NOT_OPER or STANDBY, but > > > > >> we return EACCES in the early check. QEMU on an EACCES returns a > > > > >> 'not_oper' to the guest and for EIO will inject an interrupt. > > > > > > > > > > I actually think that not_oper is the saner option here: the > > > > > device is in a state on the vfio side where it is not usable... > > > > > > > > Agreed. Then we should the change the return codes with the state > > > > machine to EACCES as well, that could be a follow up patch to your > > > > series. > > > > > > Yes, let's do that. > > > > > > As you seem to be able to hit this problem with your workload, maybe > > > post it as a standalone fix? I can easily rebase my series on top of > > > that (and I assume that it will need more review than a simple fix.) > > > > > > > I had an issue following the discussion so I chatted up Farhan. Turns > > out the bad private->state he sees is VFIO_CCW_STATE_BUSY. That is, I > > guess, a should normally not happen. I mean QEMU should not accept a new > > SSCH before the START_PENDING for the old one is cleared. And that would > > normally happen after the IO corresponding to the old SSCH is done. Well > > we do have our wonderful emulated CSCH/SSCH at the moment (which do clear > > the SCSW in QEMU but don't bother to terminate IO at the device). And we > > may or may not have bugs in all the SCSW handling/updating code as well. > > Well, that hopefully becomes better once we have proper halt/clear > passthrough. Any kind of emulation will continue to be problematic in > one way or the other. > > > The problem with this patch is that we poke private->io_region before > > calling the state machine stuff. That could race with the interrupt > > handler touching the same region. That could have been OK, if IRB in the > > region was read-only. But it is not, and QEMU zeroes it out on each > > write AFAICT. Changing the condition to NOT_OPER || STANDBY or > > omitting the check (at the moment) does not seem to be what I would call > > a fix. > > So, should this wait until after we rewrite the state machine and I/O > region concurrency? > > I'm not sure what the test setup exercises, but making the return code > (-EACCES vs. -EIO) consistent looks like a win to me. > Regarding the test setup Farhan is likely to be able to provide more info. And my intuition says, yes it should wait. I would also like to see the error codes, which are: * obviously a substantial part of the interface, and need to be mapped to channel IO conditions, and thus * beyond usual file IO stuff (i.e. man 3 write won't give you the right explanation for the error codes properly documented. @Farhan: Would you like to tackle this one? Given what QEMU does with EIO and EACCESS, I agree, if NOT_OPER || STANDBY EACCESS (that is cc 3 not operational) is more appropriate. Regards, Halil