Re: [RFC v1 2/2] vfio-ccw: Don't exit early if state of the vfio-ccw subchannel is not idle

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 22 Jan 2019 11:20:41 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:

> On Mon, 21 Jan 2019 21:10:32 +0100
> Halil Pasic <pasic@xxxxxxxxxxxxx> wrote:
> 
> > On Mon, 21 Jan 2019 17:55:39 +0100
> > Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
> > 
> > > On Mon, 21 Jan 2019 11:14:16 -0500
> > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
> > >   
> > > > On 01/21/2019 10:52 AM, Cornelia Huck wrote:  
> > > > > On Mon, 21 Jan 2019 10:48:32 -0500
> > > > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
> > > > >     
> > > > >> On 01/21/2019 10:18 AM, Cornelia Huck wrote:    
> > > > >>> On Mon, 21 Jan 2019 09:54:09 -0500
> > > > >>> Farhan Ali <alifm@xxxxxxxxxxxxx> wrote:
> > > > >>>        
> > > > >>>> This check is unecessary as we already have the vfio state machine
> > > > >>>> to handle I/O requests.
> > > > >>>>
> > > > >>>> On the other hand, this check returns incorrect information to
> > > > >>>> userspace if the state of the subchannel is not idle. For example
> > > > >>>> if the state is busy and new I/O request comes in, this will return
> > > > >>>> an EACCES, whereas we should return EBUSY.
> > > > >>>>
> > > > >>>> Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx>
> > > > >>>> ---
> > > > >>>>    drivers/s390/cio/vfio_ccw_ops.c | 2 --
> > > > >>>>    1 file changed, 2 deletions(-)
> > > > >>>>
> > > > >>>> diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c
> > > > >>>> index f673e10..3fdcc6d 100644
> > > > >>>> --- a/drivers/s390/cio/vfio_ccw_ops.c
> > > > >>>> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> > > > >>>> @@ -193,8 +193,6 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> > > > >>>>    		return -EINVAL;
> > > > >>>>    
> > > > >>>>    	private = dev_get_drvdata(mdev_parent_dev(mdev));
> > > > >>>> -	if (private->state != VFIO_CCW_STATE_IDLE)
> > > > >>>> -		return -EACCES;
> > > > >>>>    
> > > > >>>>    	region = private->io_region;
> > > > >>>>    	if (copy_from_user((void *)region + *ppos, buf,
> > > > >>>> count))    
> > > > >>>
> > > > >>> Hm, the patchset for halt/clear handling I recently posted
> > > > >>> changes this to a check for NOT_OPER || STANDBY. What do you
> > > > >>> think of that option?
> > > > >>>
> > > > >>>        
> > > > >> I am concerned with the return code that we send userspace. With
> > > > >> the state machines we return an EIO for NOT_OPER or STANDBY, but
> > > > >> we return EACCES in the early check. QEMU on an EACCES returns a
> > > > >> 'not_oper' to the guest and for EIO will inject an interrupt.    
> > > > > 
> > > > > I actually think that not_oper is the saner option here: the
> > > > > device is in a state on the vfio side where it is not usable...    
> > > > 
> > > > Agreed. Then we should the change the return codes with the state 
> > > > machine to EACCES as well, that could be a follow up patch to your
> > > > series.  
> > > 
> > > Yes, let's do that.
> > > 
> > > As you seem to be able to hit this problem with your workload, maybe
> > > post it as a standalone fix? I can easily rebase my series on top of
> > > that (and I assume that it will need more review than a simple fix.)
> > >   
> > 
> > I had an issue following the discussion so I chatted up Farhan. Turns
> > out the bad private->state he sees is VFIO_CCW_STATE_BUSY. That is, I
> > guess, a should normally not happen. I mean QEMU should not accept a new
> > SSCH before the START_PENDING for the old one is cleared. And that would
> > normally happen after the IO corresponding to the old SSCH is done. Well
> > we do have our wonderful emulated CSCH/SSCH at the moment (which do clear
> > the SCSW in QEMU but don't bother to terminate IO at the device). And we
> > may or may not have bugs in all the SCSW handling/updating code as well.
> 
> Well, that hopefully becomes better once we have proper halt/clear
> passthrough. Any kind of emulation will continue to be problematic in
> one way or the other.
> 
> > The problem with this patch is that we poke private->io_region before
> > calling the state machine stuff. That could race with the interrupt
> > handler touching the same region. That could have been OK, if IRB in the
> > region was read-only. But it is not, and QEMU zeroes it out on each
> > write AFAICT. Changing the condition to NOT_OPER || STANDBY or
> > omitting the check (at the moment) does not seem to be what I would call
> > a fix.
> 
> So, should this wait until after we rewrite the state machine and I/O
> region concurrency?
> 
> I'm not sure what the test setup exercises, but making the return code
> (-EACCES vs. -EIO) consistent looks like a win to me.
> 

Regarding the test setup Farhan is likely to be able to provide more
info. And my intuition says, yes it should wait. I would also like to
see the error codes, which are:
* obviously a substantial part of the interface, and need to be mapped
  to channel IO conditions, and thus
* beyond usual file IO stuff (i.e. man 3 write won't give you the right
  explanation for the error codes
properly documented. @Farhan: Would you like to tackle this one?

Given what QEMU does with EIO and EACCESS, I agree, if NOT_OPER ||
STANDBY EACCESS (that is cc 3 not operational) is more appropriate.

Regards,
Halil




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux