Re: [Qemu-devel] [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 22 Jan 2019 18:26:17 +0100
Cornelia Huck <cohuck@xxxxxxxxxx> wrote:

> On Tue, 22 Jan 2019 13:46:12 +0100
> Halil Pasic <pasic@xxxxxxxxxxxxx> wrote:
> 
> > On Tue, 22 Jan 2019 12:53:22 +0100
> > Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
> > 
> > > On Tue, 22 Jan 2019 12:17:37 +0100
> > > Halil Pasic <pasic@xxxxxxxxxxxxx> wrote:
> > >   
> > > > On Tue, 22 Jan 2019 11:29:26 +0100
> > > > Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
> > > >   
> > > > > On Mon, 21 Jan 2019 21:20:18 +0100
> > > > > Halil Pasic <pasic@xxxxxxxxxxxxx> wrote:
> > > > >     
> > > > > > On Mon, 21 Jan 2019 12:03:51 +0100
> > > > > > Cornelia Huck <cohuck@xxxxxxxxxx> wrote:
> > > > > >     
> > > > > > > Rework handling of multiple I/O requests to return -EAGAIN if
> > > > > > > we are already processing an I/O request. Introduce a mutex
> > > > > > > to disallow concurrent writes to the I/O region.
> > > > > > > 
> > > > > > > The expectation is that userspace simply retries the operation
> > > > > > > if it gets -EAGAIN.
> > > > > > > 
> > > > > > > We currently don't allow multiple ssch requests at the same
> > > > > > > time, as we don't have support for keeping channel programs
> > > > > > > around for more than one request.
> > > > > > > 
> > > > > > > Signed-off-by: Cornelia Huck <cohuck@xxxxxxxxxx>
> > > > > > > ---      
> > > > > > 
> > > > > > [..]
> > > > > >     
> > > > > > >  static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> > > > > > > @@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> > > > > > >  {
> > > > > > >  	struct vfio_ccw_private *private;
> > > > > > >  	struct ccw_io_region *region;
> > > > > > > +	int ret;
> > > > > > >  
> > > > > > >  	if (*ppos + count > sizeof(*region))
> > > > > > >  		return -EINVAL;
> > > > > > >  
> > > > > > >  	private = dev_get_drvdata(mdev_parent_dev(mdev));
> > > > > > > -	if (private->state != VFIO_CCW_STATE_IDLE)
> > > > > > > +	if (private->state == VFIO_CCW_STATE_NOT_OPER ||
> > > > > > > +	    private->state == VFIO_CCW_STATE_STANDBY)
> > > > > > >  		return -EACCES;
> > > > > > > +	if (!mutex_trylock(&private->io_mutex))
> > > > > > > +		return -EAGAIN;
> > > > > > >  
> > > > > > >  	region = private->io_region;
> > > > > > > -	if (copy_from_user((void *)region + *ppos, buf, count))
> > > > > > > -		return -EFAULT;
> > > > > > > +	if (copy_from_user((void *)region + *ppos, buf, count)) {      
> > > > > > 
> > > > > > This might race with vfio_ccw_sch_io_todo() on
> > > > > > private->io_region->irb_area, or?    
> > > > > 
> > > > > Ah yes, this should also take the mutex (should work because we're on a
> > > > > workqueue).
> > > > >     
> > > > 
> > > > I'm not sure that will do the trick (assumed I understood the
> > > > intention correctly). Let's say the things happen in this order:
> > > > 1) vfio_ccw_sch_io_todo() goes first, I guess updates
> > > > private->io_region->irb_area and releases the mutex.
> > > > 2) Then vfio_ccw_mdev_write() destroys the irb_area by zeriong it out,
> > > > and finally,
> > > > 3) userspace reads the destroyed irb_area using vfio_ccw_mdev_read().
> > > > 
> > > > Or am I misunderstanding something?   
> > > 
> > > You're not, but dealing with that race is outside the scope of this
> > > patch. If userspace submits a request and then tries to get the old
> > > data for a prior request, I suggest that userspace needs to fix their
> > > sequencing.
> > >   
> > 
> > I tend to disagree, because I think this would be a degradation compared
> > to what we have right now.
> > 
> > Let me explain. I guess the current idea is that the private->state !=
> > VFIO_CCW_STATE_IDLE check safeguards against this. Yes we lack proper
> > synchronization (atomic/interlocked access or locks) that would guarantee
> > that different thread observe state transitions as required -- no
> > splint brain. But if state were atomic the scenario I have in mind can
> > not happen, because we get the solicited interrupt in BUSY state (and
> > set IDLE in vfio_ccw_sch_io_todo()). 
> 
> This BUSY handling is broken for another case: If the guest requests
> intermediate interrupts, there may be more than one interrupt by the
> hardware -- and we still go out of BUSY state. (Freeing the cp is also
> broken in that case.) However, the Linux dasd driver does not seem to
> do that.
> 

Nod.

> > Unsolicited interrupts are another
> > piece of cake -- I have no idea how may of those do we get.
> 
> They at least don't have the "free the cp before we got final state"
> bug. But I think both are reasons to get away from "use the BUSY state
> to ensure the right sequence".
> 

I'm not sure I understand you correctly. I was under the impression that
the whole point in having a state machine was to ensure the states are
traversed in the right sequence with the right stuff being done on each
transition. At least in theory.

You've probably figured out that IMHO vfio-ccw is not in a good shape
(to put it mildly). I have a hard time reviewing a non-holistic
concurrency fix. Please tell sould I get perceived as non-constructive,
I will try to cut back on criticism. 

> > And because
> > of this the broken sequencing in userspace could actually be the kernels
> > fault.
> 
> Here, I can't follow you at all :(
> 

Should we ever deliver a zeroed out IRB to the userspace, for the next
ioinst it would look like we have no status nor FC bit set. That is, the
guest could end up with stuff in parallel that was never supposed to
be in parallel (i.e. broken sequencing because kernel feeds false
information due to race with unsolicited interrupt).

Does that help?

Regards,
Halil






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Kernel Development]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Info]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Linux Media]     [Device Mapper]

  Powered by Linux