Re: [PATCH v2 2/5] vfio-ccw: concurrent I/O handling

Cornelia Huck <cohuck@xxxxxxxxxx> · Fri, 25 Jan 2019 11:24:37 +0100

On Thu, 24 Jan 2019 21:37:44 -0500
Eric Farman <farman@xxxxxxxxxxxxx> wrote:

> On 01/24/2019 09:25 PM, Eric Farman wrote:
> > 
> > 
> > On 01/21/2019 06:03 AM, Cornelia Huck wrote:  

> > [1] I think these changes are cool.  We end up going into (and staying 
> > in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we 
> > bumble along.
> > 
> > But why can't these be separated out from this patch?  It does change 
> > the behavior of the state machine, and seem distinct from the addition 
> > of the mutex you otherwise add here?  At the very least, this behavior 
> > change should be documented in the commit since it's otherwise lost in 
> > the mutex/EAGAIN stuff.

That's a very good idea. I'll factor them out into a separate patch.

> >   
> >>       trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private),
> >>                      io_region->ret_code, errstr);
> >>   }
> >> diff --git a/drivers/s390/cio/vfio_ccw_ops.c 
> >> b/drivers/s390/cio/vfio_ccw_ops.c
> >> index f673e106c041..3fa9fc570400 100644
> >> --- a/drivers/s390/cio/vfio_ccw_ops.c
> >> +++ b/drivers/s390/cio/vfio_ccw_ops.c
> >> @@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct 
> >> mdev_device *mdev,
> >>   {
> >>       struct vfio_ccw_private *private;
> >>       struct ccw_io_region *region;
> >> +    int ret;
> >>       if (*ppos + count > sizeof(*region))
> >>           return -EINVAL;
> >>       private = dev_get_drvdata(mdev_parent_dev(mdev));
> >> +    mutex_lock(&private->io_mutex);
> >>       region = private->io_region;
> >>       if (copy_to_user(buf, (void *)region + *ppos, count))
> >> -        return -EFAULT;
> >> -
> >> -    return count;
> >> +        ret = -EFAULT;
> >> +    else
> >> +        ret = count;
> >> +    mutex_unlock(&private->io_mutex);
> >> +    return ret;
> >>   }
> >>   static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev,
> >> @@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct 
> >> mdev_device *mdev,
> >>   {
> >>       struct vfio_ccw_private *private;
> >>       struct ccw_io_region *region;
> >> +    int ret;
> >>       if (*ppos + count > sizeof(*region))
> >>           return -EINVAL;
> >>       private = dev_get_drvdata(mdev_parent_dev(mdev));
> >> -    if (private->state != VFIO_CCW_STATE_IDLE)
> >> +    if (private->state == VFIO_CCW_STATE_NOT_OPER ||
> >> +        private->state == VFIO_CCW_STATE_STANDBY)
> >>           return -EACCES;
> >> +    if (!mutex_trylock(&private->io_mutex))
> >> +        return -EAGAIN;  
> > 
> > Ah, I see Halil's difficulty here.
> > 
> > It is true there is a race condition today, and that this doesn't 
> > address it.  That's fine, add it to the todo list.  But even with that, 
> > I don't see what the mutex is enforcing?  Two simultaneous SSCHs will be 
> > serialized (one will get kicked out with a failed trylock() call), while 
> > still leaving the window open between cc=0 on the SSCH and the 
> > subsequent interrupt.  In the latter case, a second SSCH will come 
> > through here, do the copy_from_user below, and then jump to fsm_io_busy 
> > to return EAGAIN.  Do we really want to stomp on io_region in that case? 
> >   Why can't we simply return EAGAIN if state==BUSY?  
> 
> (Answering my own questions as I skim patch 5...)
> 
> Because of course this series is for async handling, while I was looking 
> specifically at the synchronous code that exists today.  I guess then my 
> question just remains on how the mutex is adding protection in the sync 
> case, because that's still not apparent to me.  (Perhaps I missed it in 
> a reply to Halil; if so I apologize, there were a lot when I returned.)

My idea behind the mutex was to make sure that we get consistent data
when reading/writing (e.g. if one user space thread is reading the I/O
region while another is writing to it).