On Thu, 24 Jan 2019 21:37:44 -0500 Eric Farman <farman@xxxxxxxxxxxxx> wrote: > On 01/24/2019 09:25 PM, Eric Farman wrote: > > > > > > On 01/21/2019 06:03 AM, Cornelia Huck wrote: > > [1] I think these changes are cool. We end up going into (and staying > > in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we > > bumble along. > > > > But why can't these be separated out from this patch? It does change > > the behavior of the state machine, and seem distinct from the addition > > of the mutex you otherwise add here? At the very least, this behavior > > change should be documented in the commit since it's otherwise lost in > > the mutex/EAGAIN stuff. That's a very good idea. I'll factor them out into a separate patch. > > > >> trace_vfio_ccw_io_fctl(scsw->cmd.fctl, get_schid(private), > >> io_region->ret_code, errstr); > >> } > >> diff --git a/drivers/s390/cio/vfio_ccw_ops.c > >> b/drivers/s390/cio/vfio_ccw_ops.c > >> index f673e106c041..3fa9fc570400 100644 > >> --- a/drivers/s390/cio/vfio_ccw_ops.c > >> +++ b/drivers/s390/cio/vfio_ccw_ops.c > >> @@ -169,16 +169,20 @@ static ssize_t vfio_ccw_mdev_read(struct > >> mdev_device *mdev, > >> { > >> struct vfio_ccw_private *private; > >> struct ccw_io_region *region; > >> + int ret; > >> if (*ppos + count > sizeof(*region)) > >> return -EINVAL; > >> private = dev_get_drvdata(mdev_parent_dev(mdev)); > >> + mutex_lock(&private->io_mutex); > >> region = private->io_region; > >> if (copy_to_user(buf, (void *)region + *ppos, count)) > >> - return -EFAULT; > >> - > >> - return count; > >> + ret = -EFAULT; > >> + else > >> + ret = count; > >> + mutex_unlock(&private->io_mutex); > >> + return ret; > >> } > >> static ssize_t vfio_ccw_mdev_write(struct mdev_device *mdev, > >> @@ -188,25 +192,30 @@ static ssize_t vfio_ccw_mdev_write(struct > >> mdev_device *mdev, > >> { > >> struct vfio_ccw_private *private; > >> struct ccw_io_region *region; > >> + int ret; > >> if (*ppos + count > sizeof(*region)) > >> return -EINVAL; > >> private = dev_get_drvdata(mdev_parent_dev(mdev)); > >> - if (private->state != VFIO_CCW_STATE_IDLE) > >> + if (private->state == VFIO_CCW_STATE_NOT_OPER || > >> + private->state == VFIO_CCW_STATE_STANDBY) > >> return -EACCES; > >> + if (!mutex_trylock(&private->io_mutex)) > >> + return -EAGAIN; > > > > Ah, I see Halil's difficulty here. > > > > It is true there is a race condition today, and that this doesn't > > address it. That's fine, add it to the todo list. But even with that, > > I don't see what the mutex is enforcing? Two simultaneous SSCHs will be > > serialized (one will get kicked out with a failed trylock() call), while > > still leaving the window open between cc=0 on the SSCH and the > > subsequent interrupt. In the latter case, a second SSCH will come > > through here, do the copy_from_user below, and then jump to fsm_io_busy > > to return EAGAIN. Do we really want to stomp on io_region in that case? > > Why can't we simply return EAGAIN if state==BUSY? > > (Answering my own questions as I skim patch 5...) > > Because of course this series is for async handling, while I was looking > specifically at the synchronous code that exists today. I guess then my > question just remains on how the mutex is adding protection in the sync > case, because that's still not apparent to me. (Perhaps I missed it in > a reply to Halil; if so I apologize, there were a lot when I returned.) My idea behind the mutex was to make sure that we get consistent data when reading/writing (e.g. if one user space thread is reading the I/O region while another is writing to it).