On Tue, 2021-05-11 at 13:31 +0200, Cornelia Huck wrote: > On Mon, 10 May 2021 22:56:46 +0200 > Eric Farman <farman@xxxxxxxxxxxxx> wrote: > > > Today, the stacked call to vfio_ccw_sch_io_todo() does three > > things: > > > > 1) Update a solicited IRB with CP information, and release the CP > > if the interrupt was the end of a START operation. > > 2) Copy the IRB data into the io_region, under the protection of > > the io_mutex > > 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that > > vfio-ccw can accept more work. > > > > The trouble is that step 3 is (A) invoked for both solicited and > > unsolicited interrupts, and (B) sitting after the mutex for step 2. > > This second piece becomes a problem if it processes an interrupt > > for a CLEAR SUBCHANNEL while another thread initiates a START, > > thus allowing the CP and FSM states to get out of sync. That is: > > > > CPU 1 CPU 2 > > fsm_do_clear() > > fsm_irq() > > fsm_io_request() > > vfio_ccw_sch_io_todo() > > fsm_io_helper() > > > > Since the FSM state and CP should be kept in sync, let's make a > > note when the CP is released, and rely on that as an indication > > that the FSM should also be reset at the end of this routine and > > open up the device for more work. > > > > Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx> > > --- > > drivers/s390/cio/vfio_ccw_drv.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c > > b/drivers/s390/cio/vfio_ccw_drv.c > > index 8c625b530035..ef39182edab5 100644 > > --- a/drivers/s390/cio/vfio_ccw_drv.c > > +++ b/drivers/s390/cio/vfio_ccw_drv.c > > @@ -85,7 +85,7 @@ static void vfio_ccw_sch_io_todo(struct > > work_struct *work) > > { > > struct vfio_ccw_private *private; > > struct irb *irb; > > - bool is_final; > > + bool is_final, is_finished = false; > > <bikeshed> > "is_finished" does not really say what is finished; maybe call it > "cp_is_finished"? > </bikeshed> Sure, that's a bit clearer. > > > > > private = container_of(work, struct vfio_ccw_private, io_work); > > irb = &private->irb; > > @@ -94,14 +94,16 @@ static void vfio_ccw_sch_io_todo(struct > > work_struct *work) > > (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > > if (scsw_is_solicited(&irb->scsw)) { > > cp_update_scsw(&private->cp, &irb->scsw); > > - if (is_final && private->state == > > VFIO_CCW_STATE_CP_PENDING) > > + if (is_final && private->state == > > VFIO_CCW_STATE_CP_PENDING) { > > cp_free(&private->cp); > > + is_finished = true; > > + } > > } > > mutex_lock(&private->io_mutex); > > memcpy(private->io_region->irb_area, irb, sizeof(*irb)); > > mutex_unlock(&private->io_mutex); > > > > - if (private->mdev && is_final) > > + if (private->mdev && is_finished) > > Maybe add a comment? > > /* > * Reset to idle if processing of a channel program > * has finished; but do not overwrite a possible > * processing state if we got a final interrupt for hsch > * or csch. > */ > > Otherwise, I see us scratching our heads again in a few months :) Almost certainly. :) > > > private->state = VFIO_CCW_STATE_IDLE; > > > > if (private->io_trigger) > > Patch looks good to me. > Thanks. Will make the above improvements and send as non-RFC.