On Mon, 10 May 2021 22:56:46 +0200 Eric Farman <farman@xxxxxxxxxxxxx> wrote: > Today, the stacked call to vfio_ccw_sch_io_todo() does three things: > > 1) Update a solicited IRB with CP information, and release the CP > if the interrupt was the end of a START operation. > 2) Copy the IRB data into the io_region, under the protection of > the io_mutex > 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that > vfio-ccw can accept more work. > > The trouble is that step 3 is (A) invoked for both solicited and > unsolicited interrupts, and (B) sitting after the mutex for step 2. > This second piece becomes a problem if it processes an interrupt > for a CLEAR SUBCHANNEL while another thread initiates a START, > thus allowing the CP and FSM states to get out of sync. That is: > > CPU 1 CPU 2 > fsm_do_clear() > fsm_irq() > fsm_io_request() > vfio_ccw_sch_io_todo() > fsm_io_helper() > > Since the FSM state and CP should be kept in sync, let's make a > note when the CP is released, and rely on that as an indication > that the FSM should also be reset at the end of this routine and > open up the device for more work. > > Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx> > --- > drivers/s390/cio/vfio_ccw_drv.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c > index 8c625b530035..ef39182edab5 100644 > --- a/drivers/s390/cio/vfio_ccw_drv.c > +++ b/drivers/s390/cio/vfio_ccw_drv.c > @@ -85,7 +85,7 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > { > struct vfio_ccw_private *private; > struct irb *irb; > - bool is_final; > + bool is_final, is_finished = false; <bikeshed> "is_finished" does not really say what is finished; maybe call it "cp_is_finished"? </bikeshed> > > private = container_of(work, struct vfio_ccw_private, io_work); > irb = &private->irb; > @@ -94,14 +94,16 @@ static void vfio_ccw_sch_io_todo(struct work_struct *work) > (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT)); > if (scsw_is_solicited(&irb->scsw)) { > cp_update_scsw(&private->cp, &irb->scsw); > - if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) > + if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) { > cp_free(&private->cp); > + is_finished = true; > + } > } > mutex_lock(&private->io_mutex); > memcpy(private->io_region->irb_area, irb, sizeof(*irb)); > mutex_unlock(&private->io_mutex); > > - if (private->mdev && is_final) > + if (private->mdev && is_finished) Maybe add a comment? /* * Reset to idle if processing of a channel program * has finished; but do not overwrite a possible * processing state if we got a final interrupt for hsch * or csch. */ Otherwise, I see us scratching our heads again in a few months :) > private->state = VFIO_CCW_STATE_IDLE; > > if (private->io_trigger) Patch looks good to me.