Re: [RFC PATCH v4 4/4] vfio-ccw: Reset FSM state to IDLE before io_mutex

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2021-04-21 at 12:25 +0200, Cornelia Huck wrote:
> On Tue, 13 Apr 2021 20:24:10 +0200
> Eric Farman <farman@xxxxxxxxxxxxx> wrote:
> 
> > Today, the stacked call to vfio_ccw_sch_io_todo() does three
> > things:
> > 
> > 1) Update a solicited IRB with CP information, and release the CP
> > if the interrupt was the end of a START operation.
> > 2) Copy the IRB data into the io_region, under the protection of
> > the io_mutex
> > 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that
> > vfio-ccw can accept more work.
> > 
> > The trouble is that step 3 is (A) invoked for both solicited and
> > unsolicited interrupts, and (B) sitting after the mutex for step 2.
> > This second piece becomes a problem if it processes an interrupt
> > for a CLEAR SUBCHANNEL while another thread initiates a START,
> > thus allowing the CP and FSM states to get out of sync. That is:
> > 
> > 	CPU 1				CPU 2
> > 	fsm_do_clear()
> > 	fsm_irq()
> > 					fsm_io_request()
> > 					fsm_io_helper()
> > 	vfio_ccw_sch_io_todo()
> > 					fsm_irq()
> > 					vfio_ccw_sch_io_todo()
> > 
> > Let's move the reset of the FSM state to the point where the
> > channel_program struct is cleaned up, which is only done for
> > solicited interrupts anyway.
> > 
> > Signed-off-by: Eric Farman <farman@xxxxxxxxxxxxx>
> > ---
> >  drivers/s390/cio/vfio_ccw_drv.c | 7 +++----
> >  1 file changed, 3 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/s390/cio/vfio_ccw_drv.c
> > b/drivers/s390/cio/vfio_ccw_drv.c
> > index 8c625b530035..e51318f23ca8 100644
> > --- a/drivers/s390/cio/vfio_ccw_drv.c
> > +++ b/drivers/s390/cio/vfio_ccw_drv.c
> > @@ -94,16 +94,15 @@ static void vfio_ccw_sch_io_todo(struct
> > work_struct *work)
> >  		     (SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
> >  	if (scsw_is_solicited(&irb->scsw)) {
> >  		cp_update_scsw(&private->cp, &irb->scsw);
> > -		if (is_final && private->state ==
> > VFIO_CCW_STATE_CP_PENDING)
> > +		if (is_final && private->state ==
> > VFIO_CCW_STATE_CP_PENDING) {
> >  			cp_free(&private->cp);
> > +			private->state = VFIO_CCW_STATE_IDLE;
> > +		}
> >  	}
> >  	mutex_lock(&private->io_mutex);
> >  	memcpy(private->io_region->irb_area, irb, sizeof(*irb));
> >  	mutex_unlock(&private->io_mutex);
> >  
> > -	if (private->mdev && is_final)
> > -		private->state = VFIO_CCW_STATE_IDLE;
> 
> Isn't that re-allowing new I/O requests a bit too early?

Hrm... I guess I don't see what work vfio-ccw has left to do that is
presenting it from carrying on. The copying of the IRB data back into
the io_region seems like a flimsy gate to me. But...

It seems you're (rightly) concerned with userspace doing SSCH + SSCH,
whereas I'v been focused on the CSCH + SSCH sequence. So with this
change, we're inviting the possibility of a second SSCH being able to
be submitted/started before the IRB data for the first SSCH is copied
(and presumably before userspace is tapped to read that data back).

Sigh... I guess that's not the greatest behavior either. Gotta ruminate
on this.

>  Maybe remember
> that we had a final I/O interrupt for an I/O request and only change
> the state in this case?

As a local flag within this routine? Hrm... I have entirely too many
"Let's try this" branches that didn't work, but I don't see that one
jumping out at me. Will give it a try.

> 
> 
> > -
> >  	if (private->io_trigger)
> >  		eventfd_signal(private->io_trigger, 1);
> >  }




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux