On Tue, 2 Apr 2019 12:44:34 -0400 Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > The quiesce function calls cio_cancel_halt_clear() and if we > get an -EBUSY we go into a loop where we: > - wait for any interrupts > - flush all I/O in the workqueue > - retry cio_cancel_halt_clear > > During the period where we are waiting for interrupts or > flushing all I/O, the channel subsystem could have completed > a halt/clear action and turned off the corresponding activity > control bits in the subchannel status word. This means the next > time we call cio_cancel_halt_clear(), we will again start by > calling cancel subchannel and so we can be stuck between calling > cancel and halt forever. > > Rather than calling cio_cancel_halt_clear() immediately after > waiting, let's try to disable the subchannel. If we succeed in > disabling the subchannel then we know nothing else can happen > with the device. Hmm... I had to spend some time looking at the code and it seems to be a bit confused (not your patch, the general logic). Basically, we call the quiesce function for two reasons: - The device is shutdown/removed. We don't want to do anything with the device afterwards, and especially want nobody to start anything new. The subchannel will be disabled, and stay like that. - The mdev reset callback is invoked. When resetting, we basically disable the subchannel (any I/O of course needs to be done prior to that), and then enable it again. In the first case, our goal is to disable the subchannel, and nothing more will be done with it. In the second case, we simply want to perform a reset operation; that is, get the subchannel into a clean state. The disable/enable is just a means to get there. Looking at what the common I/O layer does with the cancel_halt_clear operation, it moves the device into a special quiescing state so that no new I/O will be started. And I think that's how it is intended to be used: If we want to quiesce the subchannel prior to remove/shutdown, fencing off any new I/O is what we want. That means nobody will submit requests we're not aware of. That leaves the reset case, in which disabling the subchannel is only a means to reset the subchannel to a defined state. We definitely want to accept new I/O requests once we're done with the disable/enable dance. That leaves the question of what to do with requests that come in while resetting: Reject them? Queue them for later? Is disable/enable even the best solution here? We basically came up with it because we couldn't think of anything else... (And yes, I also find it confusing that the quiesce function not only clears out pending I/O, but also disables the subchannel...) > > Suggested-by: Eric Farman <farman@xxxxxxxxxxxxx> > Signed-off-by: Farhan Ali <alifm@xxxxxxxxxxxxx> > --- > drivers/s390/cio/vfio_ccw_drv.c | 28 ++++++++++++---------------- > 1 file changed, 12 insertions(+), 16 deletions(-) > > diff --git a/drivers/s390/cio/vfio_ccw_drv.c b/drivers/s390/cio/vfio_ccw_drv.c > index 5aca475..f87757b 100644 > --- a/drivers/s390/cio/vfio_ccw_drv.c > +++ b/drivers/s390/cio/vfio_ccw_drv.c > @@ -43,26 +43,22 @@ int vfio_ccw_sch_quiesce(struct subchannel *sch) > if (ret != -EBUSY) > goto out_unlock; > > + iretry = 255; > do { > - iretry = 255; > > - ret = cio_cancel_halt_clear(sch, &iretry); > - while (ret == -EBUSY) { > - /* > - * Flush all I/O and wait for > - * cancel/halt/clear completion. > - */ > - private->completion = &completion; > - spin_unlock_irq(sch->lock); > + cio_cancel_halt_clear(sch, &iretry); I think you still want to check the return code here... > + /* > + * Flush all I/O and wait for > + * cancel/halt/clear completion. > + */ > + private->completion = &completion; > + spin_unlock_irq(sch->lock); > > - wait_for_completion_timeout(&completion, 3*HZ); > - > - private->completion = NULL; > - flush_workqueue(vfio_ccw_work_q); > - spin_lock_irq(sch->lock); > - ret = cio_cancel_halt_clear(sch, &iretry); > - }; > + wait_for_completion_timeout(&completion, 3*HZ); ...because you don't want to wait for an interrupt that won't arrive (e.g. when xsch was successful, or if the subchannel has become non-operational in the meantime.) > > + private->completion = NULL; > + flush_workqueue(vfio_ccw_work_q); > + spin_lock_irq(sch->lock); > ret = cio_disable_subchannel(sch); > } while (ret == -EBUSY); > out_unlock: Regardless of my comments above, this looks like an improvement over what we have now. Have you been able to observe the issue in real life?