On 04/23/2019 01:42 PM, Halil Pasic wrote:
One thing I'm confused about is, that we don't seem to prevent
new I/O being submitted. That is we could still loop indefinitely
if we get new IO after the 'kill I/O on the subchannel' is done but
before the msch() with the disable is issued.
So the quiesce function will be called in the remove, release functions
and also in the mdev reset callback via an ioctl VFIO_DEVICE_RESET.
Now the release function is invoked in cases when we hot unplug the
device or the guest is gone (or anything that will close the vfio mdev
file descriptor, I believe). In such scenarios it's really the userspace
which is asking to release the device. Similar for remove, where the
user has to explicitly write to the remove file for the mdev to invoke
it. Under normal conditions no sane userspace should be doing
release/remove while there are still on going I/Os :)
Me and Conny had some discussion on this in v1 of this patch:
https://marc.info/?l=kvm&m=155437117823248&w=2
The 'flush all I/O' parts in the commit message and in the code make
this even more confusing.
Maybe...if it's too confusing it could be fixed, but IMHO I don't think
it's a dealbreaker. If anyone else thinks otherwise, I can go ahead and
change it.
Another thing that I don't quite understand is injecting interrupts
into QEMU for stuff that is actually not guest initiated.
As mentioned above under normal conditions we shouldn't be doing
quiesce. But wouldn't those interrupts just be unsolicited interrupts
for the guest?
Furthermore I find how cio_cancel_halt_clear() quite confusing. We
usually return EBUSY to say that the function was suppressed because,
well, the resource is busy. For cio_cancel_halt_clear() EBUSY means:
* either a xsch() was effectively suppressed because status pending
* or an hsch() or csch() was actually successfully executed, and the
client code is supposed to wait for the assync clear/halt function
to complete. This gets even more confusing when one reads:
/**
* cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt
* and clear ordinally if subchannel is valid.
* @sch: subchannel on which to perform the cancel_halt_clear operation
* @iretry: the number of the times remained to retry the next operation
*
* This should be called repeatedly since halt/clear are asynchronous
This is simply not true. Because halt/clear is async, we may not know
if we managed to accomplish canceling the running I/O. But the next
call of this function won't detect and say 'yep it worked, we are good
now' and return 0. This is the responsibility of the caller.
If it were like that, the current code would have been fine!
* operations. We do one try with cio_cancel, three tries with cio_halt,
* 255 tries with cio_clear. The caller should initialize @iretry with
* the value 255 for its first call to this, and keep using the same
* @iretry in the subsequent calls until it gets a non -EBUSY return.
*
* Returns 0 if device now idle, -ENODEV for device not operational,
* -EBUSY if an interrupt is expected (either from halt/clear or from a
* status pending), and -EIO if out of retries.
*/
int cio_cancel_halt_clear(struct subchannel *sch, int *iretry
TL;DR:
I welcome this batch (with an r-b) but I would like the commit message
and the comment changed so that the misleading 'flush all I/O in the
workqueue'.
I think 'vfio-ccw: fix cio_cancel_halt_clear() usage' would reflect the
content of this patch better, because reasoning about the upper limit,
and what happens if this upper limit is hit is not what this patch is
about. It is about a client code bug that rendered iretry ineffective.
I politely disagree with the change in subject line. I think the current
subject line describe what we are trying to prevent with this patch. But
again if anyone else feels otherwise, I will go ahead and change :)
Thanks
Farhan
Regards,
Halil