Hi all,
it has come up in other threads, so it might be worthwhile to have its own
topic:
Userspace command aborts
As it stands we cannot abort I/O commands from userspace.
This is hitting us when running in a virtual machine:
The VM sets a timeout when submitting a command, but that
information can't be transmitted to the VM host. The VM host
then issues a different command (with another timeout), and
again that timeout can't be transmitted to the attached devices.
So when the VM detects a timeout, it will try to issue an abort,
but that goes nowhere as the VM host has no way to abort commands
from userspace.
So in the end the VM has to wait for the command to complete, causing
stalls in the VM if the host had to undergo error recovery or something.
Aborts are racy. A lot of hardware implements these as a no-op, too.
Indeed.
With io_uring or CDL we now have some mechanism which look as if they
would allow us to implement command aborts.
CDL on the other hand sounds more promising.
So this BoF will be around discussions on how aborts from userspace could be
implemented, whether any of the above methods are suitable, or whether there
are other ideas on how that could be done.
I did not understand what is the relationship between aborts and CDL.
Sounds to me that this would tie in to something like Time Limited Error
Recovery (TLER) and LR bit set based on ioprio?
I am unclear where do aborts come into play here.