On 10/29/2021 3:26 AM, Jason Gunthorpe wrote:
On Thu, Oct 28, 2021 at 05:08:11PM +0200, Cornelia Huck wrote:
that should go in right now. Actually, I'd already consider it too late
even if we agreed now; I would expect a change like this to get at least
two weeks in linux-next before the merge window.
Usually linux-next is about sorting out integration problems so we
have an orderly merge window. Nobody is going to test this code just
because it is in linux-next, it isn't mm or something with coverage
there.
Right, in addition, the series has been on-list over a month to let
people review and comment.
V5 has no specific comment on, I believe that we are in a good state /
point to move forward with it.
Yes, if qemu becomes deployed, but our testing shows qemu support
needs a lot of work before it is deployable, so that doesn't seem to
be an immediate risk.
Do you have any patches/problem reports you can share?
Yishai has some stuff, he was doing failure injection testing and
other interesting things. I think we are hoping to start looking at
it.
Correct, I encountered some SF of QEMU upon failure injection / error flows.
For example,
- Unbinding the VF then trying to run savevm.
- Moving the mlx5 device to ERROR state, my expectation was to see some
recovery flow from QEMU as of calling the RESET ioctl to let it be
running again, however it crashed.
- etc.
Yes, we have some plans to start looking at.
If you already identified that there is work to be done in QEMU, I think
that speaks even more for delaying this. What if we notice that uapi
changes are needed while fixing QEMU?
I don't think it is those kinds of bugs.
Right, it doesn't seem as uapi changed are required, need to debug and
fix QEMU.
Yishai