On Tue, Jun 27, 2023 at 5:31 PM André Almeida <andrealmeid@xxxxxxxxxx> wrote:
Hi Marek,
Em 27/06/2023 15:57, Marek Olšák escreveu:
> On Tue, Jun 27, 2023, 09:23 André Almeida <andrealmeid@xxxxxxxxxx
> <mailto:andrealmeid@xxxxxxxxxx>> wrote:
>
> +User Mode Driver
> +----------------
> +
> +The UMD should check before submitting new commands to the KMD if
> the device has
> +been reset, and this can be checked more often if the UMD requires
> it. After
> +detecting a reset, UMD will then proceed to report it to the
> application using
> +the appropriate API error code, as explained in the section below about
> +robustness.
>
>
> The UMD won't check the device status before every command submission
> due to ioctl overhead. Instead, the KMD should skip command submission
> and return an error that it was skipped.
I wrote like this because when reading the source code for
vk::check_status()[0] and Gallium's si_flush_gfx_cs()[1], I was under
the impression that UMD checks the reset status before every
submission/flush.
It only does that before every command submission when the context is robust. When it's not robust, radeonsi doesn't do anything.
Is your comment about of how things are currently implemented, or how
they would ideally work? Either way I can apply your suggestion, I just
want to make it clear.
Yes. Ideally, we would get the reply whether the context is lost from the CS ioctl. This is not currently implemented.
Marek
[0]
https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/vulkan/runtime/vk_device.h#L142
[1]
https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/gallium/drivers/radeonsi/si_gfx_cs.c#L83
>
> The only case where that won't be applicable is user queues where
> drivers don't call into the kernel to submit work, but they do call into
> the kernel to create a dma_fence. In that case, the call to create a
> dma_fence can fail with an error.
>
> Marek