Re: [PATCH v2 1/2] drm: Add GPU reset sysfs event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 17.03.22 um 18:31 schrieb Rob Clark:
On Thu, Mar 17, 2022 at 10:27 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
[SNIP]
(At some point, I'd like to use scheduler for the replay, and actually
use drm_sched_stop()/etc.. but last time I looked there were still
some sched bugs in that area which prevented me from deleting a bunch
of code ;-))
Not sure about your hw, but at least on intel replaying tends to just
result in follow-on fun. And that holds even more so the more complex a
workload is. This is why vk just dies immediately and does not try to
replay anything, offloading it to the app. Same with arb robusteness.
Afaik it's really only media and classic gl which insist that the driver
stack somehow recover.
At least for us, each submit must be self-contained (ie. not rely on
previous GPU hw state), so in practice replay works out pretty well.
The worst case is subsequent submits from same process fail as well
(if they depended on something that crashing submit failed to write
back to memory.. but in that case they just crash as well and we move
on to the next one.. the recent gens (a5xx+ at least) are pretty good
about quickly detecting problems and giving us an error irq.

Well I absolutely agree with Daniel.

The whole replay thing AMD did in the scheduler is an absolutely mess and should probably be killed with fire.

I strongly recommend not to do the same mistake in other drivers.

If you want to have some replay feature then please make it driver specific and don't use anything from the infrastructure in the DRM scheduler.

Thanks,
Christian.


BR,
-R

And recovering from a mess in userspace is a lot simpler than trying to
pull of the same magic in the kernel. Plus it also helps with a few of the
dma_fence rules, which is a nice bonus.
-Daniel





[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux