Am Mi., 22. Dez. 2021 um 01:17 Uhr schrieb Lucas Stach <l.stach@xxxxxxxxxxxxxx>: > > Some GPU heavy test programs manage to trigger the hangcheck quite often. > If there are no other GPU users in the system and the test program > exhibits a very regular structure in the commandstreams that are being > submitted, we can end up with two distinct submits managing to trigger > the hangcheck with the FE in a very similar address range. This leads > the hangcheck to believe that the GPU is stuck, while in reality the GPU > is already busy working on a different job. To avoid those spurious > GPU resets, also remember and consider the last completed fence seqno > in the hang check. > > Reported-by: Joerg Albert <joerg.albert@xxxxxx> > Signed-off-by: Lucas Stach <l.stach@xxxxxxxxxxxxxx> Reviewed-by: Christian Gmeiner <christian.gmeiner@xxxxxxxxx> -- greets -- Christian Gmeiner, MSc https://christian-gmeiner.info/privacypolicy