Am 29.05.24 um 16:48 schrieb Li, Yunxiang (Teddy):
[AMD Official Use Only - AMD Internal Distribution Only]
Yeah, I know. That's one of the reason I've pointed out on the patch adding
that that this behavior is actually completely broken.
If you run into issues with the MES because of this then please suggest a
revert of that patch.
I think it just need to be improved to allow this force-signal behavior. The current behavior is slow/inconvenient, but the old behavior is wrong. Since MES will continue process submissions even when one submission failed. So with just one fence location there's no way to tell if a command failed or not.
No the MES behavior is broken. When a submission failed it should stop
processing or signal that the operation didn't completed through some
other mechanism.
Just not writing the fence and continuing results in tons of problems,
from the TLB fence all the way to the ring buffer and reset handling.
This is a hard requirement and really can't be changed.
Regards,
Christian.