On 23/05/2024 20:31, Jay Cornwall wrote:
On 5/23/2024 13:37, Lancelot SIX wrote:
@@ -622,8 +638,15 @@ L_SAVE_HWREG:
#if ASIC_FAMILY >= CHIP_GFX12
// Ensure no further changes to barrier or LDS state.
+ // STATE_PRIV.BARRIER_COMPLETE may change up to this point.
s_barrier_signal -2
s_barrier_wait -2
+
+ // Re-read final state of BARRIER_COMPLETE field for save.
+ s_getreg_b32 s_save_tmp, hwreg(S_STATUS_HWREG)
+ s_and_b32 s_save_tmp, s_save_tmp,
SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK
+ s_andn2_b32 s_save_status, s_save_status,
SQ_WAVE_STATE_PRIV_BARRIER_COMPLETE_MASK
Even if BARRIER_COMPLETE can be asserted while we are in the trap
hadler, I do not think it can be cleared. That being said, it might
be easier to just replace the bit, making it clearer.
Yes, I chose to structure it this way to make the intent clearer. We
don't gain much from dropping the s_andn2. Most of the time spent in the
save handler is stalled on memory instructions.
@@ -1351,7 +1369,17 @@ L_SKIP_BARRIER_RESTORE:
s_setreg_b32 hwreg(HW_REG_SHADER_XNACK_MASK),
s_restore_xnack_mask
#endif
+#if ASIC_FAMILY < CHIP_GFX12
s_setreg_b32 hwreg(S_TRAPSTS_HWREG), s_restore_trapsts
Wouldn't other gfx1x architectures have a similar issue when writing
TRAPSTS here? That is if TRAPSTS.SAVECTX is set while we are
restoring, wouldn't we loose it?
And for gfx11, there is TRAPSTS.HOST_TRAP that could have the same
issue to some degree (not sure if we would loose the host trap
completly, or re-enter with trap ID + HT bit set in ttmp1).
Prior to gfx12 context save and host trap exceptions are not delivered
to a wave until STATUS.PRIV=0, i.e. it leaves the trap handler.
The changes needed for gfx12 are due to a design change in this area.
Exceptions are now flagged immediately and cause re-entry to the trap if
any are non-zero.
Thanks for the clarifications. The patch looks good to me.
Reviewed-by: Lancelot Six <lancelot.six@xxxxxxx>
Best,
Lancelot.