On Tue, Jun 20, 2017 at 1:46 PM, Christian König <deathsimple at vodafone.de> wrote: > Am 20.06.2017 um 12:34 schrieb Marek Olšák: >> >> BTW, I noticed the flush sequence in the kernel is wrong. The correct >> flush sequence should be: >> >> 1) EVENT_WRITE_EOP - CACHE_FLUSH_AND_INV_TS - write a dword to memory, >> but no fence/interrupt. >> 2) WAIT_REG_MEM on the dword to wait for idle before SURFACE_SYNC. >> 3) SURFACE_SYNC (TC, K$, I$) >> 4) Write CP_COHER_CNTL2. >> 5) EVENT_WRITE_EOP - BOTTOM_OF_PIPE_TS - write the fence with the >> interrupt. >> >> WAIT_REG_MEM wouldn't be needed if we were able to merge >> CACHE_FLUSH_AND_INV, SURFACE_SYNC, and CP_COHER_CNTL2 into one EOP >> event. >> >> The main issue with the current flush sequence in radeon and amdgpu is >> that it doesn't wait for idle before writing CP_COHER_CNTL2 and >> SURFACE_SYNC. So far we've been able to avoid the bug by waiting for >> idle in userspace IBs. > > > Well not waiting for idle between IBs is an explicit requirement, because it > is rather bad for performance to do so. > > David Zhou, Monk and I worked quite a lot on this to avoid both possible > hazard and performance drop. I guess the requirement was ignored for SI. If you don't do the TC flush as part the EOP event, you have to wait for idle before SURFACE_SYNC, because SURFACE_SYNC doesn't wait for idle. It's kinda useless to flush TC when shaders are still in flight. Marek