On Sat, Oct 29, 2011 at 03:00:28PM +0200, Christian König wrote: > Hello everybody, > > to support multiple compute rings, async DMA engines and UVD we need > to teach the radeon kernel module how to sync buffers between > different rings and make some changes to the command submission > ioctls. > > Since we can't release any documentation about async DMA or UVD > (yet), my current branch concentrates on getting the additional > compute rings on cayman running. Unfortunately those rings have > hardware bugs that can't be worked around, so they are actually not > very useful in a production environment, but they should do quite > well for this testing purpose. > > The branch can be found here: > http://cgit.freedesktop.org/~deathsimple/linux/log/ > > Since some of the patches are quite intrusive, constantly rebaseing > them could get a bit painful. So I would like to see most of the > stuff included into drm-next, even if we don't make use of the new > functionality right now. > > Comments welcome, > Christian. So i have been looking back at all this and now there is somethings puzzling me. So if semaphore wait for a non null value at gpu address provided in the packet than the current implementation for the cs ioctl doesn't work when there is more than 2 rings to sync. http://cgit.freedesktop.org/~deathsimple/linux/commit/?h=multi-ring-testing&id=bae372811c697a889ff0cf9128f52fe914d0fe1b As it will use only one semaphore so first ring to finish will mark the semaphore as done even if there is still other ring not done. This all make me wonder if some change to cs ioctl would make all this better. So idea of semaphore is to wait for some other ring to finish something. So let say we have following scenario: Application submit following to ring1: csA, csB Application now submit to ring2: cs1, cs2 And application want csA to be done for cs1 and csB to be done for cs2. To achieve such usage pattern we would need to return fence seq or similar from the cs ioctl. So user application would know ringid+fence_seq for csA & csB and provide this when scheduling cs1 & cs2. Here i am assuming MEM_WRITE/WAIT_REG_MEM packet are as good as MEM_SEMAPHORE packet. Ie the semaphore packet doesn't give us much more than MEM_WRITE/WAIT_REG_MEM would. To achieve that each ring got it's fence scratch addr where to write seq number. And then we use WAIT_REG_MEM on this addr and with the specific seq for the other ring that needs synchronization. This would simplify the semaphore code as we wouldn't need somethings new beside helper function and maybe extending the fence structure. Anyway i put updating ring patch at : http://people.freedesktop.org/~glisse/mrings/ It rebased on top of linus tree and it has several space indentation fixes and also a fix for no semaphore allocated issue (patch 5) Cheers, Jerome _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel