Hi guys, this patch set implements the the requirement for so called gang submissions in the CS interface. A gang submission guarantees that multiple IBs can run on different engines at the same time. This is implemented by keeping a global per-device gang around represented by a dma_fence which signals as soon as all jobs in a gang are pushed to the hardware. The effect is that as long as members of a gang are waiting to be submitted no other gang can start pushing jobs to the hardware and so deadlocks are effectively prevented. The whole set is based on top of my dma_resv_usage work and a few patches merged over from amd-staging-drm-next, so it won't easily apply anywhere. Please review and comment, Christian.