On Tue, Mar 18, 2025 at 01:03:11PM +0100, Christian König wrote: > Hi guys, > > as partially discussed on the list already amdgpu has a bug in it's gang > submission code. > > Basic problem is to add the correct dependency to the gang leader we > need to arm the other gang members first, but that is a point of no > return and it is possible that adding the dependencies fails with > ENOMEM. > > Try to fix that by allowing drivers to preallocate dependency slots. Not > sure if that is a good approach, but of hand I don't see much > alternative. I think that's reasonable, in GPUVM we have a similar problem where we have to preallocate in order to avoid allocations under a mutex used in the fence signalling critical path. Unfortunately, this even prevented us from using the maple tree, since it can't preallocate for multiple entries ahead of time.