Am 15.03.2018 um 10:20 schrieb Daniel Vetter: > On Tue, Mar 13, 2018 at 06:20:07PM +0100, Christian König wrote: > [SNIP] > Take a look at the DOT graphs for atomic I've done a while ago. I think we > could make a formidable competition for who's doing the worst diagrams :-) Thanks, going to give that a try. > [SNIP] > amdgpu: Expects that you never hold any of the heavywheight locks while > waiting for a fence (since gpu resets will need them). > > i915: Happily blocks on fences while holding all kinds of locks, expects > gpu reset to be able to recover even in this case. In this case I can comfort you, the looks amdgpu needs to grab during GPU reset are the reservation lock of the VM page tables. I have strong doubt that i915 will ever hold those. Could be that we run into problems because Thread A hold lock 1 tries to take lock 2, then i915 holds 2 and our reset path needs 1. > [SNIP] >> Yes, except for fallback paths and bootup self tests we simply never wait >> for fences while holding locks. > That's not what I meant with "are you sure". Did you enable the > cross-release stuff (after patching the bunch of leftover core kernel > issues still present), annotate dma_fence with the cross-release stuff, > run a bunch of multi-driver (amdgpu vs i915) dma-buf sharing tests and > weep? Ok, what exactly do you mean with cross-release checking? > I didn't do the full thing yet, but just within i915 we've found tons of > small little deadlocks we never really considered thanks to cross release, > and that wasn't even including the dma_fence annotation. Luckily nothing > that needed a full-on driver redesign. > > I guess I need to ping core kernel maintainers about cross-release again. > I'd much prefer if we could validate ->invalidate_mapping and the > locking/fence dependency issues using that, instead of me having to read > and understand all the drivers. [SNIP] > I fear that with the ->invalidate_mapping callback (which inverts the > control flow between importer and exporter) and tying dma_fences into all > this it will be a _lot_ worse. And I'm definitely too stupid to understand > all the dependency chains without the aid of lockdep and a full test suite > (we have a bunch of amdgpu/i915 dma-buf tests in igt btw). Yes, that is also something I worry about. Regards, Christian.