On Mon, 10 Aug 2020 at 22:20, Christian König <christian.koenig@xxxxxxx> wrote: > > Am 07.08.20 um 09:02 schrieb Christian König: > > Am 06.08.20 um 20:50 schrieb Roland Scheidegger: > >> Am 06.08.20 um 17:28 schrieb Christian König: > >>> My best guess is that you are facing two separate bugs here. > >>> > >>> Crash #1 is somehow related to CRTCs and might even be cause by the > >>> atomic-helper change you noted below. > >>> > >>> Crash #2 is caused because vmw_bo_create_and_populate() tries to > >>> manually populate a BO object instead of relying on TTM to do it when > >>> necessary. This indeed doesn't work any more because of "drm/ttm: make > >>> TT creation purely optional v3". > >>> > >>> Question is why vmwgfx is doing this? > >> Not really sure unfortunately, it's possible vmwgfx is doing it because > >> ttm lacked some capabilities at some point? > > > > I think so as well, yes. > > > >> Trying to figure this one out... > > > > Problem is that what vmwgfx is doing here is questionable at best. > > > > By definition BOs in the SYSTEM domain are not accessible by the GPU, > > even if it is a virtual one. > > > > And what vmwgfx does is allocating one in the SYSTEM domain as not > > evictable and then bypassing TTM in filling and mapping it to the GPU. > > > > That doesn't really makes sense to me, why shouldn't that BO be put in > > the GTT domain then in the first place? > > Well I think I figured out what VMWGFX is doing here, but you won't like it. > > See VMWGFX doesn't support TTMs GTT domain. So to implement the mob and > otable BOs it is allocating system domain BOs, pinning them and manually > filling them with pages. > > The correct fix would be to audit VMWGFX and fix this handling so that > it doesn't mess any more with TTM internal object state. > > Till that happens we can only revert the patch for now. Probably good to do, at least we know the problem now. However I found myself in the same place yesterday so we should discuss how to fix it going forward. At least on Intel IGPs you have GTT and PPGTT (per-process table). GTT on later hw is only needed for certain objects, like scanout etc. Not every object needs to be in the GTT domain. But when you get an execbuffer and you want to bind the PPGTT objects, you need to either move the object to the GTT domain pointlessly and suboptimally, since the GTT domain could fill up and start needing evictions. So the option is to get SYSTEM domain objects, only move them to TTM_PL_TT when pinning for scanout etc, but otherwise generate the pages lists from the objects. In my playing around I've hacked up a TT create/populate path, with no bind. Dave. I have hardware that has no requirement for all objects to be in the TT domain, but still has a TT domain. _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel