On Fri, 2021-10-08 at 22:28 +0200, Thomas Hellström wrote: > On Fri, 2021-10-08 at 13:31 -0400, Zack Rusin wrote: > > This is a largely trivial set that makes vmwgfx support module reload > > and PCI hot-unplug. It also makes IGT's core_hotunplug pass instead > > of kernel oops'ing. > > > > The one "ugly" change is the "Introduce a new placement for MOB page > > tables". It seems vmwgfx has been violating a TTM assumption that > > TTM_PL_SYSTEM buffers are never fenced for a while. Apart from a > > kernel > > oops on module unload it didn't seem to wreak too much havoc, but we > > shouldn't be abusing TTM. So to solve it we're introducing a new > > placement, which is basically system, but can deal with fenced bo's. > > > > Cc: Christian König <christian.koenig@xxxxxxx> > > Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > > Hi, Zack, > > What part of TTM doesn't allow fenced system memory currently? It was > certainly designed to allow that and vmwgfx has been relying on that > since the introduction of MOBs IIRC. Also i915 is currently relying on > that. It's the shutdown. BO's allocated through the ttm system manager might be busy during ttm_bo_put which results in them being scheduled for a delayed deletion. The ttm system manager is disabled before the final delayed deletion is ran in ttm_device_fini. This results in crashes during freeing of the BO resources because they're trying to remove themselves from a no longer existent ttm_resource_manager (e.g. in IGT's core_hotunplug on vmwgfx) During review of the trivial patch that was fixing it in ttm Christian said that system domain buffers must be idle or otherwise a number of assumptions in ttm breaks: https://lists.freedesktop.org/archives/dri-devel/2021-September/324027.html And later clarified that in fact system domain buffers being fenced is illegal from a design point of view: https://lists.freedesktop.org/archives/dri-devel/2021-September/324697.html z