On Mon, 24 Mar 2014, Daniel Vetter wrote: > On Mon, Mar 24, 2014 at 07:45:47AM +1000, Dave Airlie wrote: > > On Mon, Mar 24, 2014 at 7:27 AM, Andreas Mohr <andi@xxxxxxxx> wrote: > > > On Sun, Mar 23, 2014 at 09:39:16AM -0700, Linus Torvalds wrote: > > >> On Sun, Mar 23, 2014 at 5:15 AM, Andreas Mohr <andi@xxxxxxxx> wrote: > > >> > > > >> > which did end up flawless on 3.12.0-rc2+, too > > >> > (but failed to improve the issue on 3.14.0-rc7+). > > >> > > > >> > So, for all intents and purposes, drm infrastructure seems unavoidably > > >> > (neither dri disable nor libdrm upgrade helps) affected. > > >> > Does anyone know which change caused that issue? > > >> > (I'm asking because bisect here would be relatively painful). > > >> > > >> So 3.12-rc2 works. Does 3.13 work? Is this a regression in the current > > >> 3.14 rc only, or did it happen already in the previous release? > > > > > > Hmm, given that Mikulas in > > > https://lkml.org/lkml/2014/2/26/537 > > > offered a diff of linux-3.13.5 files, it truly seems (shock! ack! noo!) > > > that that indeed may have been a regression at <= 3.13 proper even > > > (which may pose interesting questions about the level of testing coverage > > > we still enjoy [not!?] in this hardware area). That patch drops a mutex, so it is not correct. There is mutex resursion - we need to uninstall the irq in drm_master_destroy, because here we are committed to destroy the device. But the routine that uninstalls the irq takes struct_mutex, which is already held in drm_master_destroy. I suppose that the person who maintains drm reworks the patch so that it's correct: - could we use a different mutex to protect the irq in drm_irq.c? Or possibly no mutex at all and use cmpxchg to manipulate the variable dev->irq_enabled? - this seems like the best solution. But I am not sure if the code in drm_irq.c somehow depends on the facts that other parts of the drm subsystem take struct_mutex. - could we pass a new argument to drm_irq_uninstall that tells it not to take the mutex? drm_master_destroy would set this argument to 1. drm_master_destroy is mostly called with struct_mutex held, but there may be places in vmwgfx_drv.c where drm_master_put (which calls drm_master_destroy) may be called without struct_mutex held. Is it true that drm_master_destroy can be called without struct_mutex held? I don't know. I think drm maintainer should sort out the above issues and modify the patch accordingly. > > > Oh well, seems I'll have to prepare/build 3.13 now... > > > > It's > 15 year old hardware, so yes I believe we have close to 0 > > testing coverage on it outside of distros, > > > > I'm not even sure I have one anymore, I might be able to test an MGA in one box. > > I haven't done a full read of all the related code, but this smells like a > similar bug I've hit all over the place in the i810 driver (another one of > those undead drm drivers of yonders). Ingredients: > > 1) Xorg creates a drm mapping of the register space. > 2) Xorg tells the hw-specific drm which drm mapping has the hw registers, > and the driver uses that. Iirc this has been done as some form of OS > abstraction. Also note that these mappings aren't refcounted, so the first > guy to call drm_rmmap wins. > > -> All hell breaks loose if Xorg dies and takes all it's mappings with it > (in master_destroy, since the Xorg /dev fd is the master) and leaves the > driver hanging in the air if there's an interrupt still pending (or > anything else fwiw). For me that crash happened when xorg exited with a fatal error too. Mikulas _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel