On Tue, Jan 06, 2015 at 02:14:27PM +0000, Cheng, Yao wrote: > > -----Original Message----- > > From: Daniel Vetter [mailto:daniel.vetter@xxxxxxxx] On Behalf Of Daniel > > Vetter > > Sent: Monday, January 5, 2015 16:40 > > To: Cheng, Yao > > Cc: Daniel Vetter; Thierry Reding; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; dri- > > devel@xxxxxxxxxxxxxxxxxxxxx; Kelley, Sean V; Chehab, John; > > emil.l.velikov@xxxxxxxxx; Jiang, Fei; Beckett, Robert; Barbalho, Rafael > > Subject: Re: [RFC PATCH v3 4/4] tests/drv_module_reload: add ipvr support > > > > On Sun, Dec 21, 2014 at 02:40:24PM +0000, Cheng, Yao wrote: > > > > -----Original Message----- > > > > From: Daniel Vetter [mailto:daniel.vetter@xxxxxxxx] > > > > Sent: Thursday, December 18, 2014 19:21 > > > > To: Thierry Reding > > > > Cc: Cheng, Yao; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; dri- > > > > devel@xxxxxxxxxxxxxxxxxxxxx; Kelley, Sean V; Chehab, John; > > > > emil.l.velikov@xxxxxxxxx; Jiang, Fei > > > > Subject: Re: [RFC PATCH v3 4/4] tests/drv_module_reload: add ipvr > > > > support > > > > > > > > On Thu, Dec 18, 2014 at 11:04 AM, Thierry Reding > > > > <thierry.reding@xxxxxxxxx> > > > > wrote: > > > > >> I double checked the symptom and found it was a deadlock on > > > > drm_global_mutex. > > > > >> When i915_driver_load() registers the platform device while ipvr > > > > >> module > > > > is in the system, ipvr's probe() function tries to lock > > > > drm_global_mutex which was already held by i915. > > > > >> I think either of the following 2 actions need to be moved to a > > > > >> bottom half > > > > e.g. a work queue: > > > > >> platform_device_add () call in i915_ved.c (called during > > > > i915_driver_load()) > > > > >> drm_dev_register() call during ipvr's probe() Which one > > > > >> makes more sense? pls kindly advise (I personally prefer the former > > one.). > > > > > > > > > > Yes, that's somewhat ugly, but I don't see a way around that. I'd > > > > > also think that moving platform_device_add() to a workqueue would > > > > > be the best option here. > > > > > > > > Or we simply kill drm_global_mutex for platform drivers that don't > > > > use the - > > > > >probe hook. It should work when they have a correct order betwen > > > > drm_dev_alloc and _register and all the code in between. So just > > > > ditch the - > > > > >load callback in teh ipvr driver and rework the load sequence as > > > > >suggested > > > > somewhere else and this is fixed already. No need for bottom halfs I > > think. > > > > > > Daniel, sorry I didn't quite understand "platform drivers that don't > > > use the probe hook". For initialization, the ipvr platform driver's > > > probe() is called in following 2 possible paths: > > > 1. ipvr installed before i915. In this case, ipvr's probe() is called > > > inside i915_driver_load() and falls into the drm_global_mutex dead lock. > > > 2. i915 installed before ipvr. In this case, ipvr's probe() is called > > > without drm_global_mutex held by i915 and no dead lock issue. > > > If we kill drm_global_mutex, will path 2 run into issue? And in your > > > suggestion, how to rework the load sequence? Do you mean calling > > > ipvr's > > > load() callback directly during platform driver probe()? > > > > Hm right it's not that simple really. What we need in more detail is: > > - Move the mutex_lock(&drm_global_mutex) out of drm_dev_register into > > all the callers. If a driver has a ->load() callback it most likely is > > racy with the usual load ordering issues. > > > > - Rework ipvr to no longer have a ->load callback. Insteaed use the > > following sequence (in the platform ->probe callback): > > > > drm_dev_alloc(); > > ipvr_load(); > > drm_dev_register(); > > > > With that ordering we don't need the additional guarantees that > > drm_global_mutex provides and we can avoid to take that lock around > > drm_dev_registrer() call in the ipvr code. > > Thanks for the detailed explanation, Daniel! > That sounds to be a small refactor on drm core, and need change many drm drivers: nouveau, tegra, udl. > Should it be a standalone RFC patch? I think the locking shuffling should be doable in just one patch, but definitely needs to be split out. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel