Re: [PATCH 0/3] drm/i915: Fix harmfull driver register/unregister assymetry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andi,

On Wednesday, 12 February 2025 16:32:18 CET Andi Shyti wrote:
> Hi Janusz,
> 
> On Tue, Feb 11, 2025 at 01:12:37PM +0100, Janusz Krzysztofik wrote:
> > On Monday, 10 February 2025 14:01:19 CET Andi Shyti wrote:
> > > On Thu, Feb 06, 2025 at 07:07:38PM +0100, Janusz Krzysztofik wrote:
> > > > We return immediately from i915_driver_register() if drm_dev_register()
> > > > fails, skipping remaining registration steps.  However, the _unregister()
> > > > counterpart called at device remove knows nothing about that skip and
> > > > executes reverts for all those steps.  For that to work correctly, those
> > > > revert functions must be resistant to being called even on uninitialized
> > > > objects, or we must not skip their initialization.
> > > > 
> > > > Three cases have been identified and fixes proposed.  Call traces are
> > > > taken from CI results of igt@i915_driver_load@reload-with-fault-injection
> > > > execution, reported to several separate Gitlab issues (links provided).
> > > > 
> > > > Immediate return was introduced to i915_driver_register() by commit
> > > > ec3e00b4ee27 ("drm/i915: stop registering if drm_dev_register() fails"),
> > > > however, quite a few things have changed since then.  That's why I haven't
> > > > mentioned it in a Fixes: tag to avoid it being picked up by stable, which
> > > > I haven't tested.
> > > 
> > > I'm not fully convinced about this series as I think that you are
> > > fixing a subset of what needs to be handled properly. What about
> > > hwmon? What about gt? what about debugfs?
> > 
> > For all of those, their _unregister() functions seem to be designed to be safe 
> > to call even if not registered.  Like e.g. kfree() -- you can call it safely 
> > even with NULL argument, you don't need to check for NULL and call it 
> > conditionally.  However, ...
> > 
> > > 
> > > In my opinion we need to check in _unregister whether the
> > > drm_dev_register has succeded 
> > 
> > I agree with you that it would be more clear if we skipped not only 
> > _register() but also _unregister() steps symmetrically, based on result of 
> > drm_dev_register().
> > 
> > > and one way would be, e.g., to
> > > check for the drm minor value, or even set the drm device tu NULL
> > > (first things that come to my mind, maybe there are smarter ways
> > > of doing it). 
> > 
> > As long as drm doesn't provide explicit support for checking if registration 
> > succeeded other than examining the return value of drm_dev_register(), I would 
> > rather store that value somewhere in our drm_i915_private structure instead of 
> > depending on drm internals.  What do you think?
> 
> yes, I think we could have a local flag.

OK, I've tried that approach, but that revealed issues with some unregister 
steps still needed for clean driver unbind even if their counterpart register 
steps were skipped on probe.  Then, to make things working cleanly, we need to 
review and test each register/unregister pair of functions for their potential 
asymmetry, detect potential issues, invent and implement solutions, and only 
then we may feel free to skip unregister steps safely if their register 
counterparts were not called.  And that scope doesn't sound like a quick fix, 
especially as those problematic register/unregister steps may belong to 
different driver features, then may need attention and participation of more 
than one team.

With that in mind, what steps you think we should take now?

Thanks,
Janusz

> 
> Andi
> 
> > Thanks,
> > Janusz
> > 
> > 
> > > This way we could skip some of the _unregister()
> > > steps.
> > > 
> > > Andi
> > > 
> > 
> > 
> > 
> 







[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux