> -----Original Message----- > From: De Marchi, Lucas <lucas.demarchi@xxxxxxxxx> > Sent: Wednesday, September 11, 2024 9:49 PM > To: Bommu, Krishnaiah <krishnaiah.bommu@xxxxxxxxx> > Cc: Vivi, Rodrigo <rodrigo.vivi@xxxxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx; intel- > gfx@xxxxxxxxxxxxxxxxxxxxx; Kamil Konieczny <kamil.konieczny@xxxxxxxxxxxxxxx>; > Ceraolo Spurio, Daniele <daniele.ceraolospurio@xxxxxxxxx>; Upadhyay, Tejas > <tejas.upadhyay@xxxxxxxxx>; Tvrtko Ursulin <tursulin@xxxxxxxxxxx>; Joonas > Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Nikula, Jani > <jani.nikula@xxxxxxxxx>; Thomas Hellström > <thomas.hellstrom@xxxxxxxxxxxxxxx>; Teres Alexis, Alan Previn > <alan.previn.teres.alexis@xxxxxxxxx>; Winkler, Tomas > <tomas.winkler@xxxxxxxxx>; Usyskin, Alexander > <alexander.usyskin@xxxxxxxxx>; linux-modules@xxxxxxxxxxxxxxx; Luis > Chamberlain <mcgrof@xxxxxxxxxx> > Subject: Re: [PATCH v2] drm: Ensure Proper Unload/Reload Order of MEI > Modules for i915/Xe Driver > > + linux-modules > + Luis > > On Wed, Sep 11, 2024 at 01:00:47AM GMT, Bommu, Krishnaiah wrote: > > > > > >> -----Original Message----- > >> From: De Marchi, Lucas <lucas.demarchi@xxxxxxxxx> > >> Sent: Tuesday, September 10, 2024 9:13 PM > >> To: Vivi, Rodrigo <rodrigo.vivi@xxxxxxxxx> > >> Cc: Bommu, Krishnaiah <krishnaiah.bommu@xxxxxxxxx>; intel- > >> xe@xxxxxxxxxxxxxxxxxxxxx; intel-gfx@xxxxxxxxxxxxxxxxxxxxx; Kamil > >> Konieczny <kamil.konieczny@xxxxxxxxxxxxxxx>; Ceraolo Spurio, Daniele > >> <daniele.ceraolospurio@xxxxxxxxx>; Upadhyay, Tejas > >> <tejas.upadhyay@xxxxxxxxx>; Tvrtko Ursulin <tursulin@xxxxxxxxxxx>; > >> Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx>; Nikula, Jani > >> <jani.nikula@xxxxxxxxx>; Thomas Hellström > >> <thomas.hellstrom@xxxxxxxxxxxxxxx>; Teres Alexis, Alan Previn > >> <alan.previn.teres.alexis@xxxxxxxxx>; Winkler, Tomas > >> <tomas.winkler@xxxxxxxxx>; Usyskin, Alexander > >> <alexander.usyskin@xxxxxxxxx> > >> Subject: Re: [PATCH v2] drm: Ensure Proper Unload/Reload Order of MEI > >> Modules for i915/Xe Driver > >> > >> On Tue, Sep 10, 2024 at 11:03:30AM GMT, Rodrigo Vivi wrote: > >> >On Mon, Sep 09, 2024 at 09:33:17AM +0530, Bommu Krishnaiah wrote: > >> >> This update addresses the unload/reload sequence of MEI modules in > >> >> relation to the i915/Xe graphics driver. On platforms where the > >> >> MEI hardware is integrated with the graphics device (e.g., > >> >> DG2/BMG), the i915/xe driver is depend on the MEI modules. > >> >> Conversely, on newer platforms like MTL and LNL, where the MEI > >> >> hardware is separate, this > >> dependency does not exist. > >> >> > >> >> The changes introduced ensure that MEI modules are unloaded and > >> >> reloaded in the correct order based on platform-specific > >> >> dependencies. This is achieved by adding a MODULE_SOFTDEP > >> >> directive to > >> the i915 and Xe module code. > >> > >> > >> can you explain what causes the modules to be loaded today? Also, is > >> this to fix anything related to *loading* order or just unload? > >> > >> >> > >> >> These changes enhance the robustness of MEI module handling across > >> >> different hardware platforms, ensuring that the i915/Xe driver can > >> >> be cleanly unloaded and reloaded without issues. > >> >> > >> >> v2: updated commit message > >> >> > >> >> Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@xxxxxxxxx> > >> >> Cc: Kamil Konieczny <kamil.konieczny@xxxxxxxxxxxxxxx> > >> >> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > >> >> Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > >> >> Cc: Tejas Upadhyay <tejas.upadhyay@xxxxxxxxx> > >> >> --- > >> >> drivers/gpu/drm/i915/i915_module.c | 2 ++ > >> >> drivers/gpu/drm/xe/xe_module.c | 2 ++ > >> >> 2 files changed, 4 insertions(+) > >> >> > >> >> diff --git a/drivers/gpu/drm/i915/i915_module.c > >> >> b/drivers/gpu/drm/i915/i915_module.c > >> >> index 65acd7bf75d0..2ad079ad35db 100644 > >> >> --- a/drivers/gpu/drm/i915/i915_module.c > >> >> +++ b/drivers/gpu/drm/i915/i915_module.c > >> >> @@ -75,6 +75,8 @@ static const struct { }; static int > >> >> init_progress; > >> >> > >> >> +MODULE_SOFTDEP("pre: mei_gsc_proxy mei_gsc"); > >> >> + > >> >> static int __init i915_init(void) { > >> >> int err, i; > >> >> diff --git a/drivers/gpu/drm/xe/xe_module.c > >> >> b/drivers/gpu/drm/xe/xe_module.c index bfc3deebdaa2..5633ea1841b7 > >> >> 100644 > >> >> --- a/drivers/gpu/drm/xe/xe_module.c > >> >> +++ b/drivers/gpu/drm/xe/xe_module.c > >> >> @@ -127,6 +127,8 @@ static void xe_call_exit_func(unsigned int i) > >> >> init_funcs[i].exit(); > >> >> } > >> >> > >> >> +MODULE_SOFTDEP("pre: mei_gsc_proxy mei_gsc"); > >> > > >> >I'm honestly not very comfortable with this. > >> > > >> >1. This is not true for every device supported by these modules. > >> >2. This is not true for every (and the most basic) functionality of these > drivers. > >> > > >> >Shouldn't this be done in the the mei side? > >> > >> I don't think it's possible to do from the mei side. Would mei depend > >> on both xe and i915 (and thus cause both to be loaded regardless of > >> the platform?). For a runtime dependency like this that depends on > >> the platform, I think the best way would be a weakdep + either a > >> request_module() or something else that causes the module to load (is > >> that what comp_* is doing today?) > >> > >> > > >> >Couldn't at probe we identify the need of them and if needed we > >> >return -EPROBE to attempt a retry after the mei drivers were probed? > >> > >> I'm not sure this is fixing anything for probe. I think we already > >> wait on the other component to be ready without blocking the rest of the > driver functionality. > >> > >> A weakdep wouldn't cause the module to be loaded where it's not > >> needed, but need some clarification if this is trying to fix anything load- > related or just unload. > > > >This change is fixing unload. > >During xe load I am seeing mei_gsc modules was loaded, but not unloaded > >during the unload xe > > so, first thing: if things are correct in the kernel, we shouldn't need to > **unload** the module after unbinding the device. Why are we unloading xe > and the other modules for tests? While running gta@xe_module_load@reload-no-display I see failure, to address this failure I have this changes, previously I am trying to fix from IGT, but as per igt review suggestion I am trying to fix issue in kernel, IGT patch: https://patchwork.freedesktop.org/series/137343/ > >root@DUT6127BMGFRD:/home/gta# lsmod | grep xe ------>>>just after > >system reboot root@DUT6127BMGFRD:/home/gta# > >root@DUT6127BMGFRD:/home/gta# lsmod | grep mei > >mei_hdcp 28672 0 > >mei_pxp 16384 0 > >mei_me 49152 2 > >mei 167936 5 mei_hdcp,mei_pxp,mei_me > >root@DUT6127BMGFRD:/home/gta# lsmod | grep xe > >root@DUT6127BMGFRD:/home/gta# root@DUT6127BMGFRD:/home/gta# > modprobe xe > >root@DUT6127BMGFRD:/home/gta# root@DUT6127BMGFRD:/home/gta# > lsmod | > >grep mei > >mei_gsc_proxy 16384 0 > >mei_gsc 12288 1 > > ^ which means there's one user, which > should be xe > > >mei_hdcp 28672 0 > >mei_pxp 16384 0 > >mei_me 49152 3 mei_gsc > >mei 167936 8 mei_gsc_proxy,mei_gsc,mei_hdcp,mei_pxp,mei_me > >root@DUT6127BMGFRD:/home/gta# > >root@DUT6127BMGFRD:/home/gta# > >root@DUT6127BMGFRD:/home/gta# > >root@DUT6127BMGFRD:/home/gta# init 3 > >root@DUT6127BMGFRD:/home/gta# echo -n auto > > >/sys/bus/pci/devices/0000\:03\:00.0/power/control > >root@DUT6127BMGFRD:/home/gta# echo -n "0000:03:00.0" > > >/sys/bus/pci/drivers/xe/unbind root@DUT6127BMGFRD:/home/gta# > modprobe > >-r xe root@DUT6127BMGFRD:/home/gta# > root@DUT6127BMGFRD:/home/gta# lsmod > >| grep xe root@DUT6127BMGFRD:/home/gta# lsmod | grep mei > >mei_gsc_proxy 16384 0 > >mei_gsc 12288 0 > > ^ great, so the refcount went to 0, > confirming it was xe. It should go to 0 > even before you unload the module, > when unbind. > > A couple of points: > > 1) why do we care about unloading mei_gsc. Just loading xe > again (or even not even unloading it, just unbind/rebind), > should still work if the xe <-> mei_gsc integration is done > correctly. > > 2) If for some reason we do want to remove the module, then we will > need some work in kernel/module/ to start tracking runtime module > dependencies, i.e. when one module does a module_get(foo->owner), it > would add to a list and output on sysfs together with the holders list. > This way you would be able to track the runtime deps and remove them > if their refcount went to 0 after removing xe. > > (2) is doable, but previous attempts were not successful [1]. Is there something > else to make the simpler solution (1) to work? > Reference why I am doing this changes, please see review comments of this patch https://patchwork.freedesktop.org/series/137343/ Regards, Krishna. > thanks > Lucas De Marchi > > [1] https://lore.kernel.org/linux- > modules/cover.1652113087.git.mchehab@xxxxxxxxxx/ > > >mei_hdcp 28672 0 > >mei_pxp 16384 0 > >mei_me 49152 3 mei_gsc > >mei 167936 7 mei_gsc_proxy,mei_gsc,mei_hdcp,mei_pxp,mei_me > >root@DUT6127BMGFRD:/home/gta# > > > >Regards, > >Krishna. > > > >> > >> Lucas De Marchi > >> > >> > > >> >Cc: Alexander Usyskin <alexander.usyskin@xxxxxxxxx> > >> >Cc: Tomas Winkler <tomas.winkler@xxxxxxxxx> > >> >Cc: Alan Previn <alan.previn.teres.alexis@xxxxxxxxx> > >> >Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > >> >Cc: Lucas De Marchi <lucas.demarchi@xxxxxxxxx> > >> >Cc: Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx> > >> >Cc: Jani Nikula <jani.nikula@xxxxxxxxx> > >> >Cc: Joonas Lahtinen <joonas.lahtinen@xxxxxxxxxxxxxxx> > >> >Cc: Tvrtko Ursulin <tursulin@xxxxxxxxxxx> > >> > > >> >> + > >> >> static int __init xe_init(void) > >> >> { > >> >> int err, i; > >> >> -- > >> >> 2.25.1 > >> >>