On Wed, 2023-04-26 at 13:52 +0200, Daniel Vetter wrote: > On Tue, Apr 25, 2023 at 04:41:54PM +0300, Joonas Lahtinen wrote: > > (+ Faith and Daniel as they have been involved in previous discussions) > > Quoting Jordan Justen (2023-04-24 20:13:00) > > > On 2023-04-24 02:08:43, Tvrtko Ursulin wrote: > > > > > > > > > alan:snip > - the more a feature spans drivers/modules, the more it should be > discovered by trying it out, e.g. dma-buf fence import/export was a huge > discussion, luckily mesa devs figured out how to transparantly fall back > at runtime so we didn't end up merging the separate feature flag (I > think at least, can't find it). pxp being split across i915/me/fw/who > knows what else is kinda similar so I'd heavily lean towards discovery > by creating a context > > - pxp taking 8s to init a ctx sounds very broken, irrespective of anything > else > Alan: Please be aware that: 1. the wait-timeout was changed to 1 second sometime back. 2. the I'm not deciding the time-out. I initially wanted to keep it at the same timeout as ADL (250 milisec) - and ask the UMD to retry if user needs it. (as per same ADL behavior). Daniele requested to move it to 8 seconds - but thru review process, we reduced it to 1 second. 3. In anycase, thats just the wait-timeout - and we know it wont succeed until ~6 seconds after i915 (~9 secs after boot). The issue isnt our hardware or i915 - its the component driver load <-- this is what's broken. Details: PXP context is dependent on gsc-fw load, huc-firmware load, mei-gsc-proxy component driver load + bind, huc-authentication and gsc-proxy-init-handshake. Most of above steps begin rather quickly during i915 driver load - the delay seems to come from a very late mei-gsc-proxy component driver load. In fact the parent mei-me driver is only getting ~6 seconds after i915 init is done. That blocks the gsc-proxy-init-handshake and huc-authentication and lastly PXP. That said, what is broken is why it takes so long to get the component drivers to come up. NOTE: PXP isnt really doing anything differently in the context creation flow (in terms of time-consuming-steps compared to ADL) besides the extra dependency waits these. We can actually go back to the original timeout of 250 milisecs like we have in ADL but will fail if MESA calls in too early (but will succeed later) ... or... we can create the GET_PARAMs. A better idea would be to figure out how to control the driver load order and force mei driver + components to get called right after i915. I was informed there is no way to control this and changes here will likely not be accepted upstream. ++ Daniele - can you chime in? Take note that ADL has the same issue but for whatever reason, the dependant mei component on ADL loaded much sooner - so it was never an issue that was caught but still existed on ADL time merge (if users customize the kernel + compositor for fastboot it will happen).(i realize I havent tested ADL with the new kernel configs that we use to also boot PXP on MTL - wonder if the new mei configs are causing the delay - i.e. ADL customer could suddenly see this 6 sec delay too. - something i have to check now)