On 2022-07-14 16:08:51, Dave Airlie wrote: > On Fri, 15 Apr 2022 at 10:15, Matt Roper <matthew.d.roper@xxxxxxxxx> wrote: > > > > On Tue, Apr 12, 2022 at 03:59:55PM -0700, John.C.Harrison@xxxxxxxxx wrote: > > > From: John Harrison <John.C.Harrison@xxxxxxxxx> > > > > > > The latest GuC firmware drops the context descriptor pool in favour of > > > passing all creation data in the create H2G. It also greatly simplifies > > > the work queue and removes the process descriptor used for multi-LRC > > > submission. So, remove all mention of LRC and process descriptors and > > > update the registration code accordingly. > > > > > > Unfortunately, the new API also removes the ability to set default > > > values for the scheduling policies at context registration time. > > > Instead, a follow up H2G must be sent. The individual scheduling > > > policy update H2G commands are also dropped in favour of a single KLV > > > based H2G. So, change the update wrappers accordingly and call this > > > during context registration.. > > > > > > Of course, this second H2G per registration might fail due to being > > > backed up. The registration code has a complicated state machine to > > > cope with the actual registration call failing. However, if that works > > > then there is no support for unwinding if a further call should fail. > > > Unwinding would require sending a H2G to de-register - but that can't > > > be done because the CTB is already backed up. > > > > > > So instead, add a new flag to say whether the context has a pending > > > policy update. This is set if the policy H2G fails at registration > > > time. The submission code checks for this flag and retries the policy > > > update if set. If that call fails, the submission path early exists > > > with a retry error. This is something that is already supported for > > > other reasons. > > > > > > Signed-off-by: John Harrison <John.C.Harrison@xxxxxxxxx> > > > Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@xxxxxxxxx> > > > > Applied to drm-intel-gt-next. Thanks for the patch and review. > > > > (cc'ing Linus and danvet, as a headsup, there is also a phoronix > article where this was discovered). > > Okay WTF. > > This is in no way acceptable. This needs to be fixed in 5.19-rc ASAP. > > Once hardware is released and we remove the gate flag by default, you > cannot just bump firmware versions blindly. > > The kernel needs to retain compatibility with all released firmwares > since a device was declared supported. > > This needs to be reverted, and then 70 should be introduced with a > fallback to 69 versions. > > Very disappointing, I expect this to get dealt with v.quickly. This reminds me of something. A distant memory, really. But, if you can believe it, i915 used to actually be able to *do something* without the *closed source* guc. Crazy, right? Anyway, that's all ancient history now. I mean, you have to go back pretty far for that. Let me check my notes. Yeah, you'd probably have to go all the way back to 2021 for that. I guess a lot of things were much simpler back then though. Anyway... Always fun to reminisce. -Jordan