On 25.02.2022 18:18, Tvrtko Ursulin wrote: > > On 25/02/2022 16:46, John Harrison wrote: > >>>> driver we don't care that much that we failed to load HWconfig and >>>> 'notice' is enough. >>>> >>>> but I'm fine with all messages being drm_err (as we will not have to >>>> change that once again after HWconfig will be mandatory for the driver >>>> as well) >>> >>> I would be against drm_err. >>> >>> #define KERN_EMERG KERN_SOH "0" /* system is unusable */ >>> #define KERN_ALERT KERN_SOH "1" /* action must be taken >>> immediately */ >>> #define KERN_CRIT KERN_SOH "2" /* critical conditions */ >>> #define KERN_ERR KERN_SOH "3" /* error conditions */ >>> #define KERN_WARNING KERN_SOH "4" /* warning conditions */ >>> #define KERN_NOTICE KERN_SOH "5" /* normal but significant >>> condition */ >>> #define KERN_INFO KERN_SOH "6" /* informational */ >>> #define KERN_DEBUG KERN_SOH "7" /* debug-level messages */ >>> >>> From the point of view of the kernel driver, this is not an error to >>> its operation. It can at most be a warning, but notice is also fine >>> by me. One could argue when reading "normal but significant >>> condition" that it is not normal, when it is in fact unexpected, so >>> if people prefer warning that is also okay by me. I still lean >>> towards notice becuase of the hands-off nature i915 has with the >>> pass-through of this blob. >> From the point of view of the KMD, i915 will load and be 'functional' >> if it can't talk to the hardware at all. The UMDs won't work at all but > > Well this reductio ad absurdum fails I think... :) > >> the driver load will be 'fine'. That's a requirement to be able to get >> the user to a software fallback desktop in order to work out why the >> hardware isn't working (e.g. no GuC firmware file). I would view this >> as similar. The KMD might have loaded but the UMDs are not functional. >> That is definitely an error condition to me. > > ... If GuC fails to load there is no command submission and driver will > obviously log that with drm_err. > > If blob fails to verify it depends on the userspace stack what will > happen. As stated before on some platforms, and/or after a certain time, > Mesa will not look at the blob at all. So i915 is fine (it is after all > just a conduit for opaque data!), system overall is fine, so it > definitely isn't a KERN_ERR level event. > >>>>>>> + ERR_PTR(ret)); >>>>>>> + >>>>>>> ret = guc_enable_communication(guc); >>>>>>> if (ret) >>>>>>> goto err_log_capture; >>>>>>> @@ -562,6 +567,8 @@ static void __uc_fini_hw(struct intel_uc *uc) >>>>>>> if (intel_uc_uses_guc_submission(uc)) >>>>>>> intel_guc_submission_disable(guc); >>>>>>> + intel_guc_hwconfig_fini(&guc->hwconfig); >>>>>>> + >>>>>>> __uc_sanitize(uc); >>>>>>> } >>>>>>> diff --git a/drivers/gpu/drm/i915/i915_pci.c >>>>>>> b/drivers/gpu/drm/i915/i915_pci.c >>>>>>> index 76e590fcb903..1d31e35a5154 100644 >>>>>>> --- a/drivers/gpu/drm/i915/i915_pci.c >>>>>>> +++ b/drivers/gpu/drm/i915/i915_pci.c >>>>>>> @@ -990,6 +990,7 @@ static const struct intel_device_info >>>>>>> adl_p_info = { >>>>>>> BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VCS0) | >>>>>>> BIT(VCS2), >>>>>>> .ppgtt_size = 48, >>>>>>> .dma_mask_size = 39, >>>>>>> + .has_guc_hwconfig = 1, >>>>>> Who requested this change? It was previously done this way but the >>>>>> instruction was that i915_pci.c is for hardware features only but >>>>>> that >>>>>> this, as you seem extremely keen about pointing out at every >>>>>> opportunity, is a software feature. >>>>> >>>>> This was requested by Michal as well. I definitely agree it is a >>>>> software feature, but I was not aware that "i915_pci.c is for hardware >>>>> features only". >>>>> >>>>> Michal, do you agree with this and returning to the previous method >>>>> for >>>>> enabling the feature? >>>> >>>> now I'm little confused as some arch direction was to treat FW as >>>> extension of the HW so for me it was natural to have 'has_guc_hwconfig' >>>> flag in device_info >>>> >>>> if still for some reason it is undesired to mix HW and FW/SW flags >>>> inside single group of flags then maybe we should just add separate >>>> group of immutable flags where has_guc_hwconfig could be defined. >>>> >>>> let our maintainers decide >>> >>> Bah.. :) >>> >>> And what was the previous method? >>> >>> [comes back later] >>> >>> Okay it was: >>> >>> +static bool has_table(struct drm_i915_private *i915) >>> +{ >>> + if (IS_ALDERLAKE_P(i915)) >>> + return true; >>> >>> Which sucks a bit if we want to argue it does not belong in device info. >>> >>> Why can't we ask the GuC if the blob exists? In fact what would >>> happen if one would call __guc_action_get_hwconfig on any GuC platform? >> That was how I originally wrote the code. However, other parties >> refuse to allow a H2G call to fail. The underlying CTB layers complain >> loudly on any CTB error. And the GuC architects insist that a call to >> query the table on an unsupported platform is an error and should >> return an 'unsupported' error code. just for the background: IIRC the reason why arch didn't want 'query' mode that wont fail was that HWconfig is not optional on these selected platforms, so driver shall know up-front on which platforms it shall be working (and driver shall even fail if blob is not available) > > Oh well, shrug, sounds silly but I will not pretend I am familiar with H2G > > In this case has_table does sound better since it indeed isn't a > hardware feature. It is a GuC fw thing and if we don't have a way to > probe things there at runtime, then at least limit the knowledge to GuC > files. note that arch expectation is that driver itself shall be using this hwconfig (ie. will decode required data). while current approach is that driver acts only as a proxy to UMD, this has to change and refactoring will start (likely) soon. therefore flag has_guc_hwconfig fits better IMHO as this wont be just 'GuC' fw thing. Michal > > Regards, > > Tvrtko