On Tue, 24 Nov 2020, Aditya Swarup <aditya.swarup@xxxxxxxxx> wrote: > On 11/24/20 12:11 PM, Lucas De Marchi wrote: >> +Matt Roper, see question in item (3) below >> >> On Tue, Nov 24, 2020 at 04:20:40PM +0200, Jani Nikula wrote: >>> On Tue, 24 Nov 2020, Lucas De Marchi <lucas.demarchi@xxxxxxxxx> wrote: >>>> On Mon, Nov 23, 2020 at 05:32:22PM -0800, Aditya Swarup wrote: >>>>> On 11/18/20 1:18 AM, Jani Nikula wrote: >>>>>> On Tue, 17 Nov 2020, Lucas De Marchi <lucas.demarchi@xxxxxxxxx> wrote: >>>>>>> On Tue, Nov 17, 2020 at 10:50:10AM -0800, Aditya Swarup wrote: >>>>>>>> @@ -1579,9 +1579,9 @@ static inline const struct i915_rev_steppings * >>>>>>>> tgl_revids_get(struct drm_i915_private *dev_priv) >>>>>>>> { >>>>>>>> if (IS_TGL_U(dev_priv) || IS_TGL_Y(dev_priv)) >>>>>>>> - return tgl_uy_revids; >>>>>>>> + return tgl_uy_revids + INTEL_REVID(dev_priv); >>>>>>> >>>>>>> oohh, no. You have to at least check you are not accessing out of >>>>>>> bounds. New HW running on old kernel should not access create invalid >>>>>>> accesses like this. >>>>>> >>>>>> And this is just one reason why exposing arrays directly as an interface >>>>>> to the rest of the driver is a bad idea. Basically I look at *all* >>>>>> externs in the driver with suspicion, and they're all exceptions that >>>>>> should not be repeated. The revid arrays are a direct invitation to keep >>>>>> adding more and more extern arrays. And more ways to go out of bounds. >>>>> >>>>> We definitely need an array table for the SOC -> Display, GT stepping mapping. >>>> >>>> the mapping could be very well in the define iff you don't have >>>> different mappings per sku as is the case with TGL. Example: >>>> >>>> #define ADLS_REVID_A0 0 >>>> #define ADLS_REVID_A1 5 >>>> >>>> #define ADLS_DISP_REVID_A0 0 >>>> #define ADLS_DISP_REVID_B0 5 >>>> >>>> The actual value is actually the *SoC* revid, regardless the name of the >>>> macro. Since we already have to use a different macro - >>>> IS_DISP_REVID() - I don't think this is much worse and would allow us to >>>> get rid of the table *for ADL-S*, at the expense of having to pass as >>>> argument the ADLS_DISP_REVID_*. However this doesn't apply to TGL as TGL >>>> has a different mapping per sku. >>>> >>>> >>>>> SOC steppings were usually the same as display steppings/GT steppings until TGL and therefore >>>>> didn't require special mapping cases. But from TGL onwards, we have different combinations of >>>>> Disp and GT steppings per SOC stepping. Alderlake-S makes this direct mapping even more difficult >>>>> without the array requiring more macros to deal with SOC -> DISP/GT stepping differences. >>>>> >>>>> Will fix the array bound checks but the possibility of SOC revision id from drm struct going >>>>> out of bounds is minimal. Can only happen if we don't have support for latest SOC -> Disp/GT table >>>> >>>> this is very common. It's just a matter of trying to run a slightly old >>>> kernel in a slightly newer rev of the hardware. >>> >>> Indeed. All kernels released with the arrays are simply bust for any new >>> hardware revisions. They'll need a minimal Cc: stable fix. >>> >>> Here's something I drafted [1] to fix the situation more >>> generally. There are still some issues to overcome, though they exist >>> already in the current code. >>> >>> This could be followed up with converting *all* platforms to the scheme, >>> making it universal, regardless of whether the revids in the hardware >>> are consecutive or not. >>> >>> BR, >>> Jani. >>> >>> >>> [1] https://cgit.freedesktop.org/~jani/drm/log/?h=revid-stepping-scheme >> >> That is looking good. Some feedback I can give before this series being >> sent for review: > > I like this approach as well and we were discussing this in the ADLS > rev ID thread. With the tables it makes it simpler to manage rather > than worrying about individual macros. Jani do you want me to rebase > ADLS changes on top of your patches and resubmit your patches for > review or you will be submitting this series yourself? Please proceed with ADL as you were, I'll do the refactoring afterwards on top. It'll take a while anyway, we don't want to delay ADL with this. Just add the required bounds checks to the ADL patches. >> >> 1) You need to call the init function from somewhere Yeah, that's a FIXME in a commit message. Draft patches and all that. ;) I'll need to figure out where and how early we need to set this up, as it needs to be set up before any users, obviously. >> 2) For the FIXMEs: >> >> + /* >> + * FIXME: We should be able to take into account new revids not >> + * recognized by this kernel version. >> + */ >> >> + /* >> + * FIXME: We should be able to handle gaps in revid arrays gracefully, >> + * and in a way that works sensibly for the range checks. This is true >> + * for the existing revid range checks; it's fine if a new id pops up in >> + * the middle. >> + * >> + * It's okay for the display stepping to be zero, though in an array all >> + * or none should be set to non-zero, not a mix. >> + */ >> >> Maybe consider that gt_stepping will never be 0 and in the case it is (or >> size > ARRAY_SIZE), just backtrack to use the first one we find with >> gt_stepping != 0? then we probably should add a warning that we are not >> actually using the correct one, but it's the best we can do. Yeah, it'll probably need to be something like that. I'll figure something out. >> >> 3) REVID_BXT_B_LAST > > I couldn't spot REVID_NONE as well in the patches. For now REVID_NONE was intended as a placeholder to ensure all valid symbolic revids are non-zero, for array initialization purposes. >> >> what is that? The only thing that comes to mind is for "matching all B >> steps". Matt Roper had a patch to change the way we interpret the WA >> ranges so the bounds are [lower, upper) rather than [lower, upper]. >> Matt, any problem you faced with that patch? It makes more sense >> because we know the stepping in which it's fixed, but we may have >> additional revids before that >> >> But I don't see any trace of REVID_BXT_B_LAST in the tree, so not sure >> what's this about. I just indiscriminately scooped up all revid macros from i915_drv.h for starters. We have BXT_REVID_B_LAST to identify the last pre-pro bxt before C0. Needs cleanup, agreed. >> >> 4) >> >> Lastly, I'd still like the simple fix for TGL without all the noise and >> without the refactor. It's the simplest fix we can do for the 5.10 >> timeframe. Agreed. > I just submitted the fix. Thanks. Let's get that done first, then the ADL enabling, then I'll do the refactoring afterwards. BR, Jani. > > Aditya > >> >> >> Lucas De Marchi >> >>> >>> >>> >>> >>>> >>>>> for TGL from Bspec and if we are picking up wrong revision id from drm struct that means the platform >>>>> information obtained itself is wrong which will be a general platform problem unrelated to Gfx driver. >>>> >>>> Nothing else should really be a problem. We don't really use the revid >>>> much, mostly for WAs. And if other parts of the kernel are trying to use >>>> the SoC revid, then they are reading that info themselves, not using >>>> something we read. >>>> >>>> We are simply reading the revid from hardware and using that value >>>> without checking and that needs to change. >>>> >>>> >>>>> >>>>>> >>>>>> I'd rather we seek for ways to either nuke the revid arrays altogether, >>>>>> or encapsulate them within a .c file with static scope. >>>>> >>>>> I don't think we should nuke the revid arrays but I agree with finding a more appropriate place to >>>>> parse the gt/display stepping info. This should be an exercise for a later patch that takes >>>>> care of kbl,tgl and adl-s mappings. >>>>> >>>>>> >>>>>> And for that .c file... the arrays are now in gt/intel_workarounds.c >>>>>> which is a really weird place for stuff that's used for generic stepping >>>>>> info, and particularly for *display* stepping info. >>>>> >>>>> I agree and we can change the approach with a different patch later. >>>>> >>>>>> >>>>>> BR, >>>>>> Jani. >>>>>> >>>>>> >>>>> >>>> _______________________________________________ >>>> Intel-gfx mailing list >>>> Intel-gfx@xxxxxxxxxxxxxxxxxxxxx >>>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx >>> >>> -- >>> Jani Nikula, Intel Open Source Graphics Center > -- Jani Nikula, Intel Open Source Graphics Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx