On Thu, 26 Oct 2017, Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxxxxxxxx> wrote: > On 25/10/2017 08:45, Jani Nikula wrote: >> On Tue, 24 Oct 2017, Tvrtko Ursulin <tursulin@xxxxxxxxxxx> wrote: >>> On 24/10/17 18:48, Jani Nikula wrote: >>>> On Tue, 24 Oct 2017, Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> wrote: >>>>> Quoting Sagar Arun Kamble (2017-10-24 11:41:13) >>>>>> diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c >>>>>> index 875d428..d1a4911 100644 >>>>>> --- a/drivers/gpu/drm/i915/intel_device_info.c >>>>>> +++ b/drivers/gpu/drm/i915/intel_device_info.c >>>>>> @@ -462,4 +462,15 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv) >>>>>> info->sseu.has_subslice_pg ? "y" : "n"); >>>>>> DRM_DEBUG_DRIVER("has EU power gating: %s\n", >>>>>> info->sseu.has_eu_pg ? "y" : "n"); >>>>>> + >>>>>> + /* Initialize PM interrupt register offsets */ >>>>>> + if (INTEL_GEN(dev_priv) >= 8) { >>>>>> + info->pm_iir_offset = GEN8_GT_IIR(2); >>>>>> + info->pm_imr_offset = GEN8_GT_IMR(2); >>>>>> + info->pm_ier_offset = GEN8_GT_IER(2); >>>>>> + } else { >>>>>> + info->pm_iir_offset = GEN6_PMIIR; >>>>>> + info->pm_imr_offset = GEN6_PMIMR; >>>>>> + info->pm_ier_offset = GEN6_PMIER; >>>>>> + } >>>>> >>>>> If you are going to take another pass at this, move these into the >>>>> static tables in i915_pci.c >>>>> >>>>> Updating GEN6_FEATURES and GEN8_FEATURES will then percolate into >>>>> individual platform defines. >>>> >>>> Like I wrote in reply to v1, I'm not convinced we should do this at all. >>>> >>>> What makes *these* registers so important they must be in device info? >>>> What makes most of i915_reg.h so unimportant they don't deserve the same >>>> treatment? Where do you draw the line? >>>> >>>> I'd draw the line at, no registers at device info. >>> >>> I suggested to Sagar this change during review so feel responsible to >>> chime in. >>> >>> So in general I just find the amount of times our driver asks itself >>> what it's running on a bit tasteless. :( >>> >>> I did quick and dirty check by bumping a counter in all the >>> IS_this|or|that checks, all which can be known at driver probe time, and >>> wired it up to the PMU so I can check their frequency. The annotated >>> perf stat output: >>> >>> root@e31:~# perf stat -a -e i915/whoami/ -I 1000 >>> # time counts unit events >>> >>> # idle system no X running >>> >>> 1.000298100 10 i915/whoami/ >>> >>> 2.000750955 8 i915/whoami/ >>> >>> 3.001104193 10 i915/whoami/ >>> >>> 4.001333433 10 i915/whoami/ >>> >>> 5.001703162 10 i915/whoami/ >>> >>> 6.002122721 10 i915/whoami/ >>> >>> >>> # starting X now.. >>> >>> 7.002266228 2,203 i915/whoami/ >>> >>> 8.002392598 4,682 i915/whoami/ >>> >>> 9.002764398 0 i915/whoami/ >>> >>> 10.003027119 0 i915/whoami/ >>> >>> 11.003486048 42 i915/whoami/ >>> >>> >>> # X idling.. >>> >>> 12.003854660 0 i915/whoami/ >>> >>> 13.004221680 0 i915/whoami/ >>> >>> 14.004622571 0 i915/whoami/ >>> >>> 15.004968110 0 i915/whoami/ >>> >>> 16.005372363 0 i915/whoami/ >>> >>> 17.005778034 0 i915/whoami/ >>> >>> 18.005941970 0 i915/whoami/ >>> >>> 19.006313427 0 i915/whoami/ >>> >>> 20.006676048 0 i915/whoami/ >>> >>> 21.007059927 0 i915/whoami/ >>> >>> 22.007507818 0 i915/whoami/ >>> >>> 23.007887628 0 i915/whoami/ >>> >>> 24.008207035 0 i915/whoami/ >>> >>> 25.008580496 0 i915/whoami/ >>> >>> # time counts unit events >>> 26.008949236 0 i915/whoami/ >>> >>> 27.009433473 0 i915/whoami/ >>> >>> >>> # gfxbench trex starting up >>> >>> 28.009677600 2,605 i915/whoami/ >>> >>> 29.009941972 716 i915/whoami/ >>> >>> 30.010127588 2,190 i915/whoami/ >>> >>> 31.010249535 46 i915/whoami/ >>> >>> 32.010383565 36 i915/whoami/ >>> >>> 33.010527674 0 i915/whoami/ >>> >>> >>> # trex running >>> >>> 34.010760584 4,709 i915/whoami/ >>> >>> 35.011079891 5,381 i915/whoami/ >>> >>> 36.011280234 5,306 i915/whoami/ >>> >>> 37.011719986 5,505 i915/whoami/ >>> >>> 38.012017531 5,386 i915/whoami/ >>> >>> 39.012529241 5,687 i915/whoami/ >>> >>> 40.012922986 6,009 i915/whoami/ >>> >>> 41.013120143 5,791 i915/whoami/ >>> >>> 42.013399982 5,296 i915/whoami/ >>> >>> 43.013712979 5,349 i915/whoami/ >>> >>> 44.014107375 5,127 i915/whoami/ >>> >>> 45.014553950 5,387 i915/whoami/ >>> >>> 46.014953020 5,364 i915/whoami/ >>> >>> 47.015243748 4,738 i915/whoami/ >>> >>> 48.015560460 4,788 i915/whoami/ >>> >>> 49.015867395 4,927 i915/whoami/ >>> >>> 50.016152690 4,886 i915/whoami/ >>> >>> >>> So.. I am not saying these particular registers are mega important, and >>> not even saying that these 5k/s conditionals are measurable (either as >>> branches or increased code size effect), but overall the situation is a >>> bit of.. bleurgh from the elegance point of view. :( >>> >>> If we have register sets which are 100% mutually exclusive, then I see >>> them as candidates to put them in some object at probe time. It doesn't >>> have to be device_info but I don't see why we wouldn't do it. It is just >>> a different flavour of the vfunc approach after all. >> >> I think to fix something that is inelegant, you have to have a plan to >> actually improve things in the long run. IMO adding a few random >> registers to device info without a plan is less elegant and less >> consistent than the status quo. >> >> We currently have at least three ways to index pipe/port/transcoder/etc >> based registers. Combine that with storing some register offsets in >> device info, you'll have six ways. There's a chance we'll end up adding >> the register offsets to device info both statically and >> dynamically. We're already struggling with guiding new contributors to >> defining registers in the existing schemes. >> >> Now, I'm sure we could spend weeks on end devising a plan how to move >> register offsets to device info or another structure, working out the >> details and bikeshedding. After that, we could do weeks and weeks of >> busywork converting registers, causing conflicts in all the work in our >> internal trees and developers' own branches, not to mention making bug >> fix and feature backports more painful. >> >> I have a pretty strong feeling this is not a good use of our time. > > I can only read here a dislike of a big rework (which I did not suggest > to start with), and dislike of the piecemeal changes. Any change would have to be piecemeal anyway. I don't have a dislike for that per se. I'm just saying that adding some registers to some data structures on a whim leads to an ugly inconsistent end result, and it gets cargo-culted to more and more places, uncontrolled. The driver will become harder to maintain. The changes must be done piecemeal, but there needs to be a plan where we want to take all this in the long term. And that plan is going to be an awful bikeshed fest. > So basically preference for a status quo. And there will be more and > more of such checks. So today it is 5k/sec, in a year it might be > more. Even with cached register offsets you'll anyway be doing 5k fetches of the cached offsets per second. Sure you'll save a branch and couple of immediates in code, but I can't imagine it being a huge penalty. > So to clarify. Do you actually oppose some subsystem/area moving some > registers to any data structure, or just to device info? > > Do you have a suggestion on what we could do? Or you simply think this > is a complete non-issue? This is not a non-issue, but, to be quite honest, I'd rather see people go to bugzilla and fix a dozen actual issues that are hitting CI or end users out there, today. BR, Jani. -- Jani Nikula, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/intel-gfx