[PATCH 000/100] Add Vega10 Support

andresx7@xxxxxxxxx (Andres Rodriguez) · Tue, 21 Mar 2017 15:43:33 -0400

On 2017-03-21 11:14 AM, Deucher, Alexander wrote:
>> -----Original Message-----
>> From: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] On Behalf
>> Of Christian KÃ¶nig
>> Sent: Tuesday, March 21, 2017 5:01 AM
>> To: Edward O'Callaghan; StDenis, Tom; amd-gfx at lists.freedesktop.org
>> Subject: Re: [PATCH 000/100] Add Vega10 Support
>>
>> Am 21.03.2017 um 09:45 schrieb Edward O'Callaghan:
>>>
>>> On 03/21/2017 05:36 PM, Christian KÃ¶nig wrote:
>>>> Am 21.03.2017 um 00:38 schrieb Tom St Denis:
>>>>> On 03/20/2017 06:34 PM, Jan Ziak wrote:
>>>>>> On Mon, Mar 20, 2017 at 10:41 PM, Alex Deucher
>> <alexdeucher at gmail.com
>>>>>> <mailto:alexdeucher at gmail.com>> wrote:
>>>>>>
>>>>>>      On Mon, Mar 20, 2017 at 5:36 PM, Jan Ziak
>> <0xe2.0x9a.0x9b at gmail.com
>>>>>>      <mailto:0xe2.0x9a.0x9b at gmail.com>> wrote:
>>>>>>      > Hi
>>>>>>      >
>>>>>>      >
>>>>>>
>> https://cgit.freedesktop.org/~agd5f/linux/plain/drivers/gpu/drm/amd/inclu
>> de/asic_reg/vega10/NBIO/nbio_6_1_sh_mask.h?h=amd-staging-
>> 4.9&id=9555ef0ba926df25d9a637d0ea21bc0d231c21d2
>>>>>>
>>>>>>
>> <https://cgit.freedesktop.org/~agd5f/linux/plain/drivers/gpu/drm/amd/incl
>> ude/asic_reg/vega10/NBIO/nbio_6_1_sh_mask.h?h=amd-staging-
>> 4.9&id=9555ef0ba926df25d9a637d0ea21bc0d231c21d2>
>>>>>>
>>>>>>      >
>>>>>>      > The file nbio_6_1_sh_mask.h is uncompressed. It consists from
>>>>>> 133884 lines.
>>>>>>      > Only generated C/C++ code will be able to utilize the content
>>>>>> of such a file
>>>>>>      > efficiently. All hand-written codes combined will be able to
>>>>>> utilize about
>>>>>>      > 1% of the file.
>>>>>>      >
>>>>>>      > Is there a reason why nbio_6_1_sh_mask.h is huge?
>>>>>>
>>>>>>      That IP block contains a lot of registers.  The idea is to open
>>>>>> source
>>>>>>      as much IP as possible to facilitate debugging, new features, etc.
>>>>>>
>>>>>>      Alex
>>>>>>
>>>>>>
>>>>>> [This email contains long/wide lines and should be viewed on a
>>>>>> sufficiently wide screen]
>>>>>>
>>>>>> For example if I open the file in vim and go to line 66952:
>>>>>>
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>> 0x9
>>>>>>
>>>>>> Then abstracting away some of the digits used in the defined identifier
>>>>>> and using egrep:
>>>>>>
>>>>>> $ egrep
>>>>>>
>> "\<DWC_E12MP_PHY_X._NS_X._._LANE._DIG_RX_VCOCAL_RX_VCO_CAL_
>> CTRL_.__DTB_SEL__SHIFT\>"
>>>>>>
>>>>>> nbio_6_1_sh_mask.h
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_0_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_0_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_0_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_0_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_0_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_1_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_2_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_2_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_2_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_2_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_2_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_3_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_3_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_3_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_3_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>> #define
>>>>>>
>> DWC_E12MP_PHY_X4_NS_X4_3_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_1__DTB_SEL__SHIFT
>>>>>>
>>>>>>                         0x9
>>>>>>
>>>>>> The egrep command produced 20 lines.
>>>>>>
>>>>>> Instead of the many #define directives, it is a possibility to define
>>>>>> functions such as:
>>>>>>
>>>>>> int
>>>>>>
>> DWC_E12MP_PHY_Xa_NS_Xb_c_LANEd_DIG_RX_VCOCAL_RX_VCO_CAL_C
>> TRL_e__DTB_SEL__SHIFT(int
>>>>>>
>>>>>> a, int b, int c, int d, int e) __attribute__((pure));
>>>>>>
>>>>>> I suppose the file nbio_6_1_sh_mask.h is the output of a tool (it is a
>>>>>> generated file). It is an option to modify the tool to output C
>>>>>> functions with proper input guards instead of #define directives.
>>>>> The registers are generated by the HW team and then filtered down to
>>>>> become what we release publicly.  It is machine generated very likely
>>>>> from the RTL that specifies the hardware itself.
>>>> Yes, exactly that. And it was actually quite some work to get to this
>>>> point.
>>>>
>>>>> Generally speaking if a class of registers share masks/offsets the
>>>>> lowest (zero'th) is used in programming and offsets are used when
>>>>> selecting the correct MMIO address to use specific instances.
>>>> The problem is that the files we get from the HW team describe the
>>>> register block already broken down to the memory mappings. E.g. when
>> an
>>>> RTL block is instantiated N times you get N times the same definition.
>>>>
>>>>> Having these enumerated though is handy for tools like UMR which
>> would
>>>>> decode to the correct instance of the register (you could even see
>>>>> that by watching the logscan via umr).  So we make use of them fairly
>>>>> efficiently.  UMR reads the headers to create the arrays of
>>>>> registers/bitfields which if they were computed at runtime (via helper
>>>>> functions) would be harder to parse (and could amount to the halting
>>>>> problem...).
>>>>>
>>>>> What is in the NBIO header today is what was IP cleared but in theory
>>>>> it could be filtered down more simply to reduce the unused LOCs in
>>>>> future kernels.
>>>>>
>>>>> They don't make for good bedtime reading but imho you'd rather have
>>>>> more headers not less :-)
>>>> Manually merging that back into single definitions is not really an
>>>> option, but what we could do is releasing the full blown headers
>>>> somewhere and only limit what we push into the linux kernel to actually
>>>> used ones.
>>> Hi Christian,
>>>
>>> I have to say I prefer your idea here and perhaps they could even become
>>> part of the UMR repo, not sure? May I suggest the reduction to some like
>>> a `kernel-native-dialect` could be automated with a clang plugin. It
>>> seems like heaps of work however maybe not really, clang provides a API
>>> to grammatically do some AST transformers and auto-rewrite large chunks
>>> of code given some rules.
>>
>> Using clang sounds like overkill to me, but we could write an
>> awk/sed/perl script to do the job.
>>
>> Another problem is that we internally doesn't necessary want this during
>> bringup, so the question is when to apply it.
>>
>>> I also agree with Tom, its certainly nice to have them regardless, just
>>> perhaps the kernel mainline repo is the wrong place for the kitchen sink.
>>
>> It's not only nice to have, but a must have to publish the headers.
>>
>> We have put a lot of work into changing the workflow at AMD from
>> "releasing only what we have to" toward "releasing everything we can".
>>
>> Loosing that is NOT an option, but I clearly agree that the kernel
>> mainline repo is not necessary the best place for it.
>
> One advantage to having it in the kernel, is that for debugging you donâ??t have to worry about adding a bunch of register defines if you need to print some registers or hack in a quick test.  Another advantage is that when it comes to new developers working on the code, if a register define is not there, that is a good signal that the register is restricted for some reason and the patch needs further internal review and approval before release because it exposes new IP.  If the headers are there, if the code compiles, it's safe from an IP perspective.  It's also nice to have a unified source for the hardware and how to program it.
>
> Alex
>
>

I agree completely here. The register headers also function as the best 
document of how the HW is supposed to behave.

I don't really see any advantage to splitting this into a separate 
repository. As far as I can see, it would create a lot of extra busywork 
with moving things from one place to another, in exchange of a few MB of 
space. That is not a good trade-off.

One of the nicest things that the AMD team has been able to accomplish 
in recent years is getting more and more of their registers open 
publicly. I think moving them to a different repository would be a step 
backwards.

Regards,
Andres

>>
>> Using UMR for this is actually a quite nifty idea if you ask me.
>>
>> Regards,
>> Christian.
>>
>>> Just my thoughts,
>>> Kind Regards,
>>> Edward.
>>>
>>>> Christian.
>>>>
>>>>
>>>>> Tom
>>>>> _______________________________________________
>>>>> amd-gfx mailing list
>>>>> amd-gfx at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>>>
>>>> _______________________________________________
>>>> amd-gfx mailing list
>>>> amd-gfx at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>