Am 21.03.2017 um 09:45 schrieb Edward O'Callaghan: > > On 03/21/2017 05:36 PM, Christian König wrote: >> Am 21.03.2017 um 00:38 schrieb Tom St Denis: >>> On 03/20/2017 06:34 PM, Jan Ziak wrote: >>>> On Mon, Mar 20, 2017 at 10:41 PM, Alex Deucher <alexdeucher at gmail.com >>>> <mailto:alexdeucher at gmail.com>> wrote: >>>> >>>> On Mon, Mar 20, 2017 at 5:36 PM, Jan Ziak <0xe2.0x9a.0x9b at gmail.com >>>> <mailto:0xe2.0x9a.0x9b at gmail.com>> wrote: >>>> > Hi >>>> > >>>> > >>>> https://cgit.freedesktop.org/~agd5f/linux/plain/drivers/gpu/drm/amd/include/asic_reg/vega10/NBIO/nbio_6_1_sh_mask.h?h=amd-staging-4.9&id=9555ef0ba926df25d9a637d0ea21bc0d231c21d2 >>>> >>>> <https://cgit.freedesktop.org/~agd5f/linux/plain/drivers/gpu/drm/amd/include/asic_reg/vega10/NBIO/nbio_6_1_sh_mask.h?h=amd-staging-4.9&id=9555ef0ba926df25d9a637d0ea21bc0d231c21d2> >>>> >>>> > >>>> > The file nbio_6_1_sh_mask.h is uncompressed. It consists from >>>> 133884 lines. >>>> > Only generated C/C++ code will be able to utilize the content >>>> of such a file >>>> > efficiently. All hand-written codes combined will be able to >>>> utilize about >>>> > 1% of the file. >>>> > >>>> > Is there a reason why nbio_6_1_sh_mask.h is huge? >>>> >>>> That IP block contains a lot of registers. The idea is to open >>>> source >>>> as much IP as possible to facilitate debugging, new features, etc. >>>> >>>> Alex >>>> >>>> >>>> [This email contains long/wide lines and should be viewed on a >>>> sufficiently wide screen] >>>> >>>> For example if I open the file in vim and go to line 66952: >>>> >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> >>>> Then abstracting away some of the digits used in the defined identifier >>>> and using egrep: >>>> >>>> $ egrep >>>> "\<DWC_E12MP_PHY_X._NS_X._._LANE._DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_.__DTB_SEL__SHIFT\>" >>>> >>>> nbio_6_1_sh_mask.h >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_0_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_0_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_0_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_0_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_0_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_1_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_2_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_2_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_2_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_2_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_2_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_3_LANE0_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_3_LANE1_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_3_LANE2_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_3_LANE3_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> #define >>>> DWC_E12MP_PHY_X4_NS_X4_3_LANEX_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_1__DTB_SEL__SHIFT >>>> >>>> 0x9 >>>> >>>> The egrep command produced 20 lines. >>>> >>>> Instead of the many #define directives, it is a possibility to define >>>> functions such as: >>>> >>>> int >>>> DWC_E12MP_PHY_Xa_NS_Xb_c_LANEd_DIG_RX_VCOCAL_RX_VCO_CAL_CTRL_e__DTB_SEL__SHIFT(int >>>> >>>> a, int b, int c, int d, int e) __attribute__((pure)); >>>> >>>> I suppose the file nbio_6_1_sh_mask.h is the output of a tool (it is a >>>> generated file). It is an option to modify the tool to output C >>>> functions with proper input guards instead of #define directives. >>> The registers are generated by the HW team and then filtered down to >>> become what we release publicly. It is machine generated very likely >>> from the RTL that specifies the hardware itself. >> Yes, exactly that. And it was actually quite some work to get to this >> point. >> >>> Generally speaking if a class of registers share masks/offsets the >>> lowest (zero'th) is used in programming and offsets are used when >>> selecting the correct MMIO address to use specific instances. >> The problem is that the files we get from the HW team describe the >> register block already broken down to the memory mappings. E.g. when an >> RTL block is instantiated N times you get N times the same definition. >> >>> Having these enumerated though is handy for tools like UMR which would >>> decode to the correct instance of the register (you could even see >>> that by watching the logscan via umr). So we make use of them fairly >>> efficiently. UMR reads the headers to create the arrays of >>> registers/bitfields which if they were computed at runtime (via helper >>> functions) would be harder to parse (and could amount to the halting >>> problem...). >>> >>> What is in the NBIO header today is what was IP cleared but in theory >>> it could be filtered down more simply to reduce the unused LOCs in >>> future kernels. >>> >>> They don't make for good bedtime reading but imho you'd rather have >>> more headers not less :-) >> Manually merging that back into single definitions is not really an >> option, but what we could do is releasing the full blown headers >> somewhere and only limit what we push into the linux kernel to actually >> used ones. > Hi Christian, > > I have to say I prefer your idea here and perhaps they could even become > part of the UMR repo, not sure? May I suggest the reduction to some like > a `kernel-native-dialect` could be automated with a clang plugin. It > seems like heaps of work however maybe not really, clang provides a API > to grammatically do some AST transformers and auto-rewrite large chunks > of code given some rules. Using clang sounds like overkill to me, but we could write an awk/sed/perl script to do the job. Another problem is that we internally doesn't necessary want this during bringup, so the question is when to apply it. > I also agree with Tom, its certainly nice to have them regardless, just > perhaps the kernel mainline repo is the wrong place for the kitchen sink. It's not only nice to have, but a must have to publish the headers. We have put a lot of work into changing the workflow at AMD from "releasing only what we have to" toward "releasing everything we can". Loosing that is NOT an option, but I clearly agree that the kernel mainline repo is not necessary the best place for it. Using UMR for this is actually a quite nifty idea if you ask me. Regards, Christian. > Just my thoughts, > Kind Regards, > Edward. > >> Christian. >> >> >>> Tom >>> _______________________________________________ >>> amd-gfx mailing list >>> amd-gfx at lists.freedesktop.org >>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx at lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx