RE: [PATCH v8 2/2] EDAC/versal: Add a Xilinx Versal memory controller driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[AMD Official Use Only - General]

HI Boris ,

Thanks for the review.

> -----Original Message-----
> From: Borislav Petkov <bp@xxxxxxxxx>
> Sent: Tuesday, September 19, 2023 10:34 PM
> To: Datta, Shubhrajyoti <shubhrajyoti.datta@xxxxxxx>
> Cc: linux-edac@xxxxxxxxxxxxxxx; git (AMD-Xilinx) <git@xxxxxxx>;
> devicetree@xxxxxxxxxxxxxxx; Potthuri, Sai Krishna
> <sai.krishna.potthuri@xxxxxxx>; krzysztof.kozlowski@xxxxxxxxxx;
> robh+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx; tony.luck@xxxxxxxxx;
> james.morse@xxxxxxx; mchehab@xxxxxxxxxx; rric@xxxxxxxxxx; Simek,
> Michal <michal.simek@xxxxxxx>
> Subject: Re: [PATCH v8 2/2] EDAC/versal: Add a Xilinx Versal memory
> controller driver
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On Fri, Aug 04, 2023 at 05:49:24PM +0530, Shubhrajyoti Datta wrote:
> > +config EDAC_VERSAL
> > +     tristate "Xilinx Versal DDR Memory Controller"
> > +     depends on ARCH_ZYNQMP || COMPILE_TEST
> > +     help
> > +       Support for error detection and correction on the Xilinx Versal DDR
> > +       memory controller.
> > +
> > +       Report both single bit errors (CE) and double bit errors (UE).
> > +       Support injecting both correctable and uncorrectable errors for debug
> > +       purpose using sysfs entries.
>
>           Report both single bit errors (CE) and double bit errors (UE).
>           Support injecting both correctable and uncorrectable errors
>           for debugging purposes.
>
Will do.
>
> > +/**
> > + * struct ecc_error_info - ECC error log information.
> > + * @burstpos:                Burst position.
> > + * @lrank:           Logical Rank number.
> > + * @rank:            Rank number.
> > + * @group:           Group number.
> > + * @bank:            Bank number.
> > + * @col:             Column number.
> > + * @row:             Row number.
> > + * @rowhi:           Row number higher bits.
> > + * @i:                       ECC error info.
> > + */
> > +union ecc_error_info {
> > +     struct {
> > +             u64 burstpos:3;
> > +             u64 lrank:3;
> > +             u64 rank:2;
> > +             u64 group:2;
> > +             u64 bank:2;
> > +             u64 col:10;
> > +             u64 row:10;
> > +             u32 rowhi;
> > +     };
> > +     u64 i;
> > +};
>
> I missed this the last time but, if this is supposed to be a union of sizeof(u64),
> why aren't you doing this:

Will update
>
> union ecc_error_info {
>         struct {
>                 u32 burstpos:3;
>                 u32 lrank:3;
>                 u32 rank:2;
>                 u32 group:2;
>                 u32 bank:2;
>                 u32 col:10;
>                 u32 row:10;
>                 u32 rowhi;
>         };
>         u64 i;
> } __packed;
>
> i.e., the struct should have two u32's - the first one is the bitfield and the
> second one is rowhi and the "overloaded" one is the u64 i.
>
> And also it should be packed even though this shouldn't be needed in this
> case but still.
>
> Ditto for the other unions.
>
> > +
> > +union edac_info {
> > +     struct {
> > +             u32 row0:6;
> > +             u32 row1:6;
> > +             u32 row2:6;
> > +             u32 row3:6;
> > +             u32 row4:6;
> > +             u32 reserved:2;
> > +     };
> > +     struct {
> > +             u32 col1:6;
> > +             u32 col2:6;
> > +             u32 col3:6;
> > +             u32 col4:6;
> > +             u32 col5:6;
> > +             u32 reservedcol:2;
> > +     };
> > +     u32 i;
> > +};
>
> ...
>
> > +     /* Unlock the PCSR registers */
> > +     writel(PCSR_UNLOCK_VAL, ddrmc_base + XDDR_PCSR_OFFSET);
> > +
> > +     writel(0, ddrmc_base + ECCR0_CERR_STAT_OFFSET);
> > +     writel(0, ddrmc_base + ECCR1_CERR_STAT_OFFSET);
> > +     writel(0, ddrmc_base + ECCR0_UERR_STAT_OFFSET);
> > +     writel(0, ddrmc_base + ECCR1_UERR_STAT_OFFSET);
> > +
> > +     /* Lock the PCSR registers */
> > +     writel(1, ddrmc_base + XDDR_PCSR_OFFSET);
>
> I still don't know what this locking and unlocking is all about and why it is
> needed...
The entire register space is locked by default
https://docs.xilinx.com/r/en-US/am012-versal-register-reference/PCSR_LOCK-XRAM_SLCR-Register
We have  write 0xF9E8D7C6 to unlock.

>
> Btw, you must always make sure your stuff builds:
>
> drivers/edac/versal_edac.c: In function 'mc_remove':
> drivers/edac/versal_edac.c:1035:9: error: implicit declaration of function
> 'disable_intr'; did you mean 'disable_irq'? [-Werror=implicit-function-
> declaration]
>  1035 |         disable_intr(priv);
>       |         ^~~~~~~~~~~~
>       |         disable_irq
>
>
> Otherwise it looks to me like you haven't tested it and if that is the case, I'll
> simply ignore it.

The function is defined under CONFIG_EDAC_DEBUG I had it enabled to be able
To inject errors .

I should have also checked disabling it.
I will fix in the next version.
>
> So make sure you build-test and test your stuff before submitting.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux