[AMD Official Use Only - General] HI Boris , Thanks for the review. > -----Original Message----- > From: Borislav Petkov <bp@xxxxxxxxx> > Sent: Tuesday, September 19, 2023 10:34 PM > To: Datta, Shubhrajyoti <shubhrajyoti.datta@xxxxxxx> > Cc: linux-edac@xxxxxxxxxxxxxxx; git (AMD-Xilinx) <git@xxxxxxx>; > devicetree@xxxxxxxxxxxxxxx; Potthuri, Sai Krishna > <sai.krishna.potthuri@xxxxxxx>; krzysztof.kozlowski@xxxxxxxxxx; > robh+dt@xxxxxxxxxx; conor+dt@xxxxxxxxxx; tony.luck@xxxxxxxxx; > james.morse@xxxxxxx; mchehab@xxxxxxxxxx; rric@xxxxxxxxxx; Simek, > Michal <michal.simek@xxxxxxx> > Subject: Re: [PATCH v8 2/2] EDAC/versal: Add a Xilinx Versal memory > controller driver > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > On Fri, Aug 04, 2023 at 05:49:24PM +0530, Shubhrajyoti Datta wrote: > > +config EDAC_VERSAL > > + tristate "Xilinx Versal DDR Memory Controller" > > + depends on ARCH_ZYNQMP || COMPILE_TEST > > + help > > + Support for error detection and correction on the Xilinx Versal DDR > > + memory controller. > > + > > + Report both single bit errors (CE) and double bit errors (UE). > > + Support injecting both correctable and uncorrectable errors for debug > > + purpose using sysfs entries. > > Report both single bit errors (CE) and double bit errors (UE). > Support injecting both correctable and uncorrectable errors > for debugging purposes. > Will do. > > > +/** > > + * struct ecc_error_info - ECC error log information. > > + * @burstpos: Burst position. > > + * @lrank: Logical Rank number. > > + * @rank: Rank number. > > + * @group: Group number. > > + * @bank: Bank number. > > + * @col: Column number. > > + * @row: Row number. > > + * @rowhi: Row number higher bits. > > + * @i: ECC error info. > > + */ > > +union ecc_error_info { > > + struct { > > + u64 burstpos:3; > > + u64 lrank:3; > > + u64 rank:2; > > + u64 group:2; > > + u64 bank:2; > > + u64 col:10; > > + u64 row:10; > > + u32 rowhi; > > + }; > > + u64 i; > > +}; > > I missed this the last time but, if this is supposed to be a union of sizeof(u64), > why aren't you doing this: Will update > > union ecc_error_info { > struct { > u32 burstpos:3; > u32 lrank:3; > u32 rank:2; > u32 group:2; > u32 bank:2; > u32 col:10; > u32 row:10; > u32 rowhi; > }; > u64 i; > } __packed; > > i.e., the struct should have two u32's - the first one is the bitfield and the > second one is rowhi and the "overloaded" one is the u64 i. > > And also it should be packed even though this shouldn't be needed in this > case but still. > > Ditto for the other unions. > > > + > > +union edac_info { > > + struct { > > + u32 row0:6; > > + u32 row1:6; > > + u32 row2:6; > > + u32 row3:6; > > + u32 row4:6; > > + u32 reserved:2; > > + }; > > + struct { > > + u32 col1:6; > > + u32 col2:6; > > + u32 col3:6; > > + u32 col4:6; > > + u32 col5:6; > > + u32 reservedcol:2; > > + }; > > + u32 i; > > +}; > > ... > > > + /* Unlock the PCSR registers */ > > + writel(PCSR_UNLOCK_VAL, ddrmc_base + XDDR_PCSR_OFFSET); > > + > > + writel(0, ddrmc_base + ECCR0_CERR_STAT_OFFSET); > > + writel(0, ddrmc_base + ECCR1_CERR_STAT_OFFSET); > > + writel(0, ddrmc_base + ECCR0_UERR_STAT_OFFSET); > > + writel(0, ddrmc_base + ECCR1_UERR_STAT_OFFSET); > > + > > + /* Lock the PCSR registers */ > > + writel(1, ddrmc_base + XDDR_PCSR_OFFSET); > > I still don't know what this locking and unlocking is all about and why it is > needed... The entire register space is locked by default https://docs.xilinx.com/r/en-US/am012-versal-register-reference/PCSR_LOCK-XRAM_SLCR-Register We have write 0xF9E8D7C6 to unlock. > > Btw, you must always make sure your stuff builds: > > drivers/edac/versal_edac.c: In function 'mc_remove': > drivers/edac/versal_edac.c:1035:9: error: implicit declaration of function > 'disable_intr'; did you mean 'disable_irq'? [-Werror=implicit-function- > declaration] > 1035 | disable_intr(priv); > | ^~~~~~~~~~~~ > | disable_irq > > > Otherwise it looks to me like you haven't tested it and if that is the case, I'll > simply ignore it. The function is defined under CONFIG_EDAC_DEBUG I had it enabled to be able To inject errors . I should have also checked disabling it. I will fix in the next version. > > So make sure you build-test and test your stuff before submitting. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette