On Thu, Feb 27, 2025 at 11:32:10AM +0000, Datta, Shubhrajyoti wrote: > > > +union edac_info { > > > > What is an "edac_info"? > This is the row and column positions. > We are using it to extract the position from the address decoder registers. Needs a better name which is descriptive as to how it is used or what it represents. > > > +static void get_ddr_ue_error_info(u32 error_data[REGS_PER_CONTROLLER], > > struct edac_priv *priv) > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > What is that for? > > > The error_data contains the register values. Linux does not have access to the DDRMC register > Space. It queries it from the NMC and gets the the data from the rpmsg callback. I wasn't clear. Arrays in C are passed as pointers - the compiler does that anyway. You don't have to do this weird parameter specification. > > > + mci->edac_cap = EDAC_FLAG_SECDED; > > > + mci->ctl_name = "amd_ddr_controller"; > > > + mci->dev_name = dev_name(&pdev->dev); > > > + mci->mod_name = "amd_edac"; > > > > Do: > > > > git grep mod_name drivers/edac/ > > > > to get an idea how those names are chosen. > #define EDAC_MOD_STR "r82600_edac" > mci->mod_name = EDAC_MOD_STR; > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/edac/r82600_edac.c?h=v6.14-rc4#n316 > > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/edac/i5000_edac.c?h=v6.14-rc4#n1424 > mci->mod_name = "i5000_edac.c"; > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/edac/highbank_mc_edac.c?h=v6.14-rc4#n218 > mci->mod_name = pdev->dev.driver->name; > > let me know if mci->mod_name = pdev->dev.driver->name; is fine. I think you didn't get me again. amd64_edac.c - the x86 driver is called this: #define EDAC_MOD_STR "amd64_edac" Calling yours "amd_edac" doesn't work. "versalnet_edac"? That's probably better. > > You don't need "inline" - the compiler can decide that itself. And > > "process_bit" needs a better name. > > Will rename it to populate_row_bit Or "assign_row_bit" or whatever. The function name should be describing what the function does as closely as possible. > > Why are those functions copying stuff around? Why aren't you using cols > > directly? > > The column bit position is used in converting to the physical address. > We read it at init and use it every time an error occurs to get the address. > Did you mean to remove the regval. Or read the error_data every time. I mean simply use cols instead of assigning stuff to priv->col_bit* and then using that. > > Why is probing a work item? > > > > Explaining *that* is what a commit message is for - not for repeating useless > > info. > The RPMsg probe is invoked from a thread within the virtio driver responsible > for processing the response ring. If the probe initiates an mcdi API call, it blocks > until the mcdi response is received. However, since the mcdi response is also processed > by the same thread that triggered the rpmsg probe, the thread remains blocked, > preventing it from handling the response. This results in a deadlock. > > To prevent it we have a work scheduled. This is just insane. I don't see anything in amd_setup_mcdi() that needs some response from some mcdi thing. If not, you don't need the work queue thing. > > > > > + for (i = 0; i < NUM_CONTROLLERS; i++) { > > > + config = priv->adec[CONF + i]; > > > + num_chans = FIELD_GET(DDRMC5_NUM_CHANS_MASK, config); > > > + rank = FIELD_GET(DDRMC5_RANK_MASK, config); > > > + rank = 1 << rank; > > > + dwidth = FIELD_GET(DDRMC5_BUS_WIDTH_MASK, config); > > > + dt = get_dwidth(dwidth); > > > + if (dt != DEV_UNKNOWN) > > > + break; > > > + } > > > > What is that loop supposed to do? Find the last controller before the one > > with DEV_UNKNOWN device width and register that one? > > There are 8 controllers all we try to get the first one that is enabled and register that one. > We use the device unknown to know if that is enabled or not. The first one that is enabled has unknown device width? What? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette