Hi Sascha, On 13/10/2020 13:50, Sascha Hauer wrote: > The Cortex A53 and A57 cores have error detection capabilities for the > L1/L2 Caches, this patch adds a driver for them. > > Unfortunately there is no robust way to inject errors into the caches, > so this driver doesn't contain any code to actually test it. It has > been tested though with code taken from an older version of this driver > found here: https://lkml.org/lkml/2018/3/14/1203. > For reasons stated > in this thread the error injection code is not suitable for mainline, > so it is removed from the driver. > diff --git a/drivers/edac/cortex_arm64_l1_l2.c b/drivers/edac/cortex_arm64_l1_l2.c > new file mode 100644 > index 000000000000..fb8386eb40ac > --- /dev/null > +++ b/drivers/edac/cortex_arm64_l1_l2.c > @@ -0,0 +1,208 @@ > +static void read_errors(void *data) > +{ > + struct edac_device_ctl_info *edac_ctl = data; > + int cpu = smp_processor_id(); > + char msg[MESSAGE_SIZE]; > + u64 cpumerr, l2merr; > + > + /* cpumerrsr_el1 */ > + asm volatile("mrs %0, s3_1_c15_c2_2" : "=r" (cpumerr)); > + asm volatile("msr s3_1_c15_c2_2, %0" :: "r" (0)); I think you've seen earlier comments on using the sys_reg macros for this. There were versions of binutils out there that choke on this. [...] > +} > + > +static void cortex_arm64_edac_check(struct edac_device_ctl_info *edac_ctl) > +{ > + struct arm64_pvt *pvt = edac_ctl->pvt_info; > + call_single_data_t *csd; > + int cpu; > + > + get_online_cpus(); > + for_each_cpu_and(cpu, cpu_online_mask, &pvt->compat_mask) { > + csd = per_cpu_ptr(pvt->csd_check, cpu); > + csd->func = read_errors; > + csd->info = edac_ctl; > + csd->flags = 0; > + /* Read CPU L1/L2 errors */ > + smp_call_function_single_async(cpu, csd); > + /* Wait until flags cleared */ > + smp_cond_load_acquire(&csd->flags, !VAL); Hmm. We end up waiting for each CPU to schedule something else. I can't see any reason we can't sleep here. Can't we use smp_call_function_many() here? It already considers cpu_online_mask, you'd just need to deal with read_errors() being called in parallel with itself. (concurrent calls into edac are one problem, but two CPUs read/writing the same L2 register could lead to double counting) > + } > + put_online_cpus(); > +} > +static int cortex_arm64_edac_probe(struct platform_device *pdev) > +{ > + struct device_node *np, *dn = pdev->dev.of_node; > + struct edac_device_ctl_info *edac_ctl; > + struct device *dev = &pdev->dev; > + struct of_phandle_iterator it; > + struct arm64_pvt *pvt; > + int rc, cpu; > + > + edac_ctl = edac_device_alloc_ctl_info(sizeof(*pvt), "cpu_cache", > + 1, "L", 2, 1, NULL, 0, > + edac_device_alloc_index()); I used this series to test on Juno to poke the user-space interface: This chokes on a big-little system as it can't register "cpu_cache" a second time. I think we should try to make the topology look like the one in edac_device.h. This means calling it 'cpu', and registering all of them up front. On a big/little system the second probe() call would need to be careful. I can have a go at this if you don't have a platform to hand. (The 'L2-cache' thing in edac_device.h turns out to be impossible and the 'Lx' you've done here is the most popular option. I'll post a patch to change the documentation to what people are doing) [...] > +} Thanks, James