On Wed, Oct 09, 2024 at 01:41:03PM +0100, shiju.jose@xxxxxxxxxx wrote: > diff --git a/Documentation/ABI/testing/sysfs-edac-scrub b/Documentation/ABI/testing/sysfs-edac-scrub > new file mode 100644 > index 000000000000..b4f8f0bba17b > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-edac-scrub > @@ -0,0 +1,69 @@ > +What: /sys/bus/edac/devices/<dev-name>/scrubX > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + The sysfs EDAC bus devices /<dev-name>/scrubX subdirectory > + belongs to an instance of memory scrub control feature, > + where <dev-name> directory corresponds to a device/memory > + region registered with the EDAC device driver for the > + scrub control feature. > + The sysfs scrub attr nodes would be present only if the > + client driver has implemented the corresponding attr > + callback function and passed in ops to the EDAC RAS feature > + driver during registration. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_base > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RW) The base of the address range of the memory region > + to be scrubbed (on-demand scrubbing). Why does this sound more complicated than it is? Why isn't this simply "addr" and the next one "size"? On-demand scrubbing should scrub at address "addr" and "size" bytes. Can't get any simpler than that. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/addr_range_size > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RW) The size of the address range of the memory region > + to be scrubbed (on-demand scrubbing). > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_background > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RW) Start/Stop background(patrol) scrubbing if supported. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/enable_on_demand > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RW) Start/Stop on-demand scrubbing the memory region > + if supported. Why do you need a separate "enable" flag? Why can't it be: "writing into "addr" starts the on-demand scrubbing"? > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/min_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RO) Supported minimum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/max_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RO) Supported maximum scrub cycle duration in seconds > + by the memory scrubber. > + > +What: /sys/bus/edac/devices/<dev-name>/scrubX/current_cycle_duration > +Date: Oct 2024 > +KernelVersion: 6.12 > +Contact: linux-edac@xxxxxxxxxxxxxxx > +Description: > + (RW) The current scrub cycle duration in seconds and must be > + within the supported range by the memory scrubber. What are those three good for and why are they exposed? > diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile > index 4edfb83ffbee..a96a74de8b9e 100644 > --- a/drivers/edac/Makefile > +++ b/drivers/edac/Makefile > @@ -10,6 +10,7 @@ obj-$(CONFIG_EDAC) := edac_core.o > > edac_core-y := edac_mc.o edac_device.o edac_mc_sysfs.o > edac_core-y += edac_module.o edac_device_sysfs.o wq.o > +edac_core-y += scrub.o > > edac_core-$(CONFIG_EDAC_DEBUG) += debugfs.o > > diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c > index 0b8aa8150239..0c9da55d18bc 100644 > --- a/drivers/edac/edac_device.c > +++ b/drivers/edac/edac_device.c > @@ -576,6 +576,7 @@ static void edac_dev_release(struct device *dev) > { > struct edac_dev_feat_ctx *ctx = container_of(dev, struct edac_dev_feat_ctx, dev); > > + kfree(ctx->scrub); > kfree(ctx->dev.groups); > kfree(ctx); > } > @@ -610,7 +611,9 @@ int edac_dev_register(struct device *parent, char *name, > const struct edac_dev_feature *ras_features) > { > const struct attribute_group **ras_attr_groups; > + struct edac_dev_data *dev_data; > struct edac_dev_feat_ctx *ctx; > + int scrub_cnt = 0, scrub_inst = 0; > int attr_gcnt = 0; > int ret, feat; The EDAC tree preferred ordering of variable declarations at the beginning of a function is reverse fir tree order:: struct long_struct_name *descriptive_name; unsigned long foo, bar; unsigned int tmp; int ret; The above is faster to parse than the reverse ordering:: int ret; unsigned int tmp; unsigned long foo, bar; struct long_struct_name *descriptive_name; And even more so than random ordering:: unsigned long foo, bar; int ret; struct long_struct_name *descriptive_name; unsigned int tmp; > > @@ -620,7 +623,10 @@ int edac_dev_register(struct device *parent, char *name, > /* Double parse to make space for attributes */ > for (feat = 0; feat < num_features; feat++) { > switch (ras_features[feat].ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + attr_gcnt++; > + scrub_cnt++; > + break; > default: > return -EINVAL; > } > @@ -639,13 +645,36 @@ int edac_dev_register(struct device *parent, char *name, > goto ctx_free; > } > > + if (scrub_cnt) { > + ctx->scrub = kcalloc(scrub_cnt, sizeof(*ctx->scrub), GFP_KERNEL); > + if (!ctx->scrub) { > + ret = -ENOMEM; > + goto groups_free; > + } > + } > + > attr_gcnt = 0; > for (feat = 0; feat < num_features; feat++, ras_features++) { > switch (ras_features->ft_type) { > - /* Add feature specific code */ > + case RAS_FEAT_SCRUB: > + if (!ras_features->scrub_ops) > + continue; Continue? I think fail. What is a scrub feature good for if it doesn't have ops? > + if (scrub_inst != ras_features->instance) > + goto data_mem_free; > + dev_data = &ctx->scrub[scrub_inst]; > + dev_data->instance = scrub_inst; > + dev_data->scrub_ops = ras_features->scrub_ops; > + dev_data->private = ras_features->ctx; > + ret = edac_scrub_get_desc(parent, &ras_attr_groups[attr_gcnt], > + ras_features->instance); > + if (ret) > + goto data_mem_free; > + scrub_inst++; > + attr_gcnt++; > + break; > default: > ret = -EINVAL; > - goto groups_free; > + goto data_mem_free; > } > } > ... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette