Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Wed, 17 Jul 2024 14:07:05 +0000
Shiju Jose <shiju.jose@xxxxxxxxxx> escreveu:

> >-----Original Message-----
> >From: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
> >Sent: 17 July 2024 13:57
> >To: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >Cc: linux-edac@xxxxxxxxxxxxxxx; linux-cxl@xxxxxxxxxxxxxxx; linux-
> >acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >bp@xxxxxxxxx; tony.luck@xxxxxxxxx; rafael@xxxxxxxxxx; lenb@xxxxxxxxxx;
> >mchehab@xxxxxxxxxx; dan.j.williams@xxxxxxxxx; dave@xxxxxxxxxxxx; Jonathan
> >Cameron <jonathan.cameron@xxxxxxxxxx>; dave.jiang@xxxxxxxxx;
> >alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx; ira.weiny@xxxxxxxxx;
> >david@xxxxxxxxxx; Vilas.Sridharan@xxxxxxx; leo.duran@xxxxxxx;
> >Yazen.Ghannam@xxxxxxx; rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx;
> >Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
> >naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx;
> >somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx;
> >duenwen@xxxxxxxxxx; mike.malvestuto@xxxxxxxxx; gthelen@xxxxxxxxxx;
> >wschwartz@xxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx;
> >wbs@xxxxxxxxxxxxxxxxxxxxxx; nifan.cxl@xxxxxxxxx; tanxiaofei
> ><tanxiaofei@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; Roberto
> >Sassu <roberto.sassu@xxxxxxxxxx>; kangkang.shen@xxxxxxxxxxxxx;
> >wanghuiqiang <wanghuiqiang@xxxxxxxxxx>; Linuxarm
> ><linuxarm@xxxxxxxxxx>
> >Subject: Re: [RFC PATCH v9 02/11] EDAC: Add EDAC scrub control driver
> >
> >Em Tue, 16 Jul 2024 16:03:26 +0100
> ><shiju.jose@xxxxxxxxxx> escreveu:
> >  
> >> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >>
> >> Add generic EDAC scrub control driver supports configuring the memory
> >> scrubbers in the system. The device with scrub feature, get the scrub
> >> descriptor from the EDAC scrub and registers with the EDAC RAS feature
> >> driver, which adds the sysfs scrub control interface. The scrub
> >> control attributes are available to the userspace in  
> >/sys/bus/edac/devices/<dev-name>/scrub/.  
> >>
> >> Generic EDAC scrub driver and the common sysfs scrub interface
> >> promotes unambiguous access from the userspace irrespective of the
> >> underlying scrub devices.
> >>
> >> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> >> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> >> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >> ---
> >>  Documentation/ABI/testing/sysfs-edac-scrub |  64 +++++
> >>  drivers/edac/Makefile                      |   2 +-
> >>  drivers/edac/edac_ras_feature.c            |   1 +
> >>  drivers/edac/edac_scrub.c                  | 312 +++++++++++++++++++++
> >>  include/linux/edac_ras_feature.h           |  28 ++
> >>  5 files changed, 406 insertions(+), 1 deletion(-)  create mode 100644
> >> Documentation/ABI/testing/sysfs-edac-scrub
> >>  create mode 100755 drivers/edac/edac_scrub.c
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-edac-scrub
> >> b/Documentation/ABI/testing/sysfs-edac-scrub
> >> new file mode 100644
> >> index 000000000000..dd19afd5e165
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-edac-scrub
> >> @@ -0,0 +1,64 @@
> >> +What:		/sys/bus/edac/devices/<dev-name>/scrub
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		The sysfs edac bus devices /<dev-name>/scrub subdirectory
> >> +		belongs to the memory scrub control feature, where <dev-
> >name>
> >> +		directory corresponds to a device/memory region registered
> >> +		with the edac scrub driver and thus registered with the
> >> +		generic edac ras driver too.
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrub/addr_range_base
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) The base of the address range of the memory region
> >> +		to be scrubbed (on-demand scrubbing).
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrub/addr_range_size
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) The size of the address range of the memory region
> >> +		to be scrubbed (on-demand scrubbing).
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrub/enable_background
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) Start/Stop background(patrol) scrubbing if supported.
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrub/enable_on_demand
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) Start/Stop on-demand scrubbing the memory region
> >> +		if supported.  
> >
> >This is a generic comment for all sysfs calls: what happens if not supported?
> >
> >There are a couple of ways to implement it, like:
> >
> >1. Don't create the attribute;
> >2. return an error code (-ENOENT? -EINVAL?) if trying to read or
> >   write to the devnode - please detail the used error code(s);
> >
> >In any case, please define the behavior and document it.
> >
> >From what I see, you're setting 0x444 on RW nodes when write is not enabled,
> >but still it is possible to not have RO supported. This is specially true as
> >technology evolves, as memory controllers and different types of memories may
> >have very different ways to control it[1].  
> 
> It is not true. If the parent device does not support and define callbacks for both read and write,
> then return 0 as you can see in the scrub_attr_visible() and the attribute
> would not be present for that device in the sysfs.
> For e.g. attributes addr_range_base and addr_range_size does not support by CXL patrol
> scrub feature, but supported by ACPI RAS2 scrub feature.  
> >
> >[1] If you're curious enough, one legacy example of memories
> >    implemented on a very different way was Fully Buffered DIMMs
> >    where each DIMM had its own internal chipset to offload
> >    certain tasks, including scrubbing and ECC implementation.
> >    It ended not being succeeded long term, as it required
> >    special DIMMs for server's market, reducing the production
> >    scale, but it is an interesting example about how hardware
> >    designs could be innovative breaking existing paradigms.
> >    The FB-DIMM design actually forced a redesign at the EDAC
> >    subsystem, as it was too centered on how an specific type
> >    of memory controllers.
> >  
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-name>/scrub/name
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RO) name of the memory scrubber
> >> +  
> >
> >  
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrub/cycle_in_hours_available
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RO) Supported range for the scrub cycle in hours by the
> >> +		memory scrubber.
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/scrubin_hours
> >> +Date:		Oct 2024
> >> +KernelVersion:	6.12
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) The scrub cycle in hours specified and it must be with in  
> >the  
> >> +		supported range by the memory scrubber.  
> >
> >Why specifying it in hours? I would use seconds, as it is easy to represent one
> >hour as 3600 seconds, but you can't specify a cycle of, let's say, 30min, if the
> >minimum range value is one hour.  
> For the CXL patrol scrub, scrub cycle defined in hours(CXL spec 3.1 Table 8-208. Device Patrol Scrub
> Control Feature Writable Attributes), but ACPI RAS2 does not define the unit for the scrub cycle.
> Thus proposed  represent scrub cycle in hours in common.

I understand that the final goal of this series is to have CXL exported
via sysfs, but this patch is not binding the scrub to CXL. Instead, it
is placing it on a generic location:

	/sys/bus/edac/devices/<dev-name>/scrub

So, it doesn't make sense to bind it to CXL 3.1 spec.

> Not sure how convenient to set the scrub cycle in seconds from the user perspective and


[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux