On Fri, Sep 15, 2023 at 10:29 AM <shiju.jose@xxxxxxxxxx> wrote: > > From: Shiju Jose <shiju.jose@xxxxxxxxxx> > > Add sysfs documentation entries for the set of attributes those are > exposed in /sys/class/scrub/ by the scrub driver. These attributes > support configuring parameters of a scrub device. > > Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx> > --- > .../ABI/testing/sysfs-class-scrub-configure | 82 +++++++++++++++++++ > 1 file changed, 82 insertions(+) > create mode 100644 Documentation/ABI/testing/sysfs-class-scrub-configure > > diff --git a/Documentation/ABI/testing/sysfs-class-scrub-configure b/Documentation/ABI/testing/sysfs-class-scrub-configure > new file mode 100644 > index 000000000000..347e2167dc62 > --- /dev/null > +++ b/Documentation/ABI/testing/sysfs-class-scrub-configure > @@ -0,0 +1,82 @@ > +What: /sys/class/scrub/ > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + The scrub/ class subdirectory belongs to the > + scrubber subsystem. > + > +What: /sys/class/scrub/scrubX/ > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + The /sys/class/scrub/scrub{0,1,2,3,...} directories This API (sysfs interface) is very specific to the ACPI interface defined for hardware patrol scrubber. I wonder can we have some interface that is more generic, for a couple of reasons: 1. I am not aware of any chip/platform hardware that implemented the hw ps part defined in ACPI RASF/RAS2 spec. So I am curious what the RAS experts from different hardware vendors think about this. For example, Tony and Dave from Intel, Jon and Vilas from AMD. Is there any hardware platform (if allowed to disclose) that implemented ACPI RASF/RAS2? If so, will vendors continue to support the control of patrol scrubber using the ACPI spec? If not (as Tony said in [1], will the vendor consider starting some future platform? If we are unlikely to get the vendor support, creating this ACPI specific sysfs API (and the driver implementations) in Linux seems to have limited meaning. > + correspond to each scrub device. > + > +What: /sys/class/scrub/scrubX/name > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (RO) name of the memory scrub device > + > +What: /sys/class/scrub/scrubX/regionY/ 2. I believe the concept of "region" here is probably from PATROL_SCRUB defined in “APCI section 5.2.20.5. Parameter Block". It is indeed powerful: if a process's physical memory spans over multiple memory controllers, OS can in theory scrub chunks of the memory belonging to the process. However, from a previous discussion [1], "From a h/w perspective it might always be complex". IIUC, the address translation from physical address to channel address is hard to achieve, and probably that's one of the tech reasons the patrol scrub ACPI spec is not in practice implemented? So my take is, control at the granularity of the memory controller is probably a nice compromise. Both OS and userspace can get a pretty decent amount of flexibility, start/stop/adjust speed of the scrubbing on a memory controller; meanwhile it doesn't impose too much pain to hardware vendors when they provide these features in hardware. In terms of how these controls/features will be implemented, I imagine it could be implemented: * via hardware registers that directly or indirectly control on memory controllers * via ACPI if the situation changes in 10 years (and the RASF/RAS2/PCC drivers implemented in this patchset can be directly plugged into) * a kernel-thread that uses cpu read to detect memory errors, if hardware support is unavailable or not good enough Given these possible backends of scrubbing, I think a more generic sysfs API that covers and abstracts these backends will be more valuable right now. I haven’t thought thoroughly, but how about defining the top-level interface as something like “/sys/devices/system/memory_controller_scrubX/”, or “/sys/class/memory_controllerX/scrub”? [1] https://lore.kernel.org/linux-mm/SJ1PR11MB6083BF93E9A88E659CED5EC4FC3F9@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m13516ee35caa05b506080ae805bee14f9f958d43 > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + The /sys/class/scrub/scrubX/region{0,1,2,3,...} > + directories correspond to each scrub region under a scrub device. > + Scrub region is a physical address range for which scrub may be > + separately controlled. Regions may overlap in which case the > + scrubbing rate of the overlapped memory will be at least that > + expected due to each overlapping region. > + > +What: /sys/class/scrub/scrubX/regionY/addr_base > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (RW) The base of the address range of the memory region > + to be patrol scrubbed. > + On reading, returns the base of the memory region for > + the actual address range(The platform calculates > + the nearest patrol scrub boundary address from where > + it can start scrubbing). > + > +What: /sys/class/scrub/scrubX/regionY/addr_size > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (RW) The size of the address range to be patrol scrubbed. > + On reading, returns the size of the memory region for > + the actual address range. > + > +What: /sys/class/scrub/scrubX/regionY/enable > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (WO) Start/Stop scrubbing the memory region. > + 1 - enable the memory scrubbing. > + 0 - disable the memory scrubbing.. > + > +What: /sys/class/scrub/scrubX/regionY/speed_available > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (RO) Supported range for the partol speed(scrub rate) > + by the scrubber for a memory region. > + The unit of the scrub rate vary depends on the scrubber. > + > +What: /sys/class/scrub/scrubX/regionY/speed > +Date: September 2023 > +KernelVersion: 6.7 > +Contact: linux-kernel@xxxxxxxxxxxxxxx > +Description: > + (RW) The partol speed(scrub rate) on the memory region specified and > + it must be with in the supported range by the scrubber. > + The unit of the scrub rate vary depends on the scrubber. > -- > 2.34.1 > >