Re: [PATCH v18 04/19] EDAC: Add memory repair control feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Tue, 14 Jan 2025 13:05:37 +0000
Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> escreveu:

> On Tue, 14 Jan 2025 13:38:31 +0100
> Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> wrote:
> 
> > Em Thu, 9 Jan 2025 14:24:33 +0000
> > Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> escreveu:
> >   
> > > On Thu, 9 Jan 2025 13:32:22 +0100
> > > Borislav Petkov <bp@xxxxxxxxx> wrote:
> > > 
> > > Hi Boris,
> > >     
> > > > On Thu, Jan 09, 2025 at 11:00:43AM +0000, Shiju Jose wrote:      
> > > > > The min_ and max_ attributes of the control attributes are added  for your
> > > > > feedback on V15 to expose supported ranges of these control attributes to the user, 
> > > > > in the following links.          
> > > > 
> > > > Sure, but you can make that differently:
> > > > 
> > > > cat /sys/bus/edac/devices/<dev-name>/mem_repairX/bank
> > > > [x:y]
> > > > 
> > > > which is the allowed range.      
> > > 
> > > To my thinking that would fail the test of being an intuitive interface.
> > > To issue a repair command requires that multiple attributes be configured
> > > before triggering the actual repair.
> > > 
> > > Think of it as setting the coordinates of the repair in a high dimensional
> > > space.
> > > 
> > > In the extreme case of fine grained repair (Cacheline), to identify the
> > > relevant subunit of memory (obtained from the error record that we are
> > > basing the decision to repair on) we need to specify all of:
> > > 
> > > Channel, sub-channel, rank,  bank group, row, column and nibble mask.
> > > For coarser granularity repair only specify a subset of these applies and
> > > only the relevant controls are exposed to userspace.
> > > 
> > > They are broken out as specific attributes to enable each to be set before
> > > triggering the action with a write to the repair attribute.
> > > 
> > > There are several possible alternatives:
> > > 
> > > Option 1
> > > 
> > > "A:B:C:D:E:F:G:H:I:J" opaque single write to trigger the repair where
> > > each number is providing one of those coordinates and where a readback
> > > let's us known what each number is.
> > > 
> > > That single attribute interface is very hard to extend in an intuitive way.
> > > 
> > > History tell us more levels will be introduced in the middle, not just
> > > at the finest granularity, making such an interface hard to extend in
> > > a backwards compatible way.
> > > 
> > > Another alternative of a key value list would make for a nasty sysfs
> > > interface.
> > > 
> > > Option 2 
> > > There are sysfs interfaces that use a selection type presentation.
> > > 
> > > Write: "C", Read: "A, B, [C], D" but that only works well for discrete sets
> > > of options and is a pain to parse if read back is necessary.    
> > 
> > Writing it as:
> > 
> > 	a b [c] d
> > 
> > or even:
> > 	a, b, [c], d
> > 
> > doesn't make it hard to be parse on userspace. Adding a comma makes
> > Kernel code a little bigger, as it needs an extra check at the loop
> > to check if the line is empty or not:
> > 
> > 	if (*tmp != '\0')
> > 		*tmp += snprintf(", ")
> > 
> > Btwm we have an implementation like that on kernelspace/userspace for
> > the RC API:
> > 
> > - Kernelspace:
> >   https://github.com/torvalds/linux/blob/master/drivers/media/rc/rc-main.c#L1125
> >   6 lines of code + a const table with names/values, if we use the same example
> >   for EDAC:
> > 
> > 	const char *name[] = { "foo", "bar" };
> > 
> > 	for (i = 0; i < ARRAY_SIZE(names); i++) {
> > 		if (enabled & names[i].type)
> > 			tmp += sprintf(tmp, "[%s] ", names[i].name);
> > 		else if (allowed & proto_names[i].type)
> > 			tmp += sprintf(tmp, "%s ", names[i].name);
> > 	}
> > 
> > 
> > - Userspace:
> >   https://git.linuxtv.org/v4l-utils.git/tree/utils/keytable/keytable.c#n197
> >   5 lines of code + a const table, if we use the same example
> >   for ras-daemon:
> > 
> > 		const char *name[] = { 
> > 			[EDAC_FOO] = "[foo]",
> > 			[EDAC_BAR] = "[bar]",
> > 		};
> > 
> > 		for (p = strtok(arg, " ,"); p; p = strtok(NULL, " ,"))
> > 			for (i = 0; i < ARRAY_SIZE(name); i++)
> > 				if (!strcasecmp(p, name[i])
> > 					return i;
> > 		return -1;
> > 
> > 	(strtok handles both space and commas at the above example)
> > 
> > IMO, this is a lot better, as the alternative would be to have separate
> > sysfs nodes to describe what values are valid for a given edac devnode.
> > 
> > See, userspace needs to know what values are valid for a given
> > device and support for it may vary depending on the Kernel and
> > device version. So, we need to have the information about what values
> > are valid stored on some sysfs devnode, to allow backward compatibility.  
> 
> These aren't selectors from a discrete list so the question is more
> whether a syntax of
> <min> value <max> 
> is intuitive or not.  I'm not aware of precedence for this one.


[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux