Re: [PATCH v18 04/19] EDAC: Add memory repair control feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Tue, 14 Jan 2025 14:30:53 +0000
Shiju Jose <shiju.jose@xxxxxxxxxx> escreveu:

> >-----Original Message-----
> >From: Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx>
> >Sent: 14 January 2025 13:47
> >To: Shiju Jose <shiju.jose@xxxxxxxxxx>
> >Cc: linux-edac@xxxxxxxxxxxxxxx; linux-cxl@xxxxxxxxxxxxxxx; linux-
> >acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> >bp@xxxxxxxxx; tony.luck@xxxxxxxxx; rafael@xxxxxxxxxx; lenb@xxxxxxxxxx;
> >mchehab@xxxxxxxxxx; dan.j.williams@xxxxxxxxx; dave@xxxxxxxxxxxx; Jonathan
> >Cameron <jonathan.cameron@xxxxxxxxxx>; dave.jiang@xxxxxxxxx;
> >alison.schofield@xxxxxxxxx; vishal.l.verma@xxxxxxxxx; ira.weiny@xxxxxxxxx;
> >david@xxxxxxxxxx; Vilas.Sridharan@xxxxxxx; leo.duran@xxxxxxx;
> >Yazen.Ghannam@xxxxxxx; rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx;
> >Jon.Grimm@xxxxxxx; dave.hansen@xxxxxxxxxxxxxxx;
> >naoya.horiguchi@xxxxxxx; james.morse@xxxxxxx; jthoughton@xxxxxxxxxx;
> >somasundaram.a@xxxxxxx; erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx;
> >duenwen@xxxxxxxxxx; gthelen@xxxxxxxxxx;
> >wschwartz@xxxxxxxxxxxxxxxxxxx; dferguson@xxxxxxxxxxxxxxxxxxx;
> >wbs@xxxxxxxxxxxxxxxxxxxxxx; nifan.cxl@xxxxxxxxx; tanxiaofei
> ><tanxiaofei@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>; Roberto
> >Sassu <roberto.sassu@xxxxxxxxxx>; kangkang.shen@xxxxxxxxxxxxx;
> >wanghuiqiang <wanghuiqiang@xxxxxxxxxx>; Linuxarm
> ><linuxarm@xxxxxxxxxx>
> >Subject: Re: [PATCH v18 04/19] EDAC: Add memory repair control feature
> >
> >Em Mon, 6 Jan 2025 12:10:00 +0000
> ><shiju.jose@xxxxxxxxxx> escreveu:
> >  
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/mem_repairX/repair_function
> >> +Date:		Jan 2025
> >> +KernelVersion:	6.14
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RO) Memory repair function type. For eg. post package repair,
> >> +		memory sparing etc.
> >> +		EDAC_SOFT_PPR - Soft post package repair
> >> +		EDAC_HARD_PPR - Hard post package repair
> >> +		EDAC_CACHELINE_MEM_SPARING - Cacheline memory sparing
> >> +		EDAC_ROW_MEM_SPARING - Row memory sparing
> >> +		EDAC_BANK_MEM_SPARING - Bank memory sparing
> >> +		EDAC_RANK_MEM_SPARING - Rank memory sparing
> >> +		All other values are reserved.
> >> +
> >> +What:		/sys/bus/edac/devices/<dev-
> >name>/mem_repairX/persist_mode
> >> +Date:		Jan 2025
> >> +KernelVersion:	6.14
> >> +Contact:	linux-edac@xxxxxxxxxxxxxxx
> >> +Description:
> >> +		(RW) Read/Write the current persist repair mode set for a
> >> +		repair function. Persist repair modes supported in the
> >> +		device, based on the memory repair function is temporary
> >> +		or permanent and is lost with a power cycle.
> >> +		EDAC_MEM_REPAIR_SOFT - Soft repair function (temporary  
> >repair).  
> >> +		EDAC_MEM_REPAIR_HARD - Hard memory repair function  
> >(permanent repair).  
> >> +		All other values are reserved.
> >> +  
> >
> >After re-reading some things, I suspect that the above can be simplified a little
> >bit by folding soft/hard PPR into a single element at /repair_function, and letting
> >it clearer that persist_mode is valid only for PPR (I think this is the case, right?),
> >e.g. something like:  
> persist_mode is valid for memory sparing features(atleast in CXL) as well.
> In the case of CXL memory sparing, host has option to request either soft or hard sparing
> in a flag when issue a memory sparing operation.

Ok.

> 
> >
> >	What:		/sys/bus/edac/devices/<dev-  
> >name>/mem_repairX/repair_function  
> >	...
> >	Description:
> >			(RO) Memory repair function type. For e.g. post
> >package repair,
> >			memory sparing etc. Valid values are:
> >
> >			- ppr - post package repair.
> >			  Please define its mode via
> >			  /sys/bus/edac/devices/<dev-  
> >name>/mem_repairX/persist_mode  
> >			- cacheline-sparing - Cacheline memory sparing
> >			- row-sparing - Row memory sparing
> >			- bank-sparing - Bank memory sparing
> >			- rank-sparing - Rank memory sparing
> >			- All other values are reserved.
> >
> >and define persist_mode in a different way:  
> Note: For return as decoded strings instead of raw value,  I need to add some extra callback function/s
> in the edac/memory_repair.c  for these attributes  and which will reduce the current level of optimization done to
> minimize the code size.

You're already using a callback at EDAC_MEM_REPAIR_ATTR_SHOW macro.
So, no need for any change at the current code, except for the type
used at the EDAC_MEM_REPAIR_ATTR_SHOW() call.

Something similar to this (not tested) would work:

    int get_repair_function(struct device *dev, void *drv_data, const char **val)
    {
	unsigned int type;

	// Some logic to get repair type from *drv_data, storing into "unsigned int type"

	const char *repair_type[] = {
		[EDAC_SOFT_PPR] = "ppr",
		[EDAC_HARD_PPR] = "ppr",
		[EDAC_CACHELINE_MEM_SPARING] = "cacheline-sparing",
		...
	}

	if (type < ARRAY_SIZE(repair_type)) {
		*val = repair_type(type);
		return 0;
	}

	return -EINVAL;
    }

    EDAC_MEM_REPAIR_ATTR_SHOW(repair_function, get_repair_function, const char *, "%s\n");

Thanks,
Mauro




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux