Re: [PATCH v14 07/14] cxl/memfeature: Add CXL memory device patrol scrub control feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 10/30/24 9:16 AM, Jonathan Cameron wrote:
> On Tue, 29 Oct 2024 11:32:47 -0700
> Dave Jiang <dave.jiang@xxxxxxxxx> wrote:
> 
>> On 10/29/24 10:00 AM, Shiju Jose wrote:
>>>
>>>   
>>>> -----Original Message-----
>>>> From: Dave Jiang <dave.jiang@xxxxxxxxx>
>>>> Sent: 29 October 2024 16:32
>>>> To: Shiju Jose <shiju.jose@xxxxxxxxxx>; linux-edac@xxxxxxxxxxxxxxx; linux-
>>>> cxl@xxxxxxxxxxxxxxx; linux-acpi@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
>>>> kernel@xxxxxxxxxxxxxxx
>>>> Cc: bp@xxxxxxxxx; tony.luck@xxxxxxxxx; rafael@xxxxxxxxxx; lenb@xxxxxxxxxx;
>>>> mchehab@xxxxxxxxxx; dan.j.williams@xxxxxxxxx; dave@xxxxxxxxxxxx; Jonathan
>>>> Cameron <jonathan.cameron@xxxxxxxxxx>; gregkh@xxxxxxxxxxxxxxxxxxx;
>>>> sudeep.holla@xxxxxxx; jassisinghbrar@xxxxxxxxx; alison.schofield@xxxxxxxxx;
>>>> vishal.l.verma@xxxxxxxxx; ira.weiny@xxxxxxxxx; david@xxxxxxxxxx;
>>>> Vilas.Sridharan@xxxxxxx; leo.duran@xxxxxxx; Yazen.Ghannam@xxxxxxx;
>>>> rientjes@xxxxxxxxxx; jiaqiyan@xxxxxxxxxx; Jon.Grimm@xxxxxxx;
>>>> dave.hansen@xxxxxxxxxxxxxxx; naoya.horiguchi@xxxxxxx;
>>>> james.morse@xxxxxxx; jthoughton@xxxxxxxxxx; somasundaram.a@xxxxxxx;
>>>> erdemaktas@xxxxxxxxxx; pgonda@xxxxxxxxxx; duenwen@xxxxxxxxxx;
>>>> gthelen@xxxxxxxxxx; wschwartz@xxxxxxxxxxxxxxxxxxx;
>>>> dferguson@xxxxxxxxxxxxxxxxxxx; wbs@xxxxxxxxxxxxxxxxxxxxxx;
>>>> nifan.cxl@xxxxxxxxx; tanxiaofei <tanxiaofei@xxxxxxxxxx>; Zengtao (B)
>>>> <prime.zeng@xxxxxxxxxxxxx>; Roberto Sassu <roberto.sassu@xxxxxxxxxx>;
>>>> kangkang.shen@xxxxxxxxxxxxx; wanghuiqiang <wanghuiqiang@xxxxxxxxxx>;
>>>> Linuxarm <linuxarm@xxxxxxxxxx>
>>>> Subject: Re: [PATCH v14 07/14] cxl/memfeature: Add CXL memory device patrol
>>>> scrub control feature
>>>>
>>>>
>>>>
>>>> On 10/25/24 10:13 AM, shiju.jose@xxxxxxxxxx wrote:  
>>>>> From: Shiju Jose <shiju.jose@xxxxxxxxxx>
>>>>>
>>>>> CXL spec 3.1 section 8.2.9.9.11.1 describes the device patrol scrub
>>>>> control feature. The device patrol scrub proactively locates and makes
>>>>> corrections to errors in regular cycle.
>>>>>
>>>>> Allow specifying the number of hours within which the patrol scrub
>>>>> must be completed, subject to minimum and maximum limits reported by the  
>>>> device.  
>>>>> Also allow disabling scrub allowing trade-off error rates against
>>>>> performance.
>>>>>
>>>>> Add support for patrol scrub control on CXL memory devices.
>>>>> Register with the EDAC device driver, which retrieves the scrub
>>>>> attribute descriptors from EDAC scrub and exposes the sysfs scrub
>>>>> control attributes to userspace. For example, scrub control for the
>>>>> CXL memory device "cxl_mem0" is exposed in  
>>>> /sys/bus/edac/devices/cxl_mem0/scrubX/.  
>>>>>
>>>>> Additionally, add support for region-based CXL memory patrol scrub control.
>>>>> CXL memory regions may be interleaved across one or more CXL memory
>>>>> devices. For example, region-based scrub control for "cxl_region1" is
>>>>> exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.
>>>>>
>>>>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>>>>> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>>>>> Signed-off-by: Shiju Jose <shiju.jose@xxxxxxxxxx>
>>>>> ---
>>>>>  Documentation/edac/edac-scrub.rst |  74 ++++++
>>>>>  drivers/cxl/Kconfig               |  18 ++
>>>>>  drivers/cxl/core/Makefile         |   1 +
>>>>>  drivers/cxl/core/memfeature.c     | 381 ++++++++++++++++++++++++++++++
>>>>>  drivers/cxl/core/region.c         |   6 +
>>>>>  drivers/cxl/cxlmem.h              |   7 +
>>>>>  drivers/cxl/mem.c                 |   4 +
>>>>>  7 files changed, 491 insertions(+)
>>>>>  create mode 100644 Documentation/edac/edac-scrub.rst  create mode
>>>>> 100644 drivers/cxl/core/memfeature.c
>>>>>
>>>>> diff --git a/Documentation/edac/edac-scrub.rst
>>>>> b/Documentation/edac/edac-scrub.rst
>>>>> new file mode 100644
>>>>> index 000000000000..4aad4974b208
>>>>> --- /dev/null
>>>>> +++ b/Documentation/edac/edac-scrub.rst
>>>>> @@ -0,0 +1,74 @@
>>>>> +.. SPDX-License-Identifier: GPL-2.0
>>>>> +  
>>> [...]
>>>   
>>>>> +static int cxl_mem_ps_get_attrs(struct cxl_memdev_state *mds,
>>>>> +				struct cxl_memdev_ps_params *params) {
>>>>> +	size_t rd_data_size = sizeof(struct cxl_memdev_ps_rd_attrs);
>>>>> +	size_t data_size;
>>>>> +	struct cxl_memdev_ps_rd_attrs *rd_attrs __free(kfree) =
>>>>> +						kmalloc(rd_data_size,  
>>>> GFP_KERNEL);  
>>>>> +	if (!rd_attrs)
>>>>> +		return -ENOMEM;
>>>>> +
>>>>> +	data_size = cxl_get_feature(mds, cxl_patrol_scrub_uuid,
>>>>> +				    CXL_GET_FEAT_SEL_CURRENT_VALUE,
>>>>> +				    rd_attrs, rd_data_size);
>>>>> +	if (!data_size)
>>>>> +		return -EIO;
>>>>> +
>>>>> +	params->scrub_cycle_changeable =  
>>>> FIELD_GET(CXL_MEMDEV_PS_SCRUB_CYCLE_CHANGE_CAP_MASK,  
>>>>> +						   rd_attrs->scrub_cycle_cap);
>>>>> +	params->enable =  
>>>> FIELD_GET(CXL_MEMDEV_PS_FLAG_ENABLED_MASK,  
>>>>> +				   rd_attrs->scrub_flags);
>>>>> +	params->scrub_cycle_hrs =  
>>>> FIELD_GET(CXL_MEMDEV_PS_CUR_SCRUB_CYCLE_MASK,  
>>>>> +					    rd_attrs->scrub_cycle_hrs);
>>>>> +	params->min_scrub_cycle_hrs =  
>>>> FIELD_GET(CXL_MEMDEV_PS_MIN_SCRUB_CYCLE_MASK,  
>>>>> +						rd_attrs->scrub_cycle_hrs);
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>> +static int cxl_ps_get_attrs(struct device *dev, void *drv_data,  
>>>>
>>>> Would a union be better than a void *drv_data for all the places this is used as a
>>>> parameter? How many variations of this are there?
>>>>
>>>> DJ  
>>> Hi Dave,
>>>
>>> Can you give more info on this given this is a generic callback for the scrub control and each
>>> implementation will have its own context struct (for eg. struct cxl_patrol_scrub_context here
>>> for CXL scrub control), which in turn will be passed in and out as opaque data.  
>>
>> Mainly I'm just seeing a lot of calls with (void *). Just asking if we want to make it a union that contains 'struct cxl_patrol_scrub_context' and etc.
> 
> You could but then every new driver would need to include
> changes in the edac core to add it's own entry to that union.
> 
> Not sure that's a good way to go for opaque driver specific context.
> 
> This particular function though can use
> a struct cxl_patrol_scrub_context * anyway as it's not part of the
> core interface, but rather one called only indirectly
> by functions that are passed a void * but know it is a
> struct clx_patrol_scrub_context *.

Thanks Jonathan. That's basically what I wanted to know. 

> 
> Jonathan
> 
> 
>>
>>>
>>> Thanks,
>>> Shiju  
>>>>  
>>>>> +			    struct cxl_memdev_ps_params *params) {
>>>>> +	struct cxl_patrol_scrub_context *cxl_ps_ctx = drv_data;
>>>>> +	struct cxl_memdev *cxlmd;
>>>>> +	struct cxl_dev_state *cxlds;
>>>>> +	struct cxl_memdev_state *mds;
>>>>> +	u16 min_scrub_cycle = 0;
>>>>> +	int i, ret;
>>>>> +
>>>>> +	if (cxl_ps_ctx->cxlr) {
>>>>> +		struct cxl_region *cxlr = cxl_ps_ctx->cxlr;
>>>>> +		struct cxl_region_params *p = &cxlr->params;
>>>>> +
>>>>> +		for (i = p->interleave_ways - 1; i >= 0; i--) {
>>>>> +			struct cxl_endpoint_decoder *cxled = p->targets[i];
>>>>> +
>>>>> +			cxlmd = cxled_to_memdev(cxled);
>>>>> +			cxlds = cxlmd->cxlds;
>>>>> +			mds = to_cxl_memdev_state(cxlds);
>>>>> +			ret = cxl_mem_ps_get_attrs(mds, params);
>>>>> +			if (ret)
>>>>> +				return ret;
>>>>> +
>>>>> +			if (params->min_scrub_cycle_hrs > min_scrub_cycle)
>>>>> +				min_scrub_cycle = params-
>>>>> min_scrub_cycle_hrs;
>>>>> +		}
>>>>> +		params->min_scrub_cycle_hrs = min_scrub_cycle;
>>>>> +		return 0;
>>>>> +	}
>>>>> +	cxlmd = cxl_ps_ctx->cxlmd;
>>>>> +	cxlds = cxlmd->cxlds;
>>>>> +	mds = to_cxl_memdev_state(cxlds);
>>>>> +
>>>>> +	return cxl_mem_ps_get_attrs(mds, params); }
>>>>> +  
>>> [...]  
>>>>  
>>>   
>>
>>
> 
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux