Re: [PATCH v7 1/5] cxl/port: Add EINJ debugfs files and callback support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks for taking a look Dan! Replies inline.

On 12/7/23 5:13 PM, Dan Williams wrote:
> Ben Cheatham wrote:
>> Add creation of debugfs directories for ports and dports under
>> /sys/kernel/debug/cxl when EINJ support is enabled. The dport
>> directories will contain files for injecting CXL protocol errors.
>> These files are only usable once the EINJ module has loaded and
>> registered callback functions with the CXL core module, before that
>> occurs (or if the EINJ module isn't loaded) the files will do nothing
>> and return an ENODEV error.
>>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
>> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@xxxxxxx>
>> ---
>>  Documentation/ABI/testing/debugfs-cxl | 27 +++++++++
>>  drivers/cxl/core/port.c               | 84 +++++++++++++++++++++++++++
>>  drivers/cxl/cxl.h                     | 10 ++++
>>  3 files changed, 121 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
>> index fe61d372e3fa..782a1bb78884 100644
>> --- a/Documentation/ABI/testing/debugfs-cxl
>> +++ b/Documentation/ABI/testing/debugfs-cxl
>> @@ -33,3 +33,30 @@ Description:
>>  		device cannot clear poison from the address, -ENXIO is returned.
>>  		The clear_poison attribute is only visible for devices
>>  		supporting the capability.
>> +
>> +What:		/sys/kernel/debug/cxl/portX/dportY/einj_types
> 
> Given this file is identical contents for all dports it only needs to
> exist in one common location
> 
> /sys/kernel/debug/cxl/einj_types
> 

Good point, I'll make that change.

> 
>> +Date:		November, 2023
>> +KernelVersion:	v6.8
>> +Contact:	linux-cxl@xxxxxxxxxxxxxxx
>> +Description:
>> +		(RO) Prints the CXL protocol error types made available by
>> +		the platform in the format "0x<error number>	<error type>".
>> +		The <error number> can be written to einj_inject to inject
>> +		<error type> into dportY. This file is only visible if
>> +		CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must
>> +		be able to reach one (or both) of the CXL_ACPI or CXL_PORT
>> +		modules to be functional.
> 
> This can be simplified. Have something like:
> 
> config CXL_EINJ
> 	default CXL_BUS
> 	depends on ACPI_APEI_EINJ && ACPI_APEI_EINJ=CXL_BUS
> 	...
> 
> Then the documentation moves to Kconfig for how to enable this and the
> CXL code can directly call into the EINJ module without worry.
> 
> It would of course need stubs like these in a shared header:
> 
> #ifdef CONFIG_CXL_EINJ
> int cxl_einj_available_error_type(struct seq_file *m, void *v);
> int cxl_einj_inject_error(struct cxl_dport *dport, u64 type);
> #else
> static inline int cxl_einj_available_error_type(struct seq_file *m, void *v)
> {
> 	return -ENXIO;
> }
> 
> int cxl_einj_inject_error(struct cxl_dport *dport, u64 type)
> {
> 	return -ENXIO;
> }
> #endif
> 

I had to go back and take a look, but I had a shared header in v5 (link: https://lore.kernel.org/linux-cxl/20230926120418.0000575d@xxxxxxxxxx/). 
Jonathan recommended that I instead include cxl.h directly, but that was pretty much a completely different patch set
at the time (and the header was under include/linux/). That being said, I agree that a header under drivers/cxl would
make much more sense here.

>> +
>> +What:		/sys/kernel/debug/cxl/portX/dportY/einj_inject
> 
> See my comments on cxl_debugfs_create_dport_dir() later on, but I think
> the "portX" directory can be eliminated.
> 
>> +Date:		November, 2023
>> +KernelVersion:	v6.8
>> +Contact:	linux-cxl@xxxxxxxxxxxxxxx
>> +Description:
>> +		(WO) Writing an integer to this file injects the corresponding
>> +		CXL protocol error into dportY (integer to type mapping is
>> +		available by reading from einj_types). If the dport was
>> +		enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
>> +		a CXL 2.0 error is injected. This file is only visible if
>> +		CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must
>> +		be able to reach one (or both) of the CXL_ACPI or CXL_PORT
>> +		modules to be functional.
> 
> Similar comments about dropping these details that can just be solved in
> Kconfig.
> 
> This is next comment is on EINJ ABI, but you can skip it just to
> maintain momentum with the status quo. Why require the user to do the
> string to integer conversion? Seems like a small matter of programming
> to allow:
> 
> echo "CXL.cache Protocol Correctable" > einj_inject
> 
> ...to do the right thing. That probably makes scripts more readable as
> well.
> 

That's a good point. I can do that, but I think it may be better to keep the
consistency with the EINJ module to simplify things for end users. If you feel
that isn't a big enough concern I can go ahead and modify it.

>> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
>> index 38441634e4c6..acf10415a174 100644
>> --- a/drivers/cxl/core/port.c
>> +++ b/drivers/cxl/core/port.c
>> @@ -783,6 +783,72 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
>>  	return rc;
>>  }
>>  
>> +static struct cxl_einj_ops einj_ops;
>> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops)
>> +{
>> +	if (!IS_REACHABLE(CONFIG_ACPI_APEI_EINJ) || !ops)
>> +		return;
>> +
>> +	einj_ops = *ops;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_einj_set_ops_cbs, CXL);
> 
> einj_ops goes away when the CXL code can just call the EINJ module
> directly.
> 
>> +
>> +static int cxl_einj_type_show(struct seq_file *f, void *data)
>> +{
>> +	if (!einj_ops.einj_type)
>> +		return -ENODEV;
>> +
>> +	return einj_ops.einj_type(f, data);
>> +}
>> +
>> +static int cxl_einj_inject(void *data, u64 type)
>> +{
>> +	struct cxl_dport *dport = data;
>> +
>> +	if (!einj_ops.einj_inject)
>> +		return -ENODEV;
>> +
>> +	return einj_ops.einj_inject(dport, type);
>> +}
>> +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n");
> 
> The wrappers go away and DEFINE_DEBUGFS_ATTRIBUTE() can reference the
> EINJ symbols directly.
> 
>> +
>> +static int cxl_debugfs_create_dport_dir(struct dentry *port_dir,
>> +						   struct cxl_dport *dport)
>> +{
>> +	struct dentry *dir;
>> +	char dir_name[32];
>> +
>> +	snprintf(dir_name, 31, "dport%d", dport->port_id);
> 
> How about dev_name(dport->dport_dev) rather than the dynamic name?
> 
> The other benefit of this is that the dport_dev names are unique, so you
> can move the einj_inject file to:
> 
> /sys/kernel/debug/cxl/$dport_dev/einj_inject
> 

I didn't realize the dport names were also unique. I'll go ahead and do that instead.

>> +	dir = debugfs_create_dir(dir_name, port_dir);
>> +	if (IS_ERR(dir))
>> +		return PTR_ERR(dir);
>> +
>> +	debugfs_create_devm_seqfile(dport->dport_dev, "einj_types", dir,
>> +				    cxl_einj_type_show);
> 
> Per above, move this to be a top-level file.
> 

Will do.

>> +
>> +	debugfs_create_file("einj_inject", 0200, dir, dport,
>> +			    &cxl_einj_inject_fops);
>> +	return 0;
> 
> debugfs is good about failing gracefully when pre-requisites are not
> present. This is why none of the debugfs creation helpers have return
> codes because failing to setup debugfs is never fatal.
> 
> In other words, it is ok to take the output of debugfs_create_dir()
> without checking, and this function should not be returning an error.
> 

Will do.

>> +}
>> +
>> +static struct dentry *cxl_debugfs_create_port_dir(struct cxl_port *port)
>> +{
>> +	const char *dir_name = dev_name(&port->dev);
>> +	struct dentry *dir;
>> +
>> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_EINJ))
>> +		return ERR_PTR(-ENODEV);
>> +
>> +	dir = cxl_debugfs_create_dir(dir_name);
>> +	if (IS_ERR(dir)) {
>> +		dev_dbg(&port->dev, "Failed to create port debugfs dir: %ld\n",
>> +			PTR_ERR(dir));
>> +		return dir;
>> +	}
>> +
>> +	return dir;
>> +}
>> +
>>  static struct cxl_port *__devm_cxl_add_port(struct device *host,
>>  					    struct device *uport_dev,
>>  					    resource_size_t component_reg_phys,
>> @@ -861,6 +927,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
>>  				   struct cxl_dport *parent_dport)
>>  {
>>  	struct cxl_port *port, *parent_port;
>> +	struct dentry *dir;
>>  
>>  	port = __devm_cxl_add_port(host, uport_dev, component_reg_phys,
>>  				   parent_dport);
>> @@ -878,6 +945,10 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
>>  			parent_port ? " to " : "",
>>  			parent_port ? dev_name(&parent_port->dev) : "",
>>  			parent_port ? "" : " (root port)");
>> +
>> +		dir = cxl_debugfs_create_port_dir(port);
>> +		if (!IS_ERR(dir))
>> +			port->debug_dir = dir;
>>  	}
>>  
>>  	return port;
>> @@ -1127,6 +1198,7 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port,
>>  				     resource_size_t component_reg_phys)
>>  {
>>  	struct cxl_dport *dport;
>> +	int rc;
>>  
>>  	dport = __devm_cxl_add_dport(port, dport_dev, port_id,
>>  				     component_reg_phys, CXL_RESOURCE_NONE);
>> @@ -1136,6 +1208,11 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port,
>>  	} else {
>>  		dev_dbg(dport_dev, "dport added to %s\n",
>>  			dev_name(&port->dev));
>> +
>> +		rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport);
>> +		if (rc)
>> +			dev_dbg(dport_dev,
>> +				"Failed to create dport debugfs dir: %d\n", rc);
> 
> Drop the debug messages about failing to setup debugfs. This follows the
> lead of other debugfs setup in CXL.
> 

Will do.

>>  	}
>>  
>>  	return dport;
>> @@ -1156,6 +1233,7 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
>>  					 resource_size_t rcrb)
>>  {
>>  	struct cxl_dport *dport;
>> +	int rc;
>>  
>>  	if (rcrb == CXL_RESOURCE_NONE) {
>>  		dev_dbg(&port->dev, "failed to add RCH dport, missing RCRB\n");
>> @@ -1170,6 +1248,12 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
>>  	} else {
>>  		dev_dbg(dport_dev, "RCH dport added to %s\n",
>>  			dev_name(&port->dev));
>> +
>> +		rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport);
>> +		if (rc)
>> +			dev_dbg(dport_dev,
>> +				"Failed to create rch dport debugfs dir: %d\n",
>> +				rc);
>>  	}
>>  
>>  	return dport;
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index 687043ece101..3c7744fc3106 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -590,6 +590,7 @@ struct cxl_dax_region {
>>   * @depth: How deep this port is relative to the root. depth 0 is the root.
>>   * @cdat: Cached CDAT data
>>   * @cdat_available: Should a CDAT attribute be available in sysfs
>> + * @debug_dir: dentry for port in cxl debugfs (optional)
>>   */
>>  struct cxl_port {
>>  	struct device dev;
>> @@ -612,6 +613,7 @@ struct cxl_port {
>>  		size_t length;
>>  	} cdat;
>>  	bool cdat_available;
>> +	struct dentry *debug_dir;
> 
> Part of why I asked for the debugfs file rename was to eliminate this
> wart on the data structure.
> 

Yeah I wasn't that happy about adding it, so I'd be happy to remove it!

>>  };
>>  
>>  static inline struct cxl_dport *
>> @@ -813,6 +815,14 @@ bool is_cxl_nvdimm_bridge(struct device *dev);
>>  int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
>>  struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd);
>>  
>> +struct cxl_einj_ops {
>> +	int (*einj_type)(struct seq_file *f, void *data);
>> +	int (*einj_inject)(struct cxl_dport *dport, u64 type);
>> +};
>> +
>> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops);
>> +
>> +
>>  #ifdef CONFIG_CXL_REGION
>>  bool is_cxl_pmem_region(struct device *dev);
>>  struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
>> -- 
>> 2.34.1
>>
>>
> 
> 
> 




[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux