RE: [PATCH v7 1/5] cxl/port: Add EINJ debugfs files and callback support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ben Cheatham wrote:
> Add creation of debugfs directories for ports and dports under
> /sys/kernel/debug/cxl when EINJ support is enabled. The dport
> directories will contain files for injecting CXL protocol errors.
> These files are only usable once the EINJ module has loaded and
> registered callback functions with the CXL core module, before that
> occurs (or if the EINJ module isn't loaded) the files will do nothing
> and return an ENODEV error.
> 
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@xxxxxxx>
> ---
>  Documentation/ABI/testing/debugfs-cxl | 27 +++++++++
>  drivers/cxl/core/port.c               | 84 +++++++++++++++++++++++++++
>  drivers/cxl/cxl.h                     | 10 ++++
>  3 files changed, 121 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl
> index fe61d372e3fa..782a1bb78884 100644
> --- a/Documentation/ABI/testing/debugfs-cxl
> +++ b/Documentation/ABI/testing/debugfs-cxl
> @@ -33,3 +33,30 @@ Description:
>  		device cannot clear poison from the address, -ENXIO is returned.
>  		The clear_poison attribute is only visible for devices
>  		supporting the capability.
> +
> +What:		/sys/kernel/debug/cxl/portX/dportY/einj_types

Given this file is identical contents for all dports it only needs to
exist in one common location

/sys/kernel/debug/cxl/einj_types


> +Date:		November, 2023
> +KernelVersion:	v6.8
> +Contact:	linux-cxl@xxxxxxxxxxxxxxx
> +Description:
> +		(RO) Prints the CXL protocol error types made available by
> +		the platform in the format "0x<error number>	<error type>".
> +		The <error number> can be written to einj_inject to inject
> +		<error type> into dportY. This file is only visible if
> +		CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must
> +		be able to reach one (or both) of the CXL_ACPI or CXL_PORT
> +		modules to be functional.

This can be simplified. Have something like:

config CXL_EINJ
	default CXL_BUS
	depends on ACPI_APEI_EINJ && ACPI_APEI_EINJ=CXL_BUS
	...

Then the documentation moves to Kconfig for how to enable this and the
CXL code can directly call into the EINJ module without worry.

It would of course need stubs like these in a shared header:

#ifdef CONFIG_CXL_EINJ
int cxl_einj_available_error_type(struct seq_file *m, void *v);
int cxl_einj_inject_error(struct cxl_dport *dport, u64 type);
#else
static inline int cxl_einj_available_error_type(struct seq_file *m, void *v)
{
	return -ENXIO;
}

int cxl_einj_inject_error(struct cxl_dport *dport, u64 type)
{
	return -ENXIO;
}
#endif

> +
> +What:		/sys/kernel/debug/cxl/portX/dportY/einj_inject

See my comments on cxl_debugfs_create_dport_dir() later on, but I think
the "portX" directory can be eliminated.

> +Date:		November, 2023
> +KernelVersion:	v6.8
> +Contact:	linux-cxl@xxxxxxxxxxxxxxx
> +Description:
> +		(WO) Writing an integer to this file injects the corresponding
> +		CXL protocol error into dportY (integer to type mapping is
> +		available by reading from einj_types). If the dport was
> +		enumerated in RCH mode, a CXL 1.1 error is injected, otherwise
> +		a CXL 2.0 error is injected. This file is only visible if
> +		CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must
> +		be able to reach one (or both) of the CXL_ACPI or CXL_PORT
> +		modules to be functional.

Similar comments about dropping these details that can just be solved in
Kconfig.

This is next comment is on EINJ ABI, but you can skip it just to
maintain momentum with the status quo. Why require the user to do the
string to integer conversion? Seems like a small matter of programming
to allow:

echo "CXL.cache Protocol Correctable" > einj_inject

...to do the right thing. That probably makes scripts more readable as
well.

> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 38441634e4c6..acf10415a174 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -783,6 +783,72 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport,
>  	return rc;
>  }
>  
> +static struct cxl_einj_ops einj_ops;
> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops)
> +{
> +	if (!IS_REACHABLE(CONFIG_ACPI_APEI_EINJ) || !ops)
> +		return;
> +
> +	einj_ops = *ops;
> +}
> +EXPORT_SYMBOL_NS_GPL(cxl_einj_set_ops_cbs, CXL);

einj_ops goes away when the CXL code can just call the EINJ module
directly.

> +
> +static int cxl_einj_type_show(struct seq_file *f, void *data)
> +{
> +	if (!einj_ops.einj_type)
> +		return -ENODEV;
> +
> +	return einj_ops.einj_type(f, data);
> +}
> +
> +static int cxl_einj_inject(void *data, u64 type)
> +{
> +	struct cxl_dport *dport = data;
> +
> +	if (!einj_ops.einj_inject)
> +		return -ENODEV;
> +
> +	return einj_ops.einj_inject(dport, type);
> +}
> +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n");

The wrappers go away and DEFINE_DEBUGFS_ATTRIBUTE() can reference the
EINJ symbols directly.

> +
> +static int cxl_debugfs_create_dport_dir(struct dentry *port_dir,
> +						   struct cxl_dport *dport)
> +{
> +	struct dentry *dir;
> +	char dir_name[32];
> +
> +	snprintf(dir_name, 31, "dport%d", dport->port_id);

How about dev_name(dport->dport_dev) rather than the dynamic name?

The other benefit of this is that the dport_dev names are unique, so you
can move the einj_inject file to:

/sys/kernel/debug/cxl/$dport_dev/einj_inject

> +	dir = debugfs_create_dir(dir_name, port_dir);
> +	if (IS_ERR(dir))
> +		return PTR_ERR(dir);
> +
> +	debugfs_create_devm_seqfile(dport->dport_dev, "einj_types", dir,
> +				    cxl_einj_type_show);

Per above, move this to be a top-level file.

> +
> +	debugfs_create_file("einj_inject", 0200, dir, dport,
> +			    &cxl_einj_inject_fops);
> +	return 0;

debugfs is good about failing gracefully when pre-requisites are not
present. This is why none of the debugfs creation helpers have return
codes because failing to setup debugfs is never fatal.

In other words, it is ok to take the output of debugfs_create_dir()
without checking, and this function should not be returning an error.

> +}
> +
> +static struct dentry *cxl_debugfs_create_port_dir(struct cxl_port *port)
> +{
> +	const char *dir_name = dev_name(&port->dev);
> +	struct dentry *dir;
> +
> +	if (!IS_ENABLED(CONFIG_ACPI_APEI_EINJ))
> +		return ERR_PTR(-ENODEV);
> +
> +	dir = cxl_debugfs_create_dir(dir_name);
> +	if (IS_ERR(dir)) {
> +		dev_dbg(&port->dev, "Failed to create port debugfs dir: %ld\n",
> +			PTR_ERR(dir));
> +		return dir;
> +	}
> +
> +	return dir;
> +}
> +
>  static struct cxl_port *__devm_cxl_add_port(struct device *host,
>  					    struct device *uport_dev,
>  					    resource_size_t component_reg_phys,
> @@ -861,6 +927,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
>  				   struct cxl_dport *parent_dport)
>  {
>  	struct cxl_port *port, *parent_port;
> +	struct dentry *dir;
>  
>  	port = __devm_cxl_add_port(host, uport_dev, component_reg_phys,
>  				   parent_dport);
> @@ -878,6 +945,10 @@ struct cxl_port *devm_cxl_add_port(struct device *host,
>  			parent_port ? " to " : "",
>  			parent_port ? dev_name(&parent_port->dev) : "",
>  			parent_port ? "" : " (root port)");
> +
> +		dir = cxl_debugfs_create_port_dir(port);
> +		if (!IS_ERR(dir))
> +			port->debug_dir = dir;
>  	}
>  
>  	return port;
> @@ -1127,6 +1198,7 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port,
>  				     resource_size_t component_reg_phys)
>  {
>  	struct cxl_dport *dport;
> +	int rc;
>  
>  	dport = __devm_cxl_add_dport(port, dport_dev, port_id,
>  				     component_reg_phys, CXL_RESOURCE_NONE);
> @@ -1136,6 +1208,11 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port,
>  	} else {
>  		dev_dbg(dport_dev, "dport added to %s\n",
>  			dev_name(&port->dev));
> +
> +		rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport);
> +		if (rc)
> +			dev_dbg(dport_dev,
> +				"Failed to create dport debugfs dir: %d\n", rc);

Drop the debug messages about failing to setup debugfs. This follows the
lead of other debugfs setup in CXL.

>  	}
>  
>  	return dport;
> @@ -1156,6 +1233,7 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
>  					 resource_size_t rcrb)
>  {
>  	struct cxl_dport *dport;
> +	int rc;
>  
>  	if (rcrb == CXL_RESOURCE_NONE) {
>  		dev_dbg(&port->dev, "failed to add RCH dport, missing RCRB\n");
> @@ -1170,6 +1248,12 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port,
>  	} else {
>  		dev_dbg(dport_dev, "RCH dport added to %s\n",
>  			dev_name(&port->dev));
> +
> +		rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport);
> +		if (rc)
> +			dev_dbg(dport_dev,
> +				"Failed to create rch dport debugfs dir: %d\n",
> +				rc);
>  	}
>  
>  	return dport;
> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
> index 687043ece101..3c7744fc3106 100644
> --- a/drivers/cxl/cxl.h
> +++ b/drivers/cxl/cxl.h
> @@ -590,6 +590,7 @@ struct cxl_dax_region {
>   * @depth: How deep this port is relative to the root. depth 0 is the root.
>   * @cdat: Cached CDAT data
>   * @cdat_available: Should a CDAT attribute be available in sysfs
> + * @debug_dir: dentry for port in cxl debugfs (optional)
>   */
>  struct cxl_port {
>  	struct device dev;
> @@ -612,6 +613,7 @@ struct cxl_port {
>  		size_t length;
>  	} cdat;
>  	bool cdat_available;
> +	struct dentry *debug_dir;

Part of why I asked for the debugfs file rename was to eliminate this
wart on the data structure.

>  };
>  
>  static inline struct cxl_dport *
> @@ -813,6 +815,14 @@ bool is_cxl_nvdimm_bridge(struct device *dev);
>  int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd);
>  struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd);
>  
> +struct cxl_einj_ops {
> +	int (*einj_type)(struct seq_file *f, void *data);
> +	int (*einj_inject)(struct cxl_dport *dport, u64 type);
> +};
> +
> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops);
> +
> +
>  #ifdef CONFIG_CXL_REGION
>  bool is_cxl_pmem_region(struct device *dev);
>  struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev);
> -- 
> 2.34.1
> 
> 







[Index of Archives]     [Linux IBM ACPI]     [Linux Power Management]     [Linux Kernel]     [Linux Laptop]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]
  Powered by Linux