Thanks for taking a look Dan! Replies inline. On 12/7/23 5:13 PM, Dan Williams wrote: > Ben Cheatham wrote: >> Add creation of debugfs directories for ports and dports under >> /sys/kernel/debug/cxl when EINJ support is enabled. The dport >> directories will contain files for injecting CXL protocol errors. >> These files are only usable once the EINJ module has loaded and >> registered callback functions with the CXL core module, before that >> occurs (or if the EINJ module isn't loaded) the files will do nothing >> and return an ENODEV error. >> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> >> Signed-off-by: Ben Cheatham <Benjamin.Cheatham@xxxxxxx> >> --- >> Documentation/ABI/testing/debugfs-cxl | 27 +++++++++ >> drivers/cxl/core/port.c | 84 +++++++++++++++++++++++++++ >> drivers/cxl/cxl.h | 10 ++++ >> 3 files changed, 121 insertions(+) >> >> diff --git a/Documentation/ABI/testing/debugfs-cxl b/Documentation/ABI/testing/debugfs-cxl >> index fe61d372e3fa..782a1bb78884 100644 >> --- a/Documentation/ABI/testing/debugfs-cxl >> +++ b/Documentation/ABI/testing/debugfs-cxl >> @@ -33,3 +33,30 @@ Description: >> device cannot clear poison from the address, -ENXIO is returned. >> The clear_poison attribute is only visible for devices >> supporting the capability. >> + >> +What: /sys/kernel/debug/cxl/portX/dportY/einj_types > > Given this file is identical contents for all dports it only needs to > exist in one common location > > /sys/kernel/debug/cxl/einj_types > Good point, I'll make that change. > >> +Date: November, 2023 >> +KernelVersion: v6.8 >> +Contact: linux-cxl@xxxxxxxxxxxxxxx >> +Description: >> + (RO) Prints the CXL protocol error types made available by >> + the platform in the format "0x<error number> <error type>". >> + The <error number> can be written to einj_inject to inject >> + <error type> into dportY. This file is only visible if >> + CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must >> + be able to reach one (or both) of the CXL_ACPI or CXL_PORT >> + modules to be functional. > > This can be simplified. Have something like: > > config CXL_EINJ > default CXL_BUS > depends on ACPI_APEI_EINJ && ACPI_APEI_EINJ=CXL_BUS > ... > > Then the documentation moves to Kconfig for how to enable this and the > CXL code can directly call into the EINJ module without worry. > > It would of course need stubs like these in a shared header: > > #ifdef CONFIG_CXL_EINJ > int cxl_einj_available_error_type(struct seq_file *m, void *v); > int cxl_einj_inject_error(struct cxl_dport *dport, u64 type); > #else > static inline int cxl_einj_available_error_type(struct seq_file *m, void *v) > { > return -ENXIO; > } > > int cxl_einj_inject_error(struct cxl_dport *dport, u64 type) > { > return -ENXIO; > } > #endif > I had to go back and take a look, but I had a shared header in v5 (link: https://lore.kernel.org/linux-cxl/20230926120418.0000575d@xxxxxxxxxx/). Jonathan recommended that I instead include cxl.h directly, but that was pretty much a completely different patch set at the time (and the header was under include/linux/). That being said, I agree that a header under drivers/cxl would make much more sense here. >> + >> +What: /sys/kernel/debug/cxl/portX/dportY/einj_inject > > See my comments on cxl_debugfs_create_dport_dir() later on, but I think > the "portX" directory can be eliminated. > >> +Date: November, 2023 >> +KernelVersion: v6.8 >> +Contact: linux-cxl@xxxxxxxxxxxxxxx >> +Description: >> + (WO) Writing an integer to this file injects the corresponding >> + CXL protocol error into dportY (integer to type mapping is >> + available by reading from einj_types). If the dport was >> + enumerated in RCH mode, a CXL 1.1 error is injected, otherwise >> + a CXL 2.0 error is injected. This file is only visible if >> + CONFIG_ACPI_APEI_EINJ is enabled, and the EINJ module must >> + be able to reach one (or both) of the CXL_ACPI or CXL_PORT >> + modules to be functional. > > Similar comments about dropping these details that can just be solved in > Kconfig. > > This is next comment is on EINJ ABI, but you can skip it just to > maintain momentum with the status quo. Why require the user to do the > string to integer conversion? Seems like a small matter of programming > to allow: > > echo "CXL.cache Protocol Correctable" > einj_inject > > ...to do the right thing. That probably makes scripts more readable as > well. > That's a good point. I can do that, but I think it may be better to keep the consistency with the EINJ module to simplify things for end users. If you feel that isn't a big enough concern I can go ahead and modify it. >> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c >> index 38441634e4c6..acf10415a174 100644 >> --- a/drivers/cxl/core/port.c >> +++ b/drivers/cxl/core/port.c >> @@ -783,6 +783,72 @@ static int cxl_dport_setup_regs(struct device *host, struct cxl_dport *dport, >> return rc; >> } >> >> +static struct cxl_einj_ops einj_ops; >> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops) >> +{ >> + if (!IS_REACHABLE(CONFIG_ACPI_APEI_EINJ) || !ops) >> + return; >> + >> + einj_ops = *ops; >> +} >> +EXPORT_SYMBOL_NS_GPL(cxl_einj_set_ops_cbs, CXL); > > einj_ops goes away when the CXL code can just call the EINJ module > directly. > >> + >> +static int cxl_einj_type_show(struct seq_file *f, void *data) >> +{ >> + if (!einj_ops.einj_type) >> + return -ENODEV; >> + >> + return einj_ops.einj_type(f, data); >> +} >> + >> +static int cxl_einj_inject(void *data, u64 type) >> +{ >> + struct cxl_dport *dport = data; >> + >> + if (!einj_ops.einj_inject) >> + return -ENODEV; >> + >> + return einj_ops.einj_inject(dport, type); >> +} >> +DEFINE_DEBUGFS_ATTRIBUTE(cxl_einj_inject_fops, NULL, cxl_einj_inject, "%llx\n"); > > The wrappers go away and DEFINE_DEBUGFS_ATTRIBUTE() can reference the > EINJ symbols directly. > >> + >> +static int cxl_debugfs_create_dport_dir(struct dentry *port_dir, >> + struct cxl_dport *dport) >> +{ >> + struct dentry *dir; >> + char dir_name[32]; >> + >> + snprintf(dir_name, 31, "dport%d", dport->port_id); > > How about dev_name(dport->dport_dev) rather than the dynamic name? > > The other benefit of this is that the dport_dev names are unique, so you > can move the einj_inject file to: > > /sys/kernel/debug/cxl/$dport_dev/einj_inject > I didn't realize the dport names were also unique. I'll go ahead and do that instead. >> + dir = debugfs_create_dir(dir_name, port_dir); >> + if (IS_ERR(dir)) >> + return PTR_ERR(dir); >> + >> + debugfs_create_devm_seqfile(dport->dport_dev, "einj_types", dir, >> + cxl_einj_type_show); > > Per above, move this to be a top-level file. > Will do. >> + >> + debugfs_create_file("einj_inject", 0200, dir, dport, >> + &cxl_einj_inject_fops); >> + return 0; > > debugfs is good about failing gracefully when pre-requisites are not > present. This is why none of the debugfs creation helpers have return > codes because failing to setup debugfs is never fatal. > > In other words, it is ok to take the output of debugfs_create_dir() > without checking, and this function should not be returning an error. > Will do. >> +} >> + >> +static struct dentry *cxl_debugfs_create_port_dir(struct cxl_port *port) >> +{ >> + const char *dir_name = dev_name(&port->dev); >> + struct dentry *dir; >> + >> + if (!IS_ENABLED(CONFIG_ACPI_APEI_EINJ)) >> + return ERR_PTR(-ENODEV); >> + >> + dir = cxl_debugfs_create_dir(dir_name); >> + if (IS_ERR(dir)) { >> + dev_dbg(&port->dev, "Failed to create port debugfs dir: %ld\n", >> + PTR_ERR(dir)); >> + return dir; >> + } >> + >> + return dir; >> +} >> + >> static struct cxl_port *__devm_cxl_add_port(struct device *host, >> struct device *uport_dev, >> resource_size_t component_reg_phys, >> @@ -861,6 +927,7 @@ struct cxl_port *devm_cxl_add_port(struct device *host, >> struct cxl_dport *parent_dport) >> { >> struct cxl_port *port, *parent_port; >> + struct dentry *dir; >> >> port = __devm_cxl_add_port(host, uport_dev, component_reg_phys, >> parent_dport); >> @@ -878,6 +945,10 @@ struct cxl_port *devm_cxl_add_port(struct device *host, >> parent_port ? " to " : "", >> parent_port ? dev_name(&parent_port->dev) : "", >> parent_port ? "" : " (root port)"); >> + >> + dir = cxl_debugfs_create_port_dir(port); >> + if (!IS_ERR(dir)) >> + port->debug_dir = dir; >> } >> >> return port; >> @@ -1127,6 +1198,7 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port, >> resource_size_t component_reg_phys) >> { >> struct cxl_dport *dport; >> + int rc; >> >> dport = __devm_cxl_add_dport(port, dport_dev, port_id, >> component_reg_phys, CXL_RESOURCE_NONE); >> @@ -1136,6 +1208,11 @@ struct cxl_dport *devm_cxl_add_dport(struct cxl_port *port, >> } else { >> dev_dbg(dport_dev, "dport added to %s\n", >> dev_name(&port->dev)); >> + >> + rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport); >> + if (rc) >> + dev_dbg(dport_dev, >> + "Failed to create dport debugfs dir: %d\n", rc); > > Drop the debug messages about failing to setup debugfs. This follows the > lead of other debugfs setup in CXL. > Will do. >> } >> >> return dport; >> @@ -1156,6 +1233,7 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port, >> resource_size_t rcrb) >> { >> struct cxl_dport *dport; >> + int rc; >> >> if (rcrb == CXL_RESOURCE_NONE) { >> dev_dbg(&port->dev, "failed to add RCH dport, missing RCRB\n"); >> @@ -1170,6 +1248,12 @@ struct cxl_dport *devm_cxl_add_rch_dport(struct cxl_port *port, >> } else { >> dev_dbg(dport_dev, "RCH dport added to %s\n", >> dev_name(&port->dev)); >> + >> + rc = cxl_debugfs_create_dport_dir(port->debug_dir, dport); >> + if (rc) >> + dev_dbg(dport_dev, >> + "Failed to create rch dport debugfs dir: %d\n", >> + rc); >> } >> >> return dport; >> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h >> index 687043ece101..3c7744fc3106 100644 >> --- a/drivers/cxl/cxl.h >> +++ b/drivers/cxl/cxl.h >> @@ -590,6 +590,7 @@ struct cxl_dax_region { >> * @depth: How deep this port is relative to the root. depth 0 is the root. >> * @cdat: Cached CDAT data >> * @cdat_available: Should a CDAT attribute be available in sysfs >> + * @debug_dir: dentry for port in cxl debugfs (optional) >> */ >> struct cxl_port { >> struct device dev; >> @@ -612,6 +613,7 @@ struct cxl_port { >> size_t length; >> } cdat; >> bool cdat_available; >> + struct dentry *debug_dir; > > Part of why I asked for the debugfs file rename was to eliminate this > wart on the data structure. > Yeah I wasn't that happy about adding it, so I'd be happy to remove it! >> }; >> >> static inline struct cxl_dport * >> @@ -813,6 +815,14 @@ bool is_cxl_nvdimm_bridge(struct device *dev); >> int devm_cxl_add_nvdimm(struct cxl_memdev *cxlmd); >> struct cxl_nvdimm_bridge *cxl_find_nvdimm_bridge(struct cxl_memdev *cxlmd); >> >> +struct cxl_einj_ops { >> + int (*einj_type)(struct seq_file *f, void *data); >> + int (*einj_inject)(struct cxl_dport *dport, u64 type); >> +}; >> + >> +void cxl_einj_set_ops_cbs(struct cxl_einj_ops *ops); >> + >> + >> #ifdef CONFIG_CXL_REGION >> bool is_cxl_pmem_region(struct device *dev); >> struct cxl_pmem_region *to_cxl_pmem_region(struct device *dev); >> -- >> 2.34.1 >> >> > > >