Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jonathan Cameron wrote:
> On Wed, 20 Dec 2023 14:07:35 +0900
> KobayashiDaisuke <kobayashi.da-06@xxxxxxxxxxx> wrote:
> 
> > Hello.
> > 
> > This patch series adds a feature to lspci that displays the link status
> > of the CXL1.1 device.
> > 
> > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> > the link status can be output in the same way as traditional PCIe.
> > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> > different method to obtain the link status from traditional PCIe.
> > This is because the link status of the CXL1.1 device is not mapped
> > in the configuration space (as per cxl3.0 specification 8.1).
> > Instead, the configuration space containing the link status is mapped
> > to the memory mapped register region (as per cxl3.0 specification 8.2,
> > Table 8-18). Therefore, the current lspci has a problem where it does
> > not display the link status of the CXL1.1 device. 
> > This patch solves these issues.
> > 
> > The method of acquisition is in the order of obtaining the device UID,
> > obtaining the base address from CEDT, and then obtaining the link
> > status from memory mapped register. Considered outputting with the cxl
> > command due to the scope of the CXL specification, but devices from
> > CXL2.0 onwards can be output in the same way as traditional PCIe.
> > Therefore, it would be better to make the lspci command compatible with
> > the CXL1.1 device for compatibility reasons.
> > 
> > I look forward to any comments you may have.
> Yikes. 
> 
> My gut feeling is that you shouldn't need to do this level of hackery.
> 
> If we need this information to be exposed to tooling then we should
> add support to the kernel to export it somewhere in sysfs and read that
> directly.  Do we need it to be available in absence of the CXL driver
> stack? 

I am hoping that's a non-goal if only because that makes it more
difficult for the kernel to provide some help here without polluting to
the PCI core.

To date, RCRB handling is nothing that the PCI core needs to worry
about, and I am not sure I want to open that box.

I am wondering about an approach like below is sufficient for lspci.

The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
opt-in to publishing these hidden registers.

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 4fd1f207c84e..ee63dff63b68 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -960,6 +960,19 @@ static const struct pci_error_handlers cxl_error_handlers = {
        .cor_error_detected     = cxl_cor_error_detected,
 };
 
+static struct attribute *cxl_rcd_attrs[] = {
+       &dev_attr_rcd_lnkcp.attr,
+       &dev_attr_rcd_lnkctl.attr,
+       NULL
+};
+
+static struct attribute_group cxl_rcd_group = {
+       .attrs = cxl_rcd_attrs,
+       .is_visible = cxl_rcd_visible,
+};
+
+__ATTRIBUTE_GROUPS(cxl_pci);
+
 static struct pci_driver cxl_pci_driver = {
        .name                   = KBUILD_MODNAME,
        .id_table               = cxl_mem_pci_tbl,
@@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
        .err_handler            = &cxl_error_handlers,
        .driver = {
                .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
+               .dev_groups     = cxl_rcd_groups,
        },
 };
 

However, the problem I believe is this will end up with:

/sys/bus/pci/devices/$pdev/rcd_lnkcap
/sys/bus/pci/devices/$pdev/rcd_lnkctl

...with valid values, but attributes like:

/sys/bus/pci/devices/$pdev/current_link_speed

...returning -EINVAL.

So I think the options are:

1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
   piecemeal enable specific lspci needs with RCD-specific attributes

...or:

2/ Hack pcie_capability_read_word() to internally figure out that based
   on a config offset a device may have a hidden capability and switch over
   to RCRB based config-cycle access for those.

Given that the CXL 1.1 RCH topology concept was immediately deprecated
in favor of VH topology in CXL 2.0, I am not inclined to pollute the
general Linux PCI core with that "aberration of history" as it were.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux