RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx>
> Sent: Friday, January 12, 2024 8:24 PM
> To: Dan Williams <dan.j.williams@xxxxxxxxx>
> Cc: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@xxxxxxxxxxx>;
> linux-pci@xxxxxxxxxxxxxxx; linux-cxl@xxxxxxxxxxxxxxx; Gotou, Yasunori/五島 康
> 文 <y-goto@xxxxxxxxxxx>
> Subject: Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status
> 
> On Wed, 10 Jan 2024 17:11:38 -0800
> Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> 
> > Jonathan Cameron wrote:
> > > On Wed, 20 Dec 2023 14:07:35 +0900
> > > KobayashiDaisuke <kobayashi.da-06@xxxxxxxxxxx> wrote:
> > >
> > > > Hello.
> > > >
> > > > This patch series adds a feature to lspci that displays the link status
> > > > of the CXL1.1 device.
> > > >
> > > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> > > > the link status can be output in the same way as traditional PCIe.
> > > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> > > > different method to obtain the link status from traditional PCIe.
> > > > This is because the link status of the CXL1.1 device is not mapped
> > > > in the configuration space (as per cxl3.0 specification 8.1).
> > > > Instead, the configuration space containing the link status is mapped
> > > > to the memory mapped register region (as per cxl3.0 specification 8.2,
> > > > Table 8-18). Therefore, the current lspci has a problem where it does
> > > > not display the link status of the CXL1.1 device.
> > > > This patch solves these issues.
> > > >
> > > > The method of acquisition is in the order of obtaining the device UID,
> > > > obtaining the base address from CEDT, and then obtaining the link
> > > > status from memory mapped register. Considered outputting with the cxl
> > > > command due to the scope of the CXL specification, but devices from
> > > > CXL2.0 onwards can be output in the same way as traditional PCIe.
> > > > Therefore, it would be better to make the lspci command compatible with
> > > > the CXL1.1 device for compatibility reasons.
> > > >
> > > > I look forward to any comments you may have.
> > > Yikes.
> > >
> > > My gut feeling is that you shouldn't need to do this level of hackery.
> > >
> > > If we need this information to be exposed to tooling then we should
> > > add support to the kernel to export it somewhere in sysfs and read that
> > > directly.  Do we need it to be available in absence of the CXL driver
> > > stack?
> >
> > I am hoping that's a non-goal if only because that makes it more
> > difficult for the kernel to provide some help here without polluting to
> > the PCI core.
> >
> > To date, RCRB handling is nothing that the PCI core needs to worry
> > about, and I am not sure I want to open that box.
> >
> > I am wondering about an approach like below is sufficient for lspci.
> >
> > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
> > opt-in to publishing these hidden registers.
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 4fd1f207c84e..ee63dff63b68 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -960,6 +960,19 @@ static const struct pci_error_handlers
> cxl_error_handlers = {
> >         .cor_error_detected     = cxl_cor_error_detected,
> >  };
> >
> > +static struct attribute *cxl_rcd_attrs[] = {
> > +       &dev_attr_rcd_lnkcp.attr,
> > +       &dev_attr_rcd_lnkctl.attr,
> > +       NULL
> > +};
> > +
> > +static struct attribute_group cxl_rcd_group = {
> > +       .attrs = cxl_rcd_attrs,
> > +       .is_visible = cxl_rcd_visible,
> > +};
> > +
> > +__ATTRIBUTE_GROUPS(cxl_pci);
> > +
> >  static struct pci_driver cxl_pci_driver = {
> >         .name                   = KBUILD_MODNAME,
> >         .id_table               = cxl_mem_pci_tbl,
> > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
> >         .err_handler            = &cxl_error_handlers,
> >         .driver = {
> >                 .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
> > +               .dev_groups     = cxl_rcd_groups,
> >         },
> >  };
> >
> >
> > However, the problem I believe is this will end up with:
> >
> > /sys/bus/pci/devices/$pdev/rcd_lnkcap
> > /sys/bus/pci/devices/$pdev/rcd_lnkctl
> >
> > ...with valid values, but attributes like:
> >
> > /sys/bus/pci/devices/$pdev/current_link_speed
> >
> > ...returning -EINVAL.
> >
> > So I think the options are:
> >
> > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
> >    piecemeal enable specific lspci needs with RCD-specific attributes
> 
> This one gets my vote.

Thank you for your feedback.
Like Dan, I also believe that implementing this feature in the kernel may 
not be appropriate, as it is specifically needed for CXL1.1 devices.
Therefore, I understand that it would be better to implement 
the link status of CXL1.1 devices directly in lspci.
Please tell me if my understanding is wrong.

> 
> >
> > ...or:
> >
> > 2/ Hack pcie_capability_read_word() to internally figure out that based
> >    on a config offset a device may have a hidden capability and switch over
> >    to RCRB based config-cycle access for those.
> >
> > Given that the CXL 1.1 RCH topology concept was immediately deprecated
> > in favor of VH topology in CXL 2.0, I am not inclined to pollute the
> > general Linux PCI core with that "aberration of history" as it were.
> Agreed.
> 






[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux