RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+ Borislav, Tony, James, Mauro, Robert

Hi All,

Synopsys DesignWare PCIe controllers have a vendor specific capability (which
means that this set of registers are only present in DesignWare controllers)
to perform debug operations called "RASDES".
The functionalities provided by this extended capability are:

1. Debug: This has some debug related diagnostic features like holding LTSSM
in certain states, reading the status of lane detection, checking if any PCIe
lanes are broken (RX Valid) and so on. It's a debug only feature used for diagnostic
use-cases.

2. Error Injection: This is a way to inject certain errors in PCIe like LCRC, ECRC,
Bad TLPs and so on. Again, this is a debug feature and generally not used in
functional use-case.

3. Statistical counters: This has 3 parts
 - Error counters
 - Non error counters (covered as part of perf [1])
 - Time based analysis counters (covered as part of perf [1])

Selective features of  the above functionality has been implemented
by vendor specific PCIe controller drivers (pcie-tegra194.c) that use
Synopsys DesignWare PCIe controllers.
In order to make it useful to all vendors using DWC controller, we had
proposed a common implementation in DWC PCIe controller directory
(drivers/pci/controller/dwc/) and our original idea was based on debugfs
filesystem. v1 and v2 are mentioned in [2] and [3].

We got a suggestion to implement this as part of EDAC framework [3] and
we looked into the same. But as far as I understood, what I am trying to
implement is a very specific feature (only valid for Synopsys DWC PCIe controllers).
This doesn't seem to fit in very well with the EDAC framework and we can 
hardly use any of the EDAC framework APIs. We tried implementing a
"pci_driver" but since a function driver will already be running on the EP and
portdrv on the root-complex, we will not be able to bind 2 drivers to a single
PCI device (root-complex or endpoint). Ultimately, what I will be doing is
writing a platform driver with debugfs entries which will be present in EDAC
directory instead of DWC directory.

Can  you please help us out by going through this thread [3] and letting us
know if our understanding is wrong at any point. If you think it is a better
idea to integrate this in the EDAC framework, can you guide me as
to how I can utilize the framework better?
Please let me know if you need any other information to conclude.

[1] https://lore.kernel.org/linux-pci/20231121013400.18367-1-xueshuai@xxxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@xxxxxxxxxxx/T/
[3] https://lore.kernel.org/all/20231130115044.53512-1-shradha.t@xxxxxxxxxxx/

Thanks,
Shradha

> -----Original Message-----
> From: 'Manivannan Sadhasivam' <manivannan.sadhasivam@xxxxxxxxxx>
> Sent: 16 February 2024 19:19
> To: Shradha Todi <shradha.t@xxxxxxxxxxx>
> Cc: lpieralisi@xxxxxxxxxx; kw@xxxxxxxxx; robh@xxxxxxxxxx;
> bhelgaas@xxxxxxxxxx; jingoohan1@xxxxxxxxx;
> gustavo.pimentel@xxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx;
> lukas.bulwahn@xxxxxxxxx; hongxing.zhu@xxxxxxx;
> pankaj.dubey@xxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx; vidyas@xxxxxxxxxx; gost.dev@xxxxxxxxxxx
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
> 
> On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> >
> >
> 
> [...]
> 
> > > For the error injection and counters, we already have the EDAC
> > > framework. So adding them in the DWC driver doesn't make sense to me.
> > >
> >
> > Sorry for late response, was going through the EDAC framework to understand
> better how we can fit RAS DES support in it. Below are some technical challenges
> found so far:
> > 1: This debugfs framework proposed [1] can run on both side of the link i.e. RC
> and EP as it will be a part of the link controller platform driver. Here for the EP
> side the assumption is that it has Linux running, which is primarily a use case for
> chip-to-chip communication.  After your suggestion to migrate to EDAC
> framework we studied and here are the findings:
> > - If we move to EDAC framework, we need to have RAS DES as a
> > pci_driver which will be binded based on vendor_id and device_id. Our
> > observation is that on EP side system we are unable to bind two
> > function driver (pci_driver), as pci_endpoint_test function driver or
> > some other chip-to-chip function driver will already be bound. On the
> > other hand, on RC side we observed that if we have portdrv enabled in
> > Linux running on RC system, it gets bound to RC controller and then it
> > does not allow EDAC pci_driver to bind. So basically we see a problem
> > here, that we can't have two pci_driver binding to same PCI device
> > 2: Another point is even though we use EDAC driver framework, we may not be
> able to use any of EDAC framework APIs as they are mostly suitable for memory
> controller devices sitting on PCI BUS. We will end up using debugfs entries just via
> a pci_driver placed inside EDAC framework.
> 
> Please wrap your replies to 80 characters.
> 
> There is no need to bind the edac driver to VID:PID of the device. The edac driver
> can be a platform driver and you can instantiate the platform device from the
> DWC driver. This way, the PCI device can be assocaited with whatever driver, but
> still there can be a separate edac driver for handling errors.
> 
> Regarding API limitation, you should ask the maintainer about the possibility of
> extending them.
> 
> >
> > Please let me know if my understanding is wrong.
> >
> > > But first check with the perf driver author if they have any plans
> > > on adding the proposed functionality. If they do not have any plan
> > > or not working on it, then look into EDAC.
> > >
> > > - Mani
> > >
> >
> > Since we already worked and posted patches [1], [2], we will continue to work
> on this and based on consent from community we will adopt to most suitable
> framework.
> > We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that
> give information about the current status of the running system and as of now
> based on our findings, we still feel there is no harm in having debugfs entry based
> support in DesignWare controller driver itself.
> 
> There is no issue in exposing the debug information through debugfs, that's the
> sole purpose of the interface. But here, you are trying to add support for DWC
> RAS feature for which a dedicated framework already exists.
> 
> And there will be more similar requests coming for vendor specific error protocols
> as well. So your investigation could benefit everyone.
> 
> From your above investigation, looks like there are some shortcomings of the
> EDAC framework. So let's get that clarified by writing to the EDAC maintainers
> (keep us in CC). If the EDAC maintainer suggests you to add support for this
> feature in DWC driver itself citing some reasons, then no issues with me.
> 
> - Mani
> 
> --
> மணிவண்ணன் சதாசிவம்







[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux