Re: [PATCH v7 0/5] Add support for debugfs based RAS DES feature in PCIe DW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 25, 2025 at 01:58:35PM +0530, Manivannan Sadhasivam wrote:
> On Mon, Feb 24, 2025 at 06:08:26PM +0100, Niklas Cassel wrote:
> > Hello Shradha,
> > 
> > On Fri, Feb 21, 2025 at 06:45:43PM +0530, Shradha Todi wrote:
> > > DesignWare controller provides a vendor specific extended capability
> > > called RASDES as an IP feature. This extended capability  provides
> > > hardware information like:
> > >  - Debug registers to know the state of the link or controller. 
> > >  - Error injection mechanisms to inject various PCIe errors including
> > >    sequence number, CRC
> > >  - Statistical counters to know how many times a particular event
> > >    occurred
> > > 
> > > However, in Linux we do not have any generic or custom support to be
> > > able to use this feature in an efficient manner. This is the reason we
> > > are proposing this framework. Debug and bring up time of high-speed IPs
> > > are highly dependent on costlier hardware analyzers and this solution
> > > will in some ways help to reduce the HW analyzer usage.
> > > 
> > > The debugfs entries can be used to get information about underlying
> > > hardware and can be shared with user space. Separate debugfs entries has
> > > been created to cater to all the DES hooks provided by the controller.
> > > The debugfs entries interacts with the RASDES registers in the required
> > > sequence and provides the meaningful data to the user. This eases the
> > > effort to understand and use the register information for debugging.
> > > 
> > > This series creates a generic debugfs framework for DesignWare PCIe
> > > controllers where other debug features apart from RASDES can also be
> > > added as and when required.
> > > 
> > > v7:
> > >     - Moved the patches to make finding VSEC IDs common from Mani's patchset [1]
> > >       into this series to remove dependancy as discussed
> > >     - Addressed style related change requests from v6
> > 
> > I tested this series, and one thing that I noticed:
> > 
> > # for f in /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/*/counter_enable; do echo 1 > $f; done
> > 
> > # grep "" /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/*/* | grep Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/ctl_skp_os_parity_err/counter_enable:Counter Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/deskew_uncompleted_err/counter_enable:Counter Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/framing_err_in_l0/counter_enable:Counter Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/margin_crc_parity_err/counter_enable:Counter Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/retimer_parity_err_1st/counter_enable:Counter Disabled
> > /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/retimer_parity_err_2nd/counter_enable:Counter Disabled
> > 
> > that there are some events that cannot be enabled when testing on my platform,
> > rk3588, perhaps this is because my version of the DWC IP does not have these
> > events.
> > 
> > (Because all the other events can be enabled successfully:
> > # grep "" /sys/kernel/debug/dwc_pcie_a40000000.pcie/rasdes_event_counter/*/* | grep Enabled | wc -l
> > 29
> > )
> > 
> > 
> > So the question is, how do we want to handle that?
> >
> 
> This is a really good question.
>  
> > E.g. counter_enable_write() could theoretically read back the
> > dw_pcie_readl_dbi(pci, rinfo->ras_cap_offset + RAS_DES_EVENT_COUNTER_CTRL_REG);
> > register after doing the
> > ww_pcie_writel_dbi(pci, rinfo->ras_cap_offset + RAS_DES_EVENT_COUNTER_CTRL_REG, val);
> > 
> > to actually check if it could enable the event.
> > 
> > If counter_enable_write() could not enable the specific event, should it
> > perhaps return a failure to user space?
> > 
> 
> Yes, it would be appropriate to return -EOPNOTSUPP in that case. But I'd like to
> merge this series asap. So this patch can come on top of this series.

I agree that returning an error is probably the nicest thing.

However, this series has been picked up already :)

Is there anyone who volunteers on implementing the proposed feature?

If you have time, it would be interesting to see if you see the same behavior
on QCOM SoCs. (Assuming that your DWC PCIe controller does not implement all
events that Samsung DWC PCIe controller does.)


Kind regards,
Niklas




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux