On 12/20/23 02:59, Manivannan Sadhasivam wrote: >>>>> from dw_pcie_ep_set_bar(), also needs to be dropped, so that the iATU >>>>> settings will be re-written for platforms with core_init_notifiers. >>>>> >>>>> Right now, for a platform with a core_init_notifier, if you run >>>>> pci_endpoint_test + reboot the RC (so that PERST is asserted + deasserted), >>>>> and then run pci_endpoint_test again, then I'm quite sure that >>>>> pci_endpoint_test will not pass the second time (because iATU settings >>>>> were not rewritten after reset). >>>>> >>>>> It would be nice if Mani or Vidya could confirm this. >> >> So problem 2 out of 2 introduced by the patch in $subject is that >> DWC drivers with a .core_init_notifier, will perform a reset_control_assert() >> to reset the core (which will reset both sticky and non-sticky registers), >> which means that the early return in dw_pcie_ep_set_bar(): >> https://github.com/torvalds/linux/blob/v6.7-rc5/drivers/pci/controller/dwc/pcie-designware-ep.c#L268-L269 >> >> while returning after the iATU settings have been written, >> it will return before: >> >> dw_pcie_writel_dbi2(pci, reg_dbi2, lower_32_bits(size - 1)); >> dw_pcie_writel_dbi(pci, reg, flags); >> >> Which means that, for drivers with a .core_init_notifier, BARx_REG and >> BARx_MASK registers will not be written. This means that they will have >> reset values for these registers, which means that e.g. the BAR_SIZE >> (which is defined by BARx_MASK) might be incorrect for these platforms >> because this function returns early. >> >> I will not send a fix for this problem, I will leave that to you, or Mani, >> or Vidya, and hope that people are happy that I simply reported this issue. >> > > The fix for this issue shouldn't be in the DWC driver but rather in the EPF > drivers. Because, EPF drivers are the ones calling pci_epc_set_bar() during > bind(). So they have to properly cleanup the resources once the perst assertion > happens. This issue also applies to other resources such as DMA channels. > > The problem here is, there is a disconnect between DWC (EPC) and EPF drivers. > When the perst assertion happens, the event is not passed to EPF drivers > resulting in the EPF drivers having no knowledge that they need to give up the > resources. > > We need to fix this in a sane way and I'm looking into it. > > But I really appreciate your finding here and in other thread where we discussed > similar issue. We have core_init and link_up notifiers for EPF drivers. Adding link_down and core_exit notifiers would make a lot of sense :) A core_reset notifier could also be a solution. Adding that would also help EPF drivers to handle links going down temporarily (which can happen). Right now, an EPF driver can only deal with such event by tracking if it gets multiple link_up events. -- Damien Le Moal Western Digital Research