Re: [PATCH v3 2/2] PCI: rcar: Return all Fs from read which triggered an exception

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday 18 February 2022 02:53:39 Marek Vasut wrote:
> On 2/17/22 14:04, Pali Rohár wrote:
> 
> [...]
> 
> > > > > > Flipping either bit makes no difference, suspend/resume behaves the same and
> > > > > > the link always recovers.
> > > > > 
> > > > > Ok, perfect! And what happens without suspend/resume (just in normal
> > > > > conditions)? E.g. during active usage of some PCIe card (wifi, sata, etc..).
> > > > 
> > > > PING? Also what lspci see for the root port and card itself during hot reset?
> > > 
> > > If I recall, lspci showed the root port and card.
> > 
> > This is suspicious. Card should not respond to config read requests when
> > is in hot reset state. Could you send output of lspci -vvxx of the root
> > port and also card during this test? Maybe it is possible that root port
> > has broken BRIDGE_CONTROL register and did not put card into Hot Reset
> > state?
> 
> Yes, I could set the hardware up again and run more tests, it will take some
> time, but I can still do that.
> 
> But before I spend any more time running tests for you here, I have to
> admit, it seems to me running all those tests is completely off-topic in
> context of these two bugfixes here.

I do not think this is off-topic. Issue here is caused when controller
is not in L0 state and this test is something which deterministically
put controller into non-L0 state (Hot Reset). The best verification of
all race conditions and similar timing problems is to to setup scenario
in which timing windows can be under full control. Which this can can
do.

I saw more issues related to PCIe slave errors and I'm feeling that this
patch is just hacking one or two consequences and not fixing the source
of the problem globally.

In most cases slave errors are (incorrectly) reported to CPU when PCIe
controller receive UR/CA response from the bus or if controller itself
generate UR/CA response for request from CPU.

> So maybe it would make sense to stop the discussion here and move it to
> separate thread ?
> 
> I have to admit, I also don't quite understand what it is that you're trying
> to find out with all those tests.

Moreover if this test shows that PCI Bridge registers do not work
properly then it is something which must be fixed too.

There were more discussions about catching and recovering from ARM CPU
aborts and all patches for catching asynchronous exceptions were
rejected because they cannot work by their _imprecise_ nature.

And also there were discussions (not sure if on ML or IRC) if the PCI
core / drivers are the correct place for ARMv7 exceptions / data aborts.



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux