Re: [PATCH V6] PCI: rcar: Add L1 link state fix into data abort hook

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/17/21 7:33 PM, Bjorn Helgaas wrote:
[...]

Note that this fixup is applicable only to Aarch32 R-Car controllers,
the Aarch64 R-Car perform the same fixup in TFA, see TFA commit [1]
0969397f2 ("rcar_gen3: plat: Prevent PCIe hang during L1X config access")
[1] https://github.com/ARM-software/arm-trusted-firmware/commit/0969397f295621aa26b3d14b76dd397d22be58bf

This patch is horribly ugly but it's working around a horrible
hardware problem, and I don't have any better suggestions, so I guess
we don't really have much choice.

I do think the commit log is a bit glib:

Should I reword the commit message one more time and send V7 ?

   - "The R-Car PCIe controller is capable of handling L0s/L1 link
     states."  AFAICT every PCIe device is required to handle L0 and L1
     without software assistance.  So saying R-Car is "capable" puts a
     better face on this than seems warranted.

     L0s doesn't seem relevant at all; at least it doesn't seem to play
     a role in the patch.  There's no such thing as "returning to L0s"
     as mentioned in the comment below; L0s is only reachable from L0.
     Returns from L1 only go to L0 (PCIe r5.0, fig 5-1).

   - "The problem is, this transition is not atomic."  I think the
     *problem* is the hardware is broken in the first place.  This
     transition is supposed to be invisible to software.

   - "Just like other PCI controller drivers ..." suggests that this is
     an ordinary situation that we shouldn't be concerned about.  This
     patch may be the best we can do to work around a bad hardware
     defect, but it's definitely not ordinary.

     I think the other hook_fault_code() uses are for reporting
     legitimate PCIe errors, which most controllers log and turn
     into ~0 data responses without generating an abort or machine
     check, not things caused by hardware defects, so they're not
     really comparable.

Has Renesas documented this as an erratum?

They are aware of this.

Will future devices
require additions to rcar_pcie_abort_handler_of_match[]?

No, this change is for legacy arm32 SoCs only.

It'd be nice if the commit log mentioned the user-visible effect of
this problem.  I guess it does mention external aborts -- I assume you
see those when downstream devices go to D3hot or when ASPM puts the
link in L1?  And the abort results in a reboot?

It results in a hang.

To be clear, I'm not objecting to the patch.  It's a hardware problem
and we should work around it as best we can.

[...]



[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux