On 10/2/18 12:44 AM, Marek Vasut wrote: > On 09/22/2018 07:06 PM, Marek Vasut wrote: >> On 09/18/2018 12:51 PM, Wolfram Sang wrote: >>> On Tue, Sep 18, 2018 at 11:15:55AM +0100, Lorenzo Pieralisi wrote: >>>> On Tue, May 22, 2018 at 12:05:14AM +0200, Marek Vasut wrote: >>>>> From: Phil Edworthy <phil.edworthy@xxxxxxxxxxx> >>>>> >>>>> The PCIe DMA controller on RCar Gen2 and earlier is on 32bit bus, >>>>> so limit the DMA range to 32bit. >>>>> >>>>> Signed-off-by: Phil Edworthy <phil.edworthy@xxxxxxxxxxx> >>>>> Signed-off-by: Marek Vasut <marek.vasut+renesas@xxxxxxxxx> >>>>> Cc: Arnd Bergmann <arnd@xxxxxxxx> >>>>> Cc: Geert Uytterhoeven <geert+renesas@xxxxxxxxx> >>>>> Cc: Phil Edworthy <phil.edworthy@xxxxxxxxxxx> >>>>> Cc: Simon Horman <horms+renesas@xxxxxxxxxxxx> >>>>> Cc: Wolfram Sang <wsa@xxxxxxxxxxxxx> >>>>> Cc: linux-renesas-soc@xxxxxxxxxxxxxxx >>>>> To: linux-pci@xxxxxxxxxxxxxxx >>>>> --- >>>>> NOTE: I'm aware of https://patchwork.kernel.org/patch/9495895/ , but the >>>>> discussion seems to have gone way off, so I'm sending this as a >>>>> RFC. Any feedback on how to do this limiting properly would be nice. >>>>> --- >>>>> drivers/pci/host/pcie-rcar.c | 28 ++++++++++++++++++++++++++++ >>>>> 1 file changed, 28 insertions(+) >>>> >>>> The issue solved by this patch was solved in a more generic way through >>>> this series: >>>> >>>> https://lists.linuxfoundation.org/pipermail/iommu/2018-July/028792.html >>>> >>>> I will therefore drop this patch from the PCI patch queue. >>> >>> Cool. Thanks for this series and thanks for the heads up! >>> >>> Marek, can you confirm our issue is fixed (if you haven't done already)? >> >> I assembled a setup with NVMe controller, which should be triggering >> this issue according to [1], but I'm not having much success getting it >> to work with or without this patchset. Logs are below, nvme smart-log >> seems to produce completely bogus results, although the device is at >> least recognized. >> >> Maybe there's something else that's broken ? >> >> [1] https://patchwork.kernel.org/patch/9495895/ >> >> # dmesg | egrep '(pci|nvm)' >> rcar-pcie fe000000.pcie: host bridge /soc/pcie@fe000000 ranges: >> rcar-pcie fe000000.pcie: Parsing ranges property... >> rcar-pcie fe000000.pcie: IO 0xfe100000..0xfe1fffff -> 0x00000000 >> rcar-pcie fe000000.pcie: MEM 0xfe200000..0xfe3fffff -> 0xfe200000 >> rcar-pcie fe000000.pcie: MEM 0x30000000..0x37ffffff -> 0x30000000 >> rcar-pcie fe000000.pcie: MEM 0x38000000..0x3fffffff -> 0x38000000 >> rcar-pcie fe000000.pcie: PCIe x1: link up >> rcar-pcie fe000000.pcie: Current link speed is 5 GT/s >> rcar-pcie fe000000.pcie: PCI host bridge to bus 0000:00 >> pci_bus 0000:00: root bus resource [bus 00-ff] >> pci_bus 0000:00: root bus resource [io 0x0000-0xfffff] >> pci_bus 0000:00: root bus resource [mem 0xfe200000-0xfe3fffff] >> pci_bus 0000:00: root bus resource [mem 0x30000000-0x37ffffff] >> pci_bus 0000:00: root bus resource [mem 0x38000000-0x3fffffff pref] >> pci_bus 0000:00: scanning bus >> pci 0000:00:00.0: [1912:0025] type 01 class 0x060400 >> pci 0000:00:00.0: enabling Extended Tags >> pci 0000:00:00.0: PME# supported from D0 D3hot D3cold >> pci 0000:00:00.0: PME# disabled >> pci_bus 0000:00: fixups for bus >> pci 0000:00:00.0: scanning [bus 01-01] behind bridge, pass 0 >> pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1 >> pci_bus 0000:01: scanning bus >> pci 0000:01:00.0: [8086:f1a5] type 00 class 0x010802 >> pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit] >> pci 0000:01:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5 GT/s >> x1 link at 0000:00:00.0 (capable of 31.504 Gb/s with 8 GT/s x4 link) >> pci_bus 0000:01: fixups for bus >> pci_bus 0000:01: bus scan returning with max=01 >> pci_bus 0000:01: busn_res: [bus 01-ff] end is updated to 01 >> pci_bus 0000:00: bus scan returning with max=01 >> pci 0000:00:00.0: BAR 14: assigned [mem 0xfe200000-0xfe2fffff] >> pci 0000:01:00.0: BAR 0: assigned [mem 0xfe200000-0xfe203fff 64bit] >> pci 0000:00:00.0: PCI bridge to [bus 01] >> pci 0000:00:00.0: bridge window [mem 0xfe200000-0xfe2fffff] >> pcieport 0000:00:00.0: assign IRQ: got 0 >> pcieport 0000:00:00.0: enabling device (0000 -> 0002) >> pcieport 0000:00:00.0: enabling bus mastering >> pcieport 0000:00:00.0: Signaling PME with IRQ 198 >> rcar-pcie ee800000.pcie: host bridge /soc/pcie@ee800000 ranges: >> rcar-pcie ee800000.pcie: Parsing ranges property... >> rcar-pcie ee800000.pcie: IO 0xee900000..0xee9fffff -> 0x00000000 >> rcar-pcie ee800000.pcie: MEM 0xeea00000..0xeebfffff -> 0xeea00000 >> rcar-pcie ee800000.pcie: MEM 0xc0000000..0xc7ffffff -> 0xc0000000 >> rcar-pcie ee800000.pcie: MEM 0xc8000000..0xcfffffff -> 0xc8000000 >> rcar-pcie ee800000.pcie: PCIe link down >> nvme 0000:01:00.0: assign IRQ: got 171 >> nvme nvme0: pci function 0000:01:00.0 >> nvme 0000:01:00.0: enabling device (0000 -> 0002) >> nvme 0000:01:00.0: enabling bus mastering >> nvme nvme0: Shutdown timeout set to 60 seconds >> >> # lspci -nvvvs 01:00.0 >> 01:00.0 0108: 8086:f1a5 (rev 03) (prog-if 02 [NVM Express]) >> Subsystem: 8086:390a >> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- >> ParErr- Stepping- SERR- FastB2B- DisINTx- >> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- >> <TAbort- <MAbort- >SERR- <PERR- INTx- >> Latency: 0 >> Interrupt: pin A routed to IRQ 171 >> Region 0: Memory at fe200000 (64-bit, non-prefetchable) [size=16K] >> Capabilities: [40] Power Management version 3 >> Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA >> PME(D0-,D1-,D2-,D3hot-,D3cold-) >> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- >> Capabilities: [70] Express (v2) Endpoint, MSI 00 >> DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s >> unlimited, L1 unlimited >> ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ >> SlotPowerLimit 0.000W >> DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ >> Unsupported+ >> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop- >> FLReset- >> MaxPayload 128 bytes, MaxReadReq 512 bytes >> DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ >> TransPend- >> LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit >> Latency L0s <1us, L1 <8us >> ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+ >> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- >> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >> LnkSta: Speed 5GT/s, Width x1, TrErr- Train- SlotClk+ >> DLActive- BWMgmt- ABWMgmt- >> DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, >> LTR+, OBFF Via message >> DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, >> LTR-, OBFF Disabled >> LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- >> SpeedDis- >> Transmit Margin: Normal Operating Range, >> EnterModifiedCompliance- ComplianceSOS- >> Compliance De-emphasis: -6dB >> LnkSta2: Current De-emphasis Level: -6dB, >> EqualizationComplete-, EqualizationPhase1- >> EqualizationPhase2-, EqualizationPhase3-, >> LinkEqualizationRequest- >> Capabilities: [b0] MSI-X: Enable- Count=16 Masked- >> Vector table: BAR=0 offset=00002000 >> PBA: BAR=0 offset=00002100 >> Capabilities: [100 v2] Advanced Error Reporting >> UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- >> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- >> UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- >> UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- >> UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- >> UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- >> CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- >> NonFatalErr- >> CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- >> NonFatalErr+ >> AERCap: First Error Pointer: 14, GenCap+ CGenEn- ChkCap+ >> ChkEn- >> Capabilities: [158 v1] #19 >> Capabilities: [178 v1] Latency Tolerance Reporting >> Max snoop latency: 0ns >> Max no snoop latency: 0ns >> Capabilities: [180 v1] L1 PM Substates >> L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ >> ASPM_L1.1+ L1_PM_Substates+ >> PortCommonModeRestoreTime=10us >> PortTPowerOnTime=10us >> L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1- >> T_CommonMode=0us LTR1.2_Threshold=0ns >> L1SubCtl2: T_PwrOn=10us >> Kernel driver in use: nvme >> lspci: Unable to load libkmod resources: error -12 >> >> # nvme smart-log /dev/nvme0 >> Smart Log for NVME device:nvme0 namespace-id:ffffffff >> critical_warning : 0 >> temperature : -273 C >> available_spare : 0% >> available_spare_threshold : 0% >> percentage_used : 0% >> data_units_read : 0 >> data_units_written : 0 >> host_read_commands : 0 >> host_write_commands : 0 >> controller_busy_time : 0 >> power_cycles : 0 >> power_on_hours : 0 >> unsafe_shutdowns : 0 >> media_errors : 0 >> num_err_log_entries : 0 >> Warning Temperature Time : 0 >> Critical Composite Temperature Time : 0 >> Thermal Management T1 Trans Count : 0 >> Thermal Management T2 Trans Count : 0 >> Thermal Management T1 Total Time : 0 >> Thermal Management T2 Total Time : 0 > > Any suggestions ? Thank you for all your suggestions and support! I revisited this issue with current linux-next, since I was curious why the NVME SSD didn't work for me on ARM64, while it works on x86-64. It still doesn't work on ARM64 and I still don't know why. I limited the link speed to 2.5GT/s x1, but that doesn't help. I suspect the problem is with the 64bit mapping of the BAR0, while the controller has the 32bit DMA limitation. I tried Intel i210 NIC and Intel 82574L NIC and they both work fine, but they only map BAR0 as 32bit. Could that be the problem ? I was under the impression that this patch should address it. -- Best regards, Marek Vasut