Re: [BUG] rk3399 fails to reboot correctly with PCIE device inserted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/12/2019 5:28 pm, Peter Geis wrote:
On Mon, Nov 25, 2019 at 7:05 PM Peter Geis <pgwipeout@xxxxxxxxx> wrote:

On Mon, Nov 25, 2019 at 12:10 PM Peter Geis <pgwipeout@xxxxxxxxx> wrote:

On Mon, Nov 25, 2019 at 11:52 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:

Hi Peter,

On 25/11/2019 4:28 pm, Peter Geis wrote:
Good Morning,

Another issue I've come across while testing PCIE on the rockpro64.
When a PCIE device is inserted into the board, and it enumerates
successfully, the board will not reset.
I've tried various states of u-boot-rockchip, u-boot-mainline, with
both miniloader and TPL/SPL.

In case it's relevant, what particular PCIe device(s) have you seen the
issue with? FWIW I've been running a Samsung 960 Evo NVMe in my
NanoPC-T4 with mainline kernels for months now and it's always rebooted
flawlessly.

Robin.

Currently with a I350 NIC, but also observed with a pcie switch, and the GTX645.
The NIC works, while the other two didn't without the patch to hijack
the error handler.

I am running the latest atf built from their github.

On closer examination, it isn't the pcie devices causing the reboot
issues, the rk3399 just doesn't reboot.
It would seem the trigger with miniloader was random enough that it
appeared to be tied to my pcie testing.
It happens 100% of the time with tpl/spl.

With further testing, I think I've found the trigger of the reboot failure.
It would seem with ATF compiled from source, psci-reboot is not
actually triggering the reboot.
The reason my board stopped rebooting entirely is because I had
somehow broken the psci-watchdog.

I rebuilt all from source, stripping all modifications I had done and
using the defconfigs.
I get the following message at reboot time:
[ 2839.724508] watchdog: watchdog0: watchdog did not stop!
[ 2841.162516] reboot: Restarting system
U-Boot TPL 2020.01-rc3-00070-g9a0cbae22a-dirty (Dec 03 2019 - 14:07:57)
Whereas before the watchdog alert was not triggering and reboot never occurred.

It would seem that the pcsi-reboot function is dead, and the only
reason the boards are actually rebooting is because the psci-watchdog
is triggering the reboot after the kernel fails to check in.

Now I am still having the issue with boot hanging after a warm reboot
when certain pci-e devices are installed (particularly, the i350
network controller).
I think this may be due to the pci-e controller driver lacking proper
shutdown cleanup code, which is allowing the i350 to continue to
trigger either interrupts or dma transfers following the soft-reboot.

The hang occurs roughly the same point, when either the iommu or the
first dma device is initialized.
Occasionally the A72 cluster fails to initialize as well.

It turns out there's been a general issue with upstream ATF failing to reboot RK3399 correctly, which has just been tracked down to power domain states getting out of sync - there's more info on the U-Boot list here: https://lists.denx.de/pipermail/u-boot/2019-December/392348.html

Robin.


Log is below:
[    0.261198] Detected PIPT I-cache on CPU5
[    0.261223] GICv3: CPU5: found redistributor 101 region 0:0x00000000fefa0000
[    0.261235] GICv3: CPU5: using allocated LPI pending table
@0x00000000f0120000
[    0.261263] CPU5: Booted secondary processor 0x0000000101 [0x410fd082]
[    0.261377] smp: Brought up 1 node, 6 CPUs
[    0.274833] SMP: Total of 6 processors activated.
[    0.275297] CPU features: detected: 32-bit EL0 Support
[    0.275801] CPU features: detected: CRC32 instructions
[    0.290797] CPU: All CPU(s) started at EL2
[    0.291242] alternatives: patching kernel code
[    0.294848] devtmpfs: initialized
[    0.311658] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.312629] futex hash table entries: 2048 (order: 5, 131072 bytes, linear)
[    0.315223] pinctrl core: initialized pinctrl subsystem
[    0.318097] DMI not present or invalid.
[    0.318989] NET: Registered protocol family 16
[    0.326798] DMA: preallocated 256 KiB pool for atomic allocations
[    0.327415] audit: initializing netlink subsys (disabled)
[    0.328106] audit: type=2000 audit(0.320:1): state=initialized
audit_enabled=0 res=1
[    0.330213] cpuidle: using governor menu
[    0.331160] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.334653] Serial: AMBA PL011 UART driver
[    0.384125] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.384800] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
[    0.385483] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.386146] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
[    0.390063] cryptd: max_cpu_qlen set to 1000
[    0.396205] ACPI: Interpreter disabled.
[    0.399113] vcc3v3_pcie: supplied by vcc12v_dcin
[    0.400706] vcc5v0_sys: supplied by vcc12v_dcin
[    0.401426] vcc5v0_usb: supplied by vcc12v_dcin
[    0.402060] vcc3v3_sys: supplied by vcc5v0_sys
[    0.403275] iommu: Default domain type: Translated
[




With miniloader and both variants of u-boot, if you attempt a reboot
it never fires the "reboot: Restarting system" message.
If you trigger a sysrq reboot at this stage, it will reboot, but fails
to start up the two a72 cores and subsequently hangs a second later
when it loads the first dma driver.

With TPL/SPL on mainline-u-boot (I can't get rockchip-u-boot to work
with TPL/SPL), it fires the "reboot: Restarting system" message, but
never reboots.
sysrq does not function at this point.

I believe the pcie controller is not being halted, and gets stuck in a
loop with the two a72 cores.

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-rockchip


_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-rockchip



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux