Re: [BUG] rk3399 fails to reboot correctly with PCIE device inserted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 25, 2019 at 7:05 PM Peter Geis <pgwipeout@xxxxxxxxx> wrote:
>
> On Mon, Nov 25, 2019 at 12:10 PM Peter Geis <pgwipeout@xxxxxxxxx> wrote:
> >
> > On Mon, Nov 25, 2019 at 11:52 AM Robin Murphy <robin.murphy@xxxxxxx> wrote:
> > >
> > > Hi Peter,
> > >
> > > On 25/11/2019 4:28 pm, Peter Geis wrote:
> > > > Good Morning,
> > > >
> > > > Another issue I've come across while testing PCIE on the rockpro64.
> > > > When a PCIE device is inserted into the board, and it enumerates
> > > > successfully, the board will not reset.
> > > > I've tried various states of u-boot-rockchip, u-boot-mainline, with
> > > > both miniloader and TPL/SPL.
> > >
> > > In case it's relevant, what particular PCIe device(s) have you seen the
> > > issue with? FWIW I've been running a Samsung 960 Evo NVMe in my
> > > NanoPC-T4 with mainline kernels for months now and it's always rebooted
> > > flawlessly.
> > >
> > > Robin.
> >
> > Currently with a I350 NIC, but also observed with a pcie switch, and the GTX645.
> > The NIC works, while the other two didn't without the patch to hijack
> > the error handler.
> >
> > I am running the latest atf built from their github.
>
> On closer examination, it isn't the pcie devices causing the reboot
> issues, the rk3399 just doesn't reboot.
> It would seem the trigger with miniloader was random enough that it
> appeared to be tied to my pcie testing.
> It happens 100% of the time with tpl/spl.

With further testing, I think I've found the trigger of the reboot failure.
It would seem with ATF compiled from source, psci-reboot is not
actually triggering the reboot.
The reason my board stopped rebooting entirely is because I had
somehow broken the psci-watchdog.

I rebuilt all from source, stripping all modifications I had done and
using the defconfigs.
I get the following message at reboot time:
[ 2839.724508] watchdog: watchdog0: watchdog did not stop!
[ 2841.162516] reboot: Restarting system
U-Boot TPL 2020.01-rc3-00070-g9a0cbae22a-dirty (Dec 03 2019 - 14:07:57)
Whereas before the watchdog alert was not triggering and reboot never occurred.

It would seem that the pcsi-reboot function is dead, and the only
reason the boards are actually rebooting is because the psci-watchdog
is triggering the reboot after the kernel fails to check in.

Now I am still having the issue with boot hanging after a warm reboot
when certain pci-e devices are installed (particularly, the i350
network controller).
I think this may be due to the pci-e controller driver lacking proper
shutdown cleanup code, which is allowing the i350 to continue to
trigger either interrupts or dma transfers following the soft-reboot.

The hang occurs roughly the same point, when either the iommu or the
first dma device is initialized.
Occasionally the A72 cluster fails to initialize as well.

Log is below:
[    0.261198] Detected PIPT I-cache on CPU5
[    0.261223] GICv3: CPU5: found redistributor 101 region 0:0x00000000fefa0000
[    0.261235] GICv3: CPU5: using allocated LPI pending table
@0x00000000f0120000
[    0.261263] CPU5: Booted secondary processor 0x0000000101 [0x410fd082]
[    0.261377] smp: Brought up 1 node, 6 CPUs
[    0.274833] SMP: Total of 6 processors activated.
[    0.275297] CPU features: detected: 32-bit EL0 Support
[    0.275801] CPU features: detected: CRC32 instructions
[    0.290797] CPU: All CPU(s) started at EL2
[    0.291242] alternatives: patching kernel code
[    0.294848] devtmpfs: initialized
[    0.311658] clocksource: jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 7645041785100000 ns
[    0.312629] futex hash table entries: 2048 (order: 5, 131072 bytes, linear)
[    0.315223] pinctrl core: initialized pinctrl subsystem
[    0.318097] DMI not present or invalid.
[    0.318989] NET: Registered protocol family 16
[    0.326798] DMA: preallocated 256 KiB pool for atomic allocations
[    0.327415] audit: initializing netlink subsys (disabled)
[    0.328106] audit: type=2000 audit(0.320:1): state=initialized
audit_enabled=0 res=1
[    0.330213] cpuidle: using governor menu
[    0.331160] hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
[    0.334653] Serial: AMBA PL011 UART driver
[    0.384125] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.384800] HugeTLB registered 32.0 MiB page size, pre-allocated 0 pages
[    0.385483] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.386146] HugeTLB registered 64.0 KiB page size, pre-allocated 0 pages
[    0.390063] cryptd: max_cpu_qlen set to 1000
[    0.396205] ACPI: Interpreter disabled.
[    0.399113] vcc3v3_pcie: supplied by vcc12v_dcin
[    0.400706] vcc5v0_sys: supplied by vcc12v_dcin
[    0.401426] vcc5v0_usb: supplied by vcc12v_dcin
[    0.402060] vcc3v3_sys: supplied by vcc5v0_sys
[    0.403275] iommu: Default domain type: Translated
[

>
> >
> > >
> > > > With miniloader and both variants of u-boot, if you attempt a reboot
> > > > it never fires the "reboot: Restarting system" message.
> > > > If you trigger a sysrq reboot at this stage, it will reboot, but fails
> > > > to start up the two a72 cores and subsequently hangs a second later
> > > > when it loads the first dma driver.
> > > >
> > > > With TPL/SPL on mainline-u-boot (I can't get rockchip-u-boot to work
> > > > with TPL/SPL), it fires the "reboot: Restarting system" message, but
> > > > never reboots.
> > > > sysrq does not function at this point.
> > > >
> > > > I believe the pcie controller is not being halted, and gets stuck in a
> > > > loop with the two a72 cores.
> > > >
> > > > _______________________________________________
> > > > Linux-rockchip mailing list
> > > > Linux-rockchip@xxxxxxxxxxxxxxxxxxx
> > > > http://lists.infradead.org/mailman/listinfo/linux-rockchip
> > > >

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-rockchip



[Index of Archives]     [LM Sensors]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux