On Mon, May 09, 2022 at 09:44:29AM +0200, Thorsten Leemhuis wrote: > Hi, this is your Linux kernel regression tracker. Partly top-posting to > mnake this easily accessible. > > Jim, what's up here? The regression was reported more than a week ago > and it seems nothing happened since then. Or was there progress and I > just missed it? > > Anyway: > > [TLDR: I'm adding this regression report to the list of tracked > regressions; all text from me you find below is based on a few templates > paragraphs you might have encountered already already in similar form.] > > On 02.05.22 20:38, Bjorn Helgaas wrote: > > On Sat, Apr 30, 2022 at 2:53 PM <bugzilla-daemon@xxxxxxxxxx> wrote: > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=215925 > >> > >> Bug ID: 215925 > >> Summary: PCIe regression on Raspberry Pi Compute Module 4 (CM4) > >> breaks booting > >> Product: Drivers > >> Version: 2.5 > >> Kernel Version: v5.17-rc1 > >> Hardware: ARM > >> OS: Linux > >> Tree: Mainline > >> Status: NEW > >> Severity: normal > >> Priority: P1 > >> Component: PCI > >> Assignee: drivers_pci@xxxxxxxxxxxxxxxxxxxx > >> Reporter: kibi@xxxxxxxxxx > >> Regression: No > >> > >> Catching up with latest kernel releases in Debian, it turned out that my > >> Raspberry Pi Compute Module 4, mounted on an official Compute Module 4 IO > >> Board, > >> and booting from an SD card, no longer boots: this means a black screen on the > >> HDMI output, and no output on the serial console. > >> > >> Trying various releases, I confirmed that v5.16 was fine, and v5.17-rc1 was the > >> first (pre)release that wasn't. > >> > >> After some git bisect, it turns out the cause seems to be the following commit > >> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21): > >> > >> ``` > >> commit 830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21 > >> Author: Jim Quinlan <jim2101024@xxxxxxxxx> > >> Date: Thu Jan 6 11:03:27 2022 -0500 > >> > >> PCI: brcmstb: Split brcm_pcie_setup() into two funcs > >> ``` > >> > >> Starting with this commit, the kernel panics early (before 0.30 seconds), with > >> an `Asynchronous SError Interrupt`. The backtrace references various > >> `brcm_pcie_*` functions; I can share a picture or try and transcribe it > >> manually if that helps (nothing on the serial console…). > >> > >> This commit is part of a branch that was ultimately merged as > >> d0a231f01e5b25bacd23e6edc7c979a18a517b2b; starting with this commit, there's > >> not even a backtrace anymore, the screen stays black after the usual “boot-up > >> rainbow”, and there's still nothing on the serial console. > >> > >> I confirmed that 88db8458086b1dcf20b56682504bdb34d2bca0e2 (on the master side) > >> was still booting properly, and that 87c71931633bd15e9cfd51d4a4d9cd685e8cdb55 > >> (from the branch being merged into master) is the last commit showing the > >> panic. > >> > >> Since d0a231f01e5b25bacd23e6edc7c979a18a517b2b is a merge commit that includes > >> conflict resolutions in drivers/pci/controller/pcie-brcmstb.c, I suppose this > >> could be consistent with the initial panic being “upgraded” into an even more > >> serious issue. > >> > >> I've also verified that latest master (v5.18-rc4-396-g57ae8a492116) is still > >> affected by this issue. > >> > >> The regular Raspberry Pi 4 B doesn't seem to be affected by this issue: the > >> exact same image on the same SD card (with latest master) boots fine on it. Cyril, 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs") reverts cleanly as of 57ae8a492116. Does reverting it avoid the regression?