Re: [Bug 215925] New: PCIe regression on Raspberry Pi Compute Module 4 (CM4) breaks booting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Bjorn, Thorsten,

I apologize -- I did not see this email until now; I think I have to
work on my gmail filters and labels.

I've just made a post on the Bugzilla website regarding this
regression and have ideas on what may be causing the problem.
Unfortunately, the error cannot be reproduced on my RPi4 or Broadcom
STB version of the 2711.
Hopefully Cyril can help me identify the issue.

I will try to get a Fixup ASAP.

Regards,
Jim Quinlan
Broadcom STB

On Mon, May 9, 2022 at 3:44 AM Thorsten Leemhuis
<regressions@xxxxxxxxxxxxx> wrote:
>
> Hi, this is your Linux kernel regression tracker. Partly top-posting to
> mnake this easily accessible.
>
> Jim, what's up here? The regression was reported more than a week ago
> and it seems nothing happened since then. Or was there progress and I
> just missed it?
>
> Anyway:
>
> [TLDR: I'm adding this regression report to the list of tracked
> regressions; all text from me you find below is based on a few templates
> paragraphs you might have encountered already already in similar form.]
>
> On 02.05.22 20:38, Bjorn Helgaas wrote:
> > On Sat, Apr 30, 2022 at 2:53 PM <bugzilla-daemon@xxxxxxxxxx> wrote:
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=215925
> >>
> >>             Bug ID: 215925
> >>            Summary: PCIe regression on Raspberry Pi Compute Module 4 (CM4)
> >>                     breaks booting
> >>            Product: Drivers
> >>            Version: 2.5
> >>     Kernel Version: v5.17-rc1
> >>           Hardware: ARM
> >>                 OS: Linux
> >>               Tree: Mainline
> >>             Status: NEW
> >>           Severity: normal
> >>           Priority: P1
> >>          Component: PCI
> >>           Assignee: drivers_pci@xxxxxxxxxxxxxxxxxxxx
> >>           Reporter: kibi@xxxxxxxxxx
> >>         Regression: No
> >>
> >> Catching up with latest kernel releases in Debian, it turned out that my
> >> Raspberry Pi Compute Module 4, mounted on an official Compute Module 4 IO
> >> Board,
> >> and booting from an SD card, no longer boots: this means a black screen on the
> >> HDMI output, and no output on the serial console.
> >>
> >> Trying various releases, I confirmed that v5.16 was fine, and v5.17-rc1 was the
> >> first (pre)release that wasn't.
> >>
> >> After some git bisect, it turns out the cause seems to be the following commit
> >> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21):
> >>
> >> ```
> >> commit 830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21
> >> Author: Jim Quinlan <jim2101024@xxxxxxxxx>
> >> Date:   Thu Jan 6 11:03:27 2022 -0500
> >>
> >>     PCI: brcmstb: Split brcm_pcie_setup() into two funcs
> >> ```
> >>
> >> Starting with this commit, the kernel panics early (before 0.30 seconds), with
> >> an `Asynchronous SError Interrupt`. The backtrace references various
> >> `brcm_pcie_*` functions; I can share a picture or try and transcribe it
> >> manually if that helps (nothing on the serial console…).
> >>
> >> This commit is part of a branch that was ultimately merged as
> >> d0a231f01e5b25bacd23e6edc7c979a18a517b2b; starting with this commit, there's
> >> not even a backtrace anymore, the screen stays black after the usual “boot-up
> >> rainbow”, and there's still nothing on the serial console.
> >>
> >> I confirmed that 88db8458086b1dcf20b56682504bdb34d2bca0e2 (on the master side)
> >> was still booting properly, and that 87c71931633bd15e9cfd51d4a4d9cd685e8cdb55
> >> (from the branch being merged into master) is the last commit showing the
> >> panic.
> >>
> >> Since d0a231f01e5b25bacd23e6edc7c979a18a517b2b is a merge commit that includes
> >> conflict resolutions in drivers/pci/controller/pcie-brcmstb.c, I suppose this
> >> could be consistent with the initial panic being “upgraded” into an even more
> >> serious issue.
> >>
> >> I've also verified that latest master (v5.18-rc4-396-g57ae8a492116) is still
> >> affected by this issue.
> >>
> >> The regular Raspberry Pi 4 B doesn't seem to be affected by this issue: the
> >> exact same image on the same SD card (with latest master) boots fine on it.
>
> CCing the regression mailing list, as it should be in the loop for all
> regressions, as explained here:
> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
>
> To be sure below issue doesn't fall through the cracks unnoticed, I'm
> adding it to regzbot, my Linux kernel regression tracking bot:
>
> #regzbot ^introduced 830aa6f29f07a4e2f1a
> #regzbot title pci: brcmstb: CM4 no longer boots from SD card
> #regzbot ignore-activity
> #regzbot from: Cyril Brulebois <kibi@xxxxxxxxxx>
> #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
>
> This isn't a regression? This issue or a fix for it are already
> discussed somewhere else? It was fixed already? You want to clarify when
> the regression started to happen? Or point out I got the title or
> something else totally wrong? Then just reply -- ideally with also
> telling regzbot about it, as explained here:
> https://linux-regtracking.leemhuis.info/tracked-regression/
>
> Reminder for developers: When fixing the issue, add 'Link:' tags
> pointing to the report (the mail this one replied to), as the kernel's
> documentation call for; above page explains why this is important for
> tracked regressions.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>
> P.S.: As the Linux kernel's regression tracker I deal with a lot of
> reports and sometimes miss something important when writing mails like
> this. If that's the case here, don't hesitate to tell me in a public
> reply, it's in everyone's interest to set the public record straight.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux