Re: [Bug 215925] New: PCIe regression on Raspberry Pi Compute Module 4 (CM4) breaks booting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
I have just sent a pullrequest to linu-pci@vger@xxxxxxxxxx to address
this regression.
Please let me know if I have to do anything else besides addressing
reviewers concerns
for this pullreq.
Thanks,
Jim Quinlan
Broadcom STB



On Mon, May 16, 2022 at 5:05 PM Jim Quinlan <jim2101024@xxxxxxxxx> wrote:
>
> Hi Bjorn, Thorsten,
>
> I apologize -- I did not see this email until now; I think I have to
> work on my gmail filters and labels.
>
> I've just made a post on the Bugzilla website regarding this
> regression and have ideas on what may be causing the problem.
> Unfortunately, the error cannot be reproduced on my RPi4 or Broadcom
> STB version of the 2711.
> Hopefully Cyril can help me identify the issue.
>
> I will try to get a Fixup ASAP.
>
> Regards,
> Jim Quinlan
> Broadcom STB
>
> On Mon, May 9, 2022 at 3:44 AM Thorsten Leemhuis
> <regressions@xxxxxxxxxxxxx> wrote:
> >
> > Hi, this is your Linux kernel regression tracker. Partly top-posting to
> > mnake this easily accessible.
> >
> > Jim, what's up here? The regression was reported more than a week ago
> > and it seems nothing happened since then. Or was there progress and I
> > just missed it?
> >
> > Anyway:
> >
> > [TLDR: I'm adding this regression report to the list of tracked
> > regressions; all text from me you find below is based on a few templates
> > paragraphs you might have encountered already already in similar form.]
> >
> > On 02.05.22 20:38, Bjorn Helgaas wrote:
> > > On Sat, Apr 30, 2022 at 2:53 PM <bugzilla-daemon@xxxxxxxxxx> wrote:
> > >>
> > >> https://bugzilla.kernel.org/show_bug.cgi?id=215925
> > >>
> > >>             Bug ID: 215925
> > >>            Summary: PCIe regression on Raspberry Pi Compute Module 4 (CM4)
> > >>                     breaks booting
> > >>            Product: Drivers
> > >>            Version: 2.5
> > >>     Kernel Version: v5.17-rc1
> > >>           Hardware: ARM
> > >>                 OS: Linux
> > >>               Tree: Mainline
> > >>             Status: NEW
> > >>           Severity: normal
> > >>           Priority: P1
> > >>          Component: PCI
> > >>           Assignee: drivers_pci@xxxxxxxxxxxxxxxxxxxx
> > >>           Reporter: kibi@xxxxxxxxxx
> > >>         Regression: No
> > >>
> > >> Catching up with latest kernel releases in Debian, it turned out that my
> > >> Raspberry Pi Compute Module 4, mounted on an official Compute Module 4 IO
> > >> Board,
> > >> and booting from an SD card, no longer boots: this means a black screen on the
> > >> HDMI output, and no output on the serial console.
> > >>
> > >> Trying various releases, I confirmed that v5.16 was fine, and v5.17-rc1 was the
> > >> first (pre)release that wasn't.
> > >>
> > >> After some git bisect, it turns out the cause seems to be the following commit
> > >> (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21):
> > >>
> > >> ```
> > >> commit 830aa6f29f07a4e2f1a947dfa72b3ccddb46dd21
> > >> Author: Jim Quinlan <jim2101024@xxxxxxxxx>
> > >> Date:   Thu Jan 6 11:03:27 2022 -0500
> > >>
> > >>     PCI: brcmstb: Split brcm_pcie_setup() into two funcs
> > >> ```
> > >>
> > >> Starting with this commit, the kernel panics early (before 0.30 seconds), with
> > >> an `Asynchronous SError Interrupt`. The backtrace references various
> > >> `brcm_pcie_*` functions; I can share a picture or try and transcribe it
> > >> manually if that helps (nothing on the serial console…).
> > >>
> > >> This commit is part of a branch that was ultimately merged as
> > >> d0a231f01e5b25bacd23e6edc7c979a18a517b2b; starting with this commit, there's
> > >> not even a backtrace anymore, the screen stays black after the usual “boot-up
> > >> rainbow”, and there's still nothing on the serial console.
> > >>
> > >> I confirmed that 88db8458086b1dcf20b56682504bdb34d2bca0e2 (on the master side)
> > >> was still booting properly, and that 87c71931633bd15e9cfd51d4a4d9cd685e8cdb55
> > >> (from the branch being merged into master) is the last commit showing the
> > >> panic.
> > >>
> > >> Since d0a231f01e5b25bacd23e6edc7c979a18a517b2b is a merge commit that includes
> > >> conflict resolutions in drivers/pci/controller/pcie-brcmstb.c, I suppose this
> > >> could be consistent with the initial panic being “upgraded” into an even more
> > >> serious issue.
> > >>
> > >> I've also verified that latest master (v5.18-rc4-396-g57ae8a492116) is still
> > >> affected by this issue.
> > >>
> > >> The regular Raspberry Pi 4 B doesn't seem to be affected by this issue: the
> > >> exact same image on the same SD card (with latest master) boots fine on it.
> >
> > CCing the regression mailing list, as it should be in the loop for all
> > regressions, as explained here:
> > https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
> >
> > To be sure below issue doesn't fall through the cracks unnoticed, I'm
> > adding it to regzbot, my Linux kernel regression tracking bot:
> >
> > #regzbot ^introduced 830aa6f29f07a4e2f1a
> > #regzbot title pci: brcmstb: CM4 no longer boots from SD card
> > #regzbot ignore-activity
> > #regzbot from: Cyril Brulebois <kibi@xxxxxxxxxx>
> > #regzbot link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
> >
> > This isn't a regression? This issue or a fix for it are already
> > discussed somewhere else? It was fixed already? You want to clarify when
> > the regression started to happen? Or point out I got the title or
> > something else totally wrong? Then just reply -- ideally with also
> > telling regzbot about it, as explained here:
> > https://linux-regtracking.leemhuis.info/tracked-regression/
> >
> > Reminder for developers: When fixing the issue, add 'Link:' tags
> > pointing to the report (the mail this one replied to), as the kernel's
> > documentation call for; above page explains why this is important for
> > tracked regressions.
> >
> > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> >
> > P.S.: As the Linux kernel's regression tracker I deal with a lot of
> > reports and sometimes miss something important when writing mails like
> > this. If that's the case here, don't hesitate to tell me in a public
> > reply, it's in everyone's interest to set the public record straight.




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux