Re: Raspberry Pi 3 Model B+ hangs in vc4_hdmi_runtime_resume()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Maxime,

Am 27.09.22 um 15:15 schrieb Maxime Ripard:
On Tue, Sep 27, 2022 at 02:25:12PM +0200, Maxime Ripard wrote:
On Tue, Sep 27, 2022 at 01:42:40PM +0200, Maxime Ripard wrote:
On Tue, Sep 27, 2022 at 01:12:35PM +0200, Stefan Wahren wrote:
Am 27.09.22 um 11:42 schrieb Maxime Ripard:
On Tue, Sep 27, 2022 at 09:25:54AM +0200, Maxime Ripard wrote:
Hi Stefan,

On Mon, Sep 26, 2022 at 08:50:12PM +0200, Stefan Wahren wrote:
Am 26.09.22 um 14:47 schrieb Maxime Ripard:
On Mon, Sep 26, 2022 at 02:40:48PM +0200, Marc Kleine-Budde wrote:
On 26.09.2022 14:08:04, Stefan Wahren wrote:
Hi Marc,

Am 26.09.22 um 12:21 schrieb Marc Kleine-Budde:
On 22.09.2022 17:06:00, Maxime Ripard wrote:
I'm on a Raspberry Pi 3 Model B+ running current Debian testing ARM64,
using Debian's v5.19 kernel (Debian's v5.18 was working flawless).

| [    0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd034]
| [    0.000000] Linux version 5.19.0-1-arm64 (debian-kernel@xxxxxxxxxxxxxxxx) (gcc-11 (Debian 11.3.0-5) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38.90.20220713) #1 SMP Debian 5.19.6-1 (2022-0
9-01)
| [    0.000000] Machine model: Raspberry Pi 3 Model B+
| [    3.747500] raspberrypi-firmware soc:firmware: Attached to firmware from 2022-03-24T13:21:11

As soon a the vc4 module is loaded the following warnings hits 4
times, then the machine stops.
[...]

The warning itself is fixed, both upstream and in stable (5.19.7).
Ok. Debian is using 5.19.6

It shouldn't have any relation to the hang though. Can you share your
setup?
- config.txt:

-------->8-------->8-------->8-------->8--------
gpu_mem=16
disable_splash=1

arm_64bit=1
enable_uart=1
uart_2ndstage=1

os_prefix=/u-boot/

[pi3]
force_turbo=1
-------->8-------->8-------->8-------->8--------

- Raspberry Pi 3 Model B+
- no HDMI connected
Does it mean, the issue only occurs without HDMI connected?
If you didn't test with HDMI yet, could you please do?
The error occurs with HDMI not connected, as vc4 is the gfx driver I
thought this might be of interest. :)

I don't have a HDMI monitor here, but I'll come back to you as soon as I
get access to one (might take some time).
It's not the first time an issue like this one would occur. I'm trying
to make my Pi3 boot again, and will try to bisect the issue.
yes the issue is only triggered without HDMI connected. I was able to
reproduce with an older vc4 firmware from 2020 (don't want to upgrade yet).
Kernel was also an arm64 build with defconfig.

Here some rough starting point for bisection:

5.18.0 good
5.19.0 bad
5.19.6 bad
Sorry it took a bit of time, it looks like I found another bug while
trying to test this yesterday.

Your datapoints are interesting though. I have a custom configuration
and it does boot 5.19 without an HDMI connected.

So I guess it leaves us with either the firmware version being different
(I'm using a newer version, from March 2022), or the configuration. I'll
test with defconfig.
So it turns out compiling vc4 as a module is the culprit.
Do you mean regardless of the kernel version in your case?
No, I mean that, with vc4 as a module, 5.18 works but 5.19 doesn't, like
Marc said. But if vc4 is built in, both work.

In my test cases i build vc4 always as module.

It's not clear to me why at this point, but the first register write in
vc4_hdmi_reset stalls.
Sounds like timing issue or a missing dependency (clock or power domain)
It felt like a clock or power domain issue to me indeed, but adding
clk_ignore_unused and pd_ignore_unused isn't enough, so it's probably
something a bit more complicated than just the clock / PD being
disabled.
I found the offending patch:
https://lore.kernel.org/dri-devel/20220225143534.405820-13-maxime@xxxxxxxxxx/

That code was removed because it was made irrelevant by that earlier patch:
https://lore.kernel.org/dri-devel/20220225143534.405820-10-maxime@xxxxxxxxxx/

But it turns out that while it works when the driver is built-in, it
doesn't when it's a module. If we add a clk_hw_get_rate() call right
after that call to raspberrypi_fw_set_rate(), the rate returned is 0.

I'm not entirely sure why, but I wonder if it's related to:
https://github.com/raspberrypi/linux/issues/4962#issuecomment-1228593439
Turns out it's not, since the Pi3 is using the clk-bcm2835 driver.

FWIW i can confirm, that i see the same behavior:

fd5894fa2413cca3e6a3ea713b2bd57281af2e86 bad

5b6ef06ea6225570bc0b33325306c7b8c6bdf5eb good


However, even reverting that patch fails. clk_set_min_rate fails because
the rate is protected, but it doesn't look like it is anywhere for that
clock, so I'm a bit confused.

Even if we do remove the clock protection check in
clk_core_set_rate_nolock(), clk_calc_new_rates() will then fail because
the bcm2835 driver will round the clock rate below the minimum, which is
rejected.

I'm not entirely sure what to do at this point. I guess the proper fix
would be to:
   - Figure out why it's considered protected when it's not (or shouldn't be)
   - Make the driver compute an acceptable rate for that clock
   - Reintroduce the clk_set_min_rate call to HDMI's runtime_resume, or
     some other equivalent code

Maxime



[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux