Re: Regression on linux-next (next-20241106)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Chaitanya,

On Mon, Nov 11, 2024 at 6:41 AM Borah, Chaitanya Kumar
<chaitanya.kumar.borah@xxxxxxxxx> wrote:
>
> Hello Rafael,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
>
> Since the version next-20241106 [2], we are seeing the following regression
>
> `````````````````````````````````````````````````````````````````````````````````
> <4>[    7.246473] WARNING: possible circular locking dependency detected
> <4>[    7.246476] 6.12.0-rc6-next-20241106-next-20241106-g5b913f5d7d7f+ #1 Not tainted
> <4>[    7.246479] ------------------------------------------------------
> <4>[    7.246481] swapper/0/1 is trying to acquire lock:
> <4>[    7.246483] ffffffff8264aef0 (cpu_hotplug_lock){++++}-{0:0}, at: static_key_enable+0xd/0x20
> <4>[    7.246493]
>                   but task is already holding lock:
> <4>[    7.246495] ffffffff82832068 (hybrid_capacity_lock){+.+.}-{4:4}, at: intel_pstate_register_driver+0xd3/0x1c0
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].

Thanks for the report!

> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> commit 92447aa5f6e7fbad9427a3fd1bb9e0679c403206
> Author: Rafael J. Wysocki mailto:rafael.j.wysocki@xxxxxxxxx
> Date:   Mon Nov 4 19:53:53 2024 +0100
>
>     cpufreq: intel_pstate: Update asym capacity for CPUs that were offline initially
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>
> We also verified that if we revert the patch the issue is not seen.
>
> Could you please check why the patch causes this regression and provide a fix if necessary?
>
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20241106
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20241106/bat-arls-1/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20241106&id=92447aa5f6e7fbad9427a3fd1bb9e0679c403206

The problem is that cpus_read_lock() should not be called under
hybrid_capacity_lock because the latter is acquired in CPU
online/offline paths and this is exposed by the above commit, but if
I'm not mistaken, the issue is there regardless of it.

A good news is that is should be addressed by a patch that has been
posted already:

https://lore.kernel.org/linux-pm/12554508.O9o76ZdvQC@xxxxxxxxxxxxx/

so please let me know if it makes the splat go away.

Even if its changelog says that it has no functional impact, this is
not really the case.

Thanks!




[Index of Archives]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux