Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
- To: Michel Lespinasse <michel@xxxxxxxxxxxxxx>
- Subject: Re: [PATCH 04/36] cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE
- From: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
- Date: Fri, 29 Jul 2022 16:59:50 +0200
- Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>, Peter Zijlstra <peterz@xxxxxxxxxxxxx>, Richard Henderson <rth@xxxxxxxxxxxxxxx>, Ivan Kokshaysky <ink@xxxxxxxxxxxxxxxxxxxx>, Matt Turner <mattst88@xxxxxxxxx>, vgupta@xxxxxxxxxx, Russell King - ARM Linux <linux@xxxxxxxxxxxxxxx>, ulli.kroll@xxxxxxxxxxxxxx, Linus Walleij <linus.walleij@xxxxxxxxxx>, Shawn Guo <shawnguo@xxxxxxxxxx>, Sascha Hauer <s.hauer@xxxxxxxxxxxxxx>, Sascha Hauer <kernel@xxxxxxxxxxxxxx>, Fabio Estevam <festevam@xxxxxxxxx>, dl-linux-imx <linux-imx@xxxxxxx>, Tony Lindgren <tony@xxxxxxxxxxx>, Kevin Hilman <khilman@xxxxxxxxxx>, Catalin Marinas <catalin.marinas@xxxxxxx>, Will Deacon <will@xxxxxxxxxx>, Guo Ren <guoren@xxxxxxxxxx>, bcain@xxxxxxxxxxx, Huacai Chen <chenhuacai@xxxxxxxxxx>, kernel@xxxxxxxxxx, Geert Uytterhoeven <geert@xxxxxxxxxxxxxx>, sammy@xxxxxxxxx, Michal Simek <monstr@xxxxxxxxx>, Thomas Bogendoerfer <tsbogend@xxxxxxxxxxxxxxxx>, dinguyen@xxxxxxxxxx, jonas@xxxxxxxxxxxx, stefan.kristiansson@xxxxxxxxxxxxx, Stafford Horne <shorne@xxxxxxxxx>, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>, Helge Deller <deller@xxxxxx>, Michael Ellerman <mpe@xxxxxxxxxxxxxx>, Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>, Paul Mackerras <paulus@xxxxxxxxx>, Paul Walmsley <paul.walmsley@xxxxxxxxxx>, Palmer Dabbelt <palmer@xxxxxxxxxxx>, Albert Ou <aou@xxxxxxxxxxxxxxxxx>, Heiko Carstens <hca@xxxxxxxxxxxxx>, Vasily Gorbik <gor@xxxxxxxxxxxxx>, Alexander Gordeev <agordeev@xxxxxxxxxxxxx>, Christian Borntraeger <borntraeger@xxxxxxxxxxxxx>, Sven Schnelle <svens@xxxxxxxxxxxxx>, Yoshinori Sato <ysato@xxxxxxxxxxxxx>, Rich Felker <dalias@xxxxxxxx>, David Miller <davem@xxxxxxxxxxxxx>, Richard Weinberger <richard@xxxxxx>, anton.ivanov@xxxxxxxxxxxxxxxxxx, Johannes Berg <johannes@xxxxxxxxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Borislav Petkov <bp@xxxxxxxxx>, Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>, "the arch/x86 maintainers" <x86@xxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>, acme@xxxxxxxxxx, Mark Rutland <mark.rutland@xxxxxxx>, Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>, jolsa@xxxxxxxxxx, namhyung@xxxxxxxxxx, Juergen Gross <jgross@xxxxxxxx>, srivatsa@xxxxxxxxxxxxx, amakhalov@xxxxxxxxxx, pv-drivers@xxxxxxxxxx, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, Chris Zankel <chris@xxxxxxxxxx>, Max Filippov <jcmvbkbc@xxxxxxxxx>, "Rafael J. Wysocki" <rafael@xxxxxxxxxx>, Len Brown <lenb@xxxxxxxxxx>, Pavel Machek <pavel@xxxxxx>, Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>, Michael Turquette <mturquette@xxxxxxxxxxxx>, Stephen Boyd <sboyd@xxxxxxxxxx>, Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>, Lorenzo Pieralisi <lpieralisi@xxxxxxxxxx>, Sudeep Holla <sudeep.holla@xxxxxxx>, Andy Gross <agross@xxxxxxxxxx>, Bjorn Andersson <bjorn.andersson@xxxxxxxxxx>, Anup Patel <anup@xxxxxxxxxxxxxx>, Thierry Reding <thierry.reding@xxxxxxxxx>, Jon Hunter <jonathanh@xxxxxxxxxx>, Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>, Arnd Bergmann <arnd@xxxxxxxx>, Yury Norov <yury.norov@xxxxxxxxx>, Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>, Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Petr Mladek <pmladek@xxxxxxxx>, senozhatsky@xxxxxxxxxxxx, John Ogness <john.ogness@xxxxxxxxxxxxx>, Frederic Weisbecker <frederic@xxxxxxxxxx>, quic_neeraju@xxxxxxxxxxx, Josh Triplett <josh@xxxxxxxxxxxxxxxx>, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>, Lai Jiangshan <jiangshanlai@xxxxxxxxx>, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>, Juri Lelli <juri.lelli@xxxxxxxxxx>, Vincent Guittot <vincent.guittot@xxxxxxxxxx>, Dietmar Eggemann <dietmar.eggemann@xxxxxxx>, Benjamin Segall <bsegall@xxxxxxxxxx>, Mel Gorman <mgorman@xxxxxxx>, Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>, vschneid@xxxxxxxxxx, jpoimboe@xxxxxxxxxx, linux-alpha@xxxxxxxxxxxxxxx, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, linux-snps-arc@xxxxxxxxxxxxxxxxxxx, Linux ARM <linux-arm-kernel@xxxxxxxxxxxxxxxxxxx>, Linux OMAP Mailing List <linux-omap@xxxxxxxxxxxxxxx>, linux-csky@xxxxxxxxxxxxxxx, linux-hexagon@xxxxxxxxxxxxxxx, linux-ia64@xxxxxxxxxxxxxxx, linux-m68k <linux-m68k@xxxxxxxxxxxxxxx>, "open list:BROADCOM NVRAM DRIVER" <linux-mips@xxxxxxxxxxxxxxx>, openrisc@xxxxxxxxxxxxxxxxxxxx, Parisc List <linux-parisc@xxxxxxxxxxxxxxx>, linuxppc-dev <linuxppc-dev@xxxxxxxxxxxxxxxx>, linux-riscv <linux-riscv@xxxxxxxxxxxxxxxxxxx>, linux-s390@xxxxxxxxxxxxxxx, Linux-sh list <linux-sh@xxxxxxxxxxxxxxx>, sparclinux@xxxxxxxxxxxxxxx, linux-um@xxxxxxxxxxxxxxxxxxx, linux-perf-users@xxxxxxxxxxxxxxx, virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-xtensa@xxxxxxxxxxxxxxxx, ACPI Devel Maling List <linux-acpi@xxxxxxxxxxxxxxx>, Linux PM <linux-pm@xxxxxxxxxxxxxxx>, linux-clk <linux-clk@xxxxxxxxxxxxxxx>, linux-arm-msm <linux-arm-msm@xxxxxxxxxxxxxxx>, linux-tegra <linux-tegra@xxxxxxxxxxxxxxx>, linux-arch <linux-arch@xxxxxxxxxxxxxxx>, rcu@xxxxxxxxxxxxxxx, rh0@xxxxxx
- In-reply-to: <20220729102458.GA1695@lespinasse.org>
- References: <20220608142723.103523089@infradead.org> <20220608144516.172460444@infradead.org> <20220725194306.GA14746@lespinasse.org> <20220728172053.GA3607379@paulmck-ThinkPad-P17-Gen-1> <20220729102458.GA1695@lespinasse.org>
On Fri, Jul 29, 2022 at 12:25 PM Michel Lespinasse
<michel@xxxxxxxxxxxxxx> wrote:
>
> On Thu, Jul 28, 2022 at 10:20:53AM -0700, Paul E. McKenney wrote:
> > On Mon, Jul 25, 2022 at 12:43:06PM -0700, Michel Lespinasse wrote:
> > > On Wed, Jun 08, 2022 at 04:27:27PM +0200, Peter Zijlstra wrote:
> > > > Commit c227233ad64c ("intel_idle: enable interrupts before C1 on
> > > > Xeons") wrecked intel_idle in two ways:
> > > >
> > > > - must not have tracing in idle functions
> > > > - must return with IRQs disabled
> > > >
> > > > Additionally, it added a branch for no good reason.
> > > >
> > > > Fixes: c227233ad64c ("intel_idle: enable interrupts before C1 on Xeons")
> > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> > >
> > > After this change was introduced, I am seeing "WARNING: suspicious RCU
> > > usage" when booting a kernel with debug options compiled in. Please
> > > see the attached dmesg output. The issue starts with commit 32d4fd5751ea
> > > and is still present in v5.19-rc8.
> > >
> > > I'm not sure, is this too late to fix or revert in v5.19 final ?
> >
> > I finally got a chance to take a quick look at this.
> >
> > The rcu_eqs_exit() function is making a lockdep complaint about
> > being invoked with interrupts enabled. This function is called from
> > rcu_idle_exit(), which is an expected code path from cpuidle_enter_state()
> > via its call to rcu_idle_exit(). Except that rcu_idle_exit() disables
> > interrupts before invoking rcu_eqs_exit().
> >
> > The only other call to rcu_idle_exit() does not disable interrupts,
> > but it is via rcu_user_exit(), which would be a very odd choice for
> > cpuidle_enter_state().
> >
> > It seems unlikely, but it might be that it is the use of local_irq_save()
> > instead of raw_local_irq_save() within rcu_idle_exit() that is causing
> > the trouble. If this is the case, then the commit shown below would
> > help. Note that this commit removes the warning from lockdep, so it
> > is necessary to build the kernel with CONFIG_RCU_EQS_DEBUG=y to enable
> > equivalent debugging.
> >
> > Could you please try your test with the -rce commit shown below applied?
>
> Thanks for looking into it.
>
> After checking out Peter's commit 32d4fd5751ea,
> cherry picking your commit ed4ae5eff4b3,
> and setting CONFIG_RCU_EQS_DEBUG=y in addition of my usual debug config,
> I am now seeing this a few seconds into the boot:
>
> [ 3.010650] ------------[ cut here ]------------
> [ 3.010651] WARNING: CPU: 0 PID: 0 at kernel/sched/clock.c:397 sched_clock_tick+0x27/0x60
> [ 3.010657] Modules linked in:
> [ 3.010660] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc1-test-00005-g1be22fea0611 #1
> [ 3.010662] Hardware name: LENOVO 30BFS44D00/1036, BIOS S03KT51A 01/17/2022
> [ 3.010663] RIP: 0010:sched_clock_tick+0x27/0x60
> [ 3.010665] Code: 1f 40 00 53 eb 02 5b c3 66 90 8b 05 2f c3 40 01 85 c0 74 18 65 8b 05 60 88 8f 4e 85 c0 75 0d 65 8b 05 a9 85 8f 4e 85 c0 74 02 <0f> 0b e8 e2 6c 89 00 48 c7 c3 40 d5 02 00
> 89 c0 48 03 1c c5 c0 98
> [ 3.010667] RSP: 0000:ffffffffb2803e28 EFLAGS: 00010002
> [ 3.010670] RAX: 0000000000000001 RBX: ffffc8ce7fa07060 RCX: 0000000000000001
> [ 3.010671] RDX: 0000000000000000 RSI: ffffffffb268dd21 RDI: ffffffffb269ab13
> [ 3.010673] RBP: 0000000000000001 R08: ffffffffffc300d5 R09: 000000000002be80
> [ 3.010674] R10: 000003625b53183a R11: ffffa012b802b7a4 R12: ffffffffb2aa9e80
> [ 3.010675] R13: ffffffffb2aa9e00 R14: 0000000000000001 R15: 0000000000000000
> [ 3.010677] FS: 0000000000000000(0000) GS:ffffa012b8000000(0000) knlGS:0000000000000000
> [ 3.010678] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.010680] CR2: ffffa012f81ff000 CR3: 0000000c99612001 CR4: 00000000003706f0
> [ 3.010681] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 3.010682] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 3.010683] Call Trace:
> [ 3.010685] <TASK>
> [ 3.010688] cpuidle_enter_state+0xb7/0x4b0
> [ 3.010694] cpuidle_enter+0x29/0x40
> [ 3.010697] do_idle+0x1d4/0x210
> [ 3.010702] cpu_startup_entry+0x19/0x20
> [ 3.010704] rest_init+0x117/0x1a0
> [ 3.010708] arch_call_rest_init+0xa/0x10
> [ 3.010711] start_kernel+0x6d8/0x6ff
> [ 3.010716] secondary_startup_64_no_verify+0xce/0xdb
> [ 3.010728] </TASK>
> [ 3.010729] irq event stamp: 44179
> [ 3.010730] hardirqs last enabled at (44179): [<ffffffffb2000ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20
> [ 3.010734] hardirqs last disabled at (44177): [<ffffffffb22003f0>] __do_softirq+0x3f0/0x498
> [ 3.010736] softirqs last enabled at (44178): [<ffffffffb2200332>] __do_softirq+0x332/0x498
> [ 3.010738] softirqs last disabled at (44171): [<ffffffffb16c760b>] irq_exit_rcu+0xab/0xf0
> [ 3.010741] ---[ end trace 0000000000000000 ]---
Can you please give this patch a go:
https://patchwork.kernel.org/project/linux-pm/patch/Yt/AxPFi88neW7W5@xxxxxxxxxxxxxxxxxxxxxxxxxx/
?
[Index of Archives]
[Linux Kernel]
[Sparc Linux]
[DCCP]
[Linux ARM]
[Yosemite News]
[Linux SCSI]
[Linux x86_64]
[Linux for Ham Radio]