Re: [BUG] printk/nbcon can use RCU illegally prior to CPU online

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 18, 2024 at 03:24:47PM GMT, Steven Rostedt wrote:
> On Thu, 18 Jul 2024 13:13:02 -0500
> Andrew Halaney <ahalaney@xxxxxxxxxx> wrote:
> 
> 
> > I'll be honest, I only considered this from a printk() point of view so
> > far and pretty much ignored the original splat (as someone else is
> > looking at that). Here's the original splat from a centos kernel, I have
> > yet to look into it other than "we took an exception prior to RCU coming
> > online for this CPU" and me naively thinking that printk() should work
> > everywhere. I'll post the splat here for now before considering it more,
> > but it sounds like my expectation is wrong and that if something goes
> > awry and needs to printk prior to the CPU coming online, then this RCU
> > warning is sort of expected?:
> 
> Note, printk() is most definitely not safe in all contexts. It can
> cause issues if done in NMI, or even called while holding an scheduler
> rq lock. And now we are trying to make sure all tracing/printk is done
> while RCU is watching.
> 
> > 
> >     [ 13.643007] smp: Bringing up secondary CPUs ...
> >     [ 13.644422] smpboot: x86: Booting SMP configuration:
> >     [ 13.644431] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34
> >     [ 13.680691] Callback from call_rcu_tasks_rude() invoked.
> >     [ 13.684308] #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55
> >     [ 13.720218] .... node #1, CPUs: #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71 #72 #73 #74 #75 #76 #77 #78 #79 #80 #81 #82 #83 #84 #85 #86 #87 #88 #89 #90 #91 #92 #93 #94 #95 #96 #97 #98 #99
> >     [ 13.783247] Callback from call_rcu_tasks() invoked.
> >     [ 13.785941] #100 #101 #102 #103 #104 #105 #106 #107 #108 #109 #110 #111
> >     [ 5.061942] x2apic lock mismatch between BSP and AP.
> 
> >     [ 5.061942] =============================
> >     [ 5.061942] WARNING: suspicious RCU usage
> >     [ 5.061942] 5.14.0-467.el9.x86_64+debug #1 Not tainted
> >     [ 5.061942] -----------------------------
> >     [ 5.061942] kernel/printk/nbcon.c:1118 suspicious rcu_dereference_check() usage!
> 
> You said this was a centos kernel. Well that explains why the above
> says 5.14 and has a reference to kernel/printk/nbcon.c, which doesn't
> exist in 5.14.
> 
> Looks like a bad backport.
> 

Heh, exactly why I made a "reproducer" for the printk/RCU side of the
splat with linux-rt-devel before reporting.

So my takeaway is that printk is not considered safe to use in all
contexts, including during SMP bringup (because RCU's at play and not
yet ready to be used on those CPUs). So if we do printk for reasons we
can't control (say some firmware problem causing an exception) and try
to print you could get the RCU warning about printk/nbcon usage in
addition to the actual error, and that's just something to accept.

Ideally though we don't printk() in those situations... so if its badly
behaving code calling printk() that's a different story.

Let me know if you think that recap is bogus.

Thanks,
Andrew





[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux