> 2025年1月3日 08:16,Paul E. McKenney <paulmck@xxxxxxxxxx> 写道: > > On Thu, Jan 02, 2025 at 10:59:27AM +0800, Kun Hu wrote: >> Hello, >> >> When using our customed fuzzer tool to fuzz the latest Linux kernel, the following crash >> was triggered. >> >> HEAD commit: dbfac60febfa806abb2d384cb6441e77335d2799 >> git tree: upstream >> Console output: https://drive.google.com/file/d/1D3EDxDxPi0t7m_Z4Uc4FuL26DnHs7yTa/view?usp=sharing >> Kernel config: https://drive.google.com/file/d/1m1mk_YusR-tyusNHFuRbzdj8KUzhkeHC/view?usp=sharing >> C reproducer: / >> Syzlang reproducer: / >> >> We observed a crash at line 1333 in note_gp_changes, likely caused by a race condition involving rcu_gp_kthread_wake and note_gp_changes. The issue appears to involve insufficient or incorrect synchronization, as indicated by the involvement of _raw_spin_unlock_irqrestore in spinlock.c. Specifically, this may lead to invalid accesses to rcu_state.gp_kthread or related flags (e.g., gp_flags), potentially resulting in unexpected behavior in swake_up_one_online. >> >> Could you please help check if this needs to be addressed? > > This is a new one on me. > > This is running in a guest OS. Might the underlying hypervisor be > overloaded? That could result in vCPU preemption and thus in this sort > of soft lockup. > > Also, when I check out the above commit (which is v6.13-rc4), I find that > line 1333 is the close curly brace of note_gp_changes(). Of course, it is > possible that the address-to-symbol translation failed (please check!), > but in the absence of such failure, there is no way that I know of that > incorrect synchronization could cause a soft lockup at that location. > > Other things besides vCPU preemption that could cause a soft lockup at > that location include corrupted kernel text, corrupted kernel stack, > and incessant interrupts. > > Other thoughts? > > Thanx, Paul > Sorry for late, I double-checked that it's not the address-to-symbol translation failing, and the vCPU resources aren't overloaded. Additionally, I tried to reproduce multiple rounds using Syzkaller to get two types of reproducers, c and syscall sequences. i'm not sure if there are any other issues, that's all I can offer for now. Not sure if this information is useful to you, if it really isn't a real bug, please ignore it. C reproducer: https://drive.google.com/file/d/1niejFamwXcRumUsn1Ur8xiX2jfZAcown/view?usp=sharing Syscall sequence reproducer: https://drive.google.com/file/d/1gBfe_WZZeHfrhTlXp5zJfV7be21iGCAC/view?usp=sharing New log info: https://drive.google.com/file/d/1x7eugPh2RUUF9lOf3s9K64pARkkUE1Qn/view?usp=sharing ---- Thanks, Kun Hu