On Tue, Jan 3, 2023 at 4:46 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote: > > > > > On 31 Dec 2022, at 12:55 AM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Fri, Dec 30, 2022 at 1:54 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote: > >> > >> > >> > >>> On 28 Dec 2022, at 2:35 PM, Yonghong Song <yhs@xxxxxxxx> wrote: > >>> > >>> > >>> > >>> On 12/21/22 8:35 PM, Hao Sun wrote: > >>>> Hi, > >>>> This crash can be triggered by executing the C reproducer for > >>>> multiple times, which just keep loading the following prog as > >>>> raw tracepoint into kmem_cache_free(). > >>>> The prog send SIGSEGV to current via bpf_send_signal_thread(), > >>>> after load this, whoever tries to free mem would trigger this, > >>>> kernel crashed when this happens to init. > >>>> Seems we should filter init out in bpf_send_signal_common() by > >>>> is_global_init(current), or maybe we should check this in the > >>>> verifier? > >>> > >>> The helper is just to send a particular signal to *current* > >>> thread. In typical use case, it is never a good idea to send > >>> the signal to a *random* thread. In certain cases, maybe user > >>> indeed wants to send the signal to init thread to observe > >>> something. Note that such destructive side effect already > >>> exists in the bpf land. For example, for a xdp program, > >>> it could drop all packets to make machine not responsive > >>> to ssh etc. Therefore, I recommend to keep the existing > >>> bpf_send_signal_common() helper behavior. > >> > >> Sound the two are different cases. Not responsive in XDP seems like > >> an intended behaviour, panic caused by killing init is buggy. If the > >> last thread of global init was killed, kernel panic immediately. > > > > I don't get it. How was it possible that this prog was > > executed with current == pid 1 ? > > The prog is raw trace point and is attached to ‘kmem_cache_free’ event. > When init triggered the event, the prog would be executed with pid 1. > But, the reason of this crash is not very clear to me, because it’s > really hard to debug with original C reproducer. > > The following is the corresponding Syz prog: > > # {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:true Sysctl:true UseTmpDir:true HandleSegv:true Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}} > r0 = bpf$BPF_PROG_RAW_TRACEPOINT_LOAD(0x5, &(0x7f0000000000)={0x11, 0xe, &(0x7f0000000400)=ANY=[@ANYBLOB="18000000000000000000000000000000180600000000000000000000000000001807000000000000000000000000000018080000000000000000000000000000180900000000000000000000000000002d00020000000000b70100000b000000850000007500000095"], &(0x7f00000000c0)}, 0x80) > bpf$BPF_RAW_TRACEPOINT_OPEN(0x11, &(0x7f0000000100)={&(0x7f0000000080)='kmem_cache_free\x00', r0}, 0x10) Does syzbot running without any user space? Is syzbot itself a pid=1 ? and the only process ? If so, the error would makes sense. I guess we can add a safety check to bpf_send_signal_common to prevent syzbot from killing itself.