Re: kernel panic: Attempted to kill init!

Hao Sun <sunhao.th@xxxxxxxxx> · Thu, 5 Jan 2023 17:00:07 +0800

> On 4 Jan 2023, at 2:33 AM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> 
> On Tue, Jan 3, 2023 at 4:46 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote:
>> 
>> 
>> 
>>> On 31 Dec 2022, at 12:55 AM, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
>>> 
>>> On Fri, Dec 30, 2022 at 1:54 AM Hao Sun <sunhao.th@xxxxxxxxx> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On 28 Dec 2022, at 2:35 PM, Yonghong Song <yhs@xxxxxxxx> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On 12/21/22 8:35 PM, Hao Sun wrote:
>>>>>> Hi,
>>>>>> This crash can be triggered by executing the C reproducer for
>>>>>> multiple times, which just keep loading the following prog as
>>>>>> raw tracepoint into kmem_cache_free().
>>>>>> The prog send SIGSEGV to current via bpf_send_signal_thread(),
>>>>>> after load this, whoever tries to free mem would trigger this,
>>>>>> kernel crashed when this happens to init.
>>>>>> Seems we should filter init out in bpf_send_signal_common() by
>>>>>> is_global_init(current), or maybe we should check this in the
>>>>>> verifier?
>>>>> 
>>>>> The helper is just to send a particular signal to *current*
>>>>> thread. In typical use case, it is never a good idea to send
>>>>> the signal to a *random* thread. In certain cases, maybe user
>>>>> indeed wants to send the signal to init thread to observe
>>>>> something. Note that such destructive side effect already
>>>>> exists in the bpf land. For example, for a xdp program,
>>>>> it could drop all packets to make machine not responsive
>>>>> to ssh etc. Therefore, I recommend to keep the existing
>>>>> bpf_send_signal_common() helper behavior.
>>>> 
>>>> Sound the two are different cases. Not responsive in XDP seems like
>>>> an intended behaviour, panic caused by killing init is buggy. If the
>>>> last thread of global init was killed, kernel panic immediately.
>>> 
>>> I don't get it. How was it possible that this prog was
>>> executed with current == pid 1 ?
>> 
>> The prog is raw trace point and is attached to ‘kmem_cache_free’ event.
>> When init triggered the event, the prog would be executed with pid 1.
>> But, the reason of this crash is not very clear to me, because it’s
>> really hard to debug with original C reproducer.
>> 
>> The following is the corresponding Syz prog:
>> 
>> # {Threaded:true Repeat:true RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:none SandboxArg:0 Leak:false NetInjection:true NetDevices:true NetReset:true Cgroups:true BinfmtMisc:true CloseFDs:true KCSAN:false DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false IEEE802154:true Sysctl:true UseTmpDir:true HandleSegv:true Repro:false Trace:false LegacyOptions:{Collide:false Fault:false FaultCall:0 FaultNth:0}}
>> r0 = bpf$BPF_PROG_RAW_TRACEPOINT_LOAD(0x5, &(0x7f0000000000)={0x11, 0xe, &(0x7f0000000400)=ANY=[@ANYBLOB="18000000000000000000000000000000180600000000000000000000000000001807000000000000000000000000000018080000000000000000000000000000180900000000000000000000000000002d00020000000000b70100000b000000850000007500000095"], &(0x7f00000000c0)}, 0x80)
>> bpf$BPF_RAW_TRACEPOINT_OPEN(0x11, &(0x7f0000000100)={&(0x7f0000000080)='kmem_cache_free\x00', r0}, 0x10)
> 
> Does syzbot running without any user space?
> Is syzbot itself a pid=1 ? and the only process ?
> If so, the error would makes sense.

Yes, after read the C reproducer again, noticed that after a
bunch of sandbox setup, the pid of the reproducer process at
runtime is 1.  

> I guess we can add a safety check to bpf_send_signal_common
> to prevent syzbot from killing itself.

Maybe something like this? This can avoid the panic, but won’t
allow task with pid=1 to send signal with prog.

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 23ce498bca97..94d2af2ce433 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -844,6 +844,8 @@ static int bpf_send_signal_common(u32 sig, enum pid_type type)
 	 */
 	if (unlikely(current->flags & (PF_KTHREAD | PF_EXITING)))
 		return -EPERM;
+	if (unlikely(is_global_init(current)))
+		return -EPERM;
 	if (unlikely(!nmi_uaccess_okay()))
 		return -EPERM;