Re: [syzbot] [trace?] [bpf?] KASAN: slab-use-after-free Read in bpf_trace_run2 (2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-10-22 04:20, Steven Rostedt wrote:

Mathieu, can you look at this?

Sure,


[ more below ]

On Mon, 21 Oct 2024 18:23:47 +0000
Jordan Rife <jrife@xxxxxxxxxx> wrote:

I performed a bisection and this issue starts with commit a363d27cdbc2
("tracing: Allow system call tracepoints to handle page faults") which
introduces this change.

+ *
+ * With @syscall=0, the tracepoint callback array dereference is
+ * protected by disabling preemption.
+ * With @syscall=1, the tracepoint callback array dereference is
+ * protected by Tasks Trace RCU, which allows probes to handle page
+ * faults.
   */
  #define __DO_TRACE(name, args, cond, syscall)				\
  	do {								\
@@ -204,11 +212,17 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
  		if (!(cond))						\
  			return;						\
  									\
-		preempt_disable_notrace();				\
+		if (syscall)						\
+			rcu_read_lock_trace();				\
+		else							\
+			preempt_disable_notrace();			\
  									\
  		__DO_TRACE_CALL(name, TP_ARGS(args));			\
  									\
-		preempt_enable_notrace();				\
+		if (syscall)						\
+			rcu_read_unlock_trace();			\
+		else							\
+			preempt_enable_notrace();			\
  	} while (0)

Link: https://lore.kernel.org/bpf/20241009010718.2050182-6-mathieu.desnoyers@xxxxxxxxxxxx/

I reproduced the bug locally by running syz-execprog inside a QEMU VM.

./syz-execprog -repeat=0 -procs=5 ./repro.syz.txt

I /think/ what is happening is that with this change preemption may now
occur leading to a scenario where the RCU grace period is insufficient
in a few places where call_rcu() is used. In other words, there are a
few scenarios where call_rcu_tasks_trace() should be used instead to
prevent a use-after-free bug when a preempted tracepoint call tries to
access a program, link, etc. that was freed. It seems the syzkaller
program induces page faults while attaching raw tracepoints to
sys_enter making preemption more likely to occur.

kernel/tracepoint.c
===================
...
static inline void release_probes(struct tracepoint_func *old)
{
	...
	call_rcu(&tp_probes->rcu, rcu_free_old_probes); <-- Here

Have you tried just changing this one to call_rcu_tasks_trace()?

Actually, I see two possible solutions there:

1) If we want to keep unchanged register/unregister functions as a single
   API for normal and syscall tracepoints, and we don't want to add additional
   flags in the tracepoint struct, then we need to chain the call_rcu:

   call_rcu() -> rcu_free_old_probes -> call_rcu_tasks_trace() -> rcu_tasks_trace_free_old_probes.

   Just like we did when we used SRCU.

   This is not perfect because we'd be adding extra delay for reclaim of
   the old probes array for every tracepoint being updated, not just the
   syscall tracepoints, but we probably don't care. This is a straightforward
   initial solution.

   We did something similar with SRCU before and it was OK.

2) If we want something more elegant, we can add a flag to struct tracepoint
   for syscall tracepoints, and use that flag to choose between call_rcu()
   and call_rcu_tasks_trace(). But maybe this additional complexity is not
   even useful.

I'll prepare a patch implementing (1) and send it your way as RFC for further
testing.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux