Re: [bug] kernel: bpf: syscall: a possible sleep-in-atomic bug in __bpf_prog_put()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 5/19/23 7:18 AM, Teng Qi wrote:
Thank you for your response.
 > Looks like you only have suspicion here. Could you find a real violation
 > here where __bpf_prog_put() is called with !in_irq() &&
 > !irqs_disabled(), but inside spin_lock or rcu read lock? I have not seen
 > things like that.

For the complex conditions to call bpf_prog_put() with 1 refcnt, we have been
unable to really trigger this atomic violation after trying to construct
test cases manually. But we found that it is possible to show cases with
!in_irq() && !irqs_disabled(), but inside spin_lock or rcu read lock.
For example, even a failed case, one of selftest cases of bpf, netns_cookie,
calls bpf_sock_map_update() and may indirectly call bpf_prog_put()
only inside rcu read lock: The possible call stack is:
net/core/sock_map.c: 615 bpf_sock_map_update()
net/core/sock_map.c: 468 sock_map_update_common()
net/core/sock_map.c:  217 sock_map_link()
kernel/bpf/syscall.c: 2111 bpf_prog_put()

The files about netns_cookie include
tools/testing/selftests/bpf/progs/netns_cookie_prog.c and
tools/testing/selftests/bpf/prog_tests/netns_cookie.c. We inserted the
following code in
‘net/core/sock_map.c: 468 sock_map_update_common()’:
static int sock_map_update_common(..)
{
         int inIrq = in_irq();
         int irqsDisabled = irqs_disabled();
         int preemptBits = preempt_count();
         int inAtomic = in_atomic();
         int rcuHeld = rcu_read_lock_held();
         printk("in_irq() %d, irqs_disabled() %d, preempt_count() %d,
           in_atomic() %d, rcu_read_lock_held() %d", inIrq, irqsDisabled,
           preemptBits, inAtomic, rcuHeld);
}

The output message is as follows:
root@(none):/root/bpf# ./test_progs -t netns_cookie
[  137.639188] in_irq() 0, irqs_disabled() 0, preempt_count() 0, in_atomic() 0,
         rcu_read_lock_held() 1
#113     netns_cookie:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

We notice that there are numerous callers in kernel/, net/ and drivers/, so we highly suggest modifying __bpf_prog_put() to address this gap. The gap exists
because __bpf_prog_put() is only safe under in_irq() || irqs_disabled()
but not in_atomic() || rcu_read_lock_held(). The following code snippet may
mislead developers into thinking that bpf_prog_put() is safe in all contexts.
if (in_irq() || irqs_disabled()) {
         INIT_WORK(&aux->work, bpf_prog_put_deferred);
         schedule_work(&aux->work);
} else {
         bpf_prog_put_deferred(&aux->work);
}

Implicit dependency may lead to issues.

 > Any problem here?
We mentioned it to demonstrate the possibility of kvfree() being
called by __bpf_prog_put_noref().

Thanks.
-- Teng Qi

On Wed, May 17, 2023 at 1:08 AM Yonghong Song <yhs@xxxxxxxx <mailto:yhs@xxxxxxxx>> wrote:



    On 5/16/23 4:18 AM, starmiku1207184332@xxxxxxxxx
    <mailto:starmiku1207184332@xxxxxxxxx> wrote:
     > From: Teng Qi <starmiku1207184332@xxxxxxxxx
    <mailto:starmiku1207184332@xxxxxxxxx>>
     >
     > Hi, bpf developers,
     >
     > We are developing a static tool to check the matching between
    helpers and the
     > context of hooks. During our analysis, we have discovered some
    important
     > findings that we would like to report.
     >
     > ‘kernel/bpf/syscall.c: 2097 __bpf_prog_put()’ shows that function
     > bpf_prog_put_deferred() won`t be called in the condition of
     > ‘in_irq() || irqs_disabled()’.
     > if (in_irq() || irqs_disabled()) {
     >      INIT_WORK(&aux->work, bpf_prog_put_deferred);
     >      schedule_work(&aux->work);
     > } else {
     >
     >      bpf_prog_put_deferred(&aux->work);
     > }
     >
     > We suspect this condition exists because there might be sleepable
    operations
     > in the callees of the bpf_prog_put_deferred() function:
     > kernel/bpf/syscall.c: 2097 __bpf_prog_put()
     > kernel/bpf/syscall.c: 2084 bpf_prog_put_deferred()
     > kernel/bpf/syscall.c: 2063 __bpf_prog_put_noref()
     > kvfree(prog->aux->jited_linfo);
     > kvfree(prog->aux->linfo);

    Looks like you only have suspicion here. Could you find a real
    violation
    here where __bpf_prog_put() is called with !in_irq() &&
    !irqs_disabled(), but inside spin_lock or rcu read lock? I have not seen
    things like that.

     >
     > Additionally, we found that array prog->aux->jited_linfo is
    initialized in
     > ‘kernel/bpf/core.c: 157 bpf_prog_alloc_jited_linfo()’:
     > prog->aux->jited_linfo = kvcalloc(prog->aux->nr_linfo,
     >    sizeof(*prog->aux->jited_linfo), bpf_memcg_flags(GFP_KERNEL |
    __GFP_NOWARN));

    Any problem here?

     >
     > Our question is whether the condition 'in_irq() ||
    irqs_disabled() == false' is
     > sufficient for calling 'kvfree'. We are aware that calling
    'kvfree' within the
     > context of a spin lock or an RCU lock is unsafe.

Your above analysis makes sense if indeed that kvfree cannot appear
inside a spin lock region or RCU read lock region. But is it true?
I checked a few code paths in kvfree/kfree. It is either guarded
with local_irq_save/restore or by spin_lock_irqsave/spin_unlock_irqrestore, etc. Did I miss
anything? Are you talking about RT kernel here?


     >
     > Therefore, we propose modifying the condition to include
    in_atomic(). Could we
     > update the condition as follows: "in_irq() || irqs_disabled() ||
    in_atomic()"?
     >
     > Thank you! We look forward to your feedback.
     >
     > Signed-off-by: Teng Qi <starmiku1207184332@xxxxxxxxx
    <mailto:starmiku1207184332@xxxxxxxxx>>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux