Re: [PATCH bpf-next v6 1/2] bpf: Fix attaching fentry/fexit/fmod_ret/lsm to modules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 22, 2023 at 09:03:46AM -0700, Alexei Starovoitov wrote:
> On Wed, Mar 22, 2023 at 5:14 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote:
> >
> > On Wed, Mar 22, 2023 at 10:49:38AM +0100, Artem Savkov wrote:
> >
> > SNIP
> >
> > > > > Hm, do we even need to preempt_disable? IIUC, preempt_disable is used
> > > > > in module kallsyms to prevent taking the module lock b/c kallsyms are
> > > > > used in the oops path. That shouldn't be an issue here, is that correct?
> > > >
> > > > btf_try_get_module calls try_module_get which disables the preemption,
> > > > so no need to call it in here
> > >
> > > It does, but it reenables preemption right away so it is enabled by the
> > > time we call find_kallsyms_symbol_value(). I am getting the following
> > > lockdep splat while running module_fentry_shadow test from test_progs.
> > >
> > > [   12.017973][  T488] =============================
> > > [   12.018529][  T488] WARNING: suspicious RCU usage
> > > [   12.018987][  T488] 6.2.0.bpf-test-13063-g6a9f5cdba3c5 #804 Tainted: G           OE
> > > [   12.019898][  T488] -----------------------------
> > > [   12.020391][  T488] kernel/module/kallsyms.c:448 suspicious rcu_dereference_check() usage!
> > > [   12.021335][  T488]
> > > [   12.021335][  T488] other info that might help us debug this:
> > > [   12.021335][  T488]
> > > [   12.022416][  T488]
> > > [   12.022416][  T488] rcu_scheduler_active = 2, debug_locks = 1
> > > [   12.023297][  T488] no locks held by test_progs/488.
> > > [   12.023854][  T488]
> > > [   12.023854][  T488] stack backtrace:
> > > [   12.024336][  T488] CPU: 0 PID: 488 Comm: test_progs Tainted: G           OE      6.2.0.bpf-test-13063-g6a9f5cdba3c5 #804
> > > [   12.025290][  T488] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/01/2014
> > > [   12.026108][  T488] Call Trace:
> > > [   12.026381][  T488]  <TASK>
> > > [   12.026649][  T488]  dump_stack_lvl+0xb4/0x110
> > > [   12.027060][  T488]  lockdep_rcu_suspicious+0x158/0x1f0
> > > [   12.027541][  T488]  find_kallsyms_symbol_value+0xe8/0x110
> > > [   12.028028][  T488]  bpf_check_attach_target+0x838/0xa20
> > > [   12.028511][  T488]  check_attach_btf_id+0x144/0x3f0
> > > [   12.028957][  T488]  ? __pfx_cmp_subprogs+0x10/0x10
> > > [   12.029408][  T488]  bpf_check+0xeec/0x1850
> > > [   12.029799][  T488]  ? ktime_get_with_offset+0x124/0x1d0
> > > [   12.030247][  T488]  bpf_prog_load+0x87a/0xed0
> > > [   12.030627][  T488]  ? __lock_release+0x5f/0x160
> > > [   12.031010][  T488]  ? __might_fault+0x53/0xb0
> > > [   12.031394][  T488]  ? selinux_bpf+0x6c/0xa0
> > > [   12.031756][  T488]  __sys_bpf+0x53c/0x1240
> > > [   12.032115][  T488]  __x64_sys_bpf+0x27/0x40
> > > [   12.032476][  T488]  do_syscall_64+0x3e/0x90
> > > [   12.032835][  T488]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >
> >
> > hum, for some reason I can't reproduce, but looks like we need to disable
> > preemption for find_kallsyms_symbol_value.. could you please check with
> > patch below?
> >
> > also could you please share your .config? not sure why I can't reproduce
> >
> > thanks,
> > jirka
> >
> >
> > ---
> > diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
> > index ab2376a1be88..bdc911dbcde5 100644
> > --- a/kernel/module/kallsyms.c
> > +++ b/kernel/module/kallsyms.c
> > @@ -442,7 +442,7 @@ int module_get_kallsym(unsigned int symnum, unsigned long *value, char *type,
> >  }
> >
> >  /* Given a module and name of symbol, find and return the symbol's value */
> > -unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name)
> > +static unsigned long __find_kallsyms_symbol_value(struct module *mod, const char *name)
> >  {
> >         unsigned int i;
> >         struct mod_kallsyms *kallsyms = rcu_dereference_sched(mod->kallsyms);
> > @@ -466,7 +466,7 @@ static unsigned long __module_kallsyms_lookup_name(const char *name)
> >         if (colon) {
> >                 mod = find_module_all(name, colon - name, false);
> >                 if (mod)
> > -                       return find_kallsyms_symbol_value(mod, colon + 1);
> > +                       return __find_kallsyms_symbol_value(mod, colon + 1);
> >                 return 0;
> >         }
> >
> > @@ -475,7 +475,7 @@ static unsigned long __module_kallsyms_lookup_name(const char *name)
> >
> >                 if (mod->state == MODULE_STATE_UNFORMED)
> >                         continue;
> > -               ret = find_kallsyms_symbol_value(mod, name);
> > +               ret = __find_kallsyms_symbol_value(mod, name);
> >                 if (ret)
> >                         return ret;
> >         }
> > @@ -494,6 +494,16 @@ unsigned long module_kallsyms_lookup_name(const char *name)
> >         return ret;
> >  }
> >
> > +unsigned long find_kallsyms_symbol_value(struct module *mod, const char *name)
> > +{
> > +       unsigned long ret;
> > +
> > +       preempt_disable();
> > +       ret = __find_kallsyms_symbol_value(mod, name);
> > +       preempt_enable();
> > +       return ret;
> > +}
> 
> That doesn't look right.
> I think the issue is misuse of rcu_dereference_sched in
> find_kallsyms_symbol_value.

it seems to be using rcu pointer to keep symbols for module init time and
then core symbols for after init.. and switch between them when module is
loaded, hence the strange rcu usage I think

Petr, Zhen, any idea/insight?

thanks,
jirka



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux