Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 16, 2020 at 6:24 PM Hao Luo <haoluo@xxxxxxxxxx> wrote:
>
> Andrii,
>
> Do you think we need to put the kernel's variables in one single
> DATASEC in vmlinux BTF? It looks like all the ksyms in the program
> will be under one ".ksyms" section, so we are not able to tell whether
> a ksym is from a percpu section or a .rodata section. Without this
> information, if the vmlinux has multiple DATASECs, the loader may need
> to traverse all of them. If vmlinux BTF has only one DATASEC, it
> matches the object's BTF better.
>
> Right now, the percpu vars are in a ".data..percpu" DATASEC in my
> patch and the plan seems that we will introduce more DATASECs to hold
> other data.
>
> Please let me know your insights here. Thanks.

I think we should keep original DATASECs in vmlinux's BTF, so that
they match ELF sections. Otherwise BTF is going to lie and will cause
confusion down the road in the longer term.

On the BPF program side, though, I think we'll limit it to just two
special sections: .ksyms and .ksyms.percpu. libbpf will have to
traverse all vmlinux DATASECs to find corresponding variables, but
that's ok, it has to do the linear scan either way.

>
> Hao
>
> On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> >
> > On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> > <andrii.nakryiko@xxxxxxxxx> wrote:
> > >
> > > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> > > >
> > > > Thanks, Andrii,
> > > >
> > > > This change looks nice! A couple of comments:
> > > >
> > > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> > >
> > > That choice is very deliberate one. `extern const void` is the right
> > > way in C language to access linker-generated symbols, for instance,
> > > which is quite similar to what the intent is her. Having void type is
> > > very explicit that you don't know/care about that value pointed to by
> > > extern address, the only operation you can perform is to get it's
> > > address.
> > >
> > > Once we add kernel variables support, that's when types will start to
> > > be specified and libbpf will do extra checks (type matching) and extra
> > > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > > to access data pointed to by extern address.
> > >
> > > Switching type to u64 would be misleading in allowing C code to
> > > implicitly dereference value of extern. E.g., there is a big
> > > difference between:
> > >
> > > extern u64 bla;
> > >
> > > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > > of memory pointed to by "bla" symbol */
> > >
> > > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > > extern variable */
> > >
> > > Currently I explicitly support only the latter and want to prevent the
> > > former, until we have kernel variables in BTF. Using `extern void`
> > > makes compiler enforce that only the &bla form is allowed. Everything
> > > else is compilation error.
> > >
> >
> > Ah, I see. I've been taking the extern variable as an actual variable
> > that contains the symbol's address, which is the first case. Your
> > approach makes sense. Thanks for explaining.
> >
> > > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> > >
> > > That's a bit of a hack on my part. Variable needs to point to some
> > > type, which size will match the size of datasec's varinfo entry. This
> > > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > > it's almost guaranteed that it will be present in program's BTF and I
> > > won't have to explicitly add it (it's because all BPF programs return
> > > int, so it must be in program's BTF already). While 8-byte long is
> > > less likely to be there.
> > >
> > > In the future, if we have a nicer way to extend BTF (and we will
> > > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > > type isn't used for anything (there is no map with that DATASEC as a
> > > value type), so it doesn't matter.
> > >
> >
> > Thanks for explaining, Andrii.
> >
> > These explanations as comments in the code would be quite helpful, IMHO.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux