Re: [PATCH bpf-next] libbpf: support module BTF for BPF_TYPE_ID_TARGET CO-RE relocation

Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> · Tue, 8 Dec 2020 15:39:20 -0800

On Tue, Dec 08, 2020 at 10:13:35PM +0000, Alan Maguire wrote:
> On Mon, 7 Dec 2020, Andrii Nakryiko wrote:
> 
> > On Mon, Dec 7, 2020 at 7:12 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Mon, Dec 07, 2020 at 04:38:16PM +0000, Alan Maguire wrote:
> > > > Sorry about this Andrii, but I'm a bit stuck here.
> > > >
> > > > I'm struggling to get tests working where the obj fd is used to designate
> > > > the module BTF. Unless I'm missing something there are a few problems:
> > > >
> > > > - the fd association is removed by libbpf when the BPF program has loaded;
> > > > the module fds are closed and the module BTF is discarded.  However even if
> > > > that isn't done (and as you mentioned, we could hold onto BTF that is in
> > > > use, and I commented out the code that does that to test) - there's
> > > > another problem:
> > > > - I can't see a way to use the object fd value we set here later in BPF
> > > > program context; btf_get_by_fd() returns -EBADF as the fd is associated
> > > > with the module BTF in the test's process context, not necessarily in
> > > > the context that the BPF program is running.  Would it be possible in this
> > > > case to use object id? Or is there another way to handle the fd->module
> > > > BTF association that we need to make in BPF program context that I'm
> > > > missing?
> > > > - A more long-term issue; if we use fds to specify module BTFs and write
> > > > the object fd into the program, we can pin the BPF program such that it
> > > > outlives fds that refer to its associated BTF.  So unless we pinned the
> > > > BTF too, any code that assumed the BTF fd-> module mapping was valid would
> > > > start to break once the user-space side went away and the pinned program
> > > > persisted.
> > >
> > > All of the above are not issues. They are features of FD based approach.
> > > When the program refers to btf via fd the verifier needs to increment btf's refcnt
> > > so it won't go away while the prog is running. For module's BTF it means
> > > that the module can be unloaded, but its BTF may stay around if there is a prog
> > > that needs to access it.
> > > I think the missing piece in the above is that btf_get_by_fd() should be
> > > done at load time instead of program run-time.
> > > Everything FD based needs to behave similar to map_fds where ld_imm64 insn
> > > contains map_fd that gets converted to map_ptr by the verifier at load time.
> > 
> > Right. I was going to extend verifier to do the same for all used BTF
> > objects as part of ksym support for module BTFs. So totally agree.
> > Just didn't need it so far.
> > 
> 
> Does this approach prevent more complex run-time specification of BTF 
> object fd though?  For example, I've been working on a simple tracer 
> focused on kernel debugging; it uses a BPF map entry for each kernel 
> function that is traced. User-space populates the map entry with BTF type 
> ids for the function arguments/return value, and when the BPF program 
> runs it uses the instruction pointer to look up the map entry for that
> function, and uses bpf_snprintf_btf() to write the string representations 
> of the function arguments/return values.  I'll send out an RFC soon, 
> but longer-term I was hoping to extend it to support module-specific 
> types.  Would a dynamic case like that - where the BTF module fd is looked 
> up in a map entry during program execution (rather than derived via 
> __btf_builtin_type_id()) work too? Thanks!

fd has to be resolved in the process context. bpf prog can read fd
number from the map, but that number is meaningless.
Say we allow using btf_obj_id+btf_id, how user space will know these
two numbers? Some new libbpf api that searches for it?
An extension to libbpf_find_vmlinux_btf_id() ? I was hoping that this api
will stay semi-internal. But say it's extended.
The user space will store a pair of numbers into a map and
what program are going to do with it?
If it's printing struct veth_stats contents it should have attached to
a corresponding function in the veth module via fentry or something.
The prog has hard coded logic in C with specific pointer to print.
The prog has its type right there. Why would the prog take a pointer
from one place, but it's type_id from the map? That's not realistic.
Where it would potentially make sense is what I think you're descring
where single kprobe style prog attached to many places and args of
those places are stored in a map and the prog selects them with
map_lookup with key=PT_REGS_IP ?
And passes pointers into bpf_snprintf_btf() from PT_REGS_PARM1() ?
I see why that is useful, but it's so racy. By the time the map
is populated those btf_obj_id+btf_id could be invalid.
I think instead of doing this in user space the program needs an access
to vmlinux+mods BTFs. Sort-of like proposed bpf helper to return ksym
based on IP there could be a helper to figure out btf_id+btf_obj_POINTER
based on IP. Then there will no need for external map to populate.
Would that solve your use case?