Re: [PATCH bpf-next 5/7] libbpf: add x86-specific USDT arg spec parsing logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 31, 2022 at 8:14 AM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote:
>
> On Fri, 25 Mar 2022, Andrii Nakryiko wrote:
>
> > Add x86/x86_64-specific USDT argument specification parsing. Each
> > architecture will require their own logic, as all this is arch-specific
> > assembly-based notation. Architectures that libbpf doesn't support for
> > USDTs will pr_warn() with specific error and return -ENOTSUP.
> >
> > We use sscanf() as a very powerful and easy to use string parser. Those
> > spaces in sscanf's format string mean "skip any whitespaces", which is
> > pretty nifty (and somewhat little known) feature.
> >
> > All this was tested on little-endian architecture, so bit shifts are
> > probably off on big-endian, which our CI will hopefully prove.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
>
> Reviewed-by: Alan Maguire <alan.maguire@xxxxxxxxxx>
>
> minor stuff below...
>
> > ---
> >  tools/lib/bpf/usdt.c | 105 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 105 insertions(+)
> >
> > diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c
> > index 22f5f56992f8..5cf809db60aa 100644
> > --- a/tools/lib/bpf/usdt.c
> > +++ b/tools/lib/bpf/usdt.c
> > @@ -1007,8 +1007,113 @@ static int parse_usdt_spec(struct usdt_spec *spec, const struct usdt_note *note,
> >       return 0;
> >  }
> >
> > +/* Architecture-specific logic for parsing USDT argument location specs */
> > +
> > +#if defined(__x86_64__) || defined(__i386__)
> > +
> > +static int calc_pt_regs_off(const char *reg_name)
> > +{
> > +     static struct {
> > +             const char *names[4];
> > +             size_t pt_regs_off;
> > +     } reg_map[] = {
> > +#if __x86_64__
> > +#define reg_off(reg64, reg32) offsetof(struct pt_regs, reg64)
> > +#else
> > +#define reg_off(reg64, reg32) offsetof(struct pt_regs, reg32)
> > +#endif
> > +             { {"rip", "eip", "", ""}, reg_off(rip, eip) },
> > +             { {"rax", "eax", "ax", "al"}, reg_off(rax, eax) },
> > +             { {"rbx", "ebx", "bx", "bl"}, reg_off(rbx, ebx) },
> > +             { {"rcx", "ecx", "cx", "cl"}, reg_off(rcx, ecx) },
> > +             { {"rdx", "edx", "dx", "dl"}, reg_off(rdx, edx) },
> > +             { {"rsi", "esi", "si", "sil"}, reg_off(rsi, esi) },
> > +             { {"rdi", "edi", "di", "dil"}, reg_off(rdi, edi) },
> > +             { {"rbp", "ebp", "bp", "bpl"}, reg_off(rbp, ebp) },
> > +             { {"rsp", "esp", "sp", "spl"}, reg_off(rsp, esp) },
> > +#undef reg_off
> > +#if __x86_64__
> > +             { {"r8", "r8d", "r8w", "r8b"}, offsetof(struct pt_regs, r8) },
> > +             { {"r9", "r9d", "r9w", "r9b"}, offsetof(struct pt_regs, r9) },
> > +             { {"r10", "r10d", "r10w", "r10b"}, offsetof(struct pt_regs, r10) },
> > +             { {"r11", "r11d", "r11w", "r11b"}, offsetof(struct pt_regs, r11) },
> > +             { {"r12", "r12d", "r12w", "r12b"}, offsetof(struct pt_regs, r12) },
> > +             { {"r13", "r13d", "r13w", "r13b"}, offsetof(struct pt_regs, r13) },
> > +             { {"r14", "r14d", "r14w", "r14b"}, offsetof(struct pt_regs, r14) },
> > +             { {"r15", "r15d", "r15w", "r15b"}, offsetof(struct pt_regs, r15) },
> > +#endif
> > +     };
> > +     int i, j;
> > +
> > +     for (i = 0; i < ARRAY_SIZE(reg_map); i++) {
> > +             for (j = 0; j < ARRAY_SIZE(reg_map[i].names); j++) {
> > +                     if (strcmp(reg_name, reg_map[i].names[j]) == 0)
> > +                             return reg_map[i].pt_regs_off;
> > +             }
> > +     }
> > +
> > +     pr_warn("usdt: unrecognized register '%s'\n", reg_name);
> > +     return -ENOENT;
> > +}
>
> this is a really neat approach! could we shrink the arch-dependent
> part even further to the reg_map only? so instead of having the
> parse_usdt_arg() in the #ifdef __x86_64__/___i386__ , only the
> reg_map is, and we have an empty reg_map for an unsupported arch
> such that calc_pt_regs_off() does
>

That would reduce the flexibility and will save only a few lines of
code. Different architectures might have their own quirks and reg_map
might not fit all their needs. So I went for a more independent and
flexible approach, even if some loop has to be duplicated.

>         if (ARRAY_SIZE(reg_map) == 0) {
>                 pr_warn("usdt: libbpf doesn't support USDTs on current
> architecture\n");
>                 return -ENOTSUP;
>         }
>
> > +
> > +static int parse_usdt_arg(const char *arg_str, int arg_num, struct usdt_arg_spec *arg)
> > +{
> > +     char *reg_name = NULL;
> > +     int arg_sz, len, reg_off;
> > +     long off;
> > +
>
> nit but it took me a moment to notice that you had examples in each
> clause; might be good to have a higher-level comment stating
>
> we support 3 forms of argument description:
>
> - register dereference "-4@-20(%rbp)"
> - register "-4@%eax"
> - constant "4@$71"
>
> I _think_ you mentioned there were other valid arg formats that we're not
> supporting, would be good to be explicit about that here I think; "other
> formats are possible but we don't support them currently".

Yep, sure. Those examples in the comments below are indeed easy to miss.

>
> > +     if (3 == sscanf(arg_str, " %d @ %ld ( %%%m[^)] ) %n", &arg_sz, &off, &reg_name, &len)) {
> > +             /* -4@-20(%rbp) */
> > +             arg->arg_type = USDT_ARG_REG_DEREF;
> > +             arg->val_off = off;
> > +             reg_off = calc_pt_regs_off(reg_name);
> > +             free(reg_name);
> > +             if (reg_off < 0)
> > +                     return reg_off;
> > +             arg->reg_off = reg_off;

[...]



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux