On Mon, 11 Apr 2022, Andrii Nakryiko wrote: > On Fri, Apr 8, 2022 at 3:53 PM Alan Maguire <alan.maguire@xxxxxxxxxx> wrote: > > > > Parsing of USDT arguments is architecture-specific; on aarch64 it is > > relatively easy since registers used are x[0-31], sp. Format is > > slightly different compared to x86_64; forms are > > > > - "size @ [ reg[,offset] ]" for dereferences, for example > > "-8 @ [ sp, 76 ]" ; " -4 @ [ sp ]" > > - "size @ reg" for register values; for example > > "-4@x0" > > - "size @ value" for raw values; for example > > "-8@1" > > > > Signed-off-by: Alan Maguire <alan.maguire@xxxxxxxxxx> > > --- > > tools/lib/bpf/usdt.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++- > > 1 file changed, 49 insertions(+), 1 deletion(-) > > > > diff --git a/tools/lib/bpf/usdt.c b/tools/lib/bpf/usdt.c > > index 0677bbd..6165d40 100644 > > --- a/tools/lib/bpf/usdt.c > > +++ b/tools/lib/bpf/usdt.c > > @@ -1170,7 +1170,7 @@ static int parse_usdt_spec(struct usdt_spec *spec, const struct usdt_note *note, > > > > /* Architecture-specific logic for parsing USDT argument location specs */ > > > > -#if defined(__x86_64__) || defined(__i386__) || defined(__s390x__) > > +#if defined(__x86_64__) || defined(__i386__) || defined(__s390x__) || defined(__aarch64__) > > > > static int init_usdt_arg_spec(struct usdt_arg_spec *arg, enum usdt_arg_type arg_type, int arg_sz, > > __u64 val_off, int reg_off) > > @@ -1316,6 +1316,54 @@ static int parse_usdt_arg(const char *arg_str, int arg_num, struct usdt_arg_spec > > return len; > > } > > > > +#elif defined(__aarch64__) > > + > > +static int calc_pt_regs_off(const char *reg_name) > > +{ > > + int reg_num; > > + > > + if (sscanf(reg_name, "x%d", ®_num) == 1) { > > + if (reg_num >= 0 && reg_num < 31) > > + return offsetof(struct user_pt_regs, regs[reg_num]); > > + } else if (strcmp(reg_name, "sp") == 0) { > > + return offsetof(struct user_pt_regs, sp); > > + } > > + pr_warn("usdt: unrecognized register '%s'\n", reg_name); > > + return -ENOENT; > > +} > > + > > +static int parse_usdt_arg(const char *arg_str, int arg_num, struct usdt_arg_spec *arg) > > +{ > > + char *reg_name = NULL; > > + int arg_sz, len, ret; > > + long off = 0; > > + > > + if (sscanf(arg_str, " %d @ \[ %m[^,], %ld ] %n", &arg_sz, ®_name, &off, &len) == 3 || > > + sscanf(arg_str, " %d @ \[ %m[a-z0-9] ] %n", &arg_sz, ®_name, &len) == 2) { > > I'm not sure about the behavior here w.r.t. reg_name and memory > allocation. What if first sscanf() matches reg_name but fails at %ld, > will reg_name be allocated and then second sscanf() will reallocate > (and thus we'll have a memory leak). > > We might have similar problems in other implementations, actually... > > Either way, came here to ask to split two sscanfs into two separate > branches, so that we have a clear linear pattern. One sscanf, handle > it if successful, otherwise move on to next case. > good point; I'll separate the sscanfs into branches for v2. > Also a question about [a-z0-9] for register in one case and [^,] in > another. Should the first one be [a-z0-9] as well? > probably no harm, yep. I'll drop the refactoring patch too; I was a bit worried I'd break Ilya's s390 code anyhow. Thanks! Alan