Re: [RFC 00/13] kprobe/bpf: Add support to attach multiple kprobes

Masami Hiramatsu <mhiramat@xxxxxxxxxx> · Fri, 7 Jan 2022 14:42:47 +0900

On Thu, 6 Jan 2022 15:57:10 +0100
Jiri Olsa <jolsa@xxxxxxxxxx> wrote:

> On Thu, Jan 06, 2022 at 10:59:43PM +0900, Masami Hiramatsu wrote:
> 
> SNIP
> 
> > > > 
> > > > Hmm, I think there may be a time to split the "kprobe as an 
> > > > interface for the software breakpoint" and "kprobe as a wrapper
> > > > interface for the callbacks of various instrumentations", like
> > > > 'raw_kprobe'(or kswbp) and 'kprobes'.
> > > > And this may be called as 'fprobe' as ftrace_ops wrapper.
> > > > (But if the bpf is enough flexible, this kind of intermediate layer
> > > >  may not be needed, it can use ftrace_ops directly, eventually)
> > > > 
> > > > Jiri, have you already considered to use ftrace_ops from the
> > > > bpf directly? Are there any issues?
> > > > (bpf depends on 'kprobe' widely?)
> > > 
> > > at the moment there's not ftrace public interface for the return
> > > probe merged in, so to get the kretprobe working I had to use
> > > kprobe interface
> > 
> > Yeah, I found that too. We have to ask Steve to salvage it ;)
> 
> I got those patches rebased like half a year ago upstream code,
> so should be easy to revive them

Nice! :)

> 
> > 
> > > but.. there are patches Steven shared some time ago, that do that
> > > and make graph_ops available as kernel interface
> > > 
> > > I recall we considered graph_ops interface before as common attach
> > > layer for trampolines, which was bad, but it might actually make
> > > sense for kprobes
> > 
> > I started working on making 'fprobe' which will provide multiple
> > function probe with similar interface of kprobes. See attached
> > patch. Then you can use it in bpf, maybe with an union like
> > 
> > union {
> > 	struct kprobe kp;	// for function body
> > 	struct fprobe fp;	// for function entry and return
> > };
> > 
> > At this moment, fprobe only support entry_handler, but when we
> > re-start the generic graph_ops interface, it is easy to expand
> > to support exit_handler.
> > If this works, I think kretprobe can be phased out, since at that
> > moment, kprobe_event can replace it with the fprobe exit_handler.
> > (This is a benefit of decoupling the instrumentation layer from
> > the event layer. It can choose the best way without changing
> > user interface.)
> > 
> 
> I can resend out graph_ops patches if you want to base
> it directly on that

Yes, that's very helpful. Now I'm considering to use it (or via fprobe)
from kretprobes like ftrace-based kprobe.

> > > I'll need to check it in more details but I think both graph_ops and
> > > kprobe do about similar thing wrt hooking return probe, so it should
> > > be comparable.. and they are already doing the same for the entry hook,
> > > because kprobe is mostly using ftrace for that
> > > 
> > > we would not need to introduce new program type - kprobe programs
> > > should be able to run from ftrace callbacks just fine
> > 
> > That seems to bind your mind. The program type is just a programing
> > 'model' of the bpf. You can choose the best implementation to provide
> > equal functionality. 'kprobe' in bpf is just a name that you call some
> > instrumentations which can probe kernel code.
> 
> I don't want to introduce new type, there's some dependencies
> in bpf verifier and helpers code we'd need to handle for that
> 
> I'm looking for solution for current kprobe bpf program type
> to be registered for multiple addresses quickly

Yes, as I replied to Alex, the bpf program type itself keeps 'kprobe'.
For example, you've introduced bpf_kprobe_link at [8/13], 

struct bpf_kprobe_link {
	struct bpf_link link;
	union {
		struct kretprobe rp;
		struct fprobe fp;
	};
	bool is_return;
	bool is_fentry;
	kprobe_opcode_t **addrs;
	u32 cnt;
	u64 bpf_cookie;
};

If all "addrs" are function entry, ::fp will be used.
If cnt == 1 then use ::rp.

> > > so we would have:
> > >   - kprobe type programs attaching to:
> > >   - new BPF_LINK_TYPE_FPROBE link using the graph_ops as attachment layer
> > > 
> > > jirka
> > > 
> > 
> > 
> > -- 
> > Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> 
> > From 269b86597c166d6d4c5dd564168237603533165a Mon Sep 17 00:00:00 2001
> > From: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > Date: Thu, 6 Jan 2022 15:40:36 +0900
> > Subject: [PATCH] fprobe: Add ftrace based probe APIs
> > 
> > The fprobe is a wrapper API for ftrace function tracer.
> > Unlike kprobes, this probes only supports the function entry, but
> > it can probe multiple functions by one fprobe. The usage is almost
> > same as the kprobe, user will specify the function names by
> > fprobe::syms, the number of syms by fprobe::nsyms, and the user
> > handler by fprobe::handler.
> > 
> > struct fprobe = { 0 };
> > const char *targets[] = {"func1", "func2", "func3"};
> > 
> > fprobe.handler = user_handler;
> > fprobe.nsyms = ARRAY_SIZE(targets);
> > fprobe.syms = targets;
> > 
> > ret = register_fprobe(&fprobe);
> > ...
> > 
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@xxxxxxxxxx>
> > ---
> >  include/linux/fprobes.h |  52 ++++++++++++++++
> >  kernel/trace/Kconfig    |  10 ++++
> >  kernel/trace/Makefile   |   1 +
> >  kernel/trace/fprobes.c  | 128 ++++++++++++++++++++++++++++++++++++++++
> >  4 files changed, 191 insertions(+)
> >  create mode 100644 include/linux/fprobes.h
> >  create mode 100644 kernel/trace/fprobes.c
> > 
> > diff --git a/include/linux/fprobes.h b/include/linux/fprobes.h
> > new file mode 100644
> > index 000000000000..22db748bf491
> > --- /dev/null
> > +++ b/include/linux/fprobes.h
> > @@ -0,0 +1,52 @@
> > +#ifndef _LINUX_FPROBES_H
> > +#define _LINUX_FPROBES_H
> > +/* Simple ftrace probe wrapper */
> > +
> > +#include <linux/compiler.h>
> > +#include <linux/ftrace.h>
> > +
> > +struct fprobe {
> > +	const char		**syms;
> > +	unsigned long		*addrs;
> 
> could you add array of user data for each addr/sym?

OK, something like this?

	void	**user_data;

But note that you need O(N) to search the entry corresponding to
a specific address. To reduce the overhead, we may need to sort
the array in advance (e.g. when registering it).

> 
> SNIP
> 
> > +static int populate_func_addresses(struct fprobe *fp)
> > +{
> > +	unsigned int i;
> > +
> > +	fp->addrs = kmalloc(sizeof(void *) * fp->nsyms, GFP_KERNEL);
> > +	if (!fp->addrs)
> > +		return -ENOMEM;
> > +
> > +	for (i = 0; i < fp->nsyms; i++) {
> > +		fp->addrs[i] = kallsyms_lookup_name(fp->syms[i]);
> > +		if (!fp->addrs[i]) {
> > +			kfree(fp->addrs);
> > +			fp->addrs = NULL;
> > +			return -ENOENT;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +/**
> > + * register_fprobe - Register fprobe to ftrace
> > + * @fp: A fprobe data structure to be registered.
> > + *
> > + * This expects the user set @fp::syms or @fp::addrs (not both),
> > + * @fp::nsyms (number of entries of @fp::syms or @fp::addrs) and
> > + * @fp::handler. Other fields are initialized by this function.
> > + */
> > +int register_fprobe(struct fprobe *fp)
> > +{
> > +	unsigned int i;
> > +	int ret;
> > +
> > +	if (!fp)
> > +		return -EINVAL;
> > +
> > +	if (!fp->nsyms || (!fp->syms && !fp->addrs) || (fp->syms && fp->addrs))
> > +		return -EINVAL;
> > +
> > +	if (fp->syms) {
> > +		ret = populate_func_addresses(fp);
> > +		if (ret < 0)
> > +			return ret;
> > +	}
> > +
> > +	fp->ftrace.func = fprobe_handler;
> > +	fp->ftrace.flags = FTRACE_OPS_FL_SAVE_REGS;
> > +
> > +	for (i = 0; i < fp->nsyms; i++) {
> > +		ret = ftrace_set_filter_ip(&fp->ftrace, fp->addrs[i], 0, 0);
> > +		if (ret < 0)
> > +			goto error;
> > +	}
> 
> I introduced ftrace_set_filter_ips, because loop like above was slow:
>   https://lore.kernel.org/bpf/20211118112455.475349-4-jolsa@xxxxxxxxxx/

Ah, thanks for noticing!

Thank you,

> 
> thanks,
> jirka
> 
> > +
> > +	fp->nmissed = 0;
> > +	ret = register_ftrace_function(&fp->ftrace);
> > +	if (!ret)
> > +		return ret;
> > +
> > +error:
> > +	if (fp->syms) {
> > +		kfree(fp->addrs);
> > +		fp->addrs = NULL;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +/**
> > + * unregister_fprobe - Unregister fprobe from ftrace
> > + * @fp: A fprobe data structure to be unregistered.
> > + */
> > +int unregister_fprobe(struct fprobe *fp)
> > +{
> > +	int ret;
> > +
> > +	if (!fp)
> > +		return -EINVAL;
> > +
> > +	if (!fp->nsyms || !fp->addrs)
> > +		return -EINVAL;
> > +
> > +	ret = unregister_ftrace_function(&fp->ftrace);
> > +
> > +	if (fp->syms) {
> > +		/* fp->addrs is allocated by register_fprobe() */
> > +		kfree(fp->addrs);
> > +		fp->addrs = NULL;
> > +	}
> > +
> > +	return ret;
> > +}
> > -- 
> > 2.25.1
> > 
> 

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>