Re: [PATCH v2 bpf-next 1/2] libbpf: auto-bump RLIMIT_MEMLOCK if kernel needs it for BPF

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrii Nakryiko <andrii@xxxxxxxxxx> writes:

> The need to increase RLIMIT_MEMLOCK to do anything useful with BPF is
> one of the first extremely frustrating gotchas that all new BPF users go
> through and in some cases have to learn it a very hard way.
>
> Luckily, starting with upstream Linux kernel version 5.11, BPF subsystem
> dropped the dependency on memlock and uses memcg-based memory accounting
> instead. Unfortunately, detecting memcg-based BPF memory accounting is
> far from trivial (as can be evidenced by this patch), so in practice
> most BPF applications still do unconditional RLIMIT_MEMLOCK increase.
>
> As we move towards libbpf 1.0, it would be good to allow users to forget
> about RLIMIT_MEMLOCK vs memcg and let libbpf do the sensible adjustment
> automatically. This patch paves the way forward in this matter. Libbpf
> will do feature detection of memcg-based accounting, and if detected,
> will do nothing. But if the kernel is too old, just like BCC, libbpf
> will automatically increase RLIMIT_MEMLOCK on behalf of user
> application ([0]).
>
> As this is technically a breaking change, during the transition period
> applications have to opt into libbpf 1.0 mode by setting
> LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK bit when calling
> libbpf_set_strict_mode().
>
> Libbpf allows to control the exact amount of set RLIMIT_MEMLOCK limit
> with libbpf_set_memlock_rlim_max() API. Passing 0 will make libbpf do
> nothing with RLIMIT_MEMLOCK. libbpf_set_memlock_rlim_max() has to be
> called before the first bpf_prog_load(), bpf_btf_load(), or
> bpf_object__load() call, otherwise it has no effect and will return
> -EBUSY.
>
>   [0] Closes: https://github.com/libbpf/libbpf/issues/369
>
> Signed-off-by: Andrii Nakryiko <andrii@xxxxxxxxxx>

The probing approach breaks with out-of-order backports, I suppose.
Hopefully no one will do those for that particular patch, though (it's
not really a bugfix), and at least for RHEL we did backport them
together.

Can't think of any better ways of doing the detection either, but maybe
something to be aware of in the future (i.e., "don't change things in a
way that can't be detected from userspace")?

Anyway, with the nits below:

Acked-by: Toke Høiland-Jørgensen <toke@xxxxxxxxxx>

> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 6b2407e12060..7c82136979bf 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c

[...]

> +	/* attempt loading freplace trying to use custom BTF */
> +	memset(&attr, 0, bpf_load_attr_sz);
> +	attr.prog_type = BPF_PROG_TYPE_TRACING;
> +	attr.expected_attach_type = BPF_TRACE_FENTRY;

This comment also seems to be disagreeing with the code it's commenting
on?

[...]

> +static bool memlock_bumped;
> +static rlim_t memlock_rlim_max = RLIM_INFINITY;
> +
> +int libbpf_set_memlock_rlim_max(size_t memlock_max)
> +{
> +	if (memlock_bumped)
> +		return libbpf_err(-EBUSY);
> +
> +	memlock_rlim_max = memlock_max;
> +	return 0;
> +}
> +
> +int bump_rlimit_memlock(void)
> +{
> +	struct rlimit rlim;
> +
> +	/* this the default in libbpf 1.0, but for now user has to opt-in explicitly */
> +	if (!(libbpf_mode & LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK))
> +		return 0;
> +
> +	/* if kernel supports memcg-based accounting, skip bumping RLIMIT_MEMLOCK */
> +	if (memlock_bumped || kernel_supports(NULL, FEAT_MEMCG_ACCOUNT))
> +		return 0;
> +
> +	memlock_bumped = true;
> +
> +	/* zero memlock_rlim_max disables auto-bumping RLIMIT_MEMLOCK */
> +	if (memlock_rlim_max == 0)
> +		return 0;
> +
> +	rlim.rlim_cur = rlim.rlim_max = memlock_rlim_max;
> +	if (setrlimit(RLIMIT_MEMLOCK, &rlim))
> +		return -errno;
> +
> +	return 0;
> +}
> +

"rlim_max" seems to imply this will only ever increase the limit, but if
I'm reading the code correctly it could actually end up lowering the
effective limit?

-Toke





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux