On 9/11/19 7:18 PM, Christian Barcenas wrote: > A process can lock memory addresses into physical RAM explicitly > (via mlock, mlockall, shmctl, etc.) or implicitly (via VFIO, > perf ring-buffers, bpf maps, etc.), subject to RLIMIT_MEMLOCK limits. > > CAP_IPC_LOCK allows a process to exceed these limits, and throughout > the kernel this capability is checked before allowing/denying an attempt > to lock memory regions into RAM. > > Because bpf locks its programs and maps into RAM, it should respect > CAP_IPC_LOCK. Previously, bpf would return EPERM when RLIMIT_MEMLOCK was > exceeded by a privileged process, which is contrary to documented > RLIMIT_MEMLOCK+CAP_IPC_LOCK behavior. > > Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs") > Signed-off-by: Christian Barcenas <christian@xxxxxxxxxxxxx> Acked-by: Yonghong Song <yhs@xxxxxx> > --- > kernel/bpf/syscall.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 272071e9112f..e551961f364b 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -183,8 +183,9 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr) > static int bpf_charge_memlock(struct user_struct *user, u32 pages) > { > unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT; > + unsigned long locked = atomic_long_add_return(pages, &user->locked_vm); > > - if (atomic_long_add_return(pages, &user->locked_vm) > memlock_limit) { > + if (locked > memlock_limit && !capable(CAP_IPC_LOCK)) { > atomic_long_sub(pages, &user->locked_vm); > return -EPERM; > } > @@ -1231,7 +1232,7 @@ int __bpf_prog_charge(struct user_struct *user, u32 pages) > > if (user) { > user_bufs = atomic_long_add_return(pages, &user->locked_vm); > - if (user_bufs > memlock_limit) { > + if (user_bufs > memlock_limit && !capable(CAP_IPC_LOCK)) { > atomic_long_sub(pages, &user->locked_vm); > return -EPERM; > } >