On Thu, Aug 17, 2023 at 8:30 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 8/17/23 9:09 AM, Yafang Shao wrote: > > On Thu, Aug 17, 2023 at 11:31 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > >> On Wed, Aug 16, 2023 at 7:31 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > >>> On Thu, Aug 17, 2023 at 9:54 AM Alexei Starovoitov > >>> <alexei.starovoitov@xxxxxxxxx> wrote: > >>>> On Mon, Aug 14, 2023 at 7:46 PM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > >>>>> On Tue, Aug 15, 2023 at 8:28 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > >>>>>> On 8/14/23 7:33 AM, Yafang Shao wrote: > >>>>>>> Add a new bpf_current_capable kfunc to check whether the current task > >>>>>>> has a specific capability. In our use case, we will use it in a lsm bpf > >>>>>>> program to help identify if the user operation is permitted. > >>>>>>> > >>>>>>> Signed-off-by: Yafang Shao <laoar.shao@xxxxxxxxx> > >>>>>>> --- > >>>>>>> kernel/bpf/helpers.c | 6 ++++++ > >>>>>>> 1 file changed, 6 insertions(+) > >>>>>>> > >>>>>>> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > >>>>>>> index eb91cae..bbee7ea 100644 > >>>>>>> --- a/kernel/bpf/helpers.c > >>>>>>> +++ b/kernel/bpf/helpers.c > >>>>>>> @@ -2429,6 +2429,11 @@ __bpf_kfunc void bpf_rcu_read_unlock(void) > >>>>>>> rcu_read_unlock(); > >>>>>>> } > >>>>>>> > >>>>>>> +__bpf_kfunc bool bpf_current_capable(int cap) > >>>>>>> +{ > >>>>>>> + return has_capability(current, cap); > >>>>>>> +} > >>>>>> > >>>>>> Since you are testing against 'current' capabilities, I assume > >>>>>> that the context should be process. Otherwise, you are testing > >>>>>> against random task which does not make much sense. > >>>>> > >>>>> It is in the process context. > >>>>> > >>>>>> Since you are testing against 'current' cap, and if the capability > >>>>>> for that task is stable, you do not need this kfunc. > >>>>>> You can test cap in user space and pass it into the bpf program. > >>>>>> > >>>>>> But if the cap for your process may change in the middle of > >>>>>> run, then you indeed need bpf prog to test capability in real time. > >>>>>> Is this your use case and could you describe in in more detail? > >>>>> > >>>>> After we convert the capability of our networking bpf program from > >>>>> CAP_SYS_ADMIN to CAP_BPF+CAP_NET_ADMIN to enhance the security, we > >>>>> encountered the "pointer comparison prohibited" error, because > >>>>> allow_ptr_leaks is enabled only when CAP_PERFMON is set. However, if > >>>>> we enable the CAP_PERFMON for the networking bpf program, it can run > >>>>> tracing bpf prog, perf_event bpf prog and etc, that is not expected by > >>>>> us. > >>>>> > >>>>> Hence we are planning to use a lsm bpf program to disallow it from > >>>>> running other bpf programs. In our lsm bpf program we will check the > >>>>> capability of processes, if the process has cap_net_admin, cap_bpf and > >>>>> cap_perfmon but don't have cap_sys_admin we will refuse it to run > >>>>> tracing and perf_event bpf program. While if a process has cap_bpf > >>>>> and cap_perfmon but doesn't have cap_net_admin, that said it is a bpf > >>>>> program which wants to run trace bpf, we will allow it. > >>>>> > >>>>> We can't use lsm_cgroup because it is supported on cgroup2 only, while > >>>>> we're still using cgroup1. > >>>>> > >>>>> Another possible solution is enable allow_ptr_leaks for cap_net_admin > >>>>> as well, but after I checked the commit which introduces the cap_bpf > >>>>> and cap_perfmon [1], I think we wouldn't like to do it. > >>>> > >>>> Sorry. None of these options are acceptable. > >>>> > >>>> The idea of introducing a bpf_current_capable() kfunc just to work > >>>> around a deficiency in the verifier is not sound. > >>> > >>> So what should we do then? > >>> Just enabling the cap_perfmon for it? That does not sound as well ... > >> > >> Yonghong already pointed out upthread that > >> comparison of two packet pointers is not a pointer leak. > >> See this code: > >> } else if (!try_match_pkt_pointers(insn, dst_reg, ®s[insn->src_reg], > >> this_branch, other_branch) && > >> is_pointer_value(env, insn->dst_reg)) { > >> verbose(env, "R%d pointer comparison prohibited\n", > >> insn->dst_reg); > >> return -EACCES; > >> } > >> > >> It's not clear why it doesn't address your case. > > > > It can address the issue. > > It seems we should do the code change below. > > I presume in your actual program you also access the IP header (otherwise it > would be quite useless), and as you mentioned above, you would like this to be > accessible for just CAP_BPF + CAP_NET_ADMIN. The CAP_PERFMON restriction is > there for a reason, that is, to be on same cap level as tracing programs as you > could craft Spectre v1 style access. It's not a deficiency. Probably not. To mount v1 style attack the load of data_end needs to be attacker controlled which isn't the case here.