Re: [PATCH bpf-next 3/6] bpf: Don't EFAULT for {g,s}setsockopt with wrong optlen

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 25, 2023 at 11:31 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
>
> On 4/25/23 10:12 AM, Stanislav Fomichev wrote:
> > On Mon, Apr 24, 2023 at 5:56 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> >>
> >> On 4/21/23 9:09 AM, Stanislav Fomichev wrote:
> >>> On Fri, Apr 21, 2023 at 8:24 AM Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 4/19/23 12:53 AM, Stanislav Fomichev wrote:
> >>>>> Over time, we've found out several special socket option cases which need
> >>>>> special treatment. And if BPF program doesn't handle them correctly, this
> >>>>> might EFAULT perfectly valid {g,s}setsockopt calls.
> >>>>>
> >>>>> The intention of the EFAULT was to make it apparent to the
> >>>>> developers that the program is doing something wrong.
> >>>>> However, this inadvertently might affect production workloads
> >>>>> with the BPF programs that are not too careful.
> >>>>
> >>>> Took in the first two for now. It would be good if the commit description
> >>>> in here could have more details for posterity given this is too vague.
> >>>
> >>> Thanks! Will try to repost next week with more details.
> >>>
> >>>>> Let's try to minimize the chance of BPF program screwing up userspace
> >>>>> by ignoring the output of those BPF programs (instead of returning
> >>>>> EFAULT to the userspace). pr_info_ratelimited those cases to
> >>>>> the dmesg to help with figuring out what's going wrong.
> >>>>>
> >>>>> Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
> >>>>> Suggested-by: Martin KaFai Lau <martin.lau@xxxxxxxxxx>
> >>>>> Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx>
> >>>>> ---
> >>>>>     kernel/bpf/cgroup.c | 8 ++++++--
> >>>>>     1 file changed, 6 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> >>>>> index a06e118a9be5..af4d20864fb4 100644
> >>>>> --- a/kernel/bpf/cgroup.c
> >>>>> +++ b/kernel/bpf/cgroup.c
> >>>>> @@ -1826,7 +1826,9 @@ int __cgroup_bpf_run_filter_setsockopt(struct sock *sk, int *level,
> >>>>>                 ret = 1;
> >>>>>         } else if (ctx.optlen > max_optlen || ctx.optlen < -1) {
> >>>>>                 /* optlen is out of bounds */
> >>>>> -             ret = -EFAULT;
> >>>>> +             pr_info_ratelimited(
> >>>>> +                     "bpf setsockopt returned unexpected optlen=%d (max_optlen=%d)\n",
> >>>>> +                     ctx.optlen, max_optlen);
> >>>>
> >>>> Does it help any regular user if this log message is seen? I kind of doubt it a bit,
> >>>> it might create more confusion if log gets spammed with it, imo.
> >>>
> >>> Agreed, but we need some way to let the users know that their bpf
> >>> program is doing the wrong thing (if they set the optlen too high for
> >>> example).
> >>
> >> imo, I also think a printk here will be a noise in dmesg most of the time (but
> >> much better than an unexpected -EFAULT).
> >
> > I was thinking for a v2, maybe print it at least once? Similar to
> > current bpf_warn_invalid_xdp_action?
> >
> >
> >>> Any other better alternatives to expose those events?
> >>
> >> Is it possible to only -EFAULT when the bpf prog setting a ctx.optlen larger
> >> than the "original" user provided optlen?
>
> Nevermind the "ctx.optlen larger than the original user provided optlen" part. I
> mis-read something in __cgroup_bpf_run_filter_getsockopt(). max_optlen is the
> right limit that the kernel needs to bound the ctx.optlen written by bpf prog.
>
> >> Ignore for all other cases that is due to the current PAGE_SIZE limitation?
> >
> > Should be possible. That "ctx.optlen > PAGE_SIZE && ctx.optlen <
> > original_optlen" is the condition where we'd silently ignore BPF
> > output.
>
> and should the -ve ctx.optlen be treated separately? like in
> __cgroup_bpf_run_filter_getsockopt():
>
>         if (ctx.optlen < 0) {
>                 ret = -EFAULT;
>                 goto out;
>         }
>
>         if (optval && ctx.optlen > max_optlen) {
>                 ret = original_optlen > PAGE_SIZE ? 0 : -EFAULT;
>                 goto out;
>         }

Good suggestion. Let me try to distinguish between the two cases:
- bpf prog sets optlen larger than the original_optlen -> EFAULT
- bpf prog forgets to reset optlen to 0 for large optlen -> ignore

> > As per above, I'll stick a line to the dmest (similar
> > bpf_warn_invalid_xdp_action), at least to record that this has
> > happened once.
> > LMK if you or Danial still don't see a value in printing this..
>
> pr_info_once? hmm... I think it is ok-ish. At least not a warning.
>
> I think almost all of the time the bpf prog forgets to set it to 0 for the long
> optval that it has no interest in. However, to suppress this pr_info_once,
> setting optlen to 0 will disable the following cgroup-bpf prog from using the
> optval as read-only. The case that the printk that may flag is that the bpf prog
> did indeed want to change the long optval?

The case we want to printk is where the prog changes some byte in the
first 4k range of the optval and does not touch optlen (or maybe
adjusts optlen to be >PAGE_SIZE and <original_optlen).
I agree that it feels super corner-casy; but it feels like without
some kind of hint, it would be impossible to figure out why it doesn't
work. Or am I overblowing it?




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux