Re: [PATCH net v4] bpf, net: Check cgroup_bpf_enabled() only once in do_sock_getsockopt()

Tze-nan Wu (吳澤南) <Tze-nan.Wu@xxxxxxxxxxxx> · Thu, 22 Aug 2024 07:01:57 +0000



On Thu, 2024-08-22 at 11:16 +0800, Tze-nan Wu wrote:
> On Wed, 2024-08-21 at 14:01 -0700, Alexei Starovoitov wrote:
> >  	 
> > External email : Please do not click links or open attachments
> > until
> > you have verified the sender or the content.
> >  On Wed, Aug 21, 2024 at 2:30 AM Tze-nan Wu <
> > Tze-nan.Wu@xxxxxxxxxxxx>
> > wrote:
> > > 
> > > The return value from `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` can
> > 
> > change
> > > between the invocations of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`.
> > > 
> > > If `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` changes from "false"
> > > to
> > > "true" between the invocations of
> > 
> > `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` and
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`,
> > > `BPF_CGROUP_RUN_PROG_GETSOCKOPT`
> > 
> > will
> > > receive an -EFAULT from
> > 
> > `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`
> > > due to `get_user()` was not reached in
> > 
> > `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`.
> > > 
> > > Scenario shown as below:
> > > 
> > >            `process A`                      `process B`
> > >            -----------                      ------------
> > >   BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
> > >                                             enable
> > 
> > CGROUP_GETSOCKOPT
> > >   BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
> > > 
> > > To prevent this, invoke `cgroup_bpf_enabled()` only once and
> > > cache
> > 
> > the
> > > result in a newly added local variable `enabled`.
> > > Both `BPF_CGROUP_*` macros in `do_sock_getsockopt` will then
> > > check
> > 
> > their
> > > condition using the same `enabled` variable as the condition
> > 
> > variable,
> > > instead of using the return values from `cgroup_bpf_enabled`
> > > called
> > 
> > by
> > > themselves as the condition variable(which could yield different
> > 
> > results).
> > > This ensures that either both `BPF_CGROUP_*` macros pass the
> > 
> > condition
> > > or neither does.
> > > 
> > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt
> > 
> > hooks")
> > > Co-developed-by: Yanghui Li <yanghui.li@xxxxxxxxxxxx>
> > > Signed-off-by: Yanghui Li <yanghui.li@xxxxxxxxxxxx>
> > > Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@xxxxxxxxxxxx>
> > > Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@xxxxxxxxxxxx>
> > > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@xxxxxxxxxxxx>
> > > ---
> > > 
> > > Chagnes from v1 to v2: 
> > 
> > 
https://lore.kernel.org/all/20240819082513.27176-1-Tze-nan.Wu@xxxxxxxxxxxx/
> > >   Instead of using cgroup_lock in the fastpath, invoke
> > 
> > cgroup_bpf_enabled
> > >   only once and cache the value in the newly added variable
> > 
> > `enabled`.
> > >   `BPF_CGROUP_*` macros in do_sock_getsockopt can then both check
> > 
> > their
> > >   condition with the new variable `enable`, ensuring that either
> > 
> > they both
> > >   passing the condition or both do not.
> > > 
> > > Chagnes from v2 to v3: 
> > 
> > 
https://lore.kernel.org/all/20240819155627.1367-1-Tze-nan.Wu@xxxxxxxxxxxx/
> > >   Hide cgroup_bpf_enabled in the macro, and some modifications to
> > 
> > adapt
> > >   the coding style.
> > > 
> > > Chagnes from v3 to v4: 
> > 
> > 
https://lore.kernel.org/all/20240820092942.16654-1-Tze-nan.Wu@xxxxxxxxxxxx/
> > >   Add bpf tag to subject, and Fixes tag in body.
> > > 
> > > ---
> > >  include/linux/bpf-cgroup.h | 15 ++++++++-------
> > >  net/socket.c               |  5 +++--
> > >  2 files changed, 11 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-
> > 
> > cgroup.h
> > > index fb3c3e7181e6..5afa2ac76aae 100644
> > > --- a/include/linux/bpf-cgroup.h
> > > +++ b/include/linux/bpf-cgroup.h
> > > @@ -390,20 +390,20 @@ static inline bool
> > 
> > cgroup_bpf_sock_enabled(struct sock *sk,
> > >         __ret;                                                   
> > >   
> > 
> >             \
> > >  })
> > > 
> > > -#define
> > 
> > BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)                           
> >   
> >  \
> > > +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
> > 
> > enabled)                     \
> > >  ({                                                              
> > >   
> > 
> >             \
> > >         int __ret =
> > 
> > 0;                                                         \
> > > -       if
> > 
> > (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))                            
> >  \
> > > +       enabled =
> > 
> > cgroup_bpf_enabled(CGROUP_GETSOCKOPT);                       \
> > > +       if (enabled)
> > 
> > 
> > I suspect the compiler generates slow code after such a patch.
> > pw-bot: cr
> > 
> > What is the problem with double cgroup_bpf_enabled() check?
> > yes it might return two different values, so?

> Depending on where the -EFAULT occurs, the problem could be
> different.
> In our case, the -EFAULT is returned from getsockopt() during a
> "bootup-critical property setting" flow in Android. As a result, the
> property setting fails due it get -EFAULT from getsockopt(), causing
> the device to fail the boot process.
> 
> Should the userspace caller always anticipate an -EFAULT from
> getsockopt() if there's another process enables CGROUP_GETSOCKOPT
> (possibly through the bpf() syscall) at the same time?
> If that's the case, then I will handle this in userspace.
> 

BTW, If this should be handled in kernel, modification shown below
could fix the issue without breaking the "static_branch" usage in both
macros:


+++ /include/linux/bpf-cgroup.h:
    -#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen)
    +#define BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen, compat)
     ({
            int __ret = 0;
            if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT))
                copy_from_sockptr(&__ret, optlen, sizeof(int));
     +      else
     +          *compat = true;
            __ret;
     })

    #define BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock, level, optname,
optval, optlen, max_optlen, retval)
     ({
         int __ret = retval;
    -    if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT) &&
    -        cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
    +    if (cgroup_bpf_sock_enabled(sock, CGROUP_GETSOCKOPT))
             if (!(sock)->sk_prot->bpf_bypass_getsockopt ||
               ...

  +++ /net/socket.c:
    int do_sock_getsockopt(struct socket *sock, bool compat, int level,
     {
        ...
        ...
    +     /* The meaning of `compat` variable could be changed here
    +      * to indicate if cgroup_bpf_enabled(CGROUP_SOCK_OPS) is
false.
    +      */
        if (!compat)
    -       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen);
    +       max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen,
&compat);

> Thanks,
> --tze-nan