Re: [PATCH bpf-next v4 14/14] arm64: add a comment that warns that orig_x0 should not be moved

Ilya Leoshkevich <iii@xxxxxxxxxxxxx> · Wed, 09 Feb 2022 00:39:54 +0100

On Tue, 2022-02-08 at 14:23 -0800, Andrii Nakryiko wrote:
> On Tue, Feb 8, 2022 at 1:46 PM Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> wrote:
> > 
> > On Tue, 2022-02-08 at 13:11 -0800, Alexei Starovoitov wrote:
> > > On Tue, Feb 8, 2022 at 11:46 AM Ilya Leoshkevich
> > > <iii@xxxxxxxxxxxxx>
> > > wrote:
> > > > 
> > > > On Tue, 2022-02-08 at 11:25 -0800, Alexei Starovoitov wrote:
> > > > > On Tue, Feb 08, 2022 at 06:16:35AM +0100, Ilya Leoshkevich
> > > > > wrote:
> > > > > > orig_x0's location is used by libbpf tracing macros,
> > > > > > therefore
> > > > > > it
> > > > > > should not be moved.
> > > > > > 
> > > > > > Suggested-by: Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx>
> > > > > > Signed-off-by: Ilya Leoshkevich <iii@xxxxxxxxxxxxx>
> > > > > > ---
> > > > > >  arch/arm64/include/asm/ptrace.h | 4 ++++
> > > > > >  1 file changed, 4 insertions(+)
> > > > > > 
> > > > > > diff --git a/arch/arm64/include/asm/ptrace.h
> > > > > > b/arch/arm64/include/asm/ptrace.h
> > > > > > index 41b332c054ab..7e34c3737839 100644
> > > > > > --- a/arch/arm64/include/asm/ptrace.h
> > > > > > +++ b/arch/arm64/include/asm/ptrace.h
> > > > > > @@ -185,6 +185,10 @@ struct pt_regs {
> > > > > >                         u64 pstate;
> > > > > >                 };
> > > > > >         };
> > > > > > +       /*
> > > > > > +        * orig_x0 is not exposed via struct user_pt_regs,
> > > > > > but
> > > > > > its
> > > > > > location is
> > > > > > +        * assumed by libbpf's tracing macros, so it should
> > > > > > not
> > > > > > be
> > > > > > moved.
> > > > > > +        */
> > > > > 
> > > > > In other words this comment is saying that the layout is ABI.
> > > > > That's not the case. orig_x0 here and equivalent on s390 can
> > > > > be
> > > > > moved.
> > > > > It will break bpf progs written without CO-RE and that is
> > > > > expected.
> > > > > Non CO-RE programs often do all kinds of
> > > > > bpf_probe_read_kernel
> > > > > and
> > > > > will be breaking when kernel layout is changing.
> > > > > I suggest to drop this patch and patch 12.
> > > > 
> > > > Yeah, that was the intention here: to promote orig_x0 to ABI
> > > > using
> > > > a
> > > > comment, since doing this by extending user_pt_regs turned out
> > > > to
> > > > be
> > > > infeasible. I'm actually ok with not doing this, since programs
> > > > compiled with kernel headers and using CO-RE macros will be
> > > > fine.
> > > 
> > > The comment like this doesn't convert kernel internal struct into
> > > ABI.
> > > The comment is just wrong. BPF progs access many kernel data
> > > structs.
> > > s390's and arm64's struct pr_regs is not special in that sense.
> > > It's an internal struct.
> > > 
> > > > As you say, we don't care about programs that don't use CO-RE
> > > > too
> > > > much
> > > > here - if they break after an incompatible kernel change, fine.
> > > 
> > > Before CO-RE was introduced bpf progs included kernel headers
> > > and were breaking when kernel changes. Nothing new here.
> > > See the history of bcc tools. Some of them are littered
> > > with ifdef VERSION ==.
> > > 
> > > > The question now is - how much do we care about programs that
> > > > are
> > > 
> > > > compiled with userspace headers? Andrii suggested to use
> > > > offsetofend to
> > > > make syscall macros work there, however, this now requires this
> > > > ABI
> > > > promotion.
> > > 
> > > Today s390 and arm64 have user_pt_regs as a first field in
> > > pt_regs.
> > > That is kernel internal behavior and that part can change if arch
> > > maintainers have a need for that.
> > > bpf progs without CO-RE would have to be adjusted when kernel
> > > changes.
> > > Even with CO-RE it's ok to rename pt_regs->orig_gpr2 on s390.
> > > The progs with CO-RE will break too. The authors of tracing bpf
> > > progs
> > > have to expect that sooner or later their progs will break and
> > > they
> > > would have to adjust them.
> > 
> > When it comes to authors of tracing bpf progs, I agree that
> > eventually
> > they will have to adjust their code, CO-RE or not. However, in
> > patch 13
> > I introduce the following libbpf macro:
> > 
> > #if defined(__KERNEL__) || defined(__VMLINUX_H__)
> > ...
> > #else
> > #define PT_REGS_PARM1_SYSCALL(x) \
> >         (*(unsigned long *)(((char *)(x) + \
> >                              offsetofend(struct user_pt_regs,
> > pstate))))
> > 
> > If we merge this series without freezing orig_x0's offset, in case
> > of
> > an incompatible kernel change the users of PT_REGS_PARMn_SYSCALL
> > and
> > BPF_KPROBE_SYSCALL, who build against userspace headers, will not
> > simply have to update their code - they will have to upgrade
> > libbpf.
> > It's also not clear how PT_REGS_PARM1_SYSCALL in the upgraded
> > libbpf
> > will even look like, given that it would need to work on both old
> > and
> > new kernels.
> > 
> > I've also briefly looked into MIPS' ptrace.h, and it looks as if
> > their
> > user_pt_regs has no relationship to kernel pt_regs. IIUC this means
> > that it's not possible to correctly implement PT_REGS_PARMn_SYSCALL
> > using MIPS userspace headers.
> > 
> > So I wonder whether we should allow using PT_REGS_PARMn_SYSCALL and
> > BPF_KPROBE_SYSCALL with userspace headers at all? Would it make
> > sense
> > to simply fail the compilation if PT_REGS_PARMn_SYSCALL is used
> > without
> > including kernel headers?
> 
> Ok, my bad for suggesting those comments, I didn't realize the
> consequences of making anything into a stable ABI. Let's not add any
> comments, we don't need that.
> 
> I think we should just come to terms that for some architectures this
> syscall argument access won't work at least for some architectures
> without CO-RE. For uniformity let's still have those
> PT_REGS_PARM1_SYSCALL macro defined, but we should use
> __bpf_unreachable or something like that to make sure it won't
> compile
> if someone tries to use it.

Awesome, this would make quite a few headaches go away.

> 
> But it's an entirely different story for CORE variants and there (as
> I
> explained on one of the previous patches) we can fabricate our own
> definitions of pt_regs (architecture specific, using CO-RE struct
> flavors) without any unnecessary assumptions about which include
> headers the user is going to use. Hengqi's BPF_KPROBE_SYSCALL() macro
> is always going to use CORE variants and will "just work"(tm).
> 
> And because this asymmetry of CORE and non-CORE PT_REGS_PARM_SYSCALL
> definitions, your changes in v4 are a regression from the ones in v3
> which were absolutely fine (I still don't get why you changed all of
> that, I've previously landed v3, which means it was 100% acceptable
> as
> is...), because v4 establishes more rigid relation between CORE and
> non-CORE variants.
> 
> Anyways, let's get back to v3, drop UAPI changes, add struct
> pt_regs__s390x and whatever fields we need, use those with
> BPF_CORE_READ() and it should be ok with no ABI changes whatsoever.

Ok.