Re: [Bpf] Standardizing BPF assembly language?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 23, 2024 at 01:41:10PM -0800, dthaler1968@xxxxxxxxxxxxxx wrote:
> > -----Original Message-----
> > From: David Vernet <void@xxxxxxxxxxxxx>
> > Sent: Tuesday, January 23, 2024 1:31 PM
> > To: dthaler1968@xxxxxxxxxxxxxx
> > Cc: bpf@xxxxxxxx; bpf@xxxxxxxxxxxxxxx; jose.marchesi@xxxxxxxxxx
> > Subject: Re: [Bpf] Standardizing BPF assembly language?
> > 
> > On Tue, Jan 23, 2024 at 08:45:32AM -0800,
> > dthaler1968=40googlemail.com@xxxxxxxxxxxxxx wrote:
> > > At LSF/MM/BPF 2023, Jose gave a presentation about BPF assembly
> > > language (http://vger.kernel.org/bpfconf2023_material/compiled_bpf.txt).
> > >
> > > Jose wrote in that link:
> > > > There are two dialects of BPF assembler in use today:
> > > >
> > > > - A "pseudo-c" dialect (originally "BPF verifier format")
> > > >  : r1 = *(u64 *)(r2 + 0x00f0)
> > > >  : if r1 > 2 goto label
> > > >  : lock *(u32 *)(r2 + 10) += r3
> > > >
> > > > - An "assembler-like" dialect
> > > >  : ldxdw %r1, [%r2 + 0x00f0]
> > > >  : jgt %r1, 2, label
> > > >  : xaddw [%r2 + 2], r3
> > >
> > > During Jose's talk, I discovered that uBPF didn't quote match the
> > > second dialect and submitted a bug report.  By the time the conference
> > > was over, uBPF had been updated to match GCC, so that discussion
> > > worked to reduce the number of variants.
> > >
> > > As more instructions get added and supported by more tools and
> > > compilers there's the risk of even more variants unless it's
> standardized.
> > >
> > > Hence I'd recommend that BPF assembly language get documented in some
> > > WG draft.  If folks agree with that premise, the first question is
> > > then: which document?
> > 
> > > One possible answer would be the ISA document that specifies the
> > > instructions, since that would the IANA registry could list the
> > > assembly for each instruction, and any future documents that add
> > > instructions would necessarily need to specify the assembly for them,
> > > preventing variants from springing up for new instructions.
> > 
> > I'm not opposed to this, but would strongly prefer that we do it as an
> extension
> > if we go this route to avoid scope creep for the first iteration.
> 
> If the first iteration does not have it, then presumably the initial
> IANA registry would not have it either, since this iteration creates
> the registry and the rules for it.
> 
> That's doable, but may continue to proliferate more and more variants
> until it is addressed.

The same could be said for any new instructions that are added while we
sort out standardizing the assembly language as well, no?

> If it's in another document, do you agree it would still fall under
> the existing charter bullet about "defining the instructions"
> > [PS] the BPF instruction set architecture (ISA) that defines the
> > instructions and low-level virtual machine for BPF programs,
> ?

I wouldn't say it's illogical to group assembly language in this bucket,
but I would say that defining the assembly language does not need to be
tied at the hip with defining instruction encodings and semantics. So my
answer is "yes, I think it belongs here", but I also don't think it's
necessary or desirable for the first iteration.

> > > A second question would be, which dialect(s) to standardize.  Jose's
> > > link above argues that the second dialect should be the one
> > > standardized (tools are free to support multiple dialects for
> > > backwards compat if they want).  See the link for rationale.
> > 
> > My recollection was that the outcome of that discussion is that we were
> going
> > to continue to support both. If we wanted to standardize, I have a hard
> time
> > seeing any other way other than to standardize both dialects unless
> there's
> > been a significant change in sentiment since LSFMM.
> 
> If "standardize both", does that mean neither is mandatory and each tool
> is free to pick one or the other?  And would the IANA registry require a
> document
> adding any new instructions to specify the assembly in both dialects?

Well, if we're standardizing on both, then yes I think it would be
mandatory for a tool to support both, and I think instructions would
require assembly for both dialects. Practically speaking that's already
what's happening, no? Both dialects are already pervasive, so it seems
unlikely that a tool would succeed without supporting both regardless.
To Jose's point (pasted below), there are of course drawbacks:

> - Expensive :: it makes it very difficult to reuse infrastructure.
> - Problematic :: dis/assemblers, CGEN, LaTeX, editors, IDEs, etc.
> - Ambiguous :: with both GAS and llvm/MCParser: symbol assignments.
> - Pervasive :: because of the inline asm.

I think it would be a lot simpler to standardize on only a single
dialect, but I also think the standard should reflect how BPF is being
used in practice.

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux