On Tue, Jan 23, 2024 at 01:41:10PM -0800, dthaler1968@xxxxxxxxxxxxxx wrote: > > -----Original Message----- > > From: David Vernet <void@xxxxxxxxxxxxx> > > Sent: Tuesday, January 23, 2024 1:31 PM > > To: dthaler1968@xxxxxxxxxxxxxx > > Cc: bpf@xxxxxxxx; bpf@xxxxxxxxxxxxxxx; jose.marchesi@xxxxxxxxxx > > Subject: Re: [Bpf] Standardizing BPF assembly language? > > > > On Tue, Jan 23, 2024 at 08:45:32AM -0800, > > dthaler1968=40googlemail.com@xxxxxxxxxxxxxx wrote: > > > At LSF/MM/BPF 2023, Jose gave a presentation about BPF assembly > > > language (http://vger.kernel.org/bpfconf2023_material/compiled_bpf.txt). > > > > > > Jose wrote in that link: > > > > There are two dialects of BPF assembler in use today: > > > > > > > > - A "pseudo-c" dialect (originally "BPF verifier format") > > > > : r1 = *(u64 *)(r2 + 0x00f0) > > > > : if r1 > 2 goto label > > > > : lock *(u32 *)(r2 + 10) += r3 > > > > > > > > - An "assembler-like" dialect > > > > : ldxdw %r1, [%r2 + 0x00f0] > > > > : jgt %r1, 2, label > > > > : xaddw [%r2 + 2], r3 > > > > > > During Jose's talk, I discovered that uBPF didn't quote match the > > > second dialect and submitted a bug report. By the time the conference > > > was over, uBPF had been updated to match GCC, so that discussion > > > worked to reduce the number of variants. > > > > > > As more instructions get added and supported by more tools and > > > compilers there's the risk of even more variants unless it's > standardized. > > > > > > Hence I'd recommend that BPF assembly language get documented in some > > > WG draft. If folks agree with that premise, the first question is > > > then: which document? > > > > > One possible answer would be the ISA document that specifies the > > > instructions, since that would the IANA registry could list the > > > assembly for each instruction, and any future documents that add > > > instructions would necessarily need to specify the assembly for them, > > > preventing variants from springing up for new instructions. > > > > I'm not opposed to this, but would strongly prefer that we do it as an > extension > > if we go this route to avoid scope creep for the first iteration. > > If the first iteration does not have it, then presumably the initial > IANA registry would not have it either, since this iteration creates > the registry and the rules for it. > > That's doable, but may continue to proliferate more and more variants > until it is addressed. The same could be said for any new instructions that are added while we sort out standardizing the assembly language as well, no? > If it's in another document, do you agree it would still fall under > the existing charter bullet about "defining the instructions" > > [PS] the BPF instruction set architecture (ISA) that defines the > > instructions and low-level virtual machine for BPF programs, > ? I wouldn't say it's illogical to group assembly language in this bucket, but I would say that defining the assembly language does not need to be tied at the hip with defining instruction encodings and semantics. So my answer is "yes, I think it belongs here", but I also don't think it's necessary or desirable for the first iteration. > > > A second question would be, which dialect(s) to standardize. Jose's > > > link above argues that the second dialect should be the one > > > standardized (tools are free to support multiple dialects for > > > backwards compat if they want). See the link for rationale. > > > > My recollection was that the outcome of that discussion is that we were > going > > to continue to support both. If we wanted to standardize, I have a hard > time > > seeing any other way other than to standardize both dialects unless > there's > > been a significant change in sentiment since LSFMM. > > If "standardize both", does that mean neither is mandatory and each tool > is free to pick one or the other? And would the IANA registry require a > document > adding any new instructions to specify the assembly in both dialects? Well, if we're standardizing on both, then yes I think it would be mandatory for a tool to support both, and I think instructions would require assembly for both dialects. Practically speaking that's already what's happening, no? Both dialects are already pervasive, so it seems unlikely that a tool would succeed without supporting both regardless. To Jose's point (pasted below), there are of course drawbacks: > - Expensive :: it makes it very difficult to reuse infrastructure. > - Problematic :: dis/assemblers, CGEN, LaTeX, editors, IDEs, etc. > - Ambiguous :: with both GAS and llvm/MCParser: symbol assignments. > - Pervasive :: because of the inline asm. I think it would be a lot simpler to standardize on only a single dialect, but I also think the standard should reflect how BPF is being used in practice.
Attachment:
signature.asc
Description: PGP signature