On Wed, Sep 21, 2022 at 10:00 PM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > On Wed, Sep 21, 2022 at 06:50:54AM -0700, Alexei Starovoitov wrote: > > On Wed, Sep 21, 2022 at 1:34 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > > > > > Subject changed to reflect this reply is out of scope of the original topic > > > (ISA doc). > > > > > > On Tue, Sep 20, 2022 at 04:39:52PM -0700, Alexei Starovoitov wrote: > > > > On Tue, Sep 20, 2022 at 12:51 PM Dave Thaler <dthaler@xxxxxxxxxxxxx> wrote: > > > > > > -----Original Message----- > > > > > > From: Christoph Hellwig <hch@xxxxxxxxxxxxx> > > > > > > Sent: Monday, September 19, 2022 10:04 AM > > > > > > To: Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> > > > > > > Cc: Dave Thaler <dthaler@xxxxxxxxxxxxx>; bpf <bpf@xxxxxxxxxxxxxxx> > > > > > > Subject: Re: FW: ebpf-docs: draft of ISA doc updates in progress > > > > > > > > > > > > On Wed, Sep 14, 2022 at 02:22:51PM +0800, Shung-Hsi Yu wrote: > > > > > > > As discussed in yesterday's session, there's no graceful abortion on > > > > > > > division by zero, instead, the BPF verifier in Linux prevents division > > > > > > > by zero from happening. Here a few additional notes: > > > > > > > > > > > > Hmm, I thought Alexei pointed out a while ago that divide by zero is now > > > > > > defined to return 0 following. Ok, reading further along I think that is what > > > > > > you describe with the pseudo-code below. > > > > > > > > > > Based on the discussion at LPC, and the fact that older implementations, > > > > > as well as uBPF and rbpf still terminate the program, I've added this text > > > > > to permit both behaviors: > > > > > > > > That's not right. ubpf and rbpf are broken. > > > > We shouldn't be adding descriptions of broken implementations > > > > to the standard. > > > > There is no way to 'gracefully abort' in eBPF. > > > > There is a way to 'return 0' in cBPF, but that's different. See below. > > > > > > > > > > If eBPF program execution would result in division by zero, > > > > > > the destination register SHOULD instead be set to zero, but > > > > > > program execution MAY be gracefully aborted instead. > > > > > > Similarly, if execution would result in modulo by zero, > > > > > > the destination register SHOULD instead be set to the source value, > > > > > > but program execution MAY be gracefully aborted instead. > > > > > > While this makes implementation a lot easier, come to think of it though, > > > this behavior actually is more like a hack around having to deal with > > > division/modulo-on-zero at runtime. User doing statistic calculations with > > > BPF will get bit by this sooner or later, arriving at the wrong calculation > > > (fast-math comes to mind). > > > > lol. If that was the case arm64 would be on fire long ago > > and users would complain in masses. > > Same with risc-v. > > whoa, I had no idea. > > And looking around I don't see complains. Taking what I said back and +1 for > using the current division/modulo-by-zero behavior as standard. > > > > This seems to go against some general BPF principle[1] of preventing the > > > users from shooting themselves in the foot. > > > > > > Just like how BPF verifier prevents a _possible_ out-of-bound memory access, > > > e.g. arr[i] when `i` is not bound-checked. Ideally I'd expect a coherent > > > approach toward division/modulo-on-zero as well; the verifier should prevent > > > program that _might_ do division-on-zero from loading in the first place. > > > (Maybe possible to achieve if we introduce something like SCALAR_OR_NULL > > > composed type, but that's definitely not easy) > > > > > > Admittedly even if achievable, this is a radical approach that is not > > > backward compatible. If such check is implemented, programs that used to > > > load may now be rejected. (Usually new BPF verifier feature allows more BPF > > > program to pass the verifier, rather then the other way around) > > > > > > So, I don't have a good proposal at the moment. The purpose to this email is > > > to point what I see as an issue out and hope to start a discussion. > > > > > > 1: Okay, I'm making this up a bit, strictly speaking the BPF foundation is > > > safe program (one of Alexei's talk) rather than preventing users from > > > shooting themselves in the foot. > > > > Safe != invalid. > > BPF doesn't prevent invalid programs. > > BPF ensures safety only, not crashing the kernel, not leaking data, etc. > > For example: under speculation arr[i] can go out of bounds > > and to prevent data leaks we insert masking operations. > > Point taken, thanks for clarifying the difference between safe and invalid. > > > Similar with div_by_0. If the verifier can detect that it will reject > > the prog, otherwise it will insert if(==0) xor, because not > > all architectures behave like arm64. > > Speaking of which, should the "if(==0) xor" patching in do_misc_fixups() be > moved into JIT implementations and the interpretor? > > Given that the standard now mandates BPF_DIV with divisor of zero to return > zero, it would be rather confusing to see the output of `bpftool prog dump > xlated` contains the extra "if(==0) xor" instruction that's inserted by the > verifier. > > Also, maybe there'll be performance benefit for arm64 and riscv where > "if(==0) xor" is not needed. Tiny perf gain is not worth extra complexity in JITs. Integer divide is the slowest operation in a CPU. We're trying to do everything in the verifier and as little as possible in JITs.