On Wed, Sep 21, 2022 at 1:34 AM Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> wrote: > > Subject changed to reflect this reply is out of scope of the original topic > (ISA doc). > > On Tue, Sep 20, 2022 at 04:39:52PM -0700, Alexei Starovoitov wrote: > > On Tue, Sep 20, 2022 at 12:51 PM Dave Thaler <dthaler@xxxxxxxxxxxxx> wrote: > > > > -----Original Message----- > > > > From: Christoph Hellwig <hch@xxxxxxxxxxxxx> > > > > Sent: Monday, September 19, 2022 10:04 AM > > > > To: Shung-Hsi Yu <shung-hsi.yu@xxxxxxxx> > > > > Cc: Dave Thaler <dthaler@xxxxxxxxxxxxx>; bpf <bpf@xxxxxxxxxxxxxxx> > > > > Subject: Re: FW: ebpf-docs: draft of ISA doc updates in progress > > > > > > > > On Wed, Sep 14, 2022 at 02:22:51PM +0800, Shung-Hsi Yu wrote: > > > > > As discussed in yesterday's session, there's no graceful abortion on > > > > > division by zero, instead, the BPF verifier in Linux prevents division > > > > > by zero from happening. Here a few additional notes: > > > > > > > > Hmm, I thought Alexei pointed out a while ago that divide by zero is now > > > > defined to return 0 following. Ok, reading further along I think that is what > > > > you describe with the pseudo-code below. > > > > > > Based on the discussion at LPC, and the fact that older implementations, > > > as well as uBPF and rbpf still terminate the program, I've added this text > > > to permit both behaviors: > > > > That's not right. ubpf and rbpf are broken. > > We shouldn't be adding descriptions of broken implementations > > to the standard. > > There is no way to 'gracefully abort' in eBPF. > > There is a way to 'return 0' in cBPF, but that's different. See below. > > > > > > If eBPF program execution would result in division by zero, > > > > the destination register SHOULD instead be set to zero, but > > > > program execution MAY be gracefully aborted instead. > > > > Similarly, if execution would result in modulo by zero, > > > > the destination register SHOULD instead be set to the source value, > > > > but program execution MAY be gracefully aborted instead. > > While this makes implementation a lot easier, come to think of it though, > this behavior actually is more like a hack around having to deal with > division/modulo-on-zero at runtime. User doing statistic calculations with > BPF will get bit by this sooner or later, arriving at the wrong calculation > (fast-math comes to mind). lol. If that was the case arm64 would be on fire long ago and users would complain in masses. Same with risc-v. > This seems to go against some general BPF principle[1] of preventing the > users from shooting themselves in the foot. > > Just like how BPF verifier prevents a _possible_ out-of-bound memory access, > e.g. arr[i] when `i` is not bound-checked. Ideally I'd expect a coherent > approach toward division/modulo-on-zero as well; the verifier should prevent > program that _might_ do division-on-zero from loading in the first place. > (Maybe possible to achieve if we introduce something like SCALAR_OR_NULL > composed type, but that's definitely not easy) > > Admittedly even if achievable, this is a radical approach that is not > backward compatible. If such check is implemented, programs that used to > load may now be rejected. (Usually new BPF verifier feature allows more BPF > program to pass the verifier, rather then the other way around) > > So, I don't have a good proposal at the moment. The purpose to this email is > to point what I see as an issue out and hope to start a discussion. > > 1: Okay, I'm making this up a bit, strictly speaking the BPF foundation is > safe program (one of Alexei's talk) rather than preventing users from > shooting themselves in the foot. Safe != invalid. BPF doesn't prevent invalid programs. BPF ensures safety only, not crashing the kernel, not leaking data, etc. For example: under speculation arr[i] can go out of bounds and to prevent data leaks we insert masking operations. Similar with div_by_0. If the verifier can detect that it will reject the prog, otherwise it will insert if(==0) xor, because not all architectures behave like arm64.