On Tue, Sep 10, 2024 at 11:02 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > On 9/10/24 8:21 AM, Alexei Starovoitov wrote: > > On Tue, Sep 10, 2024 at 7:21 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > >> > >> On 9/9/24 10:29 AM, Alexei Starovoitov wrote: > >>> On Mon, Sep 9, 2024 at 10:21 AM Zac Ecob <zacecob@xxxxxxxxxxxxxx> wrote: > >>>> Hello, > >>>> > >>>> I recently received a kernel 'oops' about a divide error. > >>>> After some research, it seems that the 'div64_s64' function used for the 'MOD'/'REM' instructions boils down to an 'idiv'. > >>>> > >>>> The 'dividend' is set to INT64_MIN, and the 'divisor' to -1, then because of two's complement, there is no corresponding positive value, causing the error (at least to my understanding). > >>>> > >>>> > >>>> Apologies if this is already known / not a relevant concern. > >>> Thanks for the report. This is a new issue. > >>> > >>> Yonghong, > >>> > >>> it's related to the new signed div insn. > >>> It sounds like we need to update chk_and_div[] part of > >>> the verifier to account for signed div differently. > >> In verifier, we have > >> /* [R,W]x div 0 -> 0 */ > >> /* [R,W]x mod 0 -> [R,W]x */ > > the verifier is doing what hw does. In this case this is arm64 behavior. > > Okay, I see. I tried on a arm64 machine it indeed hehaves like the above. > > # uname -a > Linux ... #1 SMP PREEMPT_DYNAMIC Thu Aug 1 06:58:32 PDT 2024 aarch64 aarch64 aarch64 GNU/Linux > # cat t2.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = 5; > volatile long long b = 0; > printf("a/b = %lld\n", a/b); > return 0; > } > # cat t3.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = 5; > volatile long long b = 0; > printf("a%%b = %lld\n", a%b); > return 0; > } > # gcc -O2 t2.c && ./a.out > a/b = 0 > # gcc -O2 t3.c && ./a.out > a%b = 5 > > on arm64, clang18 compiled binary has the same result > > # clang -O2 t2.c && ./a.out > a/b = 0 > # clang -O2 t3.c && ./a.out > a%b = 5 > > The same source code, compiled on x86_64 with -O2 as well, > it generates: > Floating point exception (core dumped) > > > > >> What the value for > >> Rx_a sdiv Rx_b -> ? > >> where Rx_a = INT64_MIN and Rx_b = -1? > > Why does it matter what Rx_a contains ? > > It does matter. See below: > > on arm64: > > # cat t1.c > #include <stdio.h> > #include <limits.h> > int main(void) { > volatile long long a = LLONG_MIN; > volatile long long b = -1; > printf("a/b = %lld\n", a/b); > return 0; > } > # clang -O2 t1.c && ./a.out > a/b = -9223372036854775808 > # gcc -O2 t1.c && ./a.out > a/b = -9223372036854775808 > > So the result of a/b is LLONG_MIN > > The same code will cause exception on x86_64: > > $ uname -a > Linux ... #1 SMP Wed Jun 5 06:21:21 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux > [yhs@devvm1513.prn0 ~]$ gcc -O2 t1.c && ./a.out > Floating point exception (core dumped) > [yhs@devvm1513.prn0 ~]$ clang -O2 t1.c && ./a.out > Floating point exception (core dumped) > > So this is what we care about. > > So I guess we can follow arm64 result too. > > > > > What cpus do in this case? > > See above. arm64 produces *some* result while x64 cause exception. > We do need to special handle for LLONG_MIN/(-1) case. My point about Rx_a that idiv will cause out-of-range exception for many other values than Rx_a == INT64_MIN. I'm not sure that divisor -1 is the only such case either. Probably is, since intuitively -2 and all other divisors should fit fine. So the check likely needs Rx_b == -1 and a check for high bit in Rx_a ?