> Am 20.04.2021 um 04:50 schrieb Maciej W. Rozycki <macro@xxxxxxxxxxx>: > > We already check the high part of the divident against zero to avoid the nit-picking: s/divident/dividend/ (seems to come from from Latin "dividendum" = the number that is to be divided). > costly DIVU instruction in that case, needed to reduce the high part of > the divident, so we may well check against the divisor instead and set > the high part of the quotient to zero right away. We need to treat the > high part the divident in that case though as the remainder that would > be calculated by the DIVU instruction we avoided. > > This has passed correctness verification with test_div64 and reduced the > module's average execution time down to 1.0445s and 0.2619s from 1.0668s > and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz. Impressive. > > Signed-off-by: Maciej W. Rozycki <macro@xxxxxxxxxxx> > --- > I have made an experimental change on top of this to put `__div64_32' out > of line, and that increases the averages respectively up to 1.0785s and > 0.2705s. Not a terrible loss, especially compared to generic times quoted > with 3/4, but still, so I think it would best be made where optimising for > size, as noted in the cover letter. > --- > arch/mips/include/asm/div64.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > Index: linux-3maxp-div64/arch/mips/include/asm/div64.h > =================================================================== > --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h > +++ linux-3maxp-div64/arch/mips/include/asm/div64.h > @@ -68,9 +68,11 @@ > \ > __high = __div >> 32; \ > __low = __div; \ > - __upper = __high; \ > \ > - if (__high) { \ > + if (__high < __radix) { \ > + __upper = __high; \ > + __high = 0; \ > + } else { \ > __asm__("divu $0, %z1, %z2" \ > : "=x" (__modquot) \ > : "Jr" (__high), "Jr" (__radix)); \