Ping. On Sun, 7 Jul 2024, Nicolas Pitre wrote: > While working on mul_u64_u64_div_u64() improvements I realized that there > is a better way to perform a 64x64->128 bits multiplication with overflow > handling. This is not as lean as v1 of the series but still much better > than the existing code IMHO. > > Change from v2: > > - Fix last minute edit screw-up (missing one function return type). > > Link to v2: https://lore.kernel.org/lkml/20240707171919.1951895-1-nico@xxxxxxxxxxx/ > > Changes from v1: > > - Formalize condition for when overflow handling can be skipped. > - Make this condition apply only if it can be determined at compile time > (beware of the compiler not always inling code). > - Keep the ARM assembly but apply the above changes to it as well. > - Force __always_inline when optimizing for performance. > - Augment test_div64.c with important edge cases. > > Link to v1: https://lore.kernel.org/lkml/20240705022334.1378363-1-nico@xxxxxxxxxxx/ > > The diffstat is: > > arch/arm/include/asm/div64.h | 13 +++- > include/asm-generic/div64.h | 121 ++++++++++++----------------------- > lib/math/test_div64.c | 85 +++++++++++++++++++++++- > 3 files changed, 134 insertions(+), 85 deletions(-) > >