On Sun, Jul 7, 2024, at 21:14, Nicolas Pitre wrote: > On Sun, 7 Jul 2024, Arnd Bergmann wrote: > >> On Sun, Jul 7, 2024, at 19:17, Nicolas Pitre wrote: >> > From: Nicolas Pitre <npitre@xxxxxxxxxxxx> >> > >> > Recent gcc versions started not systematically inline __arch_xprod64() >> > and that has performance implications. Give the compiler the freedom to >> > decide only when optimizing for size. >> > >> > Signed-off-by: Nicolas Pitre <npitre@xxxxxxxxxxxx> >> >> Seems reasonable. Just to make sure: do you know if the non-inline >> version of xprod_64 ends up producing a more effecient division >> result than the __do_div64() code path on arch/arm? > > __arch_xprod_64() is part of the __do_div64() code path. So I'm not sure > of your question. > > Obviously, having __arch_xprod_64() inlined is faster but it increases > binary size. I meant whether calling __div64_const32->__arch_xprod_64() is still faster for a constant base when the new __arch_xprod_64() is out of line, compared to the __div64_32->__do_div64() assembly code path we take for a non-constant base. Arnd