Re: arm: ERROR: modpost: "__aeabi_uldivmod" [drivers/gpu/drm/sun4i/sun4i-drm-hdmi.ko] undefined!

Ard Biesheuvel <ardb@xxxxxxxxxx> · Mon, 4 Mar 2024 14:01:52 +0100

On Mon, 4 Mar 2024 at 13:35, Arnd Bergmann <arnd@xxxxxxxx> wrote:
>
> On Mon, Mar 4, 2024, at 12:45, Andre Przywara wrote:
> > On Mon, 04 Mar 2024 12:26:46 +0100
> > "Arnd Bergmann" <arnd@xxxxxxxx> wrote:
> >
> >> On Mon, Mar 4, 2024, at 12:24, Andre Przywara wrote:
> >> > On Mon, 04 Mar 2024 12:11:36 +0100 "Arnd Bergmann" <arnd@xxxxxxxx> wrote:
> >> >>
> >> >> This used to be a 32-bit division. If the rate is never more than
> >> >> 4.2GHz, clock could be turned back into 'unsigned long' to avoid
> >> >> the expensive div_u64().
> >> >
> >> > Wouldn't "div_u64(clock, 200)" solve this problem?
> >>
> >> Yes, that's why I mentioned it as the worse of the two obvious
> >> solutions. ;-)
> >
> > Argh, should have cleaned my glasses first ;-)
> >
> > I guess I was put somehow put off by the word "expensive". While it's
> > admittedly not trivial, I wonder if we care about the (hidden) complexity
> > of that function? I mean it's neither core code nor something called
> > frequently?
>
> It's not critical if this is called infrequently, and as Maxime
> just replied, the 64-bit division is in fact required here.
> Since we are dividing by a constant value (200), there is a good
> chance that this will be get turned into fairly efficient
> multiply/shift code.
>

Clang does not implement that optimization for 64-bit division. That
is how we ended up with this error in the first place.

Perhaps it is worthwhile to make div_u64() check its divisor, e.g.,

--- a/include/linux/math64.h
+++ b/include/linux/math64.h
@@ -127,6 +127,9 @@
 static inline u64 div_u64(u64 dividend, u32 divisor)
 {
        u32 remainder;
+
+       if (IS_ENABLED(CONFIG_CC_IS_GCC) && __builtin_constant_p(divisor))
+               return dividend / divisor;
        return div_u64_rem(dividend, divisor, &remainder);
 }
 #endif