Re: [RFC PATCH] MIPS: optimise 32-bit do_div() with constant divisor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ralf Baechle <ralf@xxxxxxxxxxxxxx> writes:

> On Thu, Nov 06, 2014 at 04:23:18PM +0000, Mans Rullgard wrote:
>
>> This is an adaptation of the optimised do_div() for ARM by
>> Nicolas Pitre implementing division by a constant using a
>> multiplication by the inverse.  Ideally, the compiler would
>> do this internally as it does for 32-bit operands, but it
>> doesn't.
>> 
>> This version of the code requires an assembler with support
>> for the DSP ASE syntax since accessing the hi/lo registers
>> sanely from inline asm is impossible without this.  Building
>> for a CPU without this extension still works, however.
>> 
>> It also does not protect against the WAR hazards for the
>> hi/lo registers present on CPUs prior to MIPS IV.
>> 
>> I have only tested it as far as booting and light use with
>> the BUG_ON enabled wihtout encountering any issues.
>> 
>> The inverse computation code is a straight copy from ARM,
>> so this should probably be moved to a shared location.
>
> Can you explain why you need __div64_fls()?  There's __fls() which on
> MIPS is carefully written to make use of the CLZ rsp. DCLZ instructions
> where available; the fallback implementation is looking fairly similar
> to your code.

The regular __fls() doesn't necessarily evaluate at compile-time, which
is required here.  The normal __fls() could of course be amended to
bypass the CLZ instruction for constant arguments.

> MADD is named MAD on some older CPUs; yet other CPUs don't have it
> at all.  I take it you tried to make GCC emit the instruction but it
> doesn't?

GCC generates MADDU instructions only in the most trivial of cases.  For
instance, (x >> 32) * (u32)y with 64-bit x and y produces far from
optimal code.  In fact, looking at it again, I see it is even more
stupid than I thought, so there needs to be more asm, not less.

Reading the manuals more carefully, it appears that MADDU is only
reliably available starting with MIPS32 (btw, this information is
annoyingly hard to find).  Thus this code should be restricted to such
targets (which probably covers most current users) unless someone feels
like writing a version for these older CPUs.

-- 
Måns Rullgård
mans@xxxxxxxxx





[Index of Archives]     [Linux MIPS Home]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Linux]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux