Re: [patch V3] lib: GCD: add binary GCD algorithm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Obviously non-SPARC recipients dropped, but I probably missed some.]

>> __ffs on the available architectures:

> SPARC: sparc64: YES, sparc32: NO
> Patch needs to be updated to refelct this.

I didn't see the sparc64 implementation, but on looking again, I have
to say that no, it doesn't.

arch/sparc/lib/ffs.S is a custom implementation of __ffs, but it's a
function call and 33 instructions/18 cycles long.  There are several
similar custom implementations that I also considered "NO".

"fast" in this context means a handful of inline instructions,
usually one of:

1. A direct count_trailing_zeros instruction,
2. count_leading_zeros(x ^ (x-1)), or
3. count_leading_zeros(bit_reverse(x)), or
4. popcount(~x & (x-1)).

The question is whether __ffs plus a variable shift is faster than
three instructions plus an unpredictable branch.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Kernel Development]     [DCCP]     [Linux ARM Development]     [Linux]     [Photo]     [Yosemite Help]     [Linux ARM Kernel]     [Linux SCSI]     [Linux x86_64]     [Linux Hams]

  Powered by Linux