>From simple strong typing of existing int_sqrt we came to something a bit more complex or better. Can we decide now which we want in, or I submit v12 and we decide then (although it is not a v12, but whole new thing)? On 21 December 2017 at 15:48, David Laight <David.Laight@xxxxxxxxxx> wrote: > From: Peter Zijlstra >> Sent: 21 December 2017 14:12 > ... >> > > This part above looks like FLS >> > It also does the rest of the required shifts. >> >> Still, fls() + shift is way faster on hardware that has an fls >> instruction. >> >> Writing out that binary search doesn't make sense. > > If the hardware doesn't have an appropriate fls instruction > the soft fls()will be worse. > > If you used fls() you'd still need quite a bit of code > to generate the correct shift and loop count adjustment. > Given the cost of the loop iterations the 3 tests are noise. > The open coded version is obviously correct... > > I didn't add the 4th one because the code always does 2 iterations. > > If you were really worried about performance there are faster > algorithms (even doing 2 or 4 bits a time is faster). > > David > -- To unsubscribe from this list: send the line "unsubscribe linux-iio" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html