On Tue, Oct 08, 2019 at 12:02:07PM +0200, Rasmus Villemoes wrote: > On 08/10/2019 11.31, Daniel Thompson wrote: > > On Mon, Oct 07, 2019 at 08:43:31PM +0200, Rasmus Villemoes wrote: > >> On 07/10/2019 17.28, Daniel Thompson wrote: > >>> On Thu, Sep 19, 2019 at 04:06:18PM +0200, Rasmus Villemoes wrote: > >>> > >>> It feels like there is some rationale missing in the description here. > >>> > >> > >> Apart from the function call overhead (and resulting register pressure > >> etc.), using int_pow is less efficient (for an exponent of 3, it ends up > >> doing four 64x64 multiplications instead of just two). But feel free to > >> drop it, I'm not going to pursue it further - it just seemed like a > >> sensible thing to do while I was optimizing the code anyway. > >> > >> [At the time I wrote the patch, this was also the only user of int_pow > >> in the tree, so it also allowed removing int_pow altogether.] > > > > To be honest the change is fine but the patch description doesn't make > > sense if the only current purpose of the patch is as a optimization. > > Agreed. Do you want me to resend the series with patch 3 updated to read > > "For a fixed small exponent of 3, it is more efficient to simply use two > explicit multiplications rather than calling the int_pow() library > function: Aside from the function call overhead, its implementation > using repeated squaring means it ends up doing four 64x64 multiplications." > > (and obviously patch 5 dropped)? Yes, please. When you resend you can add my R-B: to all patches: Reviewed-by: Daniel Thompson <daniel.thompson@xxxxxxxxxx> Daniel. PS Don't mind either way but I wondered the following is clearer than the slightly funky multiply-and-assign expression (which isn't wrong but isn't very common either so my brain won't speed read it): retval = DIV_ROUND_CLOSEST_ULL(retval * retval * retval, scale * scale);