Re: Is this a missed optimization for popcount implementation?

andre maute <andre.maute@xxxxxx> · Sat, 5 Dec 2020 20:48:43 +0100

Hi Jonathan,

the builtin is not the point I would like to make.

I'm wondering why is there a special optimization
for the "trick" code, and why isn't there one for the "naive" code?

Regards Andre

On 12/5/20 7:38 PM, Jonathan Wakely wrote:
On Sat, 5 Dec 2020, 15:15 andre maute, <andre.maute@xxxxxx> wrote:

Hi list,

I'm wondering if we have here a missed optimization
for a popcount implementation.

https://en.cppreference.com/w/cpp/numeric/popcount

std::popcount is available with C++20, which might not be able
on many systems for years.

Let us consider the code below

I'm running a Fedora 32 linux with
$ g++ --version
g++ (GCC) 10.2.1 20201016 (Red Hat 10.2.1-6)
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ g++ -O3 -save-temps popcount.cc -march=ivybridge

popcount1() what I consider the standard implementation
does NOT get reduced into the relevant popcount instruction

popcount2() what I consider the standard "trick" implementation
gets reduced into the relevant popcount instruction

Is this expected behaviour?
You'll have to know the trick to get the optimization!

Regards Andre

/////////////// popcount.cc ///////////////
#include <iostream>

unsigned int popcount1( unsigned int n )
{

#if __has_builtin(__builtin_popcount)
   return __builtin_popcount(n);
#endif

         unsigned int res = 0;
         while( n != 0 ) {
                 res += n & 1;
                 n >>= 1;
         }
         return res;
}

unsigned int popcount2( unsigned int n )
{
         unsigned int res = 0;
         while( n != 0 ) {
                 res++;
                 n &= (n-1);
         }
         return res;
}

int main()
{
         for( unsigned int n = 0; n < (1 << 31); n++ ) {
                 if( n % 100000 == 0 ) {
                         std::cout << n << std::endl;
                 }
                 unsigned int c1 = popcount1(n);
                 unsigned int c2 = popcount2(n);
                 if( c1 != c2 ) {
                         std::cout << n << " " << c1 << " " << c2 <<
std::endl;
                         return 1;
                 }
         }

         return 0;
}
/////////////// popcount.cc ///////////////