Re: poor m4 hash performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Ben Pfaff on 6/4/2006 4:15 PM:
> 
> Is there a good reason why m4 should not use a hash table that
> grows dynamically?  It is easier to deal with software that can
> figure out parameters on its own rather than having to be told.

CVS head m4 does dynamically grow its hashtables; the problem is that
backporting it to the 1.4 branch is not trivial, and at this point, the
1.4 branch is in non-invasive bug fix mode only.  The point of this thread
is to determine whether the 1.4.x hashing performance can be noticeably
improved, in which case it is a performance bug and safe to apply such a
patch.  Another thing to investigate is the choice of hashing functions;
right now, both the 1.4 branch and head use an algorithm borrowed from emacs:

size_t
m4_hash_string_hash (const void *key)
{
  int val = 0;
  const char *ptr = (const char *) key;
  char ch;

  while ((ch = *ptr++) != '\0')
    {
      if (ch >= 0140)
	ch -= 40;
      val = ((val << 3) + (val >> 28) + ch);
    };
  val = (val < 0) ? -val : val;
  return val;
}

Other hash algorithms exist which may be faster or have better spread
across the modulo bucket size chosen (m4 head always sticks with (2**n)-1
buckets, whereas 1.4 defaults to 509 but allows arbitrary -H options
regardless of whether they are relatively prime or not).

And the idea that sparked my investigation was a question of how many
lookups are done with strings already known to be in the hashtable.
Currently, the symbol table xstrdup's the string in every entry, but if we
can determine that a large number of lookups are done on strings already
known to be in the table, some judicious sharing of strings between
entries can allow pointer comparisons instead of strcmp, as well as some
memory savings by not having to duplicate as many strings.

- --
Life is short - so eat dessert first!

Eric Blake             ebb9@xxxxxxx
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.1 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEg3DA84KuGfSFAYARAqSRAJ4pKlMUIZJnfoOjBPAkZeqqGrhf5ACdE9L0
MWLaPnY3zhe3JuSmlIQAwXM=
=MYWb
-----END PGP SIGNATURE-----


_______________________________________________
Autoconf mailing list
Autoconf@xxxxxxx
http://lists.gnu.org/mailman/listinfo/autoconf

[Index of Archives]     [GCC Help]     [Kernel Discussion]     [RPM Discussion]     [Red Hat Development]     [Yosemite News]     [Linux USB]     [Samba]

  Powered by Linux