Christian Thalinger writes: > Hi! > > We had a curious problem with SPECjvm98 _213_javac and cacao. A newly > recompiled cacao ended with 10% preformance degradation. After further > investigation we ended with VMSystem.identityHashCode() and it's return > value. > > _213_javac make heavy use of Hashtable's and the objects stored in the > hashtable use VMSystem.identityHashCode() as their hash function. Since > alignment is normally 8 or 16-bytes, the distribution of the objects > seems to be not optimal. We ended with 40000 Object.equals() calls more > and a doubled runtime for Hashtable.get() (measured clock cycles), as > Object.equals() is used in Hashtable.get(). This seems a bit strange. Hashtable.rehash() seems always to generate a new capacity based on the old capacity * 2 + 1, so you wouldn't expect to see a great many collisions as the definition of hash() is key.hashCode() % buckets.length Okay, buckets.length isn't necessarily prime, but you wouldn't expect this to have a very non-uniform distribution. > A shifting of the VMSystem.identityHashCode() pointer value helps > (sometimes), but shifting by 3 makes it even worse (what was my guess to > make it better). > > Any thoughts? Should we change Hashtable.hash()? Or return a better > value for VMSystem.identityHashCode()? But most runtimes only return > the memory pointer... There are things that we can try, such a using a better way to turn an address into a hash. But it would be nice first to know the real cause of the very non-uniform distribution, if indeed that's what is happening. Andrew.