On Fri, Jan 20, 2006 at 05:46:34PM -0500, Ron wrote: > At 04:37 PM 1/20/2006, Martijn van Oosterhout wrote: > >Given that all it's doing is counting bits, a simple fix would be to > >loop over bytes, use XOR and count ones. For extreme speedup create a > >lookup table with 256 entries to give you the answer straight away... > For an even more extreme speedup, don't most modern CPUs have an asm > instruction that counts the bits (un)set (AKA "population counting") > in various size entities (4b, 8b, 16b, 32b, 64b, and 128b for 64b > CPUs with SWAR instructions)? Quite possibly, though I wouldn't have the foggiest idea how to get the C compiler to generate it. Given that even a lookup table will get you pretty close to that with plain C coding, I think that's quite enough for a function that really is just a small part of a much larger system... Better solution (as Tom points out): work out how to avoid calling it so much in the first place... At the moment each call to gtsvector_picksplit seems to call the distance function around 14262 times. Getting that down by an order of magnitude will help much much more. Have a nice day, -- Martijn van Oosterhout <kleptog@xxxxxxxxx> http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them.
Attachment:
signature.asc
Description: Digital signature