On Mon, Nov 26, 2012 at 08:45:05PM -0500, Theodore Ts'o wrote: > I suppose I should first check and see how much difference it makes to > with a hard-coded use __builtin_popcnt(). If it makes a sufficiently > large improvement, it's probably worth the hair of implementing the > fallback machinery. I did some quick benchmarking, and the difference it makes when checking 4TB's worth of bitmaps is negligble: slow popcount: 0.2623 fast popcount: 0.0700 For a 128TB's worth of bitmaps, the time difference is: slow popcount: 8.0185 fast popcount: 2.2066 I measured running e2fsck on an empty 128TB file system, and that took 202 CPU seconds (assuming all of the fs metadata blocks are in cache), so with this optimization we would save at most 3%. (For comparison, using an unmodified 1.42.6 e2fsck, it burned 392.7 CPU seconds.) My conclusion is that using __builtin_popcnt() is a nice-to-have, and if someone sends me patches I'll probably take them as a optimization, but it's not super high priority for me. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html