> So I improve the generic version of memcpy and memmove, and x86_64's memmove > which are implemented by byte copy. One should also add that most memmove()s and memcpy()s are actually generated by gcc as inlines (especially if you don't use the "make my code slow" option aka -Os) and don't use the fallback. The fallback depends on the gcc version and if gcc thinks the data is aligned or not. Sometimes one can get better code in the caller by making sure gcc knows the correct alignment (e.g. with suitable types) and size. This might be worth looking at for btrfs if it's really that memmove heavy. > > > >I have some systemtap scripts to measure size/alignment distributions of > >copies on a kernel, if you have a particular workload you're interested > >in those could be tried. > > Good! Could you give me these script? ftp://firstfloor.org/pub/ak/probes/csum.m4 You need to run them through .m4 first. They don't measure memmove, but that should be easy to add. -Andi -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html