On Mon, Mar 23, 2015 at 9:25 AM, David Miller <davem@xxxxxxxxxxxxx> wrote: > > Ok, here is what I committed. So I wonder - looking at that assembly, I get the feeling that it isn't any better code than gcc could generate from simple C code. Would it perhaps be better to turn memmove() into C? That's particularly true because if I read this code right, it now seems to seriously pessimise non-overlapping memmove's, in that it now *always* uses that slow downward copy if the destination is below the source. Now, admittedly, the kernel doesn't use a lot of memmov's, but this still falls back on the "byte at a time" model for a lot of cases (all non-64-bit-aligned ones). I could imagine those existing. And some people (reasonably) hate memcpy because they've been burnt by the overlapping case and end up using memmove as a "safe alternative", so it's not necessarily just the overlapping case that might trigger this. Maybe the code could be something like void *memmove(void *dst, const void *src, size_t n); { // non-overlapping cases if (src + n <= dst) return memcpy(dst, src, n); if (dst + n <= src) return memcpy(dst, src, n); // overlapping, but we know we // (a) copy upwards // (b) initialize the result in at most chunks of 64 if (dst+64 <= src) return memcpy(dst, src, n); .. do the backwards thing .. } (ok, maybe I got it wrong, but you get the idea). I *think* gcc should do ok on the above kind of code, and not generate wildly different code from your handcoded version. Linus -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html