On Thu, Jun 17, 2021 at 05:27:52PM +0200, Matteo Croce wrote: > +extern void *memcpy(void *dest, const void *src, size_t count); > +extern void *__memcpy(void *dest, const void *src, size_t count); No need for externs. > +++ b/arch/riscv/lib/string.c Nothing in her looks RISC-V specific. Why doesn't this go into lib/ so that other architectures can use it as well. > +#include <linux/module.h> I think you only need export.h. > +void *__memcpy(void *dest, const void *src, size_t count) > +{ > + const int bytes_long = BITS_PER_LONG / 8; > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > + const int mask = bytes_long - 1; > + const int distance = (src - dest) & mask; > +#endif > + union const_types s = { .u8 = src }; > + union types d = { .u8 = dest }; > + > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > + if (count < MIN_THRESHOLD) Using IS_ENABLED we can avoid a lot of the mess in this function. int distance = 0; if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)) { if (count < MIN_THRESHOLD) goto copy_remainder; /* copy a byte at time until destination is aligned */ for (; count && d.uptr & mask; count--) *d.u8++ = *s.u8++; distance = (src - dest) & mask; } if (distance) { ... > + /* 32/64 bit wide copy from s to d. > + * d is aligned now but s is not, so read s alignment wise, > + * and do proper shift to get the right value. > + * Works only on Little Endian machines. > + */ Normal kernel comment style always start with a: /* > + for (next = s.ulong[0]; count >= bytes_long + mask; count -= bytes_long) { Please avoid the pointlessly overlong line. And (just as a matter of personal preference) I find for loop that don't actually use a single iterator rather confusing. Wjy not simply: next = s.ulong[0]; while (count >= bytes_long + mask) { ... count -= bytes_long; }