From: Mark Rutland > Sent: 22 March 2023 14:05 .... > > IIUC, in such tests you only vary the destination offset. Our copy > > routines in general try to align the source and leave the destination > > unaligned for performance. It would be interesting to add some variation > > on the source offset as well to spot potential issues with that part of > > the memcpy routines. > > I have that on my TODO list; I had intended to drop that into the > usercopy_params. The only problem is that the cross product of size, > src_offset, and dst_offset gets quite large. I thought that is was better to align the writes and do misaligned reads. Although maybe copy_to/from_user() would be best aligning the user address (to avoid page faults part way through a misaligned access). OTOH, on x86, is it even worth bothering at all. I have measured a performance drop for misaligned reads, but it was less than 1 clock per cache line in a test that was doing 2 misaligned reads in at least some of the clock cycles. I think the memory read path can do two AVX reads each clock. So doing two misaligned 64bit reads isn't stressing it. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)