On Fri, Oct 1, 2021 at 1:23 PM Pavel Machek <pavel@xxxxxx> wrote: > > Hi! > > > From: Matteo Croce <mcroce@xxxxxxxxxxxxx> > > > > Write a C version of memcpy() which uses the biggest data size allowed, > > without generating unaligned accesses. > > > > The procedure is made of three steps: > > First copy data one byte at time until the destination buffer is aligned > > to a long boundary. > > Then copy the data one long at time shifting the current and the next u8 > > to compose a long at every cycle. > > Finally, copy the remainder one byte at time. > > > > On a BeagleV, the TCP RX throughput increased by 45%: > > > > before: > > > > $ iperf3 -c beaglev > > Connecting to host beaglev, port 5201 > > [ 5] local 192.168.85.6 port 44840 connected to 192.168.85.48 port 5201 > > [ ID] Interval Transfer Bitrate Retr Cwnd > > [ 5] 0.00-1.00 sec 76.4 MBytes 641 Mbits/sec 27 624 KBytes > > [ 5] 1.00-2.00 sec 72.5 MBytes 608 Mbits/sec 0 708 KBytes > > > > after: > > > > $ iperf3 -c beaglev > > Connecting to host beaglev, port 5201 > > [ 5] local 192.168.85.6 port 44864 connected to 192.168.85.48 port 5201 > > [ ID] Interval Transfer Bitrate Retr Cwnd > > [ 5] 0.00-1.00 sec 109 MBytes 912 Mbits/sec 48 559 KBytes > > [ 5] 1.00-2.00 sec 108 MBytes 902 Mbits/sec 0 690 > > KBytes > > That's really quite cool. Could you see if it is your "optimized > unaligned" copy doing the difference?> > > +/* convenience union to avoid cast between different pointer types */ > > +union types { > > + u8 *as_u8; > > + unsigned long *as_ulong; > > + uintptr_t as_uptr; > > +}; > > + > > +union const_types { > > + const u8 *as_u8; > > + unsigned long *as_ulong; > > + uintptr_t as_uptr; > > +}; > > Missing consts here? > > Plus... this is really "interesting" coding style. I'd just use casts > in kernel. > Yes, the one for as_ulong is missing. By using casts I had to use too many of them, making repeated assignments in every function. This is basically the same, with less code :) Cheers, -- per aspera ad upstream