On Fri, Jun 23, 2023 at 03:20:14PM -0700, Evan Green wrote: > > The current setting for the hwprobe bit indicating misaligned access > speed is controlled by a vendor-specific feature probe function. This is > essentially a per-SoC table we have to maintain on behalf of each vendor > going forward. Let's convert that instead to something we detect at > runtime. > > We have two assembly routines at the heart of our probe: one that > does a bunch of word-sized accesses (without aligning its input buffer), > and the other that does byte accesses. If we can move a larger number of > bytes using misaligned word accesses than we can with the same amount of > time doing byte accesses, then we can declare misaligned accesses as > "fast". > > The tradeoff of reducing this maintenance burden is boot time. We spend > 4-6 jiffies per core doing this measurement (0-2 on jiffie edge > alignment, and 4 on measurement). The timing loop was based on > raid6_choose_gen(), which uses (16+1)*N jiffies (where N is the number > of algorithms). On my THead C906, I found measurements to be stable > across several reboots, and looked like this: > > [ 0.047582] cpu0: Unaligned word copy 1728 MB/s, byte copy 402 MB/s, misaligned accesses are fast > > I don't have a machine where misaligned accesses are slow, but I'd be > interested to see the results of booting this series if someone did. Can you elaborate on "results" please? Otherwise, [ 0.333110] smp: Bringing up secondary CPUs ... [ 0.370794] cpu1: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow [ 0.411368] cpu2: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow [ 0.451947] cpu3: Unaligned word copy 2 MB/s, byte copy 231 MB/s, misaligned accesses are slow [ 0.462628] smp: Brought up 1 node, 4 CPUs [ 0.631464] cpu0: Unaligned word copy 2 MB/s, byte copy 229 MB/s, misaligned accesses are slow btw, why the mixed usage of "unaligned" and misaligned"? Cheers, Conor.
Attachment:
signature.asc
Description: PGP signature