Also, I would like to see data on the speed impact of the byte swapping,
making the old hardware even 5% slower isn't something I would be
enthusiastic about and my guess is it would more than that.
Well, the inner loop of raw_insw() uses one movew whereas raw_insw_swapw()
uses two movew and one rolw instruction. Reading or writing huge files that
are contiguous on disk will show an impact. Random access of small files does
have higher overhead both in waiting for disk latency and buffer
cache/filesystem overhead. I'm sure it can be measured at least for the first
case.
I've run some tests on my Q40 (40MHz 68040), using 6.4.10 plus Michael's RFC2
patch. I added a little tweak to pata_falcon_data_xfer() to force byte
swapping for one specific device only (this allowed me to avoid byte swapping
the slave IDE drive, which holds my root filesystem).
if(MACH_IS_Q40 && qc->dev->devno == 0)
swap = 0; // or 1, for the second test
I booted up the kernel and then timed reading the entire contents of a 2GB
"InnoDisk Corp. iCF 4000" compact flash card using dd.
With no CPU byteswapping (legacy byte ordering) -- 1308.72 seconds:
# date; time dd if=/dev/sda bs=256k of=/dev/null; date
Tue Aug 15 11:07:19 UTC 2023
7999+1 records in
7999+1 records out
0.40user 1236.33system 21:48.72elapsed 94%CPU (0avgtext+0avgdata 1088maxresident)k
0inputs+0outputs (0major+115minor)pagefaults 0swaps
Tue Aug 15 11:29:08 UTC 2023
With CPU byteswapping (compatible byte ordering) -- 1426.46 seconds:
# date; time dd if=/dev/sda bs=256k of=/dev/null; date
Tue Aug 15 11:30:52 UTC 2023
7999+1 records in
7999+1 records out
0.42user 1348.82system 23:46.46elapsed 94%CPU (0avgtext+0avgdata 1088maxresident)k
0inputs+0outputs (0major+115minor)pagefaults 0swaps
Tue Aug 15 11:54:39 UTC 2023
So it is nearly 9% slower for very large data transfers.
Thanks
Will
_________________________________________________________________________
William R Sowerbutts will@xxxxxxxxxxxxxx
"Carpe post meridiem" http://sowerbutts.com
main(){char*s=">#=0> ^#X@#@^7=",c=0,m;for(;c<15;c++)for
(m=-1;m<7;putchar(m++/6&c%3/2?10:s[c]-31&1<<m?42:32));}