Re: [PATCH 0/4] Reinstate and improve MIPS `do_div' implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 22 Apr 2021, H. Nikolaus Schaller wrote:

> > This has passed correctness verification with test_div64 and reduced the
> > module's average execution time down to 1.0445s and 0.2619s from 1.0668s
> > and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz.
> 
> test only [PATCH 1/4 and 2/4]:
> 
> [  256.301140] test_div64: Completed 64bit/32bit division and modulo test, 0.291154944s elapsed
> 
> + [PATCH 3/4]
> 
> [ 1698.698920] test_div64: Completed 64bit/32bit division and modulo test, 0.132142865s elapsed
> 
> + [PATCH 4/4]
> 
> [  466.818349] test_div64: Completed 64bit/32bit division and modulo test, 0.134429075s elapsed
> 
> So the new code is indeed faster than the default implementation.
> [PATCH 4/4] has no significant influence (wouldn't say it is slower because timer resolution
> isn't very high on this machine and the kernel has some scheduling issue [1]).

 Have you used it as a module or at bootstrap?  I have noticed that at 
bootstrap the initialisation of the random number generator sometimes 
interferes with the benchmark, which happens when there's an intervening 
message produced, e.g.:

test_div64: Starting 64bit/32bit division and modulo test
random: fast init done
test_div64: Completed 64bit/32bit division and modulo test, 1.069906272s elapsed

I think it can be worked around by configuration changes so that more 
stuff is run between the RNG and the test module, but instead I have 
simply inserted:

	mdelay(5000);

at the beginning of `test_div64_init' instead, as for historical reasons I 
haven't got the systems involved set up for modules (beyond Linux 2.4) at 
this time.

 NB I have run the benchmark five times with each change and system and 
with the RNG taken out of the picture results were very stable as any 
fluctuation only started at the fifth decimal digit.  Both the DECstation 
(the model I used anyway) and the Malta have a high-resolution clock 
source though, the I/O ASIC free-running counter register at 25MHz (used 
by David L. Mills, the original author of the NTP suite, for his reference 
implementation) and the CP0 Count register at 80MHz respectively.

 I would expect your JZ4730 device to have the CP0 Count register as well, 
as it has been architectural ever since MIPS III really, or is your system 
SMP with CP0 Count registers out of sync across CPUs due to sleep modes or 
whatever?

 Thanks for sharing your figures.

> [1] we are preparing full support for the JZ4730 based Skytone Alpha machine. Most features
> are working except sound/I2S. I2C is a little unreliable and Ethernet has hickups. And scheduling
> which indicates some fundamental IRQ or timer issue we could not yet identify.

 Good luck with that!

  Maciej



[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux