On Thu, 24 Sep 2020 at 17:28, Doug Anderson <dianders@xxxxxxxxxxxx> wrote: > > On Thu, Sep 24, 2020 at 1:32 AM Ard Biesheuvel <ardb@xxxxxxxxxx> wrote: > > ... > > > > +#define REPS 100 > > > > > > Is this sufficient? I'm not sure what the lower bound on what's > > > expected of ktime. If I'm doing the math right, on your system > > > running 100 loops took 38802 ns in one case, since: > > > > > > (4096 * 1000 * 100) / 10556 = 38802 > > > > > > If you happen to have your timer backed by a 32 kHz clock, one tick of > > > ktime could be as much as 31250 ns, right? Maybe on systems backed > > > with a 32kHz clock they'll take longer, but it still seems moderately > > > iffy? I dunno, maybe I'm just being paranoid. > > > > > > > No, that is a good point - I didn't really consider that ktime could > > be that coarse. > > > > OTOH, we don't really need the full 5 digits of precision either, as > > long as we don't misidentify the fastest algorithm. > > > > So I think it should be sufficient to bump this to 800. If my > > calculations are correct, this would limit any potential > > misidentification of algorithms performing below 10 GB/s to ones that > > only deviate in performance up to 10%. > > > > 800 * 1000 * 4096 / (10 * 31250) = 10485 > > 800 * 1000 * 4096 / (11 * 31250) = 9532 > > > > (10485/9532) / 10485 = 10% > > Seems OK to me. Seems unlikely that super fast machine are going to > have a 32 kHz backed k_time and the worst case is that we'll pick a > slightly sub-optimal xor, I guess. I assume your goal is to keep > things fitting in a 32-bit unsigned integer? Looks like if your use > 1000 it also fits... > Yes, but the larger we make this number, the more time the test will take on such slow machines. Doing 1000 iterations of 4k on a low-end machine that only manages 500 MB/s (?) takes a couple of milliseconds, which is more than it does today when HZ=1000 I think. Not that 800 vs 1000 makes a great deal of difference in that regard, just illustrating that there is an upper bound as well.