Re: [PATCH V6 2/2 RESEND] ksm: replace jhash2 with faster hash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



ср, 23 мая 2018 г. в 17:24, Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>:

> Hi Timofey,

> > crc32c will always be available, because of Kconfig.
> > But if crc32c doesn't have HW acceleration, it will be slower.

> > For talk about range of HW, i must have that HW,
> > so i can't say that *all* supported HW, have crc32c with acceleration.

> How about always defaulting to crc32c when HW acceleration is present
> without doing timings?
IIRC, yes, shash api can return 'cra_priority'.

> Do you have performance numbers of crc32c without acceleration?
Yes, https://lkml.org/lkml/2017/12/30/222

The experimental results (the experimental value is the average of the
measured values)
crc32c_intel: 1084.10ns
crc32c (no hardware acceleration): 7012.51ns
xxhash32: 2227.75ns
xxhash64: 1413.16ns
jhash2: 5128.30ns

> > > You are loosing half of 64-bit word in xxh64 case? Is this acceptable?
> May
> > > be do one more xor: in 64-bit case in xxhash() do: (v >> 32) | (u32)v
?

> > AFAIK, that lead to make hash function worse.
> > Even, in ksm hash used only for check if page has changed since last
scan,
> > so that doesn't matter really (IMHO).

> I understand that losing half of the hash result might be acceptable in
> this case, but I am not really sure how XOirng one more time can possibly
> make hash function worse, could you please elaborate?

IIRC, because of xor are symmetric
i.e. shift:
0b01011010 >> 4 = 0b0101
and xor:
0b0101 ^ 0b1010 = 0b1111
Xor will decrease randomness/entropy and will lead to hash collisions.

> > > choice_fastest_hash() does not belong to fasthash(). We are loosing
leaf
> > > function optimizations if you keep it in this hot-path. Also,
> fastest_hash
> > > should really be a static branch in order to avoid extra load and
> > conditional
> > > branch.

> > I don't think what that will give any noticeable performance benefit.
> > In compare to hash computation and memcmp in RB.

> You are right, it is small compared to hash and memcmp, but still I think
> it makes sense to use static branch, after all the value will never change
> during runtime after the first time it is set.


> > In theory, that can be replaced with self written jump table, to *avoid*
> > run time overhead.
> > AFAIK at 5 entries, gcc convert switch to jump table itself.

> > > I think, crc32c should simply be used when it is available, and use
> xxhash
> > > otherwise, the decision should be made in ksm_init()

> > I already said, in above conversation, why i think do that at ksm_init()
> is
> > a bad idea.

> It really feels wrong to keep  choice_fastest_hash() in fasthash(), it is
> done only once and really belongs to the init function, like ksm_init().
As

That possible to move decision from lazy load, to ksm_thread,
that will allow us to start bench and not slowdown boot.

But for that to works, ksm must start later, after init of crypto.

> I understand, you think it is a bad idea to keep it in ksm_init() because
> it slows down boot by 0.25s, which I agree with your is substantial. But,
I
> really do not think that we should spend those 0.25s at all deciding what
> hash function is optimal, and instead default to one or another during
boot
> based on hardware we are booting on. If crc32c without hw acceleration is
> no worse than jhash2, maybe we should simply switch to  crc32c?

crc32c with no hw, are slower in compare to jhash2 on x86, so i think on
other arches result will be same.

> Thank you,
> Pavel

Thanks.

--
Have a nice day,
Timofey.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux