Re: [PATCH V6 2/2 RESEND] ksm: replace jhash2 with faster hash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



вт, 22 мая 2018 г. в 23:22, Pavel Tatashin <pasha.tatashin@xxxxxxxxxx>:

> Hi Timofey,

> >
> > Perf numbers:
> > Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz
> > ksm: crc32c   hash() 12081 MB/s
> > ksm: xxh64    hash()  8770 MB/s
> > ksm: xxh32    hash()  4529 MB/s
> > ksm: jhash2   hash()  1569 MB/s

> That is a very nice improvement over jhash2!

> > Add function to autoselect hash algo on boot,
> > based on hashing speed, like raid6 code does.

> Are you aware of hardware where crc32c is slower compared to xxhash?
> Perhaps always use crc32c when available?

crc32c will always be available, because of Kconfig.
But if crc32c doesn't have HW acceleration, it will be slower.

For talk about range of HW, i must have that HW,
so i can't say that *all* supported HW, have crc32c with acceleration.

> > +
> > +static u32 fasthash(const void *input, size_t length)
> > +{
> > +again:
> > +     switch (fastest_hash) {
> > +     case HASH_CRC32C:
> > +             return crc32c(0, input, length);
> > +     case HASH_XXHASH:
> > +             return xxhash(input, length, 0);

> You are loosing half of 64-bit word in xxh64 case? Is this acceptable? May
> be do one more xor: in 64-bit case in xxhash() do: (v >> 32) | (u32)v ?

AFAIK, that lead to make hash function worse.
Even, in ksm hash used only for check if page has changed since last scan,
so that doesn't matter really (IMHO).

> > +     default:
> > +             choice_fastest_hash();
> > +             /* The correct value depends on page size and endianness
*/
> > +             zero_checksum = fasthash(ZERO_PAGE(0), PAGE_SIZE);
> > +             goto again;
> > +     }
> > +}

> choice_fastest_hash() does not belong to fasthash(). We are loosing leaf
> function optimizations if you keep it in this hot-path. Also, fastest_hash
> should really be a static branch in order to avoid extra load and
conditional
> branch.

I don't think what that will give any noticeable performance benefit.
In compare to hash computation and memcmp in RB.

In theory, that can be replaced with self written jump table, to *avoid*
run time overhead.
AFAIK at 5 entries, gcc convert switch to jump table itself.

> I think, crc32c should simply be used when it is available, and use xxhash
> otherwise, the decision should be made in ksm_init()

I already said, in above conversation, why i think do that at ksm_init() is
a bad idea.

> Thank you,
> Pavel

Thanks.

-- 
Have a nice day,
Timofey.





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux