On 9/13/18 5:19 PM, Timofey Titovets wrote: > From: Timofey Titovets <nefelim4ag@xxxxxxxxx> > > Replace jhash2 with xxhash. > > Perf numbers: > Intel(R) Xeon(R) CPU E5-2420 v2 @ 2.20GHz > ksm: crc32c hash() 12081 MB/s > ksm: xxh64 hash() 8770 MB/s > ksm: xxh32 hash() 4529 MB/s > ksm: jhash2 hash() 1569 MB/s > > By sioh Lee tests (copy from other mail): > Test platform: openstack cloud platform (NEWTON version) > Experiment node: openstack based cloud compute node (CPU: xeon E5-2620 v3, memory 64gb) > VM: (2 VCPU, RAM 4GB, DISK 20GB) * 4 > Linux kernel: 4.14 (latest version) > KSM setup - sleep_millisecs: 200ms, pages_to_scan: 200 > > Experiment process > Firstly, we turn off KSM and launch 4 VMs. > Then we turn on the KSM and measure the checksum computation time until full_scans become two. > > The experimental results (the experimental value is the average of the measured values) > crc32c_intel: 1084.10ns > crc32c (no hardware acceleration): 7012.51ns > xxhash32: 2227.75ns > xxhash64: 1413.16ns > jhash2: 5128.30ns > > As jhash2 always will be slower (for data size like PAGE_SIZE). > Don't use it in ksm at all. > > Use only xxhash for now, because for using crc32c, > cryptoapi must be initialized first - that require some > tricky solution to work good in all situations. > > Thanks. > > Changes: > v1 -> v2: > - Move xxhash() to xxhash.h/c and separate patches > v2 -> v3: > - Move xxhash() xxhash.c -> xxhash.h > - replace xxhash_t with 'unsigned long' > - update kerneldoc above xxhash() > v3 -> v4: > - Merge xxhash/crc32 patches > - Replace crc32 with crc32c (crc32 have same as jhash2 speed) > - Add auto speed test and auto choice of fastest hash function > v4 -> v5: > - Pickup missed xxhash patch > - Update code with compile time choicen xxhash > - Add more macros to make code more readable > - As now that only possible use xxhash or crc32c, > on crc32c allocation error, skip speed test and fallback to xxhash > - For workaround too early init problem (crc32c not avaliable), > move zero_checksum init to first call of fastcall() > - Don't alloc page for hash testing, use arch zero pages for that > v5 -> v6: > - Use libcrc32c instead of CRYPTO API, mainly for > code/Kconfig deps Simplification > - Add crc32c_available(): > libcrc32c will BUG_ON on crc32c problems, > so test crc32c avaliable by crc32c_available() > - Simplify choice_fastest_hash() > - Simplify fasthash() > - struct rmap_item && stable_node have sizeof == 64 on x86_64, > that makes them cache friendly. As we don't suffer from hash collisions, > change hash type from unsigned long back to u32. > - Fix kbuild robot warning, make all local functions static > v6 -> v7: > - Drop crc32c for now and use only xxhash in ksm. > > Signed-off-by: Timofey Titovets <nefelim4ag@xxxxxxxxx> > Signed-off-by: leesioh <solee@xxxxxxxxxxxxxx> Reviewed-by: Pavel Tatashin <pavel.tatashin@xxxxxxxxxxxxx>