------------------------------------------------------------------
From:Andrea Arcangeli <aarcange@xxxxxxxxxx>
Time:2017 Sep 28 (Thu) 18:09
Thanks for replying
> Could you repeat the whole benchmark while giving only 1 CPU to PageONE
> and after applying the following crc32c-intel patch to KSM?
>
> https://www.spinics.net/lists/linux-mm/msg132394.html
>
> You may consider also echo 1 > /sys/kernel/mm/ksm/use_zero_pages if
> you single out zero pages in pone (but it doesn't look like you have
> such feature in pone).
> The second test is exercising the worst case possible of KSM so I
> don't see how it's worth worrying about. Likely pone would also have a
> worst case to exercise (it uses hash_64 so it very likely also has a
> worst case to exercise). For KSM there are already plans to alter the
> memcmp so it's more scattered randomly.
> Making KSM multithreaded with one ksmd thread per CPU is entirely
> possible, the rbtree rebalance will require some locking of course but
> the high CPU usage parts of KSM are fully scalable (mm walk, checksum,
> memcompare, writeprotection, pagetable replacement). We didn't
> multithread ksmd to keep it simpler primarily but also because nobody
> asked for this feature yet. Why didn't you simply multithread KSM
> which provides a solid base also supporting KSMscale?
>
> Are you using an hash to find equality? That can't be done currently
> to avoid infringing. I see various memcmp in your patch but all around
> #if 0... so what are you using for finding page equality?
>
> How does PageONE deal with 1million of equal virtual pages? Does it
> lockup in rmap? KSM in v4.13 can handle infinite amount of equal
> virtual page content to dedup while generating O(1) complexity in rmap
> walks. Without this, KSM was unusable for enterprise use and had to be
> disabled, because the kernel would lockup for several seconds after
> deduplicating million of virtual pages with same content (i.e. during
> NUMA balancing induced page migrations or during compaction induced
> page migrations, let alone swapping the million-times deduplicated KSM
>page).
> KSM is usually an activity run in the background so nobody asked to
> dedicate more than one core to it, and what's relevant is to do the
> dedup in the most efficient way possible (i.e. less CPU used and no
> interference to the rest of the system whatsoever), not how long it
> takes if you run it on all available CPUs loading 100% of the system
> with it.
>
> So comparing a dedup algorithm running concurrently on 12 threads vs
> another dedup algorithm running in 1 thread only, is an apple to
> oranges comparison.
>
> Comparing KSM (with crc32 as cksum, to apply on top of upstream) vs
> PageOne restricted to a single thread (also more realistic production
> environment), will be a more interesting and meaningful comparison.
>
> It looks like rmap is supported by pone but the patch has a multitude
> of #if 0 and around all rmap code so it's not so clear. Rmap walks
> have to work flawlessy on all deduplicated pages, or pone would then
> break not just swapping but also NUMA Balancing compaction and in turn
> THP utilization and THP utilization is critical for virtual machines
> (MADV_HUGEPAGE is always set by QEMU, to run direct compactin also with
> defrag=madvise or defer+madvise).
PageONE is based on a new tree algorithm other than KSM’s red-black tree. The original idea is that we can use a lockless tree to enhance multithread performance. Not all tree algorithms are suitable for this purpose. We have not find a way to do it for red-black tree. PageONE is based on a new tree. The closest topology we found is Patricia tree, but also different. We name it SD tree currently.
The original engine name is ONE (Object Non-duplicate Engine), it is designed for general purpose object deduplication. We applied it to kernel page field (PageONE) first because here we can find out how it behaves in high speed environment. PageOne is not to improve the ksm, which is two completely different things.
The original engine name is ONE (Object Non-duplicate Engine), it is designed for general purpose object deduplication. We applied it to kernel page field (PageONE) first because here we can find out how it behaves in high speed environment. PageOne is not to improve the ksm, which is two completely different things.
We do not use an hash to find equality. Because SD tree need to compare bits, so we do the implementation of the comparison function, nor using memcmp.
PageOne has no additional management structure, except SD tree structure , page status bitmap, lockless que, so we use rmap walks when necessary to obtain reverse mapping(wrire-protect ,repalce page table) ,
and now current version , the swap and migration process is not yet completely.
PageOne Single-threaded performance and xxhash test results, we will be provided after the end of the holidays (10.1-10.8).