RFC [PATCH 0/1] ksm: introduce ksm_max_page_sharing per page deduplication limit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

this patch solves KSM computational complexity issues in the rmap walk
that created stalls during enterprise usage on large systems with lots
of RAM and CPUs. It's incremental with the patches posted earlier and
it should apply clean to upstream.

Special review should be done on the KSM page migration code with
merge_across_nodes == 0. I tested merge_across_nodes == 0 but not very
hard. The old code in this area in fact looked flakey in case two KSM
pages of equal content ended up being migrated in the same node. With
this new code instead they end up in the same "chain" in the right
stable_tree or the page under migration is merged into an existing KSM
page of the right node if page_mapcount allows for it.

At the moment the chains aren't defragmented but they could be. It'd
be enough to refile a couple of remap_items for each prune of the
stable_node_chain, from a "dup" with the lowest rmap_hlist_len to the
dup with the highest (not yet full) rmap_hlist_len. The rmap_hlist_len
of the "dup" that got its rmap_item removed, would drop to zero and it
would be garbage collected at the next prune pass and the
page_sharing/page_shared ratio would increase up to the peak possible
given the current max_page_sharing. However the "chain" prune logic
already tries to compact the "dups" in as few stable nodes as
possible. Overall if any defragmentation logic of the stable_node
"chains" really turn out to be good idea, it's better to do it in an
incremental patch as it's an orthogonal problem.

Changing max_page_sharing without first doing "echo 2
>/sys/kernel/mm/ksm/run" (that get rid of the entire stable rbtree)
would also be possible but I didn't think it was worth it as certain
asserts become not enforceable anymore.

The code has perhaps too much asserts, mostly VM_BUG_ON though (the
few BUG_ON are there because in those few cases if the asserts trigger
and ksmd doesn't stop there, it'd corrupt memory randomly). I can
remove the VM_BUG_ON when this gets more testing and gets out of
RFC. Also note, there's no VM_WARN_ON available.

Some testsuite that pretends to know the internals of KSM and predict
exact page_sharing/page_shared values, may give false positive with
this patch applied, but it's enough to set max_page_sharing to a very
large value in order to pass the old tests. Ideally those testsuites
should learn about the max_page_sharing limit and predict the new
page_sharing/shared results with the new code to validate it.

Comments welcome, thanks,
Andrea

Andrea Arcangeli (1):
  ksm: introduce ksm_max_page_sharing per page deduplication limit

 Documentation/vm/ksm.txt |  63 ++++
 mm/ksm.c                 | 731 ++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 728 insertions(+), 66 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]