My mistake I first answered to an older email. David Hildenbrand <david@xxxxxxxxxx> writes: > On 30.03.23 16:26, Johannes Weiner wrote: >> On Thu, Mar 30, 2023 at 06:55:31AM +0200, David Hildenbrand wrote: >>> On 29.03.23 01:09, Andrew Morton wrote: >>>> On Fri, 10 Mar 2023 10:28:48 -0800 Stefan Roesch <shr@xxxxxxxxxxxx> wrote: >>>> >>>>> So far KSM can only be enabled by calling madvise for memory regions. To >>>>> be able to use KSM for more workloads, KSM needs to have the ability to be >>>>> enabled / disabled at the process / cgroup level. >>>> >>>> Review on this series has been a bit thin. Are we OK with moving this >>>> into mm-stable for the next merge window? >>> >>> I still want to review (traveling this week), but I also don't want to block >>> this forever. >>> >>> I think I didn't get a reply from Stefan to my question [1] yet (only some >>> comments from Johannes). I would still be interested in the variance of >>> pages we end up de-duplicating for processes. >>> >>> The 20% statement in the cover letter is rather useless and possibly >>> misleading if no details about the actual workload are shared. >> The workload is instagram. It forks off Django runtimes on-demand >> until it saturates whatever hardware it's running on. This benefits >> from merging common heap/stack state between instances. Since that >> runtime is quite large, the 20% number is not surprising, and matches >> our expectations of duplicative memory between instances. > > Thanks for this explanation. It's valuable to get at least a feeling for the > workload because it doesn't seem to apply to other workloads at all. > >> Obviously we could spend months analysing which exact allocations are >> identical, and then more months or years reworking the architecture to >> deduplicate them by hand and in userspace. But this isn't practical, >> and KSM is specifically for cases where this isn't practical. >> Based on your request in the previous thread, we investigated whether >> the boost was coming from the unintended side effects of KSM splitting >> THPs. This wasn't the case. >> If you have other theories on how the results could be bogus, we'd be >> happy to investigate those as well. But you have to let us know what >> you're looking for. >> > > Maybe I'm bad at making such requests but > > "Stefan, can you do me a favor and investigate which pages we end up > deduplicating -- especially if it's mostly only the zeropage and if it's > still that significant when disabling THP?" > > "In any case, it would be nice to get a feeling for how much variety in > these 20% of deduplicated pages are. " > > is pretty clear to me. And shouldn't take months. > /sys/kernel/mm/ksm/pages_shared is over 10000 when we run this on an Instagram workload. The workload consists of 36 processes plus a few sidecar processes. Each of these individual processes has around 500MB in KSM pages. Also to give some idea for individual VMA's 7ef5d5600000-7ef5e5600000 rw-p 00000000 00:00 0 (Size: 262144 KB, KSM: 73160 KB) >> Beyond that, I don't think we need to prove from scratch that KSM can > > I never expected a proof. I was merely trying to understand if it's really KSM > that helps here. Also with the intention to figure out if KSM is really the > right tool to use here or if it simply "helps by luck" as with the shared > zeropage. That end result could have been valuable to your use case as well, > because KSM overhead is real. > >> be a worthwhile optimization. It's been established that it can >> be. This series is about enabling it in scenarios where madvise() >> isn't practical, that's it, and it's yielding the expected results. > > I'm sorry to say, but you sound a bit aggressive and annoyed. I also have no > idea why Stefan isn't replying to me but always you. > > Am I asking the wrong questions? Do you want me to stop looking at KSM code? > Your review is valuable, Johannes was quicker than me.