On Mon, Jul 04, 2022 at 08:48:06AM +0200, Michal Hocko wrote: > On Fri 01-07-22 21:12:56, David Hildenbrand wrote: > > On 01.07.22 15:19, Michal Hocko wrote: > > > On Fri 01-07-22 14:39:24, David Hildenbrand wrote: > > >>> I am not sure about exact details of the KSM implementation but if that > > >>> is not a desirable behavior then it should be handled on the KSM level. > > >>> The very sam thing can easily happen in a multithreaded (or in general > > >>> multi-process with shared mm) environment as well. > > >> > > >> I don't quite get what you mean. > > > > > > I meant to say that if KSM needs to be aware of a special CoW semantic > > > then it should be handled on the KSM layer regardless whether the KSM > > > has been set by the process itself or any other process that has acccess > > > to the MM. process_madvise is just another way to access a remote MM > > > other than sharing the full MM. > > > > Okay. > > > > KSM has been a corner case feature that was restricted to well-defined > > and well-tested environments. Until recently, R/O pins of any KSM pages > > was essentially completely unreliably. And applications don't expect > > such surprises. The shared zeropage is most probably the last > > problematic piece. > > > > Yes, we're getting there that it's a real feature that can see more > > (forced) wide-spread use. However, until the known issues in KSM have > > been fixed (e.g., below -- there is a whole list of papers regarding > > attacks on memory deduplication), it should be limited to well defined > > environments and applications only -- IMHO. > > Very much agreed on all this! To be completely honest I am not really > sure that all those consequences are widely understood and optmizing > solely on memory savings is a very short sighted strategy IMO. But, it > seems that there is a demand for this feature and previous attempts for > APIs were much worse both from the semantic and maintainability POV. I > am not sure we can get anything more sane than madvise. > > I also very much agree that current shortcomings have to be adressed > first before we open this can of worms to 3rd party actors. I was not > aware of those so thank for bringing them up. Maybe I was overly > optimistic here. > > So I guess we have following questions to answer: > 1) Do we really want to support KSM triggered by 3rd party? Does it > impose new challenges other than existing ones in multi "threaded" > environemnts? > 2) If yes, is the process_madvise the most appropriate existing API? Or > do we need a new one? Maybe new semantics is needed similarly to MADV_NOHUGEPAGE that ensures that there will *not* be huge pages. > 3) Should this be a highly privileged operation or we want to allow > userspace to shoot its feet because consequences are subtle and not very > well understood? > > > So what I want to express here is that if we're adding an interface that > > can be used to just enable KSM on the whole system easily, it might be a > > bit to soon for that. No matter what you document, people will ignore it. > > Agreed. > Agree too. Thanks.