On Fri 01-07-22 21:12:56, David Hildenbrand wrote: > On 01.07.22 15:19, Michal Hocko wrote: > > On Fri 01-07-22 14:39:24, David Hildenbrand wrote: > >>> I am not sure about exact details of the KSM implementation but if that > >>> is not a desirable behavior then it should be handled on the KSM level. > >>> The very sam thing can easily happen in a multithreaded (or in general > >>> multi-process with shared mm) environment as well. > >> > >> I don't quite get what you mean. > > > > I meant to say that if KSM needs to be aware of a special CoW semantic > > then it should be handled on the KSM layer regardless whether the KSM > > has been set by the process itself or any other process that has acccess > > to the MM. process_madvise is just another way to access a remote MM > > other than sharing the full MM. > > Okay. > > KSM has been a corner case feature that was restricted to well-defined > and well-tested environments. Until recently, R/O pins of any KSM pages > was essentially completely unreliably. And applications don't expect > such surprises. The shared zeropage is most probably the last > problematic piece. > > Yes, we're getting there that it's a real feature that can see more > (forced) wide-spread use. However, until the known issues in KSM have > been fixed (e.g., below -- there is a whole list of papers regarding > attacks on memory deduplication), it should be limited to well defined > environments and applications only -- IMHO. Very much agreed on all this! To be completely honest I am not really sure that all those consequences are widely understood and optmizing solely on memory savings is a very short sighted strategy IMO. But, it seems that there is a demand for this feature and previous attempts for APIs were much worse both from the semantic and maintainability POV. I am not sure we can get anything more sane than madvise. I also very much agree that current shortcomings have to be adressed first before we open this can of worms to 3rd party actors. I was not aware of those so thank for bringing them up. Maybe I was overly optimistic here. So I guess we have following questions to answer: 1) Do we really want to support KSM triggered by 3rd party? Does it impose new challenges other than existing ones in multi "threaded" environemnts? 2) If yes, is the process_madvise the most appropriate existing API? Or do we need a new one? 3) Should this be a highly privileged operation or we want to allow userspace to shoot its feet because consequences are subtle and not very well understood? > So what I want to express here is that if we're adding an interface that > can be used to just enable KSM on the whole system easily, it might be a > bit to soon for that. No matter what you document, people will ignore it. Agreed. > OTOH, if this is a real debug feature that will only be available in > specific debug/test scenarios (kernel config? toggle? whatsoever?), then > it's "better". If that is already the case, good. No, I think this is aimed to real production deployments. Thanks! -- Michal Hocko SUSE Labs