Re: [PATCH linux-next] mm/madvise: allow KSM hints for process_madvise

Michal Hocko <mhocko@xxxxxxxx> · Mon, 4 Jul 2022 08:48:06 +0200

On Fri 01-07-22 21:12:56, David Hildenbrand wrote:
> On 01.07.22 15:19, Michal Hocko wrote:
> > On Fri 01-07-22 14:39:24, David Hildenbrand wrote:
> >>> I am not sure about exact details of the KSM implementation but if that
> >>> is not a desirable behavior then it should be handled on the KSM level.
> >>> The very sam thing can easily happen in a multithreaded (or in general
> >>> multi-process with shared mm) environment as well.
> >>
> >> I don't quite get what you mean.
> > 
> > I meant to say that if KSM needs to be aware of a special CoW semantic
> > then it should be handled on the KSM layer regardless whether the KSM
> > has been set by the process itself or any other process that has acccess
> > to the MM. process_madvise is just another way to access a remote MM
> > other than sharing the full MM.
> 
> Okay.
> 
> KSM has been a corner case feature that was restricted to well-defined
> and well-tested environments. Until recently, R/O pins of any KSM pages
> was essentially completely unreliably. And applications don't expect
> such surprises. The shared zeropage is most probably the last
> problematic piece.
> 
> Yes, we're getting there that it's a real feature that can see more
> (forced) wide-spread use. However, until the known issues in KSM have
> been fixed (e.g., below -- there is a whole list of papers regarding
> attacks on memory deduplication), it should be limited to well defined
> environments and applications only -- IMHO.

Very much agreed on all this! To be completely honest I am not really
sure that all those consequences are widely understood and optmizing
solely on memory savings is a very short sighted strategy IMO. But, it
seems that there is a demand for this feature and previous attempts for
APIs were much worse both from the semantic and maintainability POV. I
am not sure we can get anything more sane than madvise.

I also very much agree that current shortcomings have to be adressed
first before we open this can of worms to 3rd party actors. I was not
aware of those so thank for bringing them up. Maybe I was overly
optimistic here.

So I guess we have following questions to answer:
1) Do we really want to support KSM triggered by 3rd party? Does it
impose new challenges other than existing ones in multi "threaded"
environemnts?
2) If yes, is the process_madvise the most appropriate existing API? Or
do we need a new one?
3) Should this be a highly privileged operation or we want to allow
userspace to shoot its feet because consequences are subtle and not very
well understood?

> So what I want to express here is that if we're adding an interface that
> can be used to just enable KSM on the whole system easily, it might be a
> bit to soon for that. No matter what you document, people will ignore it.

Agreed.

> OTOH, if this is a real debug feature that will only be available in
> specific debug/test scenarios (kernel config? toggle? whatsoever?), then
> it's "better". If that is already the case, good.

No, I think this is aimed to real production deployments.

Thanks!
-- 
Michal Hocko
SUSE Labs