On Tue, Aug 2, 2022 at 5:04 AM Michal Hocko <mhocko@xxxxxxxx> wrote: > > On Tue 02-08-22 02:48:33, Zach O'Keefe wrote: > [...] > > "mm/madvise: add MADV_COLLAPSE to process_madvise()" in the v7 series > > ended with me mentioning a couple options, but ultimately I didn't > > present a solution, and no consensus was reached[1]. After taking a > > closer look, this is my proposal for what I believe to be the best > > path forward. It should be squashed into the original patch. What do you think? > > If it is agreed that the CAP_SYS_ADMIN is too strict of a requirement > then yes, this should be squashed into the original patch. There is no > real reason to create a potential bisection headache by changing the > permission model in a later patch. Sorry about the confusion here. Assumed (incorrectly) that Andrew would kindly squash this in mm-unstable since I added the Fixes: tag. Next time I'll add some explicit verbiage saying it should be squashed. > From my POV, I would agree that CAP_SYS_ADMIN is just too strict of a > requirement. > > I didn't really have time to follow recent discussions but I would argue > that the operation is not really destructive or seriously harmful. All > applications can already have their memory (almost) equally THP > collapsed by khupaged with the proposed process_madvise semantic. > > NOHUGEMEM and prctl opt out from THP are both honored AFAIU and the only > difference is the global THP killswitch behavior which I do not think > warrants the strongest CAP_SYS_ADMIN capability (especially because it > doesn't really control all kinds of THPs). Ya. In fact, I don't think the ignoring the THP sysfs controls warrants any additional capability (set alone CAPS_SYS_ADMIN), since a malicious program can't really inflict any more damage than they would with CAP_SYS_NICE and PTRACE_MODE_READ. > If there is a userspace agent collapsing memory and causing problems > then it can be easily fixed in the userspace. And I find that easier > to do than putting the bar so high that userspace agents would be > unfeasible because of CAP_SYS_ADMIN (which is nono in many cases as it > would allow essentially full control of other stuff). So from practical > POV, risking an extended RSS is really a negligible risk to lose a > potentially useful feature for all others. > Agreed. Thanks for taking the time, Michal! Zach > Just my 2c > > > Thanks again, > > Zach > > > > [1] https://lore.kernel.org/linux-mm/Ys4aTRqWIbjNs1mI@xxxxxxxxxx/ > > -- > Michal Hocko > SUSE Labs