On 22.11.21 20:53, Jens Axboe wrote: > On 11/22/21 11:26 AM, David Hildenbrand wrote: >> On 22.11.21 18:55, Andrew Dona-Couch wrote: >>> Forgive me for jumping in to an already overburdened thread. But can >>> someone pushing back on this clearly explain the issue with applying >>> this patch? >> >> It will allow unprivileged users to easily and even "accidentally" >> allocate more unmovable memory than it should in some environments. Such >> limits exist for a reason. And there are ways for admins/distros to >> tweak these limits if they know what they are doing. > > But that's entirely the point, the cases where this change is needed are > already screwed by a distro and the user is the administrator. This is > _exactly_ the case where things should just work out of the box. If > you're managing farms of servers, yeah you have competent administration > and you can be expected to tweak settings to get the best experience and > performance, but the kernel should provide a sane default. 64K isn't a > sane default. 0.1% of RAM isn't either. > >> This is not a step into the right direction. This is all just trying to >> hide the fact that we're exposing FOLL_LONGTERM usage to random >> unprivileged users. >> >> Maybe we could instead try getting rid of FOLL_LONGTERM usage and the >> memlock limit in io_uring altogether, for example, by using mmu >> notifiers. But I'm no expert on the io_uring code. > > You can't use mmu notifiers without impacting the fast path. This isn't > just about io_uring, there are other users of memlock right now (like > bpf) which just makes it even worse. 1) Do we have a performance evaluation? Did someone try and come up with a conclusion how bad it would be? 2) Could be provide a mmu variant to ordinary users that's just good enough but maybe not as fast as what we have today? And limit FOLL_LONGTERM to special, privileged users? 3) Just because there are other memlock users is not an excuse. For example, VFIO/VDPA have to use it for a reason, because there is no way not do use FOLL_LONGTERM. > > We should just make this 0.1% of RAM (min(0.1% ram, 64KB)) or something > like what was suggested, if that will help move things forward. IMHO the > 32MB machine is mostly a theoretical case, but whatever . 1) I'm deeply concerned about large ZONE_MOVABLE and MIGRATE_CMA ranges where FOLL_LONGTERM cannot be used, as that memory is not available. 2) With 0.1% RAM it's sufficient to start 1000 processes to break any system completely and deeply mess up the MM. Oh my. No, I don't like this, absolutely not. I neither like raising the memlock limit as default to such high values nor using FOLL_LONGTERM in cases where it could be avoided for random, unprivileged users. But I assume this is mostly for the records, because I assume nobody cares about my opinion here. -- Thanks, David / dhildenb