Re: [PATCH 1/1] psi: remove 500ms min window size limitation for triggers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 1, 2023 at 12:07 PM Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>
> On Wed, Mar 01, 2023 at 11:34:03AM -0800, Suren Baghdasaryan wrote:
> > Current 500ms min window size for psi triggers limits polling interval
> > to 50ms to prevent polling threads from using too much cpu bandwidth by
> > polling too frequently. However the number of cgroups with triggers is
> > unlimited, so this protection can be defeated by creating multiple
> > cgroups with psi triggers (triggers in each cgroup are served by a single
> > "psimon" kernel thread).
> > Instead of limiting min polling period, which also limits the latency of
> > psi events, it's better to limit psi trigger creation to authorized users
> > only, like we do for system-wide psi triggers (/proc/pressure/* files can
> > be written only by processes with CAP_SYS_RESOURCE capability). This also
> > makes access rules for cgroup psi files consistent with system-wide ones.
> > Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and
> > remove the psi window min size limitation.
> >
> > Suggested-by: Sudarshan Rajagopalan <quic_sudaraja@xxxxxxxxxxx>
> > Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@xxxxxxxxxxx/
> > Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> > ---
> >  kernel/cgroup/cgroup.c | 10 ++++++++++
> >  kernel/sched/psi.c     |  4 +---
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> > index 935e8121b21e..b600a6baaeca 100644
> > --- a/kernel/cgroup/cgroup.c
> > +++ b/kernel/cgroup/cgroup.c
> > @@ -3867,6 +3867,12 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of,
> >       return psi_trigger_poll(&ctx->psi.trigger, of->file, pt);
> >  }
> >
> > +static int cgroup_pressure_open(struct kernfs_open_file *of)
> > +{
> > +     return (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) ?
> > +             -EPERM : 0;
> > +}
>
> I agree with the change, but it's a bit unfortunate that this check is
> duplicated between system and cgroup.
>
> What do you think about psi_trigger_create() taking the file and
> checking FMODE_WRITE and CAP_SYS_RESOURCE against file->f_cred?

That's definitely doable and we don't even need to pass file to
psi_trigger_create() since it's called only when we write to the file.
However by moving the capability check into psi_trigger_create() we
also postpone the check until write() instead of failing early in
open(). I always assumed failing early is preferable but if
consolidating the code here makes more sense then I can make the
switch. Please let me know if you still prefer to move the check.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux