Re: [PATCH v2] Introduce sysfs interface to disable kfence for selected slabs.

Marco Elver <elver@xxxxxxxxxx> · Thu, 11 Aug 2022 11:52:07 +0200

On Thu, 11 Aug 2022 at 11:31, <vbabka@xxxxxxx> wrote:
>
> On 8/11/22 10:59, Imran Khan wrote:
> > By default kfence allocation can happen for any slab object, whose size
> > is up to PAGE_SIZE, as long as that allocation is the first allocation
> > after expiration of kfence sample interval. But in certain debugging
> > scenarios we may be interested in debugging corruptions involving
> > some specific slub objects like dentry or ext4_* etc. In such cases
> > limiting kfence for allocations involving only specific slub objects
> > will increase the probablity of catching the issue since kfence pool
> > will not be consumed by other slab objects.
>
> So you want to enable specific caches for kfence.
>
> > This patch introduces a sysfs interface '/sys/kernel/slab/<name>/skip_kfence'
> > to disable kfence for specific slabs. Having the interface work in this
> > way does not impact current/default behavior of kfence and allows us to
> > use kfence for specific slabs (when needed) as well. The decision to
> > skip/use kfence is taken depending on whether kmem_cache.flags has
> > (newly introduced) SLAB_SKIP_KFENCE flag set or not.
>
> But this seems everything is still enabled and you can selectively disable.
> Isn't that rather impractical?

A script just iterates through all the caches that they don't want,
and sets skip_kfence? It doesn't look more complicated.

> How about making this cache flag rather denote that KFENCE is enabled (not
> skipped), set it by default only for for caches with size <= 1024, then you

Where does 1024 come from? PAGE_SIZE?

The problem with that opt-in vs. opt-out is that it becomes more
complex to maintain opt-in (as the first RFC of this did). With the
new flag SLAB_SKIP_KFENCE, it also can serve a dual purpose, where
someone might want to explicitly opt out by default and pass it to
kmem_cache_create() (for whatever reason; not that we'd encourage
that).

I feel that the real use cases for selectively enabling caches for
KFENCE are very narrow, and a design that introduces lots of
complexity elsewhere, just to support this feature cannot be justified
(which is why I suggested the simpler design here back in
https://lore.kernel.org/lkml/CANpmjNNmD9z7oRqSaP72m90kWL7jYH+cxNAZEGpJP8oLrDV-vw@xxxxxxxxxxxxxx/
)

> can drop the size check in __kfence_alloc and rely only on the flag? And if
> you need, you can also enable a cache with size > 1024 with the sysfs
> interface, to override the limit, which isn't possible now.
> (I don't think changing the limit to always act on s->object_size instead of
> e.g. size passed to kmalloc() that it can pick up now, will change anything
> in practice)
> Then you can also have a kernel boot param that tells kfence to set the flag
> on no cache at all, and you can easily enable just the specific caches you
> want. Or make a parameter that lets you override the 1024 size limit
> globally, and if you set it to 0, it means no cache is enabled for kfence?
>
> > Signed-off-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
> > ---
> >
> > Changes since v1:
> >  - Remove RFC tag
> >
> >  include/linux/slab.h |  6 ++++++
> >  mm/kfence/core.c     |  7 +++++++
> >  mm/slub.c            | 27 +++++++++++++++++++++++++++
> >  3 files changed, 40 insertions(+)
> >
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 0fefdf528e0d..947d912fd08c 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -119,6 +119,12 @@
> >   */
> >  #define SLAB_NO_USER_FLAGS   ((slab_flags_t __force)0x10000000U)
> >
> > +#ifdef CONFIG_KFENCE
> > +#define SLAB_SKIP_KFENCE            ((slab_flags_t __force)0x20000000U)
> > +#else
> > +#define SLAB_SKIP_KFENCE            0
> > +#endif
> > +
> >  /* The following flags affect the page allocator grouping pages by mobility */
> >  /* Objects are reclaimable */
> >  #define SLAB_RECLAIM_ACCOUNT ((slab_flags_t __force)0x00020000U)
> > diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> > index c252081b11df..8c08ae2101d7 100644
> > --- a/mm/kfence/core.c
> > +++ b/mm/kfence/core.c
> > @@ -1003,6 +1003,13 @@ void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags)
> >               return NULL;
> >       }
> >
> > +     /*
> > +      * Skip allocations for this slab, if KFENCE has been disabled for
> > +      * this slab.
> > +      */
> > +     if (s->flags & SLAB_SKIP_KFENCE)
> > +             return NULL;
> > +
> >       if (atomic_inc_return(&kfence_allocation_gate) > 1)
> >               return NULL;
> >  #ifdef CONFIG_KFENCE_STATIC_KEYS
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 862dbd9af4f5..ee8b48327536 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -5745,6 +5745,30 @@ STAT_ATTR(CPU_PARTIAL_NODE, cpu_partial_node);
> >  STAT_ATTR(CPU_PARTIAL_DRAIN, cpu_partial_drain);
> >  #endif       /* CONFIG_SLUB_STATS */
> >
> > +#ifdef CONFIG_KFENCE
> > +static ssize_t skip_kfence_show(struct kmem_cache *s, char *buf)
> > +{
> > +     return sysfs_emit(buf, "%d\n", !!(s->flags & SLAB_SKIP_KFENCE));
> > +}
> > +
> > +static ssize_t skip_kfence_store(struct kmem_cache *s,
> > +                     const char *buf, size_t length)
> > +{
> > +     int ret = length;
> > +
> > +     if (buf[0] == '0')
> > +             s->flags &= ~SLAB_SKIP_KFENCE;
> > +     else if (buf[0] == '1')
> > +             s->flags |= SLAB_SKIP_KFENCE;
> > +     else
> > +             ret = -EINVAL;
> > +
> > +     return ret;
> > +}
> > +SLAB_ATTR(skip_kfence);
> > +
> > +#endif
> > +
> >  static struct attribute *slab_attrs[] = {
> >       &slab_size_attr.attr,
> >       &object_size_attr.attr,
> > @@ -5812,6 +5836,9 @@ static struct attribute *slab_attrs[] = {
> >       &failslab_attr.attr,
> >  #endif
> >       &usersize_attr.attr,
> > +#ifdef CONFIG_KFENCE
> > +     &skip_kfence_attr.attr,
> > +#endif
> >
> >       NULL
> >  };
> >
> > base-commit: 40d43a7507e1547dd45cb02af2e40d897c591870
>