Re: [PATCH v1 8/8] mm: kasan: Initial memory quarantine implementation

Alexander Potapenko <glider@xxxxxxxxxx> · Thu, 18 Feb 2016 15:06:03 +0100

On Mon, Feb 1, 2016 at 3:47 AM, Joonsoo Kim <iamjoonsoo.kim@xxxxxxx> wrote:
> On Wed, Jan 27, 2016 at 07:25:13PM +0100, Alexander Potapenko wrote:
>> Quarantine isolates freed objects in a separate queue. The objects are
>> returned to the allocator later, which helps to detect use-after-free
>> errors.
>>
>> Freed objects are first added to per-cpu quarantine queues.
>> When a cache is destroyed or memory shrinking is requested, the objects
>> are moved into the global quarantine queue. Whenever a kmalloc call
>> allows memory reclaiming, the oldest objects are popped out of the
>> global queue until the total size of objects in quarantine is less than
>> 3/4 of the maximum quarantine size (which is a fraction of installed
>> physical memory).
>
> Just wondering why not using time based approach rather than size
> based one. In heavy load condition, how much time do the object stay in
> quarantine?
>
>>
>> Right now quarantine support is only enabled in SLAB allocator.
>> Unification of KASAN features in SLAB and SLUB will be done later.
>>
>> This patch is based on the "mm: kasan: quarantine" patch originally
>> prepared by Dmitry Chernenkov.
>>
>> Signed-off-by: Alexander Potapenko <glider@xxxxxxxxxx>
>> ---
>>  include/linux/kasan.h |  30 ++++--
>>  lib/test_kasan.c      |  29 ++++++
>>  mm/kasan/Makefile     |   2 +-
>>  mm/kasan/kasan.c      |  68 +++++++++++-
>>  mm/kasan/kasan.h      |  11 +-
>>  mm/kasan/quarantine.c | 284 ++++++++++++++++++++++++++++++++++++++++++++++++++
>>  mm/kasan/report.c     |   3 +-
>>  mm/mempool.c          |   7 +-
>>  mm/page_alloc.c       |   2 +-
>>  mm/slab.c             |  12 ++-
>>  mm/slab.h             |   4 +
>>  mm/slab_common.c      |   2 +
>>  mm/slub.c             |   4 +-
>>  13 files changed, 435 insertions(+), 23 deletions(-)
>>
>
> ...
>
>> +bool kasan_slab_free(struct kmem_cache *cache, void *object)
>> +{
>> +#ifdef CONFIG_SLAB
>> +     /* RCU slabs could be legally used after free within the RCU period */
>> +     if (unlikely(cache->flags & SLAB_DESTROY_BY_RCU))
>> +             return false;
>> +
>> +     if (likely(cache->flags & SLAB_KASAN)) {
>> +             struct kasan_alloc_meta *alloc_info =
>> +                     get_alloc_info(cache, object);
>> +             struct kasan_free_meta *free_info =
>> +                     get_free_info(cache, object);
>> +
>> +             switch (alloc_info->state) {
>> +             case KASAN_STATE_ALLOC:
>> +                     alloc_info->state = KASAN_STATE_QUARANTINE;
>> +                     quarantine_put(free_info, cache);
>
> quarantine_put() can be called regardless of SLAB_DESTROY_BY_RCU,
> although it's not much meaningful without poisoning. But, I have an
> idea to poison object on SLAB_DESTROY_BY_RCU cache.
>
> quarantine_put() moves per cpu list to global queue when
> list size reaches QUARANTINE_PERCPU_SIZE. If we call synchronize_rcu()
> at that time, after then, we can poison objects. With appropriate size
> setup, it would not be intrusive.
>
Won't this slow the quarantine down unpredictably (e.g. in the case
there're no RCU slabs in quarantine we'll still be waiting for
synchronize_rcu())?
Yet this is something worth looking into. Do you want RCU to be
handled in this patch set?

>> +                     set_track(&free_info->track, GFP_NOWAIT);
>
> set_track() can be called regardless of SLAB_DESTROY_BY_RCU.
Agreed, I can fix that if we decide to handle RCU in this patch
(otherwise it will lead to confusion).

>
>> +                     kasan_poison_slab_free(cache, object);
>> +                     return true;
>> +             case KASAN_STATE_QUARANTINE:
>> +             case KASAN_STATE_FREE:
>> +                     pr_err("Double free");
>> +                     dump_stack();
>> +                     break;
>> +             default:
>> +                     break;
>> +             }
>> +     }
>> +     return false;
>> +#else
>> +     kasan_poison_slab_free(cache, object);
>> +     return false;
>> +#endif
>> +}
>> +
>
> ...
>
>> +void quarantine_reduce(void)
>> +{
>> +     size_t new_quarantine_size;
>> +     unsigned long flags;
>> +     struct qlist to_free = QLIST_INIT;
>> +     size_t size_to_free = 0;
>> +     void **last;
>> +
>> +     if (likely(ACCESS_ONCE(global_quarantine.bytes) <=
>> +                smp_load_acquire(&quarantine_size)))
>> +             return;
>> +
>> +     spin_lock_irqsave(&quarantine_lock, flags);
>> +
>> +     /* Update quarantine size in case of hotplug. Allocate a fraction of
>> +      * the installed memory to quarantine minus per-cpu queue limits.
>> +      */
>> +     new_quarantine_size = (ACCESS_ONCE(totalram_pages) << PAGE_SHIFT) /
>> +             QUARANTINE_FRACTION;
>> +     new_quarantine_size -= QUARANTINE_PERCPU_SIZE * num_online_cpus();
>> +     smp_store_release(&quarantine_size, new_quarantine_size);
>> +
>> +     last = global_quarantine.head;
>> +     while (last) {
>> +             struct kmem_cache *cache = qlink_to_cache(last);
>> +
>> +             size_to_free += cache->size;
>> +             if (!*last || size_to_free >
>> +                 global_quarantine.bytes - QUARANTINE_LOW_SIZE)
>> +                     break;
>> +             last = (void **) *last;
>> +     }
>> +     qlist_move(&global_quarantine, last, &to_free, size_to_free);
>> +
>> +     spin_unlock_irqrestore(&quarantine_lock, flags);
>> +
>> +     qlist_free_all(&to_free, NULL);
>> +}
>
> Isn't it better to call quarantine_reduce() in shrink_slab()?
> It will help to maximize quarantine time.
This is true, however if we don't call quarantine_reduce() from
kmalloc()/kfree() the size of the quarantine will be unpredictable.
There's a tradeoff between efficiency and space here, and at least in
some cases we may want to trade efficiency for space.
>
>> +
>> +static inline void qlist_move_cache(struct qlist *from,
>> +                                struct qlist *to,
>> +                                struct kmem_cache *cache)
>> +{
>> +     void ***prev;
>> +
>> +     if (unlikely(empty_qlist(from)))
>> +             return;
>> +
>> +     prev = &from->head;
>> +     while (*prev) {
>> +             void **qlink = *prev;
>> +             struct kmem_cache *obj_cache = qlink_to_cache(qlink);
>> +
>> +             if (obj_cache == cache) {
>> +                     if (unlikely(from->tail == qlink))
>> +                             from->tail = (void **) prev;
>> +                     *prev = (void **) *qlink;
>> +                     from->bytes -= cache->size;
>> +                     qlist_put(to, qlink, cache->size);
>> +             } else
>> +                     prev = (void ***) *prev;
>> +     }
>> +}
>> +
>> +static void per_cpu_remove_cache(void *arg)
>> +{
>> +     struct kmem_cache *cache = arg;
>> +     struct qlist to_free = QLIST_INIT;
>> +     struct qlist *q;
>> +     unsigned long flags;
>> +
>> +     local_irq_save(flags);
>> +     q = this_cpu_ptr(&cpu_quarantine);
>> +     qlist_move_cache(q, &to_free, cache);
>> +     local_irq_restore(flags);
>> +
>> +     qlist_free_all(&to_free, cache);
>> +}
>> +
>> +void quarantine_remove_cache(struct kmem_cache *cache)
>> +{
>> +     unsigned long flags;
>> +     struct qlist to_free = QLIST_INIT;
>> +
>> +     on_each_cpu(per_cpu_remove_cache, cache, 0);
>
> Should be called with wait = 1.
Agreed, thank you.

>> +
>> +     spin_lock_irqsave(&quarantine_lock, flags);
>> +     qlist_move_cache(&global_quarantine, &to_free, cache);
>> +     spin_unlock_irqrestore(&quarantine_lock, flags);
>> +
>> +     qlist_free_all(&to_free, cache);
>> +}
>> diff --git a/mm/kasan/report.c b/mm/kasan/report.c
>> index 6c4afcd..a4dca25 100644
>> --- a/mm/kasan/report.c
>> +++ b/mm/kasan/report.c
>> @@ -148,7 +148,8 @@ static void print_object(struct kmem_cache *cache, void *object)
>>               print_track(&alloc_info->track);
>>               break;
>>       case KASAN_STATE_FREE:
>> -             pr_err("Object freed, allocated with size %u bytes\n",
>> +     case KASAN_STATE_QUARANTINE:
>> +             pr_err("Object freed, allocated with size %lu bytes\n",
>>                      alloc_info->alloc_size);
>>               free_info = get_free_info(cache, object);
>>               pr_err("Allocation:\n");
>> diff --git a/mm/mempool.c b/mm/mempool.c
>> index b47c8a7..4beeeef 100644
>> --- a/mm/mempool.c
>> +++ b/mm/mempool.c
>> @@ -105,11 +105,12 @@ static inline void poison_element(mempool_t *pool, void *element)
>>  static void kasan_poison_element(mempool_t *pool, void *element)
>>  {
>>       if (pool->alloc == mempool_alloc_slab)
>> -             kasan_slab_free(pool->pool_data, element);
>> +             kasan_poison_slab_free(pool->pool_data, element);
>>       if (pool->alloc == mempool_kmalloc)
>> -             kasan_kfree(element);
>> +             kasan_poison_kfree(element);
>>       if (pool->alloc == mempool_alloc_pages)
>> -             kasan_free_pages(element, (unsigned long)pool->pool_data);
>> +             kasan_poison_free_pages(element,
>> +                                     (unsigned long)pool->pool_data);
>>  }
>>
>>  static void kasan_unpoison_element(mempool_t *pool, void *element, gfp_t flags)
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 63358d9..4f65587 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -980,7 +980,7 @@ static bool free_pages_prepare(struct page *page, unsigned int order)
>>
>>       trace_mm_page_free(page, order);
>>       kmemcheck_free_shadow(page, order);
>> -     kasan_free_pages(page, order);
>> +     kasan_poison_free_pages(page, order);
>>
>>       if (PageAnon(page))
>>               page->mapping = NULL;
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 0ec7aa3..e2fac67 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -3374,9 +3374,19 @@ free_done:
>>  static inline void __cache_free(struct kmem_cache *cachep, void *objp,
>>                               unsigned long caller)
>>  {
>> +#ifdef CONFIG_KASAN
>> +     if (!kasan_slab_free(cachep, objp))
>> +             /* The object has been put into the quarantine, don't touch it
>> +              * for now.
>> +              */
>> +             nokasan_free(cachep, objp, caller);
>> +}
>> +
>> +void nokasan_free(struct kmem_cache *cachep, void *objp, unsigned long caller)
>> +{
>> +#endif
>
> It looks not good to me.
> Converting __cache_free() to ____cache_free() and making
> __cache_free() call ____cache_free() if (!kasan_slab_free()) looks
> better to me and less error-prone.
Fixed. Will upload the new patchset soonish.

> Thanks.

-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href