From: Dmitry Vyukov <dvyukov@xxxxxxxxxx> Subject: kasan: resched in quarantine_remove_cache() We see reported stalls/lockups in quarantine_remove_cache() on machines with large amounts of RAM. quarantine_remove_cache() needs to scan whole quarantine in order to take out all objects belonging to the cache. Quarantine is currently 1/32-th of RAM, e.g. on a machine with 256GB of memory that will be 8GB. Moreover quarantine scanning is a walk over uncached linked list, which is slow. Add cond_resched() after scanning of each non-empty batch of objects. Batches are specifically kept of reasonable size for quarantine_put(). On a machine with 256GB of RAM we should have ~512 non-empty batches, each with 16MB of objects. Link: http://lkml.kernel.org/r/20170308154239.25440-1-dvyukov@xxxxxxxxxx Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> Acked-by: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx> Cc: Greg Thelen <gthelen@xxxxxxxxxx> Cc: <stable@xxxxxxxxxxxxxxx> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> --- mm/kasan/quarantine.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff -puN mm/kasan/quarantine.c~kasan-resched-in-quarantine_remove_cache mm/kasan/quarantine.c --- a/mm/kasan/quarantine.c~kasan-resched-in-quarantine_remove_cache +++ a/mm/kasan/quarantine.c @@ -283,8 +283,15 @@ void quarantine_remove_cache(struct kmem on_each_cpu(per_cpu_remove_cache, cache, 1); spin_lock_irqsave(&quarantine_lock, flags); - for (i = 0; i < QUARANTINE_BATCHES; i++) + for (i = 0; i < QUARANTINE_BATCHES; i++) { + if (qlist_empty(&global_quarantine[i])) + continue; qlist_move_cache(&global_quarantine[i], &to_free, cache); + /* Scanning whole quarantine can take a while. */ + spin_unlock_irqrestore(&quarantine_lock, flags); + cond_resched(); + spin_lock_irqsave(&quarantine_lock, flags); + } spin_unlock_irqrestore(&quarantine_lock, flags); qlist_free_all(&to_free, cache); _