We see reported stalls/lockups in quarantine_remove_cache() on machines with large amounts of RAM. quarantine_remove_cache() needs to scan whole quarantine in order to take out all objects belonging to the cache. Quarantine is currently 1/32-th of RAM, e.g. on a machine with 256GB of memory that will be 8GB. Moreover quarantine scanning is a walk over uncached linked list, which is slow. Add cond_resched() after scanning of each non-empty batch of objects. Batches are specifically kept of reasonable size for quarantine_put(). On a machine with 256GB of RAM we should have ~512 non-empty batches, each with 16MB of objects. Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> Cc: kasan-dev@xxxxxxxxxxxxxxxx Cc: linux-mm@xxxxxxxxx Cc: Andrey Ryabinin <aryabinin@xxxxxxxxxxxxx> Cc: Greg Thelen <gthelen@xxxxxxxxxx> --- mm/kasan/quarantine.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c index 075422c3cee3..3021d2976dd6 100644 --- a/mm/kasan/quarantine.c +++ b/mm/kasan/quarantine.c @@ -311,8 +311,15 @@ void quarantine_remove_cache(struct kmem_cache *cache) on_each_cpu(per_cpu_remove_cache, cache, 1); spin_lock_irqsave(&quarantine_lock, flags); - for (i = 0; i < QUARANTINE_BATCHES; i++) + for (i = 0; i < QUARANTINE_BATCHES; i++) { + if (qlist_empty(&global_quarantine[i])) + continue; qlist_move_cache(&global_quarantine[i], &to_free, cache); + /* Scanning whole quarantine can take a while. */ + spin_unlock_irqrestore(&quarantine_lock, flags); + cond_resched(); + spin_lock_irqsave(&quarantine_lock, flags); + } spin_unlock_irqrestore(&quarantine_lock, flags); qlist_free_all(&to_free, cache); -- 2.12.0.246.ga2ecc84866-goog -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>