On Mon, Jan 08, 2018 at 04:13:04PM -0800, Andrew Morton wrote: > I agree with Jan's comment. We need to figure out how ->c_entry_count > went negative. mb_cache_count() says this state is "Unlikely, but not > impossible", but from a quick read I can't see how this happens - it > appears that coherency between ->c_list and ->c_entry_count is always > maintained under ->c_list_lock? I think I see the problem; and I think this should fix it. Andrew, Jan, can you review and double check my analysis? Thanks, - Ted commit 18fb3649c7cd9e70f05045656c1888459d96dfe4 Author: Theodore Ts'o <tytso@xxxxxxx> Date: Tue Jan 9 23:24:53 2018 -0500 mbcache: fix potential double counting when removing entry Entries are removed from the mb_cache entry in two places: mb_cache_shrink() and mb_cache_entry_delete(). The mb_cache_shrink() function finds the entry to delete via the cache->c_list pointer, while mb_cache_entry_delete() finds the entry via the hash lists. If the two functions race with each other, trying to delete an entry at the same time, it's possible for cache->c_entry_count to get decremented twice for that one entry. Fix this by checking to see if entry is still on the cache list before removing it and dropping c_entry_count. Signed-off-by: Theodore Ts'o <tytso@xxxxxxx> diff --git a/fs/mbcache.c b/fs/mbcache.c index 49c5b25bfa8c..0851af5c1c3d 100644 --- a/fs/mbcache.c +++ b/fs/mbcache.c @@ -290,8 +290,10 @@ static unsigned long mb_cache_shrink(struct mb_cache *cache, list_move_tail(&entry->e_list, &cache->c_list); continue; } - list_del_init(&entry->e_list); - cache->c_entry_count--; + if (!list_empty(&entry->e_list)) { + list_del_init(&entry->e_list); + cache->c_entry_count--; + } /* * We keep LRU list reference so that entry doesn't go away * from under us.