On Fri 05-01-18 08:54:56, jiang.biao2@xxxxxxxxxx wrote: > > On Mon 27-11-17 11:30:19, Jiang Biao wrote: > > > When running ltp stress test for 7*24 hours, the vmscan occasionally > > > complains the following warning continuously, > >> > >> mb_cache_scan+0x0/0x3f0 negative objects to delete > >> nr=-9232265467809300450 > >> ... > >> > >> The tracing result shows the freeable(mb_cache_count returns) > >> is -1, which causes the continuous accumulation and overflow of > >> total_scan. > >> > >> This patch make sure the mb_cache_count not return negative value, > >> which make the mbcache shrinker more robust. > >> > >> Signed-off-by: Jiang Biao <jiang.biao2@xxxxxxxxxx> > > > > Going through some old email... > > a) c_entry_count is unsigned so your patch is a nop as Coverity properly > > noticed. > Indeed, would the following casting be good? > + if (unlikely((int)(cache->c_entry_count) < 0)) > + return 0; That check would at least have a chance of hitting but still it is just hiding the real problem. > > b) c_entry_count being outside 0..2*cache->c_max_entries is a plain bug. I > > went through the logic and cannot find out how that could happen though. > Is there any possibility that decreasing c_entry_count from 0 to -1 > in mb_cache_entry_delete? If we think we have -1 entries in a list, we have a larger problem than just the wrong behavior of the shrinker. This is just a plain counter of entries protected by a spinlock so there isn't space for accounting errors or anything like that. If you can reproduce the problem on some reasonably recent kernel, I'd be interested in debugging this. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR